Python help wanted - minimize SVG (vs. PNG)

tldr, I’d like someone to help write some Python to scour, simplify, and compress SVG depictions of our ligands and fragments. Should save a bunch of space…

In the new template tool, there are preview images for the ligands, functional groups, etc. e.g.:

Both @matterhorn103 and @thomas have helped with the scripts:

  • cleaning and minimizing the cjson files
  • cleaning up the image generation

Since SVG support is generally good (e.g., we’re using SVG tool icons) I think we should drop the png files in favor of directly using svg depictions.

Here’s where I need some help. The current SVG depictions generated by RDKit can be optimized a lot if someone can help with some Python.

  • There’s a Python module scour that reduced an image 57% (e.g. PNG is 10K vs. 3K compressed SVG)
  • There’s clearly irrelevant info because these are auto-generated, e.g.
<path class="bond-0 atom-0 atom-1" d="m21.6 106.6 15.1-8.7" fill="none" stroke="#191919" stroke-width="2px"/>
<path class="bond-0 atom-0 atom-1" d="m36.7 97.9 15.1-8.8" fill="none" stroke="#000" stroke-width="2px"/>
<path class="bond-1 atom-1 atom-2" d="m54.4 90.6v-12.4" fill="none" stroke="#000" stroke-width="2px"/>

Clearly all the black bonds, red bonds, etc. can be compressed to one class, and then fill, stroke, etc, can be simplified.

Anyone have a bit of free time? I can send a bunch of SVG depictions as examples:
svg.tar.gz (12.8 KB)

I’m a bit on the fence here; because one one hand I like the highlight by CPK-like colors on N and O for an easier discern while on the other F, Cl; S in front of white background (either paper, or mirror-like laptop screens in bright environment) are less intelligible.

Inspection of the test svg shared by you defines many coordinates with four decimals; likely, this high precision isn’t necessary (and eventually, eats storage). I’m aware about svgcleaner archived on GitHub; implemented in Rust, can run from the shell (e.g., bash in Linux Debian after a chmod +x). If you don’t intend to have a git diff view later, it works well even on larger sets of .svg (example).

Attached a MWE with your test data (incl. one file manually compressed another loop in the shell may provide).

smaller.tar.gz (826.7 KB)

We can certainly tweak the element colors for better contrast on white background. (Hmm, seems like another recent thread.) Certainly the yellow S isn’t great.

Seems like the svgcleaner is doing a better job than scour. I was curious for a script because it was obvious that a variety of text was still un-necessary.

So it seems like that’s the best solution - albeit picking some colors with better contrast?

Any suggestions for color replacements? Or should we add a thin black stroke to the letters like sulfur?

This is not an easy question, one hand the colorschemes around (CPK, old/new Rasmol as illustrated by Jmol’s documentation, but colorbrewer2 knows only one scale with four qualitative levels considered (by their measures) colorblind safe.

Will the background for the previews remain fix to bright/white (like here light → black characters on white ground) or optional dark (like here to then white (for non cross-linking text) in front of charcoal)?

It seems sensible to check RDKit’s default, bitonal BW, Avalon, and CDK palette (as described by Gregg Landrum’s blog) in front of default (white) or/and other backgrounds with an experimental branch of depict_ligands.py of fragments.

1 Like

Even though “dark mode” is becoming common, I think it’s better for now to stick to a white background for these. Admittedly if it’s an SVG it would be a bit easier to program color swaps (e.g. you swap the text for #000 and #fff before rendering).

If you can check out a few atom palettes in the fragments scripts, that would be great (or just change a few light elements with setAtomPalette)

It’s also fairly easy in the SVG to add stroke="#000" stroke-width=0.5 px

Hi @ghutchis (and everyone),
I’ve just opened a pull request that attempts to address this (PR #2071). I’ve wired in scour for now, but happy to swap to svgcleaner if that’s the preferred optimizer. For PR #2071 recently submitted:

  1. Replaces all ligand & fragment PNGs with optimized SVGs
  • The .png preview icons have been swapped out for .svg
  • SVGs were generated via RDKit, cleaned/minified (using a small Python script + scour, then pushed through an SVG optimizer
  1. Contains build scripts (in scripts/)
  • read_cjson.py: extract SMILES from .cjson templates
  • generate_svgs.py: batch-generate raw SVG depictions from those SMILES
  • optimize_svgs.py: run scour to strip metadata, collapse IDs, shorten precision, etc.
  1. Updated files
  • template.qrc & CMakeLists.txt: now reference .svg files instead of .png, and link against Qt’s SVG module
  • Minor tweaks in templatetoolwidget.cpp to load the SVG icons correctly
  1. File size decrease
  • On average, each ligand SVG is ~60% smaller than its PNG

I’m still getting up to speed with the Avogadro codebase and C++ development more broadly, so I’d really appreciate any feedback on style, CMake conventions, or anything I may have overlooked. Thanks! :slightly_smiling_face:

3 Likes

Wow, thanks and welcome! :rocket:

In general, this looks great. I’ll do a code review tomorrow on the pull request. You somewhat did things the hard way (i.e., CJSON ⇒ SMILES).

The various fragments, etc. actually come from GitHub - OpenChemistry/fragments: Molecular fragments and inorganic ligands for rapidly building structures and they’re generally built up from SMILES. There’s a discussion at light revision of and about usage of `depict_ligands.py` by nbehrnd · Pull Request #32 · OpenChemistry/fragments · GitHub with @Thomas recently suggesting switching to plain black-on-white for contrast / colorblind-safe. (This also has the benefit of making it easy to swap “dark mode” with the SVG since the code can just exchange #fff with #000.

But that’s somewhat separate. Updating the SVG is easy once template.qrc and templatetoolwidget.cpp handle these.

No worries. It’s a big codebase. Please ask and I’ll be happy to point you in the right direction or otherwise offer help.

1 Like