Unusual molecules / geometries for UFF tests

I’ve started work on a new implementation of UFF for Avogadro. Since this one won’t require Open Babel, it should be a good fallback.

Right now, I’m working on angle terms.

So I’d like to get a set of unusual molecules (e.g., rare coordination, distorted geometries) to use as tests.

Why? For example, some force field implementations have terms like

\frac{1}{\sin \theta}

Obviously, these blow up with angles near 0° or 180°.

So before I do much more work, I want to make sure the gradients work with all sorts of unusual geometries.

I’m not sure if this taps into the parametrization of a force field (like Rappé’s UFF, attached); however, it is possible to launch in the COD a search by text distorted – examples attached separately. I’m not sure if they fit well as “unusual” or “distorted [enough]” though.

It might be worth to add that COD is much more open/permissive about the extraction of the relevant data from their .cif to actually build and publish force fields than is CCDC Cambridge/UK about the CSD file.

rappe_uff.tar.gz (2.4 MB)
distorted.tar.gz (20.9 KB)

1 Like

While I’ll validate the UFF implementation, the publication has a number of known typos:
https://towhee.sourceforge.net/forcefields/uff.html

But yes, my goal has always been to use COD data. All the crystal files in Avogadro have come from COD and I’ve always recommended it to others.

I’ll take a look at the distorted structures from COD. Should be interesting…

1 Like

Incidentally, I’ll mostly move towards some simpler unit tests (e.g., a water molecule with bond angle between 0º and 180º) etc. I can see that my derived gradients aren’t implemented correctly right now because they don’t match the numeric ones.

1 Like

This is an interesting link, not only for UFF alone, yet additionally for the compilation of commented details about many more force fields here in one spot. The developers in the field likely know it, at user level (includes me) it is interesting to see which motifs contribute(d) to the parametrizations (example). This is a bit like the consultation of table 9.5.1.1 of chapter 9.5 of volume C of the International Tables for Crystallography; instead of “length of a C-C single bound” alone there is a line about “C(sp³)-C(sp²) if C(sp²) is within a cyclopentanone” plus some statistics about CSD file data known at time of writing. (Adjacent chapters 9.4 survey interatomic distances/bond lengths of inorganic compounds, or 9.6 about organometallic and coordination complexes.)

table_9_5_1_1.tar.gz (143.0 KB)

Just to bump this thread up… I haven’t started on tests yet (since some of the gradients are buggy) but I have most energy terms implemented.

Every time I work on UFF code, I’m wondering why there isn’t a more modern version using the abundant CSD / COD files:

  • tweaks for bond distances based on formal charges (e.g., \ce{Pd^{2+}} vs. \ce{Pd^{4+}} or \ce{Fe^0} vs. \ce{Fe^{3+}})
  • adding electrostatics with atomic polarizable sites
  • etc.

It probably won’t be anywhere as good as GFN2, but seems like it might still be useful as a cleanup step.

Today, Ulrich Schatzschneider filed an unusual thallium complex to the discussion of InChI by InChI trust – with a formal coordination number of 24 (thread on GitHub). The archive below includes his deposited .sdf (V2000) of CSD code INOCEW.

INOCEW.tar.gz (4.9 KB)

1 Like