Avogadro 1.95: Problems in bond perception (correctly performed under OB 3.1.1)

I believe these are three related bugs with Avogadro:

  1. when importing XYZ files no automatic bond perception is performed, resulting in all single bonds;
  2. when reading CML files already featuring double bonds (even as simple as 2-butene and even when generated by Avogadro 1.95 itself), they are not recognized/drawn. SDF files (generated by Avogadro or Open Babel) are correctly read;
  3. if bond perception on the imported structure is attempted through:
    Extensions->Open Babel->Perceive Bonds
    Avogadro either crashes or gets stuck forever while converting XYZ to CML.

All the structures are correctly read and processed (with inclusion of double bond even from the XYZ file) in Avogadro 1.2.0 (featuring OB 2.3.90).
This same conversion is performed flawlessly by Open Babel 3.1.1 (on the same machine, run through OpenBabelGUI), providing correct bond identification and CML or SDF file outputs.

Environment Information

Avogadro version: 1.95.0
Operating system and version:
Windows 10 Home; Version: 20H2; Build: 19042.1237; Windows Feat. Exp. Pack: 120.2212.3530.0

Expected Behavior

a) Correctly perceiving bonds when reading XYZ files.
b) Correctly interpreting bonds described in CML files.
2) Correctly perceive bonds (or, at least, issuing an informative message) when launching the “Perceive Bonds” command.

Actual Behavior

a) XYZ files are read, but all bond are interpreted as single bonds.
b) The same holds for CML files featuring the correct double bond patterns.
2) When trying the perception through “Extensions->Open Babel->Perceive Bonds” the program gets stuck during XYZ to CML conversion or crashes.

Steps to Reproduce

Read a XYZ or CML file (the procedure even failed for molecules as simple as trans-2-butene when reading the correct CML file generated by Avogadro 1.95 itself!).
For issues a and b): observe the imported molecule
For issue c): launch Extensions->Open Babel->Perceive Bonds command

Draw a simple molecule containing one or more double bonds.
Save in XYZ, CML and SDF
Read back the three files in Avogadro 1.95 (as a comparison, read them in Avogadro 1.20.0).

############## 2-butene XYZ file #################
12
XYZ file generated by Avogadro.
C -5.23179 0.54409 -0.05424
C -3.88234 1.20713 -0.21043
H -0.71979 0.67518 -0.55012
H -1.47327 2.27460 -0.49759
H -0.93564 1.50060 0.99977
H -5.71686 0.43454 -1.02665
H -5.14037 -0.44825 0.39364
H -5.88068 1.14713 0.58434
C -2.74076 0.65826 0.18670
C -1.39070 1.31790 0.02378
H -3.88641 2.18751 -0.68142
H -2.73659 -0.32262 0.65695
######################################

2-butene

Ignoring bond perception in XYZ is by design. It might be debatable, but OB bond perception can become very slow on fused ring systems (e.g., nanotubes, graphene sheets, etc.). People took this as “Avogadro crashed.”

I can’t reproduce the CML issue. Perhaps you can post a link to a CML file with that issue?

As far as the crash in “Perceive Bonds,” my first question would be whether other Open Babel tasks work for you (e.g., force fields, importing files through Open Babel, etc.)

Ignoring bond perception in XYZ is by design. It might be debatable, but OB bond perception can become very slow on fused ring systems (e.g., nanotubes, graphene sheets, etc.). People took this as “Avogadro crashed.”

I suspected this. Probably, introducing bond perception (by OB and/or a simplified distance-based method) as an optional feature in file import could be the best solution.

I can’t reproduce the CML issue. Perhaps you can post a link to a CML file with that issue?

Here is the link to the CML file of the very simple trans-2-butene (generated by Avogadro 1.95 itself): https://cloud.icb.cnr.it/s/PFHkpqkt74aQ8ZQ

As far as the crash in “Perceive Bonds,” my first question would be whether other Open Babel tasks work for you (e.g., force fields, importing files through Open Babel, etc.)

Technically (there are issues on force fields or topologies generated by OB, but this is an OB related problem that I’m trying to better characterize before posting about them) changing force field settings and performing minimization generally works. Adding or deleting hydrogens sometimes works, but in other cases they crash Avogadro. I haven’t posted anything about this issue since I haven’t yet performed a more systematic analysis of this issue.
As written in my previous post, standalone OB 3.1.1 works flawlessly on the same PC.

I don’t know, but the current build works fine. There were a few bond bugs fixed in 1.95.1. Want to try a nightly?
https://nightly.link/OpenChemistry/avogadrolibs/workflows/build_cmake/master/Win64.exe

I’ll look through the Open Babel integration. I’m surprised that it could crash, since obabel is run in a separate process, but I also don’t have a Windows machine to try. I’m working to load a Win10 VM on my desktop.

As to bond order perception… At the moment, my plate is overflowing. I’m trying to finish off three significant features for 2.0 (new force field framework, new click-to-bond-fragment builder, and editable property table). I had something like that for OB-2.0 so I’ll see what I can dig out.

I’ve tried the nightly build. To avoid possible disturbance from other programs, I also deleted any user or system path entry pointing to OB or Avogadro related folders except for the Avogadro2 one.
This is a summary of the main results of my tests (1.95.0 compared behavior italicized in parenthesis):

  • file formats including complete topological info (SDF, CML) are correctly interpreted and read in the same way by Avogadro and OB filters (under 1.95.0 only SDF files are correctly interpreted by Avogadro filter; double bonds are totally missing when reading CML with Avogadro filtering and both SDF and CML with OB filtering);
  • file formats with no or partial topological info (XYZ, PDB) are correctly interpreted by OB filter and, by design, only translated into single bonds with Avogadro filter (all single bonds in both cases under 1.95.0);
  • OB “Bond perception” applied to an Avogadro-filtered XYZ file (apparently randomly and even on the same molecules) works, crashes or gets stuck (crashes or gets stuck in 1.95.0);
  • Energy Minimization (and changes in FF/EM setup) works flawlessly (OK in 1.95.0);
  • hydrogen removal through OB crashes Avogadro (crashes in 1.95.0);
  • hydrogen addition through OB crashes Avogadro (crashes in 1.95.0);
  • Avogadro build hydrogen addition (automatic and manual) occasionally fails to add some H atoms (see NOTE b) (OK in 1.95.0).

Incidentally:
thanks to initial problems during installation, I discovered an orphaned obmm.exe process still running. This process could justify some of the apparently odd/erratic behaviors obtained in previous tests, although apparent erratic behavior also occurs when no other process is running/left. In particular, no such processes are found after nightly build crashes;
the problem with ORCA outputs reported in “OB issues” (Error converting ORCA output files #1361), resulting in OB reading the starting and not the final (or all) the structures in EM calculations still persists in the last versions of OB and ORCA and in the ORCA file reading implemented in Avogadro.

NOTES:
a) since you mentioned a new FF framework, I got strange results on some heterocyclic molecules for GAFF and Ghemical FFs (MMFF94, MMFF94s and UFF results are OK). This is a FF/topological problem deserving further characterization and I hope the new FF framework will help. At present, upon independent tests on other programs featuring GAFF implementations, I have clues that part of the problem may arise from GAFF itself and/or critical atom type assignments. I should get further info on this issue in some days;
b) the apparently erratic Build ->Add hydrogens problem can be related to more extensive structure build/edit issues I’m experiencing in 1.95.1 (OK in 1.95.0): hydrogen disappearing and/or apparently random long bond creation when drawing a structure by dragging, drawing by dragging only working fine for two atoms, while building by changing H atoms into other elements seems to work. Thus, I’ll post about these issues after further tests;
c) when Avogadro 1.95.1 gets stuck, there is only an unresponsive Avogadro process running, with no related OB exe in the full process list;
d) the OB 3.1.0 version installed with Avogadro 1.95.0 when used in a terminal to convert XYZ to either SDF or CML files produces structures with correct double bond patterns. Also H deletion option works flawlessly, while OB (as well as OB 3.1.1 installed independently) gets stuck when adding hydrogens by -h option.

1 Like

Great, thanks that helps.

There are a few existing bug reports about Avogadro’s hydrogen addition. I’m pretty sure I know where the bug is, but it’s a sneaky one. :bug:

I’ll take a look at the Orca bug. I can suggest as a workaround installing the avogadro-cclib plugin, which will add an import option through cclib if it’s installed in your Python environment. No bond order perception, but we use it in the group to grab all the optimization steps, trajectories, vibrations, etc.

a) Please report any GAFF issues to the Open Babel GitHub tracker. David van der Spoel’s group worked with us to fix atom type assignments, but we can work on that. I have a project that should result in a more modern version of UFF, so I’d be curious to see your test cases too. (My project allows non-integer bond orders, which helps conjugated systems a lot.)

b) As I said, I’m aware of the bugs with Avogadro2 hydrogen addition. Pretty sure I’ll be able to squash it in the next week or two, but I’m not going to release 1.96 until it’s fixed.

c) and d) Good to know, thanks.

I’m glad I was able to help.

As for cclib plugin to read ORCA output, I’m trying to find a reasonable way to manage python versions, environments, tools and applications on my PC. This problem, already annoying under Linux, turns into a real nightmare under Windows!

FF related issues, and in particular, GAFF issues are presently under investigation in my lab. I’ll report on Open Babel GitHub tracker as soon as I have aligned results on some test systems. I’ll also send you the most relevant molecules. As for non integer bond orders, I’ll like to discuss some related subjects I met when implementing metal-organics in our molecular database. Presently, I’m approaching again similar problems to implement mass spectrometry fragments, since we are turning the DB into a complete platform to manage full natural product chemistry and metabolomics workflows.

As for hydrogen addition and, more in general, mol drawing issues, they occurred more heavily in 1.95.1, leading to a network of long bonds and to the impossibility to draw new bonds or adjusting H distribution. In 1.95.0 only occasionally some H atoms disappeared when attaching new atoms, but the correct protonation was recovered after H addition.

Yes. We’re looking at ways to add Avogadro2 as a conda package. (Tomviz is already there)

In any case, OB needs support for Orca5.

As far as force field issues, I’d be interested in further discussion - send me an e-mail. :slightly_smiling_face:

I’ll look more deeply into the hydrogen addition issues. Please feel free to open new bug reports, they’re very helpful.