Full and privileged xtb integration?

Enthusiasm is great, and I agree 100% with the point. We definitely want to make Avogadro easy to use … and to help synthetic chemists who don’t have a great intuition in 3D or understand conformers.

On the conformer side, the Open Babel conformer generator will be up and running soon. I honestly thought it was already in there. (My group will have more conformer options sometime next year.)

Yes. Conceivably it could resume after release.

Let’s see if we can come up with a checklist of smaller tasks to get this started.

As far as tasks, I’m going to suggest the task list go on GitHub for tracking: Full and privileged xtb integration · Issue #1447 · OpenChemistry/avogadrolibs · GitHub

I think the first step would be to figure out how to get Mac static binaries for the xtb releases to go with the current Linux and Win-64 builds.

That can certainly occur in parallel with itemizing the list of features (e.g., you seem very interested in crest conformers) and starting with some Python scripts because they’re easy / fast to try before the C++ implementations.

Hello,

I truly think that this kind of organization is really nice and neat.
Currently, my focus is on implementing the AMOEBA FF (and others) directly within AVOGADRO using C/C++. If you are working on the python scripts for xtb and crest, I would be more than happy to assist in integrating them directly into C, eliminating the need for a separate script after the different FFs implementation. While my proficiency in C/C++ and AVOGADRO might be basic, I am enthusiastic about contributing and learning more.

My initial thought was that would be tricky, but it does look like xtb does offer a C interface, which looks pretty similar to the xtb-python interface.
https://xtb-docs.readthedocs.io/en/latest/capi.html

This would require compiling the xtb pieces with a Fortran compiler so libxtb can be distributed (e.g., Windows, Mac, Linux AppImage) … I can provide some pointers or a branch if you’re interested in going this route.

Perhaps using that would be faster for use in your framework? And allow using GFN2-xTB for live optimization? :smiley: It would also solve any issues with python/conda packaging, I guess.

Of course, while energies and gradients (and solvation) are available via the API, frequency calculations and conformer searches are not. So I will still look at making that python script/plugin to enable use of command-line xtb via the Avogadro GUI.

Maybe? It would certainly solve the issue with GFN-FF returns an energy of 0 … but GFN2 is decidedly slower than GFN-FF.

And yes, I think having some simple scripts to calculate orbitals, vibrations, and run crest would be great.

Hi ghutchis,

Sorry to be such a pain, but I’m having difficulty even getting plugins to work. Wasn’t sure whether to open a new thread/issue for it because maybe it’s just user error rather than a general bug…

I’m on OpenSUSE Tumbleweed with KDE Plasma, up to date. I have the nightly build of 1.98.1 as an AppImage.

For a start, the Download Plugins window is completely unpopulated, with only the column headers and the Download Selected button to be seen.

I tried cloning the avogadro-scikit-nano repo, then dragging the python scripts into the Avogadro window. The “Install plugin script” dialog appears and I can choose the command option, then click ok, but nothing changes.

I tried making a folder at ~/.local/share/avogadro (it didn’t yet exist) and adding the avogadro-scikit-nano directory to it, but it changed nothing.

I also started playing around making my own python script and can also find no way to add it to Avogadro.

Any ideas? Sadly my only machines are Linux and a Windows ARM Surface, so I can’t just use a Windows or Mac version.

Ok I just about managed to build Avogadro from source on my Linux PC. Something is screwy with the graphics (possibly due to Wayland) so it’s not a useable version yet but it allows me to have a look at how some things work. For example, I can run the binary from the command line and now I see that I had the wrong path for plugin script location. I have moved my plugin and the avogadro-scikit-nano to ~/.local/share/OpenChemistry/Avogadro/commands/
I checked the AppImage and the plugins are still not loaded.

The GUI functions enough for me to open Extensions|Download Plugins… and it seems that now it is populated the way it should be. (So it seems something is broken in the AppImage, I’ll open an issue for it.) I could successfully download e.g. the nanocar plugin.

On starting the binary Avogadro prints that it found all three scripts but that it can’t load them. The same applies for the other 48 scripts that it finds in the various directories. This also explains why I only have the Lennard-Jones force field available and none of the others.

Bit of a facepalm moment for me when I just now realised I can also get the printed terminal output by starting the AppImage from the CLI. But there it says exactly the same, that the scripts are found but can’t be loaded. Though it also finds an extra 6 scripts, not sure why.

Ok yes, the graphical issues are only present when running on Wayland and switching to an X11 session made them disappear. Interesting that the AppImage looks and behaves identical on both X and Wayland though.

Yes, one project before 2.0 (ideally before 1.99) would be to switch from GLEW to glbinding in part because GLEW does not support Wayland. (And in part to switch to the OpenGL 3.2 or OpenGL 4 core profile.)

A post was split to a new topic: Challenges with Python Extensions

I’ve got a command script for a geometry optimization working nicely. Now I’m looking to add menu items for single point and frequency calculations.

Even though Avogadro does not yet have a good universal way to display diverse calculation results (I have lots of ideas for this, I’ll put them in a thread some time), I would like to pass energy results and freq results back in a way that is not dependent on Avogadro’s current implementations but is more agnostic for the future. cjson supports the inclusion of various properties and results, so I’d like to just return various results as part of a cjson file/object.

Three questions on this:

  1. The json object that should be passed back by scripts (as a string) can contain a cjson, or another geometry specification such as xyz, as well as attributes such as moleculeFormat. However, xtb handles .xyz files well so I’m using predominantly that format and currently simply return an xyz geometry. Is it possible to return a json object, called e.g. output, with output["moleculeFormat"], output["cjson"], and output["xyz"], and have all read by Avogadro?
  2. The current cjson spec requires at minimum an entry for atoms. Presumably this means I can’t just pass a cjson object with entries under properties and otherwise leave it empty as it will be read as invalid?
  3. Alternatively I suppose I could convert the output .xyz to .cjson using openbabel, add the extra information to the cjson, and then pass that back to Avo. Is there an easy way to go about using the openbabel included with Avogadro from the command scripts, or is it better to install separately using pip/conda?

By the way, openbabel.org is only giving me redirection errors.

I’ve thought about this a bit, not surprisingly.

There’s definitely a need for scripts to indicate “I have some new properties, just add them to the current molecule.” For example, it’s easy to get xtb to punch a Molden file, and then ask Avogadro to read that. But you don’t really want to replace the current molecule … maybe you’ve carefully set up bond orders, put some atoms in different layers … whatever.

In the meantime, I think you want to request cjson from Avogadro.

  • The cjson will track total charge and spin multiplicity
  • The cjson will include lattice vectors if you have a periodic system.
  • It’s easy enough to just update the coordinates when you pass back the cjson to Avogadro.
  • You can add properties to the cjson before you pass it back.
    atoms = np.array(mol_cjson["atoms"]["elements"]["number"])
    coord_list = mol_cjson["atoms"]["coords"]["3d"]
    coordinates = np.array(coord_list, dtype=float).reshape(-1, 3)

    charge = None  # neutral
    spin = None  # singlet
    if "properties" in mol_cjson:
        if "totalCharge" in mol_cjson["properties"]:
            charge = mol_cjson["properties"]["totalCharge"]
        if "totalSpinMultiplicity" in mol_cjson["properties"]:
            spin = mol_cjson["properties"]["totalSpinMultiplicity"]

Then let’s say you get new coordinates from the XYZ output from xtb:

mol_cjson["atoms"]["coords"]["3d"] = new_coords.reshape(-1)

For cases where you want to return multiple conformers or a trajectory:

mol_cjson["atoms"]["coords"]["3dsets"] = [ [ … ], [ …] ]

I think there are probably other use cases (e.g., orbitals, vibrations) in which it’s probably easier to say “I want add properties from this file, but not change atoms / bonds.”

I’m leaning towards this syntax:

{
  "moleculeFormat": "molden",
  "ignoreAtoms": true
}

I’m not sure the best name for the key… “ignoreAtoms” or “readProperties”.

I think there’s probably also merit to just returning properties (e.g., some script that calculates pKa or whatever):

{
  properties: { "pkA": 6.74 }
}

The CJSON specification is on GitHub - I’m happy to consider improvements: https://github.com/OpenChemistry/chemicaljson/master/chemicaljson.py

Yes, at the moment I think you’re right, requesting cjson and then converting to xyz manually is probably the only sound way to do it.

I think something like this would be by far the most sensible implementation. Sending changes back to Avogadro that are then added rather than replacing the whole file. Otherwise the burden is on the script to retain every piece of information in the original cjson, even if a different format was passed to the script!

I have now managed to get energy, optimization, and frequency calculations working. :smiley:

Unfortunately, as the result of an energy calc is currently passed back without coordinates, the molecule disappears, but I shall try to implement your suggested approach to solve it, thanks for that. The optimization works very nicely though, and is really pretty fast!

Again, currently after doing an opt any other information about the molecule gets wiped. In a way though this is correct behaviour – I should make sure that any energies or frequencies are wiped after doing an optimization anyway, since they no longer have meaning.

Couple of problems for the frequency calculation are:

  1. openbabel seems to convert the Gaussian format frequencies file produced by xtb incorrectly – all the atoms are present in the coords, all the frequencies are there, but atom 1 gets dropped from all the entries "eigenvectors" section. I’ve attached the results of doing a frequency calculation on acetone: output.out and g98.out from xtb and the freq.cjson generated by openbabel. If you load the cjson, you’ll see the strong mode at 1700 cm^-1 is not moving quite the atoms one would expect…

  2. The Vibrational Modes window only appears after switching to a different molecule and back again. Even Analysis | Vibrational Modes... remains greyed out until that is done. So the user currently receives no information that the calculation was successful :frowning:
    g98.out (12.8 KB)
    freq.cjson (12.2 KB)
    output.out (21.2 KB)

  3. It relies on openbabel being present in the system path, and a recently compiled version at that, as the last release on github doesn’t seem to have cjson support. So I need to work out a way to use the bundled version of openbabel really.

But I’m happy that the key functionality is essentially already there. Now I just need to implement a script to execute any arbitrary command provided by the user.

I don’t think I’d rely on openbabel in your script - there does need to be a 3.2 release, but in theory, you can request Avogadro read the g98 file as g98 format. Avogadro will hand it off to openbabel for you.

For now, for the energy calculation, you could simply indicate append:

{
  "append": true
}

https://two.avogadro.cc/develop/scripts/commands.html#appending-new-atoms-and-bonds

I’ll create a PR for the ignoreAtoms and reading properties directly.

As far as updating the vibrational modes window, I’d have to think about that. I guess it means after the commands are returned, the signal isn’t sent to update the molecule.

This would be brilliant but for whatever reason Avogadro doesn’t manage the conversion. Even though command-line openbabel can convert it, albeit with the error I described (though I have to specify the input format as otherwise the automatic format recognition fails). Avo can’t open the file directly using File | Open, and passing it back as a string within {"moleculeFormat": "g98", "g98": file_as_string} doesn’t work either. I’ll attach the json, maybe you can see what is wrong with it.
result_json.txt (9.1 KB)

This seems to work, thanks, at least in that the molecule no longer disappears. It is impossible to tell whether the energy is being passed back since Avogadro doesn’t seem to process/store the cjson["properties"]["totalEnergy"] field anyway.

Two small requests… :innocent:

(1) Would it be possible to add a field to the return json for a string that should be displayed to the user? So that for now my plugin can at least show the calculated energy to the user in the same way as happens when calculating using the forcefield framework. Something like output = {"message": ["First message to display to user", "Second message to display to user"], "cjson": cjson}

It would also be nice to use this to propagate any errors thrown by xtb back to the user, or to warn the user that negative frequencies were found.

(2) Since json.loads(sys.stdin.read()) already returns a json object with not only the requested geometry format but also other info (seemingly charge, spin, selected atoms) could it maybe become the case that the stdin json always contains the cjson of the molecule, even when another geometry format is requested and supplied? This would let me/others use another format to do the actual work but have the original cjson available to reference and return, as well as get any other useful info from.

Try this - it also adds a commit for the “message” option you described.

Oh, I almost forgot…

As far as I can tell, the best file format for xtb right now is actually coord for Turbomole. I added this to Avogadro2 because it supports both molecule-like XYZ format and unit cells.

water.coord

$coord angs
      -1.6750493050       1.2462366819       0.0000031757     o
      -0.7446617903       1.0957257275       0.1192738870     h
      -2.1597555509       0.4860377420       0.2994316384     h
$end

silver.coord

$coord angs
       0.0000000000       0.0000000000       0.0000000000    ag
       0.0000000000       2.0431000000       2.0431000000    ag
       2.0431000000       0.0000000000       2.0431000000    ag
       2.0431000000       2.0431000000       0.0000000000    ag
       4.0862000000       0.0000000000       0.0000000000    ag
       4.0862000000       4.0862000000       0.0000000000    ag
       4.0862000000       0.0000000000       4.0862000000    ag
       0.0000000000       4.0862000000       0.0000000000    ag
       0.0000000000       4.0862000000       4.0862000000    ag
       0.0000000000       0.0000000000       4.0862000000    ag
       4.0862000000       4.0862000000       4.0862000000    ag
       4.0862000000       2.0431000000       2.0431000000    ag
       2.0431000000       4.0862000000       2.0431000000    ag
       2.0431000000       2.0431000000       4.0862000000    ag
$periodic 3
$lattice angs
4.0862000000 0.0000000000 0.0000000000
0.0000000000 4.0862000000 0.0000000000
0.0000000000 0.0000000000 4.0862000000
$end

Give me a bit. That’s certainly used by other parts of the code. And if you have 3dSets of coordinates, you can use:

cjson["properties"]["energies"] = [ .. list ]

Let’s just say it’s on the TODO list (e.g. an energy plot).

Have you tried getting back a Molden file for the orbitals?

This works nicely, thanks! Though sometimes a second empty message box pops up too, and I can’t quite work out why. Perhaps it will become clear.

This does seem like a nice benefit, and I’m trying to implement it so that .coord is used instead of .xyz. Tediously though, xtb always returns .coord files in bohr, even if the input was in angstrom. It looks like crest might be even fussier as well. I would also hazard to suggest that .xyz is useful as input for more different other programs e.g. Orca than .coord is.

So I might always request .coord from Avo to make parsing easier, but convert to and run with .xyz unless the $periodic or $lattice or $cell blocks are detected. Of course, if I could request Avogadro for both formats rather than a single one that would become even easier… I don’t know how much time the conversion to each format with openbabel costs though.

No, not yet. If I have Orbitals as a separate command script, can it pass back the .molden file with "append": true and have any previously computed properties e.g. frequencies be retained and not deleted?