Full and privileged xtb integration?

Ok yes, the graphical issues are only present when running on Wayland and switching to an X11 session made them disappear. Interesting that the AppImage looks and behaves identical on both X and Wayland though.

Yes, one project before 2.0 (ideally before 1.99) would be to switch from GLEW to glbinding in part because GLEW does not support Wayland. (And in part to switch to the OpenGL 3.2 or OpenGL 4 core profile.)

A post was split to a new topic: Challenges with Python Extensions

Iā€™ve got a command script for a geometry optimization working nicely. Now Iā€™m looking to add menu items for single point and frequency calculations.

Even though Avogadro does not yet have a good universal way to display diverse calculation results (I have lots of ideas for this, Iā€™ll put them in a thread some time), I would like to pass energy results and freq results back in a way that is not dependent on Avogadroā€™s current implementations but is more agnostic for the future. cjson supports the inclusion of various properties and results, so Iā€™d like to just return various results as part of a cjson file/object.

Three questions on this:

  1. The json object that should be passed back by scripts (as a string) can contain a cjson, or another geometry specification such as xyz, as well as attributes such as moleculeFormat. However, xtb handles .xyz files well so Iā€™m using predominantly that format and currently simply return an xyz geometry. Is it possible to return a json object, called e.g. output, with output["moleculeFormat"], output["cjson"], and output["xyz"], and have all read by Avogadro?
  2. The current cjson spec requires at minimum an entry for atoms. Presumably this means I canā€™t just pass a cjson object with entries under properties and otherwise leave it empty as it will be read as invalid?
  3. Alternatively I suppose I could convert the output .xyz to .cjson using openbabel, add the extra information to the cjson, and then pass that back to Avo. Is there an easy way to go about using the openbabel included with Avogadro from the command scripts, or is it better to install separately using pip/conda?

By the way, openbabel.org is only giving me redirection errors.

Iā€™ve thought about this a bit, not surprisingly.

Thereā€™s definitely a need for scripts to indicate ā€œI have some new properties, just add them to the current molecule.ā€ For example, itā€™s easy to get xtb to punch a Molden file, and then ask Avogadro to read that. But you donā€™t really want to replace the current molecule ā€¦ maybe youā€™ve carefully set up bond orders, put some atoms in different layers ā€¦ whatever.

In the meantime, I think you want to request cjson from Avogadro.

  • The cjson will track total charge and spin multiplicity
  • The cjson will include lattice vectors if you have a periodic system.
  • Itā€™s easy enough to just update the coordinates when you pass back the cjson to Avogadro.
  • You can add properties to the cjson before you pass it back.
    atoms = np.array(mol_cjson["atoms"]["elements"]["number"])
    coord_list = mol_cjson["atoms"]["coords"]["3d"]
    coordinates = np.array(coord_list, dtype=float).reshape(-1, 3)

    charge = None  # neutral
    spin = None  # singlet
    if "properties" in mol_cjson:
        if "totalCharge" in mol_cjson["properties"]:
            charge = mol_cjson["properties"]["totalCharge"]
        if "totalSpinMultiplicity" in mol_cjson["properties"]:
            spin = mol_cjson["properties"]["totalSpinMultiplicity"]

Then letā€™s say you get new coordinates from the XYZ output from xtb:

mol_cjson["atoms"]["coords"]["3d"] = new_coords.reshape(-1)

For cases where you want to return multiple conformers or a trajectory:

mol_cjson["atoms"]["coords"]["3dsets"] = [ [ ā€¦ ], [ ā€¦] ]

I think there are probably other use cases (e.g., orbitals, vibrations) in which itā€™s probably easier to say ā€œI want add properties from this file, but not change atoms / bonds.ā€

Iā€™m leaning towards this syntax:

{
  "moleculeFormat": "molden",
  "ignoreAtoms": true
}

Iā€™m not sure the best name for the keyā€¦ ā€œignoreAtomsā€ or ā€œreadPropertiesā€.

I think thereā€™s probably also merit to just returning properties (e.g., some script that calculates pKa or whatever):

{
  properties: { "pkA": 6.74 }
}

The CJSON specification is on GitHub - Iā€™m happy to consider improvements: https://github.com/OpenChemistry/chemicaljson/master/chemicaljson.py

Yes, at the moment I think youā€™re right, requesting cjson and then converting to xyz manually is probably the only sound way to do it.

I think something like this would be by far the most sensible implementation. Sending changes back to Avogadro that are then added rather than replacing the whole file. Otherwise the burden is on the script to retain every piece of information in the original cjson, even if a different format was passed to the script!

I have now managed to get energy, optimization, and frequency calculations working. :smiley:

Unfortunately, as the result of an energy calc is currently passed back without coordinates, the molecule disappears, but I shall try to implement your suggested approach to solve it, thanks for that. The optimization works very nicely though, and is really pretty fast!

Again, currently after doing an opt any other information about the molecule gets wiped. In a way though this is correct behaviour ā€“ I should make sure that any energies or frequencies are wiped after doing an optimization anyway, since they no longer have meaning.

Couple of problems for the frequency calculation are:

  1. openbabel seems to convert the Gaussian format frequencies file produced by xtb incorrectly ā€“ all the atoms are present in the coords, all the frequencies are there, but atom 1 gets dropped from all the entries "eigenvectors" section. Iā€™ve attached the results of doing a frequency calculation on acetone: output.out and g98.out from xtb and the freq.cjson generated by openbabel. If you load the cjson, youā€™ll see the strong mode at 1700 cm^-1 is not moving quite the atoms one would expectā€¦

  2. The Vibrational Modes window only appears after switching to a different molecule and back again. Even Analysis | Vibrational Modes... remains greyed out until that is done. So the user currently receives no information that the calculation was successful :frowning:
    g98.out (12.8 KB)
    freq.cjson (12.2 KB)
    output.out (21.2 KB)

  3. It relies on openbabel being present in the system path, and a recently compiled version at that, as the last release on github doesnā€™t seem to have cjson support. So I need to work out a way to use the bundled version of openbabel really.

But Iā€™m happy that the key functionality is essentially already there. Now I just need to implement a script to execute any arbitrary command provided by the user.

I donā€™t think Iā€™d rely on openbabel in your script - there does need to be a 3.2 release, but in theory, you can request Avogadro read the g98 file as g98 format. Avogadro will hand it off to openbabel for you.

For now, for the energy calculation, you could simply indicate append:

{
  "append": true
}

https://two.avogadro.cc/develop/scripts/commands.html#appending-new-atoms-and-bonds

Iā€™ll create a PR for the ignoreAtoms and reading properties directly.

As far as updating the vibrational modes window, Iā€™d have to think about that. I guess it means after the commands are returned, the signal isnā€™t sent to update the molecule.

This would be brilliant but for whatever reason Avogadro doesnā€™t manage the conversion. Even though command-line openbabel can convert it, albeit with the error I described (though I have to specify the input format as otherwise the automatic format recognition fails). Avo canā€™t open the file directly using File | Open, and passing it back as a string within {"moleculeFormat": "g98", "g98": file_as_string} doesnā€™t work either. Iā€™ll attach the json, maybe you can see what is wrong with it.
result_json.txt (9.1 KB)

This seems to work, thanks, at least in that the molecule no longer disappears. It is impossible to tell whether the energy is being passed back since Avogadro doesnā€™t seem to process/store the cjson["properties"]["totalEnergy"] field anyway.

Two small requestsā€¦ :innocent:

(1) Would it be possible to add a field to the return json for a string that should be displayed to the user? So that for now my plugin can at least show the calculated energy to the user in the same way as happens when calculating using the forcefield framework. Something like output = {"message": ["First message to display to user", "Second message to display to user"], "cjson": cjson}

It would also be nice to use this to propagate any errors thrown by xtb back to the user, or to warn the user that negative frequencies were found.

(2) Since json.loads(sys.stdin.read()) already returns a json object with not only the requested geometry format but also other info (seemingly charge, spin, selected atoms) could it maybe become the case that the stdin json always contains the cjson of the molecule, even when another geometry format is requested and supplied? This would let me/others use another format to do the actual work but have the original cjson available to reference and return, as well as get any other useful info from.

Try this - it also adds a commit for the ā€œmessageā€ option you described.

Oh, I almost forgotā€¦

As far as I can tell, the best file format for xtb right now is actually coord for Turbomole. I added this to Avogadro2 because it supports both molecule-like XYZ format and unit cells.

water.coord

$coord angs
      -1.6750493050       1.2462366819       0.0000031757     o
      -0.7446617903       1.0957257275       0.1192738870     h
      -2.1597555509       0.4860377420       0.2994316384     h
$end

silver.coord

$coord angs
       0.0000000000       0.0000000000       0.0000000000    ag
       0.0000000000       2.0431000000       2.0431000000    ag
       2.0431000000       0.0000000000       2.0431000000    ag
       2.0431000000       2.0431000000       0.0000000000    ag
       4.0862000000       0.0000000000       0.0000000000    ag
       4.0862000000       4.0862000000       0.0000000000    ag
       4.0862000000       0.0000000000       4.0862000000    ag
       0.0000000000       4.0862000000       0.0000000000    ag
       0.0000000000       4.0862000000       4.0862000000    ag
       0.0000000000       0.0000000000       4.0862000000    ag
       4.0862000000       4.0862000000       4.0862000000    ag
       4.0862000000       2.0431000000       2.0431000000    ag
       2.0431000000       4.0862000000       2.0431000000    ag
       2.0431000000       2.0431000000       4.0862000000    ag
$periodic 3
$lattice angs
4.0862000000 0.0000000000 0.0000000000
0.0000000000 4.0862000000 0.0000000000
0.0000000000 0.0000000000 4.0862000000
$end

Give me a bit. Thatā€™s certainly used by other parts of the code. And if you have 3dSets of coordinates, you can use:

cjson["properties"]["energies"] = [ .. list ]

Letā€™s just say itā€™s on the TODO list (e.g. an energy plot).

Have you tried getting back a Molden file for the orbitals?

This works nicely, thanks! Though sometimes a second empty message box pops up too, and I canā€™t quite work out why. Perhaps it will become clear.

This does seem like a nice benefit, and Iā€™m trying to implement it so that .coord is used instead of .xyz. Tediously though, xtb always returns .coord files in bohr, even if the input was in angstrom. It looks like crest might be even fussier as well. I would also hazard to suggest that .xyz is useful as input for more different other programs e.g. Orca than .coord is.

So I might always request .coord from Avo to make parsing easier, but convert to and run with .xyz unless the $periodic or $lattice or $cell blocks are detected. Of course, if I could request Avogadro for both formats rather than a single one that would become even easierā€¦ I donā€™t know how much time the conversion to each format with openbabel costs though.

No, not yet. If I have Orbitals as a separate command script, can it pass back the .molden file with "append": true and have any previously computed properties e.g. frequencies be retained and not deleted?

Good question. Not 100% sure ā€¦ ā€œappendā€ probably still triggers an indication that the atoms / bonds changed, which is why I was suggesting the separate ā€œreadPropertiesā€ key.

Even this very simple command spawns two message boxes, by the way.
debug.py.txt (1.4 KB)

It looks like an empty dialog shows up even if there arenā€™t any options. Then the debug / error message shows up.

Iā€™ll go fix that. The code to detect that use case clearly isnā€™t good enough.

Okay, so the ā€œappendā€ part doesnā€™t work because it handles atoms and bonds. This part adds support for ā€œreadPropertiesā€ which just copies properties, charges, basis set, vibrations, etc. and leaves everything else alone.

It doesnā€™t yet work for these use cases because the surfaces and vibrations extensions donā€™t pay attention to the molecule changes. Iā€™ll take care of that tomorrow - have a few other things to do tonight.

Nice, great. :slight_smile:

Will readProperties replace the current set of properties with the new ones or just add the differences? I.e. would an extant set of vibrations be deleted by passing only orbitals?

At the moment, it replaces the properties. Safely merging properties is a bit trickier.

In the meantime, I can verify that your Frequencies and Orbitals script do work with the patch.

Would support for QLabels in the script interface specification via userOptions be possible? As the user is basically left to themselves to set up calculations other than the preset Optimize, Orbitals etc., and is required to provide the terminal command they want executed, Iā€™d like to refer the user to the xtb docs from the "Run..." window. I have already got an xtb Help option in the menu, but wouldnā€™t hurt to have a hint exactly when the user is most likely to need it.

Maybe an alternative would be a QButton that can be set to run another command script like docs.py, which is what is run by clicking xtb Help but that seems unnecessarily complex.

By the way, passing a molden file with "readProperties": True doesnā€™t seem to remove previously calculated frequencies after all. They remain available in the interface and saving the molecule and opening the cjson shows both MOs and frequencies present. :+1:

Yes. I fixed that before merging the PR.

Maybe? I mean, I can see the use case, although toolTips are also supported. I guess youā€™d want the text to span both rows of the form? Or more like:

instructions: Type a command for xtb .. see https://xtb-docs.readthedocs.io/ for more

I mean, from my perspective, you want to cover most use cases with separate scripts. Itā€™s the 80/20 or 90/10 rule ā€¦ most usage will be ā€œoptimize this,ā€ ā€œgive me some quick orbitals,ā€ and ā€œshow me the vibrations.ā€ Maybe add in ā€œcan I see a quick MD run?ā€

Adding a generic ā€œrunā€ command is somewhat useful, but more likely for people who already know what they want to do.

One other possibility would be an editable combo menu (eg., "type": "stringList") that offers a few examples, but allows free-form text entry?