Full and privileged xtb integration?

matterhorn103 · November 13, 2023, 6:26pm

Ok yes, the graphical issues are only present when running on Wayland and switching to an X11 session made them disappear. Interesting that the AppImage looks and behaves identical on both X and Wayland though.

ghutchis · November 13, 2023, 6:42pm

Yes, one project before 2.0 (ideally before 1.99) would be to switch from GLEW to glbinding in part because GLEW does not support Wayland. (And in part to switch to the OpenGL 3.2 or OpenGL 4 core profile.)

ghutchis · November 22, 2023, 4:33pm

A post was split to a new topic: Challenges with Python Extensions

matterhorn103 · November 23, 2023, 12:57pm

I’ve got a command script for a geometry optimization working nicely. Now I’m looking to add menu items for single point and frequency calculations.

Even though Avogadro does not yet have a good universal way to display diverse calculation results (I have lots of ideas for this, I’ll put them in a thread some time), I would like to pass energy results and freq results back in a way that is not dependent on Avogadro’s current implementations but is more agnostic for the future. cjson supports the inclusion of various properties and results, so I’d like to just return various results as part of a cjson file/object.

Three questions on this:

The json object that should be passed back by scripts (as a string) can contain a cjson, or another geometry specification such as xyz, as well as attributes such as moleculeFormat. However, xtb handles .xyz files well so I’m using predominantly that format and currently simply return an xyz geometry. Is it possible to return a json object, called e.g. output, with output["moleculeFormat"], output["cjson"], and output["xyz"], and have all read by Avogadro?
The current cjson spec requires at minimum an entry for atoms. Presumably this means I can’t just pass a cjson object with entries under properties and otherwise leave it empty as it will be read as invalid?
Alternatively I suppose I could convert the output .xyz to .cjson using openbabel, add the extra information to the cjson, and then pass that back to Avo. Is there an easy way to go about using the openbabel included with Avogadro from the command scripts, or is it better to install separately using pip/conda?

By the way, openbabel.org is only giving me redirection errors.

ghutchis · November 24, 2023, 12:37am

I’ve thought about this a bit, not surprisingly.

There’s definitely a need for scripts to indicate “I have some new properties, just add them to the current molecule.” For example, it’s easy to get xtb to punch a Molden file, and then ask Avogadro to read that. But you don’t really want to replace the current molecule … maybe you’ve carefully set up bond orders, put some atoms in different layers … whatever.

In the meantime, I think you want to request cjson from Avogadro.

The cjson will track total charge and spin multiplicity
The cjson will include lattice vectors if you have a periodic system.
It’s easy enough to just update the coordinates when you pass back the cjson to Avogadro.
You can add properties to the cjson before you pass it back.

    atoms = np.array(mol_cjson["atoms"]["elements"]["number"])
    coord_list = mol_cjson["atoms"]["coords"]["3d"]
    coordinates = np.array(coord_list, dtype=float).reshape(-1, 3)

    charge = None  # neutral
    spin = None  # singlet
    if "properties" in mol_cjson:
        if "totalCharge" in mol_cjson["properties"]:
            charge = mol_cjson["properties"]["totalCharge"]
        if "totalSpinMultiplicity" in mol_cjson["properties"]:
            spin = mol_cjson["properties"]["totalSpinMultiplicity"]

Then let’s say you get new coordinates from the XYZ output from xtb:

mol_cjson["atoms"]["coords"]["3d"] = new_coords.reshape(-1)

For cases where you want to return multiple conformers or a trajectory:

mol_cjson["atoms"]["coords"]["3dsets"] = [ [ … ], [ …] ]

I think there are probably other use cases (e.g., orbitals, vibrations) in which it’s probably easier to say “I want add properties from this file, but not change atoms / bonds.”

I’m leaning towards this syntax:

{
  "moleculeFormat": "molden",
  "ignoreAtoms": true
}

I’m not sure the best name for the key… “ignoreAtoms” or “readProperties”.

I think there’s probably also merit to just returning properties (e.g., some script that calculates pKa or whatever):

{
  properties: { "pkA": 6.74 }
}

The CJSON specification is on GitHub - I’m happy to consider improvements: https://github.com/OpenChemistry/chemicaljson/master/chemicaljson.py

matterhorn103 · November 24, 2023, 1:20am

Yes, at the moment I think you’re right, requesting cjson and then converting to xyz manually is probably the only sound way to do it.

I think something like this would be by far the most sensible implementation. Sending changes back to Avogadro that are then added rather than replacing the whole file. Otherwise the burden is on the script to retain every piece of information in the original cjson, even if a different format was passed to the script!

I have now managed to get energy, optimization, and frequency calculations working.

Unfortunately, as the result of an energy calc is currently passed back without coordinates, the molecule disappears, but I shall try to implement your suggested approach to solve it, thanks for that. The optimization works very nicely though, and is really pretty fast!

Again, currently after doing an opt any other information about the molecule gets wiped. In a way though this is correct behaviour – I should make sure that any energies or frequencies are wiped after doing an optimization anyway, since they no longer have meaning.

Couple of problems for the frequency calculation are:

openbabel seems to convert the Gaussian format frequencies file produced by xtb incorrectly – all the atoms are present in the coords, all the frequencies are there, but atom 1 gets dropped from all the entries "eigenvectors" section. I’ve attached the results of doing a frequency calculation on acetone: output.out and g98.out from xtb and the freq.cjson generated by openbabel. If you load the cjson, you’ll see the strong mode at 1700 cm^-1 is not moving quite the atoms one would expect…
The Vibrational Modes window only appears after switching to a different molecule and back again. Even Analysis | Vibrational Modes... remains greyed out until that is done. So the user currently receives no information that the calculation was successful
g98.out (12.8 KB)
freq.cjson (12.2 KB)
output.out (21.2 KB)
It relies on openbabel being present in the system path, and a recently compiled version at that, as the last release on github doesn’t seem to have cjson support. So I need to work out a way to use the bundled version of openbabel really.

But I’m happy that the key functionality is essentially already there. Now I just need to implement a script to execute any arbitrary command provided by the user.

ghutchis · November 24, 2023, 2:51am

I don’t think I’d rely on openbabel in your script - there does need to be a 3.2 release, but in theory, you can request Avogadro read the g98 file as g98 format. Avogadro will hand it off to openbabel for you.

For now, for the energy calculation, you could simply indicate append:

{
  "append": true
}

https://two.avogadro.cc/develop/scripts/commands.html#appending-new-atoms-and-bonds

I’ll create a PR for the ignoreAtoms and reading properties directly.

As far as updating the vibrational modes window, I’d have to think about that. I guess it means after the commands are returned, the signal isn’t sent to update the molecule.

matterhorn103 · November 24, 2023, 10:51am

This would be brilliant but for whatever reason Avogadro doesn’t manage the conversion. Even though command-line openbabel can convert it, albeit with the error I described (though I have to specify the input format as otherwise the automatic format recognition fails). Avo can’t open the file directly using File | Open, and passing it back as a string within {"moleculeFormat": "g98", "g98": file_as_string} doesn’t work either. I’ll attach the json, maybe you can see what is wrong with it.
result_json.txt (9.1 KB)

This seems to work, thanks, at least in that the molecule no longer disappears. It is impossible to tell whether the energy is being passed back since Avogadro doesn’t seem to process/store the cjson["properties"]["totalEnergy"] field anyway.

Two small requests…

(1) Would it be possible to add a field to the return json for a string that should be displayed to the user? So that for now my plugin can at least show the calculated energy to the user in the same way as happens when calculating using the forcefield framework. Something like output = {"message": ["First message to display to user", "Second message to display to user"], "cjson": cjson}

It would also be nice to use this to propagate any errors thrown by xtb back to the user, or to warn the user that negative frequencies were found.

(2) Since json.loads(sys.stdin.read()) already returns a json object with not only the requested geometry format but also other info (seemingly charge, spin, selected atoms) could it maybe become the case that the stdin json always contains the cjson of the molecule, even when another geometry format is requested and supplied? This would let me/others use another format to do the actual work but have the original cjson available to reference and return, as well as get any other useful info from.

ghutchis · November 24, 2023, 4:31pm

Try this - it also adds a commit for the “message” option you described.

github.com/OpenChemistry/avogadrolibs

Always supply cjson to scripts

OpenChemistry:master ← ghutchis:always-supply-cjson-scripts

opened 04:30PM - 24 Nov 23 UTC

ghutchis

+50 -21

Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Lin…ux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.

ghutchis · November 24, 2023, 5:02pm

Oh, I almost forgot…

As far as I can tell, the best file format for xtb right now is actually coord for Turbomole. I added this to Avogadro2 because it supports both molecule-like XYZ format and unit cells.

water.coord

$coord angs
      -1.6750493050       1.2462366819       0.0000031757     o
      -0.7446617903       1.0957257275       0.1192738870     h
      -2.1597555509       0.4860377420       0.2994316384     h
$end

silver.coord

$coord angs
       0.0000000000       0.0000000000       0.0000000000    ag
       0.0000000000       2.0431000000       2.0431000000    ag
       2.0431000000       0.0000000000       2.0431000000    ag
       2.0431000000       2.0431000000       0.0000000000    ag
       4.0862000000       0.0000000000       0.0000000000    ag
       4.0862000000       4.0862000000       0.0000000000    ag
       4.0862000000       0.0000000000       4.0862000000    ag
       0.0000000000       4.0862000000       0.0000000000    ag
       0.0000000000       4.0862000000       4.0862000000    ag
       0.0000000000       0.0000000000       4.0862000000    ag
       4.0862000000       4.0862000000       4.0862000000    ag
       4.0862000000       2.0431000000       2.0431000000    ag
       2.0431000000       4.0862000000       2.0431000000    ag
       2.0431000000       2.0431000000       4.0862000000    ag
$periodic 3
$lattice angs
4.0862000000 0.0000000000 0.0000000000
0.0000000000 4.0862000000 0.0000000000
0.0000000000 0.0000000000 4.0862000000
$end

ghutchis · November 24, 2023, 6:26pm

Give me a bit. That’s certainly used by other parts of the code. And if you have 3dSets of coordinates, you can use:

cjson["properties"]["energies"] = [ .. list ]

Let’s just say it’s on the TODO list (e.g. an energy plot).

Have you tried getting back a Molden file for the orbitals?

matterhorn103 · November 25, 2023, 1:03am

This works nicely, thanks! Though sometimes a second empty message box pops up too, and I can’t quite work out why. Perhaps it will become clear.

This does seem like a nice benefit, and I’m trying to implement it so that .coord is used instead of .xyz. Tediously though, xtb always returns .coord files in bohr, even if the input was in angstrom. It looks like crest might be even fussier as well. I would also hazard to suggest that .xyz is useful as input for more different other programs e.g. Orca than .coord is.

So I might always request .coord from Avo to make parsing easier, but convert to and run with .xyz unless the $periodic or $lattice or $cell blocks are detected. Of course, if I could request Avogadro for both formats rather than a single one that would become even easier… I don’t know how much time the conversion to each format with openbabel costs though.

No, not yet. If I have Orbitals as a separate command script, can it pass back the .molden file with "append": true and have any previously computed properties e.g. frequencies be retained and not deleted?

ghutchis · November 25, 2023, 2:19am

Good question. Not 100% sure … “append” probably still triggers an indication that the atoms / bonds changed, which is why I was suggesting the separate “readProperties” key.

matterhorn103 · November 25, 2023, 11:36am

Even this very simple command spawns two message boxes, by the way.
debug.py.txt (1.4 KB)

ghutchis · November 25, 2023, 5:10pm

It looks like an empty dialog shows up even if there aren’t any options. Then the debug / error message shows up.

I’ll go fix that. The code to detect that use case clearly isn’t good enough.

ghutchis · November 27, 2023, 3:30am

Okay, so the “append” part doesn’t work because it handles atoms and bonds. This part adds support for “readProperties” which just copies properties, charges, basis set, vibrations, etc. and leaves everything else alone.

It doesn’t yet work for these use cases because the surfaces and vibrations extensions don’t pay attention to the molecule changes. I’ll take care of that tomorrow - have a few other things to do tonight.

github.com/OpenChemistry/avogadrolibs

Allow scripts to add properties (orbitals, vibrations, cubes)

OpenChemistry:master ← ghutchis:read-script-properties

opened 03:20AM - 27 Nov 23 UTC

ghutchis

+147 -25

Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Lin…ux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.

matterhorn103 · November 27, 2023, 9:19am

Nice, great.

Will readProperties replace the current set of properties with the new ones or just add the differences? I.e. would an extant set of vibrations be deleted by passing only orbitals?

ghutchis · November 27, 2023, 1:38pm

At the moment, it replaces the properties. Safely merging properties is a bit trickier.

In the meantime, I can verify that your Frequencies and Orbitals script do work with the patch.

matterhorn103 · November 29, 2023, 12:33am

Would support for QLabels in the script interface specification via userOptions be possible? As the user is basically left to themselves to set up calculations other than the preset Optimize, Orbitals etc., and is required to provide the terminal command they want executed, I’d like to refer the user to the xtb docs from the "Run..." window. I have already got an xtb Help option in the menu, but wouldn’t hurt to have a hint exactly when the user is most likely to need it.

Maybe an alternative would be a QButton that can be set to run another command script like docs.py, which is what is run by clicking xtb Help but that seems unnecessarily complex.

By the way, passing a molden file with "readProperties": True doesn’t seem to remove previously calculated frequencies after all. They remain available in the interface and saving the molecule and opening the cjson shows both MOs and frequencies present.

ghutchis · November 29, 2023, 1:19am

Yes. I fixed that before merging the PR.

Maybe? I mean, I can see the use case, although toolTips are also supported. I guess you’d want the text to span both rows of the form? Or more like:

instructions: Type a command for xtb .. see https://xtb-docs.readthedocs.io/ for more

I mean, from my perspective, you want to cover most use cases with separate scripts. It’s the 80/20 or 90/10 rule … most usage will be “optimize this,” “give me some quick orbitals,” and “show me the vibrations.” Maybe add in “can I see a quick MD run?”

Adding a generic “run” command is somewhat useful, but more likely for people who already know what they want to do.

One other possibility would be an editable combo menu (eg., "type": "stringList") that offers a few examples, but allows free-form text entry?