Hi @ghutchis,
I’ve been working fairly intensely on a complete overhaul of the avo_xtb plugin and will be releasing a new version in the coming days. As part of this I have been immersed again in the plugin API as it stands.
I’d like to make some proposals:
As a first small change, could it be possible to request for an arbitrary number of different formats, all of which will be passed in the json under their respective keys?
One way would just be to allow inputMoleculeFormat
to be either a single string or a list of strings. Having two valid types might be something you find undesirable though I guess? Only allowing a list would break existing scripts, so if you aren’t a fan of that, I’d suggest a new field in the options print-out; it could be called inputMoleculeFormats
plural, or, to be more distinct, something like requestedFormats
, i.e.
if args.print_options:
options = {"requestedFormats": ["xyz", "mol"]}
print(json.dumps(options))
# or, if you don't hate it:
if args.print_options:
options = {"inputMoleculeFormat": ["xyz", "mol"]}
print(json.dumps(options))
gets the plugin
{
"charge": 0,
"cjson": {
"atoms": {
"coords": {
"3d": [
-3.7502126693725586,
2.830498456954956,
0,
-4.15645283088028,
3.491789003415348,
-0.5478719938913942,
-4.068236302382353,
2.918291411901762,
0.8908722373377651
]
},
"elements": {
"number": [
8,
1,
1
]
},
"formalCharges": [
0,
0,
0
],
"layer": [
0,
0,
0
]
},
"bonds": {
"connections": {
"index": [
0,
1,
0,
2
]
},
"order": [
1,
1
]
},
"chemicalJson": 1,
"layer": {
"enable": {
"Ball and Stick": [
true
],
"Cartoons": [
true
],
"Close Contacts": [
false
],
"Crystal Lattice": [
false
],
"Force": [
false
],
"Labels": [
true
],
"Licorice": [
false
],
"Meshes": [
true
],
"Non-Covalent": [
false
],
"Van der Waals": [
false
],
"Wireframe": [
false
],
"undef": [
false
]
},
"locked": [
false
],
"settings": {
"Ball and Stick": [
"true true 0.300000 0.100000"
],
"Cartoons": [
"false false false false false true false"
],
"Labels": [
"2 0 0.5 255 255 255"
]
},
"visible": [
true
]
},
"properties": {
"totalCharge": 0,
"totalSpinMultiplicity": 1
}
},
"mol": "\n Avogadro\n\n 3 2 0 0 0 0 0 0 0 0999 V2000\n -3.7502 2.8305 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0\n -4.1565 3.4918 -0.5479 H 0 0 0 0 0 0 0 0 0 0 0 0\n -4.0682 2.9183 0.8909 H 0 0 0 0 0 0 0 0 0 0 0 0\n 1 2 1 0 0 0 0\n 1 3 1 0 0 0 0\nM END\n",
"selectedAtoms": [],
"spin": 1,
"xyz": "3\nXYZ file generated by Avogadro.\nO -3.75021 2.83050 0.00000\nH -4.15645 3.49179 -0.54787\nH -4.06824 2.91829 0.89087\n"
}
for water.
Sure, needing more than two formats is probably not all that common, but I don’t see a disadvantage in this approach, as it allows maximum flexibility and means that plugin authors can benefit from Avogadro’s excellent I/O capabilities without having to convert things themselves.
My second request would be similar, but for the json passed back to Avogadro by plugins.
If anything is passed back to Avogadro it replaces the molecule’s current state wholesale, so only data that the plugin passes back will be retained. If the plugin author has made changes to the bonds, for example, they can’t just pass back a CJSON with bond data – they need to pass back the original atom data too. As a general rule, therefore, my plugin does a calculation on the provided geometry, then takes the input CJSON, takes out values that are no longer physical, supplements it with the new data, then passes the whole thing back, to ensure nothing gets lost.
Having to combine some new data in e.g. molden format with the original CJSON in order to pass back everything in a single CJSON is tedious for the plugin to have to handle. Or at least, much more tedious than it is for Avogadro with its sophisticated IO.
When I raised this you addressed this by adding a readProperties
flag, so that when the following is passed back:
output = {
"readProperties": True,
"moleculeFormat": "molden",
"molden": result_molden_string,
}
Avogadro keeps the original atoms and bonds as they are and only reads the various properties. The way this is currently implemented it seems “properties” actually doesn’t mean some_cjson["properties"]
, it seems to just mean anything in some_cjson
that isn’t in the atoms
or bonds
objects, so it includes frequencies, orbitals, and so on).
Unfortunately this only partially achieves the desired outcome; the atoms and bonds are maintained but any other data that was in the file before will be lost, and only that “property” data contained in the molden file will be available in Avogadro.
Can we please have it so that when the readProperties
flag is found, the molecule state in Avogadro is only updated with the new data, and old data that doesn’t have new values is left alone rather than deleted?
Can we please also make readProperties
take a string like moleculeFormat
does, so that Avogadro can take the data given as moleculeFormat
as the base and supplement it with the data given as readProperties
?
To show what I mean:
output = {
"readProperties": "molden",
"moleculeFormat": "cjson",
"molden": result_molden_string,
"cjson": result_cjson,
}
or if we’d like a change of syntax for compatibility reasons
output = {
"updateFrom": "molden",
"moleculeFormat": "cjson",
"molden": result_molden_string,
"cjson": result_cjson,
}
In this situation, I’d like Avogadro to:
- Load the CJSON, just like it would if solely a CJSON was passed back
- Convert the Molden-format string to CJSON using Avogadro’s usual IO functions
- Add only those parts of the Molden “properties” CJSON to the main CJSON that aren’t already present in the main CJSON
- Pull the main CJSON into memory
This would be a great help to plugin authors because it is far easier for them to take stuff out of a CJSON (so that it’ll be overwritten) than add it from another file format.