Full and privileged xtb integration?

ghutchis · December 14, 2023, 3:12pm

Well, it’s obviously less space to include just the binaries. The discussion wasn’t so much about Python and more that conda and other package managers often bring in 10 things when you just want one package.

(For example, I installed a Python package recently that conda evidently included an extra copy of qt … I already have that)

I just want to be careful on how big the packages get.

matterhorn103 · December 14, 2023, 3:18pm

Sure but not by much… xtb in xtb/bin/ seems to be 53 MiB, and the 28 MiB I mentioned for crest was the binary!

Don’t know how big the Windows or macOS executables of Avogadro are but that comes to like half the size of the AppImage.

I wasn’t trying to make any point, just checking you know how big xtb is, as I was quite surprised how big it was myself.

matterhorn103 · February 12, 2024, 9:20pm

I’ve been using the obabel binary that ships with Avogadro to convert calculation results back to cjson, which was working quite well.

Of course in some ways it would be preferable to pass back the raw output files and let Avogadro + Open Babel handle the conversion, but this approach gives me more flexibility and lets me format and combine various bits of data from multiple files before passing it all back.

I was getting the path to the program, i.e. ./prefix/bin/avogadro2, using Path.cwd(), which until recently was the initial working directory of the Python interpreter. Now, at least when Avogadro is started from the command line, it returns the cwd where the command was entered.

This is no doubt for a good reason, probably to do with the changes to the way Avo manages environments and stuff, and it’s fine.

However, it would be good if there were a way for avo_xtb to get a hold of Avogadro’s path. I can’t figure out any way to do it reliably without changing something in Avogadro. Could perhaps an environment variable be set pointing to Avogadro’s location?

ghutchis · February 13, 2024, 12:54am

Are you just saying that sometimes you started it from the build directory and now you’re starting it from somewhere else? When you start from the build directory, it’s usually in ./prefix/bin/avogadro2 on Linux platforms.

Sure. It’s possible to set an environment variable.

Just remember that some users will have a default build of obabel that doesn’t speak cjson because OB-3.2 hasn’t been released yet.

matterhorn103 · February 13, 2024, 7:54am

No, it used to be that regardless of how I started Avogadro or which location I started from, the scripts would be run from the Avogadro program directory and I could get that directory in Python using Path.cwd(). Whereas now, if I for example launch Avogadro from ~/Documents with:

user@pc:~/Documents $ ./Applications/avogadro2

then Path.cwd() will give me ~/Documents instead. At least that’s how it seems to be working.

It’s not the end of the world since it felt fairly hacky to have to rely on the cwd being a specific thing. An environment variable or something passed via the input json would be a much more robust solution.

Noted, this is essentially just users getting Avogadro and therefore Open Babel from their distro repositories, right?

I’ll make a note in the readme that the latest obabel is necessary. I’ve made it possible to manually set the path to it as well.

ghutchis · February 13, 2024, 8:54pm

It’s probably easier to inject the environment variable. I’ll put together a pull request.

One other question … any interest in a crest plugin?

I’d think a few of the examples might be useful: Useful xtb / crest features

If you don’t have the time, I might put a few of them into place myself.

matterhorn103 · February 13, 2024, 9:37pm

Do you mean do I have interest in making one or has anyone shown interest?

I’ll make a start on one. I will make avo_crest a separate plugin to avo_xtb, mirroring the Grimme group’s recent approach, but it’ll integrate tightly with it and have it as a dependency.

You’ve added Python dependency management, right? Is there any way to make other plugins dependencies? It’s fine if not, I’ll just make a note in the readme that users need both.

matterhorn103 · February 14, 2024, 12:06pm

By the way, I tried to extend avo_xtb with scripts for the force field framework, and similar to your previous experience, it is just unbearably slow when calling directly from the command line. So GFN-FF will remain viable only via an API, like currently.

Now that the plugin is fairly stable, can I suggest that you/we take out GFN1-xTB and GFN2-xTB from the force field framework, and just implement GFN-FF?

Stops duplication of functionality then, and better to restrict the integrated methods to force field methods, no? Under Extensions > Calculate > Configure... the selection box is called “Force field”, after all.

The plugin offers more methods and options such as solvation, and it is much quicker and easier to add more to it in future than to the main program.

ghutchis · February 14, 2024, 3:36pm

I think you’re asking if we can remove the scripts (e.g., just leaving the gfnff.py script).

I’m leaning more towards removing all of the Python energy scripts and making them into separate downloads. For example: Cannot load script /usr/local/lib/avogadro2/scripts/energy/ani2x.py · Issue #1620 · OpenChemistry/avogadrolibs · GitHub … people take these messages as errors rather than debugging messages.

I’d probably consider leaving the GFN* methods available … people have asked even if they know they’re slow.

matterhorn103 · February 14, 2024, 4:14pm

This reminds me – what would really help me and others writing scripts would be if Python errors were passed on a bit more transparently to the user.

When an error occurs it does sometimes get printed to the console by Avogadro, but not always (I think only when it in turn causes an error in Avo?). And if it does get printed, the formatting gets messed up because all the escape characters are printed literally rather than interpreted, so the new lines aren’t read for example, which can make understanding the error harder – see e.g. here, where the error messages are each printed as one long string.

It would be nice if when writing a script, we could write handle and raise exceptions in the way we normally would when writing Python, and rely on them being relayed to the user.

For example if I add:

raise RuntimeError("example error")

at the top level of an installed command script, the script won’t run because the error will always be raised, right? If I run the script directly with the normal Python interpreter I see:

>>>  python3 config.py
Traceback (most recent call last):
  File "/home/matterhorn103/.local/share/OpenChemistry/Avogadro/commands/avo_xtb/config.py", line 44, in <module>
    raise RuntimeError("example error")
RuntimeError: example error

But when the script is run by Avogadro, no error message is printed, it just fails silently. This prevents us from using Python exceptions to communicate with the user.

So could the Python stderr stream simply be redirected verbatim to Avogadro’s stderr? Pretty please?

ghutchis · February 14, 2024, 9:20pm

It’s probably better if plugins don’t depend on each other … at least for now. I guess you want to minimize any duplicated code. For now, it’s probably better to copy a few things and we can figure out how to minimize this in the future.

Let’s think about how that would be handled. Seems potentially useful, yes.

Ineluki · February 16, 2024, 11:16am

Also, a way to write to the Avogadro2.log from the plugin would be nice for debuging reasons. One might not want to communicate errors only.

matterhorn103 · February 16, 2024, 11:28am

This would be the advantage of simply redirecting the Python stderror stream as is to Avogadro’s stderr stream, and therefore to the log on Windows. While you wouldn’t want to overuse the possibility, there would then always be the option of writing whatever non-error message you like from Python using sys.stderr.write().