Plugin feature names and categories

Currently there are five types of plugin, which after much discussion are destined to become “feature” types (i.e. sorts of functionality) that a plugin may offer. Those five plugin/feature types/categories are:

Name 1 (documentation) Name 2 (UI) Name 3 (some code) Name 4 (folder, plugin index, some code)
Charges / Electrostatics Charges charges charges
Energy / Force Fields ? energy energy
File Formats File Formats formats formatScripts
Input Generators Input Generators generators inputGenerators
Command Scripts Commands commands commands

As can be seen, each different type is referred to in different ways in different places, and speaking from experience, this can cause confusion as to which one to use.

Relatedly, in a discussion with @brockdyer03 he opined that the current names are not very clear in indicating what they actually do, particularly commands. Calling them “scripts” is also no longer really valid in the new system, and the grammar/pluralization is also a little inconsistent.

So I’d like to start a discussion as to what (ideally unambiguous) names could be used for each feature type in the new plugin system. There will probably still need to be different names in user-facing situations (documentation, UI) than in code/metadata, but at least reducing the number to one of each and using them universally would be great IMO.


On a similar note, the new system means that it is easy for plugins to offer different feature types, as well as that the way that plugins are invoked is much more consistent between types. As such, it might well be worth considering splitting up some categories. Note that:

  • energy scripts provide not just the calculation of energies but also gradients – but often only one or the other
  • charges scripts provide not just the calculation of charges but also electrostatic potentials – but often only one or the other
  • formats scripts provide both read and write operations – but sometimes only one or the other
  • commands scripts do a huge variety of different things – in particular @brockdyer03 felt like this covers far too many functionalities with greatly differing effects, interfaces, output formats, and input and configuration requirements

At the moment, the categories correspond primarily to the different systems that they extend/fit into:

  • energy scripts are used with the force field framework for the Auto Optimize Tool, and are called many times in quick succession
  • charges scripts are only called to do a single calculation, to provide partial charges or electrostatic potential surfaces
  • formats add extra options to the normal file I/O (Import, Export)
  • generators work with the input file generation dialogs for immediate preview and feedback
  • commands add entries to the menus accessible from the top menu bar (whether that’s under File, Build, Extensions or whatever)

The slight odd type out is charges because – and correct me if I’m wrong – the partial charge calculation is accessed via and used for atom labels, while the electrostatic potential calculation is accessed via the Create Surfaces dialog, right?

So though I don’t actually really mind the current situation all that much, I did think it was worth taking this opportunity to reconsider if the categories are set up in the ideal way.

Naturally, it’s important to consider that adding categories requires a fair bit of plumbing to be done on the backend. It’s also worth noting that it’s not imperative that it’s done now, since extra categories can be added at any time later as well. This is the best time to change any existing categories, though, should we want to.

What are people’s thoughts?

A couple of initial personal suggestions would be:

Re: names – favour longer, more enlightening names like inputGenerators. Perhaps fileFormats or formatParsers, and menuCommands, or something like that?

Re: categories – I think the main thing that I would propose would be to make a distinction between different commands where the sort of things that happen as a result are radically different and thus probably require slightly different handling. Off the top of my head I can think of e.g.:

  • commands that cause a molecule to undergo some transformation
    • return new geometries that replace the previous state
  • commands that calculate some property
    • don’t affect the geometry at all, just return data that needs to be added on top
  • commands that do something entirely independent of the current molecule or file’s state
    • Avogadro doesn’t really need to process any output at all
    • the xtb plugin has commands that install xtb, for example, or open the xtb docs
  • maybe some catch-all category for whatever doesn’t fit into the other flavours?

At the moment, commands have to return the whole CJSON they are provided with, otherwise the current molecule vanishes! This could be one way round that.

To be honest, I don’t really know whether a split like this would make sense, or whether any split would make sense at all. But keen to hear other people’s ideas!

Yeah, except there are a variety of models in which the electrostatic potential is from a set of atomic partial charges.

The reason for the distinction is that there are better electrostatic models that compute the potential in other ways (polarizable models, machine learning, quantum chemistry, etc.)

I certainly don’t mind discussing wording .. but what’s wrong with commands when they add menu commands?

I know ASE which is pretty popular in the community calls calculators instead of energy, but I’d probably stick with the latter (e.g. energy plugins).

Sure. Because sometimes it’s easier to calculate the energy of the system rather than deriving analytic gradients. In general, most methods would provide both.

That’s not true. A script can return append, for example: and it wouldn’t be hard to do other things.

I’m not sure it makes sense to have such a fine-grained split of commands. There was some request for a new properties plugin, like calculating the logP with RDKit or similar fast descriptors. But I’d highly recommend that be a new type, e.g. Property Calculator Scripts · Issue #1439 · OpenChemistry/avogadrolibs · GitHub

Since I was tagged in the original post vis-à-vis the label commands, I figure I should share my opinions.

Simply put, the scope of possibilities in a menu command is exceptionally large.

Off the top of my head, some things that could happen via a menu command (regardless of what the actual plugin type is, I am saying purely things that would show up in the menus and are clickable as commands) are

  1. Call xTB to run a calculation (SCF, Opt, Freq, Opt+Freq, Orbitals)
  2. Call CREST to run a calculation (Generate Conformers, Tautomerize, Protonate, Deprotonate, Solvate)
  3. Open a configure menu
  4. Open a folder
  5. Open a website
  6. Manipulate a unit cell (e.g. convert to primitive)
  7. Generate a non-diagonal supercell for numeric phonon calculations
  8. Generate an NEB start and end point
  9. Run a linear interpolation between NEB start and end points
  10. Run an NEB-IDPP calculation
  11. Manipulate bonds (e.g. make all H-C bonds 0.05 Å shorter)
  12. Rotate a molecule such that its principle inertial axes are aligned on the unit axes.
  13. Calculate the difference in the electron density from the superposition of atomic densities to the final SCF electron density.
  14. Group atoms by type
  15. Sort atoms by coordinates
  16. Rotate a geometry about a specific axis
  17. Reflect a geometry across a plane
  18. Perform numerical integration of a charge density to calculate the dipole moment
  19. Project a radial wavefunction into 3 dimensions and run a fourier interpolation to generate a .cube file
  20. Integrate a charge density to calculate the 2nd order Makov-Payne correction

While I could likely go on, I think my point is made by now. The difference between something that groups atoms by their symbol and something that runs a fourier interpolation on a 1D radial wavefunction projected into 3D to generate a .cube file to me feels like the difference between dilute hydrochloric acid and pure dimethyl mercury. Sure, they’re both technically dangerous, but one requires a rinse in the sink and the other requires chelation.

I don’t think I have a fully developed replacement for commands, but perhaps a good starting point is to group up the commands into their categories. Here are a few examples from the list I gave.

  1. Calling an External QM/MM Program (xTB, CREST, etc.)
  2. Move a Geometry (rotate, reflect, inertialize)
  3. Add to a Geometry (generate NEB start/end points, run an image-dependent pair potential calculation to generate an initial image path)
  4. Calculate a Property (integrate charge density for 2nd order Makov-Payne, numerically integrate the dipole moment)
  5. Open an external application (open a file browser, open a website)

If nothing else (though there likely are other benefits) then the benefit here is scope reduction. I myself find it exceptionally difficult to know just what something can do as a command. Can I spawn parallel MPI processes? Can I call Python code that has Rust bindings? How do I communicate to Avogadro what I’ve done? How would I inform the user what is happening? How would I detect selected atoms if I need to?

Where do I look in the documentation to learn how to make <plugin type>?

Well-defined scope is easily one of the most important things I think of when I am writing a program, and currently the commands feature has the most nebulous scope of any of the features.

Sure. That’s the point of Avogadro. You have menu commands that recenter the camera, build a peptide chain (or polymer), optimize geometries, display point groups, generate conformers, etc. Should we limit the scope of menu items outside of scripts? I would say no.

Same thing with input generators… there’s a huge range between MOPAC and ORCA and LAMMPS.

The same thing is true in almost all editors. I can load an extension into VS Code that either formats some code, opens a website for tutorials, or runs a sophisticated LLM to find security bugs.

Photoshop (or GIMP) commands can crop a photo, invert colors, .. or perform sophisticated up-scaling, AI generation, or color correction programs.

That sounds more like “I need better documentation, tutorials, and example code.” (There are already some examples indicating how you get the list of selected atoms or layers.)

I’m not opposed to categories .. but I’m not really getting it.

The main thing I wouldn’t suggest for a command is something that’s likely to take a while because the user is going to wonder what’s happening. That’s why xtb is okay because it’s fairly fast. crest can be problematic, since it can definitely take hours for some tasks. So you could write a command that spawns MPI processes, but it doesn’t seem like a great idea.

Again, I can understand that you want examples of different kinds of plugins. I’ve been holding off on a long list because I didn’t want to write 20 and then refactor all of them for the new framework.

Let me be a bit clearer. I’m not opposed to having sub-categories of commands, but it’s adding something for the plugin author to do - classify their code (e.g., what if a logical action calls an ML model to calculate a property.. or they want to add a geometry and calculate a property of the revised structure, etc.)

And on the Avogadro C++ side, what would the categories do? Am I supposed to impose restrictions that a calculate-property script can’t run a program?

I’m happy to have the discussion, but it seems more that you want examples of a wider variety of plugins so if you have an idea (e.g., move a geometry, build a nanotube, calculate a property) you have a good starting point for coding. Is that correct?

I had floated the idea around with @matterhorn103 many days ago, but it felt to me like there really were only actually two of categories for the commands type.

This list of categories contains far more specificity than I had in my original set of categories. If I remember correctly I broke it down into two main parts. One was calculators and the other was transformers.

The idea is just that realistically there are two types of return that you could provide to the user, either some numbers or a different geometry.

Under the calculators/transformers categories, the classification for either of these cases is simple. What does it return? If the ML model calculates a property then it returns a set of numbers, it’s a calculator. If it has another setting where it can calculate a property and change a geometry, then that part is a transformer. It might return additional metadata in the CJSON of the geometry for the properties it calculates to determine that geometry, but it’s still changing the geometry.

There’s a benefit here on the C++ end too I think. If you have a plugin that is a calculator, then you don’t need to wipe the previous CJSON. If you have a plugin that is a transformer then you do need to wipe the previous CJSON.

It also makes it easier for plugin authors to decide what to return.

Instead of having a plugin author try to pick and choose between return types the calculators/transformers distinction means there are only two options. You’re either modifying data in-place with a calculator (e.g. appending/replacing a total energy), or you’re completely returning a new CJSON.

Additionally, within the calculators/transformers distinction, I think there’s an argument to add additional, optional plugin tags that are not used in the C++ of Avo2, but could be used in the documentation. This sort of tagging would make it far easier to document the various types of plugin, and additionally could in theory be used in the plugin download window to let users sort or search for a plugin type, importantly in a programmatic way.

Fair enough, if you’re happy with the categories, that’s fine, I just thought it was worth raising. :slight_smile:

I would agree that it’s useful to have a category for commands that is powerful and flexible, because it enables a lot of possibilities, so I think the good option would be like what you said, adding further categories that are more specific. That can always be done later.

Given how powerful and flexible commands are, and the likelihood that their capabilities and scope will continue to increase, that means more and more options that a plugin author has to think about. So at some point adding an additional category or two, with a more defined purpose and less to think about or configure, actually makes things less complicated for authors rather than more. That would be the advantage that I’d see. For example, the append option is a subtlety of the category that the plugin author has to find rather than a difference made obvious by a different category.

To me, you could make the case for categorizing the plugins in a few different ways:

  1. By where they appear/what they add for the user within the GUI
  2. The actual type of process/calculation/algorithm that the plugin does in Python
  3. The way the Avogadro-plugin interface works

I find the current categories to be a slightly weird mix of all three, that’s all. If we say that it should be based on (1) then commands make sense, but charges really don’t. Brock is kind of advocating for (2), and that’s kind of how you were justifying charges and potentials being together, but I am not convinced that that is useful to either users or authors. (3) seems like a slightly weak basis for the categories but in any case the current categories don’t really fit that either.

To be honest though, my feelings aren’t strong on the categories, they do the job as they are. I’m more fussed about the naming.

“Commands” can be many different things, and a menu is not the only place that something called a command might be found. You could have a command running dialog, for example. So I just always found it quite ambiguous. To be honest though, I don’t find “command” so objectionable, I’d just say it could do with some qualification.

Looking at the Apple human interface guidelines as well as those of GNOME, they both favour the term “menu items”, which I think would be clearer to a user browsing in the plugin dialog as to what the plugin provides. KDE seem to use “menu action”.

So I’d argue for menu-commands or menu-items as a better name.

Hmm. Well, to my mind charges is definitely the category with the best case for being split up. After all, it is now very easy for a plugin to provide both. If you wanted to keep charges and potentials functionality within the same type then I’d maybe suggest electrostatic-models, or at least electrostatics?

calculators feels a bit too generic since it doesn’t encompass things that calculate e.g. frequencies. I honestly can’t think of anything better than energy, but it somehow doesn’t sit right, because it also does gradients. Force fields doesn’t work because not everything is a force field. Maybe calculation-methods or something? energy-calculators? Don’t know.

For formats and generators, I feel a long form name is better. input-generators already exists and is actually usually what they’re called, it just needs to be used universally.

How about:

UI, documentation C++ Plugin (e.g. TOML)
Electrostatic Models electrostaticModels electrostatic-models
? ? ?
File Formats fileFormats file-formats
Input Generators inputGenerators input-generators
Menu Commands menuCommands menu-commands

There’s less ambiguity as to the function this way and no mismatch between the names between the UI, code, and docs, which really would make things easier for plugin authors.

Okay, that I understand better. Certainly returning properties / data vs. changing the current geometry is a useful classification. But many things don’t fit that classification. If I optimize the geometry and want to return the energy and other properties?

Yeah, but the current system lets you simply append, for example a Python package that generates nanotubes .. you just return new XYZ coordinates and ask to add them to the current system.

Again, I think that’s mostly a case for better documentation and more examples. For example if the docs have more example use cases, then the append or other options become easier.

I think we can imagine adding tags or keywords to plugins could absolutely help in discovery. That sounds great (and doesn’t affect much in the way of implementation).

I certainly hope there will quickly be dozens and if you’ve used VS Code or something similar, searching is important. At one point, a friend-of-a-friend was working on a plugins.avogadro.cc website, but he disappeared to a better-paying corporate job. Something to consider if / when the new plugins start to take off.

Sure, no problem. It would certainly make things clearer if we have some other sort of “no menu, just calculate some properties on the fly” option in future versions.

If you want electrostatics or electrostaticModels that’s fine. I don’t mind more descriptive names - I just get tired of always typing inputGenerators which is why generators and formats have popped up in different places.

So then maybe energyModels? Even though ASE calculators can be fairly easily adapted, I’ve never liked that phrase.

(BTW, at some point, I’ll add in code to generate frequencies from the energy Hessians because it can be useful, but that’s a separate thing)

If you’re optimizing a geometry and returning the energy and other properties but not returning the optimized geometry? To me that seems like it should be forbidden. What would you do with the returned properties without the geometry they come from? They certainly shouldn’t be appended to the current molecule’s CJSON, and without doing that I think the only other option is to print them to the screen.

Granted, printing to the screen is something I think the two categories I proposed do not cover, but this feels like it may be a good case for a generic category, perhaps something that specifically does nothing to the current geometry? In this way there are three total expectations for the plugin interface:

  1. Append properties
  2. Completely replace the whole CJSON
  3. Do nothing to the molecule

I agree with @matterhorn103 about the discoverability aspect of this. Even with better documentation a plugin author may struggle to determine if they really want to append or if they want to replace the whole document. Additionally, if you’re doing something like generating a nanotube then the CJSON is still going to need to be reset. I also struggle to imagine a use case where you’d want to just plop a brand new structure into an existing document from a plugin. If it were me I’d greatly prefer to put the nanotube in a new document and then cut+paste it into my existing document.

That’s kinda my point. You’re separating transforming the current frame from returning properties. There are a lot of use cases where you want to change geometry and return properties. So why create an artificial distinction?

Huh? I’m not going to dictate someone’s workflow. If you want to create a new empty frame and cut and paste into your document, you can absolutely use that workflow. Not everyone wants to work that way.

Maybe the user is starting from an empty geometry?

As is, if you “append” the new atoms are selected and the manipulate tool is picked as if you copied and pasted.

Look, reasonable comparisons are Chimera X and PyMol. They let Python extensions do anything including popping up windows and PyQt interfaces.

Notice that they can:

  • add properties
  • change the molecule or build things
  • change the rendering
  • etc.

I completely understand where you’re coming from, but I don’t think imposing specific contracts for menuCommands is a great idea.

Everything else has a specific contract:

  • formats: read or write a file to a format that Avogadro understands
  • energy: given a molecule / system, QUICKLY get back energy and gradients
  • inputGenerators : pop up a dialog to help the user create input for other programs
    • implied to be long-running things, potentially on a remote cluster
  • electrostatics: given a molecule / system, return back the requested properties
    • somewhat specific given that you want this in different places

When we added commands the point was to reuse the dialog API from inputGenerators which was a good start for many simple dialogs, and let Avogadro reach out to other packages, generating nanotubes, ice unit cells, sugars, etc.

I don’t think they need a specific contract. I think we want a more flexible API to enable a wide range of actions.

I believe he meant: what if a plugin wants to change the geometry and return properties as well as the new geometry all in one go?

This ties in a little to something that I was trying to work out when writing the xtb plugin: what, if anything, remains valid in a CJSON if the geometry is changed? Off the top of my head, the only things are things like the name of the molecule. Every actual property I can think of is a function of the atomic positions. So if that’s true, then my thinking was basically that any hypothetical new sub-types of commands would essentially be a contract with the plugin such that it commits to one of three possibilities:

  1. the plugin indicates that it will change the geometry, in which case Avogadro knows to look for the new geometry data in the plugin’s output, and that it should discard all property data it already has, and the molecular data is comprised only of the data that the plugin returns

  2. the plugin commits to not changing the geometry, in which case Avogadro can retain the current geometry information and any molecular data it already has without concerns that it is invalid, it doesn’t need to bother checking for new geometry information, and any property data the plugin returns can be merged in (replacing previously held data where appropriate)

  3. the plugin indicates that it won’t return any molecular data at all, so Avogadro can skip all those steps and only needs to check the plugin’s output for things like the plugin’s config, or messages to the user; in fact, Avogadro probably doesn’t need to bother even passing the plugin the current molecule at all, which simplifies things and saves on the amount the plugin has to parse

So in answer to your question, to me it’s obvious that a contract/category that says that a plugin can only return a new geometry is entirely pointless, because if the geometry has changed then there’s no need to retain any properties, so you might as well make them fair game for the plugin.

Therefore to my mind in situation (1) the plugin would also be free to return any property data it likes – the new CJSON will just be a combination of everything the plugin returns. I guess basically that category would essentially be saying “this command takes the current molecule/file and returns a new one”, while (2) is saying “this command calculates or determines something for the current molecule/file”, and (3) is saying “this command does something for which the current molecule is entirely irrelevant”.

I am definitely in agreement with you that a generic, all-powerful category is extremely useful to have, and that it’d be a shame to limit that category. I just wonder if some other categories that reduce complexity for the C++ side (by allowing assumptions to be made about what the plugin needs or will return) and for the plugin author (by significantly reducing the number of configurable options vs the all-powerful category) might be useful.

In summary: I wouldn’t be in favour of getting rid of commands by splitting it up and restricting the resulting categories, but I would be in favour of the addition of a few related categories that are aimed at specific subsets of the all-powerful, generic commands type.

Very well put. This is exactly what I had in mind with my replies.

By the way, unless I’m misunderstanding you, I can immediately think of a few:

  1. A command that adds the amino acid residue of the user’s choice to the existing polypeptide/polymer
  2. A command that extends a polymer chain by a single repeating unit, determined by inspection of the existing polymer
  3. A command that does explicit solvation (like the xtb/crest plugin will)

In these situations the plugin would already need to have the original geometry to determine where to put the new information, so the plugin author wouldn’t need to do much, if any extra work. They’re already writing the coordinates of the new part to CJSON, and since the original CJSON is almost certainly stored in Python as a dictionary, you just call dict.update() and throw in the data.

Oh, and re: names…

Haha, I get that. I’m lucky enough to have learnt coding in a time where tab completion is ubiquitous and powerful, but every time I don’t have it available I end up cursing every name longer than three characters.

To be clear, while I said I’d like it to be uniform across everything and included camelCase versions of the names in the table under a “C++” heading, I’m not actually suggesting that the names should never be abbreviated within the C++ code. What I think is important is that any time a user or a plugin author sees a feature type name it follows the same scheme, so that they never have to work out what corresponds to what. In the C++ code it is therefore only really relevant when doing UI stuff, or in the plugin API, or when doing things like caching the plugin data.

OK, I like that. I think that and electrostaticModels really convey nicely what they provide to Avogadro: they don’t provide an interface to calculate energies or electrostatics, they provide a model which Avogadro itself makes use of in its own interfaces.

An idea that occurs to me, incidentally, is that it might be a good idea to raise awareness of what plugins can add to the Avogadro experience by putting a message or tooltip or something in the dialogs where the models are selected, that says “Get more options here by adding plugins”.

Oh I see, we were talking at cross-purposes. You don’t mean that you can’t see a use-case for adding atoms/geometry data to an existing file full stop, you mean that you can’t see a use-case for append in the context of geometry data because the plugin will generally need the current geometry anyway, so it might as well handle the appending/merging of the additions itself.

1 Like

Why should the plugin author have to write that when Avogadro already has plenty of code for appending merging additions?

I mean, plugins can but that shouldn’t mean they must do it.

For example, should the plugins have to implement bond perception too? Or can my prototypical nanotube tool just say “hey here are a bunch of carbon atoms in XYZ format, can you please handle it?”

And even if the plugin does do it, it might be better for Avogadro to know “here are these new things, please select them”

In the use case of adding solvent molecules, you might want them selected to put them in a layer or change the rendering options on them.

Yes, exactly. And while some force fields implement electrostatics, most want RESP or AM1BCC or other electrostatic model. So they’re two different pieces.