More UI options for scripts (e.g., a WebMO plugin)

ghutchis · August 20, 2025, 9:53pm

I came back from the ACS meeting which had a symposium on the 25th Anniversary of WebMO which is particularly used a bunch for teaching (e.g., no need to install software, works on tablets, etc.)

It seemed like it might be useful to have better integration with Avogadro and WebMO, e.g.

login to WebMO server (username / password)
get current jobs
download a job result

Now my main concern would be securely handling the username / password, but the other side of this is displaying a table of job results and/or filtering. (Tables aren’t currently part of the plugin GUI options.)

So it seems like to do this right the script plugins need a few more UI options

need a password dialog option
need a table view (e.g., tab-separated data to columns)
- the table view probably needs an option to display a search / filter row

Both of these seem like they’d be useful for database scripts too (e.g., query a server, pick an option, download the data)

Thoughts?

ghutchis · September 2, 2025, 3:04pm

I want to bring this back up. Here’s an example of the HTML table for the WebMO demo server.

I can think of other plugins wanting to display some sort of table (e.g., properties or whatever).

So the plugin grabs the data from the server before popping up the Avogadro dialog. The user picks a row (calculation) and then it gets downloaded and opened.

Maybe @matterhorn103 has a thought here… I think the plugin should indicate headers as well as some data e.g. tab-separated text that gets turned into the table. Should there be some option to specify the delimiter (e.g., some other character?)

ghutchis · September 2, 2025, 3:05pm

If you’re thinking this seems like a bunch of work for WebMO - it’s also useful for other kinds of “fetch something” scripts, like searching the Crystallographic Open Database

matterhorn103 · September 2, 2025, 9:49pm

I take it the idea is that a plugin script would, on being run with the --print-options flag, return to Avogadro a JSON object that contains the tabular data? Avogadro would display that data to the user when the command is clicked in a table, and the user would then select something from the table and press OK, and Avogadro would then run the plugin script while passing along to the plugin the details of what was selected?

I’d have thought the best way to pass the tabular data would be in JSON format as well, in columnar fashion, with all data be quoted as strings:

{
  "table": {
    "headers": [
      "Number",
      "Name",
      "Description",
      ...,
    ],
    "data": [
      ["1739327", "1739326", ...],
      ["CH2O", "[C2O2N3S4]+ 2", ...],
      ["Molecular Energy - Gaussian", "Geometry Optimization - XTB", ...],
      ...
    ]
  },
  ...
}

(If strings prevent the data from being formatted satisfactorily then we’d maybe want to add a list of datatypes; passing the data as native JSON types seems like it’d be too at risk of being changed over a round trip due to the multiple casting events that’d occur?)

When it comes to views, I would have thought that the plugin doesn’t need to know anything about any views or specify any view to show, as the plugin can restrict the data by only passing a subset if it so chooses (i.e. it can run queries/filters in advance). So views can just be handled exclusively within Avogadro, no?

Then wouldn’t we ideally avoid having to pick a full-blown DSL for instructions on how to manipulate the data and instead just enable a few specific details to be passed back and forth? In terms of what a plugin might want to receive back from Avogadro after the user input has been obtained, right now I can only think that the plugin would want to know which rows have been selected. Conceivably, it might alternatively want the user to select specific cells. But if there’s no need for more than that, I’m thinking that things could be kept simple by implementing something like the following:

The plugin could request a response at the same time as passing the data to Avogadro, like so:

{
  "table": {
    ...
  },
  "request":  {
    "selection": "row",
    "multiple": true,
    "columns": ["Number", "Date"]
  },
  ...
}

The plugin would indicate with columns the combination of values needed to uniquely identify a row and therefore what it needs the selection specification to include (in case the data doesn’t have a unique ID-like column).

When Avogadro receives the above request it would allow the user to select whole rows at a time (in whatever fashion, I don’t know what you’re envisaging – highlighting? checkboxes?) and would then tell the plugin which rows were selected. This would be as a list of rows, probably including the column names again just to allow the plugin to sanity check that the order hasn’t changed if it wishes:

{
  "response": {
    "headers": ["Number", "Date"],
    "rows": [
      {
        "data": ["1739326", "9/2/2025 14:55"],
        "selected": []
      },
      {
        "data": ["1739324", "9/2/2025 14:52"],
        "selected": []
      }
    ]
  },
  ...
}

rows would always be an array but would naturally only ever contain a single item if multiple were set to false. If row selection were requested then selected would always just be an empty array and the plugin would just ignore it, but if cell selection were requested then selected would contain a list of the columns which had been selected on that row. For example, if for whatever reason the plugin had actually wanted to allow selection of specific cells, Avogadro might return to the plugin:

{
  "response": {
    "headers": ["Number", "Date"],
    "rows": [
      {
        "data": ["1739326", "9/2/2025 14:55"],
        "selected": ["Name"]
      },
      {
        "data": ["1739324", "9/2/2025 14:52"],
        "selected": ["Name", "Description"]
      }
    ]
  },
  ...
}

I think with a result returned in that format, a plugin could fairly easily adapt it to whichever DSL it wants in order to perform its own operations on the original data.

ghutchis · September 2, 2025, 11:59pm

I guess it depends on whether you want column-first or row-first, since the latter can be a bit easier to format in some things (e.g., in this WebMO example, you get back a bunch of information on each job, e.g ID, name, type, success, date / time, etc.) I guess that could easily be an option in the JSON, rather than Avogadro parsing the “default” text with delimiters.

I wanted to allow delimited text because right now the scheme for other pieces (e.g., combo menus, etc.) has a default. But this is also why I was looking for input on the concept while it’s a pull request.

The one catch with keeping the data as strings is that if you allow the table to be sorted, you can get weird results (vs. recognizing that some columns are numbers or dates).

Probably highlighting / selection rather than checkboxes, but it’s not too hard to change the view on the Avogadro side. Similarly, there’s some potential to add a search bar (e.g., if the data table is really big, Avogadro will filter on text).

As the return, I was thinking of just returning row IDs, e.g. [0, 13, 15 ] or row / cell combinations if the plugin wants a full spreadsheet vs. row-only.

But yes, I think this covers a lot of obvious use-cases without a huge API.

Of course I can also imagine a two-dialog use case:

plugin pops up a first dialog (e.g., search COD for specific elements)
- this would also work for authentication cases
second dialog comes up with the data table
user selects data to download / filter / process, etc.

matterhorn103 · September 3, 2025, 12:25pm

If it’s easy to write it in such a way that you have the option, then being able to use either possibility would be nice for sure. The logic behind columnar format is that it’s the way pandas and polars dataframes are structured, making column-wise export more ergonomic and more efficient, and it’s also how CJSON is structured. I feel like it’d also be easier to infer the type of each column when the data’s in columns.

Yeah, agreed, the strings suggestion was purely for the Avogadro-plugin communication interface; I think it’d be pretty much essential to parse them into appropriate types for the table itself, for that exact reason.

That’d be really nice!

Sure, that would be a possibility. If Avogadro provides search and sort functionality it could be hard to reliably keep track of indices, though. It’s also a shame not to use an appropriate ID-like field to identify rows if there’s one present. It might be interesting to know that polars by design does not use indices and the user guide gives some insights as to why the use of indices can be bad. Plugins could be required to include an explicit unique index column in the data rather than relying on positions.

matterhorn103 · September 3, 2025, 1:18pm

Delimited text would be the most straightforward, I suppose?

For reference, the output formats pandas and polars have in common are CSV and JSON for plain text formats and Excel and Parquet for binary formats.

All four have their disadvantages. The nicest and fastest is Parquet, but that would mean adding dependencies.

The CSV support makes producing delimited text fairly easy I guess as the CSV methods of the dataframe libraries can be (ab)used to create and read it, e.g. by specifying tab as both the delimiter and line terminator. And actually even the stdlib csv module can do that.

I can see the disadvantage of delimited text being that it might in tricky situations have to be manually prepared by iterating over the data, which’d be pretty inefficient for large datasets. Since control characters are not allowed in JSON strings there’d end up being a lot of backslash escapes going on as well.

The advantage of delimited text is that you could have quite a strict specification IMO, so I would go in the opposite direction to what you suggest and require a specific delimiter I think; ideally we’d want one of the ASCII delimiter control characters.

CSV more generally does pose various challenges, one of which being that many languages/locales use the comma for the decimal separator rather than the period.