In my group I just experienced an example of how the current way Avogadro handles indexing can lead to significant confusion and potentially to incorrect scientific conclusions. In this case it might have even influenced a publication we have in preparation…
Indeed, I was about to open an issue about Avogadro rendering orbitals from cube files differently to orbitals from ORCA output files, until I realised what the true source of the discrepancy was.
Essentially the specific problem arises because Avogadro is set up to do all indexing starting at 1, while ORCA does all indexing starting at 0.
I can also report back that the difference between “Unique ID” and “Index” being the indexing scheme used is non-obvious to other users, and my colleague tried to change to 0-indexing for the atom labelling but didn’t discover that “Unique ID” was the way to do it.
But even when one knows that, it only helps for the labels, while the 0-indexing approach used by ORCA applies not just to atom numbers, but also vibrational modes and orbitals.
At the moment, Avogadro reads everything from ORCA output files correctly (it doesn’t fail to read atom 0 for example) but then renumbers them i.e. it displays different indices for each atom/vibration/orbital than what is in the output file, or what is needed for giving ORCA further instructions. This leads to difficulties when working out the correct orbital to render for an image, or when working out which atoms to constrain, etc. etc.
A user who isn’t aware of the indexing difference can load an ORCA file in Avogadro, look for a specific atom or orbital by index, and identify the completely wrong one. I would expect that the overlap between ORCA users and Avogadro users is, due to their respective license conditions, fairly large, so I see this as a big problem.
Can we please discuss and come up with a broad plan on how to handle indexing better?
I would suggest three conditions that we need to fulfil in order to eliminate this confusion while also enabling people to work the way they want to work (i.e. some people’s general preference for 0 or 1-indexing):
- The current indexing scheme should apply for all indexing, not just e.g. for atom labels;
- After loading a file into Avogadro, whatever index is used to refer to an atom/vibration/orbital within the file should be the one displayed to the user;
- The user should be able to switch between 0-indexing and 1-indexing at any time, the current indexing scheme should be made prominently visible to the user, and the option to switch should be easy to access.
Obviously actually implementing this may be a fair bit of work but agreeing on a big-picture approach of how it should be handled would be the first step towards it.
At the moment I see this being a significant practical barrier to use of Avogadro, particularly with ORCA, so I definitely think we need an agreed plan of how it should look.