Question about how Avogadro reads in files

Hi,

I’ve been trying to add support for DL-POLY CONFIG and HISTORY files to Avogadro and am a bit confused by what happens when Avogadro tries to open a file.

While I was developing the parser I noticed that the ReadMolecule function was being called twice whenever I opened a file.

The parser was called once in avogadro/src/mainwindow.cpp in mainWindow::loadFile where there is the call:

d->moleculeFile = MoleculeFile::readFile(fileName, formatType.trimmed(),
options, false);

The readFile function creates a ReadFileThread, which creates an OBConversion object that calls the ReadMolecule function.

Further down in mainWindow::loadFile, there is a call to firstMolReady, which has the call:

OBMol *obMolecule = d->moleculeFile->OBMol();

and call to OBMol() also ends up calling ReadMolecule.

Is that the correct behaviour? I’m working with fairly big files, so parsing the file twice is quite an overhead.

I’ve also discovered a related bug. If I try and open the attached xyz file, it hangs, with avogadro stuck in a polling loop. If I remove one atom (remove the last line and change the first line to 441), it reads in fine.

Is this a quirk on my system, or a bug in the code?

Thanks,

Jens


Scanned by iCritical.

Hi,I’ve been trying to add support for DL-POLY CONFIG and HISTORY files to Avogadro and am a bit confused by what happens when Avogadro tries to open a file.

Don’t you think that OpenBabel is a better place for implementing file format parsers?

If I try and open the attached xyz file, it hangs, with avogadro stuck in a polling loop.

By default, molecule is readed from XYZ with enabled bonding recognition. Please use import dialog to read file (and disable “perceive bonding” flag)


Regards,
Konstantin

Hi,I’ve been trying to add support for DL-POLY CONFIG and HISTORY files to Avogadro and am a bit confused by what happens when Avogadro tries to open a file.

Don’t you think that OpenBabel is a better place for implementing file format parsers?

Sorry, I probably wasn’t clear about that. I’m implementing the parser in openbabel (src/formats/dlpolyformat.cpp), and it works fine there, the problem comes when I try to open the same files with Avogadro.

If I try and open the attached xyz file, it hangs, with avogadro stuck in a polling loop.

By default, molecule is readed from XYZ with enabled bonding recognition. Please use import dialog to read file (and disable “perceive bonding” flag)

That works fine, so it looks like the bug is somewhere in the chain of code it goes through when it tries to calculate the bonding.

Thanks,

Jens


Scanned by iCritical.

That works fine, so it looks like the bug is somewhere in the chain of code it goes through when it tries to calculate the bonding.

Probably, there’s no bug (I’m not sure though) - bonding algorithm is not optimal and scales horribly on large molecules


Regards,
Konstantin

Hi,

On Mon, Sep 13, 2010 at 5:36 PM, jens.thomas@stfc.ac.uk wrote:

Hi,

I’ve been trying to add support for DL-POLY CONFIG and HISTORY files to Avogadro and am a bit confused by what happens when Avogadro tries to open a file.

While I was developing the parser I noticed that the ReadMolecule function was being called twice whenever I opened a file.

The parser was called once in avogadro/src/mainwindow.cpp in mainWindow::loadFile where there is the call:

d->moleculeFile = MoleculeFile::readFile(fileName, formatType.trimmed(),
options, false);

The readFile function creates a ReadFileThread, which creates an OBConversion object that calls the ReadMolecule function.

Further down in mainWindow::loadFile, there is a call to firstMolReady, which has the call:

OBMol *obMolecule = d->moleculeFile->OBMol();

and call to OBMol() also ends up calling ReadMolecule.

Is that the correct behaviour? I’m working with fairly big files, so parsing the file twice is quite an overhead.

Yes, this is the intended behaviour but it has been written for files
containing one or many small molecules. If you work with large single
molecules, it should be possible to modify the MoleculeFile class to
only read the file once. For protein pdb files we probably want the
same behaviour.

I’ve also discovered a related bug. If I try and open the attached xyz file, it hangs, with avogadro stuck in a polling loop. If I remove one atom (remove the last line and change the first line to 441), it reads in fine.

Is this a quirk on my system, or a bug in the code?

I have to investigate this further but this is an openbabel issue.

babel 442.xyz.gz -oxyz —errorlevel 5

*** Open Babel Audit Log in Kekulize
Ran OpenBabel::Kekulize

*** Open Babel Audit Log in FindLSSR
Ran OpenBabel::FindLSSR

FindLSSR is not the problem but I think the Kekulize code takes
“forever”. This is not my area of expertise though.

Tim

Thanks,

Jens


Scanned by iCritical.


Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing
http://p.sf.net/sfu/novell-sfdev2dev


Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
avogadro-devel List Signup and Options

On Sep 13, 2010, at 12:01 PM, Konstantin Tokarev wrote:

That works fine, so it looks like the bug is somewhere in the chain of code it goes through when it tries to calculate the bonding.

Probably, there’s no bug (I’m not sure though) - bonding algorithm is not optimal and scales horribly on large molecules

That’s not quite right. The assignment of bond orders is difficult on systems with many fused rings (e.g., graphene, nanotubes). If that doesn’t describe your system, please send me the file or open a bug on the Open Babel tracker.

Thanks,
-Geoff