Interactivity, Large Molecules and Rendering

ghutchis · December 4, 2007, 10:08pm

Carsten,

Marcus tells me that you were asking around about performance and
ribbons.

I think the problem is not necessarily ribbons (although they probably
can use some performance work too). I think the problem is general
interactivity with rotation, etc. and large molecules.

For example, in many proteins, we’re talking about a few thousand
atoms and bonds.

One possible solution used by many molecular visualization tools is
the “quickdraw” option. Basically, when the user is doing a rotation,
translation (i.e., any time the mouse is down), rendering drops in
quality.

So here’s my suggestion:

Drop the global quality by 1-2 notches
Add a “quickdraw” flag to the render engines – i.e., the engines
should turn off the dynamic culling, etc.
Save a display list over all engines (i.e., on mousePressEvent)
Only use the display list while the mouse is pressed
After a slight timeout and mouseRelease, re-render with normal
settings.

I think this would help a lot. Right now, we do a lot of work for each
atom and bond, for each engine. So on a big system, that’s a lot of
work for each frame.

Thoughts? Marcus, Donald, and Benoit? You three seem to be the most
involved with the rendering framework.

Cheers,
-Geoff

Benoit_Jacob · December 4, 2007, 10:18pm

Not much to add, we just had a conversation about that on #avogadro with
Marcus and you summed it up very well.

My initial objection was that one couldn’t make a display list of a scene that
uses variable level-of-detail and then rotate it, as that would mean
low-detail objects could get close to the camera, and Marcus definitively
solved this issue by proposing, that in this display list we would use a
constant low level of detail, i.e. we don’t do dynamic level-of-detail
handling while rotating.

Another thing we discussed is that part of the overhead of the computations we
do is time spent in eigen, so moving to eigen2 (which should be possible
around february 2008) will help. For instance, defining NDEBUG in avogadro
currently gives +30% framerate, which indicated that a significant amount of
time is spent in eigen. I don’t know however how much time is spent in eigen
once NDEBUG is defined. It’s very hard to measure as it’s a template library.

Cheers,

Benoit

On Tuesday 04 December 2007 23:08:01 Geoffrey Hutchison wrote:

Carsten,

Marcus tells me that you were asking around about performance and
ribbons.

I think the problem is not necessarily ribbons (although they probably
can use some performance work too). I think the problem is general
interactivity with rotation, etc. and large molecules.

For example, in many proteins, we’re talking about a few thousand
atoms and bonds.

One possible solution used by many molecular visualization tools is
the “quickdraw” option. Basically, when the user is doing a rotation,
translation (i.e., any time the mouse is down), rendering drops in
quality.

So here’s my suggestion:

Drop the global quality by 1-2 notches

Add a “quickdraw” flag to the render engines – i.e., the engines
should turn off the dynamic culling, etc.

Save a display list over all engines (i.e., on mousePressEvent)

Only use the display list while the mouse is pressed

After a slight timeout and mouseRelease, re-render with normal
settings.

I think this would help a lot. Right now, we do a lot of work for each
atom and bond, for each engine. So on a big system, that’s a lot of
work for each frame.

Thoughts? Marcus, Donald, and Benoit? You three seem to be the most
involved with the rendering framework.

Cheers,
-Geoff

SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell. From the desktop to the data center, Linux is going
mainstream. Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4

Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
avogadro-devel List Signup and Options

ghutchis · December 4, 2007, 10:31pm

Another thing we discussed is that part of the overhead of the
computations we
do is time spent in eigen, so moving to eigen2 (which should be
possible
around february 2008) will help.

Certainly some benchmarking and performance profiling is needed here.
I’m hoping to do some of that over the weekend.

Even as a template library, at least with debugging turned on, I could
find rough percentages of time spent in Eigen. Initial testing found a
lot of time spent in glPushAttribute/glPopAttribute.

But I’ll detail what I find this weekend.

Cheers,
-Geoff

Benoit_Jacob · December 4, 2007, 10:50pm

On Tuesday 04 December 2007 23:31:54 you wrote:

Even as a template library, at least with debugging turned on, I could
find rough percentages of time spent in Eigen.

The problem is that such numbers are irrelevant, because Eigen is an extreme
example of debugging code harming performance. With debugging turned on,
every access to a coefficient in a vector is subject to check of the ranges,
which means a conditional jump, which in turn means that many functions don’t
get inlined (because the debugging code makes them too complex), etc… I
think eigen-with-debug is an order of magnitude slower than
eigen-without-debug.

For eigen2, I’m devising ways of limiting that trend, but that’s only
compensating for the facts that expression templates makes that even worse.

So in order to even get a rough idea of the speed of eigen you must define
NDEBUG. In eigen1 many functions are big enough to not get inlined so you
should still be able to see the time spent in them.

If really most of eigen code gets inlined (as will be the case with eigen2)
the easiest way to profile would be to just isolate into separate functions
the portions of avogadro code which consist mostly of eigen calls. These
separate functions would then show up in the profiling results.

Cheers,

Benoit

Benoit_Jacob · December 4, 2007, 10:52pm

On Tuesday 04 December 2007 23:31:54 you wrote:

Even as a template library, at least with debugging turned on, I could
find rough percentages of time spent in Eigen.

The problem is that such numbers are irrelevant, because Eigen is an extreme
example of debugging code harming performance. With debugging turned on,
every access to a coefficient in a vector is subject to check of the ranges,
which means a conditional jump, which in turn means that many functions don’t
get inlined (because the debugging code makes them too complex), etc… I
think eigen-with-debug is an order of magnitude slower than
eigen-without-debug.

For eigen2, I’m devising ways of limiting that trend, but that’s only
compensating for the facts that expression templates makes that even worse.

So in order to even get a rough idea of the speed of eigen you must define
NDEBUG. In eigen1 many functions are big enough to not get inlined so you
should still be able to see the time spent in them.

If really most of eigen code gets inlined (as will be the case with eigen2)
the easiest way to profile would be to just isolate into separate functions
the portions of avogadro code which consist mostly of eigen calls. These
separate functions would then show up in the profiling results.

Cheers,

Benoit

Benoit_Jacob · December 4, 2007, 10:55pm

argh.

I’m sorry the list got my mail 3 times. kmail misled me into believing the
mail had got sent only to geoff, so I retried and retried…

Donald_Ephraim_Curti · December 4, 2007, 11:48pm

Well, the other problem is that we can’t naively make a display list
when mouse is down unless we have the engines somehow say whether they
are going to be modifying the molecule. What we need to do is while the
mouse is down make the display list but if the molecule gets updated at
anytime we need to re-make the display list. (ie. the draw tool).

The other thing that may be more complicated is that while the mouse is
down leave it up to the tools to draw what is new. So you could make a
single display list when the mouse gets pressed (which is an image of
the original molecule) then have the draw tool render the new atoms and
bonds by calling something like GLWidget::render(Primitive *primitive)
but then each engine would need a function like this and it gets sticky.

So my suggestion would be to do something about the global quality while
modifying or something of that nature. Just another slider that says
“Modification quality” or something more intuitive.

Also having it be an “option” to enable / disable display lists during
navigation would be good. Enabled by default. Because I personally
think it’s cool to have dynamic quality as things move closer and
farther.

How much of a speedup do we get from actually making a master display
list? Did anyone mention this already? I may have missed it. I know
there was talk of profiling.

–
Donald

(Tue, Dec 04, 2007 at 05:08:01PM -0500) Geoffrey Hutchison geoff.hutchison@gmail.com:

Carsten,

Marcus tells me that you were asking around about performance and
ribbons.

I think the problem is not necessarily ribbons (although they probably
can use some performance work too). I think the problem is general
interactivity with rotation, etc. and large molecules.

For example, in many proteins, we’re talking about a few thousand
atoms and bonds.

One possible solution used by many molecular visualization tools is
the “quickdraw” option. Basically, when the user is doing a rotation,
translation (i.e., any time the mouse is down), rendering drops in
quality.

So here’s my suggestion:

Drop the global quality by 1-2 notches

Add a “quickdraw” flag to the render engines – i.e., the engines
should turn off the dynamic culling, etc.

Save a display list over all engines (i.e., on mousePressEvent)

Only use the display list while the mouse is pressed

After a slight timeout and mouseRelease, re-render with normal
settings.

I think this would help a lot. Right now, we do a lot of work for each
atom and bond, for each engine. So on a big system, that’s a lot of
work for each frame.

Thoughts? Marcus, Donald, and Benoit? You three seem to be the most
involved with the rendering framework.

Cheers,
-Geoff

SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell. From the desktop to the data center, Linux is going
mainstream. Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4

Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
avogadro-devel List Signup and Options

Donald_Ephraim_Curti · December 4, 2007, 11:53pm

Yes, there are times we’re pushing all attributes on the GL stack but we
dont’ need to be. The reason we ended up doing it was because the
attributes that were supposed to be getting pushed on the stack with
the more specific parameters were not in fact getting pushed and there
were some rendering problems. So I changed it to
glPushAttributes(GL_ALLATTRIBS) or whatever the enum is.

Further testing of which GL_* we need to push might help this.

The other thing. The GLPainter class is initializing and deinitializing
the TextRenderer every call for text because we don’t have a “state” in
GLPainter to correctly keep track of this. This was a problem when
things were being rendered outside of the GLPainter class.
GLPainter would get a call to render text and then initialize the
TextRenderer instance. But then something else would make a call to
OpenGL without using GLPainter and things would get funky. I can’t
remember the exact problem but there is a bug report and i’ll try to
look it up later.

–
Donald

(Tue, Dec 04, 2007 at 05:31:54PM -0500) Geoffrey Hutchison geoff.hutchison@gmail.com:

Another thing we discussed is that part of the overhead of the
computations we
do is time spent in eigen, so moving to eigen2 (which should be
possible
around february 2008) will help.

Certainly some benchmarking and performance profiling is needed here.
I’m hoping to do some of that over the weekend.

Even as a template library, at least with debugging turned on, I could
find rough percentages of time spent in Eigen. Initial testing found a
lot of time spent in glPushAttribute/glPopAttribute.

But I’ll detail what I find this weekend.

Cheers,
-Geoff

SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell. From the desktop to the data center, Linux is going
mainstream. Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4

Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
avogadro-devel List Signup and Options

ghutchis · December 5, 2007, 1:02am

On Dec 4, 2007, at 6:48 PM, Donald Ephraim Curtis wrote:

Well, the other problem is that we can’t naively make a display list
when mouse is down unless we have the engines somehow say whether they
are going to be modifying the molecule.

Good point. But this just tells me that the tool itself should set the
“quickdraw” flag. So navigate, for example, sets the flag, then the
GLWidget compiles the display list, etc. The draw tool, for example,
would not set the flag. (I think most people’s complaint is with
rotating very large molecules.)

Also having it be an “option” to enable / disable display lists during
navigation would be good. Enabled by default. Because I personally
think it’s cool to have dynamic quality as things move closer and
farther.

Yes, it should be a setting. But seriously try loading a big protein
and see what you think about interaction. Dynamic quality is cool, but
if it’s killing us on performance, we should think hard about it.

How much of a speedup do we get from actually making a master display
list? Did anyone mention this already?

No, the code hasn’t been written yet.

Cheers,
-Geoff

Donald_Ephraim_Curti · December 5, 2007, 10:47pm

Well, one instance of the TextRenderer remains for each MainWindow
(GLWidget’s that share context share the same GLPainter), but the
TextRenderer::begin and TextRenderer::end functions are being called for
each call to Painter::drawText. So I don’t think the cache (display
lists of the glyphs right?) are getting re-rendered, I think it’s just a
matter of the OpenGL “settings” being pushed and popped from the stack.

(Wed, Dec 05, 2007 at 07:07:56AM +0100) Beno?t Jacob jacob@math.jussieu.fr:

On Wednesday 05 December 2007 00:53:28 you wrote:

The other thing. The GLPainter class is initializing and deinitializing
the TextRenderer every call for text because we don’t have a “state” in
GLPainter to correctly keep track of this.

Wow, I didn’t know this was the case. Then there really is a lot of room for
improvement! The TextRenderer is built around the assumption that each glyph
is rendered only once and then cached. It’s really important to keep the
TextRender’s cache alive!

The easiest way to tell is to add a qDebug() to CharRenderer::initialize().
Normally (if the cache is being used) this function should be called only
once per different Char used.

Cheers,

Benoit

Tim_Vandermeersch · December 5, 2007, 10:56pm

I was also thinking about making a very simple force field that with
very few parameters, and one term for non-bonded interactions to allow
the user to use the AutoOpt tool on large molecules, the final energy
refinement could be done with another force field, and/or a
RotorSearch. It would allow the user with a fast computer to edit
small biomolecules I think/hope…

Tim

On Dec 5, 2007 11:47 PM, Donald Ephraim Curtis d@milkbox.net wrote:

Well, one instance of the TextRenderer remains for each MainWindow
(GLWidget’s that share context share the same GLPainter), but the
TextRenderer::begin and TextRenderer::end functions are being called for
each call to Painter::drawText. So I don’t think the cache (display
lists of the glyphs right?) are getting re-rendered, I think it’s just a
matter of the OpenGL “settings” being pushed and popped from the stack.

(Wed, Dec 05, 2007 at 07:07:56AM +0100) Beno?t Jacob jacob@math.jussieu.fr:

On Wednesday 05 December 2007 00:53:28 you wrote:

The other thing. The GLPainter class is initializing and deinitializing
the TextRenderer every call for text because we don’t have a “state” in
GLPainter to correctly keep track of this.

Wow, I didn’t know this was the case. Then there really is a lot of room for
improvement! The TextRenderer is built around the assumption that each glyph
is rendered only once and then cached. It’s really important to keep the
TextRender’s cache alive!

The easiest way to tell is to add a qDebug() to CharRenderer::initialize().
Normally (if the cache is being used) this function should be called only
once per different Char used.

Cheers,

Benoit

SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell. From the desktop to the data center, Linux is going
mainstream. Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4

Avogadro-devel mailing list
Avogadro-devel@lists.sourceforge.net
avogadro-devel List Signup and Options

ghutchis · December 6, 2007, 12:50am

On Dec 5, 2007, at 5:56 PM, Tim Vandermeersch wrote:

I was also thinking about making a very simple force field that with
very few parameters, and one term for non-bonded interactions to allow
the user to use the AutoOpt tool on large molecules, the final energy
refinement could be done with another force field,

I think one strategy might be to implement CHARMM or AMBER for
OBResidues. This would certainly solve “small biomolecules.”

My biggest concern about interactivity is in simple things like
rotating a protein. It’s simply too slow. So we definitely need to
come up with a “performance hit list,” since it seems we have several
areas in need of improvement.

Cheers,
-Geoff

Tim_Vandermeersch · December 6, 2007, 1:04am

On Dec 6, 2007 1:50 AM, Geoffrey Hutchison geoff.hutchison@gmail.com wrote:

On Dec 5, 2007, at 5:56 PM, Tim Vandermeersch wrote:

I was also thinking about making a very simple force field that with
very few parameters, and one term for non-bonded interactions to allow
the user to use the AutoOpt tool on large molecules, the final energy
refinement could be done with another force field,

I think one strategy might be to implement CHARMM or AMBER for
OBResidues. This would certainly solve “small biomolecules.”

My biggest concern about interactivity is in simple things like
rotating a protein. It’s simply too slow. So we definitely need to
come up with a “performance hit list,” since it seems we have several
areas in need of improvement.

Yes, but I don’t know much about rendering and so on, so I’ll leave
that part for other people. You all have done a great job so far I
think.

Benoit_Jacob · December 6, 2007, 10:32am

OK, as long as the TextRenderer remains alive, the cache remains too, so
there’s no problem.

The cache is a QHash of (QChar,CharRenderer) pairs, where a CharRenderer
is two textures and one display list.

Cheers,

Benoit

On Wed, 5 Dec 2007, Donald Ephraim Curtis wrote:

Well, one instance of the TextRenderer remains for each MainWindow
(GLWidget’s that share context share the same GLPainter), but the
TextRenderer::begin and TextRenderer::end functions are being called for
each call to Painter::drawText. So I don’t think the cache (display
lists of the glyphs right?) are getting re-rendered, I think it’s just a
matter of the OpenGL “settings” being pushed and popped from the stack.

(Wed, Dec 05, 2007 at 07:07:56AM +0100) Beno?t Jacob jacob@math.jussieu.fr:

On Wednesday 05 December 2007 00:53:28 you wrote:

The other thing. The GLPainter class is initializing and deinitializing
the TextRenderer every call for text because we don’t have a “state” in
GLPainter to correctly keep track of this.

Wow, I didn’t know this was the case. Then there really is a lot of room for
improvement! The TextRenderer is built around the assumption that each glyph
is rendered only once and then cached. It’s really important to keep the
TextRender’s cache alive!

The easiest way to tell is to add a qDebug() to CharRenderer::initialize().
Normally (if the cache is being used) this function should be called only
once per different Char used.

Cheers,

Benoit

Benoit_Jacob · December 11, 2007, 9:58am

Sorry for the spam, just for whom that might interest: I just completely
reworked the way eigen2 makes its asserts, so as to minimize the impact of
debugging code on performance, so that it should be possible to leave
debugging code and still get decent performance.

The end result is that my benchmark runs in 5.0s with -DNDEBUG and in 7.5s
without. (Which is still faster than Eigen1 with -DNDEBUG). So leaving
asserts only increases the execution time by +50%, I think that’s reasonable.

Cheers,
Benoit

On Tuesday 04 December 2007 23:52:21 Benoît Jacob wrote:

On Tuesday 04 December 2007 23:31:54 you wrote:

Even as a template library, at least with debugging turned on, I could
find rough percentages of time spent in Eigen.

The problem is that such numbers are irrelevant, because Eigen is an
extreme example of debugging code harming performance. With debugging
turned on, every access to a coefficient in a vector is subject to check of
the ranges, which means a conditional jump, which in turn means that many
functions don’t get inlined (because the debugging code makes them too
complex), etc… I think eigen-with-debug is an order of magnitude slower
than
eigen-without-debug.

For eigen2, I’m devising ways of limiting that trend, but that’s only
compensating for the facts that expression templates makes that even worse.

So in order to even get a rough idea of the speed of eigen you must define
NDEBUG. In eigen1 many functions are big enough to not get inlined so you
should still be able to see the time spent in them.

If really most of eigen code gets inlined (as will be the case with eigen2)
the easiest way to profile would be to just isolate into separate functions
the portions of avogadro code which consist mostly of eigen calls. These
separate functions would then show up in the profiling results.

Cheers,

Benoit

ghutchis · December 11, 2007, 1:10pm

debugging code on performance, so that it should be possible to leave
debugging code and still get decent performance.

Well, the key point is that when we make binary releases, we really
should be compiling in Release mode and turning off debugging code
completely. On the Mac, I also turn on threaded rendering.

But I’m glad you brought this thread back. Last night I did some
benchmarking. So this is on my old G4 PowerBook. All debugging is
turned on. I’m using 1CR5.pdb (4555 atoms, 4389 bonds). The benchmark
is to turn on the auto-rotate tool and let it turn.

Wireframe-only + Debug engine:

Current SVN trunk: ~2.0 fps default
New wireframe code: ~12.0 fps

The new code is a start towards the “quick draw mode.” That is, the
wireengine will compile a new display list when the primitives change,
and call it at all other times. Of course at the moment, that means
the dynamic level of detail is broken. (It can come back when I push
in a re-render after a timeout.)

Today I’m going to try the same benchmark on different engines. Of
course I have a better graphics card at work, so the change is most
obvious here. You go from completely unusable interactivity to fairly
reasonable.

Cheers,
-Geoff