Velvet and the QML Scene Graph

Published Thursday December 2nd, 2010 | by

First of all, let me start with a bit of clarification. I was at DevDays this year and met a lot of people and I came to understand that we had done a poor job naming the scene graph project. Because of name similarity, it gives the indication that it is similar to projects like Open Scene Graph, which is not the case and was never the intention. Our scene graph is a compact and small 2D scene graph for rendering QML files. So, from now on, we will refer to it as the QML Scene Graph. Its sole purpose for existence is to make QML better.

Moving on…

Animations should feel like velvet – silky smooth and pleasing. From the technical perspective this requires a few things:

  • Draw one frame for every frame the display can draw. Modern displays, such as the LCD or LED screens you are using to read this, are almost always clocked at 60Hz. It depends on the resolution (dpi) of the display of course, but some magic happens around 60Hz. If you update 2D graphics at 30Hz, you can really see every single frame as individual frames. As you get closer to 60Hz, the frames start to blend together and the eyes perceive fluid motion rather than frames and that is the key difference between velvet and sand paper.
  • Be done on time. To reach 60 Hz, you need to be done in no more than 16.66 ms. If you ever shoot above, you missed your mark. That means that while doing animations, you cannot do anything else than updating a few properties and then draw the stuff. If it takes time, it either needs to be ready beforehand, or you do the work in a background thread as described in my previous post. Until recently, I was convinced that missing the mark once now and again, would not be disastrous, but it really makes the difference between velvet and sand paper.
  • Do not draw in the middle of vertical refresh. That leads to Screen Tearing, basically chopping your nice frame in two and ruining what should be a moment of visual pleasure. The solution is to use some form of synchronization that ties you to the screen’s vertical refresh rate. Out of the box, Qt does not always help you here. On Mac OS X and on the Symbian N8, we are locked to vertical refresh and if you try to draw more often than that, the windowing system will block your rendering thread. On Linux / X11, Maemo and Windows, we are not locked and you can typically see tearing. Fortunately, there is a pretty simple solution. QGLWidget combined with QGLFormat::setSwapInterval set to 1, will enable the QGLWidget to be synchronized to vertical refresh. In the QML Scene Graph, we require OpenGL and we set the swap interval to 1 by default.
  • Advance the animation relative to the time between frames. If your object is moving at 1 pixel per millisecond, then it needs to move 16.66 pixels pr frame. If you missed a frame, then it needs to move 33.33 pixels the next one. If you always hit your 16.66ms mark, then this problem goes away, of course.

Yet with all these things in place, Qt cannot yet guarantee me velvet…

The missing ingredient is where the animation “tick” comes from. Qt’s animation framework uses a timer, which advances the animations and updates properties in all the objects. This sends off several update requests and eventually the frame gets repainted. There are two faults in this setup. Firstly the timer is started at 1000 / 60, which rounds down to 16, so it’s not 16.66 as it should have been. In addition, the timer is not accurate. It can fire at 15 or 17, and if it fires at 17, then the animation misses a frame.

This had been bothering me for a while, so while in San Francisco for DevDays, I had some “time off” to start digging. The result is that in the QML Scene Graph we drive the animations a bit differently. We do something along the lines of:

while (animationIsRunning) {
    processEvents();
    advanceAnimations();
    paintQMLScene();
    swapAndBlockForNextVSync();
}

The result is that we are always progressing the animation in sync with the vertical refresh, so once every 16.66 ms, and exactly once pr frame. I said that I was initially not convinced that missing the occational frame was that bad, but it took me ONE look at the result and I realized we finally had it. Velvet!

We cannot do this generally in Qt because the method we have for vertical synchronization is only through OpenGL’s swapBuffers(), so we can only tie it to one window. With Wayland or through custom OpenGL extensions, we can potentially get the vertical synchronization without going through swap, which means we could in theory advance animations across multiple windows, but that is out of scope for me right now. For now, it is fixed for that single window running the QML Scene Graph.

The repository is here: http://qt.gitorious.org/qt-labs/scene-graph

Did you like this? Share it:
Bookmark and Share

Posted in OpenGL, Painting, Performance

32 comments to Velvet and the QML Scene Graph

Arnaud Vrac says:

The change to the rendering loop is good because the assumption that the refresh rate is always 60Hz is false. For example on TVs the refresh rate is more likely to be 50Hz (at least in Europe). Is the rendering done in a separate thread in the QML scene graph ? With wayland waiting for the vsync is not blocking but it is not the case with opengl’s swapBuffers().

qtnext says:

Good news :) …. Qml rocks but for the moment we have choppy animation … not because it’s not efficient but just because the synchronisation is not good … see QTBUG-13600 “QML Animation not smooth” … for sure there is a big visual difference and user experience when you have a true 60 fps refresh … And I insist to be able to choose refresh rate … In Usa, Japan, it’s 60Hz per seconds for video, in Europe 50HZ !!! Imagine a media player application … the frequency of the video are synchronised to 50Hz .. so you need to have a qml synchronised to 50 instead of 60Hz

Tim says:

I’ve used loads of LCD monitors that are set to 75 Hz, so it is good that the method you used (which incidentally is exactly what all games do), should work for any refresh rate.

tassa says:

Actually on modern TV (LCD etc.) where Qt will be potentially used the refresh rate purely depends on the display mode. Thus you can choose between 1080p24, 1080p30 and 1080p60 and so on and with that tie yourself to 60 fps.

Nevertheless the new approach is MUCH better as it is really device independent and guarantees “velvet” instead of “sand paper” (which you get even with 60 Hz devices). Now if only we could see this in productive use (as in part of Qt 4.8) any time soon ;-)

Scorp1us says:

What is the status of qmlscene? Is it updated with bug fixes as of 4.7 or 4.7.1? Feature parity With qmlviewer?

I agree with everyone about making the vsync adjustable. Some TVs here (US) are 120, and even 240 hz. Though they can emulate these by interpolating frames of a 60 hz source. Seems a simgple command line “-hz XXX” should be able to specify your refresh rate…

Note though the processing power required increases exponentially, since all active elements must be serviced between frames.

Juha Turunen says:

Good post. Thanks for writing it. I’ve been wondering about the systematic periodical dropped frames when running a mostly 60fps QML UI on N8. The reason must of course be the inaccurate timer and the accumulating error.

Dennis says:

@Gunnar:

There is still the issue of parts of the UI thread taking too long and missing the deadline; there are a lot of things you simply can not do in a background thread, such as working with QPixmaps. As an example, (shameless plug), I wrote a tutorial at forum Nokia (http://wiki.forum.nokia.com/index.php/Custom_QML_Component:_Website_Thumbnails) where I would love to have the QML appear “velvet”, but as far as it know can simply not be done (or not be done simply) because the QWebView can not be created in a background thread. It’s not near velvet even on a fast desktop while loading, what chance does it have on an N8?

And suppose that with enough time and effort you could achieve it, it would probably complicate the code a lot to interleave running the eventloop with the CPU heavy parts. And, in case I did miss an obvious solutions to this particular example, there are surely similar examples.

It’s almost like there should be a dedicated display thread, with a seperate processing thread which behaves much like the current eventloop, and where it doesn’t matter if QtWebkit spends 100ms in one go processing javascript, because your animations keep going anyway. Look at the complications the QML image background loader goes through now to avoid delaying the eventloop, and it still stutters when loading images from disk in a fast pathview.

Would love to hear more about this topic . Thanks for taking the time to write the scene graph articles!

No'am Rosenthal says:

@Dennis:
With WebKit the direction is WebKit2, which separates the UI from the web processing (JS, loading and the rest) to two processes. It’s not operational yet, but it would potentially solve this problem.

gunnar says:

Thanks everyone for good comments!

Regarding the actual display rate, I’m thinking that it will be pluggable. Right now I’m using the GL swapBuffer() call because it works pretty much everywhere, but I know that for wayland we will need to drive the synchronization based on some other source. So, keep the feedback coming, and let us know if what we have doesn’t fit your needs :)

@Scorp1us: The QML Scene Graph is still based on the early pre-4.7.0 version of QML, so there is still work to be done there. We hope to get started on that in the near future, but we also have some conceptual things we want to try out, so we’ll see which gets done first.

@Dennis: This problem would be solved if WebKit could render in a background thread, wouldn’t it? To my knowledge, the trunk version of WebKit allows that, so that should solve that problem. As for rendering to pixmaps, that is allowed on any platform where you run the raster graphics system or where pixmaps are backed by QImages, which is the case for the N8 :) . There is also the option of using a QImage as the render target instead of a QPixmap and you don’t need to care about the target platform. Combine this with QDeclarativeImageProvider and an asynchronous Image element in QML and it should be possible to get pretty far. So, the solution is a bit away, but we’re getting there :)

Mike Inman says:

Finally! Back in 1986 I was learning about the Vertical Blank Interrupt on my 8 bit PC. Since those days, the vertical blank has been an elusive beast, difficult to track and harder to hold on to. If Qt can manage to expose the VBI once and for all, my animations can finally return to the high quality they had over 20 years ago!

Kaitsu says:

Have I ever told you that I love… Velvet! =)

Gopala Krishna A says:

Great blog – very informative.

BTW George Costanza (Seinfeld character) loves Velvet :P

Dennis says:

@Gunnar, @No’am
That is great to hear. This should also enable a Symbian web browser that pans and zooms “velvet” :) (what happened to silk ?)

Looking forward to a future blog telling us more, or some expiremental repository. And of course many Qt fans are very interested in Qt Quick on Symbian, the QML scenegraph sounds like it should enable very good performance (it runs great on desktop). And things like why OpenVG as the backend instead of OpenGL? How can outsiders help to improve Qt on Symbian? Etc :)

Thanks again!

2beers says:

@gunnar
“Our scene graph is a compact and small 2D scene graph for rendering QML files.”
There is a project for QML 3D objects. Aren’t that rendered using QML Scene Graph?

gunnar says:

@2beers: The Qt/3D rendering is using the QML Scene Graph, no. It has its own scene graph which supports 3D. The rendering algorithm we use is very much tailored for 2.5D and abuses the Z-buffer in ways that would not support 3D. In theory, one could write a 3D renderer for our QML Scene Graph, and we have talked about it with the Qt/3D team, but we havent’ started down that path yet.

Cheung says:

Great post, very informative.

I am wonder since Qt is event driven, why not make vsync a high-priority and unmaskable event? The while loop treating vsync like something alien confuses me a bit.

Thomas says:

@Cheung, I think the problem is that unlike consoles, PCs have traditionally been designed without a vsync interrupt. I’m pretty sure most graphics hardware actually supports it, but X11, etc don’t provide any API for the application to receive it. This means that everything drawn with normal X11 cannot be synchronized. OpenGL does provide a way to block until a vsync, which allows you to accomplish the same thing, but it won’t work with the rest of the window. The only way to fix is to fix X11.

As a side note, even if you get vsync working in your application, the window compositor can screw it up. For example, the KWin compositor’s VSync is broken – it will lock to the frame rate, but will draw in the wrong place and still tear, even if your application uses vsync.

Anyway, I’m happy that at least one part of Qt got this right. I do hope that eventually this will work for classic QWidgets too (with window system improvements), as animated themes suffer from the same problem. BTW, do OpenGL-backed QQraphicsViews support vsync?

gunnar says:

Thomas: You explain the problem exactly. OpenGL’s swapBuffer is the only thing we are guarateed to have. As for your other questions: If you use a QGLWidget with setSwapInterval set to 1, then the QGraphicsView will not tear, but you still need to drive animations similar to the loop above to have the animations be 100% smooth.

Dennis says:

@Thomas, @Gunnar

But of course X11 is only part of the picture. How do things look on Symbian and MeeGo’s version of X11? I suppose you guys can’t really say anything about the extent of Qt’s planned role on Symbian (though I would love to discuss that topic further :) , but Qt Quick is absolutely vital to attract developers and enable beautiful mobile apps (and if it works as well as it does on the desktop, I don’t see why it wouldn’t), and of course having 50M projected units with virtually identical hardware/features and drivers encourages specific optimizations for these platforms. The 4.7.0 developer preview QML examples aren’t all that smooth yet on the N8, but bursting with potential.

StefanMohl says:

What you are doing here is classic real-time rendering with a hard time-limit (i.e. if you miss your deadline the rendering has failed, it won’t be smooth). Anyone who was around for the Commodore 64 or Amiga will recognize this stuff. On those old machines, you would actually render to the screen-buffer _as it was being drawn_. Since old CRT monitors render from top of screen to bottom, you could change the data in the screen buffer, as long as you kept your changes ahead of the reading point for the electron beam from the screen buffer. That way you didn’t need to double-buffer and saved precious memory (in effect you were using the phosphorous layer of the screen to retain the image while you did your rendering).

Nowadays, double buffering is cheap, but a missed deadline will still be obvious in the form of a “hickup” in the rendering smoothness. Unfortunately it is really hard to find all cases when you miss timing. Checking if your code is fast enough can not be done in a debugger! To aid debugging, one thing you could consider adding (if you can fid a way of detecting it) is a hard fail on missed timing. Something that can be switched on for debug only. Perhaps some exception mechanism that is called if you miss your deadline so that you can print out some relevant variable values in your exception handler and figure out why you are too slow from time to time.

There is another big problem for smoothness, at least on the N900: Swap! It isn’t uncommon for my N900 to lock up completely for seconds at a time, and the small memory space (compared to a desktop) means that swap management becomes critical.

I actually think it would be better to turn off automatic swap altogether and replace it by some (mostly) manual mechanism. Perhaps some way of requiring users to select applications to “park” (i.e. manually push to swap) when more primary memory than available is needed. That app (all its processes) would then be fully suspended until the user manually wakes it up again (possibly requiring something else to be parked instead). Hmmm, there are lots of problems with that scheme (what happens when some daemon needs more memory in the middle of the night? Does the whole handset get stuck?). Anyhow, some much more direct manual control would really help, perhaps with some more traditional automatic system for back-up.

There is some real-time version of Linux, but I think it does away with swap altogether. Nevertheless, perfect real-time control is a requirement for silky-velvet smoothness!

MNaydenov says:

I am a bit worried about this being OpenGL exclusive.
Two things (still) keep me away of it
– it is slow to initialize, measuring 340ms on my machine to switch graphic view to QGLWidget, which is really noticeable at start up. (Observable on the QMLView with -opengl also)
– As of 4.7.1 big textures (images) still (silently) fail. There should be a tiling implementation, much like everyone else using OpenGL for drawing images (Clutter)

The other worry is the QML thing.
Though I kind of like it [QML] and use it here and there, nothing scares me most, the the possibility I will be *forced* to use it.
I pray *all* advancements in Qt will be available to C++ also. (Wherever technically possible)

Thanks

Dimitris says:

Excellent discussion! Indeed silk smooth UI animation was the among the deciding factors when I bought my iphone 4. Maybe now the next one will be powered by Qt :)

Dimitris says:

1UP for Amiga and C64!

Cheung says:

Just saw this when I was checking out the new qscroller stuff:
http://doc.troll.no/master-snapshot/qanimationdriver.html
Very close to what we are discussing here. Is it used in QML Scene Graph?

przemo_li says:

Small Off-topic:

Can you make some nice tut (or doc section) for enabling Qt4.7 with OpenGL 3, 4?
(Yes I know that is some contradiction to idea of cross-platform but this topic is non-existing on net!!!)

And as for post. Renaming is really good move. And I’m in favor of QML and any improvements you can get for it!

gunnar says:

@Cheung: Yes, the QML Scene Graph drives animations based on vsync.

Juha Turunen says:

@gunnar: Will the QAnimationDriver related changes be in 4.7.2?

gunnar says:

@Juha: No, this are changes for 4.8 and the implementation is done in the QML scene graph.

Emilio Coronado says:

Amiga forever !!, that was the way to do proper graphics programming , straight exposure of the display HW to the apps :)

as far as i know not sure if Symbian Devices does have any functionality in the RDisplay API to expose the display Vsync interrupts, even i have some suspicious than even internally is not used at all for posting the buffers into the display memory as everything is handled automatically by the graphics hardware.

this lack of symbian windows server display drivers synchronizations, has been historically a pain in the ass for Symbian, and quite visible in a way of terrible flickering in some “old” Nseries devices

even today within the NGA architecture looks like is still missing and still relyies on the MCU mhz and DMAs speeds and double buffering to handle everything in the display backend making everything to happen faster than the display refresh. that seems to me is not an optimal way.

I guess , this Vsync synchronization updates within the display and TV Outputs ( making the framerate timings fixed and predictable ) , a proper zero mem copies double buffering implementation , ( changing the buffer pointers when Vsync happens ) , to both platforms Symbian and Maemo/Meego, not and easy thing to do though as require to change the display drivers,, but should be a must if QT / QML / even OpenGL want to be serious “velvet” about animations and high FPS games.

Rob Palmer says:

@Emilio: Symbian (N8 onwards) can achieve proper zero-copy tear-free framebuffer updates. EGL window surfaces are implemented using this method, so long as you don’t enable framebuffer preservation across swaps (via EGL_SWAP_BEHAVIOR). Window Server rendering (via CWindowGc) gets routed through to an EGL window surface with anti-tearing switched on (eglSwapInterval==0).

Emilio Coronado says:

That’s nice to hear Rob, NGA has improved a lot the graphics area and is quite noticeable in the N8. is this anti tearing implemented in the display drivers as internal double buffering during the DMA transfers ?

it would be nice to see some graphics latencies data from QT, like EGL surfaces swapping brute force in the N8. would be useful for some developers to know those timings. and specially interesting to see if those are constant.

What i don’t understood is actually how you’re no longer really much more smartly-preferred than you might be right now.

You are so intelligent. You realize therefore considerably in
relation to this matter, produced me in my opinion imagine it from a lot
of varied angles. Its like men and women are not interested until
it is one thing to accomplish with Girl gaga!
Your individual stuffs nice. At all times take care of it up!

Commenting closed.