Wednesday, October 15, 2014

melonJS should be *All About Speed* Part 3

In the first two parts of this blog series, I discussed classical inheritance and object pooling. You can find links to both parts below.

Part 1 :
Part 2 :

Inheritance Wrap-Up

After a long hiatus, I'm back to work on melonJS. Continuing with the performance enhancements we introduced with v1.1.0, we've fixed a lot of the performance burden with the collision detection system for v1.2.0.

But before we get into that, let's see what kind of performance increase was actually gained by transplanting our classical inheritance microlib and getting rid of the deepcopy that it had to perform. I haven't actually prepared a benchmark of any kind. Fortunately, there's not much need for building a benchmark, since we can just use the Particle Editor example that ships with melonJS since v1.0.0! The nice thing about the Particle Editor is that it includes a custom particle-debug-panel plugin which reports information like how long the update and draw steps take, and even renders a nice little performance graph.

Here's what the Particle Editor looks like in v1.0.0:

melonJS 1.0.0 Particle Editor

This is what it looked like upon initial release. While the general display hasn't changed much (some rendering bugs aside), you should take note of the performance graph in the lower left corner; In 1.0.0, 300 particles take roughly 3ms to update the Particle Container, and 3.3ms (total) to update the entire frame. Similarly for drawing: 3.7ms to draw the Particle Container, and 4.5ms (total) to draw the entire frame. That's certainly not terrible; 7.9ms used over the entire frame, which leaves about 8.7ms for game logic to maintain 60fps.

How does 1.1.0 compare, with its improved object inheritance?

melonJS 1.1.0 Particle Editor; Note the missing control widgets. A bug that went unnoticed.

Wow ... wait, what?

This is the same Particle Editor example, which itself hasn't really changed. We also didn't touch the Particle Emitter or Container code. Ignoring the missing widgets (which was a bug introduced by the new video renderer) this example is exactly the same. And yet, it's overall 4x faster.

The Particle Container update time reduced from 3ms to under one tenth of a millisecond. And total frame update time is down to about 0.16ms! The performance increase with drawing is equally stunning; Particle Container draw time is down to 1.9ms (from 3.7ms) and total frame drawing time is down to 2.8ms (from 4.5ms). The overall frame time with 300 particles is about 2.9ms! Which means the biggest remaining impact is the rendering of 300 individual images. Nothing a little WebGL can't solve! But more on that later...

These numbers tell us that even with 300 particles, you still have over 13.7ms to do your game logic on each frame to maintain a buttery-smooth 60fps. (It might be telling to note that I ran these tests on a Late-2011 Macbook Pro, with Chrome 38.0.2125.101, and "automatic graphics switching" disabled in System Preferences.)

Furthermore, I've fixed the rendering issues with the control widgets for 1.2.0, and just for kicks here's what the Particle Editor now looks like running under tonight's build!

melonJS 1.2.0 Particle Editor (WIP)

As you can see, the performance numbers have not changed at all. That's because the widgets in 1.1.0 were being drawn, just incorrectly. For some reason, I managed to capture this screen shot on a frame that was actually drawing faster than what 1.1.0 showed. But that's just coincidence. The average frame times are identical.

So there you go! I actually do know what I'm talking about. ;) Go update your JavaScript projects to Jay Inheritance if you care about speed.

Collision Detection

This is the next major performance boost on our end! Olivier finally ripped out all of the old tile-based collision detection code and replaced it with the lovely SAT.js (Separating Axis Theorem). This gives us pixel-precise polygon collision and hit detection using an efficient algorithm invented by people much smarter than any of us. Olivier also incorporated a QuadTree implementation for the broad phase, which reduces the number of tests that need to be done when collision detection is requested. Together, these two features make collision detection incredibly fast and flexible.

For an example of the flexibility, melonJS now has proper collision detection support for isometric maps for the first time. It is possible to define arbitrary shapes within your maps for world collisions (create bumpy terrain or funky shapes!) and you're also able to define multiple arbitrary shapes for your entities' physics bodies. For anyone familiar with 2D or 3D physics engines, we have the same concept of a "body" where you apply forces, and specify collision shapes. The entity references its body, so the entity itself is no longer responsible for physics within melonJS.

There are still bugs and missing features (like ellipse shapes not working 100% -- perfect circles are fine, though). However, it is a really good start for fixing performance issues around collision detection. There are also a few more things we can do to increase performance even more, and there are open tickets for all of them:

Of these, #551 is probably the least important for speed; It will help a bit, but the algorithm is tricky, and it won't be a super big win. #587 will make a noticeable improvement on maps with many world collision shapes. And #574 will be the biggest win of the three.

What #574 promises is to remove the user's responsibility for performing collision detection, which in turn means the user won't be able to slow down her game by doing collision detection calls too often. In other words, the user won't be able to penalize herself just because she wants to perform collision detection against 1000 entities. She should absolutely be allowed to do that! Bare in mind that such a game might only be feasible on Desktop. Collision detection is still hard to optimize for mobile platforms.

Onward to WebGL

Aaron has been busy since 1.0.0 adding support for WebGL! This will be the biggest performance improvement on mobile platforms, hands-down. Even the WIP 1.2.0 doesn't have a working WebGL renderer, so I'm not able to show off any crazy benchmarks. Primarily we're still in the architecture stage for implementing WebGL. We're currently missing useful things like font rendering! And our 2D/3D matrix code seems to have some issues, causing the WebGL renderer to display all black. ;)

WebGL is kind of a buzzword for speed, so me clarify what WebGL means to the melonJS team: WebGL means that we can offload some of the heavy drawing code to the GPU (where it belongs). That seems pretty obvious. WebGL also means we can offload some of the heavy non-drawing code to the GPU. That's where we will gain the biggest wins. GPUs are exceptionally good at parallelizing computations. The kinds of computations we have to do on the CPU; things like vector math for physics. Why not write a physics shader to handle some of those transformations on the GPU?

Speaking of parallelization, desktop and mobile platforms have multiple CPU cores, which are typically idle, even in the computationally heaviest of melonJS examples and games. We have an opportunity to utilize all of the cores for the best player experience using Web Workers. We haven't ironed out all the details, but there's an ongoing discussion in the ticket.

WebGL won't be ready for the 1.2.0 release at the end of the month. But the looming thought of ultra fast sexy graphics in melonJS makes my mouth water!

No comments: