Tuesday, December 16, 2014

melonJS should be *All About Speed* Part 4

We've reached Part 4 of the melonJS performance improvement series! In the last post, I discussed some benefits that we can get out of WebGL, and wrapped up the inheritance improvements with some simple benchmark numbers. We've seen remarkable speed increases so far, and there's no reason to stop there! Today we'll discuss the game loop, one of the hottest parts of the framework (in terms of profiling), and what we can do to make it faster.

But first, here are the first three posts in this series, in case you want to catch up:

Part 1 : http://blog.kodewerx.org/2014/03/melonjs-should-be-all-about-speed.html
Part 2 : http://blog.kodewerx.org/2014/04/melonjs-should-be-all-about-speed-part-2.html
Part 3 : http://blog.kodewerx.org/2014/10/melonjs-should-be-all-about-speed-part-3.html

Processing AI on Non-Displayed Entities

A common question on the melonJS forum is, "Why don't my entities move when they are outside of the screen?" The answer is short and sweet, but understanding it will require the complete content of this article: melonJS will not attempt to perform updates on entities the player cannot see or interact with.

The usual followup question is "How do I make my entities update anyway?" And the answer to this question is something that most inexperienced developers do not want to hear: You do not want to update entities the player cannot see or interact with.

Trust me, you do not want to do it.

I know, you have big plans for a dynamic world where every NPC goes about its business and may even make some game-changing alterations when they are 1,000 miles away. Or your multiplayer game needs to track projectiles shot by other players on the server. And on, and on.

You still don't want to do it.

I'll tell you why you don't want to do it: The biggest hit to performance your game will face is rendering frames. But it is easy to optimize; only draw the parts that the player can currently see. Once that has been solved, the second performance challenge is going to be updating AI on all of the entities within the game world. This is also easy to optimize; only update the entities that the player can currently see.

The good news is that the end result you want can easily be "faked". You don't have to let AI wander and interact with stuff to create random world-events. Nor do you have to test every projectile for collisions when the player can't even see the things. These two concepts can be categorized under "mental masturbation"; it seems like a really interesting idea, and it also seems like the obvious way that things in a game world should work. The truth is, these are both very naïve implementations of game features than can be done in O(1) constant time; the calculations taking far less than even a single frame to complete.

The first one: making NPCs move about the world? That's wasteful! It can be simulated by giving the NPC a new random off-screen position as soon as it leaves the viewport. And the interactions can also be done with a much simpler random event system; with some random/rare chance, perform a random interaction. It's two lines of code, it doesn't burn CPU time unnecessarily, it has the same end result, and it's really quite convincing.

The second one: Handling all remote player actions client-side, even when not viewable? Use events instead! Your server will be tracking the projectiles for collision detection. (Right? Right!) It should also track every player's viewport, and signal the remote player when a projectile first enters their viewport.

Here's a quote from a forum post I made a while ago, which I think describes the issue as good as anything: "I like to think of the problem in a similar way to Schrödinger's cat; if I can't observe it, it may or may not exist." It's better to be on the safe-side, and assume it doesn't exist until the player can see it.

Why go to this much trouble to discard things the player cannot see? Because the performance win is unprecedented! Imagine a fairly big game that has 300 "alive" objects. Many of the objects are entities that have a renderable, but not all; there are also objects for the HUD, objects for the map, even objects for performing tweens and timers. Now let's say you also want the game to run at 60 fps. So that's 300 objects to update and draw within 16.7ms.

Because there are two loops (update loop, and draw loop) you can divide 16.7 by 300*2 and derive the average amount of time your game may spend processing an update or a draw from a single object. Pencils down! It's about 28µs! That's 0.000028 seconds each. For reference, Javascript's Date() resolution only goes down to 1ms (0.001 seconds) and the timer (setTimeout and friends) resolution is only about 10ms at best (0.01 seconds) 4ms (0.004 seconds) in recent UAs. Most developers using JavaScript will not have to deal with time slices any smaller than this.

So, can your objects update or draw in just 28µs or less? Well, maybe. Depending on the overall CPU speed (something you do not control!) and whether your code is highly optimized (often for a specific browser) ... maybe. Instead, let's assume we don't have to update and draw all 300 objects. (Because you don't!) Maybe only 50 of them are on screen at any one time. Punch in the new number, and you get 167µs, much better! It's still not a whole lot of time, but it is six times more available time than when operating on all 300 objects.

Now that you have heard the theory and some very simple workarounds, I will tell you how to ignore all of this advice and allow your objects to update while off-screen anyway: Never use the `alwaysUpdate` flag. It is your enemy. It was put in so that it could be used responsibly to provide incredibly powerful functionality, like me.Tween. This is an example of an object that does not need to be considered in the viewport, so should update every frame.

How we plan to make it better

Some day this might change, and objects like me.Tween could get their own cozy little area to live, outside of the normal game loop. That would allow us to get rid of the alwaysUpdate flag, for the most part. One exception is destroying projectiles when they leave the viewport. For this, melonJS needs viewport-enter and viewport-leave events (see #191). This ticket primarily covers an entity "sleep" API, which will allow entities to be put to sleep even if they are in the viewport, further saving precious CPU cycles for entities you know will not need AI updates.

There is a goal, here, of course. melonJS should automatically take care of putting entities to sleep when they are not needed. The obvious heuristic is whether the player can see the entity; the inViewport flag. Some other heuristics may be chosen automatically, but I don't know exactly what they will be.

For the inViewport stuff, it's pretty easy to optimize, now that we have a QuadTree. This data structure can be used to select all entities near the viewport. The selected list of entities are then considered "awake" for the frame (minus the entities that were forced to sleep). I have a hunch the event handling can be done efficiently with set theory operations.

Effectively we have four sets: allawake, sleeping, and inViewport.

  • all: All entities in the world container.
  • awake: A subset of all, containing the awake entities.
  • sleeping: A subset (all ^ awake), containing all sleeping entities.
  • inViewport: A subset of all, containing entities that are in the viewport rectangle.
To automatically sleep entities, we take a difference: (inViewport ^ awake) the result is a set containing all of the awake entities outside of the viewport. For each entity in this result set, trigger the viewport-leave event, and move it from awake to sleeping.

To automatically wake up entities when they enter the viewport, we can take the intersection: (inViewport & sleeping) the result is a set containing all sleeping entities within the viewport. melonJS can then decide whether these entities need to be woken; whether is was put to sleep automatically, is set to wake up at some time in the future, etc. Waking an entity is a matter of triggering the viewport-enter event on it, and moving it from sleeping to awake.

Finally we are left with a synchronized awake set that can be updated for the frame. And the inViewport set is drawn.

This is not exactly how it will be implemented, but it shows the basic idea; Only entities that are awake can be updated, and by default only entities that are in the viewport are awake. There will likely be some overrides available for forcing entities to be awake regardless of their visibility, just because we know how game developers love to shoot themselves in the foot! But we're hoping the viewport events will be used instead for things like destroying projectiles when they leave the screen.

The benefit of doing all of this work with set theory is that melonJS won't have to iterate the entire list of 300 entities each frame; it only has to iterate the list of entities which are potentially in the viewport. I say "potentially" because there is a chance that the list provided by QuadTree contains entities just outside the viewport boundaries. Those edge cases still must be filtered by doing rectangle hit testing. This is an overall win as long as all of your entities are not within the viewport. (That can happen! Especially in single-screen maps.)

The second benefit is keeping track of entities in the awake and sleeping sets to assist efficient event handling; using a subset of subsets, further reducing the amount of iterations required to process each frame.

The viewport events, along with the related entity lifecycle events, can make all the difference in a game with large maps. Developers need to be cautious about unnecessarily using the CPU, but the game engine also needs to provide utilities to directly manage how the CPU is used by game entities. That's what these events are all about. And that's why melonJS is all about speed!