Hypothesis for this update
In the last post we observed that although we shared the vertex buffer, program and handles with all circles (rather than create a new one for each), we left the rendering code in each circle re-setting many uniform and GL state values that only need to be set once per frame (assuming we render all circles in a batch--that is, with nothing else interceding).We hypothesized that removing the redundant GL API calls would improve performance, so we planned to move them to a Circle.prepare() method that we would call just once at the start of each frame.
Code changes
The Circle.js class now has a new static method Circle.prepare() that contains many calls removed from therender()
method. Also, the width and height parameters are removed from render()
and used in prepare().function Circle(gl, x, y, mass, r, g, b) { this.pos = new Vec(x,y); // Position, in pixel coords with (0,0) at center. this.vel = new Vec(0,0); // Velocity (in pixels per second). this.mass = mass; // Abritrary measure; effects circle size. this.r = r; this.g = g; this.b = b; this.selected = false; // Not used at this time. // Only create vertex buffer, load program and collect handles once. // Store them in static state. // if( !Circle.program ) { Circle.geoBuffer = gl.createBuffer(); gl.bindBuffer(gl.ARRAY_BUFFER, Circle.geoBuffer); gl.bufferData(gl.ARRAY_BUFFER, new Float32Array([ -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1]), gl.STATIC_DRAW); Circle.program = laodShaderProgram("circle", "circle"); gl.useProgram(Circle.program); // Vertex attribute handle. Circle.geoHandle = gl.getAttribLocation(Circle.program, "a_position"); // Resolution handle. Resolution is a vec2 in pixels. Circle.resHandle = gl.getUniformLocation(Circle.program, "u_resolution"); // Position handle. Position is the center of the circle in pixel coords. Circle.posHandle = gl.getUniformLocation(Circle.program, "u_pos"); // World Size handle. World Size is a vec2 containing the pixel width/height of the context. Circle.worldSizeHandle = gl.getUniformLocation(Circle.program, "u_worldSize"); // Color is the [RGB] color of the circle. Circle.colorHandle = gl.getUniformLocation(Circle.program, "u_color"); } } Circle.prepare = function(gl, width, height) { gl.useProgram(Circle.program); gl.enable(gl.BLEND); gl.blendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA); gl.useProgram(Circle.program); gl.bindBuffer(gl.ARRAY_BUFFER, Circle.geoBuffer); gl.vertexAttribPointer(Circle.geoHandle, 2, gl.FLOAT, false, 0, 0); gl.enableVertexAttribArray(Circle.geoHandle); gl.uniform2f(Circle.worldSizeHandle, width, height); } Circle.prototype.render = function(gl) { // Pipeline state setup. // gl.uniform2f(Circle.resHandle, this.mass * 300, this.mass * 300); gl.uniform2f(Circle.posHandle, this.pos.x, this.pos.y); gl.uniform3f(Circle.colorHandle, this.r, this.g, this.b); // Draw. // gl.drawArrays(gl.TRIANGLES, 0, 6); }
The script in default.html has one change: the addition of a call to Circle.prepare() just prior to rendering the circles.
... // Clear the screen. // Note: we're not using a back or stencil buffer, so we only clear the color pixels. // gl.clear(gl.COLOR_BUFFER_BIT); Circle.prepare(gl, canvas.width, canvas.height); // Render each circle. // for (var i = circles.length - 1; i >= 0; i--) { circles[i].render(gl); }; ...
Results
Performance results are consistent with expectations, as discussed below.
Test set 1: requestAnimationFrame()
# Circles | Physics | Firefox | Opera | Chrome | Firefox | Opera | Chrome | |
50 | No | 60 | 23 | 25 | Difference | 0 | 0 | 0 |
Yes | 60 | 26 | 25 | 0 | +3 | +2 | ||
500 | No | 60 | 23 | 25 | 0 | 0 | 0 | |
Yes | 60 | 27 | 25 | +10 | +4 | +7 |
Again, FireFox amazed me, being the only browser to perform well under all tests. With our small change, we picked up those last 10 frames on the hardest test.
However, overall, these are very small improvements, and we would expect that to be the case when using
requestAnimationFrame()
, as it is intended to let the browser choose the best refresh rate. Opera and Chrome's numbers are so low given the work level that I strongly suspect their frame scheduling logic does not scale well to UHD resolutions.
Test set 2: postMessage()
# Circles | Physics | Firefox | Opera | Chrome | Firefox | Opera | Chrome | |
50 | No | 1480 | 5000 | 4500 | Difference | +350 | +3150 | +2600 |
Yes | 1480 | 2840 | 2900 | +455 | +1340 | +1875 | ||
500 | No | 600 | 490 | 500 | +427 | +280 | +300 | |
Yes | 68 | 42 | 42 | +15 | 0 | +2 |
These results are something else entirely; we more than doubled some of our frame rates.
This seems ridiculous, but considering how simple our rendering is, adding a per-circle state change to the GL pipeline can easily cripple performance, and what we did here was remove that.
Of course even when we are rendering at 5000 FPS (Opera), the browser still gets to decide how often to show our masterpiece to the user, and that will be somewhere between 1 and 60 FPS. Nonetheless, these numbers show us how quickly our frames can be rendered on the GPU using this technique.
Conclusion, and what's next
For those of you who haven't noticed, what we are building is ironically close to a particle system: a collection of physically interacting objects that we draw on the screen. Particle systems are common in games and other programs, and most often run entirely on the GPU--that means both the rendering and the physics are computed on the GPU.However, I want to leave our CPU involved, based upon the assumption that at some point our circles will be user-interactive and/or dynamically updated from outside our simulation.
Given that, the next change I would like to make is to move our physics to a separate thread (via a WebWorker), freeing up the main thread somewhat in the process. I expect this to have little to no impact on FPS using the
postMessage()
technique, but a moderate impact on the requestAnimationFrame()
technique (at least for Opera and Chrome).Stay tuned!
I almost forgot: the current demo is here: http://experiments.uhdcoder.com/circles2
No comments:
Post a Comment