Hi friends,
One of the reasons I enjoy working with my hands is that I get instant feedback.
Unfortunately, this isn’t always the case with software tools. Often when I click a button or try to scroll, nothing happens for a second or two, shattering the feeling that I’m productively using something tangible.
This happens even in my own web apps, when I know that there’s no network activity, difficult computation, or other justifiable reason why feedback should be delayed.
So I’ve decided to investigate this matter and find out what it takes to actually make a fast web application.
The feeling of speed is subjective, and probably depends on both physiological and psychological factors.
I don’t know much about the physiological properties of the human perceptual system:
so I’ll ignore those factors for now.
I don’t know much about the psychological factors either:
so I’ll ignore those factors too.
Ignoring the feeling of speed, all we have left are cold, hard, physical measurements: What does an application do to draw itself to the screen?
Unfortunately, I don’t know much about the general, low-level operating system and graphics hardware stuff.
However, I do know that Chrome tries to maintain a refresh rate of 60 frames per second, which — finally — gives us a place to start: How can we make sure our application has a new frame ready every (1 second / 60) ~= 16.7 ms?
Helpfully, Google provides a rendering performance overview that explains the basics. Chrome draws a webpage (DOM + CSS styles) in roughly four steps:
Chrome does lots of caching to minimize which of these steps should be performed on each new frame. For example, if some JavaScript changes only background colors, then on the next frame the “layout” step can be skipped (i.e., the previous layout can be reused), since background colors don’t affect layout.
So, for our application to run as quickly as possible, our JavaScript and these four steps must run within 16.6 ms.
Profiling JavaScript is easy: just sprinkle profiling statements throughout the code.
Measuring the full browser rendering pipeline is trickier, because rendering is asynchronous. That is, DOM and style manipulation calls don’t block until the changes have been painted; they return immediately, and the browser renders them at its own pace.
Chrome DevTools provides a timeline tool that depicts rendering with helpful visualizations, but the tool can only be run manually.
However, I’d like a “performance test suite” that is fully automated, so that I can performance test every change.
Beyond application code changes, styling changes and browser implementation changes can all affect performance.
Furthermore, this test suite should run as quickly as possible, which means that each individual test should wait as little as possible: Each test should run immediately after last test’s frame renders.
My first attempt to measure rendering time relied on Electron’s webContents.beginFrameSubscription.
The client JavaScript would perform some action (e.g., change the document.body
background color) and measure the time it’d take for the next frame to arrive.
I ran all actions within core.async go routines so that I could “block” and ensure that an action wouldn’t run until the previous action’s frame arrived.
This scheme worked, but unfortunately the beginFrameSubscription
listener added about ~70ms of overhead to each frame, making the results useless for the timescale I needed to measure (~1ms).
I opened up an issue about this, but for the time being this approach won’t work.
I then looked into using window.requestAnimationFrame. In particular, I was curious if requestAnimationFrame callbacks are only called after any pending changes have been fully painted. I couldn’t find a clear answer reading the spec, so I decided to test it by deliberately hanging a requestAnimationFrame callback.
In particular, I setup two a requestAnimationFrame callbacks:
Both callbacks queued themselves up for the next requestAnimationFrame tick.
I expected two possible outcomes:
I was hoping for outcome 1, since this would mean:
I could use the time between requestAnimationFrame callbacks as an upper-bound on the rendering speed of certain operations. E.g., if I performed operation X in a requestAnimationFrame callback, I would know that it rendered in less time than the interval between the original and the next requestAnimationFrame callback.
I could use requestAnimationFrame to run the actions I wanted to benchmark sequentially — I wouldn’t need to worry about multiple actions being consolidated into a single repaint.
I ran the experiment and outcome 1 held! Hurray!
I then reused the first method’s code, but instead of using the webContents.beginFrameSubscription
callback to indicate the arrival of the next frame (i.e., the end of the action being measured), I used requestAnimationFrame
.
Now that I had a method for measuring paint times, I could finally run some benchmarks. Before wiring up my application, I decided to first test a variety of simple actions to just get a feel for things:
This graphic visually shows the timing distributions, with the numeric mean ± standard deviation of the paint time. Translucent lines are drawn for the JavaScript (blue for React.js, orange for plain JS) and paint (green) timings. The scale at the bottom has ticks every 10 ms.
A few interesting things to note about these benchmarks:
paint timings tend to cluster around 16ms, but not always — the “100 children” paints occur in less than 16 ms. I’m not quite sure how this is happening, since requestAnimationFrame should only fire every 16ms.
nesting elements is computationally more expensive than constructing sibling elements. TBD whether this is overhead from React.js, Rum (a ClojureScript wrapper on React.js), or my own code:
(rum/defc *sibling-boxes
[num-boxes]
[:div
(for [idx (range num-boxes)]
[:div {:style {:position "absolute"
:border "1px solid black"
:top (rand-int 100)
:left (rand-int 100)
:width 100
:height 100}}])])
(rum/defc *nested-boxes
[num-boxes]
(letfn [(*box [n]
[:div {:style {:position "absolute"
:border "1px solid black"
:top (rand-int 100)
:left (rand-int 100)
:width 100
:height 100}}
(when-not (zero? n)
(*box (dec n)))])]
(*box num-boxes)))
This code execution and paint timing allows us to know what operations definitely take too long, but it doesn’t let us compare the relative performance of operations that take less than 16 ms.
To get that information, we’ll likely need to dig into Chrome’s high-resolution tracing system — but that’s a post for another month. =)
Have a happy new year!
Best,
Kevin