Wednesday, September 24, 2014

Dynamic Worker Threads In JavaScript

Patrick knows what to do

Recently, I've been doing a lot of performance work on a real-time data visualization app at Netflix. This app in particular is getting event streams (via web sockets) from the Netflix cloud. These event streams are aggregated in various ways so I don't get a firehose of events via some brilliant work by my colleagues, but nonetheless, it's a lot for single-threaded browser app to process, especially when that browser app also has a rich, interactive UI.

This means that what I really need to do is spin up another thread to separate the work load between UI related concerns and I/O or data map/reduce concerns. The idea is to free the UI thread up as much as possible, so it doesn't "jank" when I'm doing things like parsing JSON that arrived on the web socket. Generally, a snapshot of this concept looks a *little* like this:

Behold: An overly simplified, off-center, and crappy diagram of two event loop threads mating in their natural habitat!



There are other things I'm doing in this area, for example using RxJS to buffer incoming pushes on the socket, setting up back-pressure handling, etc. But I'm not going to get into that stuff in this post. What I want to talk about here is web workers. Just the basics really.


How can you make a multi-threaded JavaScript app?


The first question someone might have is simple: "You're gonna to what?"

It's actually easier that it sounds. You just create a new Worker:


Then your script file will be something like this:

Pretty darn simple. If your browser supports it, you just create a new worker, and point it at a script you'd like to run in that thread. Both the worker object in the window, and the worker script's global scope have access to an `onmessage` event, and a `postMessage` method. The `postMessage` method will send a message event to the other side, which is handled by the `onmessage` event. Really, really basic.


How can I build one of these dynamically?


So then there's problem developers face for a variety of reasons. Maybe you need to create some worker on the fly because of variable conditions in your app. Maybe your JavaScript build environment makes unit testing separate, standalone scripts a PITA. Who knows? So what do you do? Well, Blob URLs to the rescue:



Easy peasy.  You can take any string containing JavaScript and use it to create a worker. Just make sure you also teardown the URL when you're done:



"But dynamic workers seem less testable, Ben, you dork."


Yeah. The above example is probably going to be even harder to test, I suppose. You have this string, that's a script. Evaling that string and then testing does not at all sound appealing. At least to me. So there are a few other approaches. Here are two that I like the best:

1. Behavioral testing of the live worker

Actually spin up the worker in your test, send messages to it and check the responses. This could get sloppy because maybe your worker doesn't always send responses to things you post to it. Also async gets nutty in tests sometimes.


2. Build your worker from functions and test those

For his approach, we create functions on some testable object or in some testable space we already have access to in our tests. So in Angular it might be a service or controller, and in Ember it could be any Ember Object (Controller, a Route, Model, etc). The idea is we just add a function to it, that we're going to use to create the worker.

The idea is that you can create a function that takes a global context as an argument. The reason I'm choosing to pass it as an argument, and not as `this` via binding is simply because a lot of frameworks (JQuery and Ember for example) make a lot of use of the `this` keyword, and I want to limit confusion about what this is. It's a global scope that doesn't belong to the current class.

Once you do that, you'll be able to call the worker function, or even bits and pieces of it, depending on how you use this trick and unit test those parts along with the rest of your framework code.

For my example, I'll use a plain JS class:


To use this Muppet class it might look a little like this:

And now some tests (in a Pseudo-Jasmine style, but this will work in any testing framework):


As a side note, I've "DRY-ed" up the tests above for the terseness of this blog entry. I'm actually against DRY tests, but that's a whole other blog post I suppose. This StackOverflow answer sums up some of my feelings on that, I suppose.

Also, one goofy thing about Jasmine is the need to reset spies after they've been called once, so the code above might not exactly just "work" if you copy-pasta, because I've taken out the bits that will reset the spy. Again, for terseness.


Additional things to know about Workers


  1. importScripts() - is a globally available function for loading other scripts into your worker. For example you could load RxJS in like so: `importScripts('http://example.com/rxjs.min.js', 'http://example.com/scripts/rxjs.async.js');`.  Be aware if you're using Blob URLs you might run into issue with relative URLs in import scripts.
  2. Workers have no document!  This means any script you might want to load that requires the DOM (Ember, Angular, or D3 for example) will not load properly in your worker.
  3. Workers, therefore have no DOM-related stuffs. You won't be able to build HTML via DOM objects and send them back to your UI thread.
  4. You can only "postMessage" objects that can be "structured cloned": This is complicated to explain, I guess, but to keep it simple: you can send POJOs back and forth or simple struct-type class objects.
  5. Workers can be a PITA to debug. For example: breakpoints aren't always hit, and console.log behaves differently. Therefor, I think it's a *really really* good idea to unit test your worker directly.

Go forth and enthreadify all the things!


I kid, I kid. You probably shouldn't use Workers unless you have to. But in the end, there are a LOT of ways to skin this cat. Now with the tools to convert any string containing JavaScript into a URL that can be spun up in a different thread, developers should be able to use threading in a variety of interesting, and testable(!!) ways.

There are still a lot of other things to talk about: MessageChannels, for example. But that, again would be a different blog entry. I just want to mention them so anyone reading this hopefully does some Googling.

No comments:

Post a Comment

This form allows some basic HTML. It will only create links if you wrap the URL in an anchor tag (Sorry, it's the Blogger default)