Writing Node.js Applications that Leverage Multiple Processors
Node.js
Monday, 06 September 2010 20:59

Since Node.js runs in a single process with no threads by default, your application will not be capable of leveraging multiple CPUs. On the Node.js home page, they state

Processes are necessary to scale to multi-core computers, not memory-sharing threads. The fundamentals of scalable systems are fast networking and non-blocking design—the rest is message passing. In future versions, Node will be able to fork new processes (using the Web Workers API ) which fits well into the current design.
I checked into the timeline for this promise and found a few interesting things.

First, there is now an implementation of the Web Workers API called node-webworker, which is built on node-websocket-server/client. Although I haven't tried it out myself, I get the impression that it needs some work before being ready for prime time, but it looks promising.

Second, there is a way to manually spawn child processes is Node.js, and from what I can tell, this is fully baked. The API call looks like "require('child_process').spawn()." One noted downside of this method is that the child process is not well integrated into the main processess, so messaging between the processes is a bit painful and things like output to stdout won't work as you might hope. My reaction to this method is that I wanted something much more transparent. I want my app to "just work" on a multi-cpu server, without modifications to my application. The response I got to this concern was that the CPU is rarely the bottleneck for socket applications, but rather I/O. Yeah, but what if you have a particularly CPU intensive application? Well, in that case, it might make sense to spawn a child process just for the CPU intensive logic, as opposed to all of your application constantly dealing with the overhead of process context switching when CPU isn't the bottleneck anyway. And, if tons of code in your application is CPU intensive, then perhaps the Web Workers API implementation above is right for you.

So, in summary, CPU probably won't be your bottleneck. You can manually spawn child processes for CPU intensive activities as needed and you can use node-webworker if you really need it. It all makes quite a bit of sense to me now. It is somewhat annoying that there's an extra layer of complexity here that you wouldn't normally have to think about, but the design has good reasoning and, once more, since CPU is typically not the bottleneck in socket applictions, this is a moot point in most cases and therefore isn't something you need to constantly be thinking about as you develop Node.js applications.

So I should just let my other CPU's run idle???

Absolutely not. I'm merely saying that you don't need a single instance of your app to leverage multiple CPUs. You should write your app in such a way that multiple instances of it can be run at once. A notable gotcha here is if you have some in-memory variables that must be global to your app. Really this is just bad design anyway because doing this will limit you to one server whether you use Node.js or not. Spark and Spark2 are my favorite means of running multiple instances of an app in an organized manner.


blog comments powered by Disqus