Toward a million-user long-poll HTTP application – nginx + erlang + mochiweb :)

First, this post and title are inspired by the following great article, you absolutely must read if you are interested in the subject:

I’m quite far from 1M users, but still getting a load which out-of-the-box configuration can not cope with. What I’m doing is long-poll HTTP server which implements chat and some other real-time notifications for web clients. Nginx is used as a reverse proxy (HTTP router basically). Let’s say we want to handle N connections simultaneously (N should be much bigger than 1000 šŸ™‚ ). What parameters need to be changed? Note: all numbers are approximate.

1. Nginx
Number of file handlers (descriptors) provided by OS. It is configured in the script which launches nginx. In my case it is /etc/init.d/nginx. So I just add

ulimit -n N*2

Mind that proxy naturally uses two sockets per connection (uplink and downlink).

Now let’s take a look at Nginx config file (/etc/nginx/nginx.conf for me). Here is extract from Nginx manual on events module:

The worker_connections and worker_proceses from the main section allows you to calculate maxclients value:
max_clients = worker_processes * worker_connections
In a reverse proxy situation, max_clients becomes
max_clients = worker_processes * worker_connections/4

So we have to set

events {
worker_connections N*4;

Then as we use Nginx as a proxy not for a regular HTTP server, but for long-polling one, we have to tune Nginx parameters:

location /my_url {
proxy_buffering off;
proxy_read_timeout 3600;

So we forward all requests to “/my_url” to HTTP server running on localhost port 8000. We disable buffering in Nginx, and tell that it may take a while for the server to respond, so Nginx should wait and not timeout.

ok, we are done with Nginx, let’s go to the server

2. Erlang
Again in start script we set number of available file descriptors:

ulimit -n N

Then we are passing two additional params to erlang:

erl +K true +P N ...your params here

+K option enables kernel polling, which saves some CPU cycles when handling multitude of connections. +P says how many parallel processes Eralng VM can have, by default it is 32768. My application is based on mochiweb library and uses one process per connection (usual case for Erlang server applications), so we need to have at least N processes. Note that some recommend to set it to maximum possible value, roughly 137M, but It leads to that Erlang VM allocates about 1GB of memory, not a big deal per se as it is in virtual address space, but I can imagine what internal reference tables for this memory heap and erlang process tables/mailboxes/whatever are also big and can cause some overhead. So I’d prefer to be on a safer side and set it to the really required value.

3. Application
The last thing we want is tell mochiweb server that we want more than default 2048 simultaneous connections. So I changing start parameters:

mochiweb_http:start([{max, N },{name, ?MODULE}, {loop, Loop} | Options]).

5 thoughts on “Toward a million-user long-poll HTTP application – nginx + erlang + mochiweb :)

  1. Pingback: Solving performance problems in pub-sub erlang server « Alexey’s Random Notes

  2. Pingback: Toward a million-user long-poll HTTP application ā€“ nginx + erlang + mochiweb - Nginx Erlang

  3. Pingback: Toward a million-user long-poll HTTP application ā€“ nginx + erlang + mochiweb | Nginx Lighttpd Tutorial

  4. Cerys

    Way cool! Some very valid points! I appreciate you writing this post and also the rest of the
    website is very good.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s