High Availability of hg web server through NFS share
Brodie Rao
brodie at bitheap.org
Wed Aug 10 22:34:36 UTC 2011
On Wed, Aug 10, 2011 at 1:09 PM, Christophe Furmaniak
<christophe.furmaniak.ml at gmail.com> wrote:
> Thanks Isaac for the pointer! (I may have searched with the wrong keywords
> on the bitbucket blog).
>
> These 2 articles give more informations:
>
> http://blog.bitbucket.org/2010/08/25/bitbucket-downtime-for-a-hardware-upgrade/
> http://blog.bitbucket.org/2010/09/16/outage-incident-and-our-new-monitoring-setup/
>
> From what I understand/guess, they seem to have at least 2 front end
> machines and a shared storage (a Storage, Dell MD1120 DAS array).
> I don't know about Dell DAS array, I'll keep on searching.
>
> Anybody from Bitbucket on the mailing list?
We're still using the Dell array for repository storage, but we're
planning to switch to a NetApp array in the near future.
Here's a rough guide to our current architecture:
- 2 load balancers, each running HAProxy for HTTP, HTTPS, and SSH
traffic. For HTTPS, we terminate SSL using stunnel before it hits
HAProxy. We're using a patched version of stunnel that lets us
generate X-Forwarded-For headers.
- 3 front end servers, each running nginx and 2 instances of Celery
(one for repository import jobs, and another for everything else).
Behind nginx, we have Gunicorn instances running our Django app,
hgweb, and our source tarball generator. hgweb gets 6 Gunicorn workers
per front end. These machines also run OpenSSH servers for SSH access.
- 2 DB servers running PostgreSQL 9. One is the master and the other
is the slave.
- A couple other machines that run RabbitMQ, Redis, and probably some
other things.
- 2 repository storage servers that are connected to our Dell storage
array (which is divided into two). Other servers gets data from this
via NFS.
- Amazon CloudFront for static media, which proxies to our nginx on
our front end servers (as opposed to manually uploading media to S3).
To put that all together, the typical hgweb request looks something like this:
Client <-> HAProxy <-> nginx <-> Gunicorn/hgweb <-> storage server (NFS)
Also note that because we proxy everything through nginx, it can
quickly consume hgweb responses, slowly stream them out to the client,
and buffer request bodies before sending them to hgweb. This means
that the Gunicorn workers don't stay tied up for long.
More information about the Mercurial
mailing list