Load balancing architecture

Originally created on January 2014, edited from a conversation between Roan Kattouw, Gabriel Wicke and Inez

LVS is the load balancer in front of the frontend varnishes. Every multi-server cluster has an LVS in front of it to load-balance requests. The workflow of a request to the Parsoid backend is thus:

API LVS --> API server --> Varnish LVS --> Varnish frontend --> Varnish backend --> Parsoid LVS --> Parsoid server

How LVS load balancing and hashing work together? For every cluster of servers, there's something called a service IP. It's a public IP address that points to the LVS server, which selects a backend based on round-robin. LVS just does simple load balancing/fail-over and front-end varnishes do consistent hashing based on the request. Hence the selection of the backend server is very dumb. The only smart backend selection is from Varnish frontend to Varnish backend where the frontend knows exactly which backend has (or is supposed to have) the cache entry and so it hits its IP directly

The main purpose of splitting Varnish in frontend and backend is to do hashing on the frontend one. Basically the frontend Varnishes are glorified hash-based routers: they hash the URL and route the request to the right backend they also have smallish in-memory caches to serve very popular URLs from. As an implementation detail: the frontend Varnishes aren't actually separate boxes, each Varnish box runs one frontend and one backend, each on a different port, With 1/n probability, the relevant backend is on the same box and it uses loopback

In the parsoid case the front-end varnishes don't cache anything, they are pure routers.

Where can I see that hashing code? The puppet manifests for setting up our caches are in manifests/cache.pp (IIRC) in our puppet repo

See also