Varnish

Wikimedia infrastructure
	Data centres and PoPs Networking HTTP Caching Varnish; LVS and Varnish; MediaWiki caching; Multicast HTCP purging; Wiki Media Logs

Varnish is a fast caching proxy, and can be used as an alternative to Squid in a reverse caching accelerator setup.

We currently use Varnish for serving bits.wikimedia.org, upload.wikimedia.org, page views and API requests on Wikimedia wiki domains (e.g. *.wikipedia.org; aka text domains), and various miscellaneous web services.

As of February 2014, Squid is no longer in use.

HOWTO

See Varnish statistics

Run

# varnishstat

Fred has also written a Ganglia plugin in Python for varnish, which is automatically installed by Puppet. All varnishstat metrics are therefore visible on Ganglia.

Set runtime parameters

Run

# varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082

Add/remove backends

Puppet can automatically generate Varnish backend and director statements by setting the $varnish_backends and $varnish_directors variables (see below).

Alternatively, backends can be defined manually in the templates/varnish/wikimedia.vcl.erb ERB template file.

See request logs

As explained below, there are no access logs. However, you can see NCSA style log entries for current requests using:

# varnishncsa

See backend health

Run

# varnishlog -i Backend_health -O

Some more tricks

// Query Times
varnishncsa -F '%t %{VCL_Log:Backend}x %Dμs %bB %s %{Varnish:hitmiss}x "%r"'

// Top URLs
varnishtop -i RxURL

// Top Referer, User-Agent, etc.
varnishtop -i RxHeader -I Referer
varnishtop -i RxHeader -I User-Agent

// Cache Misses
varnishtop -i TxURL

Configuration

Deployment of Varnish is done using Puppet, using class varnish in file manifests/varnish.pp.

We use the varnish package of Ubuntu/Debian, minimum version 2.1.2. Puppet installs this, and replaces its /etc/default/varnish file to set some startup parameters (discussed below). Varnish uses a VCL file (Varnish Configuration Language), a DSL where Varnish behaviour is controlled using subroutines that are compiled into C and executed during each request. The Wikimedia VCL file is

/etc/varnish/wikimedia.vcl

Like with Squid, special sysctl settings are installed by Puppet to tune the system for high HTTP traffic performance.

Varnish does not use log files, but instead writes detailed information about its operations to a SHM ring buffer of a fixed size. Any interested programs can just read along and produce statistics or log output without it slowing down the Varnish daemon itself. The SHM file is mlocked in memory, but Linux insists on writing its buffers to disk anyway - therefore Puppet mounts the directory /var/lib/ganglia into a 150M sized tmpfs filesystem to avoid this.

Startup parameters

The following parameters have been changed from the defaults. Most of these are set in the (now Puppet generated) file /etc/default/varnish.

NFILES=500000

For ulimit -n, the number of files/sockets/file descriptors Varnish can open

MEMLOCK=90000

For ulimit -l, so Varnish can lock the entire SHM log buffer (default 80M) into memory.

ulimit -s 128

To reduce the VSIZE with many Varnish threads

VARNISH_VCL_CONF=/etc/varnish/wikimedia.vcl

Points to the Wikimedia specific VCL file

VARNISH_LISTEN_ADDRESS=
VARNISH_LISTEN_PORT=80

Bind to TCP port 80 on all IPs

VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
VARNISH_ADMIN_LISTEN_PORT=6082

Listening socket for the administrative interface

VARNISH_MIN_THREADS=500
VARNISH_MAX_THREADS=8000

The minimum and maximum amounts of threads Varnish will keep around for requests, per thread pool.

VARNISH_STORAGE="malloc,1G"

For bits, which has a small content set, we want to keep everything in memory.

EXTRA_OPTS="-p thread_pools=8 -p thread_pool_add_delay=1 -p send_timeout=30 -p listen_depth=4096"

Extra runtime parameters, explained below:

thread_pools=8

One thread pool per CPU core; this reduces mutex contention.

thread_pool_add_delay

Create more threads quickly when needed

send_timeout=3

Keep the amount of open connections low, and close idle connections quickly.

listen_depth=4096

Allow many new connections in the accept() queue, before Varnish can open them.

Puppet configuration

In Puppet, two variables can be defined:

$varnish_backends: an array of fully qualified hostnames that will be used to generate Varnish backends in the VCL, e.g. $varnish_backends = [ "srv191.pmtpa.wmnet", "srv192.pmtpa.wmnet" ]
$varnish_directors: a hash of director names to backends, e.g. $varnish_directors = { "appservers" => [ "srv191.pmtpa.wmnet", "srv192.pmtpa.wmnet" ] }

You also may want to read Bits varnish testing instead, for Domas his findings during a pilot project.

One-off purges

$ dsh -c -g bits varnishadm -T 127.0.0.1:6082 -S /etc/varnish/secret ban.url <your url here>

Don't do this. Consult a varnish specialist first.

< bblack> (it's generally a varnishadm ban command, but the syntax/caveats are not exactly intuitive)
< bblack> as in, figure out the right ban command in https://www.varnish-cache.org/docs/3.0/tutorial/purging.html , then push that around to the correct cluster of caches via salt
< bblack> the commands I did were:
< bblack> root@palladium:~# salt -G 'cluster:cache_text' cmd.run 'varnishadm -S /etc/varnish/secret -T 127.0.0.1:6083 ban req.http.host == "w.wiki"'
< bblack> and then repeat, but change :6083 to :6082 (to hit frontends after fixing on backends)
< bblack> and variations of course: cluster:cache_* or cluster:cache_mobile, etc... and the ban expression can be complex, a lot of the same things available in req/resp objects in VCL in general

Mobile

When mobile wants you to 'clear the varnish cache', you should read MobileFrontend#Flushing_the_cache instead of this page.

Things that need special consideration

HTCP purging
Immediate purging of cache objects (nuke?)
Header normalization (Host, Accept-Encoding...)
Two-layer setup (CARP style)
Compatible logging
Request stats

Would be nice

SSL
IPv6

Many of these are probably already taken care of by our friends at Wikia, and therefore possibly also within Varnish itself...

External links

Varnish main web site
OSCON presentation about Varnish at Wikia by Artur Bergman, containing useful information about performance tuning and some neat features as well.
HTTP headers and their treatment in different stages -- the ones with HTTPH_R_FETCH are the ones varnish filters out when copying req.* to bereq.*
Diagram of Varnish request processing