Multicast HTCP purging
This page was heavily edited by RobLa on 2013-01-28, and could use a review by a knowledgable opsen
- MediaWiki instance in Eqiad detects that a purge is needed. It sends an HTCP purge request to a multicast group for each individual URI that needs to be purged.
- Native multicast routing is enabled in Eqiad and Pmtpa, and multicast packets should natively route between the two datacenters
- Multicast is sent to Esams via multicast->unicast->multicast relay located in eqiad (as of 2013-11-04)
- All Squids/Varnish caches subscribe to the multicast feed
Note, multicast HTCP is a one-way protocol, which means that requests are fired and forgotten. If there is a problem anywhere in the system, the HTCP origin has no way of knowing there was a failure, and thus assumes that the request went through.
HTCP modifications to Squid
Mark Bergsma modified the HTCP support in Squid to do the following:
- work without requiring HTCP CLR responses
- work at all when not requesting HTCP CLR responses
- use a different store searching algorithm instead of htcpCheckHit(), which was intended for finding cache entries for URI hits instead of URI purges
- allow the simultaneous removal of both HEAD and GET entries with a single HTCP request, by specifying NONE as the HTTP method
The Squids are all configured with the following line:
to have them join the relevant multicast group, and receive all the purge requests.
Varnish relies on a separate listener daemon (varnishhtcpd) to listen for purge requests and respond to them.
MediaWiki was extended with a SquidPurge::HTCPPurge method, that takes a HTCP multicast group address, a HTCP port number, and a multicast TTL (see DefaultSettings.php to send all URLs to purge to. It can't make use of persistent sockets, but the overhead of setting up a UDP socket is minimal. It also doesn't have to worry about handling responses.
All Apaches are configured through CommonSettings.php to send HTCP purge requests to the multicast group address 126.96.36.199. It uses multicast Time To Live 2 (instead of the default, 1) because the messages need to cross a single subnet/router.
udpmcast is a small application level multicast tool in Python, . It joins a given multicast group on startup, listens on a specified UDP port, and then forwards all received packets to a given set of (unicast or multicast) destinations.
Its options can be found by running it with the -h argument.
As of November 2013, chromium is running udpmcast via /etc/rc.local and sending to dobson. Group is 188.8.131.52 port 4827.
udpmcast.py supports forwarding rules, where it selects the destination address list based on the source address that sent the packet. These forward rules can be specified as a Python dictionary on the command line.
Multicast breakage troubleshooting
(current as of November) 2013
this is for troubleshooting the udp multicast to unicast proxy that enables purges to work in pmtpa
first tcpdump on chromium
tcpdump -n -v udp port 4827 and host 184.108.40.206
Is there a crazy amount of traffic? if yes, it's not the network on the eqiad side! if no, it's the network on the pmtpa side.
if there is a lot of traffic then tcpdump on hooft
tcpdump -n -v udp port 4827 and host 220.127.116.11
Do you see a huge amount of traffic? If yes - it's not the network! Let's say that chromium has no traffic.
After that making sure it is listening -
root@chromium:/var/log# netstat -nl | grep 4827 udp 0 0 0.0.0.0:4827 0.0.0.0:*
Then let's check and see if chromium can get multicast traffic on the correct group. Start iperf on chromium.
iperf -s -B 18.104.22.168 -u -p 1337 -i 5
Then go to a varnish machine (like cp1041) and start up iperf
iperf -c 22.214.171.124 -b 50K -t 300 -T 5 -u -p 1337 -i 5
Notice the port is NOT one used by a real service. This is important.
You should see output on chromium like
root@cp1044:~# iperf -s -B 126.96.36.199 -u -p 8648 -i 5 ------------------------------------------------------------ Server listening on UDP port 1337 Binding to local address 188.8.131.52 Joining multicast group 184.108.40.206 Receiving 1470 byte datagrams UDP buffer size: 122 KByte (default) ------------------------------------------------------------ [ 3] local 220.127.116.11 port 1337 connected with 10.64.0.169 port 8442 [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0- 5.0 sec 30.1 KBytes 49.4 Kbits/sec 0.038 ms 0/ 21 (0%) [ 3] 5.0-10.0 sec 30.1 KBytes 49.4 Kbits/sec 0.025 ms 0/ 21 (0%) [ 3] 10.0-15.0 sec 30.1 KBytes 49.4 Kbits/sec 0.023 ms 0/ 21 (0%)
If you do not, multicast has gone wrong.
Try this step over again but change the group address (like to 18.104.22.168). If this still does not work, multicast is broken between datacenters.
Purge a URL
on terbium, run:
$ echo 'https://example.org/foo?x=y' | mwscript purgeList.php
Previous methods of Squid purging implemented in MediaWiki, SquidUpdate::purge and SquidUpdate::fastPurge, used HTTP PURGE requests over unicast TCP connections from all Apaches to all Squids. This had a few drawbacks:
- All Apaches needed to be able to connect to all Squids
- There was overhead of handling Squid's replies and TCP connection overhead
The biggest drawback was that it was plain slow. Some profiling runs show that the current method is about 8000 times faster than the older fastPurge method.