20:14 logmsgbot: ori Synchronized php-1.26wmf16/includes/objectcache/ObjectCacheSessionHandler.php: Uncommitted revert of I4afaecd to test impact on T102199 (duration: 00m 12s)
20:11 godog: revert to openjdk8 and restart cassandra on restbase1008
19:55 logmsgbot: ori Synchronized php-1.26wmf16/includes/User.php: More debug logging for T102199 (duration: 00m 13s)
19:54 godog: revert to openjdk8 and restart cassandra on restbase1007
19:51 logmsgbot: ori Synchronized php-1.26wmf16/includes/EditPage.php: More debug logging for T102199 (duration: 00m 12s)
19:21 godog: revert to openjdk8 and restart cassandra on restbase1006
19:02 godog: revert to openjdk8 and restart cassandra on restbase1005
18:44 twentyafterfour: oddly, the symptom was that there were logs about apc cache entries that had been on the GC queue for too long, I guess this is due to phd being stuck
18:43 twentyafterfour: restarted phd on iridium. I had to forcefully kill one stuck repository worker to get the daemons to restart properly.
18:36 godog: revert to openjdk8 and restart cassandra on restbase1004
08:19 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I7be6dd2f5: Set $wgAjaxEditStash to false, on suspicion of being implicated in T102199 (duration: 00m 12s)
07:35 _joe_: powercycling analytics1013, no ssh, console unresponsive
04:45 logmsgbot: @tin ResourceLoader cache refresh completed at Fri Jul 31 04:45:41 UTC 2015 (duration 45m 40s)
22:57 awight: updating paymentswiki from 6854683083cabc730f37b6a79d559f23e7ff7b0f to 02db5f7f77b667da06b882b2f66de9c5546230bc
22:43 awight: paymentswiki config rolled back
22:42 awight: paymentswiki: config the IIIrd
22:34 awight: paymentswiki: rolled back again
22:31 awight: redeploying paymentswiki config: with password this time
22:21 awight: rolled back paymentswiki config
22:01 logmsgbot: ori Synchronized php-1.26wmf16/includes/page/WikiPage.php: I73fba15c26c1: Defer the InfoAction purge in onArticleEdit() (duration: 00m 11s)
21:58 awight: paymentswiki config: jiggle the handle
21:42 awight: updated paymentswiki from fd0060bf86777ee6b7acd205d134066356da69e8 to 6854683083cabc730f37b6a79d559f23e7ff7b0f
21:06 logmsgbot: ori Synchronized php-1.26wmf16/includes/Message.php: c72b7c435f: Debug logging for T102199 (take 2) (duration: 00m 11s)
21:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I1bbf3f0: Add a debug log channel for bug T102199 (duration: 00m 12s)
15:02 gwicke: manually cleaned up RB code on 1007 and 1008
14:37 moritzm: installed openjdk security updates on analytics*
14:05 moritzm: restarted opendj on nembus/neptunium to effect OpenJDK security updates
13:44 godog: downgrade openjdk-7-jre on restbase1007, nodetool flush and cassandra restart
13:39 godog: downgrade openjdk-7-jre on restbase1006, nodetool flush and cassandra restart
13:29 godog: downgrade openjdk-7-jre on restbase1005, nodetool flush and cassandra restart
13:25 moritzm: installed openjdk updates on gallium, restarting jenkins
13:17 godog: downgrade openjdk-7-jre on restbase1004, nodetool flush and cassandra restart
13:02 godog: downgrade openjdk-7-jre on restbase1003, nodetool flush and cassandra restart
12:47 godog: downgrade openjdk-7-jre on restbase1002, nodetool flush and cassandra restart
12:36 godog: downgrade openjdk-7-jre on restbase1001, nodetool flush and cassandra restart
09:18 hashar: Upgraded Zuul on all CI slaves. Should be a noop for zuul-cloner.
07:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 30 07:10:39 UTC 2015 (duration 10m 38s)
04:06 Krenair: Ignore that last error
04:05 logmsgbot: LocalisationUpdate failed: git pull of core failed
03:33 mutante: killing processes by ellery on stat1002 - load avg was over 1500 and users reported pagecounts are broken (possibly all other crons as well)
03:01 logmsgbot: LocalisationUpdate completed (1.26wmf16) at 2015-07-30 03:01:49+00:00
02:59 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 04m 25s)
02:40 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-30 02:40:38+00:00
02:36 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 45s)
02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 30 02:07:40 UTC 2015 (duration 7m 39s)
02:03 logmsgbot: LocalisationUpdate failed (1.26wmf16) at 2015-07-30 02:03:29+00:00
02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-30 02:03:29+00:00
01:30 springle: MIMEsearchPage::reallyDoQuery queries with crazy eg, LIMIT 10405000,501, on commonswiki vslow slave, from tide***.microsoft.com bots. log noise is queries hitting 5min limit and auto-killed
00:48 logmsgbot: ori Synchronized php-1.26wmf15/includes/Message.php: 160f69871c: Debug logging for T102199 (duration: 00m 13s)
00:36 logmsgbot: ori Synchronized php-1.26wmf16/includes/Message.php: eb281630ce: Debug logging for T102199 (duration: 00m 11s)
00:10 awight: rolled back config
00:09 awight: crazy previous message was all about: I pointed the DonationInterface frontends to mirror limbo messages to a Redis server on localhost.
04:00 logmsgbot: demon Synchronized database lists: moving special wikipedias to wikipedia.dblist (duration: 00m 13s)
04:00 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: moving special wikipedias to wikipedia.dblist (duration: 00m 12s)
03:25 springle: upgrade reboot db1011 trusty
03:15 logmsgbot: LocalisationUpdate completed (1.26wmf16) at 2015-07-29 03:15:56+00:00
03:09 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 10m 47s)
02:43 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-29 02:43:27+00:00
02:37 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 10m 08s)
02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 29 02:07:17 UTC 2015 (duration 7m 16s)
02:03 logmsgbot: LocalisationUpdate failed (1.26wmf16) at 2015-07-29 02:03:04+00:00
02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-29 02:03:03+00:00
00:43 logmsgbot: ori Synchronized php-1.26wmf15/extensions/AbuseFilter: Revert "Revert "Conversion to using getMainStashInstance()"" (duration: 00m 12s)
00:02 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Iccd317c6: Switch over the 'sessions' ObjectCache to nutcracker (T106986) (duration: 00m 13s)
00:01 ori: Switching over the sessions ObjectCache instance to use nutcracker. Users with an existing edit session in progress will have their session reset and will need to re-login.
2015-07-28
23:50 logmsgbot: ori Synchronized php-1.26wmf15/includes/objectcache/RedisBagOStuff.php: I3812ec5a0b: RedisBagOStuff: if no alternatives, skip master link status check (duration: 00m 12s)
23:50 logmsgbot: ori Synchronized php-1.26wmf16/includes/objectcache/RedisBagOStuff.php: I3812ec5a0b: RedisBagOStuff: if no alternatives, skip master link status check (duration: 00m 12s)
23:36 bblack: rebooting cp20xx.codfw.wmnet for kernel updates (downtimed)
18:33 andrewbogott: disabling puppet and nova-network on labnet1002 to avoid possible conflict between two different dhcp servers
17:04 godog: start cassandra on restbase1007, tentative bootstrap
16:24 YuviPanda: bounced create-dbusers on labstore1002
16:03 bd808: logstash1002 conversion to jessie done; log event volume returning to normal in index
16:01 godog: bounce cassandra on xenon to test logstash logging
15:52 bd808: installed logstash on logstash1002; forced puppet run
15:03 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor for 5% of new accounts on enwiki gerrit:226338 (duration: 00m 12s)
14:43 cmjohnson1: powering down logstash1002 to remove disk and install jessie
14:28 moritzm: restarted zookeeper on conf1003 to effect OpenJDK security update
14:16 _joe_: re-enabled puppet on mw1152 for testing
14:16 moritzm: restarted zookeeper on conf1002 to effect OpenJDK security update
13:58 paravoid: upgrading baham to gdnsd 2.2.0
13:41 _joe_: disabled puppet on mw1152, thumb_handler testing
13:40 moritzm: restarted zookeeper on conf1001 to effect OpenJDK security update
13:13 jynus: temporarily changing master of db1069(s1) to db1051 in order to fix some labsdb inconsistencies on enwiki_p
23:23 logmsgbot: catrope Synchronized w/static/images/project-logos/suwikiquote.png: Localized logo for suwikiquote (duration: 00m 12s)
23:17 ejegg: updated crm from 83cacfa1e0852ffaf47d2f02e7d843cf6f3bcda4 to db417a28a247a3fdf3e3023a700d6266e04f3e9d
22:19 andrewbogott: rebooting labvirt1005
21:50 bd808: updated scap to dc8eda5 (Don't exclude PHP files from being synced)
21:34 logmsgbot: ori Synchronized php-1.26wmf15/extensions/AbuseFilter: I13d29ea6: Revert "Conversion to using getMainStashInstance()" (duration: 00m 12s)
21:24 andrewbogott: rebooting labnet1002, just to see if I can
19:40 awight: updated DjangoBannerStats from 3db799dc8705c728c7261ae433e8197f5498fa1b to 57a0392b3f43b65050b01a0465e120ed609a769e
19:08 YuviPanda: remove others20150724183453 on labstore1002
18:39 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ib7c7861e: Point to a no-op /beacon URL rather than Special:RecordImpression (duration: 00m 12s)
18:38 ori: Merging Ib7c7861e: Point to a no-op /beacon URL rather than Special:RecordImpression
18:30 ori: Depooled Precise image scalers (mw1159 and mw1160)
18:29 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Idfe1fa60: testwiki: Point to a no-op /beacon URL rather than Special:RecordImpression (duration: 00m 12s)
18:17 YuviPanda: removed labstore/others20150724 on labstore1002
18:15 YuviPanda: running others20150724 on labstore1002
16:51 bd808: Upgraded logstash1006 to elasticsearch 1.7.0
16:48 bd808: Upgraded logstash1005 to elasticsearch 1.7.0
16:36 bd808: Upgraded logstash1004 to elasticsearch 1.7.0
16:27 bd808: Upgraded logstash1003 to elasticsearch 1.7.0
16:26 bd808: Upgraded logstash1002 to elasticsearch 1.7.0
16:25 bd808: Upgraded logstash1001 to elasticsearch 1.7.0
13:44 cmjohnson1: swapping failed disk db1058
13:11 cmjohnson1: swapping ssds in restbase1007
12:47 hashar: restarting Jenkins
12:47 hashar: Jenkins: switching gearman plugin from our custom compiled 0.1.1-9-g08e9c42-change_192429_2 to upstream 0.1.2. They are actually the exact same versions.
10:23 logmsgbot: legoktm Synchronized php-1.26wmf15/extensions/AbuseFilter/: Special:AbuseFilter on all large Wikipedias is returning errors - T106798 (duration: 00m 13s)
05:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 24 05:53:16 UTC 2015 (duration 53m 15s)
05:52 Krinkle: Added rl-test.php on testwiki (mw1017) to gather stats about cache-control rollover (Catrope, Krinkle). Used by testwiki/test2wiki/mediawikiwiki Common.js (sampled). See T105255.
02:29 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-24 02:29:25+00:00
02:26 urandom: restarting restbase on restbase1006
02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 12s)
02:06 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 24 02:06:41 UTC 2015 (duration 6m 40s)
02:02 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-24 02:02:31+00:00
23:09 ori: T84842: Requests to thumb_handler.php/.* don't match the ProxyPass rule and get handled by Zend instead. To see how HHVM actually handles these requests, I'm disabling Puppet on mw1153 and dropping the '$' anchor from the ProxyPass rules.
23:02 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Enable geo feature usage tracking on all wikis (duration: 00m 12s)
21:19 hashar: is already a nice improvement
20:33 twentyafterfour: deployed hotfix for T106716, restarted apache on iridium
15:26 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Add wgSitename and wgMetaNamespace for pnbwiki gerrit:226543 (duration: 00m 12s)
15:15 logmsgbot: thcipriani Synchronized wmf-config/CommonSettings.php: SWAT: Set a different wmgContentTranslationDefaultSourceLanguage for English part II gerrit:224031 (duration: 00m 12s)
15:14 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Set a different wmgContentTranslationDefaultSourceLanguage for English part I gerrit:224031 (duration: 00m 13s)
15:04 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Add wgSitename and wgMetaNamespace for pnbwikipedia gerrit:225322 (duration: 00m 12s)
13:08 mobrovac: graphoid deploying 81b9633
10:56 jynus: disabling puppet on maps-test hosts to debug service issue
07:28 _joe_: upgrading hhvm on the canary appservers
06:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 23 06:59:44 UTC 2015 (duration 59m 43s)
23:56 cwdent: updated civicrm from 292ad137f6b3ffc818a3bd617ca4f335931091f3 to 83cacfa1e0852ffaf47d2f02e7d843cf6f3bcda4
23:55 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: re-try reverted portion of https://gerrit.wikimedia.org/r/#/c/118654/ using NS IDs instead of not-necessarily-defined constants which were causing warning flood (duration: 00m 13s)
22:09 Reedy: running in screen as reedy on tin foreachwikiindblist wikidataclient.dblist extensions/Wikidata/extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https
16:20 logmsgbot: thcipriani Synchronized php-1.26wmf14/extensions/Wikidata: SWAT: Update Wikibase: Add api featureLog for ungroupedlist param gerrit:226086 (duration: 00m 20s)
16:01 logmsgbot: thcipriani Synchronized php-1.26wmf13/extensions/Wikidata: SWAT: Update Wikibase: Add api featureLog for ungroupedlist param gerrit:226086 (duration: 00m 20s)
15:37 godog: cleanup ganglia temp files on uranium
15:34 logmsgbot: thcipriani Synchronized php-1.26wmf14/includes/filerepo/file/File.php: SWAT: Thumbnail logging and stats part II gerrit:225936 (duration: 00m 12s)
15:34 logmsgbot: thcipriani Synchronized php-1.26wmf14/thumb.php: SWAT: Thumbnail logging and stats part I gerrit:225936 (duration: 00m 12s)
15:29 logmsgbot: thcipriani Synchronized php-1.26wmf14/includes/filerepo/file/File.php: SWAT: Thumbnail logging and stats part II gerrit:225936 (duration: 00m 13s)
15:28 logmsgbot: thcipriani Synchronized php-1.26wmf14/thumb.php: SWAT: Thumbnail logging and stats part I gerrit:225936 (duration: 00m 11s)
15:20 cmjohnson1: re-installing mw1090
15:12 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Offer 400px as a thumbnail size available in Special:Preferences gerrit:226051 (duration: 00m 12s)
19:10 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Ic0573f26: Follow-up for I189d748: whitelist 'archive.org' too (duration: 00m 12s)
19:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I189d748a: Whitelist *.archive.org for wgCopyUploadsDomains (T106293) (duration: 00m 13s)
18:29 logmsgbot: hoo Synchronized wmf-config/CommonSettings.php: Enable IP user page creation on fawiki's Draft ns (duration: 00m 11s)
18:18 logmsgbot: ori Synchronized php-1.26wmf14/includes/site/SiteSQLStore.php: I0e5f2d3b2: Use CACHE_ACCEL for SiteLists if on HHVM (duration: 00m 12s)
21:09 logmsgbot: twentyafterfour Synchronized php-1.26wmf14: Really Sync If0237cdd0d66634d75b2bab8bc4292c0f3ef75ef this time (duration: 01m 32s)
20:41 bblack: restarted salt-master service on palladium
20:33 bblack: globally cleaning up dangling symlinks left in /etc/certs from before Id7d2447 via salted 'find /etc/ssl/certs -type l -xtype l|xargs rm'
20:30 logmsgbot: twentyafterfour Synchronized php-1.26wmf14: Sync If0237cdd0d66634d75b2bab8bc4292c0f3ef75ef (revert Count API module instantiations and Hook runs) (duration: 01m 48s)
03:14 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-15 03:14:21+00:00
03:10 logmsgbot: reedy Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 13m 32s)
03:03 manybubbles: es1.6 upgrade: raised limits on shard migration rate - should speed up the restart. we should lower it before we do restarts during europe's morning
02:10 Reedy: Running LU manually to see what's wrong with it
02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 15 02:07:48 UTC 2015 (duration 7m 47s)
02:02 logmsgbot: LocalisationUpdate failed (1.26wmf13) at 2015-07-15 02:02:55+00:00
07:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 14 07:09:10 UTC 2015 (duration 9m 9s)
06:48 dcausse: es1.6 step 6: upgrade elastic1005
06:41 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I9c9bf0f4: Use LCStoreStaticArray unconditionally (duration: 03m 02s)
05:26 ori: Cleaned up now-unused hhbc files from /run/hhvm/cache on job runners
04:58 ori: Enabling LCStoreStaticArray in production. May be reverted by running: 'salt -G deployment_target:scap/scap cmd.run "rm /etc/lcstore"' on palladium.
04:48 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Follow-up for Ieb62ee050e: allow LCStoreStaticArray in server mode (duration: 00m 13s)
02:35 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-14 02:35:21+00:00
02:31 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 07m 27s)
02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 14 02:07:32 UTC 2015 (duration 7m 30s)
02:02 logmsgbot: LocalisationUpdate failed (1.26wmf13) at 2015-07-14 02:02:33+00:00
22:16 logmsgbot: legoktm Synchronized php-1.26wmf13/includes/User.php: Add 'AuthPluginStrict' log to identify users who are unable to authenticate (duration: 00m 13s)
22:15 logmsgbot: legoktm Synchronized php-1.26wmf13/includes/api/ApiMain.php: Revert "Revert "Revert Count API module instantiations and Hook runs"" (duration: 00m 12s)
22:15 logmsgbot: legoktm Synchronized php-1.26wmf13/includes/Hooks.php: Revert "Revert "Revert Count API module instantiations and Hook runs"" (duration: 00m 13s)
22:13 ejegg: updated payments from ec34ebf61e5962f66b807abdcb519ff323d41e8e to 4ca95d55a9745c05ccfbb16ee6f23a6f75328824
15:09 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT enable footer contact link on ukwiki (duration: 00m 11s)
14:55 manybubbles_: after upgrading elasticsearch its init script no longer shuts down the old version of elasticsearch. so you have to manually kill it. that means the upgrade instructions will be "special" this time around. hopefully this is a one time thing.
14:45 manybubbles_: es1.6 step 1: upgrade elasticsearch on elastic1001 -starting
14:45 manybubbles_: es1.6 step 0: successfully synced new versions of plugins
14:30 manybubbles_: es1.6 step 0: sync new versions of plugins
14:30 manybubbles_: starting the elasticsearch 1.6.0 upgrade
08:51 godog: bounce carbon daemons on graphite1001
08:50 godog: upgrade graphite to 0.9.13 on graphite1001 and bounce one instance of carbon/cache
07:29 logmsgbot: ori Synchronized php-1.26wmf13/includes/cache/LCStoreStaticArray.php: I3f63594a4: Fix variable name (follows Ib2c5856d) (duration: 00m 11s)
06:25 logmsgbot: LocalisationUpdate failed: git pull of core failed
06:24 ori: Experimenting with altering the localisation cache implementation for testwiki, operations/mediawiki-config on tin will have a local hack for a little bit
05:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 13 05:07:32 UTC 2015 (duration 7m 31s)
02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 13 02:25:58 UTC 2015 (duration 25m 57s)
02:23 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-13 02:23:43+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 16s)
02:10 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-13 02:10:25+00:00
02:10 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 34s)
01:47 springle: restarted labsdb1002 mysqld while troubleshooting replication
04:49 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 12 04:49:08 UTC 2015 (duration 49m 7s)
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-12 02:26:52+00:00
02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 12 02:25:33 UTC 2015 (duration 25m 32s)
02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 12s)
02:10 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-12 02:10:00+00:00
02:09 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 34s)
2015-07-11
19:48 jynus: stopping labsdb1002 after table corruption has been detected
19:37 urandom: from restbase1002, starting revision culling process (node thin_out_key_rev_value_data.js `hostname -i` local_group_wikimedia_T_parsoid_html 2>&1 | tee >(gzip -c > local_group_wikimedia_T_parsoid_html.log.`date +%s`.gz))
19:33 urandom: restbase: setting gc_grace_seconds to 604800 (1 week) on local_group_wikipedia_T_parsoid_html.data
04:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 11 04:55:56 UTC 2015 (duration 55m 55s)
04:21 bd808: Logstash cluster upgrade complete! Kibana working again
04:21 bd808: Upgraded Elasticsearch to 1.6.0 on logstash1006
04:12 bd808: rebooting logstash1006
04:06 bd808: logstash1005 fully recovered all shards
03:21 logmsgbot: mattflaschen Synchronized php-1.26wmf13/extensions/Flow/includes/Parsoid/Utils.php: Bump Flow to encode page name when sending to Parsoid (duration: 00m 13s)
02:28 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-11 02:28:18+00:00
02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 07s)
02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 11 02:25:19 UTC 2015 (duration 25m 18s)
02:09 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-11 02:09:45+00:00
02:09 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 35s)
00:46 bd808: Upgraded Elasticsearch to 1.6.0 on logstash1005; replicas recovering now
00:34 bd808: rebooting logstash1005
00:30 bd808: logstash1004 fully recovered all shards
2015-07-10
22:51 mutante: tendril: very short maintenance downtime
20:10 bd808: `service elasticsearch start` not starting on logstash1004; investigating
20:07 bd808: ran apt-get upgrade on logstash1004
19:52 mutante: adminbot - built and imported 1.7.10 into APT repo
19:43 bd808: rebooting logstash1004
19:40 bd808: Kibana seems to be broken by mixed 1.6.0/1.3.9 cluster
19:32 bd808: kibana not seeing indices after upgrading elasticsearch to 1.6.0; investigating
19:26 bd808: Upgraded logstash1003 to elasticsearch 1.6.0
19:22 bd808: Upgraded logstash1002 to elasticsearch 1.6.0
19:19 bd808: Upgraded logstash1001 to elasticsearch 1.6.0
22:09 logmsgbot: oblivian Synchronized wmf-config/PoolCounterSettings-eqiad.php: I don't think we want to keep poolcounter running on an imagescaler (duration: 00m 12s)
21:30 logmsgbot: tgr Synchronized php-1.26wmf13/extensions/OAuth/api/MWOAuthAPI.setup.php: no canonical redirects for requests with OAuth headers (duration: 00m 12s)
21:51 logmsgbot: manybubbles Synchronized php-1.26wmf12/extensions/CirrusSearch/: Stop some fatals in cirrus (duration: 00m 13s)
21:41 logmsgbot: bd808 Synchronized php-1.26wmf13/includes/api/ApiMain.php: Revert Count API module instantiations and Hook runs (2/2) (duration: 00m 12s)
21:40 logmsgbot: bd808 Synchronized php-1.26wmf13/includes/Hooks.php: Revert Count API module instantiations and Hook runs (1/2) (duration: 00m 12s)
21:39 logmsgbot: bd808 Synchronized php-1.26wmf13/extensions/CirrusSearch/includes/CirrusSearch.php: Suppress interwiki results when they would break (duration: 00m 12s)
02:31 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-08 02:31:24+00:00
02:16 logmsgbot: LocalisationUpdate completed (1.26wmf12) at 2015-07-08 02:16:50+00:00
02:16 logmsgbot: l10nupdate Synchronized php-1.26wmf12/cache/l10n: (no message) (duration: 00m 48s)
July 7
23:54 jgage: kafka brokers 1018 & 1021 were demoted; i have triggered a leader election and they are leaders again
23:05 logmsgbot: catrope Synchronized visualeditor-default.dblist: Enable VE by default on labswiki (duration: 00m 12s)
21:56 hoo: Restarted hhvm on mw1003 "Fatal error: Function already defined: wmfLoadInitialiseSettings in /srv/mediawiki/wmf-config/CommonSettings.php on line 187"
11:49 logmsgbot: hoo Started scap: Update WikibaseQuality and WikibaseQualityConstraint
11:40 hoo: Created the `wbqc_constraints` table on wikidatawiki
09:02 _joe_: restarted the appserver on mw1059 with hhvm.server.apc.expire_on_sets = true, restarted the heap profiling to confirm my hypothesis on T104769
08:31 _joe_: restarted cassandra on rb1004. again.
22:34 logmsgbot: legoktm Synchronized php-1.26wmf12/extensions/CentralAuth/: Made use of new USE_MULTI_COMMIT flag in user merge jobs (duration: 00m 18s)
22:31 logmsgbot: legoktm Synchronized php-1.26wmf12/extensions/UserMerge/: Added USE_MULTI_COMMIT flag to enable query batching (duration: 00m 26s)
20:13 andrewbogott: restarted keystone on labcontrol1001
18:54 bd808: Running sync-common on mw1111; fatal log showed it to be running 1.26wmf9
18:30 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: all wikis to 1.26wmf12
18:02 YuviPanda: running exportfs -ra on labstore1002
16:40 bd808: Restarted logstash on logstash1001 due to OOM
16:05 bblack: cp1065 undowntimed/repooled
16:04 YuviPanda: clean out exports.d in labstore1002, will get regenerated. backup in /root/exports.backup
15:18 logmsgbot: anomie Synchronized php-1.26wmf12/extensions/Wikidata/: SWAT: Update Wikibase: SearchEntities return 'aliases' when not same as label gerrit:222311 (duration: 00m 20s)
15:18 YuviPanda: killed icinga-wm again
15:17 bblack: depooled cp1065 in pybal/puppet
14:57 mutante: restarting gitblit on antimony for the 123443th time
14:54 mutante: restarted apache on strontium
14:50 YuviPanda: killed icinga-wm for a bit
14:43 YuviPanda: kicked puppetmaster on palladium
14:28 YuviPanda: restarted apache on labcontrol1001
19:49 ori: mw1152 not actually re-pooled because of ongoing work on palladium. I'm undoing the change and hanging back now.
19:41 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf12
19:36 logmsgbot: twentyafterfour Synchronized php-1.26wmf12: sync 1.26wmf12 branch revert of "Implement support for Google reCAPTCHA 2.0" 90665a737bc25ff3c859044755d662c6cd700573 (duration: 02m 04s)
19:31 jynus: replication issues for shard s7 on dbstore2001 and dbstore2002, production applications *not* affected
03:00 logmsgbot: LocalisationUpdate completed (1.26wmf12) at 2015-07-01 03:00:21+00:00
02:53 logmsgbot: l10nupdate Synchronized php-1.26wmf12/cache/l10n: (no message) (duration: 10m 12s)
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf11) at 2015-07-01 02:26:55+00:00
02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf11/cache/l10n: (no message) (duration: 06m 50s)
02:12 springle: upgrade db1034 trusty
01:37 ori: Depooled mw1152. Req error dashboard shows elevated 5xx rates correlating with the server getting pooled, but the logs don't appear to corroborate it. Odd.
01:03 ori: Disabling Puppet on mw1152 for 12h to hack apache config to log locally
00:42 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I9a8018981: Double $wgMaxShellMemory on HHVM scalers (512 Mb => 1024 Mb) (duration: 00m 12s)
00:34 ori: pooled mw1152 (HHVM rendering) at weight 10 for testing
00:33 gwicke: rolling cassandra restart done
00:23 gwicke: starting rolling restart of cassandra nodes to apply new config
22:13 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Flow-occupy Wikipedia talk namespace on cawiki (duration: 00m 11s)
22:09 matt_flaschen: Done converting wikitext namespace to Flow on Catalan Wikipedia
22:03 matt_flaschen: Started convertNamespaceFromWikitext.php for Project_talk on Catalan Wikipedia
21:46 RoanKattouw: Also ran populateContentModel.php --table=archive for talk namespaces on officewiki
21:45 RoanKattouw: Ran populateContentModel.php --table=archive --ns=5 on officewiki
21:29 RoanKattouw: Ran populateContentModel.php --table=page --ns=5 on cawiki
21:19 logmsgbot: catrope Synchronized php-1.26wmf12/extensions/Flow: (no message) (duration: 00m 14s)
21:19 logmsgbot: catrope Synchronized php-1.26wmf11/extensions/Flow: (no message) (duration: 00m 14s)
21:14 logmsgbot: catrope Synchronized php-1.26wmf12/extensions/Flow: (no message) (duration: 00m 14s)
21:14 logmsgbot: catrope Synchronized php-1.26wmf11/extensions/Flow: (no message) (duration: 00m 13s)
21:01 RoanKattouw: Running populateContentModel.php on officewiki for page table in namespaces occupied by Flow (1,3,5,7,9,11,13,15,91,93,101,111,113,829)
08:12 _joe_: adding conf1002 to the etcd cluster as a member
07:46 akosiaris: disabling ntp everywhere expect selected hosts in anticipation for the leap second
04:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 29 04:51:48 UTC 2015 (duration 51m 47s)
03:08 jgage: jmxtrans filled disks on all kafka brokers, 21GB log files. removed logs and restarted services.
02:23 logmsgbot: LocalisationUpdate completed (1.26wmf11) at 2015-06-29 02:23:47+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf11/cache/l10n: (no message) (duration: 05m 53s)
00:52 springle: restart eventlogging auto-purge on m4
00:51 springle: restart replication on dbstore2002
00:00 springle: pausing replication on dbstore2002
June 28
23:51 logmsgbot: ori Synchronized php-1.26wmf11/extensions/CentralNotice/modules/ext.centralNotice.bannerController/bannerController.js: I6ffdc977e87: Parse older format of Geo cookies (duration: 00m 13s)
04:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 28 04:30:54 UTC 2015 (duration 30m 53s)
02:20 logmsgbot: LocalisationUpdate completed (1.26wmf11) at 2015-06-28 02:20:52+00:00
02:17 logmsgbot: l10nupdate Synchronized php-1.26wmf11/cache/l10n: (no message) (duration: 05m 56s)
June 27
23:30 bd808: Deleted corrupt shards on logstash1004 and logstash1005. Recovery in process
20:12 ori: Delegated full access to Google Webmaster Tools for myself (olivneh@).
04:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 27 04:58:46 UTC 2015 (duration 58m 45s)
02:23 logmsgbot: LocalisationUpdate completed (1.26wmf11) at 2015-06-27 02:23:40+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf11/cache/l10n: (no message) (duration: 05m 46s)
June 26
23:57 bd808: Logstash log ingestion working again after forcing recovery of replicas for logstash-2015.06.26; new logs were being rejected with only a primary shard available
23:54 bd808: re-enabled allocation on logstash elasticsearch cluster
23:05 bblack: restarted gitblit on antimony, AGAIN
22:57 mutante: restarted gitblit
22:43 logmsgbot: catrope Synchronized php-1.26wmf11/extensions/Flow: Temporarily make subpages in Flow-occupied namespaces non-Flow again (duration: 00m 14s)
22:36 bd808: set indices.recovery.concurrent_streams to 4 on logstash ES cluster
22:36 godog: set indices.recovery.max_bytes_per_sec to 10mb on logstash ES cluster
22:25 godog: set indices.recovery.max_bytes_per_sec to 50mb on logstash ES cluster
22:25 jamesofur: Reset email address of User:Chwms identity verified in person at editathon
22:09 bd808: restarted logstash on logstash1001
21:10 urandom: taking xenon down to be rebootstrapped
20:10 bd808: Deleted 4 corrupt indices (logstash-2015.05.30 logstash-2015.05.31 logstash-2015.06.03 logstash-2015.06.06) on logstash1004
19:58 bd808: stopping elasticsearch on logstash1004 to cleanup corrupt shards
08:33 logmsgbot: ori Synchronized php-1.26wmf11/resources/src/mediawiki.skinning/elements.css: I0e5f2d3b2: Wrap lines in <pre> and .mw-code by default (duration: 00m 12s)
06:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 25 06:59:13 UTC 2015 (duration 59m 12s)
04:04 ori: restarted apache2 on palladium
03:11 logmsgbot: LocalisationUpdate completed (1.26wmf11) at 2015-06-25 03:11:01+00:00
03:04 logmsgbot: l10nupdate Synchronized php-1.26wmf11/cache/l10n: (no message) (duration: 10m 19s)
02:40 bblack: puppet re-enabled on caches
02:37 logmsgbot: LocalisationUpdate completed (1.26wmf10) at 2015-06-25 02:37:44+00:00
02:34 logmsgbot: l10nupdate Synchronized php-1.26wmf10/cache/l10n: (no message) (duration: 06m 44s)
02:04 bblack: disabling puppet on cp* caches for patch-testing
00:43 awight: update crm from bd8a00196071ddd04efbff7b30567dd9357c9000 to e923225e423948bd70440e2d1131460b10cefac1
00:38 godog: upgrade cassandra to 2.1.7 on restbase1008
15:17 logmsgbot: thcipriani Synchronized php-1.26wmf11/extensions/ContentTranslation: SWAT: Enable publish button when the preference is not to use initial translation (duration: 00m 12s)
15:08 logmsgbot: thcipriani Synchronized php-1.26wmf10/extensions/ContentTranslation: SWAT: Enable publish button when the preference is not to use initial translation (duration: 00m 13s)
14:53 logmsgbot: jynus Synchronized wmf-config/db-codfw.php: depool es2001 and es 2002 for maintenance (duration: 00m 13s)
05:03 jgage: removed old logs and did 'apt-get clean' on analytics1021 to make space
03:00 logmsgbot: LocalisationUpdate completed (1.26wmf11) at 2015-06-24 03:00:45+00:00
02:54 logmsgbot: l10nupdate Synchronized php-1.26wmf11/cache/l10n: (no message) (duration: 10m 34s)
02:28 logmsgbot: LocalisationUpdate completed (1.26wmf10) at 2015-06-24 02:28:16+00:00
02:24 logmsgbot: l10nupdate Synchronized php-1.26wmf10/cache/l10n: (no message) (duration: 07m 21s)
01:39 logmsgbot: ori Synchronized php-1.26wmf11/extensions/SyntaxHighlight_GeSHi: I0e5f2d3b2 (duration: 00m 13s)
01:01 gwicke: rolling restart of cassandra instances to rule out a single node in funky state causing elevated p99 latency
00:43 ori: experimenting with httpd on mw1041 again
00:19 gwicke: rolling restart of restbase instances to rule out backend connections as a source for high p99 latencies
00:14 ori: experimenting with HHVM shutdown via /stop on the admin server on mw1041
June 23
23:38 logmsgbot: ori Finished scap: scapping to all apaches for --restart test (duration: 07m 03s)
23:30 logmsgbot: ori Started scap: scapping to all apaches for --restart test
23:24 bblack: nginxes all updated for ssl stapling bugfix
23:24 logmsgbot: ori Finished scap: scapping to scap-test dsh group for --restart test (duration: 06m 02s)
23:18 logmsgbot: ori Started scap: scapping to scap-test dsh group for --restart test
23:16 logmsgbot: ori scap aborted: scapping to scap-test dsh group for --restart test (duration: 00m 06s)
23:16 logmsgbot: ori Started scap: scapping to scap-test dsh group for --restart test
22:14 logmsgbot: legoktm Synchronized php-1.26wmf11/extensions/SyntaxHighlight_GeSHi/SyntaxHighlight_GeSHi.class.php: RejectParserCacheValue may pass a WikiPage or Article (duration: 00m 13s)
22:07 mutante: tmp. disabling puppet on mw1033
21:53 logmsgbot: legoktm Synchronized php-1.26wmf11/extensions/SyntaxHighlight_GeSHi/SyntaxHighlight_GeSHi.class.php: (no message) (duration: 00m 15s)
21:50 logmsgbot: ori Synchronized php-1.26wmf11/includes/parser/ParserCache.php: (no message) (duration: 00m 12s)
21:40 mutante: starting instance planet1001 on ganeti1003 - cant get console
21:40 logmsgbot: legoktm Synchronized php-1.26wmf11/includes/parser/ParserCache.php: (no message) (duration: 00m 13s)
21:36 bd808: updated scap to 33f3002 (Ensure that the minimum batch size used by cluster_ssh is 1)
21:34 logmsgbot: ori Synchronized php-1.26wmf11/extensions/SyntaxHighlight_GeSHi: 3c8bb2c493: Update SyntaxHighlight_GeSHi for cherry-pick (duration: 00m 13s)
20:32 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf11
20:19 logmsgbot: mattflaschen Synchronized wmf-config/InitialiseSettings-labs.php: Beta-only change to add Flow_test to enwiki (duration: 00m 11s)
19:59 logmsgbot: ori scap failed: OSError [Errno 10] No child processes (duration: 01m 46s)
19:58 logmsgbot: ori Started scap: (no message)
19:52 ori: updated scap to master
19:11 ori: running apache graceful-stop on mw1042 to test mod_status behavior during graceful stop
18:31 godog: start rolling-downgrade of cassandra to 2.1.3 T102015
18:27 logmsgbot: twentyafterfour Started scap: New deployment branch: 1.26wmf11
18:13 logmsgbot: ori Finished scap: (no message) (duration: 04m 34s)
18:11 paravoid: reloading nginx on all cp* for reuseport
18:08 logmsgbot: ori Started scap: (no message)
17:57 ori: repooled scap-test servers (mw1170-mw1175 and mw1270-mw1275)
17:16 logmsgbot: ori Finished scap: (no message) (duration: 01m 42s)
17:14 logmsgbot: ori Started scap: (no message)
17:10 logmsgbot: ori Finished scap: (no message) (duration: 01m 34s)
17:09 logmsgbot: ori Started scap: (no message)
17:06 logmsgbot: ori scap aborted: (no message) (duration: 01m 23s)
17:04 logmsgbot: ori Started scap: (no message)
16:53 logmsgbot: bd808 Finished scap: no-op sync to scap-test dsh group; Testing HHVM restart take 4 (duration: 01m 30s)
16:52 logmsgbot: bd808 Started scap: no-op sync to scap-test dsh group; Testing HHVM restart take 4
16:45 cscott: updated OCG to version db7a56965233a74c73917c78b5c8c84c867321d9
16:37 logmsgbot: bd808 Finished scap: no-op sync to scap-test dsh group; Testing HHVM restart take 3 (duration: 01m 12s)
16:35 logmsgbot: bd808 Started scap: no-op sync to scap-test dsh group; Testing HHVM restart take 3
16:35 bd808: updated scap to da64a65 (Cast pid read from file to an int)
16:26 logmsgbot: bd808 Finished scap: no-op sync to scap-test dsh group; Testing HHVM restart take 2 (duration: 01m 26s)
16:25 logmsgbot: bd808 Started scap: no-op sync to scap-test dsh group; Testing HHVM restart take 2
16:22 bd808: updated scap to 947b93f (Fix reference to _get_apache_list)
16:12 logmsgbot: bd808 scap failed: AttributeError 'Scap' object has no attribute '_get_apache_list' (duration: 02m 15s)
16:10 logmsgbot: bd808 Started scap: no-op sync to scap-test dsh group; Testing HHVM restart
16:01 paravoid: staggered upgrade of cp* fleet to nginx 1.9.2
15:57 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: Follow-up 94e5fd2: Default wmgUseContentTranslation true only on Wikipedias gerrit:220161 (duration: 00m 16s)
15:49 jynus: rebooting es1004
15:09 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: CX: Enable CX as default except where it is not deployed gerrit:220078 (duration: 00m 12s)
12:54 moritzm: ssh on precise hosts has been updated to a backport of 6.6p1-2ubuntu2 (the version from trusty). this allows us to use modern crypto (plus labs can simplify key handling)
12:45 jynus: rebooting es1003
12:18 moritzm: uploaded openssh_6.6p1-2ubuntu2~wmfprecise2 to precise-wikimedia on apt.wikimedia.org
12:10 logmsgbot: hoo Synchronized arbitraryaccess.dblist: Arbitrary access for ruwiki and cswiki. T102122 (duration: 00m 12s)
09:41 moritzm: updated jsch on gallium and lanthanum to support modern SSH key exchange in Jenkins (actually that happened yesterday, but I forgot to log it back then)
09:41 moritzm: added jsch_0.1.50-1ubuntu1~wmfprecise1 to precise-wikimedia on carbon
09:09 akosiaris: failing over etherpad to db1016
04:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 23 04:53:17 UTC 2015 (duration 53m 16s)
03:33 springle: xtrabackup clone db2023 to db1045
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf10) at 2015-06-23 02:26:44+00:00
02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf10/cache/l10n: (no message) (duration: 06m 47s)
01:17 logmsgbot: krinkle Synchronized docroot and w: (no message) (duration: 00m 12s)
16:07 ottomata: deploying eventlogging 0.9. This includes changes for arbitrary eventlogging URIs in all eventlogging stages, as well as support for schema based kafka topic URIs.
22:09 gwicke: rolling restart of restbase instances to apply puppet change after puppet actually ran on all nodes
21:58 gwicke: rolling restart of restbase instances to apply config change
21:56 godog: restart nutcracker on mw1145
21:35 gwicke: restarting cassandra on restbase1005
20:47 mutante: temp. stopped icinga-wm
20:37 gwicke: deployed RESTBase 7ffaf94bfc
20:24 cscott: updated Parsoid to version 402ddf66
20:01 ottomata: resized antimony's / LV from 30G to 100G. looks like /var/lib/git was getting filled up
19:43 jynus: rolling schema changes on hewiki
19:29 godog: downgrade and restart cassandra to 2.1.3 on restbase1001, metrics not being pushed to graphite with 2.1.6
19:05 godog: bounce cassandra on xenon
18:46 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Ic03b152de: Make $wgUploadPath for commons https only for benefit instant commons (duration: 00m 14s)
18:11 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf10
15:37 logmsgbot: thcipriani Started scap: Wikitech-Ldap host record roll-out
15:19 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Give patrolmarks right to "*" on dewiki gerrit:218901 (duration: 00m 13s)
15:17 logmsgbot: anomie Synchronized wmf-config/throttle.php: SWAT: Add a throttle exception for United Islands of Prague gerrit:217413 (duration: 00m 14s)
15:15 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable captcha on labswiki for now gerrit:218908 (duration: 00m 13s)
15:10 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add extra namespace aliases for Italian Wikipedia gerrit:215708 (duration: 00m 13s)
15:08 anomie: SWAT: Enable anti-abuse features on labswiki gerrit:218903
15:08 jynus: testing some schema changes on testwiki
15:00 logmsgbot: aude Synchronized usagetracking.dblist: Enable Wikibase usage tracking on nowiki and plwiki (duration: 00m 13s)
13:56 logmsgbot: aude Synchronized usagetracking.dblist: Enable Wikibase usage tracking on fiwiki and idwiki (duration: 00m 13s)
13:26 logmsgbot: aude Synchronized usagetracking.dblist: Enable Wikibase usage tracking on bgwiki and eowiki (duration: 00m 13s)
10:52 akosiaris: reload pybal on lvs1006
10:50 mobrovac: finished deploying mathoid I40ef68 on SCA
08:37 YuviPanda: run sudo salt -t 20 -b 100 '*' cmd.run 'sudo service salt-minion restart' on virt1000, attempt to get them to answer on labcontrol1001 instead
06:52 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 17 06:52:58 UTC 2015 (duration 52m 57s)
02:56 logmsgbot: LocalisationUpdate completed (1.26wmf10) at 2015-06-17 02:56:49+00:00
10:36 akosiaris: rebooting ganeti200{1..6}.codfw.wmnet for kernel upgrades
09:33 logmsgbot: jynus Synchronized wmf-config/db-codfw.php: Depool es2005, es2006 and es2007 for maintenance (duration: 00m 14s)
09:10 YuviPanda: deleted huge puppet-master.log on labcontrol1001
08:05 jynus: added m5-slave to dns servers
07:52 paravoid: restarting hhvm on mw1121
07:52 moritzm: blacklisted the overlayfs kernel module (prevents a reliable local root exploit on all Ubuntu systems). no systems in the fleet had an overlaysfs mount present or the kernel module loaded, so there should be no impact on existing systems. Note: This is a bandaid, I'll create a Phab task to deploy this via puppet in the future (and to also blacklist additional desktopy kernel modules which increase our attack
06:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 16 06:24:04 UTC 2015 (duration 24m 3s)
06:18 godog: restore ES replication throttling to 20mb/s
06:13 godog: restore ES replication throttling to 40mb/s
06:08 logmsgbot: filippo Synchronized wmf-config/PoolCounterSettings-common.php: unthrottle ES (duration: 00m 14s)
05:56 godog: bump ES replication throttling to 60mb/s
05:50 manybubbles: ok - we're yellow and recovering. ops can take this from here. We have a root cause and we have things I can complain about to the elastic folks I plan to meet with today anyway. I'm going to finish waking up now.
05:49 manybubbles: reenabling puppet agent on elasticsearch machines
05:46 manybubbles: I expect them to be red for another few minutes during the initial master recovery
05:45 manybubbles: started all elasticsearch nodes and now they are recovering.
05:41 godog: restart gmond on elastic1007
05:39 logmsgbot: filippo Synchronized wmf-config/PoolCounterSettings-common.php: throttle ES (duration: 00m 13s)
05:25 manybubbles: shutting down all the elasticsearch on the elasticsearch nodes against - another full cluster restart should fix it like it did last time...............
22:05 logmsgbot: twentyafterfour Synchronized wmf-config/InitialiseSettings-labs.php: deploy: Never use wgServer/wgCanonicalServer values from production in labs (duration: 00m 12s)
20:18 godog: start cassandra on restbase1008, bootstrapping
20:04 godog: sign restbase1008 key, run puppet
20:00 godog: powercycle restbase1007, investigate disk issue
19:07 logmsgbot: ori Synchronized php-1.26wmf9/includes/jobqueue: 0a32aa3be4: jobqueue: use more sensible metric key names (duration: 00m 13s)
16:57 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Grant cloudadmins the 'editallhiera' right gerrit:218115 (duration: 00m 14s)
16:48 logmsgbot: thcipriani Synchronized php-1.26wmf9/extensions/OpenStackManager/OpenStackManagerHooks.php: SWAT: refer to user the right way (duration: 00m 13s)
16:48 godog: powercycle graphite1002, no ssh, unresponsive console
16:19 jynus: upgrading es1005 mysql service while depooled
16:12 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Grant cloudadmins the 'editallhiera' right gerrit:218115 (duration: 00m 12s)
16:10 bblack: pybal restarts complete, all ok
16:09 logmsgbot: thcipriani Finished scap: SWAT: Openstack manager and language updates (duration: 21m 27s)
15:47 logmsgbot: thcipriani Started scap: SWAT: Openstack manager and language updates
16:19 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Disable Parsoid update jobs (duration: 00m 14s)
16:18 logmsgbot: thcipriani Finished scap: SWAT: Update namespaces and special pages for Northern Luri (lrc) from translatewiki gerrit:216533gerrit:217327 (duration: 32m 11s)
15:46 logmsgbot: thcipriani Started scap: SWAT: Update namespaces and special pages for Northern Luri (lrc) from translatewiki gerrit:216533gerrit:217327
15:27 logmsgbot: thcipriani Synchronized php-1.26wmf9/extensions/OpenStackManager: SWAT: update OpenStackManager to disable unused sudoer features gerrit:217407 (duration: 00m 13s)
15:11 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Make VisualEditor access RESTbase directly on all public wikis gerrit:214833 (duration: 00m 12s)
15:05 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: CX: Add wikis for deployment on 20150611 gerrit:217460 (duration: 00m 12s)
20:38 logmsgbot: ori Synchronized php-1.26wmf8/includes/Hooks.php: d6802ad7d6: Avoid section profiling in Hooks::run due to high overhead (duration: 00m 14s)
20:37 logmsgbot: ori Synchronized php-1.26wmf9/includes/Hooks.php: e552f4942d: Avoid section profiling in Hooks::run due to high overhead (duration: 00m 17s)
20:36 logmsgbot: ori Synchronized php-1.26wmf9/includes/User.php: 2f4f1e279d: Fixed "wfTimestamp() fed bogus time value" errors (duration: 00m 12s)
20:36 logmsgbot: ori Synchronized php-1.26wmf8/includes/User.php: 55e18123ca: Fixed "wfTimestamp() fed bogus time value" errors (duration: 00m 15s)
18:07 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: Group1 wikis to 1.26wmf9
16:14 godog: reboot ms-be2008 to check disk swap config
15:34 Krenair: sync failed to something like 25 hosts, cannot directly log into any of them either
15:17 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/215030/ - no code change, just docs - should not have to wait 9 days for this (duration: 01m 08s)
15:20 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT take II: Enabled Guided Tour on th.wikipedia gerrit:216950 (duration: 01m 08s)
15:19 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enabled Guided Tour on th.wikipedia gerrit:216950 (duration: 01m 08s)
15:05 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: CX: Add wikis for deployment on 20150609 gerrit:216622 (duration: 01m 09s)
11:09 Krenair: Email set for User:GifTagger@commonswiki per phab:T100889
09:05 akosiaris: uploaded etherpad-lite_1.5.6-2 on apt.wikimedia.org/jessie-wikimedia/main component
08:22 akosiaris: upload etherpad-lite_1.5.6-1 on apt.wikimedia.org, jessie-wikimedia dist, main component
04:35 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 9 04:34:08 UTC 2015 (duration 34m 7s)
02:28 logmsgbot: LocalisationUpdate completed (1.26wmf8) at 2015-06-09 02:27:30+00:00
02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 07m 12s)
01:42 godog: stop icinga-wm on neon
June 8
23:43 bblack: repooled cp3030/cp1065 in pybal
23:11 logmsgbot: ebernhardson Synchronized php-1.26wmf8/extensions/UploadWizard/: Bump UploadWizard in 1.26wmf8 for evening SWAT (duration: 01m 09s)
22:21 bblack: depooled cp3030, cp1065 in pybal for ipsec
20:17 subbu: deployed parsoid sha 131554ba
19:18 jynus: RAID degradation (disk failure) on s5 master (db1058), no production impact, replacement on the way
17:13 ottomata: restarted eventlogging services on eventlog1001 after disabling kafka pieces
16:13 _joe_: powercycling tmh1001, console blank, unresponsive to pings
16:00 logmsgbot: thcipriani Synchronized commonsuploads.dblist: SWAT: Revert Temporarily re-enable uploads on Marathi Wikipedia, for real gerrit:216719 (duration: 01m 07s)
15:58 logmsgbot: thcipriani Synchronized commonsuploads.dblist: SWAT: Revert Temporarily re-enable uploads on Marathi Wikipedia gerrit:216719 (duration: 01m 08s)
15:40 logmsgbot: thcipriani Synchronized php-1.26wmf8/extensions/Cite: SWAT: Revert Do all of Cite's real work during unstrip and followup gerrit:216715 (duration: 01m 08s)
15:19 Coren: T96063: process halted for now as store/backup is unmovable and on slice5
15:17 logmsgbot: thcipriani Synchronized w/static/images/project-logos/pflwiki.png: SWAT: Fix transparency of pflwiki logo gerrit:216595 (duration: 01m 08s)
15:15 akosiaris: disabled ircecho on neon for a while
14:53 Coren: T96063: starting pvmove from slice5 to slice2
14:48 Coren: T96063: dropped volume slice1 from vg store
05:16 andrewbogott: we did a whole lot of things to labstore1001 while morebots was away
05:14 andrewbogott: service nfs-kernel-server restart on labstore1001
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf8) at 2015-06-07 02:25:13+00:00
02:21 logmsgbot: l10nupdate Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 07m 09s)
June 6
23:46 subbu: deployed parsoid 5172a446 (cherry-pick of 719c736f) -- hotfix for T101599
05:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 6 05:47:40 UTC 2015 (duration 47m 39s)
02:31 logmsgbot: LocalisationUpdate completed (1.26wmf8) at 2015-06-06 02:30:24+00:00
02:26 logmsgbot: l10nupdate Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 07m 10s)
June 5
22:42 godog: powercycle graphite2001, no console no ssh
22:06 andrewbogott: restarted apache on virt1000
20:49 ori: Upgrading hhvm-fss on application servers to 1.1.7; expect brief 5xx spike.
20:14 logmsgbot: demon Synchronized php-1.26wmf8: live hack (duration: 02m 32s)
20:10 mutante: apt-get upgrade on terbium
19:52 godog: bounce redis on rdb1001/rdb1003 to pick up new slave limits
19:51 mutante: chown root:root / on terbium
19:50 godog: bounce redis on rdb1002/rdb1004 to pick up new slave limits
19:29 godog: bounce redis again on rdb1003 after increasing the slave limits more
19:17 godog: bounce redis on rdb1003 after bumping slave limits
19:07 godog: redis master logs shows periodic 'cmd=sync scheduled to be closed ASAP for overcoming of output buffer limits.' indicating the slave fails to sync
18:40 godog: spike in redis network starting at ~15.00 UTC, correlates with ocg failures
18:01 moritzm: restarted gerrit on ytterbium for java update
14:43 jynus: short lag period on db1049, traffic automatically redirected to other slave and back to normal
14:07 moritzm: added ubuntu-meta-1.325+wmf1 for trusty-wikimedia to apt.wikimedia.org (T100004)
14:07 moritzm: added ubuntu-meta-1.267.1+wmf1 for precise-wikimedia to apt.wikimedia.org (T100004)
16:23 logmsgbot: kartik Started scap: Update ContentTranslation
15:54 moritzm: added redis_2.8.4-2+wmf1 to trusty-wikimedia on apt.wikimedia.org
15:48 logmsgbot: anomie Synchronized php-1.26wmf8/includes/jobqueue/: SWAT: jobqueue: Record stats on how long it takes before a job is run gerrit:215748 (duration: 00m 14s)
05:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 4 05:11:32 UTC 2015 (duration 11m 31s)
02:30 logmsgbot: LocalisationUpdate completed (1.26wmf8) at 2015-06-04 02:28:54+00:00
02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 07m 22s)
June 3
23:42 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings.php: syncing ImportSource change for meta (duration: 00m 13s)
23:34 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings.php: syncing config change for mediawiki logo on mobile, take 2 (duration: 00m 12s)
23:26 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings.php: syncing config change for mediawiki logo on mobile (duration: 00m 12s)
23:25 logmsgbot: kaldari Synchronized images/mobile/mediawiki.png: syncing mediawiki logo for mobile (duration: 00m 12s)
22:02 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase usage tracking on ukwiki and viwiki (duration: 00m 15s)
21:58 mutante: restarted gitblit
21:53 logmsgbot: ori Synchronized php-1.26wmf8/includes/resourceloader/ResourceLoader.php: 7f49853fc9: ResourceLoader::filter: use APC when running under HHVM (did not sync correct file previously) (duration: 00m 12s)
21:20 andrewbogott: restarting pdns on virt1000 and labcontrol1001
21:05 Jamesofur: decryption key for Board Election insert into voteWiki
20:50 hashar: restarted zuul entirely to remove some stalled jobs
20:29 paravoid: kafka preferred-replica-election on an1021
20:28 hashar: Restarting Jenkins to release a deadlock
20:23 logmsgbot: ori Synchronized php-1.26wmf8/resources/Resources.php: 7f49853fc9: ResourceLoader::filter: use APC when running under HHVM (duration: 00m 13s)
20:19 subbu: deployed parsoid sha ab675400
19:08 bblack: changed ops/puppet repo to ff-only in gerrit config, feel free to scream/revert if necc!
18:46 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: All wikis to 1.26wmf8, no new branch until next Tuesday, June 9th
00:33 ejegg: updated payments-wiki from a4fef65ec1dd3db1fb1d7ceb797b2c7485c722d2 to d22e44e3fab2b937707c2776384cb93a49b4cfd3
00:07 ori: Updated jobrunner for I1d351d8d1: Made periodictasks stats calls more useful
00:02 logmsgbot: ori Synchronized php-1.26wmf8/extensions/RSS/RSSParser.php: Ice44740fb: Don't rely on strip marker uniqueness (T10104) (duration: 00m 14s)
00:01 logmsgbot: ori Synchronized php-1.26wmf7/extensions/RSS/RSSParser.php: Ice44740fb: Don't rely on strip marker uniqueness (T10104) (duration: 00m 13s)
June 1
23:36 mutante: restarted gitblit ..
23:15 ori: Deployed jobchron / jobrunner change Icab05090b and restarted jobchron / jobrunner on job queue runners.
22:51 ejegg: updated payments from 60c160110a20cf763b82677ff1501e9ce0c919bc to a4fef65ec1dd3db1fb1d7ceb797b2c7485c722d2
21:36 godog: doing some local testing on carbon for T100636 fwiw, thus puppet disabled
21:35 ejegg: update paymentswiki from aa66797553fbcfb63f7cf29abccc44d060b65db0 to 60c160110a20cf763b82677ff1501e9ce0c919bc
21:13 logmsgbot: ori Synchronized php-1.26wmf7/languages/LanguageConverter.php: 1d054ce6d3: Use a fixed marker prefix string in the Parser and MWTidy (duration: 00m 14s)
20:40 logmsgbot: ori Synchronized php-1.26wmf8/languages/LanguageConverter.php: 1d054ce6d3: Use a fixed marker prefix string in the Parser and MWTidy (duration: 00m 13s)
20:29 twentyafterfour: disabled several no-longer-existent repositories in phabricator which apparently have been deleted in gerrit
20:26 subbu: deployed parsoid sha 73445bfd
20:05 twentyafterfour: restarted apache2 and phd on iridium (phabricator)
19:52 MaxSem: Repopulated gis.spatial_ref_sys on labsdb1004 with postgis 2.1 data, old contents backed up as spatial_ref_sys_bak
18:55 logmsgbot: ori Synchronized php-1.26wmf7/extensions/SemanticForms/includes/SF_FormUtils.php: I7ed3996a1: Stop using StripState (duration: 00m 13s)
18:55 logmsgbot: ori Synchronized php-1.26wmf8/extensions/SemanticForms/includes/SF_FormUtils.php: I7ed3996a1: Stop using StripState (duration: 00m 15s)
17:46 yurik: deployed graphoid service update - grafana logging cleanup
16:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: T99491, T100925: Sysops to add users to import group on maiwiki, newiki (duration: 00m 14s)
15:24 logmsgbot: thcipriani Synchronized php-1.26wmf8/includes/resourceloader/ResourceLoaderWikiModule.php: SWAT: Make ResourceLoaderWikiModule support custom position gerrit:214741 (duration: 00m 15s)
15:23 logmsgbot: thcipriani Synchronized php-1.26wmf8/extensions/WikiEditor: SWAT: Make ResourceLoaderWikiModule support custom position gerrit:214741 (duration: 00m 13s)
15:22 logmsgbot: thcipriani Synchronized php-1.26wmf8/extensions/VectorBeta: SWAT: Make ResourceLoaderWikiModule support custom position gerrit:214741 (duration: 00m 15s)
15:21 logmsgbot: thcipriani Synchronized php-1.26wmf8/extensions/SyntaxHighlight_GeSHi: SWAT: Make ResourceLoaderWikiModule support custom position gerrit:214741 (duration: 00m 14s)
15:20 logmsgbot: thcipriani Synchronized php-1.26wmf8/extensions/MobileFrontend: SWAT: Make ResourceLoaderWikiModule support custom position gerrit:214741 (duration: 00m 13s)
15:18 logmsgbot: thcipriani Synchronized php-1.26wmf8/extensions/Gather: SWAT: Make ResourceLoaderWikiModule support custom position gerrit:214741 (duration: 00m 13s)
14:42 cmjohnson1: powering down analytics1028 to swap the bad DIMM
13:48 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable arbitrary access on wikisource and itwiki, and make other projects sidebar feature default for ptwiki (for real) (duration: 00m 12s)
13:45 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable arbitrary access on wikisource and itwiki, and make other projects sidebar feature default for ptwiki (duration: 00m 15s)
17:36 Krinkle: Confirmed RL problem solved. The jquery|mediawiki&version=bizqqnC request was cached with an old mw.loader implementation somehow. After the touch and sync, the version is now dQAzAsdU and the implementation is up to date.
17:20 Krinkle: Investigating RL issues (clients are loading mediawiki.notification&version=19700101T000000Z, mw.loader.moduleRegistry contains NaN for versions)
17:12 gwicke: performed a rolling restart of RESTBase Cassandra nodes to address elevated request error rates apparently related to schema disagreement
05:35 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 31 05:34:36 UTC 2015 (duration 34m 35s)
02:47 logmsgbot: LocalisationUpdate completed (1.26wmf8) at 2015-05-31 02:46:41+00:00
02:43 logmsgbot: l10nupdate Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 05m 51s)
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf7) at 2015-05-31 02:25:44+00:00
02:21 logmsgbot: l10nupdate Synchronized php-1.26wmf7/cache/l10n: (no message) (duration: 06m 41s)
May 30
21:07 bd808: Upgraded Elasticsearch cluster to 1.3.9 on logstash100[1-6]
00:08 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/212436/ - docs only, no code change (how was this waiting 10 days?) (duration: 00m 14s)
May 29
23:56 logmsgbot: ori Synchronized w/static/images/project-logos: Ic62747f37: Optimise project logos added since I8c9a6a56 (duration: 00m 13s)
21:21 logmsgbot: ori Synchronized wmf-config/throttle.php: Ife45684c5: Add another IP address for Santiago edit-a-thon (duration: 00m 13s)
20:43 logmsgbot: ori Synchronized robots.txt: I7b321b62d: allow robots to use RL on domains (duration: 00m 14s)
17:18 mutante: fix client_max_body_size syntax error in nginx config of payments1001
15:19 logmsgbot: anomie Synchronized php-1.26wmf8/extensions/ConfirmEdit/: Update ConfirmEdit to fix API breakage gerrit:214620 (duration: 00m 14s)
14:52 paravoid: re-redirecting ns0 traffic back to rubidium
14:17 jynus: Moving pdns and designate databases from m1 to m5
13:30 logmsgbot: aude Synchronized php-1.26wmf8/extensions/Wikidata: touch js and css files to try to fix issues on test.wikidata (duration: 00m 26s)
13:17 godog: roll-restart cassandra on cerium / xenon / praseodymium following java upgrade
11:53 paravoid: reimaging rubidium
11:45 _joe_: restart nutcracker on mw1150
11:41 paravoid: redirecting ns0 traffic to baham (= ns1) in preparation for rubidium upgrade
06:52 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 29 06:51:45 UTC 2015 (duration 51m 44s)
06:13 logmsgbot: ori Synchronized php-1.26wmf7/includes/deferred/SiteStatsUpdate.php: Icc12c07ab: Update context stats in SiteStatsUpdate (duration: 00m 13s)
06:12 logmsgbot: ori Synchronized php-1.26wmf8/includes/deferred/SiteStatsUpdate.php: Icc12c07ab: Update context stats in SiteStatsUpdate (duration: 00m 14s)
06:03 apergos: salt keys regenerated on all production hosts (minions, not master key)
03:09 logmsgbot: LocalisationUpdate completed (1.26wmf8) at 2015-05-29 03:08:15+00:00
03:02 logmsgbot: l10nupdate Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 10m 08s)
02:36 logmsgbot: LocalisationUpdate completed (1.26wmf7) at 2015-05-29 02:35:10+00:00
02:31 logmsgbot: l10nupdate Synchronized php-1.26wmf7/cache/l10n: (no message) (duration: 06m 54s)
00:07 logmsgbot: ori Synchronized php-1.26wmf7/includes/diff/UnifiedDiffFormatter.php: d95cac90c7: Make the output of UnifiedDiffFormatter match diff -u (duration: 00m 14s)
00:06 logmsgbot: ori Synchronized php-1.26wmf7/extensions/Echo/includes/DiffParser.php: 41d27c4a26: Update Echo for cherry-picks (duration: 00m 13s)
May 28
23:33 jgage: restarted nutcracker on mw1056 due to errors, per bd808
23:04 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Enable A/B test of VE for new accounts on enwiki (duration: 00m 13s)
22:48 logmsgbot: hoo Synchronized php-1.26wmf7/: Touching some JS, re-syncing resource definitions to rule out causes for Wikidata JS problem. (duration: 01m 00s)
21:52 logmsgbot: ori Synchronized php-1.26wmf7/resources/src/mediawiki/mediawiki.toc.js: Touching file on unconfirmed suspicion of stale cache (duration: 00m 16s)
21:51 logmsgbot: ori Synchronized php-1.26wmf8/resources/src/mediawiki/mediawiki.toc.js: Touching file on unconfirmed suspicion of stale cache (duration: 00m 15s)
20:03 cscott: updated Parsoid to version 497da30e ; canary restart of wtp1001; observed network TX spike (possibly UDP, possibly logging); reverted to 8ed6fd0b and restarted all parsoids.
18:22 logmsgbot: krenair Synchronized php-1.26wmf6/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/214397/ - in case we have to go back to wmf6 again for whatever reason (duration: 00m 15s)
20:46 logmsgbot: twentyafterfour Purged l10n cache for 1.26wmf6
20:45 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.26wmf8
20:41 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.26wmf7
20:36 logmsgbot: twentyafterfour Finished scap: testwiki to php-1.26wmf8 and rebuild l10n cache (duration: 67m 53s)
19:40 akosiaris: removed operations/puppet/varnish from gerrit, git.wikimedia.org and github. The repo was used as a git submodule but the workflow turned out to be cumbersome approximately a year ago and was no longer updated. Up to a few minutes ago, it only served as a source of confusion. It no longer does.
19:28 logmsgbot: twentyafterfour Started scap: testwiki to php-1.26wmf8 and rebuild l10n cache
19:22 logmsgbot: twentyafterfour scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_1863397713" --threads=4 --lang en --quiet' returned non-zero exit status 255 (duration: 03m 38s)
19:18 logmsgbot: twentyafterfour Started scap: testwiki to php-1.26wmf8 and rebuild l10n cache
18:12 moritzm: Uploaded gridengine_6.2u5-4+wmf2 for precise-wikimedia to apt.wikimedia.org
10:00 ^d: gerrit: manually gc'd all repos to help with clone times
08:55 godog: resize existing whisper files with new retention on graphite2001
05:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 24 05:41:35 UTC 2015 (duration 41m 34s)
02:58 logmsgbot: LocalisationUpdate completed (1.26wmf7) at 2015-05-24 02:57:17+00:00
02:53 logmsgbot: l10nupdate Synchronized php-1.26wmf7/cache/l10n: (no message) (duration: 06m 57s)
02:34 logmsgbot: LocalisationUpdate completed (1.26wmf6) at 2015-05-24 02:33:23+00:00
02:29 logmsgbot: l10nupdate Synchronized php-1.26wmf6/cache/l10n: (no message) (duration: 06m 34s)
May 23
23:30 logmsgbot: ori Synchronized php-1.26wmf7/extensions/Gadgets: b592efa5fe: Update Gadgets for I6da3eede0: Conversion to using WAN cache (duration: 00m 13s)
12:54 godog: remove MediaWiki.xhprof to pick up new retention schema
12:53 godog: bounce carbon on graphite1001 to pick up new retention schema
11:16 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ic258d01a7: Revert "Change StatsD port to another value temporarily" (duration: 00m 13s)
10:22 ori: Metrics from MediaWiki to graphite are temporarily suspended while xhprof profiling work is ongoing.
10:21 logmsgbot: ori Synchronized wmf-config/StartProfiler.php: Exclude xhprof.run_init from being reported (duration: 00m 13s)
10:03 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 13s)
21:18 logmsgbot: twentyafterfour Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 14s)
21:01 cscott: updated OCG to version ca4f64852de5b1de782b292b50038fbd2dd84266
20:59 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.26wmf7
20:58 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.26wmf6
20:50 logmsgbot: twentyafterfour Finished scap: retry: testwiki to php-1.26wmf7 and rebuild l10n cache (duration: 26m 02s)
20:42 ebernhardson: restarted gmond on elastic10{01..31}.eqiad.wmnet
20:24 logmsgbot: twentyafterfour Started scap: retry: testwiki to php-1.26wmf7 and rebuild l10n cache
20:12 subbu: deployed parsoid version 8ed6fd0b
19:35 logmsgbot: twentyafterfour scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_3448528422" --threads=4 --lang en --quiet' returned non-zero exit status 255 (duration: 03m 22s)
19:32 logmsgbot: twentyafterfour Started scap: testwiki to php-1.26wmf7 and rebuild l10n cache
17:41 bblack: esams+eqiad upload varnish caches will be downtimed+rebooted today, experimenting with depool effects as well (next several hours)
16:03 logmsgbot: manybubbles Synchronized php-1.26wmf5/extensions/Flow/: SWAT update flow for wmf5 to fix two issues (duration: 00m 14s)
15:54 godog: rolling restart restbase on restbase1003-1006
15:52 mobrovac: restbase restarted on restbase1002
15:47 godog: restbase restarted on restbase1001
15:35 logmsgbot: manybubbles Synchronized php-1.26wmf6/extensions/Flow/: SWAT update flow for wmf6 to fix two issues (duration: 00m 12s)
15:22 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT new namespaces for ptwikinews (duration: 00m 11s)
15:18 logmsgbot: manybubbles Synchronized wmf-config/throttle.php: SWAT clean old throttle rule and add a new one for an upcoming festival (duration: 00m 13s)
20:22 mutante: mailman: killed processes by user "list". started mailman
19:40 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ia6a2cb7: Removed "refreshLinks" from $wgJobBackoffThrottling (duration: 00m 12s)
19:37 logmsgbot: anomie Finished scap: Step 2 for deploying ApiFeatureUsage: sync the config, and l10n data again because I don't think it did last time (duration: 44m 34s)
19:25 robh: mailman permission errors abound! had to take it offline again and fixing
19:02 robh: mailman is back to routing mail normally (still testing rename parts)
18:53 logmsgbot: anomie Started scap: Step 2 for deploying ApiFeatureUsage: sync the config, and l10n data again because I don't think it did last time
18:51 logmsgbot: anomie Finished scap: Step 1 for deploying ApiFeatureUsage: sync the code and l10n data (duration: 05m 39s)
18:46 logmsgbot: anomie Started scap: Step 1 for deploying ApiFeatureUsage: sync the code and l10n data
18:38 yuvipanda: issuing start command for all hosts on labvirt1006, just to make sure
18:35 yuvipanda: labvirt1006 rebooting, long POST
18:31 yuvipanda: restarted labvirt1006
18:20 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.26wmf6
18:15 robh: stopping mailman again for further planned work T99098
17:43 robh: mailing lists still down, scrubbing list archives is painful and error prone
15:16 logmsgbot: anomie Synchronized php-1.26wmf5/includes/registration/ExtensionRegistry.php: SWAT: registration: Don't array_unique() over the queue before loading it [[gerrit:211948] (duration: 00m 12s)
15:15 logmsgbot: anomie Synchronized php-1.26wmf6/includes/registration/ExtensionRegistry.php: SWAT: registration: Don't array_unique() over the queue before loading it [[gerrit:211947] (duration: 00m 12s)
14:43 jynus: back to read/write after virt1000 database migration - migration seems ok
14:41 godog: purge cassandra system CF metrics from graphite1001
14:29 jynus: temporarily going read-only for virt1000 for database migration
14:24 mobrovac: enabled puppet on restbase1001
14:19 mobrovac: restbase group1 wiki keyspaces created
14:15 mobrovac: starting manually RB with group1 wikis enabled on restbase1001
14:11 mobrovac: restbase100x: removed superfluous keyspaces by hand from Cassandra
13:47 bblack: done with cp40xx reboot process
13:32 bblack: rebooting ulsfo caches (cp40xx - currently depooled from all traffic + downtimed in icinga)
13:09 mobrovac: disabled puppet on restbase100x
12:51 godog: bounce hhvm on mw1152
08:26 _joe_: restarting a few HHVM instances with a full TC space
05:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue May 19 05:03:56 UTC 2015 (duration 3m 55s)
02:46 logmsgbot: LocalisationUpdate completed (1.26wmf6) at 2015-05-19 02:45:17+00:00
23:21 hoo: Reverting my changes to the sites and site_identifiers tables from earlier on... apparently the export/importSites.php maintenance scripts don't work as advertised
23:03 logmsgbot: ori Synchronized php-1.26wmf6/extensions/Echo: 8609cb6b90: Update Echo for cherry-picks (duration: 00m 30s)
23:02 logmsgbot: ori Synchronized php-1.26wmf5/extensions/Echo: 8c619b99a6: Update Echo for cherry-picks (duration: 00m 57s)
22:46 hoo: Updating the sites table on all wikis to reflect the language code change of bhwiki (from bh to bho). I have a backup of the old table from Wikidata in my home, should things go wrong.
20:38 mforns: upgraded and restarted EventLogging server: 19b5b7ae719321c4b8fb112890b574051b090571
11:07 jynus: depooling db1063 from cluster for maintenance
09:02 godog: loss on ulsfo-eqiad, depooled ulsfo
05:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon May 18 05:17:50 UTC 2015 (duration 17m 49s)
02:46 logmsgbot: LocalisationUpdate completed (1.26wmf6) at 2015-05-18 02:45:52+00:00
02:42 logmsgbot: l10nupdate Synchronized php-1.26wmf6/cache/l10n: (no message) (duration: 05m 35s)
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf5) at 2015-05-18 02:25:54+00:00
02:21 logmsgbot: l10nupdate Synchronized php-1.26wmf5/cache/l10n: (no message) (duration: 06m 24s)
May 17
05:06 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 17 05:05:16 UTC 2015 (duration 5m 15s)
02:44 logmsgbot: LocalisationUpdate completed (1.26wmf6) at 2015-05-17 02:43:13+00:00
02:39 logmsgbot: l10nupdate Synchronized php-1.26wmf6/cache/l10n: (no message) (duration: 05m 18s)
02:25 logmsgbot: LocalisationUpdate completed (1.26wmf5) at 2015-05-17 02:24:09+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf5/cache/l10n: (no message) (duration: 06m 10s)
May 16
13:27 manybubbles: that was the last server in the elasticsearch rolling restart. all done. now we have new versions of the plugins. Lets try not to do that again.
13:25 manybubbles: es-tool restart-fast on elastic1031
09:15 godog: bounce hhvm on mw1196
09:10 godog: bounce hhvm on mw1141
07:49 godog: restart hhvm on mw1234, still pushing xhprof metrics
06:03 _joe_: killed nrpe on labvirt1003 - see T99341
05:02 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat May 16 05:01:02 UTC 2015 (duration 1m 1s)
04:11 andrewbogott: restarting sshd and generally poking around on labvirt1003
02:47 logmsgbot: LocalisationUpdate completed (1.26wmf6) at 2015-05-16 02:46:08+00:00
02:43 logmsgbot: l10nupdate Synchronized php-1.26wmf6/cache/l10n: (no message) (duration: 04m 55s)
02:29 logmsgbot: LocalisationUpdate completed (1.26wmf5) at 2015-05-16 02:28:37+00:00
02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf5/cache/l10n: (no message) (duration: 05m 55s)
May 15
22:35 ejegg: updated crm from 03eb4cff1b009e8abaceec250f9a1c5d1f3c6b18 to 7ffe0cefb019828a09c9369187f14518847b5f41
10:19 godog: bounce statsite and uwsgi on graphite1001
09:29 godog: restart carbon on graphite1001
09:15 godog: restart hhvm on mw1018, straggling
09:07 godog: rm MediaWiki.run_init from graphite1001 / graphite2001
09:04 ori: restarted hhvm / jobrunner on jobrunners to force them to pick up I6a516a0da ; re-cleared /var/lib/carbon/whisper/MediaWiki/query_* on graphite1001 and graphite2001
08:49 kart_: Updated cxserver to 1cb6cec
08:21 jynus: reenabling icinga check for MySQL on db1009
08:15 logmsgbot: oblivian Synchronized wmf-config/StartProfiler.php: Null-sync to touch the file (duration: 00m 12s)
07:20 ori: rm -rf /var/lib/carbon/whisper/MediaWiki/query_* on graphite1001 and graphite2001, as follow-up cleanup for I6a516a0da
07:14 logmsgbot: ori Synchronized wmf-config/StartProfiler.php: I6a516a0da: Don't send profiling data to graphite for now (duration: 00m 11s)
06:23 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 15 06:22:19 UTC 2015 (duration 22m 18s)
05:38 jynus: temporarily opening mysql port on firewall from db1009 to virt1000
22:20 logmsgbot: ori Synchronized php-1.26wmf6/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: (no message) (duration: 00m 15s)
21:39 manybubbles: I'm going to be done doing rolling restarts for a couple of hours. If someone wants to pick them up and do the next one after the cluster goes green again then be my guest.
21:35 manybubbles: es-tool restart-fast on elastic1016
21:27 logmsgbot: ori Synchronized php-1.26wmf6/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: (no message) (duration: 00m 12s)
21:27 logmsgbot: ori Synchronized php-1.26wmf5/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: (no message) (duration: 00m 12s)
21:14 logmsgbot: ori Synchronized php-1.26wmf6/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: I3df6713a1: Log request times to StatsD (duration: 00m 13s)
21:14 logmsgbot: ori Synchronized php-1.26wmf5/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: I3df6713a1: Log request times to StatsD (duration: 00m 15s)
19:43 robh: mass unsubcription in listadmins list, resulting in unsupressed mass unsubscribe notices to all listadmin email address (sorry about the emails!)
15:04 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: Open external links on votewiki in new tab gerrit:210849 (duration: 00m 12s)
22:26 logmsgbot: twentyafterfour Purged l10n cache for 1.26wmf4
22:25 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: Group 0 to 1.26wmf6
22:21 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.26wmf5
22:17 twentyafterfour: restarted phd on iridium (phabricator) to sync the daemons' configuration
21:28 manybubbles: restarting elasticsearch on elastic1005
21:12 cscott: updated OCG to version c7c75e5b03ad9096571dc6dbfcb7022c924ccb4f
21:03 awight: updated payments from f97f8f99268974cfdb0182f178955bd627137842 to e89d18ee20abcb1ca3c455e6a298bf8a6aa84442
20:28 subbu: deployed parsoid version a8108fe6
20:15 manybubbles: restarted elasticsearch on elastic1004
20:12 logmsgbot: twentyafterfour Finished scap: testwiki to php-1.26wmf6 and rebuild l10n cache (duration: 47m 24s)
20:11 manybubbles: cancel that - I just realized I can't do that.
20:10 manybubbles: elastic1003 restarted elasticsearch just fine. the cluster restart is going awesome. I'm going to rig the other 28 to restart via a script, one after the other. Expect nagios to complain about them some.
20:03 bblack: restarting hhvm on mw1190
19:25 logmsgbot: twentyafterfour Started scap: testwiki to php-1.26wmf6 and rebuild l10n cache
19:11 awight: paymens rolled back to f97f8f99268974cfdb0182f178955bd627137842
19:10 awight: payments updated from f97f8f99268974cfdb0182f178955bd627137842 to 5c326a521120a904a2012654e9287757dc5a8ca2
19:00 manybubbles: elastic1002 restart went well - starting elastic1003
18:45 awight: rolled back payments to f97f8f99268974cfdb0182f178955bd627137842
18:43 awight: update payments from f97f8f99268974cfdb0182f178955bd627137842 to 5c326a521120a904a2012654e9287757dc5a8ca2
18:05 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: undo all the nostalgia (duration: 00m 10s)
17:14 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: because sometimes moving code helps (duration: 00m 15s)
17:10 manybub|lunch: elastic1002 restarted and rejoined the cluster - now the cluster is repaining. hurray.
17:08 manybub|lunch: elastic1001 restarted and rejoined the cluster hapilly while I was at lunch. it looks good - no errors beyond the ones we have fixes in flight for. So I'm going to do elastic1002
17:03 hashar: Zuul clone failures solved. Was due to network traffic being interrupted between labs and prod.
16:49 andrewbogott: re-enabling puppet on labnet1001
16:46 mutante: es2010 failed disk, reopening ticket for last fail in January
16:41 jynus: Enabling puppet agent in db1009.eqiad after reinstall
16:40 logmsgbot: ori Synchronized php-1.26wmf4/includes/resourceloader/ResourceLoader.php: I30b490e5b: ResourceLoader::filter: use APC when running under HHVM (duration: 00m 11s)
16:38 logmsgbot: ori Synchronized php-1.26wmf5/includes/resourceloader/ResourceLoader.php: I30b490e5b: ResourceLoader::filter: use APC when running under HHVM (duration: 00m 14s)
16:28 andrewbogott: disabling puppet on labnet1001 to tinker with nova config
15:44 mark: Disregard cr2-knams:xe-0/0/0; we're working on it
15:21 manybubbles: I think the elasticsearch cluster got stuck with alloation disabled after the rolling restart. Funky. Haven't seen that one before. Probably a problem with our instructions. Anyway, unstuck it and recovery is going faster now
23:46 logmsgbot: mattflaschen Synchronized wmf-config: Sync wmf-config for CirrusSearch PoolCounter change; applies to group 0 initially (duration: 00m 12s)
23:37 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings-labs.php: sync InitialiseSettings-labs.php for Browse experiment in mobile (duration: 00m 13s)
16:33 logmsgbot: manybubbles Started scap: SWAT js config vargs changes
15:59 manybubbles: waiting a few minutes after that last set of patches before we're sure that the load is down and then, hopefully, we'll scap to get the core changes that are already merged and sitting on tin that we had to ignore while we handled the trafic spike.
15:53 logmsgbot: manybubbles Synchronized php-1.26wmf4/includes/media/DjVu.php: SWAT: 10 mb djvu files are expensive to thumbnail (wmf4) (duration: 00m 13s)
15:52 logmsgbot: manybubbles Synchronized php-1.26wmf5/includes/media/DjVu.php: SWAT: 10 mb djvu files are expensive to thumbnail (wmf5) (duration: 00m 11s)
15:33 manybubbles: stopping SWAT due to some incident that just picked up. Right now Ib990f00ebe974008cea4dccbaa212ec20c846674 and Ida3fd5f8808202892001f66c4a534c1725e769a6 are merged awaiting a scap.
15:05 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT: send all mediawiki events from all wikis to logstash (duration: 00m 12s)
15:03 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: enable graph extension in beta. this should be a noop (duration: 00m 13s)
14:01 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable arbitrary Wikibase access for nlwiki and frwikisource (duration: 00m 16s)
05:10 ori: upgrading canary appservers to 3.6.1+dfsg1-1+wm2
04:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon May 11 04:53:58 UTC 2015 (duration 53m 57s)
04:17 springle: restarted hhvm on mw1020. lots of fatal noise about N4HPHP13DataBlockFullE
02:43 logmsgbot: LocalisationUpdate completed (1.26wmf5) at 2015-05-11 02:42:42+00:00
02:39 logmsgbot: l10nupdate Synchronized php-1.26wmf5/cache/l10n: (no message) (duration: 05m 37s)
02:23 logmsgbot: LocalisationUpdate completed (1.26wmf4) at 2015-05-11 02:22:25+00:00
02:18 logmsgbot: l10nupdate Synchronized php-1.26wmf4/cache/l10n: (no message) (duration: 06m 19s)
May 10
17:45 ori: App server traffic coincides with spike on S4 dbs, lots of commons sleeper queries, fatal log contains many references to User:Richenza/gallery, so nuking.
21:01 apergos: dumps are interrupted on snapshot1004 while I do a manual run for testing/debugging purposes. please let it run and don't start any other processes on the box, thanks
20:53 bd808: Updated kibana to bb9fcf6 (Merge remote-tracking branch 'upstream/kibana3')
23:12 RoanKattouw: Created shorturls table on knwiki
20:39 logmsgbot: twentyafterfour Purged l10n cache for 1.26wmf3
20:37 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.26wmf5
20:32 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.26wmf4
20:29 apergos: salt upgraded to 2014.7.5 on all precise/trusty/jessie hosts in production except for: labcontrol2001, tin, virt1000 (deferred) and dysprosium/labvirt1005/labstore1002 (down)
20:14 twentyafterfour: ignore all rumors of scap failures, the scaps were successful, with the exception of snapshot1004.eqiad.wmnet which hangs every time
07:20 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I019944f42: Change EventLogging endpoint to /beacon/event (duration: 00m 14s)
06:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed May 6 06:50:27 UTC 2015 (duration 50m 26s)
03:14 logmsgbot: LocalisationUpdate completed (1.26wmf4) at 2015-05-06 03:13:28+00:00
03:09 logmsgbot: l10nupdate Synchronized php-1.26wmf4/cache/l10n: (no message) (duration: 08m 46s)
02:46 logmsgbot: LocalisationUpdate completed (1.26wmf3) at 2015-05-06 02:45:26+00:00
02:36 logmsgbot: l10nupdate Synchronized php-1.26wmf3/cache/l10n: (no message) (duration: 10m 46s)
02:27 springle: xtrabackup clone db1060 to db1021
02:04 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I83ad6d060: Remove wmgUseBits setting, now that the migration is complete (duration: 00m 18s)
01:52 logmsgbot: ori Synchronized multiversion/MWWikiversions.php: Ib08e36901: MWWikiversions::readDbListFile: allow single-line ("#" or "//") comments (duration: 00m 18s)
01:40 springle: upgrade db1021 trusty
00:51 springle: schema change running T95179 wikidata, bit unusual, dropping a not-null field
00:46 logmsgbot: bd808 Synchronized wmf-config/CommonSettings.php: Add AffCom user group application contact page on meta 207332 (duration: 00m 20s)
00:45 logmsgbot: bd808 Synchronized docroot/noc/createTxtFileSymlinks.sh: Add AffCom user group application contact page on meta 207332 (duration: 00m 17s)
00:45 logmsgbot: bd808 Synchronized docroot/noc/conf/AffComContactPages.php.txt: Add AffCom user group application contact page on meta 207332 (duration: 00m 15s)
00:44 logmsgbot: bd808 Synchronized wmf-config/AffComContactPages.php: Add AffCom user group application contact page on meta 207332 (duration: 00m 33s)
19:23 yuvipanda: disabled puppet on zookeeper hosts
18:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I5978a3910: Update $wgULSFontRepositoryBasePath for post-bits world (duration: 00m 18s)
18:43 logmsgbot: ori Synchronized wmf-config: Ia98fc4c5d: wmgUseBits: false for enwiki (duration: 00m 17s)
18:33 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I2ee277293: wmgUseBits: false for all but enwiki (duration: 00m 13s)
17:50 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Enable graph extension on all wikis except wikidata (duration: 00m 19s)
02:44 logmsgbot: l10nupdate Synchronized php-1.26wmf4/cache/l10n: (no message) (duration: 07m 11s)
02:27 logmsgbot: LocalisationUpdate completed (1.26wmf3) at 2015-05-03 02:26:02+00:00
02:22 logmsgbot: l10nupdate Synchronized php-1.26wmf3/cache/l10n: (no message) (duration: 08m 11s)
May 2
22:16 ori: Deployed change I3bc87f3a5 to fix UBN! bug T97912. Bug was affecting ability to translate messages needed for running upcoming board election.
22:16 logmsgbot: ori Synchronized php-1.26wmf4/extensions/Translate/api/ApiQueryMessageGroups.php: I3bc87f3a5: ApiQueryMessageGroups: mark '_canchange' and '_name' as non-API-metadata (duration: 00m 30s)
22:09 logmsgbot: ori Synchronized php-1.26wmf3/extensions/Translate/api/ApiQueryMessageGroups.php: I3bc87f3a5: ApiQueryMessageGroups: mark '_canchange' and '_name' as non-API-metadata (duration: 00m 31s)
20:25 windowcat: Updated jobrunners to c95d565e242e6fa3706c088ddab1cc6f716408e1
22:19 awight: payments redeployed, revision for payments-wiki changed... from df8aeb5d1c5f595348f77cb56d3975eca19a65a2 to 3ab89e2b14eb449f7ceddf2325493d6235395ecd
22:17 awight: payments rolled back from 3ab89e2b14eb449f7ceddf2325493d6235395ecd to df8aeb5d1c5f595348f77cb56d3975eca19a65a2
22:14 logmsgbot: mattflaschen Started scap: Deploy Flow changes to 1.26wmf4 facilitate LQT->Flow conversion
22:10 awight: updating payments from df8aeb5d1c5f595348f77cb56d3975eca19a65a2 to 3ab89e2b14eb449f7ceddf2325493d6235395ecd
21:46 awight: update payments from 83d09e09178c634ad35dbb684d1c3aebbb709969 to df8aeb5d1c5f595348f77cb56d3975eca19a65a2
21:05 bd808: Finally got sync-common to run to completion on snapshot1004; runtime 45 minutes!
20:43 legoPanda: renaming <2k users who were missed in the original run (SUL finalization)
19:23 awight: enabling Thank You job
19:23 awight: updated crm from 59f03df6b689ef443cc7b7e31e6f5b2986bc8bc9 to 514e7ea41acd14e1565b31b76621ea840d209e07
19:07 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I93cdc4a2e and I9ee6bec1f: Define $wgAssetsHost based on wmgUseBits; use it to reference standard chrome (duration: 00m 16s)
18:46 Coren: rebooting labstore1002 in prevision of switch to make sure it starts up cleanly.
18:14 K4-713: disabled Thank You mail send
17:41 bd808: sync-common on snapshot1004 failed after 33 minutes with rsync timeout
23:21 logmsgbot: catrope Synchronized wmf-config/CommonSettings.php: Disable Graph namespace on all wikis except the ones that already have it (duration: 00m 22s)
15:55 logmsgbot: anomie Synchronized wmf-config/abusefilter.php: SWAT: Add abusefilter-modify-restricted right to sysop user group for idwiki gerrit:206080 (duration: 00m 25s)
15:53 logmsgbot: anomie Synchronized php-1.26wmf2/extensions/MobileFrontend: SWAT: Ah, git rebasing was rebasing the reverted commits on top of the revert... (duration: 00m 21s)
15:44 logmsgbot: anomie Synchronized php-1.26wmf2/extensions/MobileFrontend: SWAT: MobileFrontend: API: "editable" is a legacy boolean, don't convert it gerrit:207403 (duration: 00m 23s)
15:43 _joe_: restarting HHVM on mw1132 too, same reason.
15:41 logmsgbot: anomie Synchronized php-1.26wmf3/extensions/MobileFrontend: SWAT: MobileFrontend: API: "editable" is a legacy boolean, don't convert it gerrit:207403 (duration: 00m 37s)
15:40 _joe_: restarting HHVM on mw1232, stuck on __lll_lock_wait from HPHP::StatCache::refresh ()
15:30 logmsgbot: anomie Synchronized php-1.26wmf3/includes/api/ApiResult.php: SWAT: API: ApiResult must validate even when using numeric auto-indexes gerrit:207456 (duration: 00m 26s)
15:05 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Load HTML directly from RESTBase on all wikipedias gerrit:206320 (duration: 00m 17s)
13:03 paravoid: disabling netflows on cr1/2-ulsfo
07:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 29 07:11:38 UTC 2015 (duration 11m 37s)
05:28 logmsgbot: tstarling Synchronized php-1.26wmf3/extensions/SecurePoll: (no message) (duration: 00m 13s)
03:47 logmsgbot: LocalisationUpdate completed (1.26wmf3) at 2015-04-29 03:46:05+00:00
03:40 logmsgbot: l10nupdate Synchronized php-1.26wmf3/cache/l10n: (no message) (duration: 39m 55s)
02:48 springle: killed eight stalled commonswiki.transcode transactions on db1040
02:45 logmsgbot: LocalisationUpdate completed (1.26wmf2) at 2015-04-29 02:43:54+00:00
02:40 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase usage tracking on nlwiki and frwikisource (duration: 00m 12s)
02:40 logmsgbot: l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 25m 50s)
00:38 springle: xtrabackup clone db2029 to db2047
00:38 springle: xtrabackup clone db2028 to db2046
00:20 logmsgbot: gwicke Synchronized wmf-config/InitialiseSettings.php: VE: Load HTML directly from RESTBase for enwiki (duration: 00m 22s)
00:07 logmsgbot: bd808 Synchronized docroot/noc/createTxtFileSymlinks.sh: Revert of AffCom contact form 207328 (duration: 00m 35s)
00:06 logmsgbot: bd808 Synchronized wmf-config/CommonSettings.php: Revert of AffCom contact form 207328 (duration: 00m 19s)
April 28
23:57 logmsgbot: bd808 Synchronized docroot/noc/conf/AffComContactPages.php.txt: Add AffCom user group application contact page on meta 207319 (duration: 00m 28s)
23:51 logmsgbot: bd808 Synchronized wmf-config/CommonSettings.php: Add AffCom user group application contact page on meta 204205 (duration: 00m 11s)
23:50 logmsgbot: bd808 Synchronized docroot/noc/createTxtFileSymlinks.sh: Add AffCom user group application contact page on meta 204205 (duration: 00m 21s)
23:48 logmsgbot: bd808 Synchronized wmf-config/AffComContactPages.php: Add AffCom user group application contact page on meta 204205 (duration: 00m 25s)
23:35 bd808|deploy: mw2031.codfw.wmnet syncing very slowly for SWAT
15:35 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Content Translation in cs, el, kk and zu gerrit:207048 (duration: 00m 27s)
15:31 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Content Translation in cs, el, kk and zu gerrit:207048 (duration: 00m 21s)
05:44 logmsgbot: ori Synchronized php-1.26wmf1/includes/filerepo/file/LocalFile.php: Undo local hack on version that is inactive (1.26wmf1). No-op. (duration: 00m 17s)
05:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 24 05:04:42 UTC 2015 (duration 4m 41s)
04:47 logmsgbot: ori Synchronized php-1.26wmf2/includes/filerepo/file/LocalFile.php: Short-circuit LocalFile::loadExtraFromDB in attempt to mitigate outage (duration: 00m 12s)
04:42 springle: killing LocalFile::loadExtraFromDB wholesale on s4
04:32 logmsgbot: ori Synchronized php-1.26wmf1/includes/filerepo/file/LocalFile.php: Short-circuit LocalFile::loadExtraFromDB in attempt to mitigate outage (duration: 00m 14s)
04:25 ori: Did a cluster-wide 'service hhvm restart'.
02:48 logmsgbot: LocalisationUpdate completed (1.26wmf3) at 2015-04-24 02:47:12+00:00
02:44 logmsgbot: l10nupdate Synchronized php-1.26wmf3/cache/l10n: (no message) (duration: 06m 00s)
02:30 logmsgbot: LocalisationUpdate completed (1.26wmf2) at 2015-04-24 02:28:58+00:00
02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 06m 35s)
21:56 ori: Additional (planned) outcome of Ie22658727 and Ice65e7e70: xff log flowing to fluorine, causing bytes-in to climb from ~1.2M/s to ~2.1M/s
21:54 ori: Syncing Ie22658727 and Ice65e7e70 (which introduce new InitialiseSettings vars) in one go caused a small burst of 500s (peaking at 500/sec and lasting a few seconds) on four app servers.
21:42 logmsgbot: ori Synchronized wmf-config: Ie22658727 and Ice65e7e70: use Monolog to configure logging (duration: 00m 15s)
21:04 awight: update payments from 88b9f621bfee1de14a8cdef556a90e5567721754 to 83d09e09178c634ad35dbb684d1c3aebbb709969
19:31 mutante: restarting icinga-wm for config change
18:05 andrewbogott: rebooting labvirt1006
17:51 logmsgbot: kartik Synchronized php-1.26wmf2/extensions/ContentTranslation: (no message) (duration: 00m 15s)
12:29 godog: investigating icinga UNKNOWN for hhvm queue/threads
09:15 godog: restart carbon on graphite1001, replace with carbon-c-relay
08:31 godog: restart carbon on labmon1001, replace with carbon-c-relay
05:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 23 05:21:17 UTC 2015 (duration 21m 16s)
02:49 logmsgbot: LocalisationUpdate completed (1.26wmf3) at 2015-04-23 02:48:40+00:00
02:46 logmsgbot: l10nupdate Synchronized php-1.26wmf3/cache/l10n: (no message) (duration: 03m 46s)
02:28 logmsgbot: LocalisationUpdate completed (1.26wmf2) at 2015-04-23 02:27:39+00:00
02:24 logmsgbot: l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 05m 46s)
00:15 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings.php: Turning on WikiGrok on English Wikipedia (for 2 week test) (duration: 00m 11s)
20:10 logmsgbot: twentyafterfour Started scap: testwiki to php-1.26wmf3 and rebuild l10n cache
20:08 subbu: deployed parsoid version 3311936a
19:51 hashar: Zuul / Jenkins back up and processing the 1+ hour backlog of changes. Will take a while. Multiple causes: Zuul gearmand being stalled on a socket that has no more data to emit and Jenkins being deadlocked due to an IRC plugin
19:44 hashar: Killing Jenkins cause .... we know
19:27 hashar: zuul gearman server is stalled
15:30 gwicke: stopped restbase on restbase1002 in preparation for cmjohnson1 checking the hardware
15:30 logmsgbot: demon Finished scap: 1.26wmf2 was tracking master. should be fixed, being paranoid and doing full sync + i18n rebuild (duration: 08m 11s)
15:21 logmsgbot: demon Started scap: 1.26wmf2 was tracking master. should be fixed, being paranoid and doing full sync + i18n rebuild
15:19 logmsgbot: demon Synchronized php-1.26wmf2/extensions/VisualEditor/: (no message) (duration: 00m 12s)
15:19 logmsgbot: demon Synchronized php-1.26wmf2/extensions/WikiEditor/: (no message) (duration: 00m 11s)
15:12 logmsgbot: demon Synchronized php-1.26wmf1/extensions/WikiEditor/: (no message) (duration: 00m 13s)
13:37 godog: ms-be101[678] weight to 2820
13:25 paravoid: switched eqiad<->ulsfo link to Giglinx
11:11 godog: begin reimagining xenon, cerium and praseodymium
07:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 22 07:38:22 UTC 2015 (duration 38m 21s)
07:28 legoktm: SULF is done, post-rename notifications are being sent out on the last large wikis
03:20 logmsgbot: ori Synchronized hhvm-fatal-error.php: I528e5384c: Increment a counter on fatals (duration: 00m 12s)
02:56 logmsgbot: LocalisationUpdate completed (1.26wmf2) at 2015-04-22 02:55:44+00:00
02:50 logmsgbot: l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 08m 31s)
02:26 logmsgbot: LocalisationUpdate completed (1.26wmf1) at 2015-04-22 02:25:40+00:00
02:22 logmsgbot: l10nupdate Synchronized php-1.26wmf1/cache/l10n: (no message) (duration: 05m 45s)
23:04 logmsgbot: krenair Synchronized php-1.26wmf2/extensions/VisualEditor: https://gerrit.wikimedia.org/r/205774 - should effectively be a no-op until config (duration: 00m 12s)
22:24 robh: disabled a bunch of old rt queues from allowing ticket creation, tired of spam
12:14 hashar: Switching Zuul scheduler on gallium.wikimedia.org to the Debian package version
09:02 hashar: apt-get upgrade on gallium and lanthanum
08:09 godog: reboot ms-be1009, xfs woes
05:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 17 05:47:10 UTC 2015 (duration 47m 9s)
04:36 logmsgbot: aaron Synchronized wmf-config/db-eqiad.php: Set "recentchanges" group for s2-s7 (duration: 00m 11s)
04:33 logmsgbot: aaron Synchronized wmf-config/db-eqiad.php: (no message) (duration: 00m 12s)
03:33 legoktm: restarting forceRenameUsers.php (SUL finalization) on the rest of the small wikis, starting with wm2008wiki
03:26 legoktm: attached CheckUser@dewiki,enwiki,metawiki to CheckUser@global
03:25 legoktm: attached Checkuser@enwiki to Checkuser@global
02:47 logmsgbot: LocalisationUpdate completed (1.26wmf2) at 2015-04-17 02:46:38+00:00
02:43 logmsgbot: l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 05m 10s)
02:29 logmsgbot: LocalisationUpdate completed (1.26wmf1) at 2015-04-17 02:28:41+00:00
02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf1/cache/l10n: (no message) (duration: 05m 39s)
01:49 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I6fa034f4a: Enable Hovercards by default on Catalan and Greek Wikipedias (T88164) (duration: 00m 12s)
01:41 legoktm: paused forceRenameUsers around wm2008wiki
01:41 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I95f8c010e: Popups: enable as beta feature by default (duration: 00m 12s)
01:37 legoktm: marked as "Steward" accounts as not to be renamed (utr_status=11)
01:34 logmsgbot: ori Synchronized php-1.26wmf1/extensions/Popups: Update Popups for Ie4cc455f: Act as a beta feature if so configured (duration: 00m 12s)
01:33 logmsgbot: ori Synchronized php-1.26wmf2/extensions/Popups: Update Popups for Ie4cc455f: Act as a beta feature if so configured (duration: 00m 12s)
01:32 logmsgbot: ori Synchronized wmf-config: I7fde63453: PopUps: disabled by default; requires BetaFeatures if set as beta feature (duration: 00m 11s)
01:31 legoktm: marked as "Oversight" accounts as not to be renamed (utr_status=11)
April 16
23:43 logmsgbot: legoktm Synchronized php-1.26wmf2/extensions/Gather/includes/specials/SpecialGather.php: Make Special:Gather show pages for that user https://gerrit.wikimedia.org/r/#/c/204671/ (duration: 00m 13s)
23:27 logmsgbot: legoktm Synchronized php-1.26wmf1/extensions/CentralAuth/includes/CentralAuthUser.php: Fix CentralAuthUser::loadAttached if no accounts are attached (duration: 00m 13s)
23:26 logmsgbot: legoktm Synchronized php-1.26wmf2/extensions/CentralAuth/includes/CentralAuthUser.php: Fix CentralAuthUser::loadAttached if no accounts are attached (duration: 00m 13s)
23:25 logmsgbot: legoktm Synchronized php-1.26wmf2/extensions/Gather/includes/specials/SpecialGather.php: Error in regex broke User lists pages https://gerrit.wikimedia.org/r/#/c/204499/ (duration: 00m 12s)
23:08 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Set meta namespace on or.wiktionary (duration: 00m 14s)
23:06 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: User rights configuration on ne.wikipedia - Filemover (duration: 00m 11s)
23:05 logmsgbot: legoktm Synchronized wmf-config/: User rights configuration on ne.wikipedia - Abusefilter (duration: 00m 12s)
22:25 legoktm: running forceRenameUsers.php (SUL finalization) on all small wikis
20:37 ori: MediaWiki stats flowing into StatsD again.
20:34 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I31c7b2c3d5: Reset port of $wgStatsdServer to default (8125) (duration: 00m 14s)
20:32 logmsgbot: ori Synchronized php-1.26wmf1/includes/libs/BufferingStatsdDataFactory.php: 3077a66625: Don't bother buffering a counter update with a delta of zero. (duration: 00m 13s)
19:44 blazecat: Updated jobqueue:aggregator:s-wikis:v2 key on 10.64.32.76 to $wgLocalDatabases (sans labswiki)
19:15 paravoid: depooling esams, network issues
18:53 andrewbogott: rebooting labvirt100x to turn on virtualization in bios
18:40 andrewbogott: rebooting labvirt1001
18:38 legoktm: creating "Maintenance script" account on all SUL wikis for globaluserpage
16:46 csteipp: removed oauth-headers.php since that allowed stealing httponly cookies
16:44 logmsgbot: csteipp Synchronized w: (no message) (duration: 00m 11s)
12:35 akosiaris: uploaded etherpad-lite_1.4.1-2 on apt.wikimedia.org
11:58 Krenair: restarted apache on silver, wikitech login seems to work again
11:56 andrewbogott: disabling puppet on virt1000 so that I can prevent a questionable cron (purging tokens from the keystone db) from running while I sleep.
04:00 logmsgbot: legoktm Synchronized php-1.26wmf2/includes/DefaultSettings.php: The 'spambot_username' message is a reserved username (duration: 00m 11s)
03:29 logmsgbot: legoktm Synchronized php-1.26wmf1/includes/DefaultSettings.php: The 'spambot_username' message is a reserved username (duration: 00m 12s)
03:07 bd808: Updated iegreview to e126f7c (Fix aggregated reports to work on the new reviews system)
02:56 logmsgbot: LocalisationUpdate completed (1.26wmf2) at 2015-04-16 02:55:08+00:00
02:52 logmsgbot: l10nupdate Synchronized php-1.26wmf2/cache/l10n: (no message) (duration: 04m 38s)
02:34 legoktm: starting forceRenameUsers.php (SUL finalization) on non-test*wikis
02:33 logmsgbot: LocalisationUpdate completed (1.26wmf1) at 2015-04-16 02:32:26+00:00
02:29 logmsgbot: l10nupdate Synchronized php-1.26wmf1/cache/l10n: (no message) (duration: 05m 57s)
02:24 andrewbogott: but the ‘token’ table is still too big to manage
02:24 andrewbogott: restarted mysql on virt1000 because keystone was stuck. It seems to have helped, eventually
02:24 andrewbogott: restarted keystone and nova-scheduler in a failed attempt to unstick things
02:23 andrewbogott: testing the log by logging a test
April 15
20:28 subbu: deployed parsoid version ac7a01b9
18:25 legoktm: running forceRenameUsers.php (SUL finalization) on test* wikis
19:14 logmsgbot: aaron Synchronized wmf-config/PoolCounterSettings-common.php: Add pool counter config for Translate (duration: 01m 11s)
18:28 legoktm: mw2129.codfw.wmnet still timing out
18:28 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on all projects where it is a default gadget https://gerrit.wikimedia.org/r/203109 (duration: 01m 06s)
21:09 logmsgbot: gwicke Synchronized wmf-config/InitialiseSettings.php: Make VisualEditor load HTML directly from rest.wikimedia.org on enwiki (duration: 00m 11s)
20:51 logmsgbot: twentyafterfour Purged l10n cache for 1.25wmf23
20:49 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.26wmf1
20:48 logmsgbot: aaron Synchronized wmf-config/db-eqiad.php: Set "recentchanges" query group (duration: 00m 16s)
20:46 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.25wmf24
20:32 cscott: updated Parsoid to version a76bd8a3
20:25 logmsgbot: twentyafterfour Finished scap: testwiki to php-1.26wmf1 and rebuild l10n cache (duration: 25m 38s)
19:59 logmsgbot: twentyafterfour Started scap: testwiki to php-1.26wmf1 and rebuild l10n cache
15:59 logmsgbot: legoktm Finished scap: Log promote to global renames in the global rename log https://gerrit.wikimedia.org/r/202742 (duration: 22m 27s)
20:17 arlolra: updated Parsoid to version d5aa726ebe831e6e7d3343f1dd01d8cc11fba1c3
19:33 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/202094/ - should basically be a no-op for now (duration: 00m 13s)
17:38 nuria: restarted eventlogging to deal with log issues
15:36 logmsgbot: anomie Synchronized wmf-config/: SWAT: Enable ContentTranslation in the Vietnamese and Gujarati Wikipedia, and sync some other changes that naughty people didn't sync themselves but say are safe. (duration: 00m 12s)
18:55 mutante: restarted grrrit-wm for config change
18:40 ori: Restarted nutcracker on HHVM and mw1147 and repooled
18:35 ori: Depooled mw1147. Spamming fluorine:/a/mw-log/memcache-serious.log. Some nutcracker issue most likely.
17:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I9c4de264: Send server side eventlogging logs to eventlog1001 instead of vanadium (duration: 00m 11s)
15:14 logmsgbot: kartik Synchronized php-1.25wmf24/extensions/ContentTranslation: (no message) (duration: 00m 14s)
15:14 logmsgbot: kartik Synchronized php-1.25wmf23/extensions/ContentTranslation: (no message) (duration: 00m 17s)
11:14 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: db1049 to normal load (duration: 00m 11s)
16:12 logmsgbot: kartik Started scap: Update ContentTranslation
15:51 manybubbles: actually that last patch seems to be working too. cool. sweet. still running the cirrus script just in case.
15:50 manybubbles: last sync accidentally picked up 'Add 100/106 namespaces to be searched by default at frwiktionary' - that one might require a cirrus script to finish running before its working properly
15:49 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT Set $wgRestrictDisplayTitle to false at cawikimedia (duration: 00m 11s)
15:42 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT disable mobile ip editing at kowiki 2/2 (duration: 00m 12s)
15:42 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: SWAT disable mobile ip editing at kowiki 1/2 (duration: 00m 11s)
15:15 logmsgbot: manybubbles Synchronized php-1.25wmf23/includes/User.php: SWAT user preferences load from the master by default (duration: 00m 12s)
14:13 hashar: Jenkins: migrated Zuul cloner on Precise labs slaves (100[1-4] to a version provided by a Debian package. Jobs console output should now shows Zuul version: 2.0.0-304-g685ca22-wmf1precise1
12:51 andrewbogott: restarted opendj, pdns on neptunium, nembus, virt1000, labcontrol2001
12:48 paravoid: repooling esams
12:42 paravoid: upgrading junos on mr1-esams
12:15 mark: Shutting down cp3014 for 10G upgrade
11:02 mark: Shutting down cp3012 for 10G upgrade
10:38 _joe_: stopping pybal on lvs2003, running manually to help debugging
23:15 urandom: restarting cassandra on restbase1001
18:39 awight: rollback crm from b4268a60225ae11f2c2b58d3b1f1c44e282f9ec6 to 59f03df6b689ef443cc7b7e31e6f5b2986bc8bc9
18:09 twentyafterfour: mw2213.codfw.wmnet still timing out
18:07 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to $VERSION, mw2213.codfe.wmnet failed, trying one more time
18:06 twentyafterfour: sync_wikiversions failed for host mw2213.codfw.wmnet port 22: Connection timed out
18:04 twentyafterfour: group1 to VERSION=1.25wmf23
18:03 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to VERSION
20:10 gwicke: thinning out old renders in restbase, keeping only the latest per revision; starting with group0, followed by wikipedia once done
16:59 mutante: mount /mnt/data on praseodymium to fix cassandra
14:28 _joe_: restarted mw1034, stuck in HPHP::StatCache::refresh
11:40 godog: reboot ms-be1009, xfs stuck
07:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 27 07:14:37 UTC 2015 (duration 14m 36s)
02:46 logmsgbot: LocalisationUpdate completed (1.25wmf23) at 2015-03-27 02:45:22+00:00
02:43 logmsgbot: l10nupdate Synchronized php-1.25wmf23/cache/l10n: (no message) (duration: 03m 08s)
02:27 logmsgbot: LocalisationUpdate completed (1.25wmf22) at 2015-03-27 02:26:36+00:00
02:24 logmsgbot: l10nupdate Synchronized php-1.25wmf22/cache/l10n: (no message) (duration: 05m 03s)
01:01 awight: updated payments from 32e860bd304763ccedc7110dee828249daa2b154 to f617326761887ed9a9100b472ea3b5736e2c10e6
00:10 gwicke: updated fstab data array name from md2 to md127 on cerium, xenon and praseodymium; naming changed after reboot; should probably use uuid instead
00:02 mutante: remounted /mnt/data on xenon
March 26
23:47 logmsgbot: ebernhardson Synchronized php-1.25wmf22/extensions/EventLogging/: Bump EventLogging in 1.25wmf22 for SWAT (duration: 00m 07s)
23:44 logmsgbot: ebernhardson Synchronized php-1.25wmf23/extensions/EventLogging/: Bump EventLogging in 1.25wmf23 for SWAT (duration: 00m 08s)
23:42 mutante: starting ferm service on holmium
23:40 ejegg: updated dash from 038bdc4c60697ac738eaeae384d91579710ff85a to 5a6b2dda71e6ce76d7bbba853acae8dc9416052c
23:34 mutante: cerium, xenon, praseodymium - stuck at boot because /mnt/data not ready, skipped mounting to reboot
23:26 gwicke: rebooted xenon, cerium, praseodymium to reload the firewall from scratch
23:23 logmsgbot: ebernhardson Synchronized php-1.25wmf22/extensions/Flow: Bump flow submodule in 1.25wmf22 for swat (duration: 00m 08s)
22:20 MaxSem: Created wikigrok_claims and wikigrok_responses tables on wikidatawiki and testwikidatawiki. Before that, accidentally created on enwiki, so had to uncreate.
22:07 ejegg: Re-enabled Jenkins civi jobs
22:02 ejegg: updated civicrm from f8fb0f61531431348f3a8a3ee107056a864d537b to 4c459f3dbf3c3466cdc26a351ba589f4f1aef587
22:01 ejegg: disabled Jenkins civi jobs
20:35 Coren: rebooting labstore2001 to look at its bios
18:25 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to php-1.25wmf22
18:18 twentyafterfour: Starting deployment train: group1 to 1.25wmf22
18:08 legoktm: manually attached User:Secret@enwiki to global
18:00 logmsgbot: demon Synchronized wmf-config/extension-list: (no message) (duration: 00m 12s)
17:56 legoktm: set email for User:ProGTX@global, attached enwiki
01:19 bd808: Updated scap to Ie1d1642 (Have utils.check_php_opening_tag check the file extension suffix)
01:16 mutante: mw2008 rebooting to fix BIOS HT setting
01:16 bd808: Trebuchet error from mw1222 for scap deploy (status code 128), no response from mw2003
01:06 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Idf3491140: Drop support for 75 languages in SyntaxHighlighter_GeSHi (duration: 00m 05s)
01:05 logmsgbot: kaldari Synchronized php-1.25wmf21/extensions/VisualEditor: syncing update to VE to fix mobile (duration: 00m 06s)
14:59 bblack: rebooting cp1072-4, cp3030-49 (none in production)
12:57 _joe_: updating sudo across all production
12:05 _joe_: upgraded libicu48 and mediawiki-math-texvc across the cluster
11:16 YuviPanda: ran chown -R gitpuppet:gitpuppet /var/lib/git/operations/puppet on palladium, fix permission issues
11:14 YuviPanda: ran chown -R gitpuppet:gitpuppet /var/lib/git/operations/puppet on strontium, fix permission issues
11:10 akosiaris: chown gitpuppet:gitpuppet /var/lib/git/operations/puppet/.git/logs/refs/remotes/origin/production on strontium, palladium. Somehow it was owned by root
10:59 godog: depool restbase1006, provisioning
07:40 _joe_: powercycled mw2027, went down with an unresponsive console
07:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Mar 17 07:06:36 UTC 2015 (duration 6m 35s)
04:46 ejegg: updated tools from 84442d51a841af4265ff103827cda83d5dd9dc54 to 9fd0a885e84074f215082aad689649a0684660f9
02:36 logmsgbot: LocalisationUpdate completed (1.25wmf21) at 2015-03-17 02:35:00+00:00
02:34 logmsgbot: l10nupdate Synchronized php-1.25wmf21/cache/l10n: (no message) (duration: 00m 04s)
02:22 logmsgbot: LocalisationUpdate completed (1.25wmf20) at 2015-03-17 02:20:58+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.25wmf20/cache/l10n: (no message) (duration: 00m 03s)
02:03 ori: applied I98d383a629 locally on mw1017
01:45 logmsgbot: ori scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_0.713383032704/*" "/srv/mediawiki-staging/php-1.25wmf20/cache/l10n"' returned non-zero exit status 1 (duration: 00m 10s)
21:19 akosiaris: installing a non-puppetized version of the puppet cronjob on nescio, sodium. The new well thought out puppet-run can not run on lucid hosts since https://gerrit.wikimedia.org/r/#/c/196162/ . Given they go away soon, it is better to not do weird puppet tricks to accomodate for just 2 old, soon to be deprecated, boxes.
16:06 logmsgbot: anomie Synchronized php-1.25wmf20/extensions/BounceHandler/: SWAT: BounceHandler: Removed repititive un-subscribe action on a global user gerrit:196878 (duration: 01m 06s)
16:04 logmsgbot: anomie Synchronized php-1.25wmf20/extensions/RestBaseUpdateJobs/: SWAT: RestBaseUpdateJobs: Set HTTP headers as an associative array gerrit:197041 (duration: 01m 03s)
16:00 logmsgbot: anomie Synchronized php-1.25wmf21/extensions/RestBaseUpdateJobs/: SWAT: RestBaseUpdateJobs: Set HTTP headers as an associative array gerrit:197042 (duration: 01m 03s)
15:53 logmsgbot: anomie Synchronized php-1.25wmf21/extensions/BounceHandler/: SWAT: BounceHandler: Removed repititive un-subscribe action on a global user gerrit:196877 (duration: 01m 04s)
15:33 logmsgbot: anomie Synchronized php-1.25wmf21/extensions/Flow/: SWAT: Flow: base href fix and dependency gerrit:196996 (duration: 01m 10s)
15:29 logmsgbot: anomie Synchronized php-1.25wmf20/includes/Html.php: SWAT: Fix for mediawiki.ui style for wpTextbox1 and wpSummary in preview if text includes inbutbox element gerrit:196897 (duration: 01m 03s)
15:25 logmsgbot: anomie Synchronized php-1.25wmf21/includes/Html.php: SWAT: Fix for mediawiki.ui style for wpTextbox1 and wpSummary in preview if text includes inbutbox element gerrit:196896 (duration: 01m 03s)
15:23 logmsgbot: anomie Synchronized php-1.25wmf21/includes/Html.php: SWAT: Fix for mediawiki.ui style for wpTextbox1 and wpSummary in preview if text includes inbutbox element gerrit:196896 (duration: 01m 03s)
21:02 logmsgbot: twentyafterfour Started scap: Sync security patches
20:52 mutante: cp1061 repooled in pybal
20:44 logmsgbot: mobrovac Synchronized wmf-config/CommonSettings.php: Activate the RESTBase Virtual REST Service on test.wp (duration: 00m 06s)
20:43 logmsgbot: mobrovac Synchronized wmf-config/InitialiseSettings.php: Activate the RESTBase Virtual REST Service on test.wp (duration: 00m 07s)
20:42 logmsgbot: twentyafterfour Finished scap: testwiki to php-1.25wmf21 and rebuild l10n cache (duration: 20m 59s)
20:21 logmsgbot: twentyafterfour Started scap: testwiki to php-1.25wmf21 and rebuild l10n cache
20:11 subbu: deployed parsoid sha 73bf3162
19:49 mutante: cp1061 - comment in pybal, reinstalling
18:44 mutante: cp1053 - reinstalling, PXE boot
18:31 mutante: cp1053 - comment in pybal for reinstall
18:09 twentyafterfour: branching wmf/1.25wmf21
17:18 Coren: trying other ways to restart uwsgi on labmod1001
16:21 logmsgbot: catrope Synchronized php-1.25wmf20/extensions/VisualEditor/: Update and unbreak VE (duration: 00m 06s)
05:47 YuviPanda: testing sync-file to make sure I didn’t break anything
05:47 logmsgbot: yuvipanda Synchronized README: (no message) (duration: 00m 07s)
02:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Mar 11 02:30:39 UTC 2015 (duration 30m 38s)
02:11 logmsgbot: ori Synchronized php-1.25wmf20/extensions/WikimediaEvents: 2nd iteration of HTTPS test (duration: 00m 05s)
02:11 logmsgbot: ori Synchronized php-1.25wmf19/extensions/WikimediaEvents: 2nd iteration of HTTPS test (duration: 00m 05s)
02:07 logmsgbot: LocalisationUpdate completed (1.25wmf20) at 2015-03-11 02:05:58+00:00
02:05 logmsgbot: l10nupdate Synchronized php-1.25wmf20/cache/l10n: (no message) (duration: 00m 01s)
02:05 logmsgbot: LocalisationUpdate completed (1.25wmf19) at 2015-03-11 02:04:22+00:00
02:04 logmsgbot: l10nupdate Synchronized php-1.25wmf19/cache/l10n: (no message) (duration: 00m 02s)
01:22 bblack: reinstalling cp4007 + cp4015
00:55 logmsgbot: ori Synchronized docroot/foundation/misc/blank.gif: (no message) (duration: 00m 05s)
March 10
23:31 logmsgbot: ebernhardson Synchronized php-1.25wmf19/extensions/RestBaseUpdateJobs/: Update RestBaseUpdateJobs to master in 1.25wmf19 (duration: 00m 09s)
23:30 logmsgbot: ebernhardson Synchronized php-1.25wmf20/extensions/RestBaseUpdateJobs: Update RestBaseUpdateJobs to master in 1.25wmf20 (duration: 00m 06s)
23:19 logmsgbot: ebernhardson Synchronized php-1.25wmf19/extensions/Flow: Bump flow submodule in 1.25wmf19 for SWAT (duration: 00m 07s)
23:17 logmsgbot: ebernhardson Synchronized php-1.25wmf20/extensions/Flow: Bump flow submodule in 1.25wmf20 for SWAT (duration: 00m 08s)
21:27 andrewbogott: erased some api-feature-usage.logs from fluorine to make breathing room; merged a patch that will purge _all_ such logs older than 90 days.
21:16 mutante: cp1057 - repooled, all bits eqiad are jessie now
20:31 mutante: cp1057 - disabled in pybal, reinstalling
20:10 twentyafterfour: finished train deployment, logs look ok
19:45 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.25wmf20 for real this time
19:44 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.25wmf20
19:18 gwicke: re-enabled puppet on cerium, xenon and praseodymium
18:27 twentyafterfour: starting the Tuesday "train" deployment
17:51 mutante: cp1056 - disabled in pybal, reboot to PXE for reinstall
22:29 tgr: doing an extensions/GlobalUsage/refreshGlobalimagelinks.php --pages=nonexistent test run on aawiki
20:29 logmsgbot: mobrovac Synchronized wmf-config/CommonSettings.php: Set the correct RESTBase server for the RESTBaseUpdateJobs extension (duration: 00m 07s)
20:28 logmsgbot: mobrovac Synchronized wmf-config/InitialiseSettings.php: Enable the RESTBaseUpdateJobs extension on testwiki (duration: 00m 06s)
20:09 arlolra: updated Parsoid to version c8370a480636c3a0d47ed5090dd29efcb72591e2
16:14 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT: CX: Publish translations to the Main namespace by default (duration: 00m 05s)
16:03 bblack: repooled cp301[48] in pybal
15:31 akosiaris: restarted phd (phabricator daemon) on iridium
16:39 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - Change templateeditor user group rights on fawiki (duration: 00m 07s)
16:35 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - Set $wgBabelCategoryNames true at outreachwiki (duration: 00m 06s)
05:29 logmsgbot: tstarling Started scap: Ieb27df7ef470cbda06b5b0f5bfb372bd7279c183
05:29 Tim: on tin: updating deployment branches for Ieb27df7ef470cbda06b5b0f5bfb372bd7279c183
March 1
06:23 andrewbogott: logging a test to test the logging
02:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Mar 1 02:16:24 UTC 2015 (duration 16m 23s)
02:06 logmsgbot: LocalisationUpdate completed (1.25wmf19) at 2015-03-01 02:05:02+00:00
02:04 logmsgbot: l10nupdate Synchronized php-1.25wmf19/cache/l10n: (no message) (duration: 00m 01s)
02:04 logmsgbot: LocalisationUpdate completed (1.25wmf18) at 2015-03-01 02:03:30+00:00
02:03 logmsgbot: l10nupdate Synchronized php-1.25wmf18/cache/l10n: (no message) (duration: 00m 01s)
00:56 gwicke: stopped cassandra on cerium and praseodymium temporarily for testing
February 28
02:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Feb 28 02:17:47 UTC 2015 (duration 17m 46s)
02:06 logmsgbot: LocalisationUpdate completed (1.25wmf19) at 2015-02-28 02:05:51+00:00
02:05 logmsgbot: l10nupdate Synchronized php-1.25wmf19/cache/l10n: (no message) (duration: 00m 01s)
02:05 logmsgbot: LocalisationUpdate completed (1.25wmf18) at 2015-02-28 02:04:14+00:00
02:04 logmsgbot: l10nupdate Synchronized php-1.25wmf18/cache/l10n: (no message) (duration: 00m 01s)
00:16 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: If2704b0f7: Change metric prefix from 'mw' back to 'MediaWiki', for back-compat (duration: 00m 06s)
February 27
23:47 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I8fa0649ab: Set $wgUDPProfilerPort back to 8125 (duration: 00m 06s)
23:33 ori: pushing a config change to txstatsd on graphite1001, the service may complain briefly
22:19 logmsgbot: reedy Synchronized docroot and w: nooop for dbtree ( already reverted by prior deploy ) (duration: 00m 05s)
17:07 legoktm: running CentralAuth's migratePass0.php on all wikis
14:59 hoo: Ran mysql:wikiadmin@db1033 [metawiki]> UPDATE ipblocks SET ipb_deleted = 1 WHERE ipb_id = 16659; to actually suppress a suppressed name
08:41 andrewbogott: upgraded virt1012 to Trusty; starting all instances
06:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Feb 27 06:24:00 UTC 2015 (duration 23m 59s)
05:41 andrewbogott: upgrading virt1012 to Trusty because labs networking failed twice in two hours, and how could it be worse?
02:19 logmsgbot: LocalisationUpdate completed (1.25wmf19) at 2015-02-27 02:18:19+00:00
02:18 logmsgbot: l10nupdate Synchronized php-1.25wmf19/cache/l10n: (no message) (duration: 00m 01s)
02:17 logmsgbot: LocalisationUpdate completed (1.25wmf18) at 2015-02-27 02:16:14+00:00
02:16 logmsgbot: l10nupdate Synchronized php-1.25wmf18/cache/l10n: (no message) (duration: 00m 02s)
01:40 springle: switch db1046 to master of m4 (eventlogging). deployed dbproxy1004 with m4-master CNAME
21:55 gwicke: disabled puppet on cassandra test hosts cerium and praseodymium as well (in addition to xenon) to manually fix incompatible puppet config & re-initialize cluster after cluster name change; see https://phabricator.wikimedia.org/T90955 for upgrade to jessie
21:00 ^d: mw1161 is complaining about permissions on setting mtime during rsync
20:40 gwicke: issue with cassandra test cluster is actually that it's still running cassandra 2.1.2, which is incompatible with the current puppet config; should probably update the test cluster to jessie soon
20:38 gwicke: cassandra on test cluster seems to be broken, investigating
20:16 gwicke: disabled puppet on xenon to test bulk db creation with restbase
02:19 logmsgbot: LocalisationUpdate completed (1.25wmf18) at 2015-02-24 02:18:03+00:00
02:17 logmsgbot: l10nupdate Synchronized php-1.25wmf18/cache/l10n: (no message) (duration: 00m 01s)
02:17 logmsgbot: LocalisationUpdate completed (1.25wmf17) at 2015-02-24 02:16:30+00:00
02:16 logmsgbot: l10nupdate Synchronized php-1.25wmf17/cache/l10n: (no message) (duration: 00m 02s)
01:52 logmsgbot: ori Synchronized php-1.25wmf17/extensions/MobileFrontend/includes/modules/MobileUserModule.php: Reverting live-hack (duration: 00m 07s)
01:47 logmsgbot: ori Synchronized php-1.25wmf17/extensions/MobileFrontend/includes/modules/MobileUserModule.php: Testing a theory for T90411 with a live-hack to MobileFrontend. Will revert momentarily. (duration: 00m 07s)
00:56 Tim: on osmium, removing the packages I just installed since I will do it in a chroot instead
00:51 logmsgbot: twentyafterfour Synchronized php-1.25wmf18/cache/l10n: (no message) (duration: 00m 03s)
00:43 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 07s)
00:10 logmsgbot: demon Synchronized php-1.25wmf18/extensions/MultimediaViewer: (no message) (duration: 00m 09s)
00:10 logmsgbot: demon Synchronized php-1.25wmf17/extensions/MultimediaViewer: (no message) (duration: 00m 06s)
February 23
23:28 Tim: on osmium installing packages necessary for building hhvm
21:06 subbu: deployed parsoid version d9ac8c21
20:43 awight: update crm from f594a66694d52af1c604b1813ac94e9592b6c81e to 3c002f32e04652ae56a4fe791bc6158ab981ed8d
18:19 ^d: created education program tables for hewiktionary
23:06 logmsgbot: ori Synchronized php-1.25wmf17/extensions/WikimediaEvents: (no message) (duration: 00m 06s)
23:06 logmsgbot: ori Synchronized php-1.25wmf17/extensions/VisualEditor: (no message) (duration: 00m 06s)
23:05 logmsgbot: ori Synchronized php-1.25wmf18/extensions/WikimediaEvents: (no message) (duration: 00m 07s)
23:05 logmsgbot: ori Synchronized php-1.25wmf18/extensions/VisualEditor: (no message) (duration: 00m 06s)
21:57 logmsgbot: hoo Synchronized php-1.25wmf18/extensions/Wikidata/: Update Wikibase to fix langlink updates in the client API et al (duration: 00m 12s)
21:57 logmsgbot: hoo Synchronized php-1.25wmf17/extensions/Wikidata/: Update Wikibase to fix langlink updates in the client API et al (duration: 00m 14s)
21:28 gwicke: restbase now up on all live (3 of 6) prod nodes
21:20 gwicke: cleanly re-initialized prod cassandra cluster after puppet run; picked up local dc from property file
20:49 chasemp: restart ntp on mw1009
20:15 mutante: readding mw1062 to puppet, signing new cert and salt-key
19:54 mutante: reinstalling mw1062 after disk has been replaced
02:11 mutante: restbase1004/1005 systemctl daemon-reload to run systemd-sysv-generator to make it create missing unit for restbase and unbreak puppet running the service
01:12 logmsgbot: ejegg Synchronized wmf-config/CommonSettings.php: Use URLs without mobile redirects for CentralNotice (duration: 00m 07s)
00:54 logmsgbot: demon Finished scap: global user page extension-list fix + l10n rebuild (duration: 15m 21s)
00:39 AaronS: Deleted labswiki redis jobs (labswiki uses the db queue) for GlobalUserPage and flushed the queue aggregator
00:38 logmsgbot: demon Started scap: global user page extension-list fix + l10n rebuild
00:33 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 06s)
00:26 logmsgbot: demon Synchronized php-1.25wmf17/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: (no message) (duration: 00m 06s)
23:25 logmsgbot: twentyafterfour Purged l10n cache for 1.25wmf16
23:23 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.25wmf18
23:23 ori: HHVM on mw1141 locked up (threads stuck in __lll_lock_wait). Depooling for further investigation.
23:17 logmsgbot: twentyafterfour Finished scap: testwiki to php-1.25wmf18 and rebuild l10n cache (duration: 42m 58s)
22:34 logmsgbot: twentyafterfour Started scap: testwiki to php-1.25wmf18 and rebuild l10n cache
22:30 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: rollback group0 to 1.25wmf17
22:28 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.25wmf18
22:24 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.25wmf17
21:41 subbu: deployed parsoid version 17f68256
19:24 bd808|LUNCH: pruned stale members from trebuchet minions set for scap/scap: redis-cli srem "deploy:scap/scap:minions" fenari.wikimedia.org virt0.wikimedia.org nickel.wikimedia.org searchidx1001.eqiad.wmnet
19:01 godog: restart txstatsd on graphite1001 to flush old metrics
18:40 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Id215ff962: Change $wgUDPProfilerPort to 8135. (duration: 00m 05s)
17:58 _joe_: fixed scap on mw1158, moving /srv/deployment/scap away made puppet perform the redeploy
17:56 _joe_: fixed scap on mw1154, moving /srv/deployment/scap away made puppet perform the redeploy
17:51 bd808: fixing scap on mw1158 and mw1154 will take a root to fix bad trebuchet git clones -- cd /src/deployment/scap; sudo mv scap scap-broken; sudo salt-call deploy.fetch 'scap/scap'; sudo salt-call deploy.checkout 'scap/scap'
17:22 _joe_: shutting down mc1014, moving to a different rack
17:19 _joe_: mw1158 and mw1154 report broken python imports during scap
22:37 csteipp: deploy fixes for T85850, T88310, T85855
22:13 ejegg: updated payments-wiki-staging from ce73ed11de9775a596c51acdc036503751961bc8 to cbaf66e7705789f37117ec6edc4d936c6174d511
21:42 hoo: Set email for dewiki account "Ar-ras" to the email of the commons account with the same name
20:41 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to $VERSION
19:57 logmsgbot: ori Synchronized wmf-config/StartProfiler.php: I6fbd48e6b: Revert "Revert "Revert "Use ProfilerSectionOnly to handle DB/filebackend entries and the like""" (duration: 00m 05s)
19:15 logmsgbot: yurik scap failed: OSError [Errno 2] No such file or directory: '/var/lock/scap' (duration: 33m 42s)
18:52 andrewbogott: cold-migrating all instances from virt1005 to virt1012
18:41 logmsgbot: yurik Started scap: (no message)
18:23 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ie5879ec6a: Set $wgUDPProfilerPort to 8125 (duration: 00m 07s)
18:21 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Icd6766440: Correct StatsFormatString so it emits valid statsd data (duration: 00m 07s)
18:02 andrewbogott: adding virt1012 to the nova virt pool
17:25 andrewbogott: powering down virt1005, waiting a few seconds, power on
17:05 logmsgbot: marktraceur Synchronized php-1.25wmf17/tests/phpunit/includes/StatusTest.php: [SWAT] [wmf17] Make sure Commons file deletion is still working later today (duration: 00m 06s)
17:04 logmsgbot: marktraceur Synchronized php-1.25wmf17/includes/Status.php: [SWAT] [wmf17] Make sure Commons file deletion is still working later today (duration: 00m 06s)
16:55 logmsgbot: marktraceur Synchronized php-1.25wmf17/includes/filerepo/FileRepo.php: [SWAT] [wmf17] Make sure Commons uploading is still working later today (duration: 00m 05s)
16:52 _joe_: upgrading testwiki to use www-data, may cause a brief downtime
23:06 logmsgbot: awight Synchronized wmf-config: Set up a new debug logging group for T89258 (take 2) (duration: 00m 06s)
22:56 logmsgbot: awight Synchronized wmf-config: Set up a new debug logging group for T89258 (duration: 00m 06s)
22:53 logmsgbot: awight Synchronized php-1.25wmf17/extensions/CentralNotice: CentralNotice fixes for T89258 and T45250 (duration: 00m 06s)
22:52 logmsgbot: awight Synchronized php-1.25wmf16/extensions/CentralNotice: CentralNotice fixes for T89258 and T45250 (duration: 00m 07s)
21:38 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: debug time over (duration: 00m 05s)
21:37 logmsgbot: demon Synchronized php-1.25wmf16/includes/resourceloader/ResourceLoaderImage.php: debug time over (duration: 00m 05s)
21:24 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Debug fun (duration: 00m 05s)
21:22 logmsgbot: demon Synchronized php-1.25wmf16/includes/resourceloader/ResourceLoaderImage.php: Debug fun (duration: 00m 05s)
18:17 robh: morebots, you doing yer thing?
15:59 godog: es-tool restart-fast on elastic1011
15:08 godog: correction, elastic1010
15:08 godog: es-tool restart-fast on elastic1019
14:41 godog: es-tool restart-fast on elastic1009
13:25 hoo: Started rebuildItemsPerSite for wikidata on terbium
11:34 godog: restart elasticsearch on logstash1001 logstash1002 logstash1003
11:26 paravoid: mw1095/mw1192: service hhvm restart, alerts for 10h30/9h35 respectively
11:18 godog: es-tool restart-fast on elastic1008
04:56 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Feb 13 04:55:10 UTC 2015 (duration 55m 9s)
02:32 logmsgbot: LocalisationUpdate completed (1.25wmf17) at 2015-02-13 02:31:34+00:00
02:31 logmsgbot: l10nupdate Synchronized php-1.25wmf17/cache/l10n: (no message) (duration: 00m 01s)
02:19 ori: ran redis commands 'HDEL jobqueue:aggregator:h-queue-types:v2 LocalGlobalUserPageCacheUpdateJob/labswiki' and 'HDEL jobqueue:aggregator:h-queue-types:v2 LocalGlobalUserPageCacheUpdateJob' on rdb1001
02:18 logmsgbot: LocalisationUpdate completed (1.25wmf16) at 2015-02-13 02:17:27+00:00
02:17 logmsgbot: l10nupdate Synchronized php-1.25wmf16/cache/l10n: (no message) (duration: 00m 01s)
02:04 logmsgbot: legoktm Synchronized wmf-config/CommonSettings.php: Set ['LocalGlobalUserPageCacheUpdateJob'] = 'NullJob' to clear queues (duration: 00m 06s)
19:18 logmsgbot: demon Synchronized php-1.25wmf16/extensions/CentralNotice/special/SpecialBannerRandom.php: rm live hack leftovers, now being worked on (duration: 00m 05s)
18:47 logmsgbot: demon Synchronized php-1.25wmf16/extensions/CentralNotice/special/SpecialBannerRandom.php: rm live hack, have our data (duration: 00m 06s)
18:44 logmsgbot: demon Synchronized php-1.25wmf16/extensions/CentralNotice/special/SpecialBannerRandom.php: live hack (duration: 00m 08s)
14:15 Krenair: Manually logged a missing cross-wiki rights log change entry on meta "Avraham changed group membership for User:Bencmq@zhwiki from bureaucrat, check user and administrator to bureaucrat and administrator (requested)". See T89205 for details
11:28 godog: es-tool restart-fast on elastic1006
10:12 hashar: gallium and lanthanum: dpkg --purge locate
10:09 hashar: gallium: uninstalling locate package from gallium. Has been installed on 2015-01-30 00:31:39 apparently manually by root@iron.wikimedia.org
10:02 godog: es-tool fast-restart on elastic1005
08:28 hashar: puppet-lint now complains on error (not warnings) \O/ {{bug:T87132}}
04:54 springle: broke puppet db grant. fixed puppet db grant
04:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Feb 12 04:52:16 UTC 2015 (duration 52m 15s)
04:09 springle: sign puppet cert dbproxy1003, first run
02:48 hoo: Manually logged a missing global rights log change entry on meta "Ajraddatz changed global group membership for Benoit Rochon from (none) to OTRS-member with the following comment: request". See also T89205
02:27 logmsgbot: LocalisationUpdate completed (1.25wmf16) at 2015-02-11 02:26:55+00:00
02:26 logmsgbot: l10nupdate Synchronized php-1.25wmf16/cache/l10n: (no message) (duration: 00m 02s)
02:13 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-11 02:12:40+00:00
02:12 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s)
01:56 logmsgbot: krenair Synchronized php-1.25wmf16/includes/UserRightsProxy.php: https://gerrit.wikimedia.org/r/#/c/189879/ - same thing for interwiki user rights logs (duration: 00m 07s)
01:52 logmsgbot: krenair Synchronized php-1.25wmf16/extensions/CentralAuth/includes/CentralAuthGroupMembershipProxy.php: https://gerrit.wikimedia.org/r/#/c/189888/ - fix lack of global group membership change logging (duration: 00m 05s)
01:44 springle: puppet disabled on lanbdsb1001 labsdb1002. needs restart
16:24 logmsgbot: anomie Synchronized wmf-config/CommonSettings.php: SWAT: Revert "Whitelist application/x-gzip on private wikis to fully allow dia files", wasn't a correct fix for the issue (duration: 00m 05s)
16:12 logmsgbot: marktraceur Synchronized php-1.25wmf16/extensions/OAuth/: [SWAT] [wmf16] OAuth: Support ListDefinedTags and ChangeTagsListActive hooks (duration: 00m 11s)
15:00 cmjohnson1: cp1070 down for h/w troubleshooting. Already depooled by bblack
11:58 godog: bounce mwprof-profiler-to-carbon on tungsten
10:47 hoo: Manually removed wikidatawiki.wb_changes_dispatch entries for test wikis (test2wiki, testwiki, testwikidata).
09:11 gwicke: cassandra load testing on xenon, praseodymium and cerium; disk space is tight, might run out on one of those boxes but they are purely test boxes right now, so np
05:32 gwicke: stopped puppet on cerium, praseodymium & xenon
03:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Feb 9 03:48:57 UTC 2015 (duration 48m 56s)
02:13 logmsgbot: LocalisationUpdate completed (1.25wmf16) at 2015-02-09 02:12:45+00:00
02:12 logmsgbot: l10nupdate Synchronized php-1.25wmf16/cache/l10n: (no message) (duration: 00m 02s)
02:12 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-09 02:11:13+00:00
02:11 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 01s)
February 8
03:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Feb 8 03:52:49 UTC 2015 (duration 52m 48s)
02:14 logmsgbot: LocalisationUpdate completed (1.25wmf16) at 2015-02-08 02:13:11+00:00
02:13 logmsgbot: l10nupdate Synchronized php-1.25wmf16/cache/l10n: (no message) (duration: 00m 01s)
02:12 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-08 02:11:41+00:00
02:11 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 01s)
February 7
15:35 apergos: started nginx on daaset1001, it was not running for some reason
09:40 bblack: depooled cp1070 in pybal
09:33 bblack: rebooting cp1070 (dead network, dead console)
05:10 subbu: deployed parsoid hotfiix 8ca7ef40 (cherry-pick of 447a0565)
04:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Feb 7 04:47:30 UTC 2015 (duration 47m 29s)
03:13 gwicke: restarting parsoid cluster
02:34 logmsgbot: LocalisationUpdate completed (1.25wmf16) at 2015-02-07 02:33:09+00:00
02:33 logmsgbot: l10nupdate Synchronized php-1.25wmf16/cache/l10n: (no message) (duration: 00m 01s)
02:20 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-07 02:19:09+00:00
02:19 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s)
02:11 qchris: Ran kafka leader re-election as analytics1021 dropped out of it's partition leader role.
01:48 bblack: leaving cp1064 (jessie upload eqiad) pooled front+back. it's experimental but looks stable. if upload-related 503 spikes and I'm not around, feel free to depool it.
00:18 qchris: Manually bumping heap for the Hadoop namenodes and revived them after both of them running out of heap and not coming back.
19:37 logmsgbot: reedy Started scap: testwiki to 1.25wmf16
19:09 legoktm: clearing bad sidebar memcache entries on commonswiki
18:08 _joe_: restarting nutcracker on jobrunners
18:06 _joe_: restarting nutcracker on api appservers
18:01 ori: restarting nutcracker on all appservers
17:54 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I4f28205e6: Set $wmgUseMonologLogger to false (duration: 00m 06s)
17:47 logmsgbot: ori Synchronized wmf-config/logging.php: Live hack: disable Logstash logging on suspicion that it is acting up (duration: 00m 05s)
17:34 paravoid: restarting HHVM on all appservers/API appservers in 10%/6s batches
17:26 bblack: repooled cp1063 frontend-only
16:21 godog: bounce jmxtrans on analytics1018, analytics1021 and analytics1022
16:15 godog: bounce jmxtrans on analytics1012
16:03 godog: re-enabled puppet on graphite1001, bounce uwsgi
14:34 godog: upload txstatsd 1.0.0-3 to trusty-wikimedia
12:42 paravoid: cp*/amssq*: salt rm /etc/logrotate.d/varnishkafka-frontend-stats to fix cronspam
12:30 hashar: Upgrading Jenkins and restarting it
06:43 springle: upgrade silver to mariadb 10
04:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Feb 5 04:54:00 UTC 2015 (duration 53m 59s)
02:37 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ia59e654e8: Set a statsd-compatible $wgStatsFormatString (duration: 00m 07s)
02:35 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-05 02:34:02+00:00
02:34 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 01s)
02:20 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-02-05 02:19:31+00:00
02:19 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 02s)
01:41 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I7b270eb8a: Set $wgUDPProfilerHost to service alias rather than hard-code IP (duration: 00m 05s)
00:54 bd808: truncated redis input queues for logstash on all 3 hosts to see if cluster can keep up now with 3 elasticsearch writer threads
00:08 Krinkle: Added 'dduvall' to integration group ACL on Gerrit
00:06 springle: xtrabackup clone virt1000 to silver
February 4
23:38 mutante: starting memcached on virt1000
23:21 qchris: Manual failover of Hadoop namenode from analytics1001 to analytics1002, as analytics1001 had Heap space errors
22:50 ejegg: updated payments from 1e9b78e9a8bf557a710988620bd6f1a335787173 to cbaf66e7705789f37117ec6edc4d936c6174d511
22:49 manybubbles: this is certainly a bug in Elasticsearch, but I imagine its one solved in newer versions. i hope, more like.
22:49 manybubbles: not sure what happened but now space if freeing up on 1001. the disk was never in danger of filling up but it was full enough not to allocate more to it. Now that stuff is allocating elsewhere elasticsearch is clearing the used space.
22:41 manybubbles: looks like elastics1001 doesn't have much free space left. I think that might have something to do with this....
22:38 manybubbles: Elasticsearch wasn't initializing shards to elastic1001 after its restart. Didn't check why. Set allocation to primaries then back to all and that unstuck it.
21:16 arlolra: updated Parsoid to version dd4721f4
20:33 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files: I4fb67945b: Revert "[Regression] Revert "Non wikipedias to 1.25wmf15"
20:18 logmsgbot: aude Synchronized wmf-config/Wikibase.php: set useLegacyChangesSubscription to true for Wikidata (duration: 00m 07s)
18:30 godog: bounce txstatsd on cache hosts in eqiad
18:17 godog: bounce txstatsd on cache hosts in ulsfo
18:08 godog: bounce txstatsd on cache hosts in esams
17:30 logmsgbot: marktraceur Synchronized php-1.25wmf14/extensions/UploadWizard/: Touching pretty much everything in UploadWizard, maybe it will help (duration: 00m 07s)
17:22 logmsgbot: marktraceur Synchronized php-1.25wmf14/extensions/UploadWizard/resources/mw.UploadWizard.js: Touch an UploadWizard file to try and fix caching (duration: 00m 07s)
16:58 robh: replacing the intermediary cert on dumps.w.o (so nginx will flap on it shortly)
16:56 godog: restart ES on elastic1001
15:43 logmsgbot: marktraceur Synchronized php-1.25wmf14/extensions/UploadWizard/resources/controller/uw.controller.Upload.js: Touch an UploadWizard file to try and fix caching (duration: 00m 07s)
15:25 logmsgbot: marktraceur Synchronized php-1.25wmf14/extensions/UploadWizard/resources/controller/uw.controller.Upload.js: Touch an UploadWizard file to try and fix caching (duration: 00m 05s)
15:22 godog: graphite move close to completion, updating dashboards
15:16 godog: bounce diamond in batches in eqiad
14:50 logmsgbot: marktraceur Synchronized php-1.25wmf15/extensions/UploadWizard/resources/controller/uw.controller.Upload.js: Touch an UploadWizard file to try and fix caching (duration: 00m 05s)
14:14 godog: bounce webperf-related services on hafnium too: ve, statsd-mw-js-deprecate, statsv, asset-check
14:10 godog: bounce navtiming on hafnium to pick up dns changes
12:42 godog: stop bacula-fd on tungsten, backups running during migration
12:41 _joe_: installing the new HHVM package on jobrunners
12:28 godog: bounce txstatsd on ms-fe*
12:28 godog: bounce txstatsd on ms-be*
12:00 godog: bounce diamond in batches in ulsfo
11:57 godog: bounce diamond in batches in esams
11:51 godog: bounce mwprof on tungsten to force picking up dns changes
11:35 _joe_: installing the new hhvm package on api, one at a time
08:20 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Ie7b32e3d8: Add log group for T87645 (duration: 00m 05s)
08:19 logmsgbot: ori Synchronized php-1.25wmf14/includes/EditPage.php: Id376f9e75: Hack for T87645, since maybe it is still happening (duration: 00m 07s)
08:17 logmsgbot: ori Synchronized php-1.25wmf15/includes/EditPage.php: Id376f9e75: Hack for T87645, since maybe it is still happening (duration: 00m 05s)
08:14 paravoid: radium: upgrade tor to the latest torproject.org version
08:10 springle: wikitech mysql restart to fix novaold errors
05:14 springle: wikitech virt1000 test db dump T88311
04:49 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Feb 3 04:48:38 UTC 2015 (duration 48m 37s)
02:31 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-03 02:30:18+00:00
02:30 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s)
02:19 mutante: installing package upgrades on radium
02:17 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-02-03 02:16:37+00:00
02:16 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 03s)
01:59 bblack: depool cp1065 (text eqiad in pybal -> jessie)
01:59 bd808: Manually created apifeatureusage-2015.02.02 and apifeatureusage-2015.02.03 indices in elasticsearch; clsuter needs rolling restart for autocreate to work for these names
01:51 bd808: restarted logstash on logstash1001
01:51 bd808: restarted elasticsearch on logstash1003
00:03 mutante: rbf2002 - error while setting up RAID during installer (rbf2001 did not have this? or did it?)
16:52 yuvipanda: kill opendj on virt1000, it shouldn't have been running there in the first place
16:34 logmsgbot: anomie Synchronized wmf-config: SWAT: Have ContentTranslate publish article to Main namespace for cawiki gerrit:186358 (duration: 00m 07s)
16:11 aude: added and populated wbc_entity_usage table for wikidatawiki
10:54 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Exempt Item and Property namespaces from ConfirmEdit (duration: 00m 07s)
10:44 logmsgbot: hoo Synchronized php-1.25wmf14/extensions/Wikidata/: Update Wikibase: Fixes for UsageTracking and the anon edit warning (duration: 00m 14s)
10:43 logmsgbot: hoo Synchronized php-1.25wmf15/extensions/Wikidata/: Update Wikibase: Fixes for UsageTracking and the anon edit warning (duration: 00m 12s)
10:28 yuvipanda: been restarting pdns, opendj, apache, mysql, keystone left and right on virt1000 all day.
04:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Feb 2 04:08:14 UTC 2015 (duration 8m 13s)
02:44 springle: virt1000 mysqld restart, shrink buffer pool
02:20 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-02 02:19:21+00:00
02:19 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s)
02:11 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-02-02 02:09:59+00:00
02:09 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 04s)
00:13 subbu: restarted parsoid service on the parsoid cluster to free up leaked memory on several processes (seems to have happened in the 21:30 - 22:30 UTC on 31st Jan time frame)
February 1
04:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Feb 1 04:11:30 UTC 2015 (duration 11m 29s)
02:21 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-02-01 02:20:24+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 01s)
02:12 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-02-01 02:10:58+00:00
02:10 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 02s)
January 31
14:20 logmsgbot: hoo Synchronized wmf-config/CommonSettings-labs.php: (no message) (duration: 00m 06s)
04:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jan 31 04:11:31 UTC 2015 (duration 11m 29s)
02:25 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-01-31 02:24:27+00:00
02:24 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s)
02:12 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-01-31 02:10:58+00:00
02:10 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 02s)
04:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jan 29 04:51:55 UTC 2015 (duration 51m 54s)
02:35 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-01-29 02:34:03+00:00
02:34 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 02s)
02:21 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-01-29 02:20:02+00:00
02:20 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 02s)
01:12 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Id5186348f: Set $wgResourceLoaderStorageEnabled to false on osmium (duration: 00m 07s)
00:19 logmsgbot: demon Synchronized wmf-config/: (no message) (duration: 00m 06s)
00:18 logmsgbot: demon Synchronized php-1.25wmf14/includes/Title.php: (no message) (duration: 00m 06s)
00:17 logmsgbot: demon Synchronized php-1.25wmf15/includes/Title.php: (no message) (duration: 00m 06s)
00:17 logmsgbot: demon Synchronized php-1.25wmf15/includes/api/ApiPageSet.php: (no message) (duration: 00m 05s)
00:17 logmsgbot: demon Synchronized php-1.25wmf14/includes/api/ApiPageSet.php: (no message) (duration: 00m 06s)
19:35 godog: (after the fact) reboot gadolinium, currently not coming back
19:23 mutante: brought ircd back up on argon
19:19 YuviPanda: run sysctl -w net.netfilter.nf_conntrack_max=131072 on labnet1001
19:19 YuviPanda: run sysctl -w net.netfilter.nf_conntrack_max=131072 on labmon1001
19:15 Krinkle: irc.wikimedia.org is down. "Connection refused."
19:14 Krenair: IRC RC seems broken
18:31 YuviPanda: rebooting tungstun
18:27 godog: reboot swift in esams
17:59 YuviPanda: rebooting labmon1001
17:49 godog: reboot all swift machines in eqiad, in turn
17:47 bblack: rebooting various LVSes...
17:23 marktraceur: I am consciously leaving NavigationTiming unsynced because nobody seems that concerned about it, and nobody is here to shepherd the patch. If you *are* concerned about it, then contact ori.
01:53 superm401: Re-ran GettingStarted populate_categories.php, also populating ptwiki for the first time.
01:43 logmsgbot: mattflaschen Finished scap: Turning off WikiGrok, enabling GettingStarted copyediting suggestions on ptwiki, and upgrading ContentTranslation (duration: 28m 26s)
01:15 logmsgbot: mattflaschen Started scap: Turning off WikiGrok, enabling GettingStarted copyediting suggestions on ptwiki, and upgrading ContentTranslation
19:40 hoo: Set email of commons user Tatobot to the email of the owning account
19:30 akosiaris: https://gerrit.wikimedia.org/r/185610 merged, tested on wtp1024, wtp1023, caused 0 problems, rolling out to the rest of parsoid machines
05:54 ori: <jgage> mtr shows me packet loss between cr2-eqiad.wikimedia.org and 206.126.236.21 aka eqixva-google-gige.google.com
04:40 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jan 16 04:40:10 UTC 2015 (duration 40m 9s)
04:22 Tim: on mw1228 doing some tests to figure out why incorrect Expires header is being sent on requests for /images/*
03:09 logmsgbot: ori Synchronized php-1.25wmf14/includes/content/JsonContent.php: I2f4f9cb343: Let subclasses specify content model in JsonContent (duration: 00m 06s)
03:01 springle: xtrabackup clone db1020 to db1046
02:31 logmsgbot: LocalisationUpdate completed (1.25wmf15) at 2015-01-16 02:31:37+00:00
02:31 logmsgbot: l10nupdate Synchronized php-1.25wmf15/cache/l10n: (no message) (duration: 00m 01s)
02:19 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-01-16 02:19:04+00:00
02:19 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 01s)
02:06 ori: EventLogging syncs were of I335ad42bb: JsonSchemaContent: Fix html rendering of objects and arrays
02:03 logmsgbot: ori Synchronized php-1.25wmf14/extensions/EventLogging: (no message) (duration: 00m 05s)
02:03 logmsgbot: ori Synchronized php-1.25wmf15/extensions/EventLogging: (no message) (duration: 00m 06s)
00:46 mutante: on both puppetmasters: chown gitpuppet /var/lib/git/operations/puppet/.git/logs/refs/heads/production & .git/logs/HEAD & .git/logs/refs/remotes/origin to fix puppet-merge. git pulled on strontium
00:46 mutante: restarted morebots
January 15
23:09 bd808: Updated scholarships.wikimedia.org to d598e0d
22:08 bd808: restarted elasticsaerch on logstash1003; died from OOM
21:06 subbu: deployed parsoid version 2fdf9298
20:38 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I250ecfceb: Switch all wikis to monolog logger (duration: 00m 05s)
20:04 bd808: logstash redis queue backlog 384k events and climbing; likely related to the elasticsearch cluster flapping
19:53 Coren: aborting labs filesystem move (not enough contiguous free space) and postponing until new shelf
18:59 YuviPanda: this works?
18:23 csteipp: deployed patches for T85349 T85850 T86711
17:26 ejegg: updated crm from bb05adf9279bd7a795906ca476e1850a85c21711 to d648ededf5c9fc2b0ebf989300ca2037956418e3
16:51 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 06s)
16:09 bd808: Deleted 2015-12-* indices from logstash elasticsearch cluster
22:57 logmsgbot: ori Synchronized wmf-config/StartProfiler.php: I5bd397456: Restrict "forceprofile" to requests that set X-Wikimedia-Debug header (duration: 00m 06s)
22:47 logmsgbot: kartik Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 05s)
22:40 logmsgbot: kartik Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
22:31 logmsgbot: reedy Synchronized php-1.25wmf14/includes/libs/virtualrest/ParsoidVirtualRESTService.php: (no message) (duration: 00m 05s)
22:18 logmsgbot: reedy Synchronized php-1.25wmf15/includes/libs/virtualrest/ParsoidVirtualRESTService.php: (no message) (duration: 00m 05s)
22:18 logmsgbot: reedy Synchronized php-1.25wmf15/includes/libs/virtualrest/ParsoidVirtualRESTService.php: (no message) (duration: 00m 05s)
19:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.25wmf14
18:40 ori: mw1062: sync-file failed, read-only file system. Host should be removed from dsh group.
18:38 logmsgbot: ori Synchronized wmf-config/StartProfiler.php: xenon: Skip frames that don't have a 'phpStack' key (duration: 00m 06s)
17:48 hashar: If Zuul status page ( https://integration.wikimedia.org/zuul/ ) shows a lot of changes with completed jobs and the number of results growing, Zuul is deadlocked waiting for Gerrit. Have to restart it on gallium.wikimedia.org with /etc/init.d/zuul restart
17:39 hashar: Zuul back in action. Got recheck or +2 again the changes that have been discarded.
04:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jan 13 04:07:54 UTC 2015 (duration 7m 52s)
02:29 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-01-13 02:29:21+00:00
02:29 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 03s)
02:17 logmsgbot: LocalisationUpdate completed (1.25wmf13) at 2015-01-13 02:17:08+00:00
02:17 logmsgbot: l10nupdate Synchronized php-1.25wmf13/cache/l10n: (no message) (duration: 00m 03s)
00:44 logmsgbot: ori Finished scap: Updates to MobileFrontend, CentralAuth, EventLogging and WikimediaEvents (duration: 06m 27s)
00:38 logmsgbot: ori Started scap: Updates to MobileFrontend, CentralAuth, EventLogging and WikimediaEvents
January 12
23:06 ejegg: updated crm from d8a1160bca99354a856b1595cedf5c33f9ac255c to bb05adf9279bd7a795906ca476e1850a85c21711
21:38 hoo: Set email for global account "Carol.Christiansen" after having it confirmed by a steward and a dewiki bureaucrat (also based on old OTRS records)
21:12 subbu: deployed parsoid version 2cd6fefa
18:48 hoo: Ran sync-common on osmium
18:21 mutante: purging 'mlocate' package from neon as well to fix Icinga DPKG crits
18:04 bd808: Deployed scholarships at hash a5bc6fd
18:04 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 06s)
18:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 08s)
18:01 bd808: Applied 2015 schema changes to scholarships database on m2-master
17:33 hoo: mw1010: rsync: failed to set times on "/srv/mediawiki/.": Read-only file system (30)
17:31 logmsgbot: hoo Synchronized php-1.25wmf13/extensions/CentralAuth/: Only test passwords once in CentralAuthUser::prepareMigration - 2nd try (duration: 00m 07s)
17:31 logmsgbot: hoo Synchronized php-1.25wmf14/extensions/CentralAuth/: Only test passwords once in CentralAuthUser::prepareMigration (duration: 00m 06s)
17:31 logmsgbot: hoo Synchronized php-1.25wmf13/extensions/CentralAuth/: Only test passwords once in CentralAuthUser::prepareMigration (duration: 00m 06s)
17:22 mutante: restarted icinga-wm to join -releng
17:02 mutante: labmon1001 - purging mlocate package that was status 'rc'
16:30 godog: stop/start graphite-web on tungsten to clear logs
16:28 bd808: deleted 2014-01-* and 2015-12-* indices from logstash elasticsearch cluster
16:13 bd808: logs on logstash1001 reporting elasticserch connection errors; restarted logstash service
16:10 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable "Other projects sidebar" by default on frwiki gerrit:183288 (duration: 00m 05s)
16:09 bd808: logstash elasticsearch cluster has strange indices dated 2014-01-* and 2015-12-* again
16:05 bd808: restarted elasticsearch on logstash1001
16:05 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable thumbnail prerendering in production gerrit:183885 (duration: 00m 06s)
16:03 _joe_: depooling mw1062, disk errors
16:03 bd808: elasticsearch on logstash1001 not responding to http requests
16:01 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Enable usage tracking on test.wikidata and testwiki (duration: 00m 05s)
16:00 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 05s)
15:59 bd808: logstash not showing any events at all since 2015-01-12T13:58:59.728Z
09:06 hashar: Tweak Zuul configuration to pin python-daemon <= 2.0 and deploying tag wmf-deploy-20150112-1. bug T86513
07:24 andrewbogott: on virt1005 and virt1006, ran 'ln -s /usr/bin/qemu-system-x86_64 /usr/bin/kvm' that allows nova to migrate instances between hosts.
03:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jan 12 03:54:34 UTC 2015 (duration 54m 33s)
02:17 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-01-12 02:17:48+00:00
02:17 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 01s)
02:11 logmsgbot: LocalisationUpdate completed (1.25wmf13) at 2015-01-12 02:10:54+00:00
02:10 logmsgbot: l10nupdate Synchronized php-1.25wmf13/cache/l10n: (no message) (duration: 00m 01s)
17:19 logmsgbot: marktraceur Synchronized php-1.25wmf14/includes/filerepo/file/File.php: Remove silly debug line from File class (duration: 00m 07s)
17:18 logmsgbot: marktraceur Synchronized php-1.25wmf13/includes/filerepo/file/File.php: Remove silly debug line from File class (duration: 00m 08s)
16:58 Reedy: CREATE INDEX /*i*/br_timestamp ON /*_*/bounce_records(br_timestamp); for bounce_records on wikishared on extension1
15:24 Jeff_Green: deployed DNS dmarc record for wikipedia.*
10:19 _joe_: reimaging mw1152 as a HAT imagescaler
05:36 ori: repooled mw123[12]
05:32 logmsgbot: ori Synchronized wmf-config/mc.php: I33ff81e6a: memcached: set server address to localhost rather than 127.0.0.1 on mw123* (duration: 00m 05s)
04:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jan 9 04:32:31 UTC 2015 (duration 32m 30s)
03:05 springle: upgrade db1016 trusty
03:01 MaxSem: Running mwscript extensions/WikiGrok/maintenance/refreshCampaigns.php --wiki=enwiki --version=1 in screen session on terbium, feel free to kill if causes problems
02:50 logmsgbot: maxsem Synchronized php-1.25wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 07s)
02:49 logmsgbot: maxsem Synchronized php-1.25wmf13/extensions/Mantle: (no message) (duration: 00m 07s)
02:42 logmsgbot: maxsem Synchronized php-1.25wmf14/extensions/MobileFrontend/: (no message) (duration: 00m 06s)
02:42 logmsgbot: maxsem Synchronized php-1.25wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 07s)
02:42 logmsgbot: maxsem Synchronized php-1.25wmf13/extensions/Mantle: (no message) (duration: 00m 05s)
02:30 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1003 db1005 db1006 db1009. repool db1050 in s6, db1015 in s3 (duration: 00m 06s)
02:24 logmsgbot: LocalisationUpdate completed (1.25wmf14) at 2015-01-09 02:24:15+00:00
02:24 logmsgbot: l10nupdate Synchronized php-1.25wmf14/cache/l10n: (no message) (duration: 00m 01s)
02:18 logmsgbot: LocalisationUpdate completed (1.25wmf13) at 2015-01-09 02:18:23+00:00
02:18 logmsgbot: l10nupdate Synchronized php-1.25wmf13/cache/l10n: (no message) (duration: 00m 01s)
23:46 logmsgbot: ori Synchronized wmf-config/mc.php: (no message) (duration: 00m 07s)
23:32 logmsgbot: ori Synchronized wmf-config/mc.php: Revert: I4c4691e26: memcached: use a unix socket instead of a tcp connection on selected hosts (duration: 00m 06s)
23:30 logmsgbot: ori Synchronized wmf-config/mc.php: I4c4691e26: memcached: use a unix socket instead of a tcp connection on selected hosts (duration: 00m 06s)
23:26 ori: depooling mw1230 and mw1231 for a couple of minutes for I4c4691e26
21:53 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: nooop to test scap update (duration: 00m 06s)
21:52 mutante: fixing scap permissions on mediawiki-installation servers via dsh
21:29 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: nooop to test scap update (duration: 00m 06s)
21:04 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: nooop to test scap update (duration: 00m 09s)
21:00 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: nooop to test scap update (duration: 00m 06s)
20:39 Reedy: Scap deployed at a78ddec
20:22 Reedy: moved /srv/deployment/scap to scap.old as git repo seems busted. Hoping puppet puts it back again correctly...
19:07 logmsgbot: reedy Started scap: testwiki to 1.25wmf14...
18:42 godog: reboot ms-be2011, megacli in a funny state and unable to bring new drive in service
17:20 logmsgbot: ori Synchronized php-1.25wmf13/extensions/EventLogging/modules/ext.eventLogging.core.js: I5470424: Correct events to send schema name (duration: 00m 05s)
17:20 logmsgbot: ori Synchronized php-1.25wmf12/extensions/EventLogging/modules/ext.eventLogging.core.js: I5470424: Correct events to send schema name (duration: 00m 06s)
17:15 godog: reboot ms-be2003
15:36 springle: xtrabackup clone codfw slaves db2034 db2035 db2036 db2037 db2038 db2039 db2040 from other codfw slaves
14:40 _joe_: upgrading hhvm on testwiki
14:16 springle: xtrabackup clone db1027 to db1015
11:42 _joe_: reimaging mw1009-mw1012
10:19 godog: reboot ms-be2003, deleted LD should disappear
09:47 hashar: restarting Jenkins to resolve a deadlocks with the beta cluster jobs
07:53 _joe_: reimaging jobrunners mw1013-mw1016 (in batch of two)
10:21 hashar: Restarted Zuul. Gerrit transiently died out just like ~10 hours ago which locked Zuul entirely
09:41 springle: started an analytics ETL run on dbstore1002. to disable: set global event_scheduler=0;
09:41 godog: starting hhvm-profiler-to-carbon on tungsten T85641
09:16 mutante: upgraded python version on zirconium
09:13 logmsgbot: hashar Synchronized wmf-config/CommonSettings-labs.php: (no message) (duration: 00m 05s)
08:42 hashar: Zuul scheduler was stuck while reporting a change back to Gerrit waiting for data to be received. For some reason none came back and Zuul halted entirely. Restarting Gerrit killed the stalled connection and made Zuul to drop all events and resume operations.
08:38 andrewbogott: restarted gerrit service on ytterbium
08:11 hashar: Zuul stalled for some reason :(
08:07 andrewbogott: restarted pdns on virt1000 and labcontrol2001 to handle the change to nembus (just in case pdns is upset by change!)
08:07 andrewbogott: moved codfw ldap service to nembus
17:00 bd808: restarted logstash on logstash1001 to see if that will make syslog events come back
16:58 bd808: syslog events not being recorded in logstash as expected (apache2, hhvm)
16:21 logmsgbot: manybubbles Synchronized php-1.25wmf13/extensions/VisualEditor/: SWAT fix switching between wikitext and VE on mobile (duration: 00m 14s)
16:14 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT disable creating books in the wikipedia namespace AND shuffle some upload permissions on kowiki (duration: 00m 05s)
16:12 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT disable creating books in the wikipedia namespace (duration: 00m 06s)
16:03 logmsgbot: manybubbles Synchronized wmf-config/Wikibase.php: SWAT Display links to Wikidata in the other project sidebar (duration: 00m 06s)
10:40 springle: xtrabackup clone: db1037 to db1061, db1039 to db1062