Trebuchet

From Wikitech
(Redirected from Git-deploy)

Trebuchet is a SaltStack based deployment system composed of three components:

  • trebuchet-deploy, the SaltStack based deployment backend. This component handles the actual deployment of code in git repositories.
  • trebuchet trigger, the git interface to Trebuchet. This component provides a method of running deployment commands via "git deploy <action>".
  • trebuchet ricochet, the web interface to Trebuchet. This component provides a reporting interface for the deployment status of repositories.

Trebuchet is currently in use by Wikimedia for all non-MediaWiki deployment.

Deployment location

On both the deployment server (tin) and on deployment targets your repository will live in /srv/deployment/<repo-name>.

Adding a new repo

Add the new repo's configuration to puppet

The puppet configuration is at hieradata/common/role/deployment.yaml. Each entry is specified by repo, with the following settings for each repo:

upstream (default: none)
A string that defines the url of the upstream repo associated with this deployment repo. This is used to initialize the repo on the deployment server. If no upstream is defined then an empty repository will be created.
shadow_reference (default: false)
A boolean that defines whether or not this repo will create a reference clone of this repo during the fetch stage. Example: test/testrepo would also have a test/.testrepo clone on the targets that is fully checked out to the deployment tag during the fetch stage.
fetch_module_calls (default: {})
A hash of salt modules with a list of arguments that will get called on the minion at the end of the fetch stage of deployment.
checkout_module_calls (default: {})
A hash of salt modules with a list of arguments that will get called on the minion at the end of the checkout stage of deployment. The following argument expansions exist: __REPO__ expands to the name of the repo.
checkout_submodules (default: false)
A boolean that defines whether or not this repo should also do submodule actions. The following argument expansions exist: __REPO__ expands to the name of the repo.
gitfat_enabled (default: false)
A boolean that defines whether or not this repo should also do git-fat actions. If true, git-fat init and git-fat pull will be run at the end of the checkout phase during deployment.
service_name (default: none)
A string that defines a service name associated with this repo. Restarting services via service-restart requires this to be set.
location (default: /srv/deployment/<repo-name>)
The string location of the repository on the deployment system and on the minion. You should probably never set this.
sync_script (default: shared.py)
A string that defines the sync script used for this repository. Options are depends.py and shared.py.
dependencies (default: {})
A hash of repositories with dependency scripts that this repository depends on. These repositories will be deployed automatically before this repository. Example to add the l10n-slot0 dependency for a repo, with that dependency using the l10n dependency script: {'l10n-slot0' => 'l10n'}
automated (default: false)
A boolean that defines whether or not this repository is automatically or manually deployed.

Run puppet on salt master

Puppet needs to run on the salt master (palladium) for the deployment repository to be created. If for some reason that fails to update tin.eqiad.wmnet, you can run salt-call deploy.deployment_server_init on tin manually to create the initial structure.

Deploy the repo via tin.eqiad.wmnet

After puppet has run everywhere your repo will automatically exist on tin. Simply force a new deployment and everything will be finished.

umask 022
cd <your-new-repo>
git deploy start
git deploy sync

Example of adding a new repo

Assume we're adding a repo named test/testrepo and it has no submodules and doesn't need to trigger anything on fetch or checkout. We'll take the following steps to add this new repo:

Step 1: Add repo to puppet

In hieradata/common/role/deployment.yaml add the repo's config:

  test/testrepo:
    service_name: puppet
    checkout_submodules: true

second example:

  cassandra/metrics-collector:
    gitfat_enabled: true
    upstream: https://gerrit.wikimedia.org/r/operations/software/cassandra-metrics-collector

Once the puppet change is merged, puppet will need to run on palladium, tin, and whichever minions are targeted.

Step 2: Force a deployment from tin

umask 002
cd /srv/deployment/test/testrepo
git deploy start
git deploy sync

Deploying

Initialize the git-deploy environment:

$ git deploy start

You can now proceed updating the code base using git pull, checkout, or (if this is a private repository, like Parsoid config), git cherry-pick, git commit, or whatever you need to do to bring the repository in the desired state for deployment.

Once you have finished updating the code base, ask git-deploy to actually deploy the modifications:

$ git deploy sync

If you screwed up something during the code update, you can abort your current work using:

$ git deploy abort

Restarting a service

First, ensure your repo config has service_name set. Next, you can use the service subcommand from git deploy:

/srv/deployment/test/testrepo$ git deploy service restart
i-00000821.pmtpa.wmflabs: True
/srv/deployment/test/testrepo$ git deploy report service
i-00000821.pmtpa.wmflabs: True

This command will restart your service with a default batch size of 10%. It's possible to adjust that size with --batch='<size>'.

Reporting

It's possible to show the status of any repository's deployment at any time, using the report subcommand of git deploy.

/srv/deployment/test/testrepo$ git deploy report sync
Repo: test/testrepo
Tag: test/testrepo-sync-20140305-043426

1/1 minions completed fetch; 1/1 minions completed checkout

Show a detailed report:

/srv/deployment/test/testrepo$ git deploy report sync --detailed
Repo: test/testrepo
Tag: test/testrepo-sync-20140305-043426

1/1 minions completed fetch; 1/1 minions completed checkout

Details:

i-00000821.pmtpa.wmflabs:
       fetch status: 0 [started: 149 mins ago, last-return: 149 mins ago]
       checkout status: 0 [started: 149 mins ago, last-return: 149 mins ago]

Removing minions from redis

Trebuchet keeps a list of minions that it expects to hear from in a Redis server on the deployment server. As of September 2015, there is no automated pruning of this list when the minion has been retired.

Minions are stored in a redis SET with keys following the format: deploy:<repo>:minions. To list members of a set:

redis-cli SMEMBERS "deploy:<repo>:minions"

Minions must be removed manually from Redis per-repo using the SREM command:

redis-cli SREM "deploy:<repo>:minions" <minion> <minion2> <minion3> <..>

For instance:

redis-cli SREM "deploy:parsoid/config:minions" mexia.pmtpa.wmnet tola.pmtpa.wmnet

You can verify the status in git-deploy using: git deploy report sync.

In the future this should be available via a git deploy sub commands.

Trying it

tin.eqiad.wmnet is the eqiad deployment host. There's a testrepo that can be used for testing. Simply go into /srv/deployment/test/testrepo and try it out. It is not necessary to forward your ssh agent to this host.

Troubleshooting

Repo doesn't exist on tin

When a new repository is added into the deployment configuration hash the puppet master should automatically update all deployment targets' pillars (/usr/bin/salt -C 'G@deployment_server:true or G@deployment_target:*' saltutil.refresh_pillar), then it should run a module on tin that will set up the new repository (/usr/bin/salt -G 'deployment_server:true' deploy.deployment_server_init).

However, sometimes this automation fails due to bugs. If this occurs, you should run the command on tin manually (as root, or through sudo):

salt-call deploy.deployment_server_init

Modules not available on minion

Push the modules (custom python code) from the master to minons:

salt '<node-regex>' saltutil.sync_all

Fetch modules from master to a minion:

salt-call saltutil.sync_all

Initial fetches are failing (minions forever pending)

If this occurs then something is broken on the salt/puppet master and a bug report should opened. There is a workaround that can be done to move this process along, though. It's necessary to restart the salt-minion on the affected nodes:

salt '<node-regex>' service.restart salt-minion

After doing so you can restart the deployment.

Troubleshooting the deployment from multiple locations

Via the runner on the salt master:

salt-run deploy.fetch 'test/testrepo'
salt-run deploy.checkout 'test/testrepo'

Via direct peer runner calls on tin:

sudo salt-call -l quiet publish.runner deploy.fetch 'test/testrepo'
sudo salt-call -l quiet publish.runner deploy.checkout '[test/testrepo,True]'

Via direct commands on the minions:

salt-call deploy.fetch 'test/testrepo'
salt-call deploy.checkout 'test/testrepo'

The direct command call on the minions are likely to be the most informative, as they'll output the commands being run. If a fetch or checkout is failing with a status other than 0, you can run the last command, which will give you the git specific error. If there's ever an error like this, it's a bug and should be reported upstream.

Also, on tin, you can debug the repository creation module via:

salt-call deploy.deployment_server_init

Bringing up new minions

Minions are automatically added to the cluster via puppet. Puppet will also do a deployment to the minion for all repos the minion is targeted for.

Using Trebuchet in Labs

Any Labs project can have its own Trebuchet deployment infrastructure, but this seems slightly broken. A hacky way to get it working:

  1. Add an instance named <project>-deploy in your project to act as a deployment server. (You can name it differently, but in that case you'll need to set the deployment_server_override field to the FQDN of the deployment server on all involved instances - the deployment server, the salt master and all the minions.)
    1. Configure the <project>-deploy instance to use the role::deployment::server puppet class.
    2. set up repo_config and deployment_server hiera keys (see e.g. Hiera:Sentry)
  2. Host a salt master in your project. (Can be the same instance as the deployment server.)
    1. Add the role::deployment::salt_masters puppet class to the salt master's configuration.
    2. Run puppet on the salt master.
  3. Point the instances in your project to your master.
  4. Run puppet on the deployment server again (there will be several errors but it will set up the trebuchet repos nevertheless)
  5. For every trebuchet repo you intend to use, do cd /srv/deploy/<repo>; git deploy start; git deploy sync; git deploy abort --force

Design

/Design