Conftool

From Wikitech

Conftool is a set of tools we use to sync and manage the dynamic state configuration for a few services (as of June 2015, only varnish backend lists and the pybal pools). This configuration is stored in the distributed key/value store: Etcd.

Overview

Conftool just gets information input in a series of configuration files, which are in the conftool-data/ directory in the Puppet repository. These files represent a static view of the configuration - so some information about services we manage, and then which services are installed on which hosts. There is another part of the equation, which is the dynamic state of such configuration (such as, the weight of the server in its pool, and the information about either having the server pooled or not), which is left untouched by the sync (apart from setting default values in newly added hosts).

The config files

Relative to the conftool root, configuration files are organized as follows:

  • the services directory contains a single file called data.yaml and has all the information on services, in the following form:
cluster_name:
  service_name:
    port: 1234
    default_values:
      pooled: inactive
      weight: 10
    datacenters:
      - eqiad
  another_service:
  ...

here, cluster_name is supposed to be almost always a 1:1 correspondence to the 'cluster' we define in puppet.

  • the nodes directory, where we have instead one file per datacenter, so nodes/eqiad.yaml. The format is again quite simple:
cluster_name:
  node_name.eqiad.wmnet:
    - service_name
    - another_service
  another_node.eqiad.wmnet:
    - service_name
...

The tools

Currently, we have two tools, both installed on the puppetmaster:

  • conftool-sync which is used to sync what we write in the files described above to the distributed key/value-cluster (as of June 2015, it's [Etcd], but this may well change in the future). conftool-sync will not be called by you directly, in most cases, you will just call conftool-merge (in a near future, it will be directly invoked by our puppet-merge utlity on the puppetmaster.
  • confctl is the tool to interact with the key/value store, a typical invocation could be:
confctl --tags dc=eqiad,cluster=cache_text,service=varnish-be --action get cp1052.eqiad.wmnet

{"cp1052": {"pooled": "no", "weight": 0}}

where the tags argument is a comma-separated list of data that specifies the service you want to query, so for the varnish backend service of the cache_text cluster in the eqiad datacenter will look like shown above.

The required tag list of course changes, but conftool will complain if you don't specify those correctly. Of course you can work on any object, you just need to specify the object-type parameter. So for example:

confctl --object-type service --tags cluster=cache_text --action get varnish-be

will work as well.

In puppet

Conftool is installed by including the conftool class into your node manifest. It won't install the conftool-data directory, though, which is part of the puppet git repository. So it's pretty natural for the puppetmaster (palladium) to be the standard machine where you should run conftool.

Operating

Add a service

If you need to add a service to a cluster, just edit the relevant yaml file under conftool-data/services, adding a service entry, and then run conftool-sync.

So for now you typically:

  • Create a puppet change adding the service stanza
  • On palladium, you run puppet-merge
  • Again on palladium, you run conftool-merge without arguments (this is a wrapper script that "does the right thing")

Add a server node to a service

If you need to add a server node to a pool, find the corresponding cluster in conftool-data/nodes/, see if the node stanza is present. If it is, then just add the service to the list of services; if not, add the node with its fqdn, as a key to the cluster, and add a list containing the service as a value.

After you have done that, you will need to merge the change in puppet and follow the steps outlined before for adding a service. Typically, though, new nodes will NOT be pooled, so if you want to pool your service you will need to modify the state of the node as shown below.

Modify the state of a server in a pool

Let's say we want to depool the server mw1018.eqiad.wmnet: what we'll do is what follows:

  • The server is in the eqiad datacenter, is part of the appserver cluster in puppet, and the service we want to change is apache2. We need all this information as we'll see next.
  • Run, from any host where conftool is installed:
confctl --tags dc=eqiad,cluster=appserver,service=apache2 --action set/pooled=no mw1018.eqiad.wmnet
  • Verify that it worked with
confctl --tags dc=eqiad,cluster=appserver,service=apache2 --action get mw1018.eqiad.wmnet

The syntax for the set action is: set/key1=value1:key2=value2. A small note on the pooled value meaning:

  • yes means the server is pooled
  • no means the server is not pooled but (only in pybal) present in the config
  • inactive means the server is not in the config we write at all

Pooling/depooling a server from all the related services

When a server is in maintenance mode or needs to be depooled/repooled in all of its services, you can use the --find argument instead of the tags. In this way, confctl will act on every service that is present on the server you indicate:

confctl --find set/pooled=(yes|no) foo.example.com

Decommission a server

Decommissioning a server is as simple as:

  • Depool it from all services (as seen above)
  • Remove its stanza from conftool-data, then sync the data exactly in the way you did for adding a node.