Obsolete:Bots project documentation

From Wikitech

All of this was at Nova_Resource:Bots/Documentation but that Labs project is obsolete now, replaced by the Tools project (Tool Labs).

Architecture

Topology of tools on labs

Bots run on application servers. They can write to shared gluster storage at /data/project/userdata or shared NFS server at /mnt/secure. An apache server serves bot user directories from /data/project/public_html. Bots also have mysql storage available at mysql-bsql01 - every user has an account on this mysql server and can access it by typing mysql -h bots-bsql01. In case you need to create a database there, execute:

call system.create_db("database_name");

in mysql. The user who created the new database will have all the SQL rights.

Please note: user accounts are created by a job which is cronned every hour. If you have a newly created account on bots, it may take a bit for you to have access to SQL. Please wait. If your account doesn't work for more than 50 minutes, you should contact a sysadmin.

Login server

  • bots-login

Application servers:

  • bots-bnr(1,2,3) - application node of grid
  • bots-4 - testing instance
  • bots-cb - cluebot only

Each user has a directory in /data/project/public_html on each app server which can be accessed from outside via bots.wmflabs.org/~user.

How to install and run your bot

Use Oracle Grid Engine to schedule your tasks. Or you can just manually run it on a random application server.

Using OGE

You need to install your bot to a folder that is accessible on all instances, such as your home or /data/project

Once you install your bot there, you need to create a start script. It can be a simple shell script:

 #!/bin/dash
 # replace all variables for your need
 logfile=/data/project/mybot/syslog

 echo "Started bot on `hostname` at `date`" >> $logfile

 #here is a startup for your bot - example is bellow, commented
 #mono /data/project/mybot/bot.exe >> $logfile 2>&1

 echo "Stopped the bot at `date`" >> $logfile
 exit 0

This simple script will start the bot. After it is finished, it will exit. Save it as start.sh, chmod a+x start sh and you can schedule this script by doing this from bots-gs:

qsub -q main.q /data/project/mybot/start.sh

Now your bot should be running on the application nodes.

qstat

will tell you if it runs or not. If it doesn't run you, won't see anything - check $syslog to see why it didn't run:

# in our example $syslog was set to this file
cat /data/project/mybot/syslog

To see an overall summary of the current state of the grid use the command

qtop

which displays the number of jobs, loads and any broken queues.

If you are running an interactive bot, and it must run on a server with a minimal load (so that it responds quickly and isn't affected by a lagging system):

qsub -q minimalload /data/project/mybot/start.sh

Running by hand

You can install the bot to any storage, including /mnt/share (local storage, shared among users of server). Then, you can just run it as you like.

How to request a project membership

You need to ask a current member. It's easiest just to go on irc and ask there. Everyone who demonstrates even a modicum of trust can have access.

How to add a project member

Here is a rough guideline for project users for adding new project members.

  • Ask nicely what the user wants to do in the bots project. (e.g. what bot he wants to run, etc)
  • Add the user to the project using Special:NovaProject. (scroll to the bots project)

... and you are done! Note that the public_html directories, mysql login and /mnt/secure are automatically created via a cron on bots-apache01 and cron on bots-secure.

MySQL

In case you want to use mysql, there is a shared server called bots-bsql01 - there is a script cronned every hour, that creates accounts for all users who don't have them.

mysql -h bots-bsql01

Will get you there.

mysql> call system.create_db("quack");

Create a new db "quack" and give you all grants for it

There are NO LIMITS for users in this moment. That means you can create as big of a database as you need and your queries can use as many system resources as you need. Just don't abuse it, or we will need to change it. ;)

Requesting packages to be installed

If you need to install a global package on bots cluster, you have a few options:

  • Ask a sysadmin - we live in #wikimedia-labs connect. A list of active sysadmins can be found in motd of each instance.
  • Create a puppet class for it.

Getting help

Do you need help? We are happy to help you - just join #wikimedia-labs connect and ask - or you can email labs-l@lists.wikimedia.org

The bots cluster is community / wmf maintained, the list of sysadmins can be found in motd - Real Name (nickname). If you need a sysadmin for anything, just go ahead and ping one on irc.

Servers

Application servers

  • bots-gs - Grid scheduler - this is server you should control the grid from
  • bots-bnr(1,2,3) - Grid node (4core, 8GB ram)
  • bots-ibnr(1) - Grid node (smaller)

Testing servers

bots-4

This server runs:

  • DrTrigonBot by DrTrigon
    • rewrite part, see log: rewrite
    • trunk part follows (currently still on TS)
  • BeneBot* (.NET) by Bene*
  • FIXME by Fastily
  • JohnFLBot by John F. Lewis
  • VoxelBot by Fox Wilson and Vacation9

Infrastructure servers

bots-apache01

This is our main public web server. No need to log in to this server to change files in your http://bots.wmflabs.org/~<user>/ directories, as it is already shared across all servers at /data/project/public_html.

bots-bsql01

A shared SQL server (8core, 16GB Ram)

bots-sql2

This server is scheduled for removal. Please don't use it.

Dedicated servers

bots-cb

This server powers Cluebot, an antivandal bot on the English Wikipedia.

bots-labs

This server runs Labs-specific bots (i.e. labs-morebots, etc.).

bots-liwa

This is the instance running 'linkwatcher' (Perl) by Dirk Beetstra. See "User:Beetstra/MiniManual".

  • LinkWatcher (m:User:LiWa) is a bot that for every edit parses out link additions, reports them to freenode and stores them in a database on bots-sql2
bots-salebot

This server runs Salebot, an antivandal bot by User:Gribeco on frwiki and ptwiki.

Purpose unknown

bots-abogott-devel

FIXME

bots-analytics

FIXME

bots-dev

FIXME

bots instances

language-pack-en-base
needed for locals in some bot scripts

IRC bots

wm-bot

An IRC bot. documentation for sysadmin, documentation for users

labs-morebots

Another IRC logging bot, written in python. labs-morebots is the instance providing the !log command. analytics-logbot is another instance.

It runs on bots-labs.

There is an init script for it called "adminbot". (/etc/init.d/adminbot).

It needs /var/run/adminbot as a cache directory and permissions to write to it. It also needs working LDAP to fetch the project list.

Troubleshooting

Connect to bots-labs and /etc/init.d/adminbot start (or restart, check before if the process is running)
Bot dies on !log command

check for the existence of the cache directory described above and ensure the bot user can write to it

Bot says <x> is not a valid project

Either you misspelled a project name or there is an LDAP connection issue. Does it also say "Can't contact LDAP for project list"? If yes, check with ops for possible LDAP and/or NFS issues.


Pywikipedia bot framework

For operators of Python bots, snapshots (updated daily) of the Pywikipedia framework trunk and rewrite versions are maintained at /data/project/pywikipedia/trunk and /data/project/pywikipedia/rewrite, respectively. Note that these are just the source files; each bot operator will need to create its own configuration files, such as user-config.py, and set up its PYTHONPATH and other environment variables.