Occasionally we need to run maintenance scripts to update records on production.
Getting started
editThe following resources will help you get started:
- First you need shell access and access to the analytics servers for querying data.
- How to query the analytics replicas.
ssh stat1006.eqiad.wmnet analytics-mysql <wiki>
- Information on WMF's maintenance servers.
- How to run a maintenance script using binaries.
- Generic info on running maintenance scripts on Mediawiki.
Recommendations
edit- Make the dry-run of the script the default and add an option
--commit
that must be appended to run the actual updates. - Capture the stdout of the script to file.
- Optimize database CRUD operations by passing in arrays of ids as parameters.
- Use BatchRowIterator to prevent ballooning queries from scanning millions of rows in a loop.
- If a warning function is part of your script, add a
--no-warn
option to speed up the script. - Wikimedia specific scripts should be added using the WikimediaMaintenance extension unless the script is generic enough to be added to core.
Running on Production
edit- Scripts can be run on individual wikis from the current release branch using a maintenance script runner like so:
mwscript extensions/WikimediaMaintenance/nameOfScript.php --wiki <wiki> [args]
- You can leverage the Wikimedia binaries to run scripts for dblists. For example the following commands run a script for a dblist with exclusions:
expanddblist desktop-improvements | egrep -xv "bnwiki|mediawikiwiki|ptwiki|testwiki|trwiki" | while read wiki; do echo $wiki && mwscript extensions/WikimediaMaintenance/nameOfScript.php --wiki=$wiki [args]; done
expanddblist all | egrep -xv "arywiki|bnwiki|collabwiki|dewikivoyage|euwiki|fawiki|foundationwiki|frwiki" | while read wiki; do echo $wiki && mwscript extensions/WikimediaMaintenance/nameOfScript.php --wiki=$wiki [args]; done
- Prior to running scripts on production, you should ping the #wikimedia-data-persistence and #wikimedia-operations channels in IRC to make sure it's ok to run the scripts (they should be run outside of deployment windows).
- Good practice is to log when you are starting and ending running a maintenance script in #wikimedia-operations i.e.
!log Start running maintenance script for updating user preferences T299104
- You can query the analytics-replicas databases in real time to verify that the script has made the expected updates.
- While running scripts, monitor the production databases to ensure things look ok - https://grafana.wikimedia.org/d/000000278/mysql-aggregated?orgId=1
Examples
editA recent example of running a maintenance script can be found at T299104