SQL/XML Dumps
This document is for people adding features or new datasets to the SQL/XML dumps. It is focused on the implementation as deployed at the Wikimedia Foundation; if you are dumping a large number of wikis elsewhere, you will need to make appropriate adjustments.
A few documents are for adding new dump items to the non-SQL/XML dumps. These are extremely specific to the Wikimedia Foundation, but they might prove interesting to third party users.
Background reading
edit- Enduser documentation: meta:Data_dumps
- Maintainer documentation: wikitech:Dumps
- How to write a dumps maintenance script: SQL/XML_Dumps/Writing_maintenance_scripts
- Other dumps-related pages on this wiki: Category:Import/Export
Workshop documents
edit- August 2020: "SQL/XML Dumps/Daily life with the dumps"
- August 2020: "SQL/XML Dumps/Anatomy of a dumps job"
- September 2020: "SQL/XML Dumps/A dump job using an existing MediaWiki script"
- September 2020: "SQL/XML Dumps/Command management walkthrough"
- September 2020: "SQL/XML Dumps/Stubs, page logs, abstracts"
- September 2020: "SQL/XML Dumps/Wikibase dumps via cron"
- October 2020: "SQL/XML Dumps/Running a dump job"
- October 2020: "SQL/XML Dumps/"Other" dump jobs via cron
- December 2020: "SQL/XML Dumps/Puppet for dumps maintainers" plus slides, speaker notes
Becoming a dumps co-maintainer
edit- February 2021: "Setup"
- Unknown (WIP): "Deployment-prep"
- March 2021: "Access"
- March 2021: Dumps high level overview (6 slides)
- April 2021: "Deploying MW changes"
- April 2021: "Phabricator task management & future"
- July 2021: Useful skills checklist for dumps co-maintainers
- October 2024: Debugging possible PHP logic issue.
General talks
edit- November 2020: "Dumps are not backups": slides, speaker notes
- December 2020: "10 years of Dumps at the WMF": slides, speaker notes
Getting around the code
editThis is currently a stub. Nag the editor(s) if a few weeks pass with no activity.