Wikimedia Release Engineering Team/Checkin archive/20150901
2015-09-01
editTeam Business
edit- FYI
- Aaron's multi-DC work in public RFC meeting, Wednesday Sept 2, 21:00 UTC
- Q2 Goals
Deploy tooling
editObjective: Reduce number of deploy tools from 3 to 2 Key result: Migrate all Service team owned services and MW deploys to scap3 - task T109926
We don't have a KPI to track this one, but it's easy to measure success ("did we retire trebuchet and ansible?").
Open questions: 1) Still doable (retirement of Trebuchet and Ansible by December)? 2) Request: We probably want specific subtasks of that one for the individual Service Team owned services 3) Note: The current "scap3" sprint does not need to be tracked by this task (T109926) because it's our Q1 work.
Migrate Gerrit to Differential
editObjective: Retire Gerrit in favor of Phabricator (Differential) Key results: .... tbd ...
Open questions: 1) We need to own the Gerrit-Migration board and make it reflect reality (what actually needs to happen). https://phabricator.wikimedia.org/project/board/9/ I presume that will take a conversation between at least Chad, Mukunda, and myself (and Quim? Andre?). 2) I've filed a meta-task to track this work (the creation of the plan): https://phabricator.wikimedia.org/T110623
TODO: Meeting of Roadmap doom: Greg (optional), Chad, Mukunda, Antoine? TODO: Greg make the business case stuff
- get rid of all the various glue bots
NB: Keep in mind that we've allocated ourselves two quarters to complete this work.
CI Scaling
editObjective: Reduce CI wait time Key result: CI cluster responds to spike in queued builds by starting and registering additional jenkins slaves
We can use the "Jenkins/Zuul queue wait" KPI to track the effectiveness of this work: https://phabricator.wikimedia.org/T108750 .
Open questions: 1) Is this task still the right task to judge the completion of this work? https://phabricator.wikimedia.org/T47499
- or look at the Phabricator board https://phabricator.wikimedia.org/tag/continuous-integration-scaling/
2) Are the blockers of that task still accurate? IOW: when all of those blockers are completed we can consider the work done (whether or not it makes a change to the KPI)? 3) Doable in 3 months?
Zuul gate time KPI attempt https://grafana.wikimedia.org/#/dashboard/db/releng-zuul
On going:
- Jessie image https://phabricator.wikimedia.org/T110735 Give https://gerrit.wikimedia.org/r/#/c/234975/ a try?
- Actual process is lame:
** build on a labs instance (integration-dev) ** copy .qcow2 image to /var/www/html ** curl from labnodepool ** sudo -H -u nodepool -s ** cd && . .profile ** openstack image create .....
- bump Nodepool to support python-statsd 3.x https://phabricator.wikimedia.org/T107268
- Create a MySQL DB (Jaime on it) https://phabricator.wikimedia.org/T110693
Todo:
- Refactor MediaWiki tests. Split unit tests in their own jobs and speed up the lame 'integration' tests (10 minutes with Zend).
- DOCUMENTATION (T2001)
- figure out a solution to cache npm/pip/composer/rubygems modules (tarballs and compiled)
Q1 goal: nodepool infra build. at least 1 production grade job using Q2: CI cluster responds to spike in queued builds by starting and registering additional jenkins slaves (and migrate more jobs) Q3: migrate rest / phase out legacy
#together
edit- https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Skill_matrix
- if you want to know how to say #together in baby sign language :) http://www.babysignlanguage.com/dictionary/t/together/
Scrum of Scrums
edit- https://phabricator.wikimedia.org/project/board/64/
- Blocked on us: https://phabricator.wikimedia.org/maniphest/?statuses=open%28%29&projects=PHID-PROJ-arpazvuktn2l647rb6us#R
Isolated CI instances CI Scaling
edit- https://phabricator.wikimedia.org/tag/continuous-integration/board/?order=priority
- Quarterly Priority: Disposable VMs - https://phabricator.wikimedia.org/T47499
Deployment Cabal
editDeveloper Tooling (MW-Vagrant, MW-Selenium, etc.)
editPhabricator
editBeta Cluster
edit
Other Work
edit- Željko blocked on https://phabricator.wikimedia.org/T102020, waiting for somebody that knows which folders in operations/puppet contain upstream code
- put on SOS (done)
Vacations/Confs/etc
editPlease add your time off to your gcal, **Phabricator**, and ADP, as appropriate
- Chad - Sept 7-11 (last minute vacation. mostly reachable by e-mail) Sept 18th & 28th (Music festivals/shows)
- Željko planned to be offline on Wednesday September 2 but that has 1% chance of happening, sick kid
- Monday Sept 7th - US Holiday (Labor Day)
- Tyler: Sept 8th—in mountains
- Mukunda: Sept 4th (This friday) (I can attend the sprint meeting, taking the afternoon off)
- Andrew out this Friday