Wikimedia Release Engineering Team/Checkin archive/20160829

2016-08-29

edit

Vacations/Important dates

edit

How to do it: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/Time_off

  • Sept 02: Q2 goals draft published, Dan out
  • Sept 05: US Holiday (Labor day)
  • Sept 23: Q2 goals finalized
  • Oct 01: Start of Q2
  • October 10: US Holiday (Indigenous People's Day)
  • October 17-21: Offsite in Washington D.C.
  • October 31: Mukunda
  • October 28 - Nov 2 (ish) - Chad
  • November 24: US Holiday (Thanksgiving)
  • January 9-11: Dev Summit
  • January 12-13: All Hands

Team Business

edit

Time spent spreadsheet

edit

Rotating positions and absences

edit

Maniphest query for deployment blocker tasks: https://phabricator.wikimedia.org/u/blockers

weeks of Aug 22 and Aug 29

edit

weeks of Sep 05 and Sep 12

edit

Actions from last meeting

edit

Scrum of Scrums

edit
https://phabricator.wikimedia.org/project/board/64/
Blocked on us: https://phabricator.wikimedia.org/maniphest/query/h7YTCBTJsepS/#R

This week

edit

Last week

edit


Other Team Business

edit

Offsite

edit

"Upgrade all mw* servers to debian jessie"

edit

Q2 (Oct - Dec) Goals

edit
Previously listed goals
edit
  • Differential
    • fix Jenkins tests, maybe
    • migrate android
    • not a goal
  • Malu
    • pause
  • LLB + MW + Extension deploys to scap3 ?
    • not a goal
New goal proposals
edit
  • Python software deployment via scap3 (Zuul + Nodepool)
    • think more on it (Tyler and Antoine), not a goal for now
  • CI Tech Debt
    • Determine long term plan for Nodepool
      • This needs to be more specific
    • Anything else?
    • think about how we use queues
      • split queue per branch (eg: security releases hitting multiple branches, 600 jobs), can make more run parallel
      • SWAT deploys: make the wmf branch go through as fast as it can
      • ""Review and adjust CI queues for more parallel operations"
  • MW deploy tech debt (Experiment/Stretch)
    • scap swat
    • ability to have multiple checks on MW deploys (in addition to logstash), eg swagger spec for MW (node endpoints checking)



Q1 goal/project check-in

edit

Phase out Ubuntu Precise

edit

Replace primary production Continuous Integration host (gallium) - task T95757

edit


Upgrade Phabricator database servers to Maria10/Jessie - task T138460

edit
  •   Done

Upgrade Beta Cluster database servers to Maria10/Jessie - task T138778

edit
  • Chad will review Dan's patches
  • Dan will coord with Jaime for $whenever_works_for_them

Move Gerrit off of ytterbium - task T125018

edit
  •   Done

Reduce Technical Debt

edit

Perform a technical debt analysis of software and services maintained by WMF Release Engineering - task T138225

Next steps:

  • Greg get the documentation documented and call it done (for this goal for this quarter)


Streamline deployments (long-lived branches)

edit

keyresult task:

  • Convert our production deployment strategy to use long-lived branches - task T89945

project view: https://phabricator.wikimedia.org/project/view/2117/

  • Tooling will probably be done
  • static asset conclusion might not be
  • scap swat coming along nicely
    • use gerrit rest api (has need features not avail over ssh)
      • will need some sort of shared account (with frequent credential rotation, potentially each deploy)
    • can use a .netrc right now
    • scraping Deployment calendar page is crappy

Non-Quarterly goal work

edit

SWAT deploy changes

edit
  • European SWAT deploys (task T137970
  • Future changes?
    • requiring a task associated with each change being pushed out?
    • Add all swatters to each swat window, stop segmenting based on their availability (worst case they get a ping when they're not online)

CI Scaling/Nodepool

edit

Browser tests

edit

Beta Cluster

edit
  • Long lived cherry pick stuff popping up again
  • Antoine got one out thanks to Brandon

Other

edit

DB Inconsistencies

edit

https://phabricator.wikimedia.org/T132416 and https://phabricator.wikimedia.org/T104459 (see also: https://www.mediawiki.org/wiki/Development_policy#Database_patches )


People status updates

edit

Antoine

edit

Last week

edit
  • Catch up on Nodepool incident - DONE
  • Migrate jobs back to Nodepool instance - Week of Aug 29
    • Ideally get quota raised
  • Figure out contint1001 network with ops / Tyler
  • done: clear out 3 weeks worth of mails
  • personal: learn how to play https://www.youtube.com/watch?v=d9i_zXmULyk
  • pet project: rake / rspec on puppet.git and tox for operations/software.git

This week

edit
  • Migrate jobs back to Nodepool instance. Chase to monitor wmflabs as we progressively switch back. Starting on Tuesday Aug 30th
  • Figure out contint1001 network with ops / Tyler
    • Haven't pushed for it. Faidon in vacations this week. -- solved with mark --> public IP
  • personal: working on Ukulele major chords. C, D, F, G done. Todo: A B E. Probably gonna buy a guitar.
  • Branch cut / train deploy - done

Chad

edit

Last week

edit
  • MW release today (finally)
  • Finally going to do DB consistency script -- per our 1:1 this shouldn't be so hard
  • Long lived branches (long may they ilve)

This week

edit
  • Diving into the DB consistency script. Doable, but hard :)
  • More long lived branches

Last week

edit

This week

edit

Mukunda

edit

Last week

edit
  • Finish the `scap swat` tool which is taking shape nicely.
  • Propose Improvements to the scap remote execution api to make it easy to use from scap plugins
    • This could facilitate development of arbitrary scap checks which can be ran separately from deployments
    • Will discuss with Tyler during the deployments meeting and go from there.

This week

edit
  • Still finishing up scap swat
    • Hopefully, resolve the screen scraping debate
  • Start on `scap merge` branch management tooling

Tyler

edit

Last week

edit
  • Bugfix scap update
  • nodepool things

This week

edit

Željko

edit

Last week

edit

This week

edit