Wikimedia Release Engineering Team/Checkin archive/20160829


Vacations/Important datesEdit

How to do it:

  • Sept 02: Q2 goals draft published, Dan out
  • Sept 05: US Holiday (Labor day)
  • Sept 23: Q2 goals finalized
  • Oct 01: Start of Q2
  • October 10: US Holiday (Indigenous People's Day)
  • October 17-21: Offsite in Washington D.C.
  • October 31: Mukunda
  • October 28 - Nov 2 (ish) - Chad
  • November 24: US Holiday (Thanksgiving)
  • January 9-11: Dev Summit
  • January 12-13: All Hands

Team BusinessEdit

Time spent spreadsheetEdit

Rotating positions and absencesEdit

Maniphest query for deployment blocker tasks:

weeks of Aug 22 and Aug 29Edit

weeks of Sep 05 and Sep 12Edit

Actions from last meetingEdit

Scrum of ScrumsEdit
Blocked on us:

This weekEdit

Last weekEdit

Other Team BusinessEdit


"Upgrade all mw* servers to debian jessie"Edit

Q2 (Oct - Dec) GoalsEdit

Previously listed goalsEdit
  • Differential
    • fix Jenkins tests, maybe
    • migrate android
    • not a goal
  • Malu
    • pause
  • LLB + MW + Extension deploys to scap3 ?
    • not a goal
New goal proposalsEdit
  • Python software deployment via scap3 (Zuul + Nodepool)
    • think more on it (Tyler and Antoine), not a goal for now
  • CI Tech Debt
    • Determine long term plan for Nodepool
      • This needs to be more specific
    • Anything else?
    • think about how we use queues
      • split queue per branch (eg: security releases hitting multiple branches, 600 jobs), can make more run parallel
      • SWAT deploys: make the wmf branch go through as fast as it can
      • ""Review and adjust CI queues for more parallel operations"
  • MW deploy tech debt (Experiment/Stretch)
    • scap swat
    • ability to have multiple checks on MW deploys (in addition to logstash), eg swagger spec for MW (node endpoints checking)

Q1 goal/project check-inEdit

Phase out Ubuntu PreciseEdit

Replace primary production Continuous Integration host (gallium) - task T95757Edit

Upgrade Phabricator database servers to Maria10/Jessie - task T138460Edit

  •   Done

Upgrade Beta Cluster database servers to Maria10/Jessie - task T138778Edit

  • Chad will review Dan's patches
  • Dan will coord with Jaime for $whenever_works_for_them

Move Gerrit off of ytterbium - task T125018Edit

  •   Done

Reduce Technical DebtEdit

Perform a technical debt analysis of software and services maintained by WMF Release Engineering - task T138225

Next steps:

  • Greg get the documentation documented and call it done (for this goal for this quarter)

Streamline deployments (long-lived branches)Edit

keyresult task:

  • Convert our production deployment strategy to use long-lived branches - task T89945

project view:

  • Tooling will probably be done
  • static asset conclusion might not be
  • scap swat coming along nicely
    • use gerrit rest api (has need features not avail over ssh)
      • will need some sort of shared account (with frequent credential rotation, potentially each deploy)
    • can use a .netrc right now
    • scraping Deployment calendar page is crappy

Non-Quarterly goal workEdit

SWAT deploy changesEdit

  • European SWAT deploys (task T137970
  • Future changes?
    • requiring a task associated with each change being pushed out?
    • Add all swatters to each swat window, stop segmenting based on their availability (worst case they get a ping when they're not online)

CI Scaling/NodepoolEdit

Browser testsEdit

Beta ClusterEdit

  • Long lived cherry pick stuff popping up again
  • Antoine got one out thanks to Brandon


DB InconsistenciesEdit and (see also: )

People status updatesEdit


Last weekEdit

  • Catch up on Nodepool incident - DONE
  • Migrate jobs back to Nodepool instance - Week of Aug 29
    • Ideally get quota raised
  • Figure out contint1001 network with ops / Tyler
  • done: clear out 3 weeks worth of mails
  • personal: learn how to play
  • pet project: rake / rspec on puppet.git and tox for operations/software.git

This weekEdit

  • Migrate jobs back to Nodepool instance. Chase to monitor wmflabs as we progressively switch back. Starting on Tuesday Aug 30th
  • Figure out contint1001 network with ops / Tyler
    • Haven't pushed for it. Faidon in vacations this week. -- solved with mark --> public IP
  • personal: working on Ukulele major chords. C, D, F, G done. Todo: A B E. Probably gonna buy a guitar.
  • Branch cut / train deploy - done


Last weekEdit

  • MW release today (finally)
  • Finally going to do DB consistency script -- per our 1:1 this shouldn't be so hard
  • Long lived branches (long may they ilve)

This weekEdit

  • Diving into the DB consistency script. Doable, but hard :)
  • More long lived branches


Last weekEdit

This weekEdit


Last weekEdit

  • Finish the `scap swat` tool which is taking shape nicely.
  • Propose Improvements to the scap remote execution api to make it easy to use from scap plugins
    • This could facilitate development of arbitrary scap checks which can be ran separately from deployments
    • Will discuss with Tyler during the deployments meeting and go from there.

This weekEdit

  • Still finishing up scap swat
    • Hopefully, resolve the screen scraping debate
  • Start on `scap merge` branch management tooling


Last weekEdit

  • Bugfix scap update
  • nodepool things

This weekEdit


Last weekEdit

This weekEdit