Wikimedia Release Engineering Team/SpiderPig/Meeting notes/2024-08-22

So, this meeting time sucks.

  • Decided to leave here for now, if it conflicts with team members' train responsibilities then we can move/postpone on those weeks

WE6.2.3: If we create a new deployment UI that provides more information to the deployer and reduce the amount of privilege needed to do deployment, it will make deployment easier and open deployments to more users as measured by the number of unique deployers and number of patches backported as a percentage of our overall deployments.

  • Total number of unique deployers by method
  • Total deployers overall by method vs total users of a particular method

Goal of this meeting:

  • Introduce the project
  • Determine appropriate next steps, refine roadmap
  • Determine next steps
  • Determine missing roles/competencies
  • Determine timeline of next steps

Project roadmap

edit

By End of August

By end of Q1 (September 30th)

  • Finish the refactor of backport
  • Not realisitic at this point, straw-dog: end of November.

Future work

  • Rollback --- anything to do explicitly here?
  • Adding a queue to scap backport
  • Adding a scap spiderpig command to create a daemon
  • Adding IDP auth to spiderpig/web (https://phabricator.wikimedia.org/T372892)
  • Design work for web ui and flow (could be done at any time)
  • Creating an output mechanism suitable for web UI (with secrets masking)
  • Job history (presentation)
  • Adding deployment monitoring to web UI
    • Progress reporting, time estimation
  • Creating a websocket within spiderpig
  • Puppet work for
    • Daemon monitoring/logging/logrotate
    • Websocket passthrough ATS (Apache Traffic Server)
    • TLS termination (unknowns here)
  • Admin tools

Conflicts I'm aware of

  • Jeena and Jaime: Catalyst work, expected to continue
  • Ahmon: Working on group -1, initial hypothesis wrapping soon-ish
  • Jeena out almost all Sept
  • Train
  • Essential bugfix work

Next steps

edit
  • Jeena can start working on the refactor
  • Jaime/Jeena to sync on steps
  • Tyler to check in with design research for design support
  • Ahmon to start working on user flow through web-ui

Roles

edit
  • Tyler: Responsible for resourcing, timelines, scope, reporting
  • Ahmon: tech lead of web -- responsible for scap web
  • Jeena + Jaime: tech leads for backport -- accountable for the technical path of scap web through backport

Questions

edit
  • 2fa 4 idp: https://phabricator.wikimedia.org/T372892
  • How does a failed rollback or deployment affect the deploys in the queue?
    • Failed rollback should admin lock
  • Queue persistence?
    • Yes, small file on disk, access will need to be serialized