Wikimedia Release Engineering Team/Group -1/Progress reports/2024-07-25

← week ending 2024-07-18 Group -1 progress reports week ending 2024-07-25 week ending 2024-08-01 →

Report on activities in the Group -1 project for the week ending 2024-07-25.

[WE6.2.1] Publish pre-train single version containers

edit
Progress update
  • Stage: Design
  • Discussions within the team and with SRE Service Operations are clarifying the scope of the project in relation to other hypotheses. We now believe that we must prepare to produce a matrix of containers which vary on a number of separate dimensions. Currently identified dimensions are:
    • MediaWiki version
    • debug/non-debug runtime support
    • maintenance scripts/web runtime
    • embargoed security patches/public code
    • deployment environment specific secrets
    • PHP runtime version
  • Ahmon's changes to add FORCE_MW_VERSION environment variable support to the Multiversion runtime scripts have been merged.
  • Ahmon is working on cleanup of his initial proof of concept implementation building a single version image.
Any new metrics related to the hypothesis
  • None
Any emerging blockers or risks
  • Discussions with the SRE Service Operations team about their upcoming project to support regular PHP version upgrades (WE5.4.2) have flagged the lack of MediaWiki container use in the beta cluster as an emerging risk. The risk currently envisioned is that PHP 8.1 will only be introduced in production as a container. This will happen before we are ready to move QTE's pre-train testing use cases to a new environment, so we will need to find a way for Quality and Test Engineering (QTE) to test under a PHP version that SRE is not planning to support via the Puppet driven provisioning currently used in the beta cluster. At this point we feel that the proper way to address this risk is by formulating a new hypothesis to follow WE6.2.1 that will result in a containerized runtime for MediaWiki in the beta cluster. This need was actually under consideration already, but has been elevated from a nice to have to a hard requirement. ServiceOps has indicated that they expect to start on WE5.4.2 in mid-Q2 (November 2024) to early Q3 (January 2025). We currently think this start date will be late enough for us to be ready to work on the new hypothesis for the beta cluster in parallel.
Any unresolved dependencies - do you depend on another team that hasn’t already given you what you need? Are you on the hook to give another team something you aren’t able to give right now?
  • Not yet, but see emerging risks
Have there been any new lessons from the hypothesis?
  • This is more reinforced knowledge than a new lesson, but the regular IC level sync between folks from Release Engineering and SRE Service Operations that was established to support the MediaWiki on Kubernetes project continues to provide good value. This meeting of committed participants regularly surfaces both issues and experimental solutions for both groups.
Have there been any changes to the hypothesis scope or timeline?
  • Need for several additional dimensions of container variation uncovered via discussions with SRE Service Operations. These are currently seen as clarifying information rather than disruptive requirements.