Scaling out happens in kubernetes by increasing the number of pods so this sentence needs some rewording. Deployments just spawn up a batch of new k8s pods (25% by default, configurable though) and kill the same amount of pods from the previous deploy. Rinse and repeat in an A/B fashion until the entirety of the pods have been replaced.
Topic on Talk:Wikimedia Release Engineering Team/MW-in-Containers thoughts
Ah, yes, will re-word.
Is this change sufficient?
Yes, I think so. I am still a bit unclear on the ** How does we tell the controller (?) to know to which deployment state to route a given request? part, but judging from the "controller" having a question we probably want to define that first
That's wrapped up in the decision about how we want the A/B split of traffic to the new deploy – do we just do it uniformly randomly across all request types (standard k8s behaviour), or do we want do something closer to what we do now (roll out by request type sharded by target wiki), or something else. I've left that open as I don't think it's been discussed.