Wikimedia Release Engineering Team/Deployment pipeline/2019-02-28
Last Time
editCurrent Quarter Goals
edit- Roughly a month left!
- TEC3:O6:O:6.1:Q3: Deployment Pipeline Documentation
- cf: [#TODOs_from_last_time]
- TEC3:O3:O3.1:Q3: Move cxserver, citoid, changeprop, eventgate (new service) and ORES (partially) through the production CD Pipeline
- In progress cxserver
- In progress citoid
- Images built via deployment pipeline
- Deployed
- Traffic not switched yet
- In progress citoid
- changeprop
- Done eventgate
- In progress ORES
- cf: Dan's comments
- In progress ORES
General
edit- Missed talking about last time: CI for Go
- https://phabricator.wikimedia.org/T209106
- There is a blubber.yaml now
- Runs through service-pipline-test and service-pipeline-test-and-publish as of yesterday
- beta code stewardship request
- Staging discussion -- first canaries then staging
- Staging supplants some uses of beta, but not all
- Joe has a thing to ensure services exist in beta during transition
- Databases hosted on bare metal?
- No clear cut answer as yet
- Discussion still ongoing
- aside: outgoing firewalls in place now as part of transition
TODOs from last time
edit- Done TODO thcipriani to make task for continuous deployment, what's missing? a k8s api token on contint1001 to deploy
- Jenkins attack vectors
- We worry about escape/security holes
- Jenkins is not known to be highly secure
- What about a way to put the logs somewhere else? and make jenkins non-public
- nginx, for example - they're on disk
- The real solution here is probably 2 separate instances of Jenkins
- TODO various attack vectors document to start
- There's a releases Jenkins - https://releases-jenkins.wikimedia.org
- Maybe things meant for production use should be built there
- Hosted on releases1001
- thcipriani/hashar to work on figuring out what stages happen on which Jenkins
- +2 meaning deployment is a cultural shift
- MediaWiki vs MediaWiki/config
- In progress TODO: support documention like the one tyler did for the portal and pipeline/helmfile and deployment
- https://wikitech.wikimedia.org/wiki/Deployment_pipeline now exists, https://wikitech.wikimedia.org/wiki/Continuous_Delivery has been deleted.
- TODO: Joe & James_F to work on eventual 2019-04-01 email
- Beware: announces on 04/01 can be considered an April's fool
- In progress TODO: improve feedback from pipeline -- link to actual failing job, show images, and tags as applicable
- Done create task: Pipeline feedback https://phabricator.wikimedia.org/T177868
- Done Gerrit Styling patch: https://gerrit.wikimedia.org/r/c/operations/puppet/+/490640
- Done Groovy in integration/pipelinelib patch: https://gerrit.wikimedia.org/r/#/c/integration/pipelinelib/+/492225/
- Done Bot created for commenting https://wikitech.wikimedia.org/wiki/User:PipelineBot
- Add credentials PipelineBot credentials to jenkins
- Patch service-pipline-* jobs to use new integration/pipelinelib code
- Aim for end of week
RelEng
edit- Proposals written by Dan could use feedback if anyone has any:
- The pipeline should provide a way to save artifacts from a stage
- .pipeline/config.yaml Proposal The Latest™ (top level task still ORES one "The continuous release pipeline should support more than one service per repo")
- Local charts for dev
- https://gerrit.wikimedia.org/g/releng/local-charts
- How to push docker images somewhere for local development
- Parsoid image for local development or MediaWiki for local dev
- docker-pkg for localdev images, upload from contint1001
- 'TODO: Jeena, Brennen, Tyler to sort specific process
- CI future working group: https://www.mediawiki.org/wiki/Wikimedia_Release_Engineering_Team/CI_Futures_WG
- Blog on Phabricator later today
- Has just started, a prelim list of requirement and candiadates on above page
- Would it be feasible to use spare servers for CI in the future? Even if they're only available for a short period?
- Alex: There are resources for CI coming up, quite a large amount of money
Serviceops
edit- citoid deployed on staging and production
- Still working out the kinks (e.g. metrics)
- Need to switch traffic to it and deprecated the old method.
- eventgate done \o/