Castor is an umbrella term for the caching of dependencies/package managers materials for the isolated instances .
The CI jobs start up in a fresh environment and have to retrieve dependencies over the internet and eventually, for native dependencies, compile them. The download phase can be arbitrarily long with package managers such as
maven download a long list of dependencies, and has the risk of upstream blacklisting our network abusing bandwidth. The installation and compile phase can be quite slow as well and it does not make sense to compile again and again the same material.
We introduced a very lame system based on
rsync. It copies from the instance a list of directories to a central place whenever the change succeeded in the Zuul
gate-and-submit pipeline. When a job start, it first attempts to retrieve the material from the central cache, thus warming up the cache before invoking the package manager. The cache itself is namespaced by:
||The git project name|
||git branch the patch has been made against|
||The Jenkins job name|
For reference see
integration-castor05.integration.eqiad.wmflabsconfigured in Jenkins via
When a job is in
gate-and-submit and is successful, it triggers the jenkins job
castor-save which runs on the Castor instance. The job will connect to the instance the original gate job ran on, and then rsync the package managers caches to the Castor instance.
The cache is namespaced by: Gerrit project name with
/ replaced by
mediawiki-core), target branch (eg:
master) and job name (eg:
The job have a builder macro that attempt to
rsync the cache from
castor into the home dir, thus populating the local cache. When the package manager installer is run (eg:
npm install), it will hit the local cache, saving it from having to download packages over the internet.
The JJB macro refers to the host using the
CASTOR_HOST environment variable which is configured as a global variable on the Jenkins controller.
/srv/castor is a Cinder Volume mounted in the instance.