Flow/Architecture/Memcache

Flow is written to take full advantage of a memcached caching layer. To this end we have implemented something very roughly equivalent to queryable indexes in memcache. We have considered moving this to Redis, but for now its using the memcache infrastructure.

What is being cached?

All of the Flow-specific data models. The reverse-proxy cluster serving up most wiki content to visitors does not apply to most editors. To provide the editors the responsiveness they deserve we are aggressively caching within the application. We have decided to cache at the data model level rather than options like caching view fragments to simplify cache invalidation and variances between what different editors see based on their roles, although see Flow/Architecture/HTML caching.

What is actually written in the cache?

The Flow "Indexes" are always nested arrays of arrays. e.g.

array( array( /* first data model */ ), array( /* second data model */ ), ... );

Each index is defined with a memcache key prefix (e.g. 'flow_definition:pk' ). This is combined with an equality condition like array( 'definition_id' => 12345 ) to generate a cache key such as 'flow_definition:pk:12345'. When the index is defined it is told what keys within the data model are used for indexing such that it can always build cache keys with properties in the same order.

Currently we have two main index implementations, both of which build off a common parent(FeatureIndex). We have the UniqueFeatureIndex which is mostly used for the primary index of each data model and holds exactly 1 data model matching a unique set of features. We additionally have a TopKIndex which stores the top K items matching a defined group of features(properties) sorted by a pre-defined feature of the data model. In addition to these two there are a few indexes that extend TopKIndex and add additional related data to the model to allow indexing models based on data not explicitly a part of the data model.

For every data model there is exactly one primary index. This primary index generally uses the same keys as the primary key in MySQL. The primary index should be (but is not specifically restricted) the only index that contains the full data models. The primary index still stores its index as a nested array of arrays for consistency, but there is always exactly one data model stored in this index.

Most data models then use supplementary secondary(i.e. not primary) indexes. All secondary indexes limit their data model storage to the piece of the data model necessary for looking up the full data model in the primary index, along with whatever values are necessary for sorting the index. These secondary indexes are always pre-sorted and limited in length. From a developers perspective querying a data model's primary or secondary index is the same, both always return the full data model.

Will this affect other users of memcache, like the parser cache?

It could, eventually. In the initial deployment of Flow it will be restricted to a small subset of pages and will use a relatively tiny amount of memory within the memcache cluster. Eventually, when deployed to millions of pages, the memory requirements could conceivably grow into the 10's of GB or more depending on configuration and usage levels. Our plan to address this issue is outline below. This plan is still very much a work in progress and will be worked on before any wider release.

Data we need to collect about existing usage patterns

Memcache usage will be directly related to the number of distinct conversations accessed over the cache lifetime.

Data to be collected to help inform estimates:

Number of unique talk pages viewed over time spans from 12 hours to 2 weeks
Breakdown of talk pages viewed over a time span against how long since it was last requested
- e.g. 50% of pages were requested with the last day. 20% within the last week. 30% were last requested more than a week ago.

How long is it cached?

Memory usage is within the cache is directly tied to usage levels so there is no set-and-forget value to the cache time. We instead need to determine what measurable values are affected and tune the cache lifetime to suit our requirements.

Possibly measurable data points:

cache hit and miss ratios (per-index?)
total cache memory usage
page response time
database load

For smaller values we expect cache hit and miss ratios to climb, total cache usage to decrease, database load to increase and page response time to possibly increase. For larger values we expect the opposite.

Measuring these values outside the production environment seems difficult, without any real world usage we don't really know how many pages will be in cache on a certain lifetime. We are looking into a combination of having independent memcache instances for Flow usage to collect stats from in addition to EventLogging some statistically significant number of requests to flow enabled pages.

We expect a lower usable bound of the cache lifetime to be between 24 to 36 hours. This way pages with a daily cycle of views in one part of the day and nothing in other parts of the day stay in cache. The upper bound is limited by the amount of memory we want to dedicate to this caching, which depends on what measurable effect more or less cache actually has on the application.

Reads

All reads in flow hit the memcache infrastructure first. Queries to storage are almost entirely done in terms of equality conditions(along with options that specify details such as the prefered sorting order) so we can use those equality conditions as a cache key. Reads within flow always prefer to multi-get content from memcache to reduce the number of round trips required. This should work well combined with twemproxy which receives the full multi-get and parallelizes those requests to the appropriate memcache servers.

We prefer there to be a single source of truth within the memcache cluster. This is accomplished by using a single primary index with the full data model along with multiple secondary indexes which point to the primary index. This is a convention and not enforced by the implementation. We can see the possibility of future use cases having different requirements.

Writes

For the most part wiki's dont delete anything. Even things that are deleted are only really hidden from view except in specific very limited cases. This helps to limit(but not remove) our exposure to race conditions.

Updating data in memcache is done after the transaction to MySQL has been committed. To update an existing index we utilize a CAS operation to:

Pull the current index value from memcache
Add the new value to the index
Sort the index
Limit the index to the predefined size
Write it back to memcache