Core Platform Team/Initiative/Multi-DC Echo Notification Storage/Initiative Description

< Multi-DC Echo Notification Storage

Summary

We will provide a key-value store backend for the Echo notification extension that will work in an active-active data centre environment.

Significance and Motivation

With the future enabling of multiple data centres with active MediaWiki servers running, our application and storage infrastructure has to adapt. In particular, the assumption that databases will be read only in one data centre is no longer valid; we need replicated databases in our two North American data centres, and our architecture should support further decentralization of data.

Echo is an extension for in-wiki notifications. Because it lists notifications in the Web interface, it keeps a "last-read" timestamp, so that only unread notifications are shown in the user interface. Each registered user of Wikimedia sites has two timestamps; one for alerts, and one for notifications.

The notification timestamps are currently stored using the MainStash abstraction in MediaWiki. MainStash is a global key-value store that doesn't require a schema or database setup like most database tables do. In the Wikimedia environment, MainStash is implemented as a RedisBagOStuff, writing to Redis servers in the Wikimedia main data centre.

There is not an expectation that Redis will be able to extend to a multi-DC environment. Much of the data currently stored in Redis has been or will be moved to other storage. The goal of this project is to move Echo notification timestamps to storage that is more multi-DC-friendly.

Our experience with moving Wikimedia session storage to the new Kask key-value server should be helpful in determining storage requirements for Echo.

Outcomes
  • Echo notifications are not a blocker for active-active DC deployment
  • No noticeable degradation of performance for Echo notification UI
  • No noticable errors in Echo notification UI
Baseline Metrics
  • Echo notification last-read timestamps are stored in MainStash
  • MainStash is implemented as a Redis server
  • WMF Redis configuration is not multi-DC-ready and we don't see a future configuration that will be
Target Metrics
  • Echo notification last-read timestamps are stored in Cassandra
  • Echo extension uses configurable key-value store, such as RESTBagOStuff, to store notification data
  • Kask REST service brokers data storage
Stakeholders
  • Growth (Echo owners)
  • SRE (for new services)
Known Dependencies/Blockers

None given