Core Platform Team/Initiative/Multi-DC Echo Notification Storage/Epics, User Stories, and Requirements

< Multi-DC Echo Notification Storage

Personas edit

  • User - a registered Wikimedia user
  • Infrequent User - a registered Wikimedia user who logs in "infrequently" (boundary of frequent/infrequent TBD)
  • Systems Adminstrator - a systems administrator

Epic 1 edit

User Stories
ID Description Priority Notes
1 As a User, I want to see unread notifications, so I get timely notifications without a lot of confusing older messages. Must have This is "read" functionality for the timestamps. The timestamps are read on the server side by the Echo notification code, and change the appearance of the "notifications" and "alerts" indicators on each wiki page. There are potential race conditions if the timestamp was recently written in another data centre and the row has not yet propagated.
2 As a User, I want the system to remember that I read notifications when I click the Notices popup on the main Web UI, so that I don't have to get notified of those messages again. Must have This is one "write" function for the timestamps. When the alerts or notification UI is popped open, a POST request is sent from the browser to the MediaWiki server, handled by the Echo extension, to reset the last-read timestamp to now.
3 As a User, I want the system to remember that I read notifications when I go to the All Notifications page, so that I don't have to get notified of those messages again. Must have This is the other "write" function for the timestamps. When the user navigates to the Special:Notifications page, the server will reset the last-read timestamp, even though it is an HTTP GET request. In our multi-DC environment, we usually don't write to data storage on an HTTP GET, since POST, PUT, DELETE verbs will all be routed to the primary data centre. In this case, it can mean writing to data storage in a secondary data centre.
4 As a Systems Administrator, I want to configure Echo to write its notification timestamps to the storage engine of my choice, so I can architect my storage systems without changing the extension's code. Must have Echo currently writes to MainStash without any override capability. We need a way to configure Echo to use a different storage server.
5 As an Infrequent User, I want to see unread notifications even if I haven't logged in a long time, so that I am not suprised when I look at my notifications list and see notifications marked as "read" that I have not read. Must have This is the user story for migrating data from the Redis server to the Kask server.
6 As a Systems Adminsitrator, I want to decommission the Redis server, because it is not well-suited to our multi-DC configuration so it is no longer used by any software components. Must

have

This is for shifting out of the migration period into the final configuration.

Engineering tasks edit

This is informational, based on Evan's understanding of what needs to be done.

  • Stand up a new Kask server (user stories 1, 2, 3)
  • Change Echo so that it can use a configured object store, with MainStash as a fallback (user story 4)
  • Configure WMF MediaWiki servers to use MultiWriteBagOStuff with Kask and Redis as a fallback so that it gradually migrates from Redis to Kask (user story 5)
  • Write and run a maintenance script to copy all or some Echo notification timestamps from Redis to Kask (user story 5)
  • Configure WMF MediaWiki servers to use RESTBagOStuff only, without the Redis fallback (user story 5)

The maintenance script is tricky. There are tens of millions of timestamps in Redis, so it will take a long time to run. It will only be run once. However, any timestamps older than when we start doing the multi-write configuration will otherwise be lost when we go to the Kask-only configuration. It's important for us to figure out how far we need to go back, and what happens if there's no data for a user who hasn't been back to Wikipedia in a year or two.