Wikimedia Technology/Annual Plans/FY2019/TEC2: Modern Event Platform
TEC2ː Modern Event Platform
editEfforts on this program are around building an event data platform that can be used by data driven production features and analytics.
Our goal is to lower the difficulty in building interoperable systems for both production and analytics purposes, by encouraging event-first oriented services. We will build the backend systems and conventions that support this architecture.
EventLogging is home grown, and was not designed for purposes other than low volume analytics in MySQL databases. However, the ideas it was based on are solid and convergently have become an industry standard, often called a Stream Data Platform. In the last two years, we have been developing the EventBus sub-system with the aim of standardizing events to be used both internally for propagating changes to update the dependent artifacts as well as exposing them to clients. While this has been a success, integrating these events with different systems requires much custom and cumbersome glue code. There exist open source technologies for integrating and processing streams of events. This program is about modernising our event production and collection systems with strong open source technologies and best practices.
We often have use cases that depend on the same 'event' happening. For example, a cache purge needs to know that a page was edited. In the analytics arena we track these events for various purposes. With a comprehensive event based architecture all consumers, analytics or otherwise, can share streams of events and take action as pertains. This is somewhat possible with our current system, but quite cumbersome, static and non fine-grained. We need a more robust solution that allows us to build services that can both consume and produce standardised data in a predictable fashion. We are already moving towards this type of system design with EventBus and Change Prop, but slowly and without a larger vision. To do this right, we need a standardised, organization-wide way of producing, transforming, and consuming events. This will make it easier to share data for production features and to integrate analytics systems for querying and dashboarding.
Engineering teams should be able to quickly develop features that are easy to instrument and measure, as well as for those features to react to events from other systems.
Additionally, experience in our existing infrastructure informs the need for ordering and de-duplication of interdependent event-driven tasks, and so as part of this work we plan to explore an implementation of fine-grained dependency tracking. For example, when we utilize an event stream to purge entities from caches, we often issue many unnecessary purges out of an abundance of caution, since information about the relationships of entities is unknown. A dependency tracking system will allow these relationships to be known and up to date. Event Data Platform components can support a dependency tracking service that can intelligently update dependencies in real time. As such, the choice of a stream processing system is tightly coupled with a dependency tracking solution. As part of this program, we will collaboratively research and choose a stream processing system that enables complex dependency tracking, as well as make architecture decisions about how to eventually build a dependency tracking system.
Program outline
editTeams contributing to the program
editSite Reliability Engineering, Analytics, Services
Annual Plan priorities
editPrimary Goal: 3. Knowledge as a Service - evolve our systems and structures
How does your program affect annual plan priority?
editBy building a comprehensive event data platform, we will reduce the friction involved in building analytics and production services that need to reliably share data with current and future systems.
Program Goal
editA modern event data platform will make it easier for engineers to build infrastructure for Knowledge as a Service. It will enable measuring the effectiveness of engineering projects, and also provide a base for smart reactive services, such as dependency tracking.
Outcomes
edit
Outcome 1edit |
---|
Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production. |
Output 1.1
|
Output 1.2
|
Output 1.3
|
Output 1.4
|
Outcome 2edit |
Stream processing system with dependency tracking system conceptual design. |
Output 2.1
|
Output 2.2
|
Output 2.3
|
Output 2.4
|
Targets
editOutcome 1 Measurement
editAnalytics team has deployed at least one event based service or automated dashboard using Event Data Platform. WMF engineers satisfied with Event Data Platform and willing to use it to build services.
Outcome 2 Measurement
editConsensus on stream processing system choice and preliminary dependency tracking technologies.
Resources
editPeople | FY2017–18 | FY2018–19 |
---|---|---|
Analytics |
|
|
Services |
|
|
SRE |
|
|
CapEx | ||
|
| |
Travel and Other | ||
|
|
Dependencies
edit- Output 1.3 requires upfront input from many Audiences and Technology engineers, to ensure confidence in system architecture choices.
- Output 2.2 Stretch goal and all requires that Streamlined Service Delivery (Kubernetes) succeeds and is usable.
Other Docs
editSee also