Wikimedia Technology/Annual Plans/FY2019/TEC2: Modern Event Platform/Goals

Program Goals and Status for FY18/19

edit
  • Goal Owner: Nuria Ruiz
  • Program Goals for FY18/19: A modern event data platform will make it easier for engineers to build infrastructure for Knowledge as a Service. It will enable measuring the effectiveness of engineering projects, and also provide a base for smart reactive services, such as dependency tracking.
  • Annual Plan: TEC2: Modern Event Platform

Outcome 1 / Output 1.1 - 1.4

edit

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics, Services, SRE

Goal

edit
  • TechCom RFCs underway and technical decisions made. (more)

Status

edit

  Note: July 2018

Discussed JSONSchema versus Avro and decision was taken to use JSONSchemas and is   Done task T198256

  Note: August 2018

Discussed schema registry and metadata service   Done task T201643
Discussed scalable event intake service   Done task T201963

  Note: September 18, 2018

  Done RFC for schema registry closing soon. We will leave metadata/config service out of MVP. We will work on scalable event intake as part of next quarter goals.


Outcome 1 / Output 1.1

edit

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Core Platform Team, SRE

Goal

edit
  • Development of intake service for events whose transport is JSONSchema/http task T201068   Done

Status

edit

  Note: October 19, 2018

Planing on what language/platform we are going to be building the intake service   Done

  Note: November 14, 2018

Intake service prototype is being built in node task T206815   Done

  Note: December 12, 2018

Code can be found here: https://github.com/wikimedia/eventgate   Done


Outcome 1 / Output 1.1

edit

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)

edit
  • Deployment of Stream Intake Service (AKA EventGate) using TEC3: Deployment Pipeline   Done
  • Mediawiki Monolog+Kafka usage migrated to EventGate. task T214080 task T216163   In progress
  • STRECH GOAL: Migration of some mediawiki 'Eventbus' events to EventGate.
  • STRECH GOAL: Decomission old 'analytics' Kafka cluster.

Status

edit

  To do January 2019

Deployment via Docker & Kubernetes in beta and then production. task T211247   Done

  Note: February 25, 2019

Produce Monolog based events from Mediawiki to EventGate and create Hive tables in Hadoop.   In progress

  To do March 2019 Proposed:

Get users of Monolog based events to use new tables.
Begin migrating some Mediawiki 'Eventbus' events to EventGate.
If possible decommission 'analytics' Kafka cluster.

Marked   Done on March 14, 2019:

  • Deployment of Stream Intake Service

Outcome 1 / Output 1.3

edit

It is clear to engineers how to design event schemas to support analytics and production features to ease future maintenance and evolution of those systems. Dependencies on: Analytics Engineering, Core Platform Team

Goal(s)

edit
  • Backwards compatibility checks in schema repository CI task T206814   In progress

Status

edit

  To do January 2019

JSONSchema backwards compatibility library implemented task T206889   In progress

  Note: February 25, 2019

mediawiki/event-schemas repository Jenkins CI with backwards compatibility checks   In progress

  To do March 2019

Code on this task is ready to go, checks are not enabled as there is now only two schemas that can benefit from it.


Outcome / Output

edit

Wikimedia engineers have a reliable, scalable, and comprehensive platform for building services that produce and consume event data for analytics and production.

Events can easily and reliably be produced by internal and external clients and consumed by other internal services.

Dependencies on: Analytics Engineering, Core Platform Team, SRE, Release Engineering

Goal(s)

edit
  • Decommission old 'analytics' Kafka cluster.T183303   Done
  • Deploy an instance of EventGate that processes events sent to kafka main cluster T218346   Done
  • Schema Repository CI for convention and backwards compatibility enforcement

Status

edit

  To do May 2019

  In progress In order to decomission the cluster we need to turn off the old avro workflow, we hope to do that in June.

Also, kafka main event gate deployment is   In progress

  To do June 2019 Some of the cI work will move to next quarter