Product Analytics/Event Platform recommendations

Schema Organization

In the legacy EventLogging system, schemas were organized in a flat structure – all within the Schema namespace on Meta wiki. The only way to "organize" schemas was to use a specific naming system, such as the MobileWikiApp* schemas used by the Android and iOS teams.

In the modern Event Platform system, schemas exist as actual files and directories in a Git repository on Gerrit. This allows schemas to be organized hierarchically – indeed that is the case for the legacy schemas migrated to the modern system, as they are stored under jsonschema/analytics/legacy/ path.

The Product Analytics team identified several issues with the old approach:

  1. It was difficult to find relevant schemas.
  2. The list of schemas was overwhelming to browse.

After some deliberation we decided on a recommendation – not a requirement – to organize schemas by which aspect of the user experience they relate to. We identified these key categories to start with:

Proposed categories for organizing future schemas under
Category Examples
Mobile Apps Android, iOS, and KaiOS app-specific usage tracking
Reading Browsing/engaging with content (desktop, mobile web, mobile apps), interacting with UI in a non-contributing capacity, searching
Onboarding Growth's experimental features/interventions
Editing Usage of VisualEditor and WikiEditor
Moderating Usage of Anti-Harassment Tools
Fundraising Campaign banners and donate links clicks and impressions

We are planning to eventually use a single generalized, re-usable schema which will provide coverage to 95% of use-cases – since the new system allows us to collect data to different tables through streams – but for the remaining 5% where a specialized schema is required we think that collecting similar/related schemas together under these categories will help keep jsonschema/analytics clean and organized.