Architecture Repository/Patterns/Canonical data modeling
Canonical data modeling
editAllows content to be understood by people, programs, and machines outside the boundaries of the system
Last updated: 2022-12-16 by APaskulin (WMF)
Status: v1 published September 2021
Summary
editA canonical data model is a predictably-structured, technology-agnostic data structure that represents the system as a whole instead of each component having its own representation of the data. Discrete bits of information are interconnected based on relationships between them and contextualized with metadata. This allows users and machines to consume content easily without specifically caring about the underlying technologies driving the system.
Related to
edit- Knowledge as a service: This strategic initiative transforms knowledge created as a single web page into discrete units of predictably structured information that are interrelated.
- Federated API: Defines a unified, consistent response to all API queries regardless of the module or product it requests, while allowing individual subsystems to evolve and change independently.
Product benefits
edit- Structured content: Having an agreed-upon, standardized, technology-agnostic data structure enables the universal structuring of the content across our different products.
- Interconnectedness: Each structured data piece includes information about how it relates to other pieces (for example, by keywords, by hierarchy, etc). This enables finding and utilizing context-related links between pieces of information to produce powerful product outputs.
Example product narratives
editThis architecture pattern enables the following product narrative examples:
- New editor experiences: Creating an article
- New editor experiences: Editing citations
- Wikipedia in the classroom