Core Platform Team/Initiatives/API Gateway

Initiative Vision

< Initiatives

Vision:
  • Easier-to-use APIs, accessed and managed in a consistent way
  • Easier-to-create APIs, including support for cross-property and non-core APIs
  • Improved API stability, better able to withstand attacks and high traffic levels
Target Group(s):
  • Traffic
  • Product Infrastructure
  • Analytics
  • Internal developers
  • External developers
  • Search
Needs:
  • our existing APIs require immersion in Wikimedia culture and use techniques unfamiliar to modern client developers
  • no satisfactory way to create cross-project or non-core APIs
  • no comprehensive way to protect our infrastructure from excessive API traffic
Product:
  • Single API domain at api.wikimedia.org, with associated documentation
  • Ability to publish APIs regardless of implementation technology
  • Ability to limit rates across APIs, leveraging appropriate monitoring and authentication technologies
Aligned Goals:
  • Knowledge as a service
  • Increase ease and speed of feature development
  • Increase the stability of the platform

Initiative Description

< Initiatives

Summary

This project will curate a selection of APIs and present them in a favoured service to make it easier for client developers to work with our systems.

Significance and Motivation

We have many different API endpoint and modes of access. We'd like to establish a best practice, recommended API framework that works as we want APIs to work in the future.

Outcomes

The outcome of this project will be a single virtual API service that developers can use for all their Wikipedia project needs.

Baseline Metrics
  • many APIs
  • different calling paradigms (RPC, REST, query, ...)
  • document-oriented datatypes
  • User-Agent for client IDs
  • Sessions for user IDs
Target Metrics
  • Increased OAuth 2.0 key registration by X%
  • Reduced 503 errors (service not available) by X%
  • More evenly distributed API traffic
Stakeholders
  • Partnerships
  • Developer Advocacy
  • Product Infrastructure
  • SRE
Known Dependencies/Blockers
  • Epic 1 of the Core REST API, to have a basic document-level REST API available for all projects
  • OAuth 2.0 with client IDs (API keys)
  • availability of api.wiki[mp]edia.org
  • developer portal

Epics, User Stories, and Requirements

< Initiatives

Personas

This epic has the following personas:

  • Client Developer - A developer working with Wikimedia project data and content. Could be internal or external.
  • Administrator - A system administrator working for Wikimedia Foundation and implementing Foundation policies and priorities
  • API Developer - An internal developer creating APIs for internal or external use.
  • API Curator - A person who curates APIs for public use, such as the Product Manager for APIs (currently Evan).

Epic 1: API domain

At the end of this epic, we will have a single preferred API domain to point developers to. Note that numbers are not consecutive, since this epic was adapted from an earlier initiative.

User stories
ID Title Description Priority Notes
1 API domain As a Client Developer, I want to have a single root domain for API calls, so that I don't have to manage different roots in my code. Must have We'd like to get api.wiki[mp]edia.org

Should we have wikimedia.org or wikipedia.org? Probably the first for now, and the second when the movement brand changes.

9 Mounting an API As an API Developer, I want to mount an internal network URL as a path in the API gateway tree, so that Client Developers can use my API. Must have For example, a Kubernetes-hosted service becomes available at api.wikmedia.org/example/v2/

This should be easy, but does it need to be automated? I think not for an MVP.

Do we need to have different mechanisms for mounting APIs that are for limited or internal use (like Citoid) versus those that are for public use?

16 Unmount an API As an Administrator, I want to remove an API from the tree, so that unsupported or deprecated APIs are no longer available through the gateway. Must have This is just the inverse of 9, "Mounting an API". Solutions where we can't remove APIs for whatever reason don't meet our needs.
17 Curate APIs As an API Curator, I want to choose which APIs are available through the gateway, so more preferred APIs are highlighted and less preferred APIs are backgrounded. Must have Just adding this to point out that we're specifically selecting APIs for the gateway. We don't need to remap every URL on every project server.
4 RESTful CMS API As a Client Developer, I want a RESTful CMS-level API available for all projects and languages, so I can provide CMS-style functionality to my end user. Must have api.wikimedia.org/core/v1/travel/fr/page/Maldives

MediaWiki REST API. We should launch with one API, and this one is relatively new and preferred so it makes a good candidate.

15 Feed API As a Client Developer, I want a RESTful API with information on timely or featured articles, so I can show my users new time-sensitive Wikimedia project data. Must have After talking with PI, they think Wikifeeds makes a good candidate for the public API. I do, too. So, this is the user story for "Mount Wikifeeds in the tree."

Epic 2: API keys

At the end of this epic, we will have API keys that client developers can use to access Wikimedia APIs.

ID Title Description Priority Notes
5 OAuth 2.0 API key As a Client Developer, I want to have a single API key that I use across all Wikimedia projects and language versions, so I have one relationship to manage with the organization. Must have
10 Universal OAuth 2.0 As an API Developer, I want to support OAuth 2.0 authentication regardless of whether my API is implemented within MediaWiki or in a microservice, so that I can use the best tool for my job. Optional Optional, since the first API mounted (MW REST) is built inside MediaWiki and has direct access to OAuth through session management. It's likely that microservice-based APIs will be coming soon. At the very least, we should not prevent this user story from coming later. MediaWiki OAuth 2.0 access tokens are JWTs, so it should be possible for microservices to validate tokens directly by checking the signature.

Epic 3: Rate limits

At the end of this epic, we can limit the rates of API calls by client developers.

ID Title Description Priority Notes
6 Rate Limit, Admin As an Administrator, I want to define a maximum number of API calls that can be made with a given API key during a particular period of time, so I can plan for and manage API traffic usage. Must have Same as 7 (?), but from the administrator's point of view. Note that this only covers API calls that come through the API gateway, and not all API traffic (yet). A rate limit is defined as number of API calls per time period; these are not yet fixed, and will probably be adjusted over time, so should be variable. For estimation, likely to be O(10^4)/hour to start off. All API calls count the same. Just one hard limit (no soft limit). Limit is by key, not by developer account (developers can have multiple keys).
7 Rate Limit, Client As a Client Developer, I want to have an explicit pool of API calls I can make, so I can plan on more reliable API support. Must have This is just 6, but from the developer's POV.
8 Rate Limit Class As an Administrator, I want to define a class of Client Developers and assign them a rate limit, so I can have different limits for different classes. Optional This makes managing rate limits much much easier, but it's not crucial for this product.
11 Universal Rate Limit As an API Developer, I want to support a global rate limit regardless of whether my API is implemented within MediaWiki or in a microservice, so that I can use the best tool for my job. Optional Optional, since the first API mounted (MW REST) is built inside MediaWiki. It's likely that microservice-based APIs will be coming soon. At the very least, we should not prevent this user story from coming later. Note that this doesn't have to be transparent to the API Developer; it could be handled in the service (or in service-runner framework).
12 API usage data As a Client Developer, I want to know how many API calls I made during a particular period, so I can plan for the future. Must have We've got a visualization in the Developer Portal prototype. The Dev Portal is implemented with MediaWiki within our network, and could have access to a global data store. Doesn't need to be exact logs, just number of calls in the time period. Currently the period (hours? days?) and number of calls per period (50? 500,000) is unspecified. Reasonable to only keep these logs for 90 days, like other private (?) data.
13 Anonymous rate limit As an Administrator, I want to limit the number of API calls that can be made without an OAuth 2.0 API key during a particular period of time, so I can plan for and manage API traffic usage. Must have This is the "global pool" of API calls. No fixed value yet, but 100 x per-key limit is good for an estimate.
18 Informative errors As a Client Developer, if I go over my API rate limit, I want an informative error response that tells me why my request failed and what my rate limit is, so I can correct my program's behaviour. Must have We should return a 429 Too Many Requests status when a developer has gone over their limit, preferably with rate-limit headers.
19 Rate Limit Headers As a Client Developer, I want to have informational HTTP headers in my response that say what my rate limit is and how many requests I still have available, so that I can throttle my requests to avoid going over the limit. Optional Different services use different proprietary headers for rate-limit information. We can use those, define our own, or use the draft RFC for rate-limit headers from the IETF.
20 Rate limit notification As a Client Developer, I want to receive a notification when my code goes over an API rate limit, so I can check my code and make changes. Optional This could be an Echo notification, an email, or something else. It's a friendly feature for services that have API rate limits.
21 Assign Client ID to Class As an Administrator, I want to assign a Client Developer to a rate limit class, so that the developer has the rate limit of that class. Optional User story 8 implies that it should be possible to assign client IDs to a class; this makes it explicit. I think this would be 1:many; a client ID would be a member of exactly one class. I don't believe it makes sense to be a member of more than one class (which limit would you use?) or zero classes.
22 Default Rate Limit Class As an Administrator, I want to define a default rate limit class for new Client Developers, so that every Client Developer has a rate limit class. Optional I think this is required to keep every client ID to exactly one rate limit class.
23 Burst Rate Limit As an Administrator, I want to define a maximum number of API calls that can be made with a given API key during a small period of time, to control for bursts of high usage. Optional Typically our rate limits will be by hour or day, like "50,000 API calls/day". To protect our infrastructure, we'd like to prevent high bursts of activity, like making all 50K API calls in the first second of the day! This user story is for adding a second burst rate limit. It would be an acceptable default to have this be a fixed multiple of the average rate, such as 100x. So if the rate limit is 50K calls/day, the default burst rate limit would be 100 * (500K/86400) = 60 API calls/second. Different developer classes will have different burst limits.
24 Connection Limit As an Administrator, I want to define a maximum number of Web connections per running instance of an app, to control for excessive connections. Optional I'm thinking a limit of N connections per IP address/client ID pair, where N is somewhere between 1 and 5. So as an end user I wouldn't see weird behaviour if I was running two apps on the same device concurrently, both of which are keeping their total number of connections under N. I believe the recommended N for Web browsers is 4...? Different developer classes will have different connection limits.

Epic 4: API Portal

ID Title Description Priority Notes
1 Create developer account As a Client Developer, I want to create a developer account, so that I have an identity as a developer. Must have This probably means mapping a developer account onto a single user login (SUL) account. That's how the current OAuth extension works. Some developer tools have ways to connect a developer account to an organisation or group (AWS, Github). We're not going to do that initially.
2 Developer account login As a Client Developer, I want to log in to my account, so that I can use the developer UI with my own authorization. Must have
3 List all developer clients As a Client Developer, I want to see a list of all my Clients, so I can know which clients I am responsible for and can manage. Must have Not sure whether we should show the client secret also, or if it should be view-once on client creation.
4 Create a client As a Client Developer, I want to create a new Clients, so I can use that identity for a new program. Must have This should be as simple as possible. Only OAuth 2.0 clients. The scopes should be simplified; probably bundled into "identity", "read", "write", "admin". Open question whether the client secret should be "view once" or not.
5 Disable a client As a Client Developer, I want to disable an existing Client, so that I'm no longer responsible for it. Must have Client Developers should disable keys they no longer use. We can call this "deleting" a client ID if we want. Depending on privacy requirements, we can even delete the client ID in the DB.
6 Create a developer token As a Client Developer, I want to create an access token for a Client ID I own for my own account without going through the authorization flow, so I can use it for debugging and testing. Optional This is a helpful feature that some developer platforms provide to help with testing, development and debugging. Some call it a "personal access token". The token should view-once and copyable. There shouldn't be more than one developer access token per client.
7 Delete a developer token As a Client Developer, I want to delete an access token for a Client I own for my own account, so that I'm no longer responsible for it. Optional
8 Replace a developer token As a Client Developer, I want to replace an access token for a Client I own for my own account, so that if I lose or forget a token I can get a new one. Optional If it's view-once, we need a way to generate a new one.
9 Request higher access level As a Client Developer, I want to request a higher access level for one of my Clients, so that I can get a higher API rate limit or different scope permissions. Optional What the access levels are, and what they affect, TBD.
10 Read notifications As a Client Developer, I want to read any notifications for my account, so that I'm informed about how my client is being used or if it is causing problems. Must have The Echo notification interface is probably fine for this.
11 View API usage graph As a Client Developer, I want to see my API usage in graph form, so that I know how many API requests my clients are making. Must have This needs to integrate with the rate limiting back end; see Epic 3, user story 12.
12 View API usage table As a Client Developer, I want to see my API usage in table form, so that I know how many. API requests my clients are making. Must have This needs to integrate with the rate limiting back end; see Epic 3, user story 12.
13 Developer account logout As a Client Developer, I want to log out of my account, so that the device I use can't be used by someone else to change my account. Must have

Open Questions

< Initiatives

Open questions
Question Reason we're asking Responsible for the answer
What rate limit policies do other major API providers have? To figure out and justify our rate limit policies Evan
What formal or informal API rate limit policies does WMF have in place already? To make sure new rate limit policies don't conflict too much Evan
What initial rate limit classes should we have, and what should their rate limits be? To help with estimation and architecture of the solution Evan
Do we need soft limits as well as hard limits? Validate the assumption that we only need hard limits Evan
Is it OK to weight every API call equally? Balance simplicity for clients with resource management Evan
Is it OK to have a global rate limit, or do we need different limits for different APIs or endpoints Balance simplicity for clients with resource management Evan
Do we limit API calls by API key, or by developer account? Validate requirements Evan
How do we differentiate between APIs intended for public use versus those for private use? Validate requirements Evan
Do we use api.wikimedia.org, api.wikipedia.org, or something else? Validate requirements Evan
How long can and should we keep historical API usage data? Scaling for data storage Evan
What should the anonymous rate limit be? Validate requirements Evan
Can we do a first release without user story #14? Simplify implementation and launch policy Evan
Is the feeds API a good candidate for the initial release? Validate user story Evan
What are our criteria for choosing APIs to route through the gateway? Validate whether API-related user stories meet our criteria Evan
Which rate-limit header format should we use? Clearer interface definition Evan, Bill
What problems are people trying to solve? Eric, Bill
What software exists? Eric, Bill
What are the pros and cons? Eric, Bill
What are people doing at our scale? Eric, Bill
What are the lessons learned? Eric, Bill

Documentation Links

< Initiatives

Phabricator

API Gateway

Plans/RFCs

None given

Other Documents


See also