Extension:External Data/Throttling data retrievals
Retrieval of web data and program data can be throttled per data source, that is, a delay between calls to the same web service or program can be enforced. If a throttled data source is attempted to be accessed before the specified delay has passed, then, if a cache is being used, a stale cache value will be returned; otherwise, a message informing the users that calls are throttled will be shown, and a job actually fetching the data will be scheduled.
For web sites, soap services and server-side programs, throttling is configured by the settings throttle key
and throttle interval
within $wgExternalDataSources:
throttle key
is a string that generates a throttle key. Wildcards within the string like$host$
,$url$
or$2nd_lvl_domain$
(for web services) or$param$
(for parameters to programs) will be replaced with their corresponding values. The default value is$2nd_lvl_domain$
, meaning that, for example, any call to any page in Wikipedia in any language will have the throttle keywikipedia.org
.throttle interval
is a float holding the minimal interval, in seconds, between calls to web services or server-side programs with the same throttle key.
As with other settings, these are per data source, which can be:
- The full URL,
- host,
- second-level domain,
'*'
for the default fallback for any site.
Throttling makes sense when there are numerous calls of parser functions (e.g., caused by a template embedded many times) addressing the same external service or program that either requires much computational resources or is, effectively, a call to an external service, like youtube-dl.
If there is no throttling key, or throttling interval is zero or not set, there will be no throttling. This is the default setting.