Wikimedia Product/Data dictionary/pageviews_hourly
This page describes the data set pageviews_hourly
that stores on Druid Datasources, which can be accessed via Superset/Turnilo. pageviews_hourly
on Druid is generated by aggregating wmf.pageview_hourly
on Hive by hour, while wmf.pageview_hourly
on Hive is extracted from wmf.pageview_actor
.
Schema
editField name | data type | description | data example | source schema | source field |
---|---|---|---|---|---|
project | string | Project name from requests hostname | aa.wikipedia | wmf.pageview_actor | pageview_info['project'] |
agent_type | string | Agent accessing the pages, can be spider, user or automated (see BotDetection) | user | wmf.pageview_actor | agent_type |
ua_browser_family | string | Name of web browser (if not using an official Wikipedia mobile app), extracted from the client device's User-Agent | Firefox | wmf.pageview_actor | user_agent_map['browser_family'] |
ua_device_family | string | Client device family (e.g. brand of manufacturer, product name), extracted from the client device's User-Agent if provided | Other | wmf.pageview_actor | user_agent_map['device_family'] |
city | string | City iso code of the accessing agents (computed using maxmind GeoIP database) | Apple Valley | wmf.pageview_actor | geocoded_data['city'] |
subdivision | string | Subdivision of the accessing agents (computed using maxmind GeoIP database) | California | wmf.pageview_actor | geocoded_data['subdivision'] |
ua_wmf_app_version | string | Version of official Wikipedia mobile app (for iOS, Android, and KaiOS), extracted from the client device's User-Agent | - | wmf.pageview_actor | user_agent_map['wmf_app_version'] |
country | string | Country (text) of the accessing agents (computed using maxmind GeoIP database) | United States | wmf.pageview_actor | geocoded_data['country'] |
country_code | string | Country iso code of the accessing agents (computed using maxmind GeoIP database) | US | wmf.pageview_actor | geocoded_data['country_code'] |
ua_os_major | string | Major version of that Operating System, extracted from the client device's User-Agent | 10 | wmf.pageview_actor | user_agent_map['os_major'] |
continent | string | Continent of the accessing agents (computed using maxmind GeoIP database) | North America | wmf.pageview_actor | geocoded_data['continent'] |
ua_os_family | string | Operating System family used by the client device, extracted from the User-Agent | Mac OS X | wmf.pageview_actor | user_agent_map['os_family'] |
language_variant | string | Language variant from requests path (not set if present in project name) | default | wmf.pageview_actor | pageview_info['language_variant'] |
ua_os_minor | string | Minor version of that Operating System, extracted from the client device's User-Agent | 14 | wmf.pageview_actor | user_agent_map['os_minor'] |
referer_class | string | Can be none (null, empty or \'-\'), unknown (domain extraction failed), internal (domain is a wikimedia project), external (search engine) (domain is one of google, yahoo, bing, yandex, baidu, duckduckgo), external (any other) | none | wmf.pageview_actor | referer_class |
zero_carrier | string | NULL as zero program is over | Null | NULL | |
access_method | string | Method used to access the pages, can be desktop, mobile web, or mobile app | desktop | wmf.pageview_actor | access_method |
ua_browser_major | string | Major version of the client browser, extracted from the client device's User-Agent | 68 | wmf.pageview_actor | user_agent_map['browser_major'] |
project_family | string | Project family | wikipedia | canonical_data.wikis | database_group |
view_count | bigint | Number of views | 1 | wmf.pageview_actor | count(1) then aggregated by hour |