Wikimedia Product/Data dictionary/edit_hourly
This page describes the data set edit_hourly
that stores on Druid Datasources, which can be accessed via Superset/Turnilo. edit_hourly
on Druid is directly loaded from wmf.edit_hourly
on Hive, while wmf.edit_hourly
on Hive is originally extracted from wmf.mediawiki_history
.
Schema
editField name | Data type | Description | Data example | Source schema | Source field |
---|---|---|---|---|---|
creates_new_page | boolean | Whether the edit was the first of a page (page creation), revision_parent_id == 0 | TRUE/FALSE | wmf.mediawiki_history | revision_parent_id |
edit_count | bigint | Number of edits belonging to this hourly bucket (for the given dimension value set) | 2 | wmf.mediawiki_history | COUNT(*) |
interface | string | Editing interface | VisualEditor, 2017 wikitext editor, Switched from VisualEditor to wikitext editor, Other | wmf.mediawiki_history | revision_tags |
is_deleted | boolean | Whether the edit has been deleted | TRUE/FALSE | wmf.mediawiki_history | revision_is_deleted_by_page_deletion |
is_reverted | boolean | Whether the edit has been reverted | TRUE/FALSE | wmf.mediawiki_history | revision_is_identity_reverted |
namespace_is_content | boolean | Whether the namespace is of type content or not | TRUE/FALSE | wmf.mediawiki_history | page_namespace_is_content_historical |
namespace_is_talk | boolean | Whether the namespace is of type talk or not | TRUE/FALSE | wmf.mediawiki_history | page_namespace_historical |
namespace_name | string | Namespace name | Main, Talk, User, User talk, etc. | wmf.mediawiki_history | page_namespace_historical |
platform | string | Access method | iOS, Android, Mobile web, Other | wmf.mediawiki_history | revision_tags |
project | string | The project this event belongs to | ar.wikipedia | canonical_data.wikis | domain_name |
revision_tags | array<string> | Revision tags (change tags) array | ["External Link added to disambiguation page","Possible disruption","visualeditor"] | wmf.mediawiki_history | revision_tags |
text_bytes_diff | bigint | Number of bytes added minus number of bytes removed belonging to this hourly bucket (for the given dimension value set) | 2 | wmf.mediawiki_history | revision_text_bytes_diff |
user_edit_count_bucket | string | Authors edit count bucket | 1-4, 5-99, 100-999, 1000-9999, 10000+ | wmf.mediawiki_history | event_user_revision_count |
user_groups | array<string> | User groups array | ["Image-reviewer","OTRS-member","patroller","rollbacker"] | wmf.mediawiki_history | event_user_groups_historical |
user_is_administrator | boolean | Whether user is adminstrator or not, ARRAY_CONTAINS(event_user_groups_historical, 'sysop') | TRUE/FALSE | wmf.mediawiki_history | event_user_groups_historical |
user_is_anonymous | boolean | Whether user is anonymous or not | TRUE/FALSE | wmf.mediawiki_history | event_user_is_anonymous |
user_is_bot | boolean | Whether user is bot or not | TRUE/FALSE | wmf.mediawiki_history | event_user_is_bot_by_historical |
user_tenure_bucket | string | Bucketed time between user creation and edit | Under 1 day, 1 to 7 days, 7 to 30 days, ..., Over 10 years, Undefined | wmf.mediawiki_history | event_user_registration_timestamp, event_user_creation_timestamp, event_user_first_edit_timestamp |
language | string | Project language | Arabic | canonical_data.wikis | language |
project_family | string | Project family name | wikipedia | canonical_data.wikis | project_family |
is_redirect_currently | boolean | Whether the page is *currently* a redirect (no historical information available) | TRUE/FALSE | wmf.mediawiki_history | page_is_redirect |