User:Brooke Vibber/Compacting the revision table round 2
Overview
editPer ongoing discussion in ArchCom and at WikiDev17 about performance, future requirements, and future-proofing for table size it's proposed to do a major overhaul of the revision
table, combining the following improvements:
- Normalization of frequently duplicated data to separate tables, reducing the dupe strings to integer keys
- Separation of content-specific from general-revision metadata to support:
- Multi-content revisions allowing for storing of multiple content blobs per revision -- not related to compaction, but of great interest for structured data additions planned for multimedia and articles
- general reduction in
revision
table width / on-disk size will make schema changes easier in future - trying to avoid inconsistencies in live index deployments
- ideally all indexes should fit on all servers, making it easier to switch database backend around in production
cf other & older notes:
Current
editColumn | Type |
---|---|
rev_id | int
|
rev_page -> page_id | int
|
rev_text_id -> old_id | int
|
rev_comment | varbinary(767)
|
rev_user -> user_id | int
|
rev_user_text | varchar(255)
|
rev_timestamp | binary(14)
|
rev_minor_edit | tinyint
|
rev_deleted | tinyint
|
rev_len | int
|
rev_parent_id -> rev_id | int
|
rev_sha1 | varbinary(32)
|
rev_content_model | varbinary(32)
|
rev_content_format | varbinary(64)
|
Column | Type |
---|---|
ar_id | int
|
ar_namespace | int
|
ar_title | varchar(255)
|
ar_text | mediumblob
|
ar_comment | varbinary(255)
|
ar_user | int
|
ar_user_text | varbinary(255)
|
ar_timestamp | binary(14)
|
ar_minor_edit | tinyint
|
ar_flags | tinyblob
|
ar_rev_id | int
|
ar_text_id | int
|
ar_deleted | tinyint
|
ar_len | int
|
ar_page_id | int
|
ar_parent_id | int
|
ar_sha1 | varbinary(32)
|
ar_content_model | varbinary(32)
|
ar_content_format | varbinary(64)
|
Provisional
edit/tables.sql (needs updating)
Column | Type | Change |
---|---|---|
rev_id | bigint
|
int -> bigint
|
rev_page -> page_id | int
|
— |
rev_comment_id -> comment_id | bigint
|
Moved to own table. |
rev_actor -> actor_id | bigint
|
Separated user /user_text to own table.
|
rev_timestamp | binary(14)
|
— |
rev_minor_edit | tinyint
|
— |
rev_deleted | tinyint
|
— |
rev_parent_id -> rev_id | bigint
|
int -> bigint
|
Column | Type | Change |
---|---|---|
ar_id | bigint
|
int -> bigint
|
ar_namespace | int
|
— |
ar_title | varchar(255)
|
— |
ar_page_id | int
|
— |
ar_rev_id | bigint
|
int -> bigint
|
ar_comment_id -> comment_id | bigint
|
Moved to own table. |
ar_actor -> actor_id | bigint
|
Separated user /user_text to own table.
|
ar_timestamp | binary(14)
|
— |
ar_minor_edit | tinyint
|
— |
ar_deleted | tinyint
|
— |
ar_parent_id -> rev_id | bigint
|
int -> bigint
|
Column | Type | Change |
---|---|---|
slot_revision -> rev_id/ar_rev_id | bigint
|
Link back to revision or archive |
slot_content -> cont_id | bigint
|
Link to content |
slot_role -> cr_id | smallint
|
Link to content_role |
Column | Type | Change |
---|---|---|
cont_id | bigint
|
N/A |
cont_address | int
|
Replaces rev_text_id. |
cont_len | int
|
Moved; no other change. |
cont_sha1 | varbinary(32)
|
Moved; no other change. |
cont_model -> cm_id | smallint
|
Link to content_model |
cont_format -> cf_id | smallint
|
Link to content_format |
Column | Type | Change |
---|---|---|
comment_id | bigint
|
N/A |
comment_text | mediumblob
|
varbinary(767) -> mediumblob (for phab:T6714) |
Column | Type | Change |
---|---|---|
actor_id | bigint
|
N/A |
actor_user -> user_id | int
|
Link to user; moved from rev_user
|
actor_text | varchar(255)
|
Moved from rev_user_text ; no other change.
|
Column | Type | Change |
---|---|---|
cm_id | smallint
|
N/A |
cm_model | varbinary(32)
|
Moved; no other change. |
Column | Type | Change |
---|---|---|
cf_id | smallint
|
N/A |
cf_format | varbinary(64)
|
Moved; no other change. |
Column | Type | Change |
---|---|---|
cr_id | smallint
|
N/A |
cr_role | varbinary(32)
|
New concept for MCR. |
TODO: consider further changes to archive table (full revdel?)
TODO: maybe switch text table around and change the text_id ref to a URL? (In progress of thought)
TODO: plan updates to other tables that have the user_text pattern.
Thoughts
edit- That seems like a lot of tables!
- Most of them are the small tables for inlining strings -- content models, content formats, content slot roles for MCR, and user refs/IP addresses for
actor
. These should save a fair chunk of duplicated space. Additionally the MCR split between revision & content makes each of the two tables smaller and more malleable. - What happened to rev_text_id?
- content.cont_address replaces it.
- It may be an open question whether we want to make that change immediately, or whether to change the 'text' table as well, etc.
- Why isn't rev_deleted moved to content?
- rev_deleted is a bitfield and most of its options apply to things that aren't part of a Content object, such as the edit comment and the username. If separately "rev-deleting" just one content item is needed, a second bitfield or flag will need to be added on the content table too...
- What about rev_len, rev_sha1 -- do they belong in content?
- Not sure about this. Do we need to keep the fields for summing from multiple content objects?
- How hard will it be to change queries?
- Those that WHERE on rev_user/rev_user_text directly, or read fields directly, etc will need to be updated. :(
- Things that just use Revision::*FromConds() and the accessor functions will be able to fall back to lazy loading without
- Stuff that touches rev_text_id directly will need changing.
- Stuff that wants to pre-join a bunch of data may need changing. May be able to add abstractions on Revision function to hide some of that, or build new abstractions that are less freaky.
- Stuff that inserts manually will need updating; stuff that uses Revision accessors should remain safe, but should update to MCR-specific interfaces in future
- What would a transition look like? What kind of background processing and what kind of downtime to expect?
- See #Transition -- we'll need a transitionary mode in MediaWiki where a background process runs filling out the new tables. This may take some time -- several weeks sounds likely for the biggest sites. This may increase load on servers and will require additional disk space.
- Most likely we will first add the new fields to revision, allowing them to be filled out without disturbing ongoing actions, and then remove the now redundant fields afterwards.
- In principle, once the background process is complete, it should be possible to switch a wiki to read-only, flip its mode, and then switch back to read-write with little downtime for editors.
- This could also allow quickly aborting/reverting to the previous state of revisions if things look bad in read-only... but if a rollback after going read-write is desired, that's a lot freakier to deal with.
- What about manual queries (tool labs, etc) that might not want to update?
- In principle a view can be created that replicates the classic view of the revision table... except for that text_id maybe. #Compatibility view
- Space and performance concerns about indexes; possibility of "sharding" different sub-tables with specific indexes?
- Could consider extra tables to support usage patterns, as long as they all fit on standard servers. #Denormalization
- How will the archive table be transitioned?
- In current model, archive may drop some legacy fields from ancient MediaWiki versions, which may require upgrade of existing rows in these circumstances: separation of inline text storage to 'text' table, setup of 'content' rows, and creation of an ar_rev_id for those that lack it (to be used in content slot role assignment).
- It's possible we could make a bigger change where deleted rows stay in the 'revision' table (full 'revision deletion') but this is not yet decided.
Transition plan
editClean up archive table (it's conceivable this has already been done, but expect surprises)
- background process finishes any outstanding migrations on the archive table for deleted revs
- migrate any inline text storage to text table
- assign ar_rev_id if none present
Intermediate schema
- intermediate schema will have all the old fields *and* all the new fields
- apply intermediate schema on DB replicas while out of rotation, until all servers ready
Transition mode
- keep reading from the old fields, but write to the new fields too when updating/inserting
- set up the content model, content format, slot role entries.
- background process churns through existing rows:
- break out
comment
,actor
entries - break out
content
entries and createslot
rows assigning them to the revs
- break out
Provisional switchover
- (?) activate full usage of the new fields, but write to the old fields to when updating/inserting
- (?) gives us a chance to test without losing data
Full switchover
- activate full usage of new fields, ignore use of the old fields
Final schema
- drop the old fields on DB replicas out of rotation, until all servers applied.
Compatibility view
editSomething like this could work for providing a back-compatible view of revision:
Main pain points are rev_text_id, if we change it to a full address in content it's harder to just use a reference. And whether to explicitly pull a main content slot or what...
create view revision_old
as select
rev_id,
rev_page,
0 as rev_text_id, -- XXXX no exact match?
comment_text as rev_comment,
actor_user as rev_user,
actor_text as rev_user_text,
rev_timestamp,
rev_minor_edit,
rev_deleted,
cont_len as rev_len,
rev_parent_id,
cont_sha1 as rev_sha1,
cm_name as rev_content_model,
cf_name as rev_content_format,
FROM revision
left join slots on slot_revision=rev_id and slot_role=1 -- ????
left join content on content_id=slot_content
left join content_model on cm_id=cont_model
left join content_format on cf_id=cont_format
left join actor on actor_id=rev_user_entry
left join comment on comment_id=rev_comment_id;
Denormalization
editPart of the space usage of the revision table is the several indexes on it:
CREATE INDEX /*i*/rev_page_id ON /*_*/revision (rev_page, rev_id); -- used for direct current-ver lookups
CREATE INDEX /*i*/rev_timestamp ON /*_*/revision (rev_timestamp); -- is this used directly? looks bogus
CREATE INDEX /*i*/page_timestamp ON /*_*/revision (rev_page,rev_timestamp); -- used for history, editing, lots of stuff
CREATE INDEX /*i*/user_timestamp ON /*_*/revision (rev_user,rev_timestamp); -- used for contribs
CREATE INDEX /*i*/usertext_timestamp ON /*_*/revision (rev_user_text,rev_timestamp); -- used for contribs
CREATE INDEX /*i*/page_user_timestamp ON /*_*/revision (rev_page,rev_user,rev_timestamp); -- what is this for? seems weird
(Question: are there any extra indexes added in production or tools that might be needed too?)
There might or might not be benefit to creating denormalized summary tables containing only a few keys rows and then the indexes needed, like this:
--
-- Summary table to isolate the usage of the indexes for Special:contributions
-- and ApiQueryUserContributions.
--
CREATE TABLE /*_*/contribs (
-- key to rev_id
contribs_revision bigint unsigned NOT NULL PRIMARY KEY AUTO_INCREMENT,
-- Key to user_entry.ue_id of the user who made this edit.
-- ue_user or ue_user_text will have the actual id or IP address.
contribs_user_entry bigint unsigned NOT NULL default 0,
-- The timestamp
rev_timestamp binary(14) NOT NULL default '',
) /*$wgDBTableOptions*/;
CREATE INDEX /*i*/user_entry_timestamp ON /*_*/revision (rev_user_entry,rev_timestamp);
This can then be joined to the main tables to get the rest of the information. Does this make sense? We're not sure... yet :)
Questions and comments
editUse the talk page please