VisualEditor/Internals/DM

The VisualEditor linear model is optimized for transactional editing. It is similar to an HTML token stream, however inline formatting is composed onto each character. This allows arbitrary slicing of content to be simple and efficient. The transaction system allows modifications of the document to be safe and reversible. Transactions are prepared against the current document state and then committed. Transactions can also be later rolled back, or "undone".

The linear model edit

Classes: ve.dm.LinearModel, ve.dm.ContentBranchNode, ve.dm.*Annotation

The linear model is a clean representation of document content, in a format conceptually similar to HTML but using an array data structure that is optimal for transactional editing.

  • BranchNodes are block-level nodes, represented using open/close tag pairs (just like in HTML). BranchNodes can have other BranchNodes as children.
  • ContentBranchNodes are special branch nodes that can contain inline content, such as a paragraph or a heading.
  • Inline content such as text can only appear inside a ContentBranchNode. Text is represented with a separate array item for each javascript character (i.e. each Unicode code unit).
  • Annotations such as italic or link are not represented as nodes. Instead each text character carries its annotations independently. This makes arbitrary slicing of content simple and efficient.

Example linear model data (simplified for clarity):

[
 { type: 'paragraph' },
 'h',
 'e',
 'l',
 'l',
 'o',
 ' ',
 [ 'w', [ { type: 'textStyle/italic' } ] ],
 [ 'o', [ { type: 'textStyle/italic' } ] ],
 [ 'r', [ { type: 'textStyle/italic' } ] ],
 [ 'l', [ { type: 'textStyle/italic' } ] ],
 [ 'd', [ { type: 'textStyle/italic' } ] ],
 { type: '/paragraph' }
]

The spanning tree edit

Classes: ve.dm.Document, ve.dm.ContentBranchNode, ve.dm.*Node

The spanning tree allows efficient access to the portion of the data model that corresponds to a particular node. Only offsets and parent-child relationships are stored in the spanning tree (not actual document data); as such it is a cache of data derivable from the data model alone.

The internal list edit

Classes: ve.dm.InternalList, ve.dm.MWReferenceNode (in Cite extension)

Loading and outputting HTML edit

Classes: ve.dm.Converter

Transactions edit

Classes: ve.dm.Transaction

A transaction represents a change to the document starting from a particular state. Every such change can be described as a sequence of insert/remove operations at particular locations in the linear model.

However, the reverse is not true: not every possible sequence of linear insert/remove operations forms a valid transaction, because some won't result in a valid document tree. For example, removing just an open tag will obviously break tree validity.

So we can alternatively define a transaction as being a sequence of linear insert/remove operations that, taken as a whole, preserve tree validity. Inside the ve.dm.Transaction class, we actually define the transaction that way.

As the user edits, transactions are built to modify the linear model state to match the user modifications. Here are some examples:

Transaction replacing some text

replaceSomeText = new ve.dm.Transaction( [
    { type: 'retain', length: 10 },
    { type: 'replace', remove: [ 'a', 'b' ], insert: [ 'x' ] },
    { type: 'retain', length: 17 }
] )

Transaction changing a paragraph into a heading

changeParaToHeading = new ve.dm.Transaction( [
    { type: 'retain', length: 2 },
    {
        type: 'replace',
        remove: [ { type: 'paragraph' } ],
        insert: [ { type: 'heading', attributes: { level: 1 } } ]
    },
    { type: 'retain', length: 20 },
    {
        type: 'replace',
        remove: [ { type: '/paragraph' } ],
        insert: [ { type: '/heading' } ]
    },
    { type: 'retain', length: 5 }
] )

The key types of operations are:

  • retain, which steps past content unmodified
  • replace, which splices content at the current location
  • attribute, which changes attributes on an open tag at the current location

Important properties of transactions edit

Transactions are built to apply to a particular initial state of the linear model. Applying the transaction to the initial state transforms the linear model to another state, the final state. Transactions are not intended to be applied to any other state (at least, not directly; see "Rebasing transactions" below).

Transactions preserve tree validity (when applied to the intended linear model initial state). Note an individual operation within the transaction may not preserve tree validity when applied in isolation. Therefore the transaction is the basic unit of change that preserves tree validity. Transactions are reversible. This is because the operations they contain are reversible. (To reverse a replace operation, just swap the remove and insert values). The reverse transaction will therefore transform the linear model from the original transaction's final state back to its initial state.

Reversibility implies the edit history can be stored as a list of transactions in the order they were applied. Undo can be implemented by popping a transaction from the history and applying its reverse.

Applying transactions edit

Classes: ve.dm.TransactionProcessor, ve.dm.TreeModifier

See VisualEditor/Internals/DM/TreeModifier for more information.

VisualEditor transactions incrementally to the linear model while keeping the spanning tree in sync at all times. This requires a special "TreeModifier" algorithm, because the individual linear splices in a transaction do not necessarily preserve tree structure validity.

Incremental validity is essential, because it means event listeners on data model tree nodes can fire while the update is in progress and perform corresponding updates to ContentEditable tree nodes, relying on the guarantee that the document structure will remain valid even partway through the transaction being applied.

History and undo edit

Classes: ve.dm.Change

Rebasing transactions edit

Rebasing only applies to two transactions that apply to the same starting state.

Transactions conflict if their modified ranges overlap.

Otherwise they can be rebased just by changing the retain length.

This guarantees tree correctness.

The DM Surface edit

Classes: ve.ce.Surface

The DM surface builds a number of features on top of the DM Document, including:

  • Selection
  • Undo stack
  • Session storage
  • Active annotations at the selection