Extension:DiscussionTools/How it works

This page documents the internals of DiscussionTools for developers of the extension and tools that build on top of it, like JS gadgets or SQL queries.

To learn about common reasons why DiscussionTools might not work as expected on a specific page, see Help:DiscussionTools/Why can't I reply to this comment?.

ParserEdit

Most DiscussionTools features rely on the talk page parser introduced in this extension (no relation to the MediaWiki wikitext parser).

The parser takes as input the HTML rendering of the discussion page (produced by either Parsoid or the old wikitext parser), and gives as output a representation of the comments and threads on the page.

Note that DiscussionTools does not deal with the wikitext at all, only with HTML.

Data structuresEdit

DiscussionTools recognizes two kinds of items: headings and comments. Other content on the page is not included in the representation.

Headings and comments form a tree structure. Comments can be top-level (represented as replies to headings), or be replies to other comments. Headings can be top-level, or be sub-headings (represented as replies to other headings). A thread is a heading together with its tree of replies.

Each item has the following properties:

  • ID and name, which are used to identify the item in different contexts
  • Range, referencing the HTML DOM nodes where it was detected. The range may begin or end in the middle of an element, and may span multiple elements in different parent nodes.
  • Indentation level (always 0 for headings, 1 for top-level comments, 2+ for replies)
  • References to parent item and reply items

Comments additionally have:

  • Signature ranges, as above, referencing the HTML DOM nodes of signatures
  • Author name
  • Date and time

Headings additionally have:

  • Heading level (1-6)
  • Whether it is a placeholder heading, used when comment items appear before the first heading on the page

This data structure is ephemeral and not stored anywhere. When it's needed, it is constructed from scratch from the page HTML. (The information is encoded back into the HTML in the formatter though, as described below.)

ExampleEdit

Below is an example discussion, and the parser's representation of it (pseudocode):

A

B. Matma Rex (talk) 00:09, 24 June 2021 (UTC)

C.
C. Matma Rex (talk) 00:09, 24 June 2021 (UTC)
D. Matma Rex (talk) 00:09, 24 June 2021 (UTC)
E. Matma Rex (talk) 00:09, 24 June 2021 (UTC)
F. Matma Rex (talk) 00:09, 24 June 2021 (UTC)
G. Matma Rex (talk) 00:09, 24 June 2021 (UTC)

H. Matma Rex (talk) 00:09, 24 June 2021 (UTC)

I. Matma Rex (talk) 00:09, 24 June 2021 (UTC)
[
  HeadingItem( { level: 0, range: (h2: A), replies: [
    CommentItem( { level: 1, range: (p: B), replies: [
      CommentItem( { level: 2, range: (li: C, li: C), replies: [
        CommentItem( { level: 3, range: (li: D), replies: [
          CommentItem( { level: 4, range: (li: E), replies: [] },
          CommentItem( { level: 4, range: (li: F), replies: [] },
        ] },
      ] },
      CommentItem( { level: 2, range: (li: G), replies: [] },
    ] },
    CommentItem( { level: 1, range: (p: H), replies: [
      CommentItem( { level: 2, range: (li: I), replies: [] },
    ] },
  ] } )
]

Parsing algorithmEdit

Detecting commentsEdit

First step to obtain the above is to find the comments and headings that exist on the page.

  • For each text node in the DOM, excluding those inside blockquotes etc.:
    • If its text contains a timestamp formatted according to the wiki's language, and
    • If the text node is preceded by a signature, that is a link to a user page, user talk page, or user contributions
    • Output a comment with the following properties:
      • Range beginning at the first "leaf" node following the previous comment, heading, or start of document; and ending at the end of the "paragraph" containing the signature
      • Indentation level computed as the minimum of the indentation of the beginning and end of the range
      • Signature range from the first detected link to the end of the timestamp
      • Author name parsed from the signature
      • Date and time parsed from the timestamp

Parsing timestampsEdit

Timestamps are parsed by an algorithm that reverses the steps taken by MediaWiki to output them. Only timestamps that exactly match the MediaWiki's date formats are accepted, to guarantee that they can be parsed unambigously. DST timezones and language variants are supported.

Threading commentsEdit

Comments are assigned as replies to other comments depending on the indentation level.

Assigning ID and nameEdit

Item IDs and names are computed based only on the author, date and time, and thread structure. They do not depend on the text of the comment or the heading. This allows identical IDs/names to be assigned to the same comment even if it is modified in later revisions of the page, or the same heading even if it is renamed, and to be identical when language variants are in use.

Item IDs are unique within the page being parsed. If two items were to be otherwise indistinguishable, they are numbered sequentially.

Item names are consistent across all pages and revisions where the item might appear, even when it's moved or changed. In rare cases, multiple comments or headings may have the same name. Use ID to distinguish them if necessary.

Reply toolEdit

Adding reply linksEdit

The formatter inserts reply links into the DOM in PHP, as well as comment start and end markers.

Care is taken not to insert them in invalid places, like inside a <style> or a <br> tag.

Item properties from the comment tree data structure are included as JSON data attribute on the reply links. Together with the markers, they are later used in JS code to reconstruct the comment tree without running the parser.

We use markers instead of directly storing the range to allow some compatibility with other extensions and gadgets that modify the client-side DOM.

Inserting the reply widgetEdit

The modifier inserts the reply widget into the DOM in JS, as if the reply widget was a new reply to the comment.

The DOM tree is suitably rearranged to ensure correct indentation level of the reply (wrapper nodes are added, and other nodes may be moved around).

The reply is added below all existing replies to the given comment (and replies to them), with indentation level of the given comment plus 1.

Saving commentsEdit

Saving comments uses the same modifier algorithm, implemented in PHP. The contents of each paragraph in the reply are inserted inside a list item node. Then the HTML is converted back to wikitext using Parsoid, which is saved as a new revision of the page.

When replying in wikitext mode, each line of wikitext is added inside a list item node as a transclusion. Parsoid includes the wikitext unchanged in its output.

Why not wikitextEdit

Saving comments does not operate directly on wikitext, but rather uses HTML throughout the process and Parsoid to convert it. This has some benefits and drawbacks.

Benefits:

  • We do not need to maintain a whole separate parser and modifier that would implement a similar algorithm for wikitext.
  • The reply widget and the actual reply are placed on the page in the same way, so the "preview" will always match the final result.
  • We can more easily recognize "frames" around the content, such as barnstar/wikilove messages, and add replies outside of them, regardless of the markup they use.
  • It better handles edge cases where a single line of wikitext contains fragments of multiple comments (occasionally occurring when the page was previously edited using visual editor).
  • We will not need to make major changes once multi-line list items are introduced in wikitext.
  • Comments transcluded from other pages can usually be detected and replied to.

Drawbacks:

  • Any Parsoid bugs affect the reply tool and potentially cause content corruption. This has been a significant issue at the beginning of the project, but since then we've developed a tool to detect issues and the Parsing team has been fixing them. (One remaining issue is that Parsoid incorrectly handles pages that contained fostered content in HTML (T240280). The reply tool will refuse to edit such pages.)
  • Parsoid's handling of HTML comments and whitespace has been unintuitive and it required a lot of effort to get it to produce reasonable wikitext.
  • Comments marked as template-generated but not transcluded from other pages usually can't be replied to.

Transcluded commentsEdit

When running the parser on Parsoid HTML, we can use the information about comment ranges from our parser and information about template-generated content from Parsoid HTML to determine whether a comment visible on the page has been transcluded from a different page, and post the reply there.

Parallel implementationsEdit

Most of the parser, modifier, and data structure code has two implementations: in PHP and JavaScript. It is a historical accident, as the tools were first prototyped in JS to make it easy to test them with live content on Wikipedia, and then reimplemented in PHP to improve performance (particularly to avoid fetching and sending the full page's Parsoid HTML when saving replies). But once we had them, we kept them both: it helps avoid bugs by comparing the two implementations and allows some client-side actions to happen without consulting the server, e.g. inserting the reply widget.

New discussion toolEdit

Unlike the reply tool, the new discussion tool saves the comment as wikitext, using the existing APIs to add a new section to a page. In visual mode the comment is converted to wikitext first.

Conceptually, in our data structure, adding a new discussion thread is the same as adding a new heading and then adding a top-level comment as a reply to that heading. The interface code reuses much of the reply tool by putting that concept into reality. It seemed like a good idea at the time.

NotificationsEdit

SubscribingEdit

Users can subscribe to receive notifications about new replies in the comment tree of an item (including direct replies, and any replies to replies, and so on). We currently only allow subscribing to level 2 headings.

This model could theoretically support subscribing to notifications about replies to any comment or heading. However, it would require much more complexity in the user interface (particularly in managing subscriptions when multiple subscriptions with different states could overlap), so we gave up on it.

Each subscription has the following properties:

  • Subscription item name, that is the name (as defined above) of the heading of the thread. This is used when generating notifications.
  • Subscription link target, that is the page title and section title where this item appeared when the subscription was created. This is not used when generating notifications, and may not match where the thread actually appears (if it was archived, or renamed). It's only intended to be used as a human-readable label when managing subscriptions (not implemented yet).
  • State, subscribed or unsubscribed. Currently unused but intended to be used for unsubscribing from automatic subscriptions.
  • User who is subscribed
  • Time when this subscription was created
  • Time when a notification about the item was last sent

This data is stored in a database table.

Generating notificationsEdit

Echo separates the concepts of events and notifications. A single event can results in notifications sent to many users, depending on its user locators (to include users) and user filters (to exclude them).

Whenever an edit to a talk page is saved, Echo compares the previous and new page revision to generate its events, e.g. mentions. DiscussionTools extends this mechanism, and compares the previous and new comment trees to find new comments and generate events for them.

Each event has the following properties:

  • (built-in in Echo) Page title
  • (built-in in Echo) Agent (user who caused the event, by leaving the comment)
  • (built-in in Echo) Section title
  • (built-in in Echo) Page revision
  • Subscription item name. A locator is used to include all users subscribed to it in the notifications. Note that we ignore the page title and section title here, and users will still get notifications if the section was renamed or archived to a different page.
  • New comment's ID and name. The ID is used to show a direct link to the comment. The name is intended to be used in the future to allow linking to the comment if it has been archived to a different page.
  • New comment's content, a snippet of which is shown in the notifications
  • List of users who were mentioned in the comment

This data is stored in one of Echo's database tables, however only the title and agent can be queried directly. Everything else is in a serialized blob.

We generate an event for every new talk page comment, regardless of whether anyone is subscribed to the thread it's in. We generate notifications only for subscribed users.

If the edit would result in an Echo event related to talk pages (that is: mention, mention-summary, or edit-user-talk) as well as a DiscussionTools comment event, we avoid sending double notifications by using a filter to exclude the users who were mentioned and, if the edit was to a user talk page, its owner. Instead we enhance the Echo event with the comment's ID and name to show a direct link to the new comment (rather than just a section where it was added) and the comment's content to show a snippet (unless Echo provided one).

Duplicate notificationsEdit

Sections you subscribe to are identified by the username and the timestamp of the oldest comment. If two sections have identical username and timestamp (even on different pages), and you subscribe to one of them, everything behaves as if you had subscribed to both – you'll get notifications for both of them. Unsubscribing from one also unsubscribes you from all others.

It is perhaps not the ideal behavior, but doing the above allows for sections to be moved, renamed, or archived/unarchived, without losing the subscriptions.