Parsoid/PostProcessor:DOM Tag Minimization

The image on the right is the thumbnail of the paper sketch of the algorithm currently implemented in Parsoid. This is implemented as a post-processor to minimize tag use (maximizes tag overlap, merges adjacent identical tags). The sketch is the best way to understand the algorithm. This is currently applied to a set of 4 HTML tags (B, I, U, and S), but can be extended to other inline tags.

Parsoid DOM Tag Minimization.Algo Sketch

Example 1:

<b><i>BI</i></b><i>I</i>

gets restructured to:

<i><b>BI</b>I</i>

Example 2:

<b><i><u>BIU</u></i></b><u><i>UI</i></u><i>I</i>

gets restructured to:

<i><u><b>BIU</b>UI</u>I</i>