Parsoid/Deployments/2015

Wednesday, Dec 16, 2015 around 1:25 pm PT: Yes Deployed 64029e12 edit

  • task T86271: Serialize <link>s on own line always
  • task T121174: Carry over paramInfo when unpacking DOMFragment
  • Use Node.replaceChild() instead of delete/insert
  • Non-functional changes:
    • Fix stale paths to core-upgrade.js
    • task T108140: Automate some of the tedium of manual regression testing

Monday, Dec 14, 2015 around 1:15 pm PT: Yes Deployed df3171e6 edit

  • Use babybird as the underlying Promise implementation
    • This is faster than the currently used implementation provided by core-js.
    • We have seen a 30% slowdown in WTS performance since the the async WTS version was deployed on Dec 9th.
  • Tweaks to the resource limits enforcing code

Friday, Dec 11, 2015 around 4:20pm PT: Yes Deployed ebd62ab5 edit

  • task T120972: Introduce configurable wt2html/html2wt resource limits

Thursday, Dec 10, 2015 around 2:25pm PT: Config change deployed edit

  • Config change: reduce request time out to 3 mins (from 4 mins earlier)

Wednesday, Dec 9, 2015 around 1:35pm PT: Yes Deployed a0c626e4 edit

  • task T107818: Record first wikitext node in multi-template-content-block scenario
  • task T104032: Fix html2wt newline constraints for paragraphs
  • task T115720: Refactor WTS to be async
  • Bunch of code cleanup:
    • Cleanup in WikiConfig and parser environment constructors
    • Cleanup list handing in the serializer

Monday, Dec 7, 2015 around 1:15pm PT: Yes Deployed 4a7df427 (cf0b9ef + cherry-pick of d65debd) edit

  • task T115717: Strip trailing <nowiki />s
  • Update core-js to v1.2.6 and prfun to v2.1.2
  • Consolidate setting separators into a method to ensure consistent updates of SOL state

Wednesday, Nov 18, 2015 around 1:15 pm PT: Yes Deployed e0a4fc91 edit

  • task T115327: Log errors passed along in express
  • task T118715: Improvements to broken attribute parsing in self-closing tags
  • Non-functional changes
    • task T93974: Allocate native extension objects once per doc
    • Removed dead code (Remove unnecessary indent pre stripping for refs)

Monday, Nov 16, 2015 around 1:15 pm PT: Yes Deployed 3a6f3b9e edit

  • task T118462: Support the newer scrub_wikitext form as well
  • task T53444: Strip <br>s from headings via new HTML normalization routine

Thursday, Nov 12, 2015 around 9:25 am PT: Yes Deployed 392e25eb edit

  • task T118367: Kill dead code + fix bad perf in pathological scenarios.

Wednesday, Nov 11, 2015 around 1:15 pm PT: Yes Deployed 7ca999c1 edit

  • Remove api/server.js symlink to bin/server.js (no longer needed since the puppet patch updating paths has been merged and deployed)
  • task T88827: Provide srcset attribute for images
  • task T117566: Optimize insertion of transclusion shadow metas -- these metas are added for detecting fostered content from transclusions. These set of patches greatly reduces the volume of these meta tags and improves performance on a subset of pages that would previously take too long and cause timeouts.
  • When a template range is expanded to include a table, expand it to include fostered content from it.
  • Code cleanup in template wrapping + removal of some potentially edge case bug scenarios.

Wednesday, Nov 4, 2015 around 1:15 pm PT: Yes Deployed 04893a18 edit

  • Reduce logging volume for empty/li entries + turn of logging for empty/tr entries
  • Put express in production mode by default (enables view caching)
  • Non-functional changes: Code cleanup of the wikitext serializer

Monday, Nov 2, 2015 around 1:25 pm PT: Yes Deployed f0d77afc edit

  • task T115464: Add ability to sample log requests
  • Log template names that produce stripped empty elements
  • Fix sol handling in separators
  • Update DU.hasDiffMarkers helper
  • Non-functional changes: Reorganization of the Parsoid code repo + code cleanup.

Monday, Oct 26, 2015 around 1:50 pm PT: Yes Deployed 660c59a9 edit

Wikitext -> HTML fixes

  • DSR: Fix bugs in LTR propagation + fix buggy tests in DOMUtil helpers. This fix eliminates O(n^2) behavior in some cases.
  • Fix OOM issue: our old favourite (exp*)+ (cherry-picked to production on Oct 19)
  • An inline_break is a fine way to end a list

HTML -> Wikitext fixes

  • nowiki escaping: Reduce use of fullWrap scenarios

Other fixes

  • Remove forked _http_agent.js
  • Move stack suppression to the logger
  • Remove some dead code from parser.defines
  • Improve ApiRequest logging
  • task T115185: Graph worker exit code / signals (cherry-picked to production on Oct 19)

Monday, October 19, 2015 around 1:20 pm PT: cherrypicked b317f33f and 60a82ae0 edit

  • task T115072: Fix out-of-memory parse errors on some pages (regression since deploy of 44d657de on Wednesday, August 26, 2015)
  • task T115185: Graph worker exit code / signals

These patches are being cherry-picked since master is not currently in deployable state.

Thursday, October 8, 2015 around 1pm PT: 998db843 to be deployed edit

  • task T86271: Serialize <link>s on own line always (affects newly added categories, magic words, and <*include*> directives).

Continues to be postponed since this deploy is dependent on a patch that needs review and testing. We have been backlogged because of parsing team offsite, vacations, quarterly planning and reviews. This should get unblocked this week.

Thurday, October 1, 2015 around 1:30 pm PT: Yes Deployed 62971510 edit

  • Set Main_Page as the default page name if none is provided in API requests.

Cuts down the errors showing up in kibana ... 100s of K errors in 3-4 bursts last 2 days.

Wednesday, September 30, 2015 around 1pm PT: Yes Deployed 39c60c67 edit

  • task T114185: Support body_only parameter in v3 API.
  • Minor fix to WTS nowiki-ing of links whose hrefs could be magic links but whose content isn't appropriate.
  • task T113666: Terminate autolinks on double or triple quotes
  • task T84937: Terminate autolinks on &nbsp; and numeric entity encodings of <>

Tuesday, September 29, 2015 around 9:15 am PT: Turned on use of ParsoidBatchAPI in production edit

  • Expected to reduce Parsoid's load on the Mediawiki API cluster
  • Expected to improve parse latencies
  • Improves image handling in some scenarios (task T112631, task T112045)

Monday, September 28, 2015 around 1:45 pm PT: Yes Deployed b9e5244e edit

  • Update request to 2.63
  • task T113206: Fix batch retries
  • task T105413: Do not allow data-ooui attributes in wikitext
  • Turn on use of ParsoidBatchAPI in production
    • Expected to reduce Parsoid's load on the Mediawiki API cluster
    • Expected to improve parse latencies
    • Improves image handling in some scenarios (task T112631, task T112045)

Wednesday, September 23, 2015 around 1:45 pm PT: Yes Deployed 6619409e edit

  • Count non-200 http status codes in the API (will show up in grafana)
  • Log 4xx API responses in Kibana
  • task T113044: Render default part of parameters at the top level
  • A bit of bonus cleanup in the tokenizer
  • task T112631: Attempt to match tpl(arg) brace precedence
  • task T111151: Drop <font> tags without attributes if scrubWikitext=true

Monday, September 21, 2015 around 1:25 pm PT: Yes Deployed 9984d221 edit

  • task T31919: Update parsoid sitematrix (et.wikimedia.org -> ee.wikimedia.org and other sitename updates)
  • task T111213, task T111225: Release version 0.4.1
  • task T112686: Use a timer to ensure forward progress in batched dispatches (fixes bug in use of batching API which is not enabled in production)
  • task T112668: Fix denial of client-side upscaling in thumb and frameless format (primarily related to batching API, but also some thumbnail scaling fixes in the non-batching API usecase)

Monday, September 14, 2015 around 1:15 pm PT: Yes Deployed 3d5f4359 edit

Bunch of edge-case tweaks and fixes to parsing of attributes in tables (rows, cells, table) -- improves compatibility with PHP parser output:

  • Pop tableCellArg before parsing template args
  • task T95131: Content on table start / row is all attributes
  • Remove single_cell_table_args
  • Match broken attribute parsing with the PHP parser
  • Handle broken_table_attribute_name_char in table_attributes (improves handling of broken table attributes task T51839, task T95131, task T93769)

Other wikitext -> HTML fixes:

  • TSP: Retokenize tokens that get converted to strings
  • Handle [[[Foo]]] and [[[[Foo]]]] properly

HTML -> wikitext fixes:

  • Move popping EOFTk inside tokenizeStr
  • Nowiki escaping: Process multi-line text nodes line-by-line

Other fixes:

  • Log the signal, if available, when a Parsoid worker exits
  • task T111092: Batching API use (not yet enabled in production): Fix totally broken interpretation of parse batch response

Wednesday, September 9, 2015 around 1:15pm PT: Yes Deployed ffd0b444 edit

npm dependency tweaks to eliminate version variability in installed packages:

  • Shrinkwrap npm dependencies
  • Bump several dependencies to what's in production
  • Prefer tilde ranges in package.json

Logging and error reporting fixes:

  • Downgrade duplicate id warnings
  • DOMDiff: Use more descriptive error prefixes
  • Improved Mediawiki API error reporting for ease of debugging

Wikitext -> HTML fixes:

HTML -> wikitext fixes (specifically nowiki escaping code):

  • Fix logic in hasWikitextTokens when asking for linksOnly

Other:

  • task T111818: Update sitematrix.json for be-tarask and affcom wikis

Wednesday, September 2, 2015 around 1:15pm PT: Yes Deployed 5f2fae6c edit

  • task T110692: Massage batching API imageinfo width/height to numbers
  • Tabs are preventing nowiki pre protection
  • Consolidate test to determine if separator introduced SOL
  • Implement Sanitizer's escapeId

Monday, August 31, 2015 around 1:20pm PT: Yes Deployed c3e4df5e edit

  • task T110037: WTS support for localized ISBN magic links
  • Be careful about using tsr in tokens/x-mediawiki phase
  • Don't ignore errors in extension parsing
  • task T23261: Support IPv6 addresses in URLs
  • Drop bad extension HTML and continue html2wt instead of returning HTTP 500
  • Let the OS randomize ports
  • Allow non-newline whitespace in RFC/PMID/ISBN autolinks

Wednesday, August 26, 2015 around 1:10 pm PT: Yes Deployed 44d657de edit

  • Fix the profile quoting in our content type strings (currently in production via a cherry-picked deploy on Tue, Aug 25)
  • task T110206: Fix couple regexps in tokenizer
  • task T110206: Fix html2wt crasher on eswiki:Usme
  • task T110206: Fix pathological backtracking regexp
  • task T100680: Implement Parsoid v3 API (and add test suite)
  • Several cleanups and improvements to attribute parsing in the tokenizer
    • Improve broken attribute heuristics
    • Cleanup _att_value rules
    • Remove resetting the parse position
    • Move location of tokenizing tags in attributes

Tuesday, August 25, 2015 around 3:00pm PT: Yes Deployed c3b037b0 (cherry-pick of 437cac80) edit

Tuesday, August 25, 2015 around 1:10 pm PT: Yes Deployed 759916fc edit

  • task T64326 : Upgrade express to 4.x from 2.x, use connect-busboy and upgrade other dependencies
  • Finish up fixing profile values in all content-type strings

Monday, August 24, 2015 around 1:15pm PT: Yes Deployed 0b2fbae7 edit

  • serializeChildrenToString shouldn't clobber sol state
  • Allow configuration of the "domain" separate from the MW API URL
  • Deprecate "prefix" parameter of setMwApi/removeMwApi
  • Match separator heuristic to its description
  • Quote the profile in our content type strings

Thursday, August 20, 2015 around 1:30pm PT: Yes Deployed db6e6404 edit

  • task T109686: Fix crasher in normalizer
  • Use rel="mw:WikiLink" for ISBN magic links
  • task T109371: Protect RFC/PMID/ISBN magic links with <nowiki> during WTS
  • Bracketed links must have at least one valid character after protocol
  • task T109358: Escape serialized nowiki DOM elements

Wednesday, August 19, 2015 around 1:15pm PT: Yes Deployed 8d617c99 edit

  • Followup to T93580 fix: Save data-attribs in DOMs of nested refs (improves serialization and editablity)
  • task T93580: Fix buggy regexp in strip meta tags DOM pass
  • task T106945: Bare protocols are not autolinks
  • task T107474: Fix <nowiki> escape of | in image captions
  • task T78425, task T108563: Fix WTS of autolink-like text after [^W]
  • task T45888: Batch MW parser and imageinfo API requests (batching disabled currently -- will be enabled once the batching extension is deployed and we test latency impacts).
  • Code cleanup:
    • Remove special case in nowiki serializing
    • WikiConfig: remove dead code for hasValidProtocol / findValidProtocol
    • Convert bugzilla references in source code to phabricator references.
    • Documentation updates

Monday, August 17, 2015 around 1:15pm PT: Yes Deployed 4b656b72 edit

  • task T108563: fix WTS of autolink-like text after [^W]
  • Allow ISBNs which end with a lowercase `x`
  • Support bitcoin:, redis:, urn:, xmpp:, etc protocols (part 2)
  • Newlines in html table attributes are valid
  • Normalizer: Tweaks to <td> escapable prefix normalization
  • Normalizer: Deal with "chameleon node" effect as in 7608aeab
  • WTS: Strip spans added for misnested a-tags
  • Other fixes: documentation, testing related code updates, code cleanup

Wednesday, August 12, 2015 around 1:40pm PT: Yes Deployed a271c205 edit

Monday, August 10, 2015 around 1:10pm PT: Yes Deployed 7b554ce2 edit

  • task T50958, task T107435: Parse non-block image caption all the way to Parsoid DOM
  • task T95730: Scrub empty anchors
  • Support bitcoin:, redis:, urn:, xmpp:, etc protocols
  • DOMDiff: Get rid of 'modified' diff marker - reduces dirty diffs by improving reusability of original wikitext during serialization.
  • HTML pres should permit newline attributes
  • task T108216: Disable pre_indent_in_tags rule for now
  • Check for null nodes in DOM helpers that test for node type

Wednesday, August 5, 2015 around 2:40pm PT: Yes Deployed cherry-picked hotfix ba49b80b edit

  • Check for null nodes in DOM helpers that test for node type -- should fix crashers on saves to VE edits that involved empty table cells.

Wednesday, August 5, 2015 around 1:25pm PT: Yes Deployed d5a5722c edit

  • task T93116: Add a space after the | char in table cells if it contains +/- as the first char (fix for new table cells only)
  • Normalize links that end in spaces to prevent nowikis
  • task T107652: Don't strip <ref> span tags in templated <td>-attr scenarios

Monday, August 3, 2015 around 1:25pm PT: Yes Deployed 38d0cdb1 edit

  • Enforce single-line context for definition lists
  • task T65642, task T76377: Additional scenarios dealing with treebuilder fixup
  • task T107622: <nowiki> tags don't properly protect table-related content
  • Remove smart nowikier
    • nowiki wrappers are now added around smallest string (instead of trying to minimize nowiki additions).
    • Addresses comments like this and others in the past.
  • Update sitematrix.json
    • Fetched latest changes in wiki configs - gom, lrc, azb wikis added + TLS added to most urls
  • Update domino to 1.0.19

Wednesday, July 29, 2015 around 1:30pm PT: Yes Deployed 6e095a92 edit

  • Move sol transparent link hoisting behind scrubWikitext (since VE is now passing in that API flag)
  • Disable single-line wikitext mode in selser in the same places as in non-selser serialization
  • task T104554: Prevent nowiki protection around leading whitespace in paragraphs by deleting that whitespace.

Monday, July 27, 2015 around 1:30 pm PT: Yes Deployed 92f1cd6d edit

  • Bug fix stripping indent-pre nowikis in scrubWikitext mode

Wednesday, July 22, 2015 around 1:15pm PT: Yes Deployed 6befc44e edit

  • task T104502: Redirects no longer create categories
  • task T104918: Fix redirects to non-local targets
  • task T103364: Edited autolink-like text becomes an autolink
  • task T105997: Fix crash on __proto__
  • Escape data-mw as well as data-parsoid in tokenizer
  • Refactor comment regexp into a constant and reuse everywhere
  • Use the new fork of PEG.js master

No deployments week of July 13 - 17th edit

Parsoid deployments paused this week because of Wikimania. Only emergency cherry-picks, when required, this week.

Wednesday, July 8, 2015 around 1:25pm PT: Yes Deployed c4cfc527 edit

  • Scrub empty styles tags (if scrubWikitext API param is enabled)
  • Scrub whitespace at the start of paragraphs (if scrubWikitext API param is enabled)
  • Disentangle versioned APIs
  • task T102117: Improve validating dp in the api
  • Remove old-style url redirects
  • Tweak td-fixup dom pass to handle some unhandled scenarios
  • Generate <head> only for the final document

Monday, July 6, 2015 around 2:10pm PT: Yes Deployed 87a746e6 edit

  • Bump HTML version because of cite html changes
  • task T86782 Use CSS to style Cite references

Monday, June 29, 2015 around 1:20pm PT: Yes Deployed ea98be88 edit

  • Suppress newlines before category links + Don't swallow newlines & categories into last <li> of a list (Fixes task T95988, related to task T2087).
  • task T96673: Serialize new display space hacks.

Monday, June 22, 2015 around 1:25pm PT: Yes Deployed d488783e edit

  • task T91411: Tokenizer incorrectly parses a "!!" inside a HTML <td> cell as a <th>
  • Newlines in comments shouldn't affect SOL state
  • Give nested blocks a chance to break on end delimiters
  • Only normalize new nodes
  • task T94723: Fix serialization of `mw:WikiLink` which use absolute URLs
  • task T69540: Include RL style modules from parser functions in <head>
  • Use DOMTraverser instead of DOMUtils.traverseWithTplOrExtInfo
  • Further tests and fixes to SOL behavior switches
  • Make tokenizer errors be more vague
  • Remove the last use of peg$FAILED from the PEG grammar
  • Eliminate the possibility of expansion reuse for private routes

Wednesday, June 17, 2015 around 1:21pm PT: Yes Deployed 402ddf66 edit

  • Don't stop on "!!" in templates
  • More cleanup in the tokenizer
  • task T97430: Ignore marker meta tags during nowiki escaping
  • Refine DSR algo to use end-tag width info in the right context
  • Fix bug in computation of end-tag widths of wikitext constructs
  • Update sitematrix to include cnwikimedia
  • task T99802: Don't prevent fostering of meta tags in our DOM spec
  • task T102117: Return 400 if the passed in data-parsoid is empty

This is a repeat of Monday's postponed deploy.

Monday, June 15, 2015 around 1:15pm PT: to be deployed 402ddf66 (cancelled) edit

  • Don't stop on "!!" in templates
  • More cleanup in the tokenizer
  • task T97430: Ignore marker meta tags during nowiki escaping
  • Refine DSR algo to use end-tag width info in the right context
  • Fix bug in computation of end-tag widths of wikitext constructs
  • Update sitematrix to include cnwikimedia
  • task T99802: Don't prevent fostering of meta tags in our DOM spec
  • task T102117: Return 400 if the passed in data-parsoid is empty

We couldn't perform pre-deploy checks on the beta cluster since VisualEditor was broken there. Postponing deploy to Wednesday.

Monday, June 8, 2015 around 1:15pm PT: Yes Deployed 131554ba edit

  • More thorough job of stripping unneeded data-parsoid from templated content
  • Code cleanup and improvements in PEG tokenizer
  • Minor code refactoring in serializer and template encapsulation code

Saturday, June 6, 2015 around 4:40 PT: 5172a446 (cherry-pick of 719c736f) deployed as a hotfix edit

  • task T101599: Don't hoist category links out of headings when they come from templates

Wednesday, June 3, 2015 around 1:15pm PST: Yes Deployed ab675400 edit

  • Be more careful about which MW API warnings we suppress
  • task T97386: Make behavior switches SOL transparent
  • API: If "wt" parameter is passed in, set it as the page source unconditionally
  • task T100225: DOM normalization: Move meta-tag hoisting from core serializer to DOM normalization pass
  • DOM normalization: Merge adjacent <a> tags with identical attrs

Monday, June 1, 2015 around 1pm PST: Yes Deployed 73445bfd edit

This is the same as the previous deploy attempt:

  • task T73161: Support subst: of transclusion blocks in the parseFragment API endpoint
  • DOMDiff: For <ref> id properties in data-mw, fetch HTML and compare DOMs to detect edits to <ref>s without requiring clients to dirty the <ref> nodes
  • DOMDiff: Improve robustness of data-mw diff testing
  • Suppress separators in single-line context (part of task T52683)
  • task T86882, task T87513: Make hardcoded config values configurable
  • Blank template parameters should be preserved
  • Code cleanup in mediawiki wiki config
    • Use interwikiMap, not mwApiMap, to normalize titles
    • Store apiConf as an object in the mwApiMap
    • Make proxy_strip_https into a general proxy configuration option
  • Code cleanup and fixes in mediawiki API request handling
    • Refactor request default options into ApiRequest.prototype.request()
    • Strip UTF8 BOM so that JSON.parse() doesn't throw
    • Ignore modulemessages in api=parse result

Plus two new cherry-picked patches:

  • task T100696: suppress modulemessages deprecation warnings in logs.
    • A new version of mediawiki core was deployed earlier in the day which caused a spike in these warning messages. With this patch, we are suppressing all warning/api messages.
  • Fix typo in config property used for sampling heap usage
    • this should fix the outgoing network spike seen in previous attempt.

Thursday, May 28, 2015 around 12:40pm PST: 497da30e to be deployed (Reverted) edit

  • task T73161: Support subst: of transclusion blocks in the parseFragment API endpoint
  • DOMDiff: For <ref> id properties in data-mw, fetch HTML and compare DOMs to detect edits to <ref>s without requiring clients to dirty the <ref> nodes
  • DOMDiff: Improve robustness of data-mw diff testing
  • Suppress separators in single-line context (part of task T52683)
  • task T86882, task T87513: Make hardcoded config values configurable
  • Blank template parameters should be preserved
  • Code cleanup in mediawiki wiki config
    • Use interwikiMap, not mwApiMap, to normalize titles
    • Store apiConf as an object in the mwApiMap
    • Make proxy_strip_https into a general proxy configuration option
  • Code cleanup and fixes in mediawiki API request handling
    • Refactor request default options into ApiRequest.prototype.request()
    • Strip UTF8 BOM so that JSON.parse() doesn't throw
    • Ignore modulemessages in api=parse result

This is the same as yesterday's attempted deploy, which we had to defer due to task T100439.

Reverted after observing an outgoing network traffic spike on our canary deploy machine (wtp1001). Suspected to be due to stats or logging misconfiguration. This is because of a typo in one of the parsoid-config properties that determines the heap usage sample interval. Because of the typo, instead of sending heap usage samples every 5 mins, parsoid was sending samples all the time. This caused the network spike seen on wtp1001.

Wednesday, May 27, 2015 around 1pm PST: 497da30e to be deployed (Cancelled) edit

  • task T73161: Support subst: of transclusion blocks in the parseFragment API endpoint
  • DOMDiff: For <ref> id properties in data-mw, fetch HTML and compare DOMs to detect edits to <ref>s without requiring clients to dirty the <ref> nodes
  • DOMDiff: Improve robustness of data-mw diff testing
  • Suppress separators in single-line context (part of task T52683)
  • task T86882, task T87513: Make hardcoded config values configurable
  • Blank template parameters should be preserved
  • Code cleanup in mediawiki wiki config
    • Use interwikiMap, not mwApiMap, to normalize titles
    • Store apiConf as an object in the mwApiMap
    • Make `proxy_strip_https` into a general proxy configuration option
  • Code cleanup and fixes in mediawiki API request handling
    • Refactor request default options into ApiRequest.prototype.request()
    • Strip UTF8 BOM so that JSON.parse() doesn't throw
    • Ignore modulemessages in api=parse result

Because of task T100439, we cannot currently test the deploy by looking at VE edits to see that we didn't break anything by examining wikitext diffs. Parsoid deploys are paused till that ticket is resolved and a patch is deployed to production.

Wednesday, May 20, 2015 around 1pm PST: Yes Deployed 8ed6fd0b edit

  • task T94509: Add mw:DisplaySpace to typeof for nbsp before colon
  • task T96279: Provide section-offsets for immediate children of <body> to support section editing in VE and other clients

Monday, May 18, 2015 around 1:10pm PST: Yes Deployed 8ed3e503 edit

  • task T93824: Put escaped HTML tags inside <nowiki>
  • task T96923: html2wt should not need access to original source
  • Restore speedy non-selser serialization
  • Don't use selser if oldid is missing

Wednesday, May 13, 2015 around 1:25pm PST: Yes Deployed a8108fe6 edit

  • task T96090: Allow quotes as template targets
  • Normalize empty headings only if they are newly inserted content
  • A bunch of code cleanup patches (including some refactoring of server configuration)

Monday, May 4, 2015 11:44am PST: Yes Deployed b53a7272 edit

  • Avoid deep freezing some parsoidConfig properties
    • This patch prevents the bug that prevented Parsoid service from starting up in production causing a revert Wedneday, April 29
  • Ensure that embedded Maps and Sets are properly deep-frozen
  • Freeze parsoidConfig to avoid shared mutable state
  • Remove uri fallback when switching wiki configs

Wednesday, April 29, 2015 around 1pm PST: 45b54f63 to be deployed (Reverted) edit

  • Freeze parsoidConfig to avoid shared mutable state
  • Remove uri fallback when switching wiki configs

See outage report for more details.

Monday, April 27, 2015 around 1pm PST: Yes Deployed ebdac59b edit

  • task T97207: Forward the X-Request-ID header
  • task T97204: Exponentially increase the request timeout
  • Reduce API concurrency and retries (to deal with overload on API cluster)
  • Don't strip \r in API routes
  • Remove redundant \r handling
  • Upgrade to prfun 2.0.0 and smash the global Promise
  • Performance: Use core-js/shim instead of es6-shim
  • A lot of code cleanup
    • This includes bcea0ab0 which is a fix for the cleanup patch 915ea3f6 which was causing last week's corruptions.

Saturday, April 25, 2015 around 8:25 am PST: Yes Deployed fca17070 (cherry-pick of d2135c6b on parsoid master) edit

Cherry-picked "Reduce API concurrency and retries" from parsoid master to reduce # retries and concurrency level with which Mediawiki API is hit.

Friday, April 24, 2015 around 12:50 pm PST: Reverted deploy to 3311936a edit

Thursday late night deploy reverted due to corruptions reported.

See outage report for more details

Thursday, April 23, 2015 around 11:45pm PST: Yes Deployed d2135c6b edit

This was meant to be an emergency deploy of one patch but unintentionally deployed all changes from master.

  • Reduce API concurrency and retries (to deal with overload on API cluster)
  • Don't strip \r in API routes
  • Remove redundant \r handling
  • Upgrade to prfun 2.0.0 and smash the global Promise
  • Performance: Use core-js/shim instead of es6-shim
  • A lot of code cleanup

Wednesday, April 22, 2015 around 1:05pm PST: Yes Deployed 3311936a edit

  • task T95794: Enforce <pre> for all lines when escaping wikitext
  • Fix base href on _rt routes
  • Accept scrubWikitext as a query parameter

Monday, April 20, 2015 around 1pm PST: Yes Deployed 0cabb5b2 edit

  • task T94867: Suppress empty headings if scrubWikitext param is provided
  • Add a scrubWikitext param to the API to (optionally) apply normalizations that won't roundtrip
  • task T93368: Fix crasher seen in production
  • task T96197: <ref> marker metas should remain fosterable
  • Log uncaught exceptions in Parsoid service
  • Edge case bug fix in migrateTrailingNLs DOM pass (for example, in en:SM U-66)
  • Other code cleanup that doesn't affect functionality

Wednesday, April 15, 2015 around 1:20pm PST: Yes Deployed ac7a01b9 edit

  • Bug fix serializing nested refs (would refuse to save because of missing <ref> content)
  • Bug fix in selser tests that sometimes normalized element attributes unnecessarily
  • Handle empty content string ("") returned by the API
  • Normalize DOM before running DOM-Diff
  • Fix findFirstEncapsulationWrapperNode -- eliminates dirty diffs in some edge case scenarios
  • Other code cleanup that doesn't affect functionality

Monday, April 13, 2015 around 1pm PST: 8f35374d (skipped) edit

  • Bug fix serializing nested refs (would refuse to save because of missing <ref> content)
  • Bug fix in selser tests that sometimes normalized element attributes unnecessarily
  • Handle empty content string ("") returned by the API
  • Normalize DOM before running DOM-Diff
  • Other code cleanup that doesn't affect functionality

Deploy postponed because beta cluster is down and it is not possible to verify this in beta cluster beforehand.

Wednesday, April 8, 2015 around 1pm PST: Yes Deployed a76bd8a3 edit

Other changes:

  • Various code style tweaks and clean ups.

Monday, April 6, 2015 around 1pm PST: Yes Deployed d5aa726eb edit

  • task T94055: Normalize comments so that Parsoid output is valid XML
  • Edge-case fix for hoisting embedded <link>s from headings
  • task T94799: Preserve querystring params while redirecting
  • Don't serialize <a> tags as <a> ever
  • task T93973: Remove state from Cite extension
  • Log with supplied x-request-id header

Monday, Mar 30, 2015 around 1pm PST: Yes Deployed 29a5dafb edit

  • Skip link validity tests for strings that won't be used as hrefs: Eliminates erroneous "bad title text" logging messages
  • cleanupAndSaveDataParsoid should be done in its own pass: Fixes incorrect HTML generated in <li>-hack scenarios when v2 API is used
  • Replace duplicate ids in wikitext: Allows Parsoid to handle pages with duplicated ids without corrupting them on serialization (task T93739 is an instance of this)
  • task T93926: Never serialize a-tag as html
  • task T64881: Add original dimension information for images.
  • task T93839: Normalize wikilink targets to strip leading "./"
  • task T63165, task T93715: Ensure reference index is reset at the end of document
  • Use tokenizer info to fix/cleanup tdFixups dom pass

Wednesday, Mar 25, 2015 around 1pm PST: Yes Deployed 0313fcc7 edit

  • task T87069: Pop comments from the end of table tag attributes
  • Strip out X-Parsoid-Performance headers and associated code -- no longer useful since Parsoid now sends lots of metrics to statsd
  • Bug fix setting TSR in defn lists - fixes DSR inconsistency warnings
  • task T93369: Nulls in DSR computation should not be coerced to 0
  • Edge case fix for definition lists: Only return colon when ignoring in tags
  • Associate data-parsoid with duplicated ids (copy-paste in VE can introduce duplicate element ids)

Monday, Mar 23, 2015 around 1:25pm PST: Yes Deployed a5d7483f edit

  • task T88081: Fix tokenizing redirect context
  • Use more specific warning labels to help sift through logs in Kibana
  • Use fatal/request instead of fatal when we can't serialize a <ref> ( https://gerrit.wikimedia.org/r/198176 ). This should send a 500 response, not kill the entire worker.

Thursday, Mar 19, 2015 around 6:45 pm PST: Yes Deployed 99d1b214 edit

  • task T93228: Don't strip id attributes from DOM nodes -- required for <ref> tags
  • task T73708: Serialize category redirects with a ':'
  • Additional logging to help debug Visual Editor issues

Thursday, Mar 19, 2015 around 9 am PST: Yes Deployed f5f5f0ed edit

  • task T93228: Abort html -> wt serialization when we encounter a <ref> DOM id without a matching DOM element
  • Log errors when Parsoid-like element ids are stripped from HTML elements

Wednesday, Mar 18, 2015 around 1pm PST: Yes Deployed b48f6e25 edit

  • Don't serialize HTML id attributes with Parsoid-like elt ids
  • task T54341: Ensure that alt image option is handled properly even when it has complex wikitext
  • v2 API: Explicitly set a utf-8 charset in text content-types

Monday, Mar 16, 2015 around 1pm PST: Yes Deployed ccf4c140 edit

  • task T69850, task T90028: Handle entities/nowikis in templated attributes
  • task T52683: Enforce single-line context in the serializer
  • task T71123: Table cells not properly parsed in an implicit-td context
  • task T53961: Improve escaping and nowikiing of template arguments
  • Additional fixes to selective serializer around reusing original source in lists and list items
  • Additional instrumentation (input/output sizes, init times) of Parsoid endpoints

Wednesday, Mar 11, 2015 around 1pm PST: Yes Deployed 73bf3162 edit

  • task T88318: Fix serialization of table cells with "-" and "+" in them
  • task T71482: Convert | to {{!}} in template parameters
  • task T92177: Eliminate fatal assertion failures seen in production (found on kibana)
  • task T71950: Improvements to <nowiki> wrapping for strings that needed them
  • Fixes to DSR computation algo to eliminate negative DSR deltas (should eliminate the warnings seen in kibana)
  • Updated sitematrix.json to latest changes
  • Explicitly pass rawcontinue=1 to the Mediawiki API (to eliminate deprecation warnings logged on the M/W API end)
  • Log mediawiki API warnings (so we can find and fix API deprecations in future)

Monday, Mar 9, 2015 around 1pm PST: Yes Deployed c8370a48 edit

Wednesday, Mar 4, 2015 around 1pm PST: Yes Deployed 06c8cf33 edit

  • task T90517: Fix selser bugs that would occasionally lose newly added comments
  • task T85782: Fix broken serialization in some scenarios after table columns are deleted
  • Fix broken performance timer code (broken in Monday, Mar 2, deploy)

Monday, Mar 2, 2015 around 1:15pm PST: Yes Deployed 08643f53 edit

  • task T88290: Remove duplication of <ref> content in the data-mw.body.html property of <ref> tags
  • task T88017: Remove more cases of data-parsoid.src from mw:Extensions
  • Memory usage reports are now generated once every 5 mins and sent to the statsd server

Wednesday, Feb 25, 2015 around 1:00 pm PST: Yes Deployed 5a3aaf71 edit

  • Serialize new anchor links (w/o rel) as internal
  • Amend timing metrics
  • task T87708: Open tags only affect line when parsing definition list colon
  • task T90452: Fix nowiki escaping for <td>

Monday, Feb 23, 2015 around 1pm PST: Yes Deployed d9ac8c21 edit

  • Workaround for task T90463. (Will be reverted once citoid bug is fixed: task T90479.)
  • task T90309: Ensure that implicitly-added <references> output have unique ids
  • Fix serializing categories without indent-pre protection (tweaks #REDIRECT handling as well)
  • Don't crash when revision is hidden
  • task T88495: Handle more templated <td>-attr scenarios
  • Tweaked naming of selser-related timing stats.
  • Enable timing stats in production (localsettings.js change).

Wednesday, Feb 18, 2015 around 1:30 pm PST: Yes Deployed 17f68256 edit

  • task T88660: Emit reflists for <ref> with no explicit <references>
  • task T85232, task T66171: Enable timing stats for Parsoid wt2html and html2wt requests
    • not yet enabled in production (requires change to localsettings.js)

Monday, Feb 16, 2015 around 1pm PST: Yes Deployed 86e76a30 edit

  • task T51075: Handle template-generated DISPLAYTITLE and DEFAULTSORT
  • task T89383: Fix selser regression introduced on Feb 11 deploy
  • task T89411: Fix selser in v2 API (to be used by RESTbase)
  • For older MW APIs that doesn't provide that information, default to cached enwiki config for supported link protocols

Wednesday, Feb 11, 2015, around 1:35pm PST: Yes Deployed 4fc3b43d edit

  • task T88017: Remove data-parsoid.src for elts with valid data-mw and DSR info
  • task T88019: Remove unnecessary <meta> transclusion tags
  • Fixes to handle high load on Parsoid cluster
    • Don't reprocess same token in AttributeExpander unless necessary (eliminates infinite loop scenarios found on some pages)
    • Fixes to make sure fatal errors more consistently force process restarts without leaving behind stuck processes
  • Categories on their own line don't need nowikis around any leading whitespace
  • Non-word characters shouldn't terminate tag names
  • task T52373: Hoist categories, language links, redirects, comments out of headings when serializing them
  • task T72960: Fix serializing new links with "./" in content string

Monday, Feb 9, 2015, around 1pm PST: dd98dea0 to be deployed (Cancelled) edit

  • task T88017: Remove data-parsoid.src for elts with valid data-mw and DSR info
  • task T88019: Remove unnecessary <meta> transclusion tags
  • Fixes to handle high load on Parsoid cluster
    • Don't reprocess same token in AttributeExpander unless necessary (eliminates infinite loop scenarios found on some pages)
    • Fixes to make sure fatal errors more consistently force process restarts without leaving behind stuck processes
  • Categories on their own line don't need nowikis around any leading whitespace
  • Non-word characters shouldn't terminate tag names
  • task T52373: Hoist categories, language links, redirects, comments out of headings when serializing them

Deployment cancelled. We found some regressions and the fixes for them are still going through round trip testing at this time. So, we'll get these out on Wednesday.

Friday, Feb 6, 2015, around 9:10pm PST: Hotfix of Gerrit change 189036 cherry-pick edit

Jan 28, 2015 deploy of task T48811 exposed a longstanding bug in Parsoid which was fixed by Gerrit change 189036. On some pages, due to task T88864 where some templates weren't being expanded, the Attribute Expander was effectively being asked to re-expand the template over and over again in an infinite loop. This was being triggered on a few enwiki pages today that was causing processes to get stuck without being restarted. This hotfix prevents the infinite loop.

Friday, Feb 6, 2015, around 11:20am PST: Hotfix of Gerrit change 188982 cherry-pick edit

A bug in our process restart (on fatal errors) was exposed by unrelated bugs in our parse pipeline which manifested as stuck processes on the cluster. This hotfix fixes that by ensuring that fatals continue to restart processes.

Wednesday, Feb 4, 2015, around 1pm PST: Yes Deployed dd4721f4 edit

  • Switch to using the compression package instead of the outdated version bundled with connect. In testing, that cleared up the memory leak noticed since the Jan. 15th deploy.
  • Some cleanup including:
    • Changing a few on handlers to once.
    • Using request's qs option for apiargs instead of stringifying those manually.
    • Better error handling for config requests.

Monday, Feb 2, 2015, around 1pm PST: Yes Deployed e3c9ae99 edit

  • Set X-Forwarded-Proto when proxying https. This fixes timeouts for ruwikinews which is strict about accepting only https connections.
  • Some performance tweaks to attribute expander to eliminate useless work and useless memory allocation
  • task T86902: Fixes to resource module loading URI in the <head> section of Parsoid HTML
  • Fixes to tokenizer to ensure that strings starting with '-' are parsed for directives like language variant markup

Friday, Jan 30, 2015 around 2:35 pm PST: Yes Deployed 2abd0eb6 edit

The Jan 15th deploy where Parsoid started using sitematrix info for configuring wikis was missing special handling for some wikis (commonswiki being one of them). This caused timeouts which in turn repeatedly exercised an existing memory leak. This, in turn, caused a slow buildup of leaked memory on the cluster and a higher than normal cpu load. This special Friday deploy fixed the config issues.

Specifically, the following two patches were deployed:

  • Some special wikis should use the default proxy
  • Strip TLS from sitematrix url if we're using the default proxy

Wednesday, Jan 28, 2015 around 1pm PST: Yes Deployed 88605a4a edit

  • task T48811: Correctly handle templates that generate part-attribute and part-content of a DOM node.
  • task T73412: Preserve blank template parameters
  • task T71859: Cleanup of behavior switch production
  • Updates to wikitext serializer to simplify and enable more robust wikitext escaping
  • task T66300, task T67278, task T73462: Magic link fixes (wt2html and html2wt nowiki handling)

Thursday, Jan 15, 2015 around 1pm PST: Yes Deployed 2fdf9298 edit

On Jan 14th 1:20 pm PST, we reverted Parsoid to older deployed version after dirty diffs were seen during post-deploy testing. It turned out that the dirty diffs weren't related to the Parsoid deploy, but now that those issues have been fixed, we'll revisit the Parsoid deployment on Thursday.

Monday, Jan 12, 2015 around 1pm PST: Yes Deployed 2cd6fefa edit

  • Include location of titles in timeout logs
  • Tweaks to Parsoid's cite port to generate identical ref ids as native cite implementation

Wednesday, Jan 7, 2015 around 1pm PST: Yes Deployed 904fab9e edit

Monday, Jan 5, 2015 around 2pm PST: Yes Deployed 0e2997d2 edit

Wikitext -> HTML

  • task T72786: data-parsoid stripped from template content
  • task T71219: Context-aware parsing of definition list colon
  • task T58916: Parse extension parameters as plain text
  • task T57531: Stray is parsed to meta
  • Marginal improvement parsing templates in definition lists

HTML -> Wikitext

  • task T74844, task T84921: Several improvements and fixes to nowiki protection for quotes
  • Other improvements and bug fixes to nowiki protection in headings, lists, tables.
  • task T72791: Insert an extra newline after new content and existing headings

Other (API, logging, etc)

  • Add logging for html2wt API endpoints
  • Fix robots.txt route
  • Send SIGKILL to kill a timed out worker
  • task T75955: API v2 parsing and serialization routes