Parsoid/Deployments/2016
< Parsoid | Deployments
Tuesday, December 21, 2016 around 5:03 am PT: Deployed e7e3a4dc on the deploy-20161221 branch
edit- ApiRequest: Clone the request options before modifying them.
Tuesday, December 20, 2016 around 7:48 am PT: Deployed 5eb649e8
edit- Use mwApiServer as the provider of the full URI of the MW API
- Add a mwApiServer configuration variable
- Add arbcom_cswiki to site matrix
Thursday, December 15, 2016 around 10:24 am PT: Deployed 6719e240
edit- task T96555: Ignore self-closed tags when extending source
- Drop native LST altogether
- Fix DOMDiff annotations
- Linter:
- Fix bug in self-closing-tag category + other cleanup
- Fix crasher when linting a gallery
- Apply lint sampling when sending it to the logger as well
- Don't provide 'src'
Wednesday, December 14, 2016 around 1:24 pm PT: Deployed 60ee19ac
editwt2html:
- task T119265: Add more page-level metadata that MCS can use
- Support extension tags which shadows block level elements
- Move section handling to the LST extension
- task T104523: Prevent infinite recursion
- task T104662: Allow nested ref tags only in templates
Linting (disabled in production):
- Use ApiRequest.js to post results
- Handle MW API errors that come with a HTTP 200
Debugging:
- Let extensions supply the pp tracing name
Monday, December 12, 2016 around 1:35 pm PT: Updated production config
edit- Bump table cell and list item resource limits to 40K (from 30K)
Wednesday, December 7, 2016 around 1:21 pm PT: Deployed 3cf19c6b
edit- Bump HTML contentVersion to 1.3.0 (see updated spec)
- task T110910: Native <gallery> extension
- task T102209: Assign ids to headings to match core's section anchors
- task T94949, task T150112: Munge link fragments and element ids as in the php parser
- task T151570: Update SiteMatrix data fork for last 3 wiki creations
- task T149209: Deal with newlines in <td> and <th> cells
- task T150213: Suppress logs for known unknown contentmodels
- task T152073: Reduce request timeout to 110s (from 3min) and worker timeout to 115s (from 3min); Increase M/W batcher API timeout to 65s
- Some configurations moved to vars.yaml in the deploy repo
- s/warning/warn/ to match service-runner's levels
- Don't entity escape extension attribute values from data-mw
- Normalize all extension options, not just native
- Remove unused package gelf-stream
- Linter: Add linting of self-closed tags
- Testing:
- Remove scrolling by access key
- require('should') in lintertests.js for standalone runs
Monday, November 7, 2016 around 1:29 pm PT: Deployed 2c2fe425
edit- Cleanup http redirects
- Send error responses in the requested format
- Fix processing listeners in node v7.x
Wednesday, November 2, 2016 around 1:27 pm PT: Deployed 173d7e32
edit- task T149241: Whitelist content model fallback
- Testing:
- Don't expose dev routes in production
- Get rid of simple debug helpers
- task T119228: Stop testing on node v0.10.x
- Linter:
- Add node name for missing-end-tag
- Remove higher resource limits (max wikitext page size, max # list items, max # table cells per page) and fall back to default limits.
And the commits that were attempted to deploy on Oct. 26th (ede4353):
- task T141723: Bump mediawiki-title
- task T141905: Fix crasher and other bugs of that category
- service-runner doesn't recognize warning level
- Stop asserting that we'll never be encapsulating a flipped range
- Lots of linter fixes / features (currently, linting is disabled in production though)
- Remove html5 treebuilder in favour of domino's
- Bump domino to 1.0.27
- task T147742: Trim template target after stripping comments
- task T48580, task T133320: Allow extensions to handle specific contentmodels
Tuesday, November 1, 2016: Parsoid cluster upgraded to node v4.6
editOps upgraded node on the Parsoid eqiad cluster to node v4.6. The (backup) codfw cluster had been upgraded on Monday.
Monday, October 31, 2016 around 1:34 pm PT: Deployed e503e801
edit- task T149504: Fix reflected XSS
Wednesday, October 26, 2016 around 1:15 PT: ede4353 to be deployed Reverted to 63f1e151, contentmodel errs
edit
- task T141723: Bump mediawiki-title
- task T141905: Fix crasher and other bugs of that category
- service-runner doesn't recognize warning level
- Stop asserting that we'll never be encapsulating a flipped range
- Lots of linter fixes / features (currently, linting is disabled in production though)
- Remove html5 treebuilder in favour of domino's
- Bump domino to 1.0.27
- task T147742: Trim template target after stripping comments
- task T48580, task T133320: Allow extensions to handle specific contentmodels
Monday, October 24, 2016 around 1:42 pm PT: Deployed 63f1e151
edit- task T146612, task T139032: Site matrix update for olowiki
- task T141905: Fix crasher in table fixups
Wednesday, September 21, 2016 around 1:17 pm PT: Deployed a802de0
edit- Tokenizer:
- Encapsulate protected table attributes from wt
- Inline generic_attribute_newline_value and table_attribute_value
- Set srcOffsets for table_attribute and generic_newline_attribute
- HTTP API:
- Page id and revid aren't the same thing
- html2html should require an original or previous revision
Wednesday, September 14, 2016 around 1:11 pm PT: Deployed aed15dda
edit- Let native extensions add stylesheets
- Move getAPIProxy to parsoidConfig
- Other minor refactorings and parserTest changes
Monday, September 12, 2016 around 1:40 pm PT: Deployed f7c43009
edit- Handle HTML tags in attribute text properly
- AttributeExpander: Tweak check for improved code readability
- Testing:
- Bump worker_heartbeat_timeout to 2mins for testing
- Allow specifying a specific revision for roundtrip-test.js
Tuesday, September 6, 2016 around 10:37 am PT: Deployed 7863e6ad
edit- task T142617: Handle invalid titles in transclusions
- Sanitizer fixes:
- Decode all char refs in text
- Ignore some fields when freezing SanitizerConstants for node v6.5 -- no-op for Wikimedia cluster that runs node v4.x
- node-module updates:
- Bump service-runner to v2.1.0
- Remove bunyan
- Some minor cleanups
Monday, August 29, 2016 around 1:10 pm PT: Deployed 48cf803e
edit- Run localSettings.setup after assigning options
- Use service-runner's metrics reporter in the http api
- Updates in preparation for supporting version 2.x content in the future -- should be no-op for version 1.x content
- Support downgrading 2.x content to 1.x
- No content reuse from semantically different content versions
- task T143356: Establish precedence for data-mw in 2.0.0 content
Monday, August 22, 2016 around 1:12 pm PT: Deployed df53a991
edit- task T142998: html2wt: Fix crasher in DOM normalization code
- task T141370: Use service-runner's logger as a backend to Parsoid's logger
Wednesday, August 17, 2016 around 1:09 pm PT: Deployed 3cf877bb
edit- html2wt: Always emit canonical wikitext for url links
- html2wt: Emit url-links where appropriate no matter what rel attribute says
Monday, August 15, 2016 around 1:09 pm PT: Deployed f039dcf6
edit- migrateTrailingNLs DOM pass: Code simplifications and some subtle edge case bug fixes
- task T138864: Deal with edge cases serializing links
- Remove deprecated "disablepp" MediaWiki API param and pass "disablelimitreport" instead
- Increase resource limits for wikitext size, max table cells, and max list items
- With the upgrade to node v4, we have more breathing room for parsing large pages
Wednesday, August 10, 2016 around 1:10 pm PT: Deployed 4de49e26
edit- Handle caption-like text outside tables
- Table captions: Remove unneeded mw:TSRMarker meta token + add TSR info in tokenizer which leads to more accurate DSR offsets.
- When table wikitext shows up outside tables and are converted to strings, strip attached mw:TSRMarker tags
- computeDSR: Fix source of pathological O(n^2) behavior
Tuesday, August 9, 2016 around 11:15 am PT: Deployed a577d80e
edit- Fix crasher in escapeWikitext
- task T140898: Update site matrix for tcy.wikipedia.org
Tuesday, August 2, 2016 - Tuesday August 9, 2016: Upgrade Parsoid cluster to node v4.x and Jessie
edit- task T135176: Over the week, Operations upgraded the cluster gradually.
- The eqiad cluster was fully migrated by Friday, August 5th.
- The codfw cluster was fully migrated by Tuesday, August 9th.
Monday, August 1, 2016 around 1:15 pm PT: Deployed abf396eb
edit- Fix title parsing of subpages during initialization (addresses crashers while parsing these pages)
- Only apply data-* attributes in /pagebundle/ paths (API cleanup)
- Determines the content version in the html2wt direction, enabling content upgrade
Tuesday, July 26, 2016 around 10:12 am PT: Deployed 285b6983
edit- Use mediawiki-title package to replace homegrown Title code (resolves task T113322, task T133425, and task T139135)
- Reintroduce a 3-minute request timeout
- Bump some minor / patch level versions of dependencies (addresses a security advisory)
- Prevent JSON.stringify circular refs in template wrapping trace/error logs
Thursday, July 21, 2016 around 9:30 am PT: Deployed ed2f8228
edit- Test deploy to verify trebuchet deployment is not broken after all the tinkering done during the service-runner deploy. The deployed change was a change that only affects parser tests.
Wednesday, July 20, 2016 between 7:30 - 8:20 am PT: Deployed 45beb6c0
edit- task T90668: Update Parsoid to use the service-runner framework
- In collaboration with Services & Ops teams
- wtp1001 and wtp1002 were transitioned over July 19, 2016 between 8:00 - 9:00 am PT
Monday, July 11, 2016 around 1:10 pm PT: Deployed e738c415
edit- task T131564: Respect $wgInterwikiMagic setting while parsing lang-links
- task T139388: DOMDiff: Skip over encapsulated content rather than about-id content (fixes problem with lost edits in content nested in elements with templated attributes)
- Code cleanup (don't expect functional changes): Use a more appropriate DOM helper (s/hasParsoidAboutId/isEncapsulationWrapper/) where appropriate
Monday, June 27, 2016 around 1:08 pm PT: Deployed dd8e644d
edit- Template wrapping: Eliminate pathological tpl-range nesting scenario
Thursday, June 23, 2016 around 10:30 am PT: Deployed 18022c96
edit- Emit single newline separator in table wikitext for new content
- Make the http connect timeout configurable
- Update many deps by minor version
- task T137406: Ensure newlines are added where required around thead/tbody/tfoot
- task T96195: Remove node 0.8 support (does not affect WMF deploy of Parsoid)
Wednesday June 15, 2016 around 1:10 pm PT: Deployed 3445eceb
edit- task T137406: Emit |- between thead/tbody/tfoot
Non-functional changes (these will come into play once we move to v2.0.0 of Parsoid HTML):
- Roundtrip 2.0.0 content
- task T114413: Provide HTML2HTML endpoint in Parsoid
Monday, June 6, 2016 around 1:15 pm PT: Deployed e8d6092e
edit- Normalize all lists to not mix wikitext and HTML list syntax (selser prevents unnecessary dirty diffs in production)
Thursday, June 2, 2016 around 10:40pm PT: Deployed 7188080b
edit- task T134389: Serialize content in HTML tables using HTML tags
- task T125419: Fix selser issues serializing first table row
- Selser: Bug fix reusing separator text from original source
Wednesday, June 1, 2016 around 1:15 pm PT: Deployed afb0d522
edit- Bump core-js from v1.2.6 to v2.4.0
- Bump yargs from v1.3.1 to v4.7.1
- Don't use non-standard array generic functions (Array.reduce, etc.) - removed from newer version of core-js
- Use normalized form of default page "Main_Page" instead of "Main Page"
- task T135596: Return client error for missing data attributes
- Fix up the internal forms to use v3 post endpoint
- Add a page/wikitext/:title route to GET wikitext for a page
Thursday, May 19, 2016 around 11:38am PT: Deployed 67816adf
edit- task T100681: Remove deprecated v1/v2 HTTP APIs.
- task T130638: Content negotiation; Add data-mw as separate JSON blob in the page bundle.
- Strict Accept header checking is turned off; we will return 1.2.x format if an invalid Accept header is provided (which is allowed by RFC 2616).
CLEARED DIRTY REPOS which had this patch applied as root during the restbase/changeprop/parsoid outage:
diff --git a/lib/api/routes.js b/lib/api/routes.js index 4d08922..d372c2f 100644 --- a/lib/api/routes.js +++ b/lib/api/routes.js @@ -377,6 +377,7 @@ module.exports = function(parsoidConfig, processLogger) { var v1Wt2html = function(req, res, wt) { var env = res.locals.env; var p = apiUtils.startWt2html(req, res, wt).then(function(ret) { + if ( ret.oldid === 106801025 ) { return false; } if (typeof ret.wikitext === 'string') { return apiUtils.parseWt(ret) // .timeout(REQ_TIMEOUT)
Wednesday, May 4, 2016 around 1:15 pm PT: Deployed b0d015fa
edit- task T134017: Update cached SiteMatrix, mainly for jamwiki
Monday, May 2, 2016 around 1:15 pm PT: Deployed 0a26f3a4
edit- html -> wt: For invalid links, text doesn't need escaping in link context
- DOMDiff: Fix marking data-is-block on extra base nodes
- Add autoload mechanism for user extension code -- proof-of-concept for future use
- Update shrinkwrap after 23c97752
- Code cleanup: should not affect functionality
- Keep the data-* attributes at the edges of the DOM
- Remove ParsoidCacheRequest
- Organize post-processors distinguishing handlers
- Move the dumper to DOMUtils and use more widely
Monday, April 25, 2016 around 1:05 pm PT: Deployed d5363193
edit- task T130645: Pass the right title to PHPParseRequest
- Don't allow unclosed extension tags
- Code cleanup: should not affect functionality
- task T95325: Move tsrDelta to dp.tmp
- Rename DU.serializeChilden to DU.serializeToXML
- storeDataParsoid is an env variable, not a Parsoid config property
Monday, April 11, 2016 around 1:15pm PT: Deployed e3766b79
edit- Count api version use
- Don't dom-diff on a cloned node
- task T95325: Migrate temporary data to dp.tmp
- Suppress errors raised when getting debugging info
- Code cleanup: should not affect functionality
- Fix some variable shadowing
- Stop working on cloned nodes in parserTests
- Rename timer to stats, since we do counting too
- Fix regression testing tool
- Fix crasher and more informative rt errors
Wednesday, April 6, 2016 around 1:15 pm PT: Deployed 5f6c0c60
edit- task T116020, task T53852: Serialize localized image options (already cherry-picked yesterday)
- Stop suppressing escaping errors
- Remove the broken_template rule in the PEG tokenizer -- no need to wrap {{, {{{, }}, }}} in <nowiki> spans
- Code cleanup: should not affect functionality
- Cleanup some fallback rules in the PEG tokenizer
- Use Util.placeholder in a few more places
- Be consistent with dp.src check
Tuesday, April 5, 2016 around 2:40pm PT: Deployed a5be1cdc
edit- task T116020, task T53852: Cherry-pick of image option localization patch to match alias reordering in mediawiki core version 1.27.0-wmf.20.
- Deployed cherry-pick from
deploy-20160405
branch.
Monday, April 4, 2016 around 1:10 pm PT: Deployed 579ec3e6
edit- Fix log type in cite implementation
- Code cleanup: should not affect functionality
- Move dp.src handlers to their respective dom handlers
- Add new env.normalizeAndResolvePageTitle helper and use it
Wednesday, March 30, 2016 around 1:15 pm PT: Deployed a20ef276
edit- Bump HTML version number to 1.2.1
- Declare charset with <meta charset>
- Add html/dp version numbers in <head> instead of full content type
- task T113331: Move auto-generated refs flag from data-parsoid to data-mw
- Default ParsoidConfig.loadWMF to false
- Bump node-uuid to 1.4.7 for nsp
Wednesday, March 23, 2016 around 1:15 pm PT: Deployed 5538d868
edit- Don't construct regexp with a regexp when flags need to be set
- Don't export Namespace since it isn't used anywhere else
- task T129752: Include user agent in request logs
- Tweak error prefixes for ease of browsing in logstash
- Promisify the exposed batching methods
- task T128659: Handle async createSocket
Monday, March 7, 2016 around 1:15pm PT: Deployed 5db1d28b
edit- Cleanup and tweaks of transclusion formatting for clarity and fewer dirty diffs
- Fix breakage in counting of HTTP status codes (broken by fix for T127983)
Tuesday, March 1, 2016 around 10:50am PT: Deployed 1f7ed5d0
edit- task T128319: Fix bug in formatting of transclusions for block-format templates
- Remove overloading of pipe stop in the PEG tokenizer -- eliminates incorrect parsing of pipes in external links
Monday, February 29, 2016 around 1:25pm PT: Deployed d809ad7a
edit- task T127983: Don't crash on misconfigured statsd host
- task T108134: Match html5 unquoted attribute parsing
- Break for [[ in table attribute values too
Wednesday, Feb 24, 2016 around 1:15 pm PT: Deployed 581a43c7
edit- Bump HTML content-type version to 1.2.0 (from 1.1.0) and data-parsoid content-type version to 0.0.2 (from 0.0.1)
- Update parsoid content type meta tags in the <head>
- <meta property="mw:parsoidVersion" content="0"/> is now changed to <meta property="mw:html-content-type" content='text/html; charset=utf-8; profile="mediawiki.org/specs/html/1.2.0"'/> to be more consistent with the version information that is output in the response headers.
- For the non-pagebundle API endpoints, <meta property="mw:data-parsoid-content-type" content='application/json; charset=utf-8; profile="mediawiki.org/specs/data-parsoid/0.0.2"'/> is also emitted.
- task T125266: Remove user/contribution information from header
- task T90479: Assert param value serializes to a string
- task T104599, task T111674: Fetch and use templatedata while serializing transclusions
- data-parsoid semantics updated to use 'foo=bar' as the default transclusion arg spacing.
- Remove data-mw.body.extsrc for the <references> tag (unused, and bloats data-mw)
Thursday, Feb 18, 2016 around 11:00 am PT: Deployed dfbafb60
edit- task T127218: Update sitematrix for ady.wikipedia.org
Wednesday, Feb 10, 2016 around 1:15 pm PT: Deployed 8976ab93
edit- Assert when flipped ranges are expected in template wrapping
- This should have no functional changes in parsing. At best, it will catch a bug / failed expectation in the template wrapping code.
Monday, Feb 8 2016 around 1:15 pm PT: Deployed 4d44fcc7
edit- Fix worker shutdown code in server.js + use it to restart stuck workers and to shutdown the Parsoid service
- Expect that this will fix the scenario with stuck worker processes when Parsoid service is restarted during deploys.
Wednesday, Feb 3, 2016 around 2:45 pm PT: Deployed 98619f7f
edit- Fix complex single-line nowiki handling
- More robust algorithm + can eliminate some spurious nowikis
- task T115289: Disable migrateTrailingNLs if table has had content fostered out of it
- Some code cleanup
- Removed some FIXMEs in nowiki escaping in <td>s
- Tweaks to attribute parsing in the PEG tokenizer
- Warn if prefix/domain is not unique during configuration
- ParsoidConfig changes: Don't proxy nonglobal wikis (temporary special handling for labswiki and labstestwiki)
- Config changes:
- Remove hardcoded references to internal API LVS endpoint.
- Removed references to unused parsoidcache.
- Removed explicit config entry for labswiki (ParsoidConfig handles it now).
Monday, Feb 1, 2016 around 1:15 pm PT: 2fcc841f to be deployed Cancelled deploy to fix nowiki regressions
edit
Warn if prefix/domain is not unique during configurationFix complex single-line nowiki testsCan eliminate some spurious nowikisBut, can introduce spurious nowikis around [{{echo|foo}}] style wikitext -- 0.07% of pages in rt testing were affected, but with selective serialization, we expect impact to be small. We will consider possible solutions to minimize nowikis in this scenario, nevertheless.
task T115289: Disable migrateTrailingNLs if table has had content fostered out of itConfig changes: Remove hardcoded references to internal API LVS endpoint + removed references to unused parsoidcache.
Wednesday, Jan 20, 2016 around 1:45 pm PT: Deployed f1ddfb88
edit- task T122816: Record when a range is subsumed from overlapping
- Temporarily disable the request timeout (since they don't abort request processing and cancel cpu timeouts as well)
- Reduce cpu timeout value to 3 minutes
Monday, Jan 11, 2016 around 1:15 pm PT: Deployed 07494cf2
editwt2html
- task T73154: Remove the vestiges of pipetrick entirely
- task T114225, task T121611: Note that DOM tree building uses restrictive checks (documentation fix)
- task T122054: Strip nowiki spans from templated / extension content
- Match permitted attributes to php's getAttribsRegex
html2wt
- Normalize DOM by stripping \u200e, \u200f next to category links (This is controlled by a config switch that we will turn on, if necessary)
- Edge case fixes to serializing lists with templated portions
task T119883: Performance fixes (for large DOMs)
- Use startsWith() instead of regex to match tag names in the DOM
- Optimise shadow meta deletion
- Bump domino to 1.0.21 (with performance fixes)
Other
- task T55874: Add a generic extension registration mechanism
- task T50891: Register
<translate>
and<tvar>
natively - task T122062: Update SiteMatrix, another wiki created
- task T121611: Use httpStatus instead of code as the property on errors