Wikimedia Discovery/Meetings/Checkin/2017-11-28
Topics from the past
edit- Deployment schedule (train and SWAT) for upcoming (end of year) holiday period:
- Please note any holiday emergency/contingency information at https://office.wikimedia.org/wiki/Technology/2017_holiday_coverage
- 2017 Community Wishlist Survey is open to proposals. Folks should probably keep an eye on the Search category
- Voting starts on Nov 27 until Dec 10
- Discovery-related wishlist items we should probably respond to:
- https://meta.wikimedia.org/wiki/2017_Community_Wishlist_Survey/Search/Preferences_settings_to_modify_crosswiki_search_results
- bhttps://meta.wikimedia.org/wiki/2017_Community_Wishlist_Survey/Search#Search_page:_integrated_.22incategory.22_functionality - we can point to WMDE’s AdvancedSearch project for this one. :)
- https://meta.wikimedia.org/wiki/2017_Community_Wishlist_Survey/Search/Order_search_results_by_date_of_last_edit_or_alphabetically
- Note, much overlap with https://meta.wikimedia.org/wiki/2017_Community_Wishlist_Survey/Search/Search_by_date
- Discovery-related wishlist items we should probably respond to:
- Voting starts on Nov 27 until Dec 10
- Holiday party for SF-based folks 11/30
Announcements, Information, Questions
edit- Make sure your travel is booked for dev summit/all hands
- Jan has too many meetings! (now with Reading) Will this be his last discovery weekly?
- Discovery Weekly email will continue
- Jan is wrapping up portal, so there probably won’t be any portal reporting after that
- This meeting becoming much more search-oriented, should be merged with search planning meeting, will become search status and planning meeting instead
- Add goals to Disco Weekly Status email
- T-shirts? Yes!
- Guillaume will lead the last retro for Discovery, and the last of the year
Scrum of Scrums
editAre we blocked?
- None
Are we blocking?
- None
Other dependencies (in either direction) which don’t need to be called out as “blocked” (e.g. are progressing smoothly, have no urgency, etc.)
- None
Discovery News
editQuick Quarterly Goals/KPI Update (if needed)
editDiscovery Roadmap FY 2017/18: https://docs.google.com/a/wikimedia.org/presentation/d/1N41eNrz0vFHJamLkhQjSFCOSiDg-bKr5H3ROchqwVJU/edit?usp=sharing
FY 2017-18 Q2 (Oct-Dec) goals: https://www.mediawiki.org/wiki/Wikimedia_Discovery#Projects
This status was last updated 2017-11-14. Completed/dropped goals may not be shown.
Tech:Search Platform
editSearch:
1. Implement advanced methodologies such as “learning to rank” machine learning techniques and signals to improve search result relevance across language Wikipedias.
- Begin to automate the machine learning pipeline, starting by targeting eight to ten languages, other than English, that match (at a minimum) current performance and then deploy those models. (IN PROGRESS)
2. Improve support for multiple languages by researching and deploying new language analyzers as they make sense to individual language wikis.
- Investigate open source language software that is available and see if it can be converted into ElasticSearch plugins. (IN PROGRESS)
- Investigate usage of fall-back languages (DONE)
- Investigate fuzzy (phonetic) matching.
- Continue general language support. (ONGOING)
3. Investigate how to expand and scale Wikidata Query Service to improve its ability to power features on-wiki for readers
- Work on sub-category filtering and searching within the Wikidata Query Service. (IN PROGRESS)
4. Address technical debt:
- Convert existing Selenium tests to Node.js (IN PROGRESS)
- Investigate ownership and maintenance of Logstash (IN PROGRESS)
Structured Data on Commons:
1. Commons search will be extended via CirrusSearch and ElasticSearch and Wikidata Query Service, to support searching based on structured data elements describing media.
- Determine advanced search requirements and measures for structured data on commons. (NOT STARTED YET)
2. Advanced search capabilities (e.g., Wikidata Query Service, SPARQL queries) will be updated to support the more specific media search filters and the relationships to the topics they represent
- Begin work on prefix- and full-text search in ElasticSearch on Wikidata in preparation for the Structured Data on Commons project. (IN PROGRESS)
WDQS: Wikidata Query Service goal for this quarter will be to work on sub-category filtering and searching within the Wikidata Query Service; it will be maintained by Stas and Guillaume to support the continued growth and use of the service; the Analysis team will help with statistics.
Audiences:Readers:Discovery
edithttps://www.mediawiki.org/wiki/Wikimedia_Audiences/2017-18_Q2_Goals#Readers
Portal: Update the Wikipedia.org portal codebase to be completely automated for ease of ongoing maintenance.
- Automate portal project updates: statistics and translations (IN PROGRESS)
Maps: Support the move to be more operationally centralized and roll out a new map style that has numerous updates and enhancements.
- Finalize and deploy new map style; replicate maps test cluster in Wikimedia Cloud Service; monitor for critical bugs (IN PROGRESS)
Analysis: The team will continue to work closely with the Search Platform team to analyze A/B tests and other assorted data; they will also begin working on determining a baseline set of metrics for Structured Data on Commons. (IN PROGRESS)
FYI
edit- Dec 25-29: WMF Holiday week
- Jan 22-23: DevSummit
- Jan 22-23: Readers:Audiences offsite
- Jan 25-26: All Hands
- Jan 27-28: Search Platform offsite (after all-hands)
- OΟО
- Guillaume 2 weeks in Kenya probably in March… details to come
- David out last week of February 2018 (2/26 - 3/2)