Topic on Talk:Wikimedia Apps/Flow

Geraki (talkcontribs)

The android app is displaying a "on this day" card with a structured timeline of events for english, but not for other languages. How is this data extracted, so that we can implement this for other languages?

Amire80 (talkcontribs)

I'd like to know, too.

MHurd (WMF) (talkcontribs)

I'm so sorry I didn't see this question!


The 'onthisday' data is served via this rest endpoint. Here's an example of a call to the endpoint:

https://en.wikipedia.org/api/rest_v1/feed/onthisday/events/01/15


The endpoint extracts this data by teasing out some structure from "day pages", like https://en.wikipedia.org/wiki/January_15.


It's not a perfect approach - ideally we'll get this data from wikidata at some point (or use this endpoint to seed wikidata records for days of the year?) - but it's been useful in the mean time.


The structure of these day pages varies a bit (or a lot) by language wiki, so I tried to abstract away the language logic. Here's a mirror of the file that contains the language specific logic used by the onthisday parser (which I wrote). I added handling for a few languages which, at the time, seemed to have a page for each day of the year, but I got side-tracked by other work since.


I'd like to circle back and audit the state of all language wikis' day page coverage again. If a language has a page for every day of the year the parser isn't too hard to update iirc. I seem to remember it taking an hour or 2 per lang?


Hope this helps!!

NickK (talkcontribs)

Hi @MHurd (WMF): and thank you for working on this!

This topic was raised by a few Ukrainian Wikipedia users on Twitter here: they were quite disappointed that Ukrainian is not supported, while Russian is.

We have standard pages for Ukrainian Wikipedia (actually Main page sections) named per Ukrainian MONTHNAMEGEN, e.g. uk:Вікіпедія:Проект:Цей день в історії/2 квітня for today.

Can you please add them to processing, or is something we need to do to make them machine-readable? Thanks

Amire80 (talkcontribs)

Thanks! It's a start.

Here's an idea for a future improvement: Instead of collecting month names and section names in all languages, could you make a standardized set of internal formats and names that could be used in all languages? For example, instead of "January 16, 2019" people would write "2016-01-16"? If people really want dates written in words in page titles, it could probably be handled with redirects.

Another idea is to reuse the same ultramagical parser that automatically converts dates written with words into an internal presentation on Wikidata.

And instead of section names, could it have HTML classes or ids that would be the same across all languages?

Or maybe these section names could be made into a translatable file and translated on translatewiki? This would be far easier to contribute to than to submit Git pull requests to JS code.

Amire80 (talkcontribs)

Oh, and what about the other main page sections, such as "In the news"?

Geraki (talkcontribs)
Amire80 (talkcontribs)

Well, yeah, that's the nature of change :)

Making it more uniform will make it easier to do for all languages. Currently, it can be done only for languages that created their own structure and submitted a Git patch.I propose to reduce the number of steps by making the Git patch unnecessary.

Reply to "On this day"