Topic on User talk:TJones (WMF)/Notes/DWIM as API

use "canonical form" for universal dwimming

3
קיפודנחש (talkcontribs)

i mentioned it in the ticket, but i'll repeat here:

i think it makes sense to define "universal" keymapping. this may be tricky or same-alphabet-differrent-key mapping (i think German vs. English has this little quirk wrt the location of Z and/or Y? maybe French too. dunno).

basically, "canonize" all the article names, and when "dwim" needs to be invoked (currently, it means less than 10 hits), canonize the search word, and take a 2nd pass over the "canonized articles".

by "canonical" i mean map the characters in whatever language to some standard set of keys (prolly standard us keyboard, but also consider skipping it, and map directly to "keycodes" - those should be pretty standard, i think), using the most common keyboard for this language.

this way, it will not be just "whatever <==> english", or some other pairing you have to define somewhere, but rather "whatever <==> whatever".

just my $.02

peace

TJones (WMF) (talkcontribs)

Sorry for not replying sooner. I didn't quite understand what you meant on the Phab ticket, but I've looked into it more now and I see what you are talking about. I'm not sure we could support the extra in-memory index required for autocomplete title lookups, but it's an interesting idea. I've explained it a bit more in Phab, and I'll share the idea with the rest of the search team. Thanks for the info and ideas! (Also, do you have a reference for Google using a keycode index? I'd love to read more about it.)

קיפודנחש (talkcontribs)

Sorry, the reference to google was made purely based on observation. I bave no idea how google actually implement their dwim. Peace

Reply to "use "canonical form" for universal dwimming"