Transkribus Enabled Wikisources
This page contains details about the Wikisources where Transkribus has been enabled as an OCR engine alongside Google and Tesseract
Sl. No. | Wikisource | Name of Model | Model ID |
---|---|---|---|
1 | Balinese | Balinese palm-leaf manuscripts 16th century | bali |
2 | Bengali | Bengali printed books | ben-print |
3 | German | Dutch_XVII_Century | de-17 * |
Transkribus Dutch Handwriting | de-hd-m1 | ||
Transkribus German Handwriting | ger-hd-m1 | ||
15-16th Century German | ger-15 | ||
4 | English | Transkribus B2022 English Model M4 | en-b2022 * |
Transkribus English Handwriting M3 | en-handwritten-m3 | ||
Transkribus Print M1 | en-print-m1 | ||
Transkribus Typewriter | en-typewriter | ||
5 | Spanish | Diario de Madrid 1788-1825 | es-md * |
SpanishRedonda_sXVI-XVII_extended_v1.2 | es-redonda-extended-v1_2 | ||
6 | Finnish | NLF_Newseye_GT_FI_M2+ | fin |
7 | French | Transkribus French Model 1 | fr-m1 |
8 | Italian | Transkribus Italian Handwriting M1 | it-hd-m1 |
9 | Polish | Transkribus Polish M2 | pl-m2 |
10 | Russian | Russian generic handwriting 2 | rus-hd-m2 |
Russian print of the 18th century | rus-print | ||
11 | Sanskrit | Devanagari Mixed M1A | san |
12 | Swedish | Stockholm Notaries 1700 2.1 | swe-2.1 |
The Swedish Lion I | swe-lion-i | ||
13 | Yiddish | The Dybbuk for Yiddish Handwriting | yi-hd |
14 | Hindi | Devanagari Mixed M1A | dev |
15 | Czech | Old Czech Handwriting (with spaces) | cs-space * |
Old Czech Handwriting (without spaces) | cs-no-space | ||
16 | Danish | 19th century Danish Gothic handwriting v.1.1 | da-goth * |
Danish gothic print 1859-1888 v4 | da-goth-print | ||
Gjentofte 1881-1913 Denmark | da-gjen | ||
17 | Greek | Ligorio 0.3 PyL | el-ligo * |
Noscemus GM 6 | el-print | ||
18 | Estonian | Estonian Court Records 19thC | et-court |
19 | Hebrew | Hebrew DiJeSt 2.0 | he-dijest |
20 | Hungarian | Hungarian handwriting 19th–20th cent. | hu-hand-19 |
21 | Latin | Carolingian Minuscule Model CMM 9th-11th c. | la-caro |
UCL–University of Toronto #7 | la-med | ||
Pylaia_NeoLatin_Ravenstein | la-neo | ||
22 | Dutch | Admiraliteit Zeeland 1605-1609 compleet | nl-1605 * |
Dutch Mountains (18th Century) | nl-mount | ||
Dutch newspapers 17th century | nl-news | ||
23 | Norwegian | NorHand 1820-1940 | no-1820 * |
Sunnhordland Partition Protocols | no-1874 | ||
24 | Portuguese | General Portuguese M1 | pt-m1 * |
SPJCL17C V4.2 | pt-17 | ||
25 | Romanian | RTA2 (Romanian Transition Alphabet) | ro-print |
26 | Slovenian | Slovenian 18th century manuscript | sl-hand-18 |
27 | Slovak | Handwritten Glagolitic | sk-hand |
For Wikisources with more than one model listed, the ones marked with * are currently active. All models, however, can be found on the OCR tool until a model selector is integrated into the wiki side itself (see T279405)
The Transkribus Pilot project is being undertaken in collaboration with IIIT Hyderabad and the Balinese Community. We have successfully integrated the Balinese OCR model created by IIIT Hyderabad into Wikimedia OCR.