Citoid is usually pretty good about UTF-8 conversion, but http://w.genealogy.euweb.cz/hung/batth3.html has the á come back as �. ~~~~
Topic on Talk:Citoid
When I request http://w.genealogy.euweb.cz/hung/batth3.html in my browser the content comes back encoded as non-UTF-8 (perhaps Windows-1252?), e.g. <TITLE>Batthy�ny 3</TITLE>
. I think your browser is doing some magic to turn it into readable text, but I think the issue is in the source?
Playing around with the encoding in Safari, I found that the "Default" option looks great, but UTF-8 is wrong. I went through some options and ISO Latin is what it is. Chrome and Safari on MacOS are both smart enough (perhaps same OS library), to figure out that it is not UTF-8. Character set guessing is not an exact science sadly (harder than NP-complete, blah blah blah), but heuristic guesses are usually pretty good.