Topic on Project:Support desk

Corrupted database - where to from here?

14
Summary by Taavi

eight-year-old topi; hopefully resolved by now

220.244.174.211 (talkcontribs)

A recent server move at my host ended up corrupting my database. So, I've come to grips with the fact that my written content is gone (or at least, horribly mangled - working on that separately). All the uploaded materials appear to be in order, however.

Does anyone have suggestions as to the easiest way to go about a fresh install, given that part of the old install is there? I've never encountered an issue of this magnitude, so any suggestions as to the easiest path forward would be appreciated.

88.130.117.234 (talkcontribs)

Things would be far easier for you, if you had a backup - even if it was a few days or weeks old. Do you have one?

Another idea I have right now would be to fix the "mangled characters". Maybe your host copied the database and "only" changed the charset of the DB? So that basically your content is fine, but MySQL displays it incorrectly?

220.244.174.211 (talkcontribs)

Nope, sadly no backup. Lesson learnt, etc. The database is completely wrecked, to the point where my host is refusing to even speak about it any more. Regular pages are filled with junk between the words, and all forms of punctuation have been stripped:

"theu800 8thu800 episode ofu800 poor andu800 weird features kenu800 andu800 jimu800 ether covering takin' itu800 offu800 andu800 takin' itu800 allu800 offu800. poor andu800 weird poor andu800 weird poor andu800 weird inu800 au800 rare moment ofu800 loquaciousness jimu800 informs usu800 that nothing much hasu800 been going onu800 lately. file pwe8-nothingu82emp3u800 theu800 show show'su800 rssu800 feed doesn'tu800 work."

I can get everything back eventually, I suppose. Going to save what's there and try and recreate the wiki as best I can. Fresh install time?

MarkAHershberger (talkcontribs)

Yes, I would dump the current wiki (if you can still do that) and then install fresh. If you can dump it, you may be able to see some usable revisions.

220.244.174.211 (talkcontribs)

So, install as per the directions as if I were doing it the first time? But don't delete the images, extensions, etc. folders?

Just want to make sure I'm doing this right.

MarkAHershberger (talkcontribs)

right... though, the images will not magically re-appear on the wiki till you re-upload them.

88.130.117.234 (talkcontribs)

If it is not too much content, I think doing a new install would be faster than trying to fix the broken characters...

However, what exactly should the original text of what you posted above look like? Maybe one of the experts here can then tell, what the hoster might have done to get it broken exactly that way.

  • You sometimes have "u800" inserted inbetween words.
  • Some other characters are broken: "file pwe8-nothingu82emp3u800 theu800 show" What was that before?
220.244.174.211 (talkcontribs)

It was a lot of content. But I think the database is ruined beyond just the characters - I can only see tables for hitcounter and searchindex. I'm pulling the pages from searchindex.

The text should, obviously, not have u800 between words. I can't recall exactly what that filename should be - I think pwe8-nothing.mp3. So it looks like files have u82e inserted instead of a dot. It's going to be rough, but I should be able to get things to a vaguely workable state in a word processor. Unfortunately, looks like pretty much all the formatting - capital letters, tables, etc - is gone.

88.130.117.234 (talkcontribs)

The structure of tables and so are not inside the searchindex table. The search is not case sensitive, so there are no capital letters...

hitcounter and searchindex are the only tables, which use the storage engine MyISAM or HEAP. All others are set up to use InnoDB. Did they uninstall InnoDB or what?

If you want to proceed that way, you should do a row of search and replace operations for each page, e.g. "u800" => space, u82e => dot and so on.

When you start with the wiki all over again, do the installation and before you insert any content into the DB check the following:

  • The character set for the DB and for the tables should be utf8_...something (e.g. "utf8_general_ci" or "utf8_unicode_ci"). If it is different from that remove everything from the new DB again and fix the charset first.
  • Check that the tables are using InnoDB as storage engine. (hitcounter and searchindex should use something else, that's fine).

Btw: Have you considered moving to another provider? Seems like a good time to think about that.

220.244.174.211 (talkcontribs)

I have no idea what they did, sadly. I think they uninstalled InnoDB. I don't know why, though.

Yep, search and replace was the idea. Slow work, but at least I've still got part of what I had.

Thanks, I'll make sure to check the tables as suggested.

220.244.174.211 (talkcontribs)

As for moving providers, this one has been quite good, up until the last three weeks. Plus (and this is the real killer) I've still got a year's worth paid for.

88.130.117.234 (talkcontribs)

It's always the same: When you just paid them, they screw up. Sometimes I think that cannot be unintentional.

As for the images: MediaWiki has a script, which allows you to do a mass import of images. I would put my images in a folder on the server and then run that script. See Manual:importImages.php for the details.

220.244.174.211 (talkcontribs)

It's certainly taught me to keep backups. If I'd backed up the database just once I could have saved all this messing about.

Thanks for the suggestion. Most of the content is not images, but rather audio files. So I'll just install MSupload again and do them page by page.

88.130.102.253 (talkcontribs)