Can't you just convert the encoding of the file, they asked happily…
Well, have a look at this problem. Add several cascaded incorrect encodings of unknown nature plus mixed styles of line endings and multiply that by about 8000.
Now you have a basic idea of the task at hand. Oh, and did I mention that most of the text content you're working in is in languages you do not speak at all?
Yes, it's kinda absurd… I agree.
#KaraokeLeaks