https://chaos.social/@Emathion/114613267697396447
@cstross Well, not banning, just running any vague matches of author+title involving unusual unicode ranges through a manual queue during title setup.
Amazon is a big company, pretty sure they could handle this pretty smoothly IF they wanted to.
@tsturm @cstross My guess is that the problem is volume.
Not only would they have to have enough staff to process those manual queues (in multiple languages) but those staff would have to understand well enough to know which author is real but similar to some other author and which is a grifter fake.
Bearing in mind that Goodreads relies on a lot of volunteer librarians, I'd be surprised if it were practical at scale.
Of course, that *shouldn't* be an excuse, but enshittification…
@cstross @tsturm OTOH they could auto-ban names that use a suspicious mix of specific characters from different unicode blocks, as defined by the unicode consortium itself
(there are libraries that do all of the dirty work for you)
if they allowed for a manual override (after reasonable checks) for that one author who really wants to sell a book titled “don't go to aⅿazon.com” I'd think it would be a pretty reasonable restriction
That’s terrible
It’s strange that Amazon allowed these accounts to sell the books at all; when I self-publish titles that either are, or have been, also published by traditional publishers, I have to jump through all kinds of hoops to prove to Amazon that I have the rights for the territory and format in question — sending them scans of my publishing contracts and letters of reversion. I’ve sometimes had to argue for *months* with multiple different Amazon employees following their opaque procedures to convince them that I’ve proved my case. So I don’t know what these grifters are doing to get their wholly fraudulent authorisation with such ease.
@lispi314 @cstross Multiple different author accounts can have books with perfectly overlapping names. Thus, someone could abuse the system by making a “C. Stross” account and listing books with all the same titles. A search for “Invisible Sun Stross” would return both. Scummy search engine gaming tricks could result in the fake being ranked higher. Doesn’t even have to be the same book. They just throw the cover on some LLM garbage or whatever.
It’s difficult for automated systems to catch this reliably, and book marketplaces like Amazon are unwilling to hire enough people to review what the automated systems don’t catch.
@cstross
Well, yes.
And Unicode has a number of even more advanced fun topics. Like optional decomposition of diacritics. Yes, there are code points that serve as suffix to add all kinds of stuff to a character. So that à and ä have two Unicode representations and this also two utf-8 encodings.
MacOS FS is one place that uses decomposited Unicode. So zip files with diacritics look fine, but actually break when used say on Linux.