So I'm hacking on Family Feud for the Switch, and it turns out it stores all its questions in one two-megabyte JSON file.

with no linebreaks

it is (of course!) Unity
the file is 6mb if you run it through a json formatter.
So they saved 4mb! in this 2gb rom! that is probably compressed!

the layout of the json is... weird. I think they're trying to do some basic manual compression?

There's string pools.

So there's primarily:
1. list of questions (each of these is a complex structure)
2. list of strings of questions
3. list of strings of answers (true or false or unseen)
4. list of names of audio files or "UNKNOWN"

the last two are aligned.

So #4 is called "AnswerAudioCues" and goes:
[
"A_Cry",
"UNKNOWN",
"UNKNOWN",
"UNKNOWN",
"UNKNOWN",
"A_Howl",
"UNKNOWN",
"A_BreakDown",
...

and #3 is called "AnswerText" and goes:
[
"Cry",
"Sob",
"Get sad",
"Weep",
"Cries",
"Howl",
"Feel sad",
"Break down",
...
See how it lines up? Cry, Howl, and Break Down are answers with matching audio files: So when they're selected, the game will play one of the audio files of the host reading them out.

okay so #1, "Questions".

Each structure here has an ID, TextIndexUS, Quality, Validity, ValidForOnline, Answers, and False Answers.

The ID just looks like "Q_00001". TextIndexUS indexes into the #2 string table, "QuestionText". Here it's 0, which matches up with:
"Name something you do after a fight with your partner."
Quality & Validity are integers of unknown use. (1 and 3, here)
ValidForOnline is a boolean (false, here).
answers is a list of objects. Each has a separate Id (named like "Q_00001_A01"), and TextIndexUS (indexing into AnswerText, here 0 or "Cry"), AudioIndexUS (indexing into AnswerAudioCues, here 0 or "A_Cry"), a points value (here 62), and then a VariantIndicesUS sub-list
VariantIndicesUS is the indexes of AnswerTexts that are alternate aliases or "Cry". Here it's 1-9, or:
"Sob",
"Get sad",
"Weep",
"Cries",
"Howl",
"Feel sad",
"Break down",
"Wail",
"Moan",
FalseAnswers has the same sort of structure as Answers: Id, TextIndexUS, AudioIndexUS, VariantIndicesUS. all that's missing is the points (since you get no points for an invalid answer)
So I think this is juts here for the AI players: it lets them generate plausible incorrect answers.
for example, the first false answer is 45, with variants of 46-51.
That matches to "Call ex", with variants of:
"Call ex-boyfriend",
"Call ex-girlfriend",
"Call ex-husband",
"Call ex-wife",
"Call ex-partner",
"Message ex",
it looks like the game ships with 1595 questions
although there are duplicates and near-duplicates
the game also contains a 3036-line file titled "swear_filter"
some items:
bloblos
bullet vibe
clover clamps
commie
cunillingus
dolcett (OKAY, WHO DATAMINED A VORE WIKI?)
flow
glazeddonut
gramps
hahal (?)
heterosexual
masterblaster
mcfagget
moist
mr hands (immortality for the horsefucker!)
octopussy (the only banned Bond movie)
paginate (what)
pile
plump
plug (no, not buttplug. just plug)
raging boner (boner is on the list separately, so I don't know why they needed to call this one out)
smeg (red dwarf: banned)
soviet
tainted love
yaoi
oh and tutelage. the game bans tutelage
and stew
doom? they banned DOOM!?
they ban "bimbos" but not "bimbo". the singular is fine.
australian, athletesfoot, church, color (not colored: that's separate), dendrophilia (attraction to trees?), and dumas (NO COUNT OF MONTE CRISTO HERE)
they also ban the term "black cock", despite the fact that both "cock" and "black" are banned separately.
they ban:
boobs
booobs
boooobs
booooobs
booooooobs
but not:
boooooobs
inert, itch, jade, jam, jelly, juice, latin (?!)
it also has the scunthorpe problem
@foone it is an outrage that i, william kenneth galboobsloway, must suffer this erasure
@foone Some of these describe ethnicities and nationalities, which tells me the developer wanted to head off online abuse. Either that, or someone on the team REALLY had it out for Latin Australian bimbos.
@foone there's jade again. i wonder why jade is offensive
@foone profanity filters are my favourite example to demonstrate to people why regex is actually hard. 😬 A lot of programmers think they are big brain enough that they always get it right.
@foone looking down my shirt to see whether I have bobs, boobs, booobs, boooobs, booooobs, boooooobs, or booooooobs
@foone (look at this overconfident bitch, so sure she doesn’t have bbs)
@foone bo{2,}bs?
@ericspittle @foone I’m assuming unity has a regex engine built in, but if this was a webpack, there’d be like three, and none of them in use anywhere.
@foone
somehow that one feels less lewd
somehow it feels related to a multitude of bobs
@foone banning boobs is wrong
@foone ^if only there were some way to have expressions to match a set of strings in a regular way^
@foone They had some really specific ideas…

@foone

Wait, the word black is banned???

"how do you like your coffee?"

"Black."

YOU HAVE BEEN BANNED"

😑

@foone

maybe a sloppy import from a word list like this one

https://github.com/dsojevic/profanity-list/blob/main/en.json

it attaches variants with adjective prefixes to the same profanity id, but if someone just stripped all the words out of it they'd get all the adjectives too.

profanity-list/en.json at main · dsojevic/profanity-list

A highly consumable list of profanities / bad words with severity ratings, exceptions, and tags. - dsojevic/profanity-list

GitHub

@foone my favorite item in this list is "butthole engineer"

that's a valid medical profession!

@gloriouscow @foone The exceptions are hilarious
@foone is that what they call the treehuggers these days? Interesting
@foone what about colour? are the bri'ish allowed to be sloightly rude?
@foone You can have one (1) bimbo, as a treat.
@foone it’s a brand. 🤷‍♂️
@foone
One bimbo is godly, two is just hedonism.