RFC 9839 and Bad Unicode

ongoing by Tim Bray

«Unicode is good. If you’re designing a data structure or protocol that has text fields, they should contain #Unicode characters encoded in #UTF8. There’s another question, though: “Which Unicode characters?” The answer is “Not all of them, please exclude some.”

This issue keeps coming up, so [ @paulehoffman and @timbray ] put together an individual-submission draft to the IETF and now (where by “now” I mean “two years later”) it’s been published as #RFC9839. It explains which characters are bad, and why, then offers three plausible less-bad subsets that you might want to use.»

https://www.tbray.org/ongoing/When/202x/2025/08/14/RFC9839 by @timbray

#programming #CharacterEncoding #LML

RFC 9839 and Bad Unicode

ongoing by Tim Bray

I made a typo and tried to encode a string as "UTC-8".

#ProgrammingMistake #UTF8 #UTC8

It is 2025 and it has been 0 days since I wasted way too much time due to f"¿Quéucked up character encoding.

#utf8

30 years in, UTF-8 remains a mystery.

cc @schei_encoding

#ScreenshotSaturday #utf8 #Neuland

The following hashtags are trending across South African Mastodon instances:

#trivia
#English
#words
#bible
#jesuschrist
#salvation
#surveillance
#ai
#southafrica
#utf8

Based on recent posts made by non-automated accounts. Posts with more boosts, favourites, and replies are weighted higher.

The following hashtags are trending across South African Mastodon instances:

#trivia
#English
#words
#bible
#jesuschrist
#salvation
#surveillance
#ai
#southafrica
#utf8

Based on recent posts made by non-automated accounts. Posts with more boosts, favourites, and replies are weighted higher.

The following hashtags are trending across South African Mastodon instances:

#flysafair
#southafrica
#Wordle
#wordle1497
#dreamed
#utf8
#ethiopia
#worldivfday
#ivf
#infertilityawareness

Based on recent posts made by non-automated accounts. Posts with more boosts, favourites, and replies are weighted higher.

Every time I read about how we encode characters as bit-patterns on computers, I feel impressed all over again.
#UTF8 #Ethiopia
https://www.unicode.org/charts/PDF/U1200.pdf