@emilymbender Is this tendency to find and fixate early on meaning analogous to this problem from a PL context?
There've been security problems when a system assumes that, because a string is in a structured language, say GIF, that no downstream part of the system will interpret it as a string in another language, say JavaScript.
https://0x00sec.org/t/gif-javascript-polyglots-abusing-gifs-tags-and-mime-types-for-evil/5088
This happens, not infrequently, when one part of a system gets Content-Type metadata specifying the language, makes some security relevant decision, and forwards it to another part of the system without that metadata. The downstream subsystem then uses a heuristic to pick a language while relying on the earlier security decision.
This kind of flaw makes it past design review, because, I conjecture, of two different senses of "language":
- A language is a set of string plus some semantics. In this view, it's clear that a string can be in more than one set.
- A string of JavaScript is that which is produced by a JavaScript programmer, even if the product has flaws that put it outside the set a JavaScript interpreter can deal with. In this view, the provenance of the string separates it from other languages.