@b0rk Excellent in-depth dive into how to investigate these types of issues! strace helped my team identify that Python was segfaulting due to UCS-4 string use attempting to allocate space for a >57KiB regular expression.
(Switch the internal representation to UCS-8, the problem goes away. Only happened in dev/test, 'cause only in dev/test do we enable every single route in the entire application, resulting in that monster regex. Thanks, Django.)
strace is love.
strace is life.
@b0rk But this also touches on a peculiarity of Python I've mentioned before. There are edge cases where Python is legitimately faster than equivalent C code. Memory pre-allocation including no-mmove array/list growth, complex dead code removal, and so forth.
Python cheats.
As much as it can possibly get away with.
We joke that Python's hash table implementation powering sets, dictionaries, &c. couldn't be optimized further without risking the creation of a singularity.
"Complex dead code removal" being, for example, the deletion of the entirety of this function, becoming a no-op.
def do():
n = 0
for i in range(27*10**42):
n += 2
n is never used, so the loop is irrelevant and just goes away. (Pypy JIT to blame for this one.)
C⦠would iterate 27 tredecillion times to do nothing.