Mastodawn

Fluffy Developer Fox Jul 23, 2023

Recursion Wars Episode II: Python strikes back

Because Python does not have real support for recursion (if there is a limit it's not actually supported in my opinion), I had to write this monstrosity to create a list of nested dicts from an object tree.

You'd think that calling dict on the root objects would be enough, but noooo, I have to unfold the trees, break them up into distinct depth levels, then fold everything back up while converting each object one at a time.

#programming #Python

Pup Arlen Jul 23, 2023

@FluffyDeveloper it’s disingenuous (nearing misinformation) to say that python doesn’t have real recursion support. Outside of carefully crafted cases to permit tail-chaining, EVERY language has a limited amount of recursion. Every time you recurse, another stack frame has to be added, and that takes memory to do - you’re eventually going to run out. If anything, python’s intentionally limited depth is a lot safer than other implementations that can chew through your systems memory until they wake the OOM killer, which has all sorts of bad consequences. The claims you’re making betray you don’t actually understand how recursion really works or it’s pitfalls - it’s not a good look 😬

Fluffy Developer Fox Jul 23, 2023

@puparlen Yeah, too bad that languages like C, Go, Rust, and even JavaScript can effectively recurse a LOT deeper than Python without running into any error :/

A quick test will tell you that Python throws a recursion limit error REGARDLESS of its memory usage, it will do it with an empty function calling itself! I got it to trigger with a whopping 200KB of memory used up XD

And you can still go into overflow before reaching the limit because, again, its arbitrary and independent of memory.

@FluffyDeveloper but honestly, 99.9% of the time if you’re running into the 1000 limit, it suggests code smell and a time to reevaluate the implementation. Most efficient recursive algorithms use far less depth for reasonable cases. And for what it’s worth, Python is arguably more functional than most languages - internally it inherits a lot of zero-copy and tail-chaining from lisp. Unfortunately as it’s evolved it gets lost in syntax, but comprehensions and generators do use it

Fluffy Developer Fox Jul 23, 2023

@puparlen Not really, I tend to write using functional paradigms and recursion is just natural to use when parsing trees. In this case the data comes from a forum-like website so I cannot set an arbitrary limit to the recursion depth. Maybe if I knew that they had one I could use that, but alas that's not the case.

Oh I know that it can do that in the background, it’s what annoys me the most x3

It could SO much more, but its limited on purpose and that's a pity :(

Pup Arlen Jul 23, 2023

@FluffyDeveloper Fair enough; agreed that changing the limit internal to a framework/library is bad practice - I didn’t have that context. However, it feels naive to claim that “setting a hard limit is not always doable”. The real world is full of hard limits - computing resources are finite (we only pretend they’re not). If you’re approaching those limits, coding defensively and handling those real limits is crucial. In this context, that might mean batching or a compromise limit

Fluffy Developer Fox Jul 24, 2023

@puparlen True, but when dealing with pure data memory can be considered effectively limitless in some scenarios. Take the case of these comments: even when reaching the recursion limit it used only a few MB of memory, and from stress testing I'd say it could go hundreds of thousands of levels deep before it ran into memory issues (more with some optimisation).

That’s not a number which can be seen in that specific scenario: a few hundred chained comments? Yes. Tens of thousands? Not a chance.

Pup Arlen Jul 24, 2023

@FluffyDeveloper Great. You literally just bounded your depth by an order of magnitude. Since you’ve effectively set that at max 1000 deep (fermi estimation), as long as you’re one scale above (10k) you should be good to go

Fluffy Developer Fox Jul 24, 2023

@puparlen There is still the library problem. I can't go and set the recursion limit to some other level that may break the caller program, so I am stuck with the default 1000.

Pup Arlen Jul 24, 2023

@FluffyDeveloper Fair. Agreed, I wasn’t suggesting that - merely that you have a bound to write against. Any library solution would have to fit in the 1000, but could optimize internally on 10k max/overall

Fluffy Developer Fox Jul 24, 2023

@puparlen That's what that new function does. It takes the recursion out of the equation entirely so now it can parse as many comments that can fit into memory, regardless of nesting depth.

I just liked my one-liner better x3

Pup Arlen Jul 24, 2023

@FluffyDeveloper I get that your forum gives the appearance of infinite depth. That doesn’t absolve you of needing enough memory and compute to handle your algorithm, and writing it within reasonable limits. What do you do in the circumstance that it has 100,000 depth? Crash? Even your iterative approach will break at some scale - it can’t truly be infinite

Fluffy Developer Fox Jul 24, 2023

@puparlen It's not truly infinite obviously, but a case where you encounter enough comments to crash the memory is not really possible, so you would never reach the crash limit.

Pup Arlen Jul 24, 2023

@FluffyDeveloper it’s limited on purpose to keep the average user from blowing off their own foot when writing a recursive algorithm, because it’s incredibly easy to do so, especially with a non-trivial leaf case (although I know you know that). I don’t see that as a bad thing; it makes the 80% case reasonable and gives power users an escape (or a prod to reconsider what they’re doing)

Fluffy Developer Fox Jul 24, 2023

@puparlen I am of the school “crash to learn” x3

It's not necessarily bad, but it can lead to issues when you wanna build libraries, because you can't (shouldn’t) take control of system variables, so you are effectively stuck with the default :(

Pup Arlen Jul 24, 2023

@FluffyDeveloper believe it or not, I actually agree with you. I’ve given the impression I’m gung ho about altering settings like that; my position was just that it could be changed at an application level (where this is the final tool)/ to diagnose why. Sorry, should have clarified. My personal style is very much the fail-fast/crash/restart of Erlang

Fluffy Developer Fox Jul 24, 2023

@puparlen Erlang ❤️

That, Go, and Haskell are the languages that defined my coding style. Crash when necessary, pass errors up the chain, and be functional :3

PS: you do come across a bit strong ;)

Pup Arlen Jul 24, 2023

@FluffyDeveloper I’m piecing together the details from your posts and I suspect it’s something to do with the yields in your __iter__ (I’m not 100% on that, just tracing very roughly). Those calls will retain the containing stack frame so they can resume; if your recursion touches that callback your stack depth would balloon. Even just a small retention (the one or two extra you mentioned) would cause problems pretty quickly

Fluffy Developer Fox Jul 24, 2023

@puparlen That was my exact thought. I am actually planning on getting rid of the __iter__ method and just add an as_dict() one instead that doesn't use yield. Not as pretty but much safer.

Pup Arlen Jul 24, 2023

@FluffyDeveloper dataclass? https://docs.python.org/3/library/dataclasses.html. as_dict out of the box and nice efficiency from __slots__

dataclasses — Data Classes

Source code: Lib/dataclasses.py This module provides a decorator and functions for automatically adding generated special methods such as__init__() and__repr__() to user-defined classes. It was ori...

Python documentation

Fluffy Developer Fox Jul 24, 2023

@puparlen That was my first implementation, but I didn’t feel it suitable for “interactive” objects. Controlling the __init__ is quite annoying, and because I use recursive and looping references, I'd have to change so many methods that I might as well create a manual class :(

Pup Arlen Jul 24, 2023

@FluffyDeveloper fair. I run into this a fair amount myself. They’re handy for some things but yeah, cumbersome quickly. Nothing wrong with manual classes :3