Mastodawn

Sven A. Schmidt May 25, 2024

“With the increased limit of the acceptance queue, and a patched version of wrk, we can now conclude that swift is a good competitor speed-wise as a web application server.

Memory wise it blows all the other technologies away, using only 2.5% of the amount of memory that the java implementation needs, and 10% of node-js.”

https://tech.phlux.us/Juice-Sucking-Servers-Part-Trois/

#swift #swiftlang

Juice Sucking Servers, part trois

Some technical stuff I do.

Phluxus Tech Blog

Matt Massicotte May 25, 2024

@finestructure I’m not sure it’s actually truly fair to compare to Java memory usage. But still, this is good stuff!

Sven A. Schmidt May 25, 2024

@mattiem I think it is in this case, because the idea is - what size VM do I need to run this workload and that’ll have a massive cost impact.

Matt Massicotte May 25, 2024

@finestructure Don’t get me wrong - Java VM tuning is an enormous pain point. You basically cannot run JVM stuff without becoming an expert. But there *is* some optimized setting and that could, conceivably compete quite well with Swift’s memory usage when handling a similar load. So think this could be hard to compare. But it’s still great and I vastly prefer it!

Matt Massicotte May 25, 2024

@finestructure You know thinking about this more, I could be wrong. I guess the fixed VM cost is represented here in this workload already. So, I take it back this rocks.

Helge Heß May 25, 2024

@mattiem @finestructure RC is conceptually better here *if* the RC is implemented properly, i.e. the code produces no retain cycles. Memory is freed when it isn't needed which is particularly important if you have high scale (which 98% of devs don't have, fwiw).
I've once heard that there've been high speed trading systems that could trade favourably by predicting the GC cycles of the Java systems of large banks.

Matt Massicotte May 25, 2024

@helge @finestructure I saw a few cool presentations from the JVM team at Twitter about GC performance, and let me tell you, they did not share this opinion. But intuitively, RC always made sense to me.

@mattiem @finestructure Render me surprised that the JVM team didn't share that opinion! 🙂 FWIW I'm a big GC fan.
The bigger problem w/ RC is that the application has to be memory correct, i.e. not leak, and that is quite hard. It's a big issue w/ NIO based things, isolated failure points (whether OOMs or fatalErrors) bring down the whole stack at once (conceptually tens of thousands of connections!).
So you either go ownership (hard+fast) or GC (easy+mem).

Helge Heß May 25, 2024

@mattiem @finestructure (or you use Apache, which is a protocol aware host and can recycle subprocesses cleanly if anomalies are detected, or even just after every 1k requests, because it's cheap to fork 🙃).

Magnus Ahltorp May 27, 2024

@helge @mattiem @finestructure I implemented throw-everything GC in a search engine in the ‘90s. Not as brutal as killing the whole process, but the idea was that no memory allocated during a request was needed after the request returned. I didn’t know then that Next had implemented that and called it autorelease.

Helge Heß May 27, 2024

@ahltorp @mattiem @finestructure Autorelease has a different purpose, it isn't for grouping free's. It exists mainly for API reasons, so that a method can return an "unowned" reference (which has a logical RC of 0).
NeXT actually had bulk free, but that was never really used AFAIK (NXZone's/NSZone's, hence +allocWithZone:).
An example for alloc, then throw-away everything are Apache pools: https://apr.apache.org/docs/apr/1.5/group__apr__pools.html

Apache Portable Runtime: Memory Pool Functions

Magnus Ahltorp May 27, 2024

@helge @mattiem @finestructure If by RC you mean automatic RC and not manual, the leak problems are not very different from GC. If the program hangs on to a reference it will not be collected, in both RC and GC.

Cycles will not be detected in pure RC, sure, but forgetting to let go of references is not handled in either.

Helge Heß May 27, 2024

@ahltorp @mattiem @finestructure The RC being automatic or manual doesn't matter in any way. W/ RC you can (easily) create unreachable graphs that stick around in memory (there is whole tooling around that). Which you have to protect against by embedding memory management into your software architecture. GC frees you from that (for memory, not for everything), which is a very useful abstraction (that isn't free like most).
Not applicable for everything, but fine in many domains.

Magnus Ahltorp May 27, 2024

@helge @mattiem @finestructure Still doesn’t handle reachable but unused objects, which is a big problem.

Helge Heß May 27, 2024

@ahltorp @mattiem @finestructure Reachable and unused are polar opposites. Either it is used (reachable) or not.
Do you have a specific example in mind?

Magnus Ahltorp May 27, 2024

@helge @mattiem @finestructure An object can be reachable but never used again. Global variables pointing to huge objects can become serious leaks. Hence the need to set these to something else when they’re not needed anymore.

Helge Heß May 27, 2024

@ahltorp @mattiem @finestructure Storing something in a global variable is a definitive sign that you want to use it again. Why would you store in a global variable otherwise? If that's unintended it is a logic error and only your AI compiler would be able to protect you against that.
It would be useful to stop throwing in red herrings. GC's solve a specific (and real) issue RC has. At a cost. Just like RC solves specific (and real) issues straight malloc/free has. Also at a cost.

Magnus Ahltorp May 27, 2024

@helge Of course storing something in a way that it can’t be garbage collected when it’s never going to be used again is a logic error. But people make errors, and they sometimes have to be handled outside the tools (i.e. me cleaning up).

But you know what else is a logic error? Not marking your cycles in a pure RC system, or not doing free correctly in a fully manual system.

I just think cycles are not a big problem compared to improperly reachable memory, so full GC is not as attractive.

Helge Heß May 27, 2024

@ahltorp "Not marking your cycles in a pure RC system" is a logic fault. That's the whole point, this issue is the thing GC fixes and which simply doesn't exist there.
If you think that RC leaks aren't a real world issue in apps today, you live on a different planet 🙂 (and FWIW I think *A*RC is responsible for that to a significant degree)

Magnus Ahltorp May 27, 2024

@helge I have not worked on Swift projects collaboratively, so I might miss something there. Maybe cycles are a huge leak generator in Swift projects. Are they larger than semantic leaks?

I have worked on collaborative projects for LISA-grade systems in Java and Python projects though, and there I have seen semantic leaks all the time. But people rarely talk about semantic leaks, because they don’t think they have to care about memory management when there’s a GC.

Helge Heß May 27, 2024

@ahltorp I think they are a significant leak generator, but that's not even my point.
W/ RC you have to take care of cycles as part of your application design, that is the worse part. E.g. "properly" applying weak, which is in fear of cycles _way_ overused in Swift. (And `weak` isn't even a solution here, just a workaround with its own problems .. like objects going away 🙈)
A GC might still be used improper, but Swift often is RC being as-if it is a GC, because it looks so simple.

Magnus Ahltorp May 27, 2024

@helge Weak pointers is one tool in the toolbox to avoid semantic leaks, so that is something that is needed in for example Python, Java and Javascript too. Why do you think they exist in those languages if they aren’t needed?

If you’re designing with hard pointer cycles, you’re most likely doing something that you should not be doing. If you use cycles in any way, they should always be an integral part of your application design. Regardless of your type of GC.

Helge Heß May 28, 2024

@ahltorp Can you give an example on what you'd call a "semantic leak"? Just the globals a user might inappropriately use?
(Weak still makes sense in a GC context if the holder is not actually materially interested in the object, but e.g. is a cache, think ORM - which is different to the usage in RC environments, where you need weak to break cycles).

Magnus Ahltorp May 28, 2024

@helge https://en.m.wikipedia.org/wiki/Tracing_garbage_collection

https://en.m.wikipedia.org/wiki/Garbage_(computer_science)

Tracing garbage collection - Wikipedia