Days since I've spent way too long trying to track down a bug in my `malloc` implementation: 0

Days since I've also blamed my `malloc` implementation because I actually forgot to use the new pointer returned by `realloc`: 0

Don't code in C if you don't have to...

For a bit of background, I've been trying to get a web version of my atto programming language working correctly with all sorts of advanced features, like procedures with scoped variables.

It works by running catto (my C implementation of atto) in a WebAssembly module, which is then wrapped with JavaScript.

(Sounds a bit ridiculous given that the original atto was written in JavaScript, but I want to ensure that the behaviour of atto is consistent across platforms.)

You can have a play around with it here: https://attojs.devicefuture.org/examples/editor/

Anyway, I was getting my Fibonacci sequence code working as a recursive procedure. But a weird thing occurred: when it's first executed, it prints:

0, 1, 1, 2, 3, 5, 8, 13, 20, 34, 55

That should be a 21, not a 20...

When I click on 'run' again, it works fine though:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55

So I keep spamming the 'run' button, which causes a `memory access out of bounds` error. πŸ₯²

atto.js editor demo

So I spent *hours* scouring through my code for catto, trying to see where the issue is with my procedures implementation.

But when I run this demo on my computer outside of WebAssembly, it works fine.

Even so, I was still convinced it was an issue with atto, so I wasted even more time trying to find this bug.

Spoiler alert: catto wasn't the problem.

Turns out it was an issue with my custom memory allocator I wrote for the WebAssembly environment catto was running in.

Yeah β€” I didn't want to use Emscripten for this project as I felt it'd be too overkill. So I wrote my own `malloc`, `calloc`, `realloc` and `free`.

Or rather, I stole it from a previous project I made and repurposed it for use here.

The problem was indeed with `malloc` β€” here's what happens:

When I want to allocate some new memory, I call `malloc`. Usually, it returns a fresh bit of memory that hasn't been used before. However, sometimes, if you call `malloc` after `free`, it'll return the memory you've just freed, provided that freed memory is the same size or larger than the amount of memory you're requesting.

If you call `free` multiple times for a contiguous set of memory blocks, then the next call to `malloc` will attempt to stitch together those individual free blocks into a larger free block before attempting allocation.

Here's a hastily-drawn diagram to show you what I mean.

My code *was* merging free blocks, but only merging enough blocks for what is needed to allocate the right size of memory, instead of all the subsequent free blocks.

Which is perfectly valid, but there was an issue along the way which didn't demarcate where the next free block after the large merged block is.

I changed the behaviour to merge *all* of the subsequent free blocks right up to the point where either there is a block in use, or we reach the end.

That seemed to have done the job.

Given how hard that was to discover, I was so overjoyed that I managed to track down yet another hard-to-find bug with my `malloc`!

I might switch back to the original behaviour of merging only enough blocks as necessary for allocation in the future, so long as I get the implementation right.

But there lied another issue with catto...

I wanted to see how well atto could cope with returning strings from procedures.

(For context, to return something from a procedure, you have to store the return value into a parameter variable, which is basically given by reference.)

So I wrote a simple program to return the string `"yo"` with the value of `n` appended to it.

I clicked on 'run' multiple times. Looks like atto didn't work...

(Each line in this console output should be exactly the same.)

Why on Earth is it printing `yoyo`???

Back to the drawing board then. 🫀

At first, I thought it would be an issue with my memory allocator again β€” it could possibly be returning the wrong memory addresses or overwriting areas of memory it shouldn't.

Nope, it turns out it's a problem with my use of this function here...

It wasn't a problem with that function itself. What was the problem is that my code was littered with calls to `catto_appendCharToString`, and I was throwing away its return value in each case.

You'll notice in the function declaration that I return the value of `CATTO_REALLOC` β€” AKA `realloc`.

Oh dear.

The problem with `realloc` is that in 95% of cases, the pointer that you pass to it for reallocating memory will still be the same pointer for the newly-allocated memory, so any time you forget to reassign the pointer, nothing bad happens.

But then, there'll be a case where the memory allocator can't increase the size of the memory you want to reallocate in-place (because there's a block of memory immediately after the block you're reallocating), so it'll move your memory to a new address and give you back the new pointer, making the old pointer invalid.

Since I was throwing that new pointer away, I was still referencing my old pointer, which now is so useless that it might as well point to a random area of memory.

The solution was easy β€” find all instances of `catto_appendCharToString`, and prepend them with `string = `, or whatever the variable that holds the string was called for a given instance of this call.

However, I'd say I lost about an hour, or an hour and a half, on this problem.

C is an immensely fun programming language when your code works...

...but C is an absolute nightmare to debug, as your code will more often than not have bugs in it that you'll only realise exist months later.

(That statement applies to both bugs I've written about here. They've been hiding from me for the best part of a month or two now.)

`realloc`'s signature should really be

`void realloc(void** ptr, size_t size)`

instead of

`void* realloc(void* ptr, size_t size)`

so that you *have to* pass in a pointer to your pointer, which `realloc` is then able to update. That way, you don't just accidentally discard the new pointer that was given to you as a return value.

But hey β€” it's not like I was the author of this memory allocator. ... Oh, yeah, I am.

Still, I'm not going to change how `realloc` or `CATTO_REALLOC` works, as someone's already set out a standard for how `realloc` should be called.

But I might modify `catto_appendCharToString`'s function signature to prevent my own self from falling over this problem again.

You can see what I fixed at: https://github.com/devicefuture/catto/commit/b85adf7 and https://github.com/devicefuture/atto.js/commit/111f01e

+102 -28 lines changed total (delta +74).

Fix memory and other issues with procedures Β· devicefuture/catto@b85adf7

A C runtime library for the atto programming language. 😺πŸ–₯️ - Fix memory and other issues with procedures Β· devicefuture/catto@b85adf7

GitHub
Oh yeah, and the good ending is, it works now!

I was very hesitant to see if I could even return a list from a procedure.

By some miracle, that also works too! And also note the local scoping of `i` in that `for` loop! 😲