Mastodawn

witch_t *navi Jan 22

i really hate this idea that every non-trivial struct in c must be an opaque heap allocated pointer because people can't trust their users to read documentation, or what?

abi stability is another topic, and that one is more valid of a reason, but i'd still say it's not that hard, use sonames properly -- but if you *know* the struct definition will be changing a lot, then it's fair (but consider structuring your api differently, be modular and rely less on one single big struct type)

@navi It's not really just about that. If you're trying to create a compile time (or hell, runtime) backend system, you can snip off the opaque data into its own struct within its own file. If you've ever worked with a C++, one of the worst things you'll ever deal with is header bloat, and overtime as headers get pulled in, you'll end up overincluding things you don't want and compile times end up actually tacking up. Trust me, a project I work on happens to take ~8 minutes to compile on a few cores and even simpler files can take 6-7 seconds each to build, becsuse the original devs just exposed everything over each header (with occasional forward decls when possible, which can sorta do similar to what im gonna describe), and it just ends up being a bit of extra work and everyone has just kept putting up with that. The solution to this is just make private inplementations of the data, so at most the c++ file wont be pulling in STL and sdl or wayland or glib headers that you dont need (this gets groser for those "unity headers" which just include everything for convenience).

It's just generally good practice to not expose a lot of stuff that isn't needed to the user. If your struct is mostly ints and chars and a pointer here and there, its not neccesarily something you need to hide. But even then, in some more complex codebases where you DO care about api stability/abi, you shouldnt really be just giving someone a struct anyway and should have some getters in the case where you need to do "backwards compat" or actually modify what happens to get at the data someday.

But its never, to my knowledge about "treating the user as dumb." in the case of public, private it kind of is.. but.... if you intend for data to never be poked at, keep it hidden, because otherwise people may screw at a struct they saw and then risk things breaking (oh i changed the x value but forgot to update prevx and change this thing, this wouldnt have happened if i used change_x(int)...)

I think ive only had to request data be exposed once in my life for a library many years ago, and even then, the guy just exposed a function to get at what I wanted.

@swags

most of c++'s compile time include costs is template instantiation, and those can't really be "opaque" anyway, and a better solution to it is to modularize your api, and to https://include-what-you-use.org/

headers should only include other headers if they *need* the definitions that are in those headers. if your types or functions don't, then you include the other header in the translation unit that does -- yes, a lot of people include random things in the header because they use it in the c file, but that doesn't make it a good practice

and yes, you can also do some sort of polymorphism by making types opaque and changing the definition based on a backend, but that's an exception rather than a rule

"hiding things from the user" is not worth the mess and runtime costs you get from it, now every structure needs to be heap allocated, now every public field needs to have a getter and setter, your library is twice as painful to use, unnecessarily slower, for a small benefit on the general case

if i intend data to be poked at, i document it, otherwise don't touch it. in c23 you can even go a step further:

```c
#ifndef mylib_private
# define mylib_private [[deprecated("private field")]]
#endif
```

now you compile your code with `-Dmylib_private`, but clients of the library would need to actively bypass compiler warnings to do stupid shit, and if you really really want, pre c23, you can even do things like wrapping private data in a `.private` struct field

making the whole struct opaque doesn't just impact "users can't touch it", it impacts your whole api, adds runtime constraints and limitations, and complicates memory management in many cases

include-what-you-use - A tool for use with clang to analyze #includes in C and C++ source files

@navi > headers should only include other headers if they *need* the definitions that are in those headers.

that was.. quite literally my point. I am explaining to you why people _need_ (or rather, like) to have private implementations, because they want to dig out stuff being exposed. This applies to both libraries and applications.

> every structure needs to be heap allocated, now every public field needs to have a getter and setter, your library is twice as painful to use, unnecessarily slower, for a small benefit on the general case

do you... actually believe this is 'slower'? like esp on modern memory and being paired with a good malloc impl and all that... a little heap redirection (and allocation) isnt going to cost that much. Unless you're in a tight for loop where you really care about how your CPU is caching, this is not something you should _actually_ be worrying about in your code, like.. unless you are really tight on restraints. Seriously. Benchmark a running program and tell me how many waiting games you're playing where this genuinely a noticable issue. I'm no java programmer, i tell you, but there are plenty of areas where performance means a lot, and this one isnt really it....

its also worth nothing that heap allocation whilst keeping private is a _deliberate_ choice by the dev, not just because they dont want you digging but quite literally because that struct always changes and usually you just pass it around... opaquely. its not done because they hate the users but because they just want their data in the heap to begin with. Otherwise you'll end up with padding tricks and v2 v3 stuff structs to work around it. It's a technique to save energy on maintenance, not performnace.

> "users can't touch it"

once again, it was never really about this.

> most of c++'s compile time include costs is template instantiation, and those can't really be "opaque" anyway

im aware, but i dont believe i bring that up.

(sorry of this comes off harshly, i just woke up, headache etc)

@navi and its important to note: there is a time and a place for private impls :) there is also a time and a place for raw structs youd just work with. Its a design decision because sometimes one makes more sense then the other. Not everything is fit for the pimpl or opaque struct design. You deliberately pick and choose the one which you think fits, and thats up to how you plan to use the struct.

Its never a project specific thing but just something you do when it makes sense and youre confident the data will change a lot, for example, and yes, the pointer redirection, like x y values could maybe slightly possibly have a tiny performnace overhead. :)

@navi Take sdl for a simpler example, there are many things that get pimpl'd where it fits and plenty of things where structs are deliberately exposed for you to fill in. It entirely depends on what youre doing and how youre doing it.

witch_t *navi Jan 22

@swags my problem is the vast majority of opaque structs i see don't need to be opaque and when i ask why it is, i get told either "but what if we want to change it in the future????" (you bump soname) or "but the users are not supposed to have access to the fields" (tell them not to)

it's so painful to work with, say, libinput, having all event objects be opaque structures, when the event data itself is basically just a few numbers from hardware, and those ain't changing since hardware ain't changing, if anything would change it would be a new event, but instead, opaque structs with _get functions, just because

and while libinput is the most fresh bad example i have, there's plenty others too

witch_t *navi Jan 22

that was.. quite literally my point. I am explaining to you why people need (or rather, like) to have private implementations

since when self-contained headers require private implementations? my point is about being mindful of what your headers declare, nothing to do if ‘thing’ is declared at all or not

foo.h can have struct nya; and use struct nya * on it’s declarations, then nya.h can have struct nya { ... };, and now the user can allocates nya anywhere they want, stack, heap, array, inside a hashmap, and pass in that to say foo_do_thing(&nya), then foo.c includes nya.h and then it does whatever it needs

and so what i meant in the end was that foo.h should not include nya.h just to use struct nya *, but a lot of developers do that and unnecessarily re-export symbols

do you… actually believe this is ‘slower’? like esp on modern memory and being paired with a good malloc impl and all that… a little heap redirection (and allocation) isnt going to cost that much

it’s not going to cost that much, sure, but a) embedded and old hardware still exist and are still functional, and b) being “unnecessarily slower” is just one of the points

but it can also be tangibly slower, if you’re keeping an array of $thing, e.g. some library context per-connection, you could have an array of packed connection data to iterate, but instead you end up with an array of pointers and iteration times get painfully slower – and if you don’t believe me that it matters, see this talk: https://www.youtube.com/watch?v=IroPQ150F6c

Otherwise you’ll end up with padding tricks and v2 v3 stuff structs to work around it. It’s a technique to save energy on maintenance, not performnace.

i did address this on op:

but if you know the struct definition will be changing a lot, then it’s fair (but consider structuring your api differently, be modular and rely less on one single big struct type)

but if you’re not breaking abi so often, and when you do, bump the SONAME and that’ll make (good) build systems and distro tooling rebuild things pretty much automatically

Andrew Kelley: A Practical Guide to Applying Data Oriented Design (DoD)

YouTube

@navi > but instead you end up with an array of pointers and iteration times get painfully slower

I was deliberately implying that by the way, which is why I was referring more to "pointers." It's not cache friendly at all, but its not something you need to pull hair over. Hell you could have that sort of thing in a game loop these days. In fact in my old game engine i made when i first started programming this was one of the first things i learned not to do (it was for tile data, mind you, lmao), but even then, given the amount of data i was actually dealing with, it wasn't significant enough, and i mean damn, youre meddling into a hearty dose of nanoseconds at that point, which isnt great but its not really as insanely slow as you think it is.

> since when self-contained headers require private implementations?

never said that, in fact i later said that its not even about that. You pick and choose what you think fits.

RE: bump SONAME

a lot of these changes tend to be internal though, which is my point, so youd just make the functions change and well, deprecate the behavior but still keep the functionality. That is precisely why people do this sort of thing. bumping soname is fine but it can get pretty obnoxious doing it once a week.

In a big, heavy moving project where even little abi breaks can cost a fortune, you have to account for this stuff; you need to provide backwards compat and consistent behavior between functions,even if internally the logic for say, getting the x value of something is to change, or libinput junk. Sometimes it can be misused, wont deny that, but its more precautious programming if anything...

I want to reply more but i need to head off here. Thanks for chatting! ☺️

witch_t *navi Jan 22

@swags

> You pick and choose what you think fits.

i'm not trying to, i'm trying to understand what your point was and talk about that, but seems like i can't seem to understand what it is

> bumping soname is fine but it can get pretty obnoxious doing it once a week.

well, releasing once a week would be hellish, but, versioned symbols are also an option, or, try to not change things so much, the more modular your interface is, the less any individual thing needs to change, and you end up adding new structures way more often than changing existing ones

> I want to reply more but i need to head off here. Thanks for chatting! ☺️

good luck out there, thanks asw

mirabilos Jan 22

@swags @navi there’s still enough not-modern systems around, and quite a part of embedded is a modern use case in which you don’t generally want to malloc, ever. Or at least, not more than once, as there is no free.

@mirabilos @navi Ok... When was i ever talking about "those" embedded systems? At the lowest common denominator of performance i am referring to a pentium D. On "those" embedded systems you definitely wouldn't really even have a malloc to begin with (or at least one that works akin to glibc malloc), and will probably even have magic globals litterered everywhere or a little scratchpad buffer if you're really tight.

But even modern embedded systems are considered a pleroma in this case. Alas, I cannot make any assumptions of what type of "embedded" system you are referring to.

Not a single day on this earth would you _ever_ do what I'm describing on a system like that. And even if you presumably did, it probably wouldn't be bad enough to genuinely make a huge difference. I'm not even really sure why you brought that up, really.

mirabilos Jan 25

@swags @navi uhm, I have 80486 systems and SPARCstations around me, I care very much about these.

@mirabilos @navi Dude............ You were talking about embedded. Those are not embedded. Those are just insanely old. Nobody is talking about those. Nobody was ever hinting about those. I love old hardware, but this is a fetish. Im not, and absolutely nobody else is actively designing software to still run fine on these unless it's deliberate. You wouldnt do anything i described with a system like that, but even then you could, but i really have no clue when it becomes noticable, unless you can give a benchmark side by side.

edit: ok i misread, you were talking about not-modern hardware, not just embedded. Even then, I hold by my point.

I still have absolutely no clue why you brought this up, though. That's not what i ever had in mind. Plenty of software will do "array of pointers."

mirabilos Jan 25

@navi @swags I can choose to selectively reply to individual points of your replies as I find headspace to address them, and I don’t have any new things to embedded and am merely surprised you don’t consider them but cannot even imagine them, so I chose to not proceed on the embedded part because you seem to be miles away

@mirabilos @navi cool. you are very pleasant to speak with 🫤

mirabilos Jan 25

@swags @navi just sleepy and annoyed by wall of english text

@mirabilos @navi ... im sorry? i type a lot, but nobody is forcing you to reply