https://stackoverflow.com/questions/33051108/how-to-get-around-the-linux-too-many-arguments-limit/33278482

> I have to pass 256Kb of text as an argument to the "aws sqs"

what, uhhh, what

> MAX_ARG_STRLEN is defined as 32 times the page size in linux/include/uapi/linux/binfmts.h:
> The default page size is 4 KB so you cannot pass arguments longer than 128 KB.
> I modified linux/include/uapi/linux/binfmts.h to #define MAX_ARG_STRLEN (PAGE_SIZE * 64), recompiled my kernel and now your code produces

casually patching the kernel to send a quarter megabyte as a *single* argument oh my god i'm laughing hard
@navi well in the early rust for Linux days we hit this limit with the passing kconfig options to rustc. Fun times

@kloenk @navi Back when 128 kB was the limit for argv+envp, Google was hitting it too because they passed all the configuration for their whole software stack on the command line as --long-option=value switches.

Their solution? Compress the command line. So every binary started by ungzipping argv[1] and parsing it to get the configuration.

The person explaining this to me saw my horrified face, and said with the perfect Hide The Pain Harold smile: "a series of individually completely rational and reasonable decisions led to this." and I have been thinking a lot about it since.

@ska @navi And I guess null bytes in gzipped form must have been funny to handle

@lanodan @navi I don't think that's necessarily a problem. argv[1] doesn't have to be a string, it's a character array. Null is used as a separator when the kernel puts the whole argv on the stack, yes, but argv[1] is still just a pointer and if you know you're expecting a blob and have a way to know where the blob ends, it should work, I think.

Or they could have been base64-encoding the gzip for all I know, it's probably still smaller than the uncompressed argv.

(Edit: typo)

@ska @lanodan @navi Nope, it's a string. execve will stop processing it at the first null byte.

execve syscall essentially acts as a scatter-gather (in this case gather) operation running over the argv and environ pointer arrays in user memory and performing a string-copy-from-user operation for each one to built the object that will be prepopulated into the new process-image.

@dalias @lanodan @navi Oh? It's a shame, then, that there isn't an execae() primitive that takes a char array as argv and a char array as envp and splits them following null bytes.

Because I have to do that all the time in execline and I hate to do it just to follow the API when the kernel is going to do the exact same right afterwards.

@ska @navi @lanodan @dalias Why not just pass the data in through stdin?
Laurent Bercot (@[email protected])

@[email protected] @[email protected] @[email protected] Sending the configuration to stdin is more difficult than storing it in a config file, because you have to have a process writing to the daemon's stdin. It's easier for the cluster manager to scp the config and give "-f configfile" to the daemon's command line. The point is that they didn't even want to scp a config file. The agent was just reading and running a command line and they didn't want to modify it. That, I think, is the more questionable design decision.

Treehouse Mastodon