The convention of having the command shell replace "*" with a list of all matching files in the current folder[1] only in is... well.

I understand how it's useful for really basic core file utilities.

For anything that needs to do recursive directory-searches, though, it really gets in the way and raises the bar for what the user has to know in order to make use of CLI in Linux.

I just now had a long conversation with an advanced bash user[2], and apparently there really is no way to get this information without setting an option in bash before running a command (and then presumably unsetting it afterwards, so as not to break other programs).

Just... why.

#softwareGripe

[1] ...and *only* the current folder... and only including folders that match the same pattern -- like "*.rb" would include a folder named "foldername.rb", which pretty much never happens

[2] Much thanks to sophia kara for hashing through this with me. I was very grumpy about it.

@woozle 1. What option is this?

2. Globbing is a convenience, but generally *Not Good Programming Practice*.

3. find | read or find | xargs is probably what you want. More specifically:

while find . <args> | read file do; echo ">>> $file <<<"; <processing on file>; done

I like to echo the name of the file(s) found, first, both as a verification of the find command/results, and as a progress indicator.

@dredmorbius

#1: I've documented my findings -- https://htyp.org/bash/globbing

#2: Hard agree -- especially when there's no way to access the raw information (without making the user jump through extra hoops to provide it).

#3: I'd consider this an "extra hoop".

It seems to me that bash needs to be patched to provide the information in the execution environment. It already provides all kinds of other information of more dubious value, e.g. the format of the command-prompt, so why not this?

@woozle Bash is (at least) two things:

1. An interactive command environment.

2. A scripting tool.

The *benefit* of combining these features is that _what you use daily to interact with the system_ is *also* what you can use _for basic system automation tasks_.

In fact you can segue from one to the other through "shell one-liners" and the like. As a consequence, bash is the one programming tool I know best, _simply from daily familiarity_.

The combination also forces compromises.

1/

@woozle The reasons for globs is that *when used interactively* they are convenient.

When used *programmatically* (as scripts) ... they're convenient but also dangerous.

And you're looking at decades of legacy, so drastic changes are ... problematic. Many old scripts will break. This can mean difficult-to-understand elements, but also means tools remain stable with time.

Another result is that Unix / Linux end up being a mix of technical domains *and* a social lore. Both matter.

3/

[1/2] @dredmorbius

Working hypothesis:

Globbing was created/designed with the idea that there would be (or are) a lot of Really Simple Utility Programs that couldn't afford to be smart enough to do anything but take input from a single file and do something with it. Globbing therefore allows the user to perform those operations on multiple files without having to type a command for each file.

Problems:

  • globbing does not handle recursion at all. So if you want to perform the operation recursively, some other mechanism has to be employed.
  • ...and of course it prevents more sophisticated applications from doing their own globbing.
  • [2/2] @dredmorbius

    Solutions

    • Backwards-compatible: provide the raw command-line (up to the first operator -- pipe, <, > maybe others, but basically anything that divides {input to the command} from anything else) as an environment variable.
    • Backwards-breaking-ish: turn off globbing and train users to use external utilities for globbing. (This gives the user much more control over how globbing should be interpreted, allowing for things like folder-recursion, and also makes it clearer wtf is going on.)
      • Optional backwards-compatibility variation: have a (user-editable) list of legacy apps that expect globbing, and turn it back on when running any of those.