@rl_dane @amin @kabel42 @sotolf @thedoctor obviously not, because it’s written differently ;)

re_format(7) knows:

There are two special cases** of bracket expressions: the bracket expres- sions '[[:<:]]' and '[[:>:]]' match the null string at the beginning and end of a word, respectively. A word is defined as a sequence of charac- ters starting and ending with a word character which is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype(3)) or an underscore. This is an extension, compati- ble with but not specified by POSIX, and should be used with caution in software intended to be portable to other systems. (as for the mark:) POSIX leaves some aspects of RE syntax and semantics open; '**' marks de- cisions on these aspects that may not be fully portable to other POSIX implementations.

The definition for \< / \> differs between less, perlre, pcre, … I believe, but they all are somewhat simiar.

@rl_dane @amin @kabel42 @sotolf @thedoctor perlre(1) actually has…

A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W".

… so the \< probably comes from less(1)?

… hm, no. But, where then?

@mirabilos @amin @kabel42 @sotolf @thedoctor

I used to use \b a lot, but \< and \> are just as easy to use, and POSIX. ;)

\w is nice, though. I think the closest POSIX one is [[:graph:]]? (Not super close, though)

@rl_dane @amin @kabel42 @sotolf @thedoctor \< and \> are not POSIX.

perlre(1) \w is identical to POSIX [a-zA-Z0-9_] in the C locale, so [[:alnum:]_] if you have support for POSIX character classes.

@mirabilos @amin @kabel42 @sotolf @thedoctor

Ah, yes. [[:alnum:]] was the one I was thinking of.

@mirabilos @amin @kabel42 @sotolf @thedoctor

Waiiiiit, what does the underscore before the second bracket do? I've never seen that before.

No mention of it in RE_FORMAT(7) on FreeBSD.

@rl_dane @amin @kabel42 @sotolf @thedoctor the exact same thing as the underscore in [a-zA-Z0-9_], and I’d be surprised if the FreeBSD manpage would not document it

@rl_dane @amin @kabel42 @sotolf @thedoctor let me blow your mind if that was news to you:

[[:alpha:][:digit:]_]

@mirabilos @rl_dane @amin @sotolf @thedoctor yay context sensitive [], there is no way that can go wrong \s
@kabel42 @rl_dane @amin @sotolf @thedoctor it’s actually not, the first unescaped [ switches from RE context to RE-Bracket context in the bracket-begin state, in which you can have an optional ^ (except in shellglobs where it is spelt !), then an optional ] not taken as the end of the RE-Bracket, then an optional -, then any amount of expressions of the type a-z, [:charclass:], [=equivalenceclass=], x, then an optional -, then a closing ] which terminates the RE-Bracket context.
@kabel42 @rl_dane @amin @sotolf @thedoctor (I erred: you can have either the ] or the - at the beginning, not both)
@kabel42 @rl_dane @amin @sotolf @thedoctor (and I forgot collating elements, which is totally fucked up, [a[.ch.]] in e.g. es_ES.UTF-8 matches either a or ch, so a bracket expression in POSIX has a variable matching length…)
@kabel42 @rl_dane @amin @sotolf @thedoctor these are rare-to-never-used features, thankfully
@kabel42 @rl_dane @amin @sotolf @thedoctor tbh the only time I use something other than simple chars and ranges in bracket expressions is the BSD [[:<:]] and [[:>:]] extension (which matches a zero-length string)
@kabel42 @rl_dane @amin @sotolf @thedoctor no, the zero-length string between a nōn-word‑ and a word character

@kabel42 @mirabilos @amin @sotolf @thedoctor

Basically spaces and punctuation.

@rl_dane @kabel42 @amin @sotolf @thedoctor no, literally [^a-zA-Z0-9_]
@mirabilos @rl_dane @amin @sotolf @thedoctor and ^ here is negation?
@kabel42 @rl_dane @amin @sotolf @thedoctor no, [^char-class] matches “any single character, other than newline, not in char-class”
@mirabilos @rl_dane @amin @sotolf @thedoctor yeah, basically what i meant except for the newline maybe

@kabel42 @rl_dane @amin @sotolf @thedoctor yea, I’m just pedantic.

In the RE ^foo[^bar^]baz$ there technically are exactly two carets.

@kabel42 @rl_dane @amin @sotolf @thedoctor this is important when you want to include a ] or - in a bracket expression, and for the newline ofc.

@mirabilos @kabel42 @amin @sotolf @thedoctor

Don't you have to backslash escape a right bracket, like [a-z\]]?

@sotolf @thedoctor @amin @rl_dane @kabel42 not if it’s the first character of a bracket expression, like []a-z]

@mirabilos @sotolf @thedoctor @amin @kabel42

Ahhhh, good to know. Mentally filed. ;)

@kabel42 @amin @thedoctor @sotolf @rl_dane I often go through logs by first cutting off timestamp
and host using rectangle mode in jupp, then replacing ^([^ ]*)\[[^]]*\]: with \1: and sort -uing.

I’ve also used [][0-9a-fA-F:] to match IP addresses…

@mirabilos @kabel42 @amin @thedoctor @sotolf

I love editors with rectangle selection and editing modes. vim has it, and my first exposure to it was actually in Microsoft Word 4.0 for mac. Obviously not something I use today. XD

@rl_dane @mirabilos @amin @thedoctor @sotolf kate had that for a time and now i can't find it anymore... :(

@mirabilos @kabel42 @sotolf @thedoctor @amin

Respect to your efforts, but for me, it's modal editing or die. XD

@rl_dane @kabel42 @sotolf @thedoctor @amin just imagine ^K and ^Q as starting the action and movement modes, respectively, and otherwise you’re in insert mode, with a few shortcuts

@mirabilos @kabel42 @sotolf @thedoctor @amin

Maybe if I had caps lock mapped to control, rather than escape. ;)

@thedoctor @kabel42 @amin @rl_dane @sotolf solvable problem ;) PCs (including my first) did have Control there, after all

@mirabilos @thedoctor @kabel42 @amin @sotolf

I don't recall seeing anything other than unix terminals and workstations with control to the left of "A"

But yeah, capslock is a dumb key, or at least, that's a dumb placement for it. ;)

@rl_dane @thedoctor @kabel42 @amin @sotolf https://commons.wikimedia.org/wiki/File:IBM_Model_F_XT.png used to be the standard layout for PCs, though the F keys could also go up to where they are now (only up to and including F10, mind you)
File:IBM Model F XT.png - Wikimedia Commons

@rl_dane @thedoctor @kabel42 @amin @sotolf and here I thought you were older than me?

@mirabilos @thedoctor @kabel42 @amin @sotolf

Maybe? I'm half a century, roughly. ;)