File this under #shell #functions I should have written years ago:
function grepc {
#Do a grep -c, but skipping files with no results
grep -c "$@" |grep -v ':0$'
}
File this under #shell #functions I should have written years ago:
function grepc {
#Do a grep -c, but skipping files with no results
grep -c "$@" |grep -v ':0$'
}
Oh, didn't know about -c. I usually just pipe to wc -l I guess.
-c, -l, -h, -H, and -q are my favorite #grep flags. :D
Huh, that almost became a [Marcel Duchamp] reference. π
I just use -v and -E
...and bash instead of zsh
...and grep/awk/sed instead of jq
...and firefox instead of chrome
...and the fediverse instead of facebook
Face it... I'm an unpopular-opinion neckbeard level boss. XD
cc: @mirabilos
@rl_dane Those are so not comparable!
@sotolf @thedoctor @rl_dane @mirabilos
Mm, not really though? ripgrep is meant for bulk grepping of files
@sotolf @thedoctor @rl_dane @mirabilos
I mostly just use it to run rg TODO and see all the spots in a codebase I marked as still needing work.
@amin @sotolf @thedoctor @mirabilos
Why is ripgrep better than just grep -R?
@kabel42 @amin @sotolf @thedoctor @mirabilos
Interesting! I wonder what kind of algorithmic optimizations (as opposed to compiler optimizations) they're using to do that, and if regular (GNU/BSD) grep could do the same.
Because I'll wear clown shoes and a tutu before changing to a "rewrite the world in rust!" utility π
@kabel42 @rl_dane @amin @sotolf @thedoctor eww, itβs not even a drop-in thenβ¦
(For not-a-drop-in, I found pcregrep interesting. Sadly, Debian recently dropped it, but in the versions which donβt have pcregrep any more, you can use grep -P for many use cases. pcre2grep is not a drop-in for pcregrep eitherβ¦)
@mirabilos @kabel42 @amin @sotolf @thedoctor
I was a total PCRE stan in the olden days, but I've steered more towards regular extended regexp for compatibility. I do miss \d, \w and \s, though. [[:space:]] feels so clumsy to type and use several times in a regex, I'll sometimes put a sp="[[:space:]]" line at the start of a script, and you'll see several invocations of "${sp}" in my regex strings.
But again... compatibility. ;)
Is there a big difference between (GNU) grep -P and pcregrep? I hadn't heard of that utility before.
@amin @kabel42 @rl_dane @sotolf @thedoctor I never used \d and the likes, always felt them much too complicated. I almost never use POSIX character classes (besides the BSD [[:<:]] and [[:>:]]), rather I just hit [ tab space ] quickly.
GNU grep -P does a PCRE grep, it doesnβt support all of the extra flags of pcregrep though, and before the version in IIRC trixie was very broken.
@mirabilos @amin @kabel42 @sotolf @thedoctor
is [[:<:]] and [[:>:]] the same as \< and \>?
@rl_dane @amin @kabel42 @sotolf @thedoctor obviously not, because itβs written differently ;)
re_format(7) knows:
There are two special cases** of bracket expressions: the bracket expres-
sions '[[:<:]]' and '[[:>:]]' match the null string at the beginning and
end of a word, respectively. A word is defined as a sequence of charac-
ters starting and ending with a word character which is neither preceded
nor followed by word characters. A word character is an alnum character
(as defined by ctype(3)) or an underscore. This is an extension, compati-
ble with but not specified by POSIX, and should be used with caution in
software intended to be portable to other systems.
(as for the mark:)
POSIX leaves some aspects of RE syntax and semantics open; '**' marks de-
cisions on these aspects that may not be fully portable to other POSIX
implementations.
The definition for \< / \> differs between less, perlre, pcre, β¦ I believe, but they all are somewhat simiar.
@rl_dane @amin @kabel42 @sotolf @thedoctor perlre(1) actually hasβ¦
A word boundary ("\b") is a spot between two characters that
has a "\w" on one side of it and a "\W" on the other side of
it (in either order), counting the imaginary characters off
the beginning and end of the string as matching a "\W".
β¦ so the \< probably comes from less(1)?
β¦ hm, no. But, where then?
@mirabilos @amin @kabel42 @sotolf @thedoctor
I used to use \b a lot, but \< and \> are just as easy to use, and POSIX. ;)
\w is nice, though. I think the closest POSIX one is [[:graph:]]? (Not super close, though)
@rl_dane @amin @kabel42 @sotolf @thedoctor \< and \> are not POSIX.
perlre(1) \w is identical to POSIX [a-zA-Z0-9_] in the C locale, so [[:alnum:]_] if you have support for POSIX character classes.
@mirabilos @amin @kabel42 @sotolf @thedoctor
Ah, yes. [[:alnum:]] was the one I was thinking of.
@mirabilos @amin @kabel42 @sotolf @thedoctor
Waiiiiit, what does the underscore before the second bracket do? I've never seen that before.
No mention of it in RE_FORMAT(7) on FreeBSD.
[a-zA-Z0-9_], and Iβd be surprised if the FreeBSD manpage would not document it@rl_dane @amin @kabel42 @sotolf @thedoctor let me blow your mind if that was news to you:
[[:alpha:][:digit:]_]
[ switches from RE context to RE-Bracket context in the bracket-begin state, in which you can have an optional ^ (except in shellglobs where it is spelt !), then an optional ] not taken as the end of the RE-Bracket, then an optional -, then any amount of expressions of the type a-z, [:charclass:], [=equivalenceclass=], x, then an optional -, then a closing ] which terminates the RE-Bracket context.] or the - at the beginning, not both)[a[.ch.]] in e.g. es_ES.UTF-8 matches either a or ch, so a bracket expression in POSIX has a variable matching lengthβ¦)[[:<:]] and [[:>:]] extension (which matches a zero-length string)@kabel42 @rl_dane @amin @sotolf @thedoctor (does your font lack the β ?)
https://toot.mirbsd.org/@mirabilos/statuses/01KGZBC68X0E7AQEK692CGSK65
@kabel42 @rl_dane @amin @sotolf @thedoctor and, duh, itβs a Fediverse link, you copy/paste it into the Search form of your client to read it, not the browserβ¦
β¦ I wish GtS would go on and support web+ap://β¦
@kabel42 @mirabilos @amin @sotolf @thedoctor
I think he's using the German keyboard on his phone.
@sotolf Definitely not. I can readily do ΓΈ and Γ but not that one.
@kabel42 @mirabilos @amin @sotolf @thedoctor
Basically spaces and punctuation.
[^char-class] matches βany single character, other than newline, not in char-classβ@kabel42 @rl_dane @amin @sotolf @thedoctor yea, Iβm just pedantic.
In the RE ^foo[^bar^]baz$ there technically are exactly two carets.
] or - in a bracket expression, and for the newline ofc.@mirabilos @kabel42 @amin @sotolf @thedoctor
Don't you have to backslash escape a right bracket, like [a-z\]]?
[]a-z]@mirabilos @sotolf @thedoctor @amin @kabel42
Ahhhh, good to know. Mentally filed. ;)
@kabel42 @amin @thedoctor @sotolf @rl_dane I often go through logs by first cutting off timestamp
and host using rectangle mode in jupp, then replacing ^([^ ]*)\[[^]]*\]: with \1: and sort -uing.
Iβve also used [][0-9a-fA-F:] to match IP addressesβ¦
@mirabilos @kabel42 @amin @thedoctor @sotolf
I love editors with rectangle selection and editing modes. vim has it, and my first exposure to it was actually in Microsoft Word 4.0 for mac. Obviously not something I use today. XD
@kabel42 @mirabilos @amin @thedoctor @sotolf
Looking online... is it ctrl+shift+B?
@thedoctor @rl_dane @kabel42 @sotolf @amin why not?
@bentsukun made the first editions of the MirBSD flyers in Quark Xpress on MacOS.
@kabel42 @mirabilos @amin @sotolf @thedoctor
Aye. You can use it in bracket expressions, even with character classes, like:
[^0-9] Everything but digits
[^[:space:]] Everything but spaces
@mirabilos @kabel42 @amin @sotolf @thedoctor
The caret? Outside of brackets, the caret matches the beginning of the line.
[^foo] and the other goes [foo]@mirabilos @kabel42 @amin @sotolf @thedoctor
(as long as it's not the first character after the [, kind of like - is a hyphen character only if it's the last character before the ]) π
Everything is fine.
@kabel42 @rl_dane @amin @sotolf @thedoctor Python and py3k use PCRE.
Shell globs have [!β¦] for negated bracket expressions.
@kabel42 @rl_dane @amin @sotolf @thedoctor even jupp REs use \[^β¦] for them.
I can do PCRE, too: https://mbsd.evolvis.org/cvs.cgi/contrib/code/Snippets/uricheck.py?rev=HEAD :ΓΎ