Curious which words (>3 letters) in your system dictionary have all the letters in alphabetical order? Sate your curiosity with a little #awk:

$ awk 'length>3 && /^a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*$/' /usr/share/dict/words

Optionally sort they by length:

$ awk 'length>3 && /^a*b*c*d*e*f*g*h*i*j*k*l*m*n*o*p*q*r*s*t*u*v*w*x*y*z*$/{print length, $0}' /usr/share/dict/words | sort -n

give me "billowy" and "beefily" as words of interest. If you don't like duplicates, use "?" instead of "*"

$ awk 'length>3 && /^a?b?c?d?e?f?g?h?i?j?k?l?m?n?o?p?q?r?s?t?u?v?w?x?y?z?$/{print length, $0}' /usr/share/dict/words | sort -n

which gives "almost", "biopsy", and "chintz" as nice long runs.

@gumnos looking at those regexes first makes me shudder at the amount of backtracking they would kick off, then makes me remember just how unbelievably fast modern CPUs actually are.

@gnomon

Given the ratcheting nature of them and the initial anchoring at the front/back, there's minimal backtracking. The first letter of a word zips right through zero-of-everything-before, and the instant an out-of-sequence letter is found, it rejects. It may be *ugly*, but it's fast πŸ˜†

@gumnos awwww *pats regex on the head* me too l'il guy

@gnomon

If you don't mind uglier and are using PCRE2 instead of awk's EREs, you can force greedy * operators with *+ like

^a*+b*+c*+d*+e*+f*+g*+h*+i*+j*+k*+l*+m*+n*+o*+p*+q*+r*+s*+t*+u*+v*+w*+x*+y*+z*+$

which backtracks at roughly O(N) rather than O(N log N) πŸ˜‰

https://regex101.com/r/hjuQRY/1

(changing "billowya" to "billowy" goes from not-matching in 29 steps to matching in 30 steps) compared to the non-greedy "*" operator which rejects "billowya" in 120 steps while accepting "billowy" in 30)

(and yes, I used regex to edit my regex… πŸ˜†)

regex101: build, test, and debug regex

Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.

regex101

@gnomon @gumnos
(sorry but I like blabbering about regexp implementation, so...)
there's no need for backtracking *at all* in regexp, you'd need it only for some extensions (which are no longer regular expression in the mathematical sense).

Russ Cox gives a very nice explanation here: https://swtch.com/~rsc/regexp/regexp2.html

Regular Expression Matching: the Virtual Machine Approach