Special points for ^$ vs \A\z with Ruby.
Ruby has a special handling of regular expressions, the regexps are
matching by default in multi-line mode. This is not the case for instance
in Perl or other programming languages.
To demonstrate this behavior compare the two command lines below:
$ perl -e ‘$a=“foo\nbar”; $a =~ /^foo$/ ? print “match” : \
print “no match”’
no match
$ ruby -e ‘a=“foo\nbar”; if a =~ /^foo$/; puts “match”; \
else puts “no match”; end’
match
The string “foo\nbar” does not match the regular expression /^foo$/ in the
Perl code snippet, it is matching in the Ruby code snippet.
The main problem with this regular expression handling is that quite a lot
of developers are not aware of this subtle difference. This results in
improper checks and validations. As an example the controller below comes
close to what can be observed in real world code (the regex is somewhat
simplified here):
class PingController < ApplicationController
def ping
if params[:ip] =~ /^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}$/
render :text => ping -c 4 #{params[:ip]}
else
render :text => “Invalid IP”
end
end
end
The developer’s expectation is to match only numbers and dots within the
above IP address validation. But due to the default multi line mode of
Ruby’s regular expression parser the above check can be circumvented by a
string like “1.2.3.4.\nsomething”. The $ in the above regex would stop at
\n therefore the above code is command injectable with a simple request
like this:
$ curl localhost:3000/ping/ping -H “Content-Type: application/json” \
–data ‘{“ip” : “127.0.0.999\n id”}’
Instead of using ^ and $ \A and \z should be used to match the beginning
and end of the string, rather than the beginning or end of the line.
Another common usecase of this RegEx behavior is the verification of user
given links. So for instance the RegEx /^https?:\/\// is bypassable by
supplying a link like:
“javascript:alert(‘lol’)/\nhttp:///” (note the newline)
When this input is rendered into a href attribute of an anchor tag, we’ve
gotten a straight froward Cross-Site Scripting.
