Regular Expressions: Difference between revisions
(Created page with '== References == * [http://www.regular-expressions.info/ Regular-Expressions.info], The Premier website about Regular Expressions * [http://en.wikipedia.org/wiki/Regular_expressi…') |
|||
Line 101: | Line 101: | ||
|<tt>^on single line$</tt>||''(start and end of line)'' - Match ''on single line'' on a single line. |
|<tt>^on single line$</tt>||''(start and end of line)'' - Match ''on single line'' on a single line. |
||
|} |
|} |
||
== Regex Golf == |
|||
* http://regex.alf.nu/ |
|||
My solutions so far (see here for other scores [http://www.reddit.com/r/programming/comments/1tb0go/regex_golf/]): |
|||
Plain strings (207) foo |
|||
Anchors (208) k$ |
|||
Ranges (202) ^[a-f]+$ |
|||
Backrefs (201) (...).*\1 |
|||
Abba (190) ^(?!.*(.)(.)\2\1).*$ |
|||
A man, a plan (176) ^(.)(.).*\2\1$ |
|||
Prime (232) ^(xx|xxx|x{5}|x{7}|x{11}|x{13}|x{17}|x{19}|x{23}|x{29}|x{31})$|x{33} |
|||
Four (198) (.).\1.\1.\1 |
|||
Order (156) ^a?b?c?c?d?e?e?f?g?h?i?l?l?m?n?o?o?p?r?s?s?t?t?y?w?z?$ |
|||
Triples (0) NONE |
|||
Glob NONE |
|||
Balance NONE |
|||
Powers (60) ^(((((((((xx?)\9?)\8?)\7?)\6?)\5?)\4?)\3?)\2?)\1?$ |
|||
Long count (0) |
|||
Long count v2 (0) |
|||
Alphabetical (0) |
Revision as of 15:10, 9 January 2014
References
- Regular-Expressions.info, The Premier website about Regular Expressions
- Regular expression on Wikipedia
Engines
Powerful engines:
- Perl
- PCRE
- Open source regex engine implemented into PHP for instance
Less powerful engines:
- Use Extended regular expressions (switch
-r
) so that meta-characters(){}
have their special meaning when unquoted.
Character Classes
Class | Meaning | Comment |
---|---|---|
[ae] |
Matches a or e | |
[a-z] |
Matches any char in range a...z | |
[^a-z] |
Matches any char not in range a...z | |
\d |
Digit - Equivalent to [0-9] |
|
\w |
Word character - Equivalent to [A-Za-z0-9_] |
|
\s |
Whitespace character - Equivalent to [ \t\r\n] |
|
\D |
Negated \d, i.e. [^\d] |
|
\W |
Negated \w, i.e. [^\w] |
|
\S |
Negated \s, i.e. [^\s] |
About negated class, note that [\D\S]
is not the same as [^\d\s]
. The latter will not match a character that is either a digit or a whitespace. The former will match any character that is either not a digit, or not a whitespace, i.e. it will match any character...
Zero-length matches
The regex here are zero-length, meaning they match a zero-length string, either because they match particular positions in the string (such as end-of-line or beginning-of-line anchors), or because the matched string is dropped after evaluation (like assertions, which only yield a boolean value, match or not matched).
Anchors:
Anchor | Meaning | Comment |
---|---|---|
^ |
Beginning-of-line anchor | |
$ |
End-of-line anchor | |
\b |
Word boundary anchor | |
\B |
Negated word boundary anchor | |
\< |
Start-of-a-word anchor | GNU extensions |
\> |
End-of-a-word anchor | GNU extensions |
Assertions:
Assertion | Meaning | Comment |
---|---|---|
(?=regex) |
Lookahead positive assertion | e.g. \b(?=\w{0,3}cat)\w{6}\b , matches locate
|
(?!regex) |
Lookahead negative assertion | e.g. \b(?!\w{0,3}cat)\w{6}\b , matches relica but not locate
|
(?<=regex) |
Lookbehind positive assertion | |
(?<!regex) |
Lookbehind negative assertion |
Examples
Sed - The list below is actually for Extended regular expression (switch -r
).
Regexp | Description |
---|---|
. | Match any character |
gray|grey | Match gray or grey |
gr(a|e)y | Match gray or grey |
gr[ae]y | Match gray or grey |
file[^0-2] | Match file3 or file4, but not file0, file1, file2. |
colou?r | (zero or one) - Match Color or Colour. |
ab*c | (zero or more) - Match ac, abc, abbc, .... |
ab+c | (one or more) - Match abc, abbc, abbbc, .... |
a{3,5} | (at least m and not more than n times) - Match aaa, aaaa, aaaaa. |
^on single line$ | (start and end of line) - Match on single line on a single line. |
Regex Golf
My solutions so far (see here for other scores [1]):
Plain strings (207) foo Anchors (208) k$ Ranges (202) ^[a-f]+$ Backrefs (201) (...).*\1 Abba (190) ^(?!.*(.)(.)\2\1).*$ A man, a plan (176) ^(.)(.).*\2\1$ Prime (232) ^(xx|xxx|x{5}|x{7}|x{11}|x{13}|x{17}|x{19}|x{23}|x{29}|x{31})$|x{33} Four (198) (.).\1.\1.\1 Order (156) ^a?b?c?c?d?e?e?f?g?h?i?l?l?m?n?o?o?p?r?s?s?t?t?y?w?z?$ Triples (0) NONE Glob NONE Balance NONE Powers (60) ^(((((((((xx?)\9?)\8?)\7?)\6?)\5?)\4?)\3?)\2?)\1?$ Long count (0) Long count v2 (0) Alphabetical (0)