1.1 Basic RE2
Felix provides Google's RE2 engine for regular expressions. The basic syntax and capabilities are a subset of Perl's PCRE, only RE2 actually works correctly and performs well. RE2 does not support backreferences.
1.1.2 Compiling a regexp.
A regexp can be compiled with the
var r = RE2(" *([A-Za-z_][A-Za-z0-9]*).*");
Matching is done with the
var line = "Hello World"; var maybe_subgroups = Match (r, line);
Match only supports a complete match.
There's no searching or partial matching. Instead,
just use repeated wildcards as shown.
The best way to check the result of a
with a pattern match
match maybe_subgroups with | #None => println$ "No match"; | Some a => println$ "Matched " + a.1; endmatch;
1.1.5 Streamable matching
You may want to match more than one instance of a pattern in a string. For example, you may want to capture each word in a line of text. This can be done by iterating over a regex like the following
var r2 = RE2("\w+"); // try to match a word var sentence = "Hello World"; for x in (r2, sentence) do println$ x.0; done
If you use the simple method, you'll only match a single word, but with the for loop you get every match.
1.1.6 Supported Syntax.
See RE2 syntax.
1.2 Regular definitions.
Regular expressions are quoting hell. Luckily Felix provides a solution: regular definitions:
begin regdef lower = charset "abcdefghijklmnopqrstuvwxyz"; regdef upper = charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; regdef digit = charset "0123456789"; regdef alpha = upper | lower; regdef cid0 = alpha | "_"; regdef cid1 = cid0 | digit; regdef cid = cid0 cid1 *; regdef space = " "; regdef white = space +; regdef integer = digit+;
These are some basic definitions. Note that
regdef introduces a new
syntax corresponding with the notation usually used for
This is called a DSSL or Domain Specific Sub-Language. Its not a DSL, because that's a complete new language, rather the sub suggests its an extension of normal Felix. The extension is written entirely in user space.
Now to use these definitions:
// match an assignment statement regdef sassign = white? "var" white? group (cid) white? "=" white? (group (cid) | group (integer)) white? ";" white? ; var rstr : string = sassign.Regdef::render; var ra = RE2 rstr; var result = Match (ra, " var a = b; "); match result with | #None => println$ "No match?"; | Some groups => if groups.2 != "" do println$ "Assigned " + groups.1 + " from variable " + groups.2; else println$ "Assigned " + groups.1 + " from integer" + groups.3; done; endmatch; end
Assigned a from variable b
Note that the regdef kind of variable must be converted to a
Perl regexp in a string form using the