Regexp::Parser::Handlers 0.20 review
DownloadRegexp::Parser::Handlers is a Perl module with handlers for Perl 5 regexes. This module holds the init() method for the Regexp::Pa
|
|
Regexp::Parser::Handlers is a Perl module with handlers for Perl 5 regexes.
This module holds the init() method for the Regexp::Parser class, which installs all the handlers for standard Perl 5 regexes. This documentation contains a sub-classing tutorial.
SUB-CLASSING
I will present two example sub-classes, Regexp::NoCode, and Regexp::AndBranch.
Parser Internals
The parser object is a hash reference with the following keys:
regex
A reference to the original string representation of the regex.
len
The length of the original string representation of the regex.
tree
During the first pass, tree is undef, which instructs the object() method not to actually create any objects. Afterwards, it is an array reference of (node) objects.
stack
Initially an array reference, used to store the tree as a new scope is entered and then exited. The general concept is:
if (into_scope) {
push STACK, TREE;
TREE = CURRENT->DATA;
}
if (outof_scope) {
TREE = pop STACK;
}
After the tree has been created, this key is deleted; this gives the code a way to be sure compilation was successful.
maxpar
The highest number of parentheses. It will end up being identical to nparen, but it is incremented during the initial pass, so that on the second pass (the tree-building), it can distinguish back-references from octal escapes. (The source code to Perl's regex compiler does the same thing.)
nparen
The number of OPENs (capturing groups) in the regex.
captures
An array reference to the 'open' nodes.
flags
An array reference of flag values. When a scope is entered, the top value is copied and pushed onto the stack. When a scope is left, the top value is popped and discarded.
It is important to do this copy-and-push before you do any flag-parsing, if you're adding a handle that might parse flags, because you do not want to accidentally affect the previous scope's flag values.
Here is example code from the handler for (?ismx) and (?ismx:...):
# (?i:...) {next} }, qw< c) atom >;
}
for (split //, $on) {
if (my $h = $S->can("FLAG_$_")) {
my $v = $h->(1); # 1 means this is 'on'
if ($v) { &Rf |= $v } # turn the flag on
else { ... } # the flag's value is 0
next;
}
# throw an error if the flag isn't supported
}
for (map "FLAG_$_", split //, $off) {
if (my $h = $S->can("FLAG_$_")) {
my $v = $h->(0); # 0 means this is 'off'
if ($v) { &Rf &= ~$v } # turn the flag off
else { ... } # the flag's value is 0
next;
}
# throw an error if the flag isn't supported
}
You'll probably not be adding handlers that have to parse flags, but if you do, remember to follow this model correctly.
next
An array reference of what handles (or "rules") to try to match next.
Requirements:
Perl
Regexp::Parser::Handlers 0.20 search tags