Regexp::Parser::Handlers 0.20 review

Download
by rbytes.net on

Regexp::Parser::Handlers is a Perl module with handlers for Perl 5 regexes. This module holds the init() method for the Regexp::Pa

License: Perl Artistic License
File size: 39K
Developer: Jeff Pinyan
0 stars award from rbytes.net

Regexp::Parser::Handlers is a Perl module with handlers for Perl 5 regexes.

This module holds the init() method for the Regexp::Parser class, which installs all the handlers for standard Perl 5 regexes. This documentation contains a sub-classing tutorial.

SUB-CLASSING

I will present two example sub-classes, Regexp::NoCode, and Regexp::AndBranch.

Parser Internals

The parser object is a hash reference with the following keys:

regex

A reference to the original string representation of the regex.

len

The length of the original string representation of the regex.

tree

During the first pass, tree is undef, which instructs the object() method not to actually create any objects. Afterwards, it is an array reference of (node) objects.

stack

Initially an array reference, used to store the tree as a new scope is entered and then exited. The general concept is:

if (into_scope) {
push STACK, TREE;
TREE = CURRENT->DATA;
}
if (outof_scope) {
TREE = pop STACK;
}

After the tree has been created, this key is deleted; this gives the code a way to be sure compilation was successful.

maxpar

The highest number of parentheses. It will end up being identical to nparen, but it is incremented during the initial pass, so that on the second pass (the tree-building), it can distinguish back-references from octal escapes. (The source code to Perl's regex compiler does the same thing.)

nparen

The number of OPENs (capturing groups) in the regex.

captures

An array reference to the 'open' nodes.

flags

An array reference of flag values. When a scope is entered, the top value is copied and pushed onto the stack. When a scope is left, the top value is popped and discarded.

It is important to do this copy-and-push before you do any flag-parsing, if you're adding a handle that might parse flags, because you do not want to accidentally affect the previous scope's flag values.

Here is example code from the handler for (?ismx) and (?ismx:...):

# (?i:...) {next} }, qw< c) atom >;
}

for (split //, $on) {
if (my $h = $S->can("FLAG_$_")) {
my $v = $h->(1); # 1 means this is 'on'
if ($v) { &Rf |= $v } # turn the flag on
else { ... } # the flag's value is 0
next;
}
# throw an error if the flag isn't supported
}

for (map "FLAG_$_", split //, $off) {
if (my $h = $S->can("FLAG_$_")) {
my $v = $h->(0); # 0 means this is 'off'
if ($v) { &Rf &= ~$v } # turn the flag off
else { ... } # the flag's value is 0
next;
}
# throw an error if the flag isn't supported
}

You'll probably not be adding handlers that have to parse flags, but if you do, remember to follow this model correctly.

next

An array reference of what handles (or "rules") to try to match next.

Requirements:
Perl

Regexp::Parser::Handlers 0.20 keywords