HTML::Parser 3.54
HTML::Parser is a HTML parser class
|
|||||||||||||||||||
HTML::Parser is a HTML parser class. Objects of the HTML::Parser class will recognize markup and separate it from plain text (alias data content) in HTML documents. As different kinds of markup and text are recognized, the corresponding event handlers are invoked.
HTML::Parser is not a generic SGML parser.
We have tried to make it able to deal with the HTML that is actually "out there", and it normally parses as closely as possible to the way the popular web browsers do it instead of strictly following one of the many HTML specifications from W3C. Where there is disagreement, there is often an option that you can enable to get the official behaviour.
The document to be parsed may be supplied in arbitrary chunks. This makes on-the-fly parsing as documents are received from the network possible.
If event driven parsing does not feel right for your application, you might want to use HTML::PullParser. This is an HTML::Parser subclass that allows a more conventional program structure.
SYNOPSIS:
use HTML::Parser ();
# Create parser object
$p = HTML::Parser->new( api_version => 3,
start_h => [&start, "tagname, attr"],
end_h => [&end, "tagname"],
marked_sections => 1,
);
# Parse document text chunk by chunk
$p->parse($chunk1);
$p->parse($chunk2);
#...
$p->eof; # signal end of document
# Parse directly from file
$p->parse_file("foo.html");
# or
open(my $fh, "parse_file($fh);
tags
Download HTML::Parser 3.54
Authors software
|
|
|
|
|
|
|
|
|
|
Similar software
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Other software in this category
|
|
|
|
|
|
|
|
|
|
Featured Software
jEdit 4.3 pre8
jEdit is an Open Source text editor written in Java
Opera 9.02
Surf the Internet in a safer, faster, and easier way with Opera browser
GNU Aspell 0.60.4
GNU Aspell is a Free and Open Source spell checker designed to eventually replace Ispell
- Communications
- Database
- Desktop Environment
- Games
- Internet
- Multimedia
- Office
- Programming
- Science and Engineering
- System
- Text Editing&Processing
