Linux SoftwareProgrammingLibrariesHTML::TableExtract 2.07

HTML::TableExtract 2.07


HTML::TableExtract is a Perl module that simplifies the extraction of information from tables within HTML documents. Tables, no ma
Developer:   Matthew P Sisk
      more software by author →
Price:  0.00
License:   GPL (GNU General Public License)
File size:   23K
Language:   
OS:   
Rating:   0 /5 (0 votes)
Your vote:  
enlarge screenshot


HTML::TableExtract is a Perl module that simplifies the extraction of information from tables within HTML documents.

Tables, no matter how nested or clustered, can be targeted symbolically with column headers or by more specific depth and count information.

Each table is labeled in the first row with coordinates in terms of depth and count, which both start at 0. Some of the tables have headers in the second row; although in this example these header cells are in fact < th > tags, header cells can be either < th > or < td >. The remaining cells in the table indicate row and column information from that cell, along with the table coordinates: depth,count:row,column. Rows and columns begin at 0 as well, so the table label and headers, if present, will affect these cell coordinates.

In the illustrations of what is extracted from these tables, content in italics is notational in nature; it was not actually extracted from the tables. In particular, whenever headers are used for extraction, the order in which the headers were provided is noted by listing the headers, but the header row is not actually extracted from the target table.

What's New in This Release:
  • A subtable slicing bug and an hrow() attachment bug were fixed.
  • Tests were added.
    tags extracted from  the table  actually extracted  from the  the headers  not actually  depth and  the tables  header cells  information from  and count  

    Download HTML::TableExtract 2.07


     http://mirrors.evolva.ro/CPAN/authors/id/M/MS/MSISK/HTML-TableExtract-2.07.tar.gz


    Authors software

    Finance-QuoteHist 1.07 (by Matthew P Sisk)
    Top level aggregator that will select a default lineup of site instances default lineup of site instancesof site instancesof site ins

    HTML::TableExtract 2.07 (by Matthew P Sisk)
    HTML::TableExtract is a Perl module that simplifies the extraction of information from tables within HTML documents.

    Tables, no ma


    Similar software

    HTML::TableExtract 2.07 (by Matthew P Sisk)
    HTML::TableExtract is a Perl module that simplifies the extraction of information from tables within HTML documents.

    Tables, no ma

    PyHtmlTable 1.13 (by Joe Pasko)
    PyHtmlTable is a class for Python CGIs to generate HTML tables on the fly

    TableMatrix 1.22 (by TableMatrix Team)

    PHP SQLDiff 2.2 (by Terry Gliedt)
    PHP SQLDiff is a Web application that shows the difference between two SQL database tables.

    If you manage your database tables lik

    t2t 5.1 (by Steve Scholnick)
    t2t is a Perl script that converts standard ASCII text to HTML 4.0 tables

    PHP Data Grid Class 1.0 Beta (by Stefan Gabos)
    PHP Data Grid Class can be used to display MySQL query results in HTML tables

    HTML::TableTiler 1.21 (by Domizio Demichelis)
    HTML::TableTiler can easily generates complex graphic styled HTML tables.

    HTML::TableTiler uses a minimum HTML table as a tile to

    PackRat 0.28 (by Stewart Allen)
    PackRat is a compact, minimalist personal information manager

    HTML::QuickTable 1.12 (by Nathan Wiger)
    HTML::QuickTable is a Perl module to quickly create fairly complex HTML tables.

    SYNOPSIS

    use HTML::QuickTable;

    my $q

    Basset::DB::Table 1.03 (by Jim Thomason)
    Basset::DB::Table is used to define database tables, ways to load that data into memory and build queries based upon the table inform


    Other software in this category

    zlib 1.2.3 (by Jean-loup Gailly)
    zlib is designed to be a free, general-purpose, legally unencumbered, lossless data-compression library for use on virtually any comp

    libjpeg v6b (by Independent JPEG Group)
    libjpeg is a library for handling the JPEG (JFIF) image format

    OpenSSL 0.9.7c (by The OpenSSL Project Team)
    The OpenSSL Project is a collaborative effort to develop a robust, commercial-grade, full-featured, and Open Source toolkit implement

    libxml2 2.6.27 (by DV)
    Libxml2 is the XML C parser and toolkit developed for the Gnome project (but usable outside of the Gnome platform), libxml2 library i

    GNU C library 2.4 (by Andreas Jaeger)
    GNU C library (glibc) is one of the most important components of GNU Hurd and most modern Linux distributions.

    GNU C library is us

  •     search


    Featured Software

    jEdit 4.3 pre8
    jEdit is an Open Source text editor written in Java

    Opera 9.02
    Surf the Internet in a safer, faster, and easier way with Opera browser

    GNU Aspell 0.60.4
    GNU Aspell is a Free and Open Source spell checker designed to eventually replace Ispell


    Subscribe in Rojo
    Google Reader
    Add to My Yahoo!

    Add to My AOL
    Subscribe with Bloglines
    Subscribe in NewsGator Online
    Add 'nixbit linux software' to Newsburst from CNET News.com
    del.icio.us nixbit linux software


    Top tags