[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#448872: ITP: libhtmlparser-java -- java library to parse html



Package: wnpp
Severity: wishlist
Owner: Tiago Saboga <tiagosaboga@gmail.com>


* Package name    : libhtmlparser-java
  Version         : 1.6
  Upstream Author : Derrick Oswald
* URL             : http://htmlparser.sourceforge.net/
* License         : LGPL
  Programming Lang: Java
  Description     : java library to parse html

HTML Parser is a Java library used to parse HTML in either a linear
 or nested fashion. Primarily used for transformation or extraction,
 it features filters, visitors, custom tags and easy to use
 JavaBeans.

 The two fundamental use-cases that are handled by the parser are
 extraction and transformation (the syntheses use-case, where HTML
 pages are created from scratch, is better handled by other tools
 closer to the source of data).

 In general, to use the HTMLParser you will need to be able to write
 code in the Java programming language. Although some example programs
 are provided that may be useful as they stand, it's more than likely
 you will need (or want) to create your own programs or modify the
 ones provided to match your intended application.

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (500, 'testing'), (500, 'stable'), (50, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.22sofocles1 (SMP w/1 CPU core)
Locale: LANG=pt_BR.UTF-8, LC_CTYPE=pt_BR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash




Reply to: