[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#359170: RFP: tagsoup - SAX-compliant HTML parser for Java



Package: wnpp
Severity: wishlist

Package name: tagsoup
Version: 1.0rc3
Upstream Author: John Cowan <cowan@ccil.org>
URL: http://mercury.ccil.org/~cowan/XML/tagsoup/
License: GPL or AFL
Description: SAX-compliant HTML parser for Java

This is the home page of TagSoup, a SAX-compliant parser written in
Java that, instead of parsing well-formed or valid XML, parses HTML as
it is found in the wild: nasty and brutish, though quite often far
from short. TagSoup is designed for people who have to process this
stuff using some semblance of a rational application design. By
providing a SAX interface, it allows standard XML tools to be applied
to even the worst HTML.

TagSoup is free and Open Source software, licensed under the Academic
Free License, a cleaned-up and patent-safe BSD-style license which
allows proprietary re-use. It's also licensed under the GNU GPL, since
unfortunately the GPL and the AFL are incompatible. You can choose to
license TagSoup from me under either the GPL or the AFL.

Reply to: