[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#470369: RFP: haskell-tagsoup -- Haskell library to robustly parse non-standards-compliant HTML



Package: wnpp
Severity: wishlist

* Package name    : haskell-tagsoup
  Version         : 0.4
  Upstream Author : Neil Mitchell <ndm@cs.york.ac.uk>
* URL             : http://www-users.cs.york.ac.uk/~ndm/tagsoup/
* License         : 3-clause BSD
  Programming Lang: Haskell
  Description     : Haskell library to robustly parse unstructured HTML

TagSoup extracts information out of unstructured HTML code, sometimes
known as tag soup. TagSoup does not require well-formed or
standards-compliant HTML, or HTML that renders correctly in any
particular rendering engine.  TagSoup transforms HTML into a list of
open tags with attributes, close tags, and text, but makes no attempt
to group these together into any kind of structure.

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.24-1-686 (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash



Reply to: