Bug#374643: O: htdig - WWW search system for an intranet or small internet
Package: wnpp
Severity: normal
The current maintainer of htdig, Robert Ribnitz <ribnitz@linuxbourg.ch>,
does orphan this package.
If you want to be the new maintainer, please take it -- see
http://www.debian.org/devel/wnpp/index.html#howto-o for detailed
instructions how to adopt a package properly.
Some information about this package:
Package: htdig
Priority: optional
Section: web
Installed-Size: 2940
Maintainer: Robert Ribnitz <ribnitz@linuxbourg.ch>
Architecture: i386
Version: 1:3.1.6-11.1
Depends: libc6 (>= 2.3.5-1), libdb2 (>= 2:2.7.7.0-7), libgcc1 (>=
1:4.0.2), libs
tdc++6 (>= 4.0.2-4), zlib1g (>= 1:1.2.1), debconf (>= 0.5) |
debconf-2.0, perl,
lockfile-progs, gawk, sed (>= 4.0)
Recommends: wwwoffle, apache | httpd, htdig-doc
Suggests: catdoc, pstotext | gs | xpdf | xpdf-i
Conflicts: htdig3.2
Filename: pool/main/h/htdig/htdig_3.1.6-11.1_i386.deb
Size: 987894
MD5sum: d273677fa6273d644de0ac19f8673d5b
Description: WWW search system for an intranet or small internet
The ht://Dig system is a complete world wide web indexing and searching
system for a small domain or intranet. This system is not meant to
replace the need for powerful internet-wide search systems like Lycos,
Infoseek, Webcrawler and AltaVista. Instead it is meant to cover the
search needs for a single company, campus, or even a particular sub
section of a web site.
.
As opposed to some WAIS-based or web-server based search engines,
ht://Dig can span several web servers at a site. The type of these
different
web servers doesn't matter as long as they understand the HTTP 1.0
protocol.
.
Features:
* Intranet searching
* It is free
* Robot exclusion is supported
* Boolean expression searching
* Configurable search results
* Fuzzy searching
* Searching of HTML and text files
* Keywords can be added to HTML documents
* Email notification of expired documents
* A Protected server can be indexed
* Searches on subsections of the database
* Full source code included
* The depth of the search can be limited
* Full support for the ISO-Latin-1 character set
.
Disk space requirements:
.
The search engine will require lots of disk space to store its
databases. Unfortunately, there is no exact formula to compute the
space requirements. It depends on the number of documents you are
going to index but also on the various options you use. To give you
an idea of the space requirements, here is what I have deduced from
our own database size at San Diego State University.
.
If you keep around the wordlist database (for update digging instead
of initial digging) I found that multiplying the number of documents
covered by 12,000 will come pretty close to the space required.
.
We have about 13,000 documents: 150MB index size with a 'wordlist'
database
93MB index size without a 'wordlist'
database
.
The package is available in two varieties, the 'stable', well-tested
version
(this one) and a less tested version (as 'htdig3.2').
Tag: interface::web, made-of::lang:c++, protocol::http, role::sw:server,
use::se
arching, web::cgi, works-with::text:html
--
Address: Daniel Baumann, Burgunderstrasse 3, CH-4562 Biberist
Email: daniel.baumann@panthera-systems.net
Internet: http://people.panthera-systems.net/~daniel-baumann/
Reply to: