Re: new search engine for our web pages?
Hi.
I sent this mail to debian-devel@debian.or.jp to seek the information,
And nokubi replied to this.
at Date: Mon, 21 Feb 2000 18:08:06 +0900,
on Subject: [debian-devel:11686] Re: new search engine for our web pages? [was: masayuki-h@geocities.co.jp: Re: ITP: namazu2],
Taketoshi Sano <kgh12351@nifty.ne.jp> writes:
> Are there any idea to improve that point ?
>
> This issue is about to use namazu (or namazu2) for search of the whole page
> of www.debian.org. This mail is sent to debian-devel@debian.or.jp, with
> Cc: to debian-www.debian.or.jp and debian-www.lists.debian.org.
>
> In article <20000219234133.B27024@landru.home.net>,
> at Sat, 19 Feb 2000 23:41:33 -0500,
> "James A. Treacy" <treacy@debian.org> writes:
>
> > On Sat, Feb 19, 2000 at 12:33:37PM +0100, Josip Rodin wrote:
> > > Hi everyone,
> > >
> > > Could we use this namazu program for searching the web pages?
> > >
> > Here is the part that stopped me cold:
> > Indexing process will take fifty minutes to index 25 MByte files
> > with Linux Box has Pentium 166 MHz + 64 MB.
> >
> > That's over 8 hours just to index the main part of the site (roughly
> > 200MB), which should be reindexed every day. For comparison, swish++
> > takes less than 10 minutes to index the Package section of the site
> > (roughly 97MB). Using this for the List Archives (around 2GB) isn't
> > even funny.
> >
> > --
> > James (Jay) Treacy
> > treacy@debian.org
>
> The past (not to be updated) record of the List Archives can be
> indexed step by step, maybe. But everyday re-indexing may be too much.
>
> How do you think, Kitame, and Nokubi ? (Masayuki wrote you are
> the namazu "demigods", so you can answer to this issue, I hope.)
>
> FYI:
>
> The size of my local cvs copy for www.debian.or.jp:
> $ du -s /Home/sano/work/Debian/Web/www.debian.or.jp/
> 5089 /Home/sano/work/Debian/Web/www.debian.or.jp
>
> The size of my local cvs copy for www.debian.org/english:
> $ du -s /Home/sano/work/Debian/Web/webwml/english/
> 6809 /Home/sano/work/Debian/Web/webwml/english
>
> The size of my local cvs copy for www.debian.org/japanese:
> $ du -s /Home/sano/work/Debian/Web/webwml/japanese/
> 1679 /Home/sano/work/Debian/Web/webwml/japanese
>
> # I don't get other language tree, but there may be many langugage trees
> # other than these two trees.
>
> The size of my local cvs copy for www.linux.or.jp/public:
> $ du -s /Home/sano/work/JLUG/Web/main/public/
> 6220 /Home/sano/work/JLUG/Web/main/public
>
> --
> Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@nifty.ne.jp>
In <200002220011.JAA02875@ns.eal.or.jp>,
at Date: Tue, 22 Feb 2000 09:11:18 +0900,
on Subject: [debian-devel:11700] Re: new search engine for our web pages? [was:masayuki-h@geocities.co.jp: Re: ITP: namazu2],
knok@daionet.gr.jp (NOKUBI Takatsugu) writes:
knok> Excuse me, I'm not subscribe debian-www currently. I expect to work
knok> linux.debian.www newsgroup.
knok>
knok> In article <y5azosvv7ln.fsf@kgh12351.nifty.ne.jp>
knok> kgh12351@nifty.ne.jp writes:
knok>
knok> >> The past (not to be updated) record of the List Archives can be
knok> >> indexed step by step, maybe. But everyday re-indexing may be too much.
knok> >>
knok> >> How do you think, Kitame, and Nokubi ? (Masayuki wrote you are
knok> >> the namazu "demigods", so you can answer to this issue, I hope.)
knok>
knok> At first, re-indexing is not heavier than first time indexing. It is
knok> "difference indexing". Untouched files are not targets of proccessing.
knok>
knok> Second, namazu/namazu2 can handle multiple index files. So some index
knok> processing can divide (and could be use some machines).
knok>
knok> BTW, I'm not "demigod". I'm just a member of Namzu Project :-)
knok> --
knok> NOKUBI Takatsugu
knok> E-mail: knok@daionet.gr.jp
knok> knok@debian.or.jp (Debian-JP)
How do you think ?
--
Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@nifty.ne.jp>
Reply to: