[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: new search engine for our web pages?



Hi.

I sent this mail to debian-devel@debian.or.jp to seek the information,
And nokubi replied to this.

 at Date: Mon, 21 Feb 2000 18:08:06 +0900,
  on Subject: [debian-devel:11686] Re: new search engine for our web pages? [was: masayuki-h@geocities.co.jp: Re: ITP: namazu2],
   Taketoshi Sano <kgh12351@nifty.ne.jp> writes:

> Are there any idea to improve that point ?
> 
> This issue is about to use namazu (or namazu2) for search of the whole page
> of www.debian.org. This mail is sent to debian-devel@debian.or.jp, with 
> Cc: to debian-www.debian.or.jp and debian-www.lists.debian.org.
> 
> In article <20000219234133.B27024@landru.home.net>,
>   at Sat, 19 Feb 2000 23:41:33 -0500,
>  "James A. Treacy" <treacy@debian.org> writes:
> 
> > On Sat, Feb 19, 2000 at 12:33:37PM +0100, Josip Rodin wrote:
> > > Hi everyone,
> > > 
> > > Could we use this namazu program for searching the web pages?
> > > 
> > Here is the part that stopped me cold:
> >    Indexing process will take fifty minutes to index 25 MByte files
> >    with Linux Box has Pentium 166 MHz + 64 MB.
> > 
> > That's over 8 hours just to index the main part of the site (roughly
> > 200MB), which should be reindexed every day. For comparison, swish++
> > takes less than 10 minutes to index the Package section of the site
> > (roughly 97MB). Using this for the List Archives (around 2GB) isn't
> > even funny.
> > 
> > -- 
> > James (Jay) Treacy
> > treacy@debian.org
> 
> The past (not to be updated) record of the List Archives can be
> indexed step by step, maybe. But everyday re-indexing may be too much.
> 
> How do you think, Kitame, and Nokubi ? (Masayuki wrote you are
> the namazu "demigods", so you can answer to this issue, I hope.)
> 
> FYI:
> 
>  The size of my local cvs copy for www.debian.or.jp:
>  $ du -s /Home/sano/work/Debian/Web/www.debian.or.jp/
>  5089    /Home/sano/work/Debian/Web/www.debian.or.jp
> 
>  The size of my local cvs copy for www.debian.org/english:
>  $ du -s /Home/sano/work/Debian/Web/webwml/english/  
>  6809    /Home/sano/work/Debian/Web/webwml/english
> 
>  The size of my local cvs copy for www.debian.org/japanese:
>  $ du -s /Home/sano/work/Debian/Web/webwml/japanese/
>  1679    /Home/sano/work/Debian/Web/webwml/japanese
> 
> # I don't get other language tree, but there may be many langugage trees
> # other than these two trees.
> 
>  The size of my local cvs copy for www.linux.or.jp/public:
>  $ du -s /Home/sano/work/JLUG/Web/main/public/      
>  6220    /Home/sano/work/JLUG/Web/main/public
> 
> -- 
>   Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@nifty.ne.jp>

In <200002220011.JAA02875@ns.eal.or.jp>,
 at Date: Tue, 22 Feb 2000 09:11:18 +0900,
  on Subject: [debian-devel:11700] Re: new search engine for our web pages? [was:masayuki-h@geocities.co.jp: Re: ITP: namazu2],
   knok@daionet.gr.jp (NOKUBI Takatsugu) writes:

knok> Excuse me, I'm not subscribe debian-www currently. I expect to work
knok> linux.debian.www newsgroup.
knok> 
knok> In article <y5azosvv7ln.fsf@kgh12351.nifty.ne.jp>
knok> kgh12351@nifty.ne.jp writes:
knok> 
knok> >> The past (not to be updated) record of the List Archives can be
knok> >> indexed step by step, maybe. But everyday re-indexing may be too much.
knok> >> 
knok> >> How do you think, Kitame, and Nokubi ? (Masayuki wrote you are
knok> >> the namazu "demigods", so you can answer to this issue, I hope.)
knok> 
knok> At first, re-indexing is not heavier than first time indexing. It is
knok> "difference indexing". Untouched files are not targets of proccessing.
knok> 
knok> Second, namazu/namazu2 can handle multiple index files. So some index
knok> processing can divide (and could be use some machines).
knok> 
knok> BTW, I'm not "demigod". I'm just a member of Namzu Project :-)
knok> -- 
knok> NOKUBI Takatsugu
knok> E-mail: knok@daionet.gr.jp
knok> 	knok@debian.or.jp (Debian-JP)

How do you think ?

-- 
  Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@nifty.ne.jp>


Reply to: