[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: ARCHI server for Linux ? (and FTP-Search Project)



Am 2005-04-13 14:10:50, schrieb Andreas Rippl:

> Hi Michelle,
> 
> not to destroy your enthusiasm, just a word of warning:
> When I was young and foolish I made a little script-crawler in Perl
> which connected to randomly created IPs and check for an open
> port 21. Then I would try to log in anonymously and get a directory
> listing of the contents. It worked beautifully on the Debian/Sparc20 I
> had running back then...until I had a visit by the upset Admin from Uni
> one morning. Some kind of jackass Admin from one site or another had
> observed that I tried to *hack into his site* and complained to Uni. I
> can't even blame him, he needs to show results to his boss as well...
> 
> Funnily, I was on good terms with the Admin, and after I explained what
> I had tried to, the angry mails by the guy (he proposed a proper
> spanking for me or some such thing) just went directly to the trash.
> 
> Ah, the good old days...

:-)

In the last 2 dasy I have tried to spider 480 FTP-Servers (220 in
germany) but if there is no ls-lR(.gz) on the machine, it would be
to slow. Even if my curent internet connection has 8 MBit, most
servers give me only 100-300 kByte/second.

Have you tried <wuarchive.wustl.edu> ?

This Server whith its 2 TeraByte of Diskspace has a CVS on it...

Now I have limited the level to 4 and I am only spidering vor ls-lR(.gz)
480 Servers amd 3,5 GByte of ls-lR(.gz) files.

Now I need only a script which put the infos in my postgresql.

1. Table        relations       holds only numerical indexes of the
                                other Tables
2. Table        servers         in the middle around 45 Bytes per server
3. Table        directories     very big table. how long can be a path ?
4. Table        files           name, ext, perm, size

So the Database is not realy difficult...

Maybe I should split the 2nd Table by domains and the 4th Table by
extensions...

And I think, the size of the 3th Table will be a problem too.

OK, I have already a postgresql of 120 GByte running on a 3Ware 3c9xxx
Raid-5 with 4 Raptors (SATA)  :-)  but this new project will be a nice
experience.

Greetings
Michelle

-- 
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack   Apt. 917                  ICQ #328449886
                   50, rue de Soultz         MSM LinuxMichi
0033/3/88452356    67100 Strasbourg/France   IRC #Debian (irc.icq.com)

Attachment: signature.pgp
Description: Digital signature


Reply to: