Am 2005-04-13 14:10:50, schrieb Andreas Rippl: > Hi Michelle, > > not to destroy your enthusiasm, just a word of warning: > When I was young and foolish I made a little script-crawler in Perl > which connected to randomly created IPs and check for an open > port 21. Then I would try to log in anonymously and get a directory > listing of the contents. It worked beautifully on the Debian/Sparc20 I > had running back then...until I had a visit by the upset Admin from Uni > one morning. Some kind of jackass Admin from one site or another had > observed that I tried to *hack into his site* and complained to Uni. I > can't even blame him, he needs to show results to his boss as well... > > Funnily, I was on good terms with the Admin, and after I explained what > I had tried to, the angry mails by the guy (he proposed a proper > spanking for me or some such thing) just went directly to the trash. > > Ah, the good old days... :-) In the last 2 dasy I have tried to spider 480 FTP-Servers (220 in germany) but if there is no ls-lR(.gz) on the machine, it would be to slow. Even if my curent internet connection has 8 MBit, most servers give me only 100-300 kByte/second. Have you tried <wuarchive.wustl.edu> ? This Server whith its 2 TeraByte of Diskspace has a CVS on it... Now I have limited the level to 4 and I am only spidering vor ls-lR(.gz) 480 Servers amd 3,5 GByte of ls-lR(.gz) files. Now I need only a script which put the infos in my postgresql. 1. Table relations holds only numerical indexes of the other Tables 2. Table servers in the middle around 45 Bytes per server 3. Table directories very big table. how long can be a path ? 4. Table files name, ext, perm, size So the Database is not realy difficult... Maybe I should split the 2nd Table by domains and the 4th Table by extensions... And I think, the size of the 3th Table will be a problem too. OK, I have already a postgresql of 120 GByte running on a 3Ware 3c9xxx Raid-5 with 4 Raptors (SATA) :-) but this new project will be a nice experience. Greetings Michelle -- Linux-User #280138 with the Linux Counter, http://counter.li.org/ Michelle Konzack Apt. 917 ICQ #328449886 50, rue de Soultz MSM LinuxMichi 0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Attachment:
signature.pgp
Description: Digital signature