Re: sed question
Thanks for your help.
On Sun, 31 Aug 2003 02:10:14 +0100
Carlos Sousa <email@example.com> wrote:
> On Sun, 31 Aug 2003 10:20:46 +1000 John Habermann wrote:
> > I have tried things like the following:
> > sed -e 's/^w.*\s//' > log
> > thinking that it would delete from the beginning of the line to the
> > first white space but it deletes all matched expressions.
> (man sed, man grep)
> It seems you mean 's/^\w*\s//'
> Unfortunately, it seems sed doesn't understand the \w and \s escape
> sequences, unlike grep. Better try:
> sed 's/^[[:alnum:]]*[[:space:]]*//'
cat temp | sed 's/^[[:alpha:]]*[[:space:]]*//' > log
Where temp is:
test.wilderness.org.au/about_us/whatistwsck 184.108.40.206 - - [26/Aug/2003 08:14:01] "GET http://test.wilderness.org.au/about_us/whatistws HTTP/1.0" 200 20872 "-" "Dillo/0.7.3" TCP_MISS:DIRECT
but that just removes the test from .wilderness....
What I want to do is to go through a log file and delete all the virtual host entries at the front of each log entry so I end up with something like this.
220.127.116.11 - - [26/Aug/2003 08:14:01] "GET http://test.wilderness.org.au/about_us/whatistws HTTP/1.0" 200 20872 "-" "Dillo/0.7.3" TCP_MISS:DIRECT
I just can't seem to figure out how to match that first virtual host path and delete. I have seen perl scripts written to sort virtual hosts into separate log files for apache but they don't seem to work for the logs that are produced by squid and I don't have any perl knowledge so looking at the scripts didn't help me. I thought I would be able to use grep and sed to do the job for me and I can use grep to filter them in separate log files but I now need to delete that first virtual host entry so I can use webalizer to analyse the seperate log files. I have read the info page for sed and looked at tutorials and the faq but haven't seen been able to really understand the options.
> Carlos Sousa
> To UNSUBSCRIBE, email to firstname.lastname@example.org
> with a subject of "unsubscribe". Trouble? Contact email@example.com