[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

sed question



Hi 

I am just beginning to learn how to use sed in order to sort my squid log files by virtualhost and am having trouble getting my head around how the regular expression works. 

I can sort my log files into the different virtual hosts using grep eg "grep '^test' access-sed.txt > test.wilderness.log" as I have got squid to write the logs with the virtualhost entry added to the front of every log entry as illustrated below. What I am having problems with is using sed to strip the virtualhost entry from the front of the log entrys once they have been sorted so I can then use webalizer to analyse the logs for me and get different webalizer reports for different virtual hosts.

www.sydney.wilderness.org.au/docs/node.php? 203.48.59.163 - - [26/Aug/2003 08:09:56] "GET http://www.sydney.wilderness.org.au/docs/node.php? HTTP/1.1" 200 8719 "http://www.sydney.wilderness.org.au/docs/module.php?mod=book"; "Mozilla/5.0 (X11; U; Linux ppc) Gecko/20030714 Galeon/1.3.7 Debian/1.3.7.20030723-1" TCP_MISS:DIRECT

I have tried things like the following:

sed -e 's/^w.*\s//' > log

thinking that it would delete from the beginning of the line to the first white space but it deletes all matched expressions. I was wondering how I could get sed to just match the first expression or is there a better way to do this. I am having a bit of trouble understanding exactly how regular expressions work in sed.

Thanks for any help

John




Reply to: