Re: cleaning up messy capitalization to transfer a web page from wintoes
On July 8, 2003 11:24 pm, K S Sreeram wrote:
> Check out this page:
>
> http://linuxvm.org/Info/HOWTOs/win2lin.html
>
> Looks like it has what you need
This script would be great, except that it doesn't actually work. I first got
an error because the end of one line was missing a \. Then it ran, but
didn't actually correct the capital letters in a way to fix broken
hyperlinks. It seemed to lower-case the first letter in the <a href=... but
do nothing to the filenames? Oh well.
If anyone's curious to look at the short bit of code, here it is. Maybe the
problem would be obvious to someone who really knows their sed and shell
programming. (I've fixed the first bug, the missing \ at the end of the line
-e 's/|/ /' \)
#!/bin/bash
D='/tmp/htmlrename/firelab/'
grep -ni "<a .*href=" `find ${D} -name \*.html` \
| sed -e 's/:.*href="/|/i' -e 's/".*//' \
-e 's/<a//' \
-e 's/#.*//' \
-e 's/|/ /' \
| grep '\..*\.' \
| grep -v http: \
| sort -u \
| while read source link
do
fd=`dirname $source`
EF=$fd/$link
[ -n "${link}" ] && \
[ ! -f ${EF} ] && \
{
echo EF=${EF} s=$source d=$fd l=,$link,
AF=`find ${D} -iname ${link}`
echo AF=${AF}
cmd="grep -ni $link $source /dev/null"
# echo ${cmd}
${cmd}
# grep -i "$link" $source /dev/null
echo
}
done
Reply to: