help needed for converting strings in a file
hi,,
i am making a small tool for
offline web browsing.. For this I need to change the source of html files.
let me explain::
In a web page the hyper links are written as
href="http://www.micronux.com/catalog/"
i want this particular string to convert to
href="./micronux.com_catalog"
The logic is --1)delete
http://www.
2) replace '/' '?' etc with '_'
I want to write a script using sed or awk which will do all the conversion in a file..
Any help in this regard will be highly appreciated..
Thanking you
Sourabh Bora
Symbol Technologies India Ltd
Bangalore
India
Once I cross this stage I plan to make furthur improvements to make a user friendly offline browser.
PS:
I have written a C program but,,,it is very limited in functionality.
Usage:
./a.out "
http://www.micronux.com/catalog/"
output is
micronux.com_catalog
The limitation is it cannot search the whole file for any url starting with href=
The code::::::
int main(int argc,char * *argv) {
char url[200];
char output[200];
int urlCount;
int outputCount;
strcpy(url,argv[1]);
urlCount=4;//start with url[4]--if http its : if https its "s"
if (url[urlCount]==':')
urlCount=7;
else
urlCount=8;
//now check for "www."
if(url[urlCount]=='w' && url[urlCount+1]=='w'&& url[urlCount+2]=='w' && url[urlCount+3]=='.')
urlCount+=4;
//now do the actual conversion--convert / \= : * ?" <> and | to _
for (outputCount=0;url[urlCount]!='\0';urlCount++,outputCount++){
if(url[urlCount]=='/' || url[urlCount]== '\\' || url[urlCount]=='=' || url[urlCount]==':' || url[urlCoun
t]=='*' || url[urlCount]=='\"' || url[urlCount]=='?' || url[urlCount]=='<' || url[urlCount]=='>' || url[urlCount]=='|')
output[outputCount]='_';
else
output[outputCount]=url[urlCount];
}
outputCount--;
//if the last character is '-' remove it
if(output[outputCount]=='_')
output[outputCount]='\0';
else
output[++outputCount]='\0';
printf("%s\n",output ) ;
return 0;
}
Reply to: