[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re[2]: OT: how to strip out SGML tags?



erik <erik@bossa.org> wrote:

> > ##  Use STDIN if no files are given
> > $ARGV[0] = "-" unless @ARGV;
> > 
> > ##  Strip out anything contained in an SGML markup tag.  This is not
> > ##  very pretty and rather inefficient, but it does take care of tags
> > ##  which cross line or paragraph boundaries.
> > foreach $file (@ARGV) {
> >   open(INPUT,$file);
> >   while($char = getc(INPUT)) {
> >     if($char eq "<") {
> >       IGNORE: for(;;) {
> >         last IGNORE if (getc(INPUT) eq ">");
> > 
>  ... not sure why the IGNORE thing is in here; it seems like this should
> work but I would have simply done :
> 	if($char eq "<") {
> 	   while(getc(INPUT) ne ">") {
> 		;
> 	    }
> 	}
> 

I had trouble with your idea, but I went back to the original script I posted
and discovered that the problem is it dies whenever a numerical '0' is
encountered! Apart from that it works fine. It just so happened I had a '0' in
the first few lines of my SGML, but I didn't get the implication.

So zero makes the condition '$char = getc(INPUT)' evaluate to false, dumping
the flow down to closing the file. What's the perl equivalent of WHILE NOT
EOF? <g>

> Look reasonable? 


--
Bob Bernstein                  http://www.ruptured-duck.com




Reply to: