[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Is this tidy converting correct?



Hi, 

I have this file:

$ cat test | od -t u1
0000000 181 220 210 221 149 132 163 168 183 189 193 166 201 234 163 169
                        ^^^ ^^^

When using tidy, it gives wired result:

$ cat test | tidy -quiet -numeric
[...]
µÜÒÝ•„£¨·½Á¦É꣩
                        ^^^^^^^^^^^^^^
[...]

Notice the extreme big number pointed by ^^^? 

Why can't tidy encode exactly as the "od -t u1" output, i.e., •„
instead of •...? It will give me trouble when the result is further
processed by other tools, e.g., Perl XML::XPath.

Moreover, after using the "-bare" option to "strip out smart quotes and em
dashes, etc.", the result is even more wired, even seems wrong to me:

$ cat test | tidy -quiet -numeric -bare
[...]
µÜÒÝ•"£¨·½Á¦É꣩
                        ^^^^^^^^
[...]

Anybody has some comment on this? 

Thanks

PS. to produce the test file:

echo '181 220 210 221 149 132 163 168 183 189 193 166 201 234 163 169' | perl -ne 'print chr $_ for split /\s+/'  > test 

tong





Reply to: