Re: utf
> You just seem to have Decided, for reasons known only to you, that
> The Character Length Of A String Is Not Useful. Despite literally
> decades of programs that have used strlen() in various ways.
strlen was mostly used in a context where char-length = byte-length =
display-width. Most of those calls to strlen have nothing to do with
char-length but are more interested in display-width or byte-length.
In the context of Unicode, using utf-8 doesn't make byte-length any
harder than with ASCII. And in the context of Unicode, display-width
is a lot more complex than strlen regardless of which encoding you use
because any given Unicode char can have a display-width of 0, 1, or
2 (even if you disregard proportional fonts and other fancy rendering
tricks). So utf-8 doesn't make the computation of display-width any
more complex than utf-32.
> What if the question is "Find all the English words that have an E
> in the 5th position and a U in the 7th"?
That can be answered just as easily and efficiently from a utf-8
representation of the string as from a utf-32 representation.
Stefan
Reply to:
- References:
- Re: utf
- From: Andre Majorel <aym-naibed@teaser.fr>
- Re: utf
- From: Darac Marjal <mailinglist@darac.org.uk>
- Re: utf
- From: Ben Caradoc-Davies <ben@transient.nz>
- Re: utf
- From: Nicolas George <george@nsup.org>
- Re: utf
- From: Greg Wooledge <wooledg@eeg.ccf.org>
- Re: utf
- From: Nicolas George <george@nsup.org>
- Re: utf
- From: deloptes <deloptes@gmail.com>
- Re: utf
- From: Nicolas George <george@nsup.org>
- Re: utf
- From: deloptes <deloptes@gmail.com>
- Re: utf
- From: Nicolas George <george@nsup.org>
- Re: utf
- From: Greg Wooledge <wooledg@eeg.ccf.org>