* Colin Watson <cjwatson@debian.org> [20030325 18:50 PST]: > On Tue, Mar 25, 2003 at 04:01:51PM -0800, Vineet Kumar wrote: > > First of all, '-' renders as a hyphen (U+2010) instead of as ASCII 0x2D. > > The correct groff escape to use in things like command-line options is > > '\-', which renders as the 0x2D minus sign in both UTF-8 and ASCII > > locales. Hyphenated words such as "read-only" or "command-line" should > > properly be printed with actual hyphens instead of minus signs, and do > > not need to be changed. For clarity, though, I recommend that > > intentinoal hyphens be specified with the escape \(hy, to emphasize that > > they are actually intended to by hyphens and not mistakenly-unescaped > > minus signs. > > I find '-' much clearer to read myself, but I don't think it's too > important either way; leave that one up to the author of the page. > Replacing '-' with '\-' when a literal dash is desired is the important > part. Sure. It would also make for smaller patches to leave hyphens as '-' instead of changing them to \(hy, which is good. > > Accents: grave (U+0060) and acute (U+00B4) should be given as \` and \' > > respectively. According to groff(7), a bare, unescaped ` should also > > render as "left quote, backquote (ASCII 0x27)". The left quote (U+2018) > > is different from the backquote (ASCII 0x27), so I think that "left > > quote" should be deleted from the groff manpage, and groff should be > > changed to display ` as `(U+0060) and not as U+2018. > > I'm not sure I agree. I think groff(7) is simply unduly ASCII-centric, > unlike groff_char(7). Well, that's fine, too, as long \` or \(ga is used when U+0060 is intended, and not bare `. This does make that line of groff(7) pretty misleading, but I guess it's really a consequence of the long-standing `-as-left-quote mess. It would help if groff(7) could make this clearer, but as you point out, groff_char(7) makes it pretty clear. > > Most of these things don't make any difference in ASCII locales, but > > break in UTF-8 locales in which the special characters are actually > > rendered specially. For example, searching for a particular > > command-line option is unncessarily difficult if it is incorrectly > > specified with a hyphen instead of a minus sign. > > The last time this came up in detail on the upstream groff mailing list, > it was pointed out that Unicode-capable pagers really ought to start > regarding different types of spaces and dashes as similar for searching > purposes. I've not seen much evidence of this yet, but I think this > would be a good time for such support to start happening. Right. That would help usability in the short term, but it feels more like a workaround than a fix. It still doesn't solve the copy-and-paste problem, either. Having the pagers change what they display when given the proper multibyte characters would really a step backwards. Thanks for the input. good times, Vineet -- http://www.doorstop.net/ -- "Those who desire to give up freedom in order to gain security will not have, nor do they deserve, either one." --President Thomas Jefferson
Attachment:
pgpZnaC8_ERhI.pgp
Description: PGP signature