[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [RFR] templates://lynx-cur/{templates}



Atsuhito Kohda wrote:
> Thanks for your test.  I'll check how LOCALE_CHARSET works 
> afterwards.  If it works as expected we don't need a wrapper
> script perhaps.

Except that I can't see any other way of doing PREFERRED_LANGUAGE.
It seems obvious that it should vary automatically to match LANG,
but it doesn't; you can't set it with a command line option, you
can't set it with an environment variable, it has to be set in a
config-file. 

[...] 
> I set CHARACTER_SET=iso-8859-1 only because it is the default
> setting in the upstream.  I basically follow the upstream.
> If utf-8 is much more appropriate for Debian I'm willing
> to modify it.

There are (at least) two different issues here:         

1) If lynx encounters a page with no charset tag in the headers, how
	should it interpret it?
          
The official answer to this question is set by the W3C standards,
which are always the same regardless of whether the user doing the
browsing is Korean or Russian or French.  Is there a good reason to
enforce iso-8859-1 on some browsers and not others?  There's stuff
in lynx.cfg about "raw (CJK) mode", but that seems to be an
independent variable controlled by a -raw command line option. 

Of course it turns out that pages with missing charset tags are
usually in Windows-1252, so that's arguably the best fallback - it
even looks like they're going to make this part of the HTML 5
standard.

2) If lynx determines that a page contains a euro symbol (which may
	happen even on 7-bit-clean pages, via the € entity),
	how should it display that?

This is the question locale variables are designed to answer.  If
I'm using a unicode locale and browsing pages correctly tagged as
being in unicode, Lynx shouldn't be mangling the text just because
there's no "€" character in iso-8859-1.  But that's what happens if
Lynx thinks I want "CHARACTER_SET:iso-8859-1"... and that's its
default assumption, unless I either overrule it directly or switch
on LOCALE_CHARSET.

I think I'm going to give up and just stick with w3m.
-- 
JBR	with qualifications in linguistics, experience as a Debian
	sysadmin, and probably no clue about this particular package


Reply to: