[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Correct one-line-description of cookie setting in textbrowser w3m needed

markus.hiereth@freenet.de wrote:
> thank You for Your mail. One additional information on the background:
> w3m is a textbrowser of age. Regrettably, the development team
> vanished and documentation is scarce or difficult to understand. So I
> hoped from help from other sides. A couple of problems with strings in
> the configuration panel of w3m were already solved by reflecting the
> context; it is the field of http and html.

I'm glad to see w3m getting this work; I've long been a fan of "the
pager that thinks it's a tabbed graphical web browser".  It was
particularly enjoyable ten years ago when the version of w3m in Debian 
Stable supported tabbed browsing years before MSIE did!

> Justin B Rye schrieb am 17. Oct 2014 um 00:53
>>> 1 use_cookie=<bool>                Enable cookie processing
>> (That really ought to be "use_cookieS", and likewise for most of the
>> other variable names below.)
> Indeed. But only the compiler needs to digest such variable names :-)

True as far as I know for *some* of them, but README.cookie adds to
the obscurity by expecting end users to recognise the variable
"cookie_avoid_wrong_number_of_dots" by that name. 
>> It's not a great variable name, either, because the user isn't being
>> asked to specify one "wrong" number N which will configure w3m to
>> reject cookies for domains with N dots - if I understand correctly,
>> setting N=1 will cause w3m to reject as invalid all cookies for
>> single-dotted *or* dotless domains (that is, it sets the minimum
>> valid number of dots to two, which is just begging for off-by-one
>> errors).  It's not quite clear if you can meaningfully set N=255.
>> And... wait, why on earth is it a <string>?  What would it mean if I
>> set N=fish?
> The variable really gets a string.

Indeed, so the interpretation suggested by the documentation in
README.cookie was just as wrong as the interpretation suggested by the
variable name.

> The RFC
> (https://www.ietf.org/rfc/rfc2109.txt) establishes a general rule on
> the position and the number of dots in the domain attribute of cookies

Yes, you already quoted that, thanks.

> and the item of w3m's configuration we are talking about allows to
> define exceptions.

In that case it ought to be really easy to describe:

  cookie_gibberish_name=<string> Domains to exempt from cookie format validation

But then it's imperative that the explanation in README.cookie
 1) avoids mentioning cookie_gibberish_name;
 2) avoids getting bogged down in RFC technicalities; and above all
 3) gives a useful example.

> Maybe the string fish as value for cookie_avoid_wrong_number_of_dots
> is not as absurd as it seems. It fits to one case of the scheme of a
> domain-lists given of w3m's README.cookie.
>    domain-list = domains
>                | ""
>    domains     = domain
>                | domain + "," + domains
>    domain      = "." + domain-name      ; match with domain name
>                | host-domain-name       ; match with HDN
>                | ".local"               ; match with all HDN except which include  "."
>                | "."                    ; match with all
>                            (HDN: host domain name)
> "fish" is no Fully qualified Domain Name FQDN but a host domain
> name. For tests purposes, a local webserver installation could send
> cookies with fish.local or .fish as value of the domain
> attribute. According to 4.3.2 of the RFC mentioned above, such a
> cookie is not acceptable.

Given that fish.local has just as many dots as microsoft.com, it's a
bad idea to summarise this validation as "wrong number of dots".

> I assume that the two configuration items cookie_accept_domains and
> cookie_avoid_wrong_number_of_dots reflect to subsequent steps of
> setting a cookie on the client side, i.e. computer running the
> browser: 1. Is the sender of the cookie listed in domains to reject
> cookies from?  2. Is the domain attribute of the cookie in accordance
> with RFC 2109?  3. In case it is not: Does the domain attribute of the
> cookie appear in the string value of cookie_avoid_wrong_number_of_dots?

Yes, this interpretation makes sense to me, too.  It's a format
validation exemption list.
> I did not want to reveal the suggestion I made in my bug report within
> the first mail towards Your list. But my one line explanation for the
> configuration item would be:
>   cookie_avoid_wrong_number_of_dots=<string> "Do not reject cookies having one of the domain attributes"

That sounds like a boolean; you'd have to say something like
    cookie_avoid_wrong_number_of_dots=<string> "Domain attributes not to reject cookies having"
which is a horribly contorted phrase and still doesn't convey anything
close to what we want it to.  The "not to reject" part makes it sound
as if it conflicts with the "Domains to accept/reject" options.

Did the previous two options work in terms of the actual FQDN of the
web server that the cookie comes from, or was it the Domain= attribute
declared in the cookie?  If they're all talking about the same thing
then we shouldn't raise the topic here.  On the other hand if we need
the distinction then we should make that clearer on both sides - but
I'd suggest using terminology like "server domain" versus "cookie
domain" rather than this obscure stuff about "attributes".

>>> The RFC (https://www.ietf.org/rfc/rfc2109.txt) explains
>> [...]
>>> I assume that the option in question refers to differences between the
>>> domain of the server which is about to set a cookie on the computer of
>>> the internet user and the domain attribute inside the cookie.
>> I don't see any reason to assume that.
> But the background of the RFC rules is security! A cookie shall only
> interfere in communications between a specified webserver. A cookie
> with a domain attribute "com.uk" would request for activation in any
> communication with UK commercial sites.

Yes, this is part of the reason why there's validation for domains
declared in cookies.  But the option isn't talking about these

>>> Has anyone in Your team a suggestion for a-one-line description of
>>> this option? I delivered one in my bug report [1]. Tatsuya as the
>>> maintainer of the package w3m would as well appreciate Your help.
>>> [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765068
>> Meanwhile in ja.po it's:
>> 	"[wrong number of dots] ???????????????????????????"

(Oops, "charset=us-ascii"!)

>> (where the part in Japanese means something like "domains to ignore").
>> This doesn't seem helpful at all.
> If You know Japanese, give it a second thought! Think of a domain
> attribute inside the cookie that shall not be checked for accordance
> with the RFC. This is would be a request to ignore something.

Except that w3m won't ignore those domains, or even ignore cookies
that specify Domain= attributes on that list - it will skip format
validation and *accept* cookies for those domains.

I'm not sure about the range of senses of that Japanese verb, but as
far as I can make out so far the English we need is something like:
	Domains to exempt from cookie format validation
or	Domains to exempt from cookie RFC2109 validation
or	Domains to exempt from cookie dot-count checking

> Finally, a charming statement, found in a README file of the w3m
> development team:
>   If you can read English, see doc/*.
>   If you can read Japanese, see doc-jp/*.
>   If you can read both, read both and correct English. :-)

I can puzzle out bits of Japanese if I'm lucky, but the day that
maintaining my technological lifestyle requires fluency in written
Japanese will be the day I give up and go and live in a cave.
JBR	with qualifications in linguistics, experience as a Debian
	sysadmin, and probably no clue about this particular package

Reply to: