[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: content negotiation for language in web pages

I've moved this to debian-www where it should have been in the first place.

> > Mirrors that don't support content negotiation would be stuck serving
> > in one language (the pages would be set up to default to English).
> Not necessarily <sp> true. I mean, the English part. For example, if the
> German mirror doesn't support negotiation, under the previous scheme, it
> can only mirror the German directories. (Flattening symlinks, of course)
> > It has the benefit of supporting partial translations. If a
> Yes, that's why I used this in the first place. It works great.
> > Also, if a browser doesn't know about content negotiation or the user
> > hasn't configured it to use their preferred language (and the default
> > is usually English), the user will get English docs.
> This again may not be always true. If the browser doesn't support content
> negotiation, it has an internal list (at least Apache does). It knows what
> language to serve by default.
The big problem with this is that it all hinges on every server supporting
content negotiation (CN from here on in). We don't control the mirrors so
it's a big problem. Without content negotiation, you run into a problem
in deciding how to write links. If you just write <foo>.html, then CN does
the right thing. Without CN, you can only get one language.

> > 3. Similar to 2, but each language references the pages in its language,
> >    e.g. index.html.de would reference vendors.html.de . At the main
> ugly, ugly, ugly. It's a nightmare to maintain. Plus, the server has to be
> reconfigured to understand that html.en is text/html, and that is not
> always possible because of the "extra" dot.
I don't see why it's ugly. It's a compromise for an imperfect world. It's
not any more difficult to maintain than any of the other methods.
Also, a server that doesn't understand content negotiation doesn't need to
worry about html.en as all the english files would have a link from .html to
.html.en .

> >    page the user would get a language (either by content negotiation
> >    or by explicitly choosing the language by using one of the cross-links)
> >    and all links followed after that would be in that language.
> >    Someone jumping into a different page would have no idea other languages
> >    existed.
> With the setup I presented, this can be solved in this way:
> http://www.debian.org/lang-1 reads DocumentRoot.lang-1 and it DOESN'T do
> content negotiation. The other languages are treated in the same way.
> http://www.debian.org/ reads DocumentRoot and it DOES content negotiation.
> Drawback: you have to remember to use relative links only, that is, <A
> HREF="/dir/document.html"> is not allowed. You have to <A
> HREF="../../dir/document.html>. This almost always limits the usefulness
> of server generated footer and headers that contain links.
This all supposes that we have some control over the mirrors. Many of the
mirror administrators have no time for this sort of thing. We are lucky
that they have convinced their superiors to donate space on the machines.

BTW, the entire web pages use relative links. Works great.

> I really think content-negotiation is the way to go, considering that it's
> something that can be configured on a server by server basis. For example,
> www.es.debian.org (the mirror in Spain, not the server in Spanish that
> someone else proposed) can be configured to provide documents in Spanish
> by default. www.it.debian.org provides documents in Italian,
> www.us.debian.org in English, and so on.
Of course CN is the way to go. At the same time it is important,
given the structure of Debian, that we also make the pages accessable in
all languages even if CN isn't available. That's why number 3 was proposed.
When you catch up on the thread, you'll see a few changes have been
proposed which should make this work quite well.

> The problem I saw, and still see, is search engines are stupid enough not
> to know about content-negotiation (well, I complained, and someone at
> Altavista emailed me saying they were consireding that, maybe they have
> implemented it by now). For example, http://www.debian.org/ may appear in
> search engines only in English, but when the user gets there it suddenly
> starts speaking German (because the browser asks for "de fr en", for
> example). For me, that's really nice, but others may not think so. That's
> the other reason I came up with the DocumentRoot.lang thing. 
As for setting up searching, each file should say what language it is in
(say with a meta tag). Searches will check for this tag so only the language(s)
specified will be returned. I've used htdig and glimpse and found that they both
had annoying limitations. It looks like glimpse has been improved in the
last 6 months so I will take a look at it again.

- Jay

TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-www-request@lists.debian.org . 
Trouble?  e-mail to templin@bucknell.edu .

Reply to: