[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#905126: www.debian.org: Website search box unhelpful for common names (e.g. Buster) in certain character sets



Hello Jonathan

Thanks for reporting this bug.

On Tue, 31 Jul 2018 21:40:18 +0800 Jonathan Wiltshire <jmw@debian.org>
wrote:
> Package: www.debian.org
> Severity: normal
> 
> A number of search languages end up with no results for contextually
> common search terms, for example "debian" or "buster".
> 
> To reproduce:
>  - use the search box for the term "buster" in English. There are a
>    number of results including release information, news items and
>    errata.
>  - set the language to Vietnamese, Chinese or similar and search again
>  - there are no results.

I can reproduce that. However, searching in Vietnamese for "Debian" or
"Buster" shows results.

E.g. the search for "Buster" in Vietnamese produces this link as first
match:

https://www.debian.org/releases/index.vi.html
100% relevant, matching: buster

Interestingly, it says "matching: buster" (smallcaps, but I searched for
Buster)

If I search for "buster" (with quotes), I also get the results.

> 
> I assume that this is an issue with translations into non-Latin
> character sets without hint words nearby the translated word.
> 

I think it's not related to character sets because I've tested with
Esperanto (latin character set) and saw the same behaviour.

The relevant code in the Debian website about this bug is in the file
webwml/english/search.xml.in, that I think it just sends the search term
to the search engine (which is in search.debian.org):

<:	my $ext = lc('$(CUR_ISO_LANG)');  $ext =~ s/-/_/;
	print
'template="https://search.debian.org/cgi-bin/omega?P={searchTerms}&amp;HITSPERPAGE={count?}&amp;DB='.$ext.'[CN:-cn:][TW:-tw:][HK:-hk:]"/>';
:>

I couldn't find a canonical repository or pseudopackage related to
search.debian.org. For what I've search, it is a "a slightly patched
xapian-omega instance". I've logged in the machine and the code there
has two remote repositories. I'm CC'ing Raphael Geissert (shown as
contact for comments in the search result pages) and Olly Betts (shown
as the author of the last commits in the repo that is currently deployed
in search.debian.org). I hope they can help or tell us how to proceed.

Cheers
-- 
Laura Arjona Reina
https://wiki.debian.org/LauraArjona


Reply to: