[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1020387: dictionaries-common: Consensus regarding the packaging of the Qt WebEngine hunspell binary dictionaries



On Wednesday, October 5, 2022 5:07:50 AM MST Agustin Martin wrote:
> El jue, 22 sept 2022 a las 21:30, Soren Stoutner
> One noticeable thing is that bdic generation  failed for some hunspell
> dicts I have installed

That’s concerning.

> ++ processing an_ES.aff
> [1003/125813.760330:FATAL:aff_reader.cc(305)] Did not find a space in 'y   
> i'. Trace/breakpoint trap

This is caused by line 90 of an_ES.aff:

REP y<tab character>i

All the previous instances of REP in this file have a space between the two 
arguments.  This is the first one to use a tab.  Following line 90 both tabs 
and spaces are used.

I don’t know enough about the Hunspell file format to know what is expected.  
Is this an example of an incorrectly formatted .aff file or is this an example 
of qwebengine_convet_dict not knowing how to read appropriate Hunspell 
formatting?

> ++ processing ar.aff
> [1003/125813.796753:FATAL:aff_reader.cc(123)] We don't support the
> IGNORE command yet. This would change how we would insert things in
> our lookup table.

Based on this error message, it seems fairly obvious that 
qwebengine_convert_dict does not fully support the Hunspell format.  The line 
in question is 24142 from ar.aff which reads as follows:

IGNORE ٌٍَُِّْ

I will file an upstream bug to see if that can be corrected in some way, but I 
think I will wait until I have the answers to these other questions to decide 
if I should file one bug or three.

> ++ processing gl_ES.aff
> gl_ES.dic_delta not found.
> Reading gl_ES.aff
> Reading gl_ES.dic
> Serializing...
> Verifying...
> Word does not match!
>   Index:    2126
>   Expected: Abū po:antropónimo
> is:ngrama_Abū_ʿAbdullāh_Muḥammad_ibn_Jābir_ibn_Sinān_ar_Raqqī_al_Ḥarrani_aṣ_
> Ṣabiʾ_al_Battānī Actual:   Abū po:antropónimo
> is:ngrama_Abū_ʿAbdullāh_Muḥammad_ibn_Jābir_ibn_Sinān_ar_Raqqī_al_Ḥarrani_aṣ_
> Ṣabiʾ_al_Battā ERROR converting, the dictionary does not check out OK.

I am not exactly sure what is causing this error, but I would assume that it 
is some mismatch between the .aff and the .dic files.  The line it appears to be 
complaining about is 2095 from gl_ES.dic, which reads as follows:

Abū po:antropónimo 
is:ngrama_Abū_ʿAbdullāh_Muḥammad_ibn_Jābir_ibn_Sinān_ar_Raqqī_al_Ḥarrani_aṣ_Ṣabiʾ_al_Battānī

However, for some reason it is expecting the line to be shorter.

-- 
Soren Stoutner
soren@stoutner.com

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: