Bug#929923: missing dictionaries.xcu confuses non-US English locales (e.g. en_AU)
tag 929923 + patch
thanks
Rene Engelhard wrote:
> On Mon, Jun 03, 2019 at 07:21:47PM +1000, Trent W. Buck wrote:
>> Upstream, LibreOffice uses a dictionaries.xcu file to say "use the en_US thesaurus for ALL en locales".
>> AFAICT Debian doesn't ship dictionaries.xcu files, though they are present in the libreoffice-dictionaries source package.
>
> Yeah, that is a side effect on how we package it. Application-agnostic.
>
> If we packaged it as a LO-only extension we could include that file, as
> it is now, not. (and mythes is not used by LO only ttbomk).
>
> [code showing mythes-en-us is used by lyx and texlive]
>
> And packaging this up as an extension would either loose the ability to use it
> in other applications or duplicate it.
I agree that the thesaurus should be application-agnostic.
I agree that packaging as a LO-only extension is a bad idea.
My original bug report was not very clear (sorry!); I will rephrase below.
Short version: I propose 929923.patch (attached).
The problem
===========
1. LibreOffice only has one English-language thesaurus, th_en_US_v2.
2. On Windows LibreOffice, Tools > Thesaurus is clickable for non-US English.
3. On Debian LibreOffice, Tools > Thesaurus is greyed out for non-US English.
This is the bug.
To reproduce:
* Install libreoffice and mythes-en-us.
* Open LO writer.
* Tools > Language > For All Text > English (UK)
* Tools > Thesaurus is greyed out.
* Tools > Language > For All Text > English (USA)
* Tools > Thesaurus is clickable.
I want Debian LibreOffice to behave like non-Debian LibreOffice.
Do you agree so far?
The solution
============
I think on non-Debian, LibreOffice knows that en_* should use
th_en_US_v2 because of dictionaries/en/dictionaries.xcu:
https://sources.debian.org/src/libreoffice-dictionaries/1:6.2.0-1/dictionaries/en/dictionaries.xcu/#L82
https://sources.debian.org/src/libreoffice-dictionaries/1:6.2.0-1/dictionaries/en/dictionaries.xcu/#L90
I think without that XML, LibreOffice guesses based on the file name.
This causes Tools > Thesaurus to work for en_US, but
not for other locales mentioned on #L90, above.
I can see three ways to address this:
a. like upstream, bundle the .dic and .xcu files into an .oxt (LibreOffice Extension).
This makes the thesaurus inaccessible to non-LO apps, which is unacceptable.
b. distribute the XML somewhere else, such as into
/usr/lib/libreoffice/share/registry/mythes-en-us.xcd
I couldn't get this to work (see attached draft).
c. make some crappy symlinks th_en_XX_v2.dic -> th_en_US_v2.dic.
This works for me.
The downside is that debian/*.links and
dictionaries/*/dictionaries.xcu can get out of sync.
UPDATE: I just noticed you are already doing (c) for other languages, e.g.
https://sources.debian.org/src/libreoffice-dictionaries/1:6.2.0-1/debian/mythes-es.links/
So, I am simply proposing to do the same for English.
Attached is a simple draft patch.
<?xml version="1.0" encoding="UTF-8"?>
<!-- REFERENCE: http://web.archive.org/web/20101103025920/http://util.openoffice.org/common/configuration/oor-document-format.html -->
<!-- REFERENCE: https://people.freedesktop.org/~vmiklos/2013/oor-document-format.html (another mirror) -->
<oor:data xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:oor="http://openoffice.org/2001/registry">
<!-- AFAICT <dependency> means "load me AFTER /usr/lib/libreoffice/share/registry/foo.xcd". -->
<!-- We must load after lingucomponent, because we're editing (oor:op=fuse) its ServiceManager node. -->
<dependency file="lingucomponent" />
<oor:component-data oor:package="org.openoffice.Office" oor:name="Linguistic">
<node oor:name="ServiceManager">
<node oor:name="Dictionaries">
<node oor:name="ThesDic_en-US" oor:op="replace">
<prop oor:name="Locations" oor:type="oor:string-list">
<value>/usr/share/mythes/th_en_US_v2.dat</value>
</prop>
<prop oor:name="Format" oor:type="xs:string">
<value>DICT_THES</value>
</prop>
<prop oor:name="Locales" oor:type="oor:string-list">
<value>en-GB en-US en-PH en-ZA en-NA en-ZW en-AU en-CA en-IE en-IN en-BZ en-BS en-GH en-JM en-MW en-NZ en-TT</value>
</prop>
</node>
</node>
</node>
</oor:component-data>
</oor:data>
commit 4f2ef16491ab99544e46a8380324a1aecc2ae210
Author: root <root@localhost>
Date: Tue Jun 4 12:41:58 2019 +1000
Non-maintainer upload.
* Non-maintainer upload.
* Fix Tools > Thesaurus for English (non-USA) in LibreOffice.
Closes: #929923
diff --git a/debian/changelog b/debian/changelog
index fc56faf..aaf9096 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,11 @@
+libreoffice-dictionaries (1:6.2.0-1.1) UNRELEASED; urgency=medium
+
+ * Non-maintainer upload.
+ * Fix Tools > Thesaurus for English (non-USA) in LibreOffice.
+ Closes: #929923
+
+ -- Trent W. Buck <trentbuck@gmail.com> Tue, 04 Jun 2019 12:32:55 +1000
+
libreoffice-dictionaries (1:6.2.0-1) unstable; urgency=medium
* New upstream version 6.2.0.
diff --git a/debian/control b/debian/control
index da6f316..a791361 100644
--- a/debian/control
+++ b/debian/control
@@ -433,6 +433,10 @@ Multi-Arch: foreign
Depends: dictionaries-common, ${misc:Depends}
Suggests: libreoffice-writer
Provides: mythes-thesaurus, mythes-thesaurus-en-us
+# This is the ancient mythes-en-au from OpenOffice.org, from way back before Oracle bought Sun.
+# It hasn't had an update since 2011. Upstream LibreOffice uses the English (USA) thesaurus for all English languages.
+Breaks: mythes-en-au (<< 2.1-5.4)
+Replaces: mythes-en-au (<< 2.1-5.4)
Description: English (USA) Thesaurus for LibreOffice
Libreoffice is a full-featured office productivity suite that provides a
near drop-in replacement for Microsoft(R) Office.
diff --git a/debian/mythes-en-us.links b/debian/mythes-en-us.links
new file mode 100644
index 0000000..4efe431
--- /dev/null
+++ b/debian/mythes-en-us.links
@@ -0,0 +1,34 @@
+# These associations are taken from dictionaries/en/dictionary.xcu:SericeManager.Dictionaries.ThesDic_en-US.Locales.
+# FIXME: now that these are provided, should mythes-en-us be renamed to mythes-en (similar to mythes-es et al)?
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_GB_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_GB_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_PH_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_PH_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_ZA_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_ZA_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_NA_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_NA_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_ZW_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_ZW_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_AU_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_AU_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_CA_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_CA_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_IE_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_IE_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_IN_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_IN_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_BZ_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_BZ_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_BS_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_BS_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_GH_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_GH_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_JM_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_JM_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_MW_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_MW_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_NZ_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_NZ_v2.idx
+usr/share/mythes/th_en_US_v2.dat usr/share/mythes/th_en_TT_v2.dat
+usr/share/mythes/th_en_US_v2.idx usr/share/mythes/th_en_TT_v2.idx
Reply to: