[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#984717: marked as done (unblock: opencc/1.1.1+git20200624+ds2-10)



Your message dated Sun, 7 Mar 2021 21:16:50 +0100
with message-id <57d6c166-039a-c3f4-74ee-f99fbfa64d90@debian.org>
and subject line Re: Bug#984717: unblock: opencc/1.1.1+git20200624+ds2-10
has caused the Debian Bug report #984717,
regarding unblock: opencc/1.1.1+git20200624+ds2-10
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
984717: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=984717
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: release.debian.org
Severity: normal
User: release.debian.org@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: zhsj@debian.org, byang@debian.org

Please unblock package opencc

Summary of changes since testing/1.1.1+git20200624+ds2-5

+ Enable upstream tests in package building.
+ Add autopkgtest
+ Backport 2 upstream patches to fix performance regression.

[ Reason ]

This fix is small, but the performance is improved a lot.
As opencc is also used in generating Debian official website
for Simplified and Traditional Chinese version, and a lot of text
needs to be processed by opencc, so the performance matters.

opencc is key package, so it needs manual unblock.

[ Impact ]
Without this patch, the performance drops a lot.

[ Tests ]
+ Upstream unit and integration tests
+ Autopkgtest for installed library and tool
+ Manual tests
  + With opencc in testing:

    $ time opencc -c /usr/share/opencc/t2s.json -i <(printf "Open Chinese Convert 開放中文轉換\n%.0s" {1..50000}) -o /dev/null

    real    0m40.328s
    user    0m40.272s
    sys     0m0.105s


  + With opencc in unstable:

    $ time opencc -c /usr/share/opencc/t2s.json -i <(printf "Open Chinese Convert 開放中文轉換\n%.0s" {1..50000}) -o /dev/null

    real    0m0.556s
    user    0m0.551s
    sys     0m0.065s


[ Risks ]

+ Patch is small
+ Key package

[ Checklist ]
  [x] all changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in testing

[ Other info ]
The upstream patch was first backported in 1.1.1+git20200624+ds2-6, but the tests
were not run, so Boyuan didn't notice the backport is incomplete. Then the backport
was reverted in -7. After adding tests and autopkgtest, the patches are backported
again in -10.


unblock opencc/1.1.1+git20200624+ds2-10

Diff:

Real effected code added are only 13 lines.
+ https://salsa.debian.org/debian/opencc/-/blob/debian/1.1.1+git20200624+ds2-10/debian/patches/0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
  L28-L42
+ https://salsa.debian.org/debian/opencc/-/blob/debian/1.1.1+git20200624+ds2-10/debian/patches/0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
  L78-L80

Full diff:

Also available at:
https://salsa.debian.org/debian/opencc/-/compare/debian%2F1.1.1+git20200624+ds2-5...debian%2F1.1.1+git20200624+ds2-10

Some long test code added by upstream commit are skipped below.

diff --git a/debian/changelog b/debian/changelog
index eee8331..69db793 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,48 @@
+opencc (1.1.1+git20200624+ds2-10) unstable; urgency=medium
+
+  * Team upload.
+  * Upload to unstable.
+  * Backport patch to fix performance regression again.
+    Add
+    + 0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
+    + 0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
+
+ -- Shengjing Zhu <zhsj@debian.org>  Sun, 07 Mar 2021 14:20:40 +0800
+
+opencc (1.1.1+git20200624+ds2-9) experimental; urgency=medium
+
+  * Team upload.
+  * Remove unused command from autopkgtest scripts
+  * Add build-essential to autopkgtest
+
+ -- Shengjing Zhu <zhsj@debian.org>  Sun, 07 Mar 2021 00:54:22 +0800
+
+opencc (1.1.1+git20200624+ds2-8) experimental; urgency=medium
+
+  * Team upload.
+  * Enable test when building
+  * Add autopkgtest
+
+ -- Shengjing Zhu <zhsj@debian.org>  Sat, 06 Mar 2021 17:57:00 +0800
+
+opencc (1.1.1+git20200624+ds2-7) unstable; urgency=high
+
+  * Team upload.
+  * Drop debian/patches/0005 for now due to regression reported.
+    See also https://github.com/fcitx/fcitx5/issues/238 .
+
+ -- Boyuan Yang <byang@debian.org>  Fri, 05 Mar 2021 09:37:48 -0500
+
+opencc (1.1.1+git20200624+ds2-6) unstable; urgency=high
+
+  * Team upload.
+  * debian/patches/0005: Add upstream patch to fix severe performance
+    regression in `Conversion::Convert` that caused O(N^2) complexity.
+  * debian/rules: Disable parallel build to workaround some random
+    build error for now.
+
+ -- Boyuan Yang <byang@debian.org>  Sun, 28 Feb 2021 19:48:01 -0500
+
 opencc (1.1.1+git20200624+ds2-5) unstable; urgency=medium
 
   * Team upload.
diff --git a/debian/control b/debian/control
index 2eadce4..834dae0 100644
--- a/debian/control
+++ b/debian/control
@@ -13,6 +13,7 @@ Build-Depends:
  darts,
  debhelper-compat (= 13),
  doxygen <!nodoc>,
+ googletest <!nocheck>,
  libmarisa-dev,
  libtclap-dev,
  python3:any,
diff --git a/debian/gbp.conf b/debian/gbp.conf
new file mode 100644
index 0000000..cec628c
--- /dev/null
+++ b/debian/gbp.conf
@@ -0,0 +1,2 @@
+[DEFAULT]
+pristine-tar = True
diff --git a/debian/patches/use-cmake-install-libdir.patch b/debian/patches/0001-use-cmake-install-libdir.patch
similarity index 100%
rename from debian/patches/use-cmake-install-libdir.patch
rename to debian/patches/0001-use-cmake-install-libdir.patch
diff --git a/debian/patches/0003-use-system-libraries.patch b/debian/patches/0002-use-system-libraries.patch
similarity index 100%
rename from debian/patches/0003-use-system-libraries.patch
rename to debian/patches/0002-use-system-libraries.patch
diff --git a/debian/patches/no-remote-images-when-reading-docs-on-disk.patch b/debian/patches/0004-no-remote-images-when-reading-docs-on-disk.patch
similarity index 77%
rename from debian/patches/no-remote-images-when-reading-docs-on-disk.patch
rename to debian/patches/0004-no-remote-images-when-reading-docs-on-disk.patch
index 3ea24ee..60b1bbc 100644
--- a/debian/patches/no-remote-images-when-reading-docs-on-disk.patch
+++ b/debian/patches/0004-no-remote-images-when-reading-docs-on-disk.patch
@@ -1,9 +1,17 @@
-Description: Don't fetch remote images when reading docs on disk
- This fixes a privacy breach previously reported as Lintian warnings
-Author: Gunnar Hjalmarsson <gunnarhj@ubuntu.com>
+From: Gunnar Hjalmarsson <gunnarhj@ubuntu.com>
+Date: Sat, 6 Mar 2021 17:45:20 +0800
+Subject: Don't fetch remote images when reading docs on disk
+
 Forwarded: no
 Last-Update: 2021-01-14
 
+This fixes a privacy breach previously reported as Lintian warnings
+---
+ README.md | 8 ++++----
+ 1 file changed, 4 insertions(+), 4 deletions(-)
+
+diff --git a/README.md b/README.md
+index d941d6b..0663dcd 100644
 --- a/README.md
 +++ b/README.md
 @@ -1,12 +1,12 @@
diff --git a/debian/patches/0005-Use-system-googletest.patch b/debian/patches/0005-Use-system-googletest.patch
new file mode 100644
index 0000000..783d6bb
--- /dev/null
+++ b/debian/patches/0005-Use-system-googletest.patch
@@ -0,0 +1,22 @@
+From: Shengjing Zhu <zhsj@debian.org>
+Date: Sat, 6 Mar 2021 17:53:25 +0800
+Subject: Use system googletest
+
+Forwarded: not-needed
+---
+ CMakeLists.txt | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/CMakeLists.txt b/CMakeLists.txt
+index efb51ae..e26b934 100644
+--- a/CMakeLists.txt
++++ b/CMakeLists.txt
+@@ -195,7 +195,7 @@ add_subdirectory(test)
+ ######## Testing
+ 
+ if (ENABLE_GTEST)
+-  add_subdirectory(deps/gtest-1.11.0)
++  add_subdirectory(/usr/src/googletest/googletest ${CMAKE_BINARY_DIR}/googletest-build EXCLUDE_FROM_ALL)
+   enable_testing()
+ endif()
+ 
diff --git a/debian/patches/0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch b/debian/patches/0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
new file mode 100644
index 0000000..d89f9a3
--- /dev/null
+++ b/debian/patches/0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
@@ -0,0 +1,97 @@
+From: Carbo Kuo <byvoid@byvoid.com>
+Date: Thu, 25 Feb 2021 20:48:50 +0900
+Subject: Fix a bug in the calculation of DictGroup::keyMaxLength.
+
+The length should be the maximum of all sub-dictionaries in the dictionary group.
+---
+ src/DictGroup.cpp     | 16 ++++++++++++++--
+ src/DictGroupTest.cpp | 32 ++++++++++++++++++++++++--------
+ 2 files changed, 38 insertions(+), 10 deletions(-)
+
+diff --git a/src/DictGroup.cpp b/src/DictGroup.cpp
+index 4ca9e33..c682e96 100644
+--- a/src/DictGroup.cpp
++++ b/src/DictGroup.cpp
+@@ -1,7 +1,7 @@
+ /*
+  * Open Chinese Convert
+  *
+- * Copyright 2010-2014 Carbo Kuo <byvoid@byvoid.com>
++ * Copyright 2010-2021 Carbo Kuo <byvoid@byvoid.com>
+  *
+  * Licensed under the Apache License, Version 2.0 (the "License");
+  * you may not use this file except in compliance with the License.
+@@ -24,8 +24,20 @@
+ 
+ using namespace opencc;
+ 
++namespace {
++
++size_t GetKeyMaxLength(const std::list<DictPtr>& dicts) {
++  size_t keyMaxLength = 0;
++  for (const DictPtr& dict : dicts) {
++    keyMaxLength = std::max(keyMaxLength, dict->KeyMaxLength());
++  }
++  return keyMaxLength;
++}
++
++} // namespace
++
+ DictGroup::DictGroup(const std::list<DictPtr>& _dicts)
+-    : keyMaxLength(0), dicts(_dicts) {}
++    : keyMaxLength(GetKeyMaxLength(_dicts)), dicts(_dicts) {}
+ 
+ DictGroup::~DictGroup() {}
+ 
+diff --git a/src/DictGroupTest.cpp b/src/DictGroupTest.cpp
+index 7003506..c91731e 100644
+--- a/src/DictGroupTest.cpp
++++ b/src/DictGroupTest.cpp
<skip, upstream test file>
diff --git a/debian/patches/0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch b/debian/patches/0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
new file mode 100644
index 0000000..a614830
--- /dev/null
+++ b/debian/patches/0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
@@ -0,0 +1,188 @@
+From: Carbo Kuo <byvoid@byvoid.com>
+Date: Thu, 25 Feb 2021 21:13:38 +0900
+Subject: Fix a severe performance bug in `Conversion::Convert` that caused
+ O(N^2) complexity.
+
<skip, upstream log commit message>
+This bug was reported in https://github.com/BYVoid/OpenCC/issues/478 and https://github.com/BYVoid/OpenCC/issues/517.
+---
+ src/Dict.hpp                  |  7 ++++
+ src/benchmark/Performance.cpp | 77 ++++++++++++++++++++++++++++++++++---------
+ 2 files changed, 69 insertions(+), 15 deletions(-)
+
+diff --git a/src/Dict.hpp b/src/Dict.hpp
+index 461d6d2..1c81034 100644
+--- a/src/Dict.hpp
++++ b/src/Dict.hpp
+@@ -49,6 +49,13 @@ public:
+   virtual Optional<const DictEntry*> MatchPrefix(const char* word,
+                                                  size_t len) const;
+ 
++  /**
++   * Matches the longest matched prefix of a word.
++   */
++  Optional<const DictEntry*> MatchPrefix(const char* word) const {
++    return MatchPrefix(word, KeyMaxLength());
++  }
++
+   /**
+    * Matches the longest matched prefix of a word.
+    */
+diff --git a/src/benchmark/Performance.cpp b/src/benchmark/Performance.cpp
+index cf8d3aa..d1b6468 100644
+--- a/src/benchmark/Performance.cpp
++++ b/src/benchmark/Performance.cpp
+@@ -1,7 +1,26 @@
<skip, upstream performance test>
diff --git a/debian/patches/series b/debian/patches/series
index 74933a6..ad3c3d6 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1,4 +1,7 @@
-use-cmake-install-libdir.patch
-0003-use-system-libraries.patch
+0001-use-cmake-install-libdir.patch
+0002-use-system-libraries.patch
 0003-data-Explicitly-use-python3.patch
-no-remote-images-when-reading-docs-on-disk.patch
+0004-no-remote-images-when-reading-docs-on-disk.patch
+0005-Use-system-googletest.patch
+0006-Fix-a-bug-in-the-calculation-of-DictGroup-keyMaxLeng.patch
+0007-Fix-a-severe-performance-bug-in-Conversion-Convert-t.patch
diff --git a/debian/rules b/debian/rules
index 9bfc988..250a007 100755
--- a/debian/rules
+++ b/debian/rules
@@ -8,8 +8,10 @@ export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 
 include /usr/share/dpkg/architecture.mk
 
+# Disable parallel build to circumvent some random build error
+# (needs further investigation)
 %:
-       dh $@ --buildsystem=cmake --with pkgkde_symbolshelper
+       dh $@ --buildsystem=cmake --with pkgkde_symbolshelper --no-parallel
 
 BUILD_OPTIONS = \
     -DCMAKE_INSTALL_PREFIX=/usr \
@@ -23,6 +25,11 @@ else
 BUILD_OPTIONS += -DBUILD_DOCUMENTATION=OFF
 endif
 
+ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
+BUILD_OPTIONS += -DENABLE_GTEST=ON
+else
+BUILD_OPTIONS += -DENABLE_GTEST=OFF
+endif
 
 override_dh_auto_configure:
        dh_auto_configure -- $(BUILD_OPTIONS)
diff --git a/debian/tests/CMakeLists.txt b/debian/tests/CMakeLists.txt
new file mode 100644
index 0000000..e6f88c7
--- /dev/null
+++ b/debian/tests/CMakeLists.txt
@@ -0,0 +1,24 @@
+cmake_minimum_required(VERSION 3.18)
+project (opencc-integration CXX)
+include (CTest)
+enable_testing()
+
+find_package(PkgConfig REQUIRED)
+find_program (OPENCC_TOOL opencc REQUIRED)
+pkg_check_modules(OPENCC REQUIRED IMPORTED_TARGET opencc)
+
+add_definitions(
+  -DCMAKE_BINARY_DIR="${CMAKE_BINARY_DIR}"
+  -DCMAKE_SOURCE_DIR="${CMAKE_SOURCE_DIR}"
+  -DPROJECT_BINARY_PATH="${OPENCC_TOOL}"
+)
+make_directory(${CMAKE_BINARY_DIR}/test)
+add_subdirectory(/usr/src/googletest/googletest ${CMAKE_BINARY_DIR}/googletest-build EXCLUDE_FROM_ALL)
+set(UNITTESTS
+  CommandLineConvertTest
+)
+foreach(UNITTEST ${UNITTESTS})
+  add_executable(${UNITTEST} ${UNITTEST}.cpp)
+  target_link_libraries(${UNITTEST} gtest gtest_main PkgConfig::OPENCC)
+  add_test(${UNITTEST} ${UNITTEST})
+endforeach(UNITTEST)
diff --git a/debian/tests/CommandLineConvertTest.cpp b/debian/tests/CommandLineConvertTest.cpp
new file mode 100644
index 0000000..6ca6068
--- /dev/null
+++ b/debian/tests/CommandLineConvertTest.cpp
@@ -0,0 +1,104 @@
+/*
+ * Open Chinese Convert
+ *
+ * Copyright 2015 Carbo Kuo <byvoid@byvoid.com>
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <fstream>
+
+#include "Common.hpp"
+#include "gtest/gtest.h"
+
+namespace opencc {
+
+class CommandLineConvertTest : public ::testing::Test {
+protected:
+  CommandLineConvertTest() { GetCurrentWorkingDirectory(); }
+
+  virtual ~CommandLineConvertTest() { free(originalWorkingDirectory); }
+
+  virtual void SetUp() {
+    ASSERT_NE("", CMAKE_BINARY_DIR);
+    ASSERT_NE("", CMAKE_SOURCE_DIR);
+    ASSERT_EQ(0, chdir("/usr/share/opencc/"));
+  }
+
+  virtual void TearDown() { ASSERT_EQ(0, chdir(originalWorkingDirectory)); }
+
+  std::string GetFileContents(const std::string& fileName) const {
+    std::ifstream fs(fileName);
+    EXPECT_TRUE(fs.is_open());
+    const std::string content((std::istreambuf_iterator<char>(fs)),
+                              (std::istreambuf_iterator<char>()));
+    fs.close();
+    return content;
+  }
+
+  void GetCurrentWorkingDirectory() {
+    originalWorkingDirectory = getcwd(nullptr, 0);
+  }
+
+  const char* OpenccCommand() const {
+    return PROJECT_BINARY_PATH;
+  }
+
+  const char* InputDirectory() const {
+    return CMAKE_SOURCE_DIR "/../../test/testcases/";
+  }
+
+  const char* OutputDirectory() const { return CMAKE_BINARY_DIR "/test/"; }
+
+  const char* AnswerDirectory() const {
+    return CMAKE_SOURCE_DIR "/../../test/testcases/";
+  }
+
+  const char* ConfigurationDirectory() const {
+    return "/usr/share/opencc/";
+  }
+
+  std::string OutputFile(const char* config) const {
+    return std::string(OutputDirectory()) + config + ".out";
+  }
+
+  std::string AnswerFile(const char* config) const {
+    return std::string(AnswerDirectory()) + config + ".ans";
+  }
+
+  std::string TestCommand(const char* config) const {
+    return OpenccCommand() + std::string("") + " -i " + InputDirectory() +
+           config + ".in" + " -o " + OutputFile(config) + " -c " +
+           ConfigurationDirectory() + config + ".json";
+  }
+
+  char* originalWorkingDirectory;
+};
+
+class ConfigurationTest : public CommandLineConvertTest,
+                          public ::testing::WithParamInterface<const char*> {};
+
+TEST_P(ConfigurationTest, Convert) {
+  const char* config = GetParam();
+  ASSERT_EQ(0, system(TestCommand(config).c_str()));
+  const std::string& output = GetFileContents(OutputFile(config));
+  const std::string& answer = GetFileContents(AnswerFile(config));
+  ASSERT_EQ(answer, output);
+}
+
+INSTANTIATE_TEST_CASE_P(CommandLine, ConfigurationTest,
+                        ::testing::Values("hk2s", "hk2t", "jp2t", "s2hk", "s2t",
+                                          "s2tw", "s2twp", "t2hk", "t2jp", "t2s",
+                                          "tw2s", "tw2sp", "tw2t"));
+
+} // namespace opencc
diff --git a/debian/tests/README.md b/debian/tests/README.md
new file mode 100644
index 0000000..12d6cec
--- /dev/null
+++ b/debian/tests/README.md
@@ -0,0 +1,3 @@
+Fork from ../../test, please refresh this file if source has changed.
+
+Test with installed opencc tool and library.
diff --git a/debian/tests/control b/debian/tests/control
new file mode 100644
index 0000000..36df47b
--- /dev/null
+++ b/debian/tests/control
@@ -0,0 +1,8 @@
+Tests: integration
+Depends:
+ build-essential,
+ cmake,
+ googletest,
+ pkg-config,
+ @,
+Restrictions: allow-stderr,
diff --git a/debian/tests/integration b/debian/tests/integration
new file mode 100755
index 0000000..c1f1746
--- /dev/null
+++ b/debian/tests/integration
@@ -0,0 +1,12 @@
+#!/bin/sh
+
+set -ex
+
+cd "$(dirname "$0")"
+mkdir -p build
+cd build
+cmake ..
+make
+make test
+cd ..
+rm -rf build

Attachment: signature.asc
Description: PGP signature


--- End Message ---
--- Begin Message ---
Hi,

On 07-03-2021 17:39, Shengjing Zhu wrote:
> Please unblock package opencc

Unblocked.

Paul
PS: please don't rename patches during the freeze period, really
annoying to review.

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


--- End Message ---

Reply to: