pixman: Changes to 'debian-unstable'
.gitignore | 46
ChangeLog | 1955 +++++++++++++++++++++++
configure.ac | 5
debian/changelog | 17
debian/control | 9
debian/patches/ppc64el.diff | 14
debian/patches/series | 1
debian/rules | 5
pixman/Makefile.am | 2
pixman/pixman-arm-asm.h | 37
pixman/pixman-arm-common.h | 11
pixman/pixman-arm-neon-asm-bilinear.S | 12
pixman/pixman-arm-neon-asm.S | 12
pixman/pixman-arm-neon-asm.h | 20
pixman/pixman-arm-neon.c | 24
pixman/pixman-arm-simd-asm-scaled.S | 11
pixman/pixman-arm-simd-asm.S | 525 ++++++
pixman/pixman-arm-simd-asm.h | 116 +
pixman/pixman-arm-simd.c | 44
pixman/pixman-combine-float.c | 338 ++--
pixman/pixman-combine32.c | 1686 +-------------------
pixman/pixman-fast-path.c | 2
pixman/pixman-general.c | 27
pixman/pixman-gradient-walker.c | 2
pixman/pixman-inlines.h | 3
pixman/pixman-mips-dspr2-asm.S | 2
pixman/pixman-mips-dspr2-asm.h | 4
pixman/pixman-mips-dspr2.c | 10
pixman/pixman-mips-dspr2.h | 8
pixman/pixman-mmx.c | 109 +
pixman/pixman-private.h | 6
pixman/pixman-sse2.c | 24
pixman/pixman-vmx.c | 1315 +++++++++++++++-
pixman/pixman.c | 18
test/Makefile.sources | 60
test/affine-bench.c | 436 +++++
test/blitters-test.c | 20
test/check-formats.c | 176 --
test/composite.c | 11
test/lowlevel-blt-bench.c | 507 +++++-
test/pixel-test.c | 2780 +++++++++++++++++++++++++++++++++-
test/radial-invalid.c | 54
test/solid-test.c | 353 ++++
test/thread-test.c | 29
test/tolerance-test.c | 360 ++++
test/utils.c | 653 ++++++-
test/utils.h | 13
47 files changed, 9417 insertions(+), 2455 deletions(-)
New commits:
commit 42fab57651e2ebdde5d260ae76809a2500086839
Author: Andreas Boll <andreas.boll.dev@gmail.com>
Date: Fri Sep 4 13:40:42 2015 +0200
Bump standards version to 3.9.6.
diff --git a/debian/changelog b/debian/changelog
index 245fb5c..e73a52d 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -6,6 +6,7 @@ pixman (0.33.2-1) UNRELEASED; urgency=medium
* Update Vcs-* fields.
* Add upstream url.
* Drop XC- prefix from Package-Type field.
+ * Bump standards version to 3.9.6.
[ intrigeri ]
* Simplify hardening build flags handling (closes: #760100).
diff --git a/debian/control b/debian/control
index c78d8b6..6188e41 100644
--- a/debian/control
+++ b/debian/control
@@ -7,7 +7,7 @@ Build-Depends:
dh-autoreconf,
pkg-config,
quilt,
-Standards-Version: 3.9.2
+Standards-Version: 3.9.6
Vcs-Git: https://anonscm.debian.org/git/pkg-xorg/lib/pixman.git
Vcs-Browser: https://anonscm.debian.org/cgit/pkg-xorg/lib/pixman.git
Homepage: http://pixman.org/
commit 56432ef5e5a38ddd77e23d10e1e8f724afcbedd8
Author: Andreas Boll <andreas.boll.dev@gmail.com>
Date: Fri Sep 4 13:38:49 2015 +0200
Drop XC- prefix from Package-Type field.
diff --git a/debian/changelog b/debian/changelog
index e6627d6..245fb5c 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -5,6 +5,7 @@ pixman (0.33.2-1) UNRELEASED; urgency=medium
* Enable vmx on ppc64el (closes: #786345).
* Update Vcs-* fields.
* Add upstream url.
+ * Drop XC- prefix from Package-Type field.
[ intrigeri ]
* Simplify hardening build flags handling (closes: #760100).
diff --git a/debian/control b/debian/control
index 03277a6..c78d8b6 100644
--- a/debian/control
+++ b/debian/control
@@ -28,7 +28,7 @@ Description: pixel-manipulation library for X and cairo
Package: libpixman-1-0-udeb
Section: debian-installer
-XC-Package-Type: udeb
+Package-Type: udeb
Architecture: any
Depends:
${shlibs:Depends},
commit c0f98e1cf4fa897eb67a3ef737b24deacda5ae7e
Author: Andreas Boll <andreas.boll.dev@gmail.com>
Date: Fri Sep 4 11:47:45 2015 +0200
Add upstream url.
diff --git a/debian/changelog b/debian/changelog
index 05d7550..e6627d6 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -4,6 +4,7 @@ pixman (0.33.2-1) UNRELEASED; urgency=medium
* New upstream release candidate.
* Enable vmx on ppc64el (closes: #786345).
* Update Vcs-* fields.
+ * Add upstream url.
[ intrigeri ]
* Simplify hardening build flags handling (closes: #760100).
diff --git a/debian/control b/debian/control
index a56b239..03277a6 100644
--- a/debian/control
+++ b/debian/control
@@ -10,6 +10,7 @@ Build-Depends:
Standards-Version: 3.9.2
Vcs-Git: https://anonscm.debian.org/git/pkg-xorg/lib/pixman.git
Vcs-Browser: https://anonscm.debian.org/cgit/pkg-xorg/lib/pixman.git
+Homepage: http://pixman.org/
Package: libpixman-1-0
Section: libs
commit 03e2d2138b1248c79658e5edeaf66b283a278ff2
Author: Andreas Boll <andreas.boll.dev@gmail.com>
Date: Fri Sep 4 11:46:39 2015 +0200
Update Vcs-* fields.
diff --git a/debian/changelog b/debian/changelog
index 4cdb1aa..05d7550 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -3,6 +3,7 @@ pixman (0.33.2-1) UNRELEASED; urgency=medium
[ Andreas Boll ]
* New upstream release candidate.
* Enable vmx on ppc64el (closes: #786345).
+ * Update Vcs-* fields.
[ intrigeri ]
* Simplify hardening build flags handling (closes: #760100).
diff --git a/debian/control b/debian/control
index 18a1b7f..a56b239 100644
--- a/debian/control
+++ b/debian/control
@@ -8,8 +8,8 @@ Build-Depends:
pkg-config,
quilt,
Standards-Version: 3.9.2
-Vcs-Git: git://git.debian.org/git/pkg-xorg/lib/pixman
-Vcs-Browser: http://git.debian.org/?p=pkg-xorg/lib/pixman.git
+Vcs-Git: https://anonscm.debian.org/git/pkg-xorg/lib/pixman.git
+Vcs-Browser: https://anonscm.debian.org/cgit/pkg-xorg/lib/pixman.git
Package: libpixman-1-0
Section: libs
commit e6fce5e4e47a7a1597defa0c8f89eba0222b8953
Author: intrigeri <intrigeri@debian.org>
Date: Sun Aug 31 16:56:42 2014 +0000
Update changelog.
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
diff --git a/debian/changelog b/debian/changelog
index 37ddf53..4cdb1aa 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,8 +1,14 @@
pixman (0.33.2-1) UNRELEASED; urgency=medium
+ [ Andreas Boll ]
* New upstream release candidate.
* Enable vmx on ppc64el (closes: #786345).
+ [ intrigeri ]
+ * Simplify hardening build flags handling (closes: #760100).
+ Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
+ * Enable all hardening build flags. Thanks to Simon Ruderich too.
+
-- Andreas Boll <andreas.boll.dev@gmail.com> Fri, 04 Sep 2015 11:29:52 +0200
pixman (0.32.6-3) sid; urgency=medium
commit 7bc925aa5056ea114822bd9d06d94852946ba3d4
Author: intrigeri <intrigeri@debian.org>
Date: Sun Aug 31 16:54:54 2014 +0000
Enable all hardening build flags. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon again: "It currently has the same effect as hardening=+bindnow,
but will automatically enable future hardening options and in case the package
will ever build binaries those are immediately protected with PIE as well."
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
diff --git a/debian/rules b/debian/rules
index 99d67fc..a0e0b9e 100755
--- a/debian/rules
+++ b/debian/rules
@@ -3,7 +3,7 @@
PACKAGE = libpixman-1-0
SHLIBS = 0.25.2
-export DEB_BUILD_MAINT_OPTIONS = hardening=+bindnow
+export DEB_BUILD_MAINT_OPTIONS = hardening=+all
# Disable Gtk+ autodetection:
override_dh_auto_configure:
commit 2fb4da778cc2ce30df4e1e692dc82d00c6593137
Author: intrigeri <intrigeri@debian.org>
Date: Sun Aug 31 16:53:25 2014 +0000
Simplify hardening build flags handling. Thanks to Simon Ruderich <simon@ruderich.org> for the patch.
Quoting Simon Ruderich <simon@ruderich.org>:
"There's no need to use dpkg-buildflags manually in debian/rules.
Debhelper with compat=9 automatically enables the hardening flags when
dh_auto_configure is used. So just by calling dh_auto_configure [...]
the hardening flags get automatically passed to the build system.
DEB_BUILD_MAINT_OPTIONS is also respected."
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
diff --git a/debian/rules b/debian/rules
index a8100d2..99d67fc 100755
--- a/debian/rules
+++ b/debian/rules
@@ -11,8 +11,7 @@ override_dh_auto_configure:
# changelog entry:
LS_CFLAGS=" " dh_auto_configure -- --disable-gtk \
--disable-silent-rules \
- --disable-arm-iwmmxt \
- $(shell dpkg-buildflags --export=configure)
+ --disable-arm-iwmmxt
# Install in debian/tmp to retain control through dh_install:
override_dh_auto_install:
commit e47fb32ae3180d847a4f0e8f88f71174004b90b3
Author: Andreas Boll <andreas.boll.dev@gmail.com>
Date: Fri Sep 4 11:34:44 2015 +0200
Enable vmx on ppc64el (closes: #786345).
diff --git a/debian/changelog b/debian/changelog
index 7db916f..37ddf53 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,6 +1,7 @@
pixman (0.33.2-1) UNRELEASED; urgency=medium
* New upstream release candidate.
+ * Enable vmx on ppc64el (closes: #786345).
-- Andreas Boll <andreas.boll.dev@gmail.com> Fri, 04 Sep 2015 11:29:52 +0200
diff --git a/debian/patches/ppc64el.diff b/debian/patches/ppc64el.diff
deleted file mode 100644
index 34a4aa0..0000000
--- a/debian/patches/ppc64el.diff
+++ /dev/null
@@ -1,14 +0,0 @@
-diff --git a/configure.ac b/configure.ac
-index dce76b3..172de8b 100644
---- a/configure.ac
-+++ b/configure.ac
-@@ -540,6 +540,9 @@ AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
- #if defined(__GNUC__) && (__GNUC__ < 3 || (__GNUC__ == 3 && __GNUC_MINOR__ < 4))
- #error "Need GCC >= 3.4 for sane altivec support"
- #endif
-+#if defined(__PPC64__) && (__BYTE_ORDER__==__ORDER_LITTLE_ENDIAN__)
-+#error VMX utilization is still not ready on ppc64el
-+#endif
- #include <altivec.h>
- int main () {
- vector unsigned int v = vec_splat_u32 (1);
diff --git a/debian/patches/series b/debian/patches/series
index eebecc8..708b774 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1,2 +1 @@
-ppc64el.diff
test-increase-timeout.diff
commit 18e4bdcadf77910f2e22ce66b01b5bd98006c9fa
Author: Andreas Boll <andreas.boll.dev@gmail.com>
Date: Fri Sep 4 11:30:12 2015 +0200
Bump changelogs.
diff --git a/ChangeLog b/ChangeLog
index 2f951b8..96b8c28 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,10 +1,1548 @@
-commit 87eea99e443b389c978cf37efc52788bf03a0ee0
+commit ee790044b08e3b668e6aa5d9229f46ed7295ebf0
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Sat Aug 1 22:34:53 2015 +0300
+
+ Pre-release version bump to 0.33.2
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+
+commit 8d9be3619a906855a3e3a1e052317833cb24cabe
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Wed Jul 1 14:34:07 2015 +0300
+
+ vmx: implement fast path iterator vmx_fetch_a8
+
+ no changes were observed when running cairo trimmed benchmarks.
+
+ Running "lowlevel-blt-bench src_8_8888" on POWER8, 8 cores,
+ 3.4GHz, RHEL 7.1 ppc64le gave the following results:
+
+ reference memcpy speed = 25197.2MB/s (6299.3MP/s for 32bpp fills)
+
+ Before After Change
+ --------------------------------------------
+ L1 965.34 3936 +307.73%
+ L2 942.99 3436.29 +264.40%
+ M 902.24 2757.77 +205.66%
+ HT 448.46 784.99 +75.04%
+ VT 430.05 819.78 +90.62%
+ R 412.9 717.04 +73.66%
+ RT 168.93 220.63 +30.60%
+ Kops/s 1025 1303 +27.12%
+
+ It was benchmarked against commid id e2d211a from pixman/master
+
+ Siarhei Siamashka reported that on playstation3, it shows the following
+ results:
+
+ == before ==
+
+ src_8_8888 = L1: 194.37 L2: 198.46 M:155.90 (148.35%)
+ HT: 59.18 VT: 36.71 R: 38.93 RT: 12.79 ( 106Kops/s)
+
+ == after ==
+
+ src_8_8888 = L1: 373.96 L2: 391.10 M:245.81 (233.88%)
+ HT: 80.81 VT: 44.33 R: 48.10 RT: 14.79 ( 122Kops/s)
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit 47f74ca94637d79ee66c37a81eea0200e453fcc1
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Mon Jun 29 15:31:02 2015 +0300
+
+ vmx: implement fast path iterator vmx_fetch_x8r8g8b8
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
+
+ cairo trimmed benchmarks :
+
+ Speedups
+ ========
+ t-firefox-asteroids 533.92 -> 489.94 : 1.09x
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit fcbb97d4458d717b9c15858aedcbee2d33c8ac5a
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Sun Jun 28 23:25:24 2015 +0300
+
+ vmx: implement fast path scaled nearest vmx_8888_8888_OVER
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
+ reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)
+
+ Before After Change
+ ---------------------------------------------
+ L1 134.36 181.68 +35.22%
+ L2 135.07 180.67 +33.76%
+ M 134.6 180.51 +34.11%
+ HT 121.77 128.79 +5.76%
+ VT 120.49 145.07 +20.40%
+ R 93.83 102.3 +9.03%
+ RT 50.82 46.93 -7.65%
+ Kops/s 448 422 -5.80%
+
+ cairo trimmed benchmarks :
+
+ Speedups
+ ========
+ t-firefox-asteroids 533.92 -> 497.92 : 1.07x
+ t-midori-zoomed 692.98 -> 651.24 : 1.06x
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit ad612c4205f0ae46fc72a50e0c90ccd05487fcba
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Sun Jun 28 22:23:44 2015 +0300
+
+ vmx: implement fast path vmx_composite_src_x888_8888
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
+ reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)
+
+ Before After Change
+ ---------------------------------------------
+ L1 1115.4 5006.49 +348.85%
+ L2 1112.26 4338.01 +290.02%
+ M 1110.54 2524.15 +127.29%
+ HT 745.41 1140.03 +52.94%
+ VT 749.03 1287.13 +71.84%
+ R 423.91 547.6 +29.18%
+ RT 205.79 194.98 -5.25%
+ Kops/s 1414 1361 -3.75%
+
+ cairo trimmed benchmarks :
+
+ Speedups
+ ========
+ t-gnome-system-monitor 1402.62 -> 1212.75 : 1.16x
+ t-firefox-asteroids 533.92 -> 474.50 : 1.13x
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit fafc1d403b8405727d3918bcb605cb98044af90a
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Sun Jun 28 10:14:20 2015 +0300
+
+ vmx: implement fast path vmx_composite_over_n_8888_8888_ca
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le.
+
+ reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills)
+
+ Before After Change
+ ---------------------------------------------
+ L1 61.92 244.91 +295.53%
+ L2 62.74 243.3 +287.79%
+ M 63.03 241.94 +283.85%
+ HT 59.91 144.22 +140.73%
+ VT 59.4 174.39 +193.59%
+ R 53.6 111.37 +107.78%
+ RT 37.99 46.38 +22.08%
+ Kops/s 436 506 +16.06%
+
+ cairo trimmed benchmarks :
+
+ Speedups
+ ========
+ t-xfce4-terminal-a1 1540.37 -> 1226.14 : 1.26x
+ t-firefox-talos-gfx 1488.59 -> 1209.19 : 1.23x
+
+ Slowdowns
+ =========
+ t-evolution 553.88 -> 581.63 : 1.05x
+ t-poppler 364.99 -> 383.79 : 1.05x
+ t-firefox-scrolling 1223.65 -> 1304.34 : 1.07x
+
+ The slowdowns can be explained in cases where the images are small and
+ un-aligned to 16-byte boundary. In that case, the function will first
+ work on the un-aligned area, even in operations of 1 byte. In case of
+ small images, the overhead of such operations can be more than the
+ savings we get from using the vmx instructions that are done on the
+ aligned part of the image.
+
+ In the C fast-path implementation, there is no special treatment for the
+ un-aligned part, as it works in 4 byte quantities on the entire image.
+
+ Because llbb is a synthetic test, I would assume it has much less
+ alignment issues than "real-world" scenario, such as cairo benchmarks,
+ which are basically recorded traces of real application activity.
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit a3e914407e354df70b9200e263608f1fc2e686cf
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 18 15:05:49 2015 +0300
+
+ vmx: implement fast path composite_add_8888_8888
+
+ Copied impl. from sse2 file and edited to use vmx functions
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 16 cores, 3.4GHz, ppc64le :
+
+ reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)
+
+ Before After Change
+ ---------------------------------------------
+ L1 248.76 3284.48 +1220.34%
+ L2 264.09 2826.47 +970.27%
+ M 261.24 2405.06 +820.63%
+ HT 217.27 857.3 +294.58%
+ VT 213.78 980.09 +358.46%
+ R 176.61 442.95 +150.81%
+ RT 107.54 150.08 +39.56%
+ Kops/s 917 1125 +22.68%
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit d5b5343c7df99082597e0c37aec937dcf5b6602d
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 18 14:56:47 2015 +0300
+
+ vmx: implement fast path composite_add_8_8
+
+ Copied impl. from sse2 file and edited to use vmx functions
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 16 cores, 3.4GHz, ppc64le :
+
+ reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)
+
+ Before After Change
+ ---------------------------------------------
+ L1 687.63 9140.84 +1229.33%
+ L2 715 7495.78 +948.36%
+ M 717.39 8460.14 +1079.29%
+ HT 569.56 1020.12 +79.11%
+ VT 520.3 1215.56 +133.63%
+ R 514.81 874.35 +69.84%
+ RT 341.28 305.42 -10.51%
+ Kops/s 1621 1579 -2.59%
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit 339eeaf095f949694d7f79a45171ac03a3b06f90
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 18 14:12:05 2015 +0300
+
+ vmx: implement fast path composite_over_8888_8888
+
+ Copied impl. from sse2 file and edited to use vmx functions
+
+ It was benchmarked against commid id 2be523b from pixman/master
+
+ POWER8, 16 cores, 3.4GHz, ppc64le :
+
+ reference memcpy speed = 27036.4MB/s (6759.1MP/s for 32bpp fills)
+
+ Before After Change
+ ---------------------------------------------
+ L1 129.47 1054.62 +714.57%
+ L2 138.31 1011.02 +630.98%
+ M 139.99 1008.65 +620.52%
+ HT 122.11 468.45 +283.63%
+ VT 121.06 532.21 +339.62%
+ R 108.48 240.5 +121.70%
+ RT 77.87 116.7 +49.87%
+ Kops/s 758 981 +29.42%
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit 0cc8a2e9714efcb7cdd7e2a94c9cba49c3e29e00
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Sun Jun 28 09:42:19 2015 +0300
+
+ vmx: implement fast path vmx_fill
+
+ Based on sse2 impl.
+
+ It was benchmarked against commid id e2d211a from pixman/master
+
+ Tested cairo trimmed benchmarks on POWER8, 8 cores, 3.4GHz,
+ RHEL 7.1 ppc64le :
+
+ speedups
+ ========
+ t-swfdec-giant-steps 1383.09 -> 718.63 : 1.92x speedup
+ t-gnome-system-monitor 1403.53 -> 918.77 : 1.53x speedup
+ t-evolution 552.34 -> 415.24 : 1.33x speedup
+ t-xfce4-terminal-a1 1573.97 -> 1351.46 : 1.16x speedup
+ t-firefox-paintball 847.87 -> 734.50 : 1.15x speedup
+ t-firefox-asteroids 565.99 -> 492.77 : 1.15x speedup
+ t-firefox-canvas-swscroll 1656.87 -> 1447.48 : 1.14x speedup
+ t-midori-zoomed 724.73 -> 642.16 : 1.13x speedup
+ t-firefox-planet-gnome 975.78 -> 911.92 : 1.07x speedup
+ t-chromium-tabs 292.12 -> 274.74 : 1.06x speedup
+ t-firefox-chalkboard 690.78 -> 653.93 : 1.06x speedup
+ t-firefox-talos-gfx 1375.30 -> 1303.74 : 1.05x speedup
+ t-firefox-canvas-alpha 1016.79 -> 967.24 : 1.05x speedup
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit c12ee95089e7d281a29a24bf56b81f5c16dec6ee
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Sun Jun 28 09:42:08 2015 +0300
+
+ vmx: add helper functions
+
+ This patch adds the following helper functions for reuse of code,
+ hiding BE/LE differences and maintainability.
+
+ All of the functions were defined as static force_inline.
+
+ Names were copied from pixman-sse2.c so conversion of fast-paths between
+ sse2 and vmx would be easier from now on. Therefore, I tried to keep the
+ input/output of the functions to be as close as possible to the sse2
+ definitions.
+
+ The functions are:
+
+ - load_128_aligned : load 128-bit from a 16-byte aligned memory
+ address into a vector
+
+ - load_128_unaligned : load 128-bit from memory into a vector,
+ without guarantee of alignment for the
+ source pointer
+
+ - save_128_aligned : save 128-bit vector into a 16-byte aligned
+ memory address
+
+ - create_mask_16_128 : take a 16-bit value and fill with it
+ a new vector
+
+ - create_mask_1x32_128 : take a 32-bit pointer and fill a new
+ vector with the 32-bit value from that pointer
+
+ - create_mask_32_128 : take a 32-bit value and fill with it
+ a new vector
+
+ - unpack_32_1x128 : unpack 32-bit value into a vector
+
+ - unpacklo_128_16x8 : unpack the eight low 8-bit values of a vector
+
+ - unpackhi_128_16x8 : unpack the eight high 8-bit values of a vector
+
+ - unpacklo_128_8x16 : unpack the four low 16-bit values of a vector
+
+ - unpackhi_128_8x16 : unpack the four high 16-bit values of a vector
+
+ - unpack_128_2x128 : unpack the eight low 8-bit values of a vector
+ into one vector and the eight high 8-bit
+ values into another vector
+
+ - unpack_128_2x128_16 : unpack the four low 16-bit values of a vector
+ into one vector and the four high 16-bit
+ values into another vector
+
+ - unpack_565_to_8888 : unpack an RGB_565 vector to 8888 vector
+
+ - pack_1x128_32 : pack a vector and return the LSB 32-bit of it
+
+ - pack_2x128_128 : pack two vectors into one and return it
+
+ - negate_2x128 : xor two vectors with mask_00ff (separately)
+
+ - is_opaque : returns whether all the pixels contained in
+ the vector are opaque
+
+ - is_zero : returns whether the vector equals 0
+
+ - is_transparent : returns whether all the pixels
+ contained in the vector are transparent
+
+ - expand_pixel_8_1x128 : expand an 8-bit pixel into lower 8 bytes of a
+ vector
+
+ - expand_alpha_1x128 : expand alpha from vector and return the new
+ vector
+
+ - expand_alpha_2x128 : expand alpha from one vector and another alpha
+ from a second vector
+
+ - expand_alpha_rev_2x128 : expand a reversed alpha from one vector and
+ another reversed alpha from a second vector
+
+ - pix_multiply_2x128 : do pix_multiply for two vectors (separately)
+
+ - over_2x128 : perform over op. on two vectors
+
+ - in_over_2x128 : perform in-over op. on two vectors
+
+ v2: removed expand_pixel_32_1x128 as it was not used by any function and
+ its implementation was erroneous
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit 034149537be94862b43fb09699b8c2149bfe948d
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jul 2 11:04:20 2015 +0300
+
+ vmx: add LOAD_VECTOR macro
+
+ This patch adds a macro for loading a single vector.
+ It also make the other LOAD_VECTORx macros use this macro as a base so
+ code would be re-used.
+
+ In addition, I fixed minor coding style issues.
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit 744134025609a0a5805c2d3b4d34856eb75cb711
+Author: Nemanja Lukic <nemanja.lukic@rt-rk.com>
+Date: Fri Jun 27 18:05:39 2014 +0200
+
+ MIPS: update author's e-mail address
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+
+commit e2d211ac491cd9884aae7ccaf18e5b3042469cf2
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 13:54:01 2015 +0300
+
+ lowlevel-blt-bench: add option to skip memcpy measurement
+
+ The memcpy speed measurement takes several seconds. When you are running
+ single tests in a harness that iterates dozens or hundreds of times, the
+ repeated measurements are redundant and take a lot of time. It is also
+ an open question whether the measured speed changes over long test runs
+ due to unidentified platform reasons (Raspberry Pi).
+
+ Add a command line option to set the reference memcpy speed, skipping
+ the measuring.
+
+ The speed is mainly used to compute how many iterations do run inside
+ the bench_*() functions, so for repeated testing on the same hardware,
+ it makes sense to lock that number to a constant.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit 31cb0d4267f4f358b62f75fd42c4b1ae625be7ee
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 13:20:47 2015 +0300
+
+ lowlevel-blt-bench: add CSV output mode
+
+ Add a command line option for choosing CSV output mode.
+
+ In CSV mode, only the results in Mpixels/s are printed in an easily
+ machine-parseable format. All user-friendly printing is suppressed.
+
+ This is intended for cases where you benchmark one particular operation
+ at a time. Running the "all" set of benchmarks will print just fine, but
+ you may have trouble matching rows to operations as you have to look at
+ the tests_tbl[] to see what row is which.
+
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+ v2: don't add a space after comma in CSV.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+
+commit 9a7e0bc6d08c0324f09d6440270cd07201929f3f
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 12:41:57 2015 +0300
+
+ lowlevel-blt-bench: refactor to Mpx_per_sec()
+
+ Refactor the Mpixels/s computations into a function. Easier to read and
+ better documents what is being computed.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit 6e9c48c579e3325506234fa2ee7635f08f2c5a33
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 12:53:09 2015 +0300
+
+ lowlevel-blt-bench: all bench funcs to return pix_cnt
+
+ The bench_* functions, that did not already do it, are modified to
+ return the number of pixels processed during the benchmark. This moves
+ the computation to the site that actually determines the number, and
+ simplifies bench_composite() a bit.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit 9e8f2bcaf5fabd3729ee0ecc90009fd6cea9e8e9
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 12:02:17 2015 +0300
+
+ lowlevel-blt-bench: move speed and scaling printing
+
+ Move the printing of the memory speed and scaling mode into a new
+ function. This will help with implementing a machine-readable output
+ option.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit a33c2e6853fe0a76da42a43ed7ed9095e2dbe6a2
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 11:56:39 2015 +0300
+
+ lowlevel-blt-bench: print single pattern details
+
+ When given just a single test pattern instead of "all", print the test
+ details. This can be used to verify the pattern parser agrees with the
+ user, just like scaling settings are printed.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit 3ac7ae201758fe99627fdb2adf783be4063a9b1f
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 11:34:45 2015 +0300
+
+ lowlevel-blt-bench: make test_entry::testname const
+
+ We assign string literals to it, so it better be const.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit 56d8b365f5944bf78a427ac65c5a0d0311e0da5e
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 11:21:14 2015 +0300
+
+ lowlevel-blt-bench: move explanation printing
+
+ Move explanation printing to a new function. This will help with
+ implementing a machine-readable output option.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit bddff993ed734f4b9030c1960bcb3ebe1caca807
+Author: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+Date: Wed Jun 10 11:14:38 2015 +0300
+
+ lowlevel-blt-bench: move usage to a function
+
+ Move printing of usage into a new function and use argv[0] as the
+ program name. This will help printing usage from multiple places.
+
+ Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Reviewed-by: Ben Avison <bavison@riscosopen.org>
+
+commit 2be523b20402b7c9f548ac33b8c0f0ed00156c64
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 25 15:59:57 2015 +0300
+
+ vmx: fix pix_multiply for ppc64le
+
+ vec_mergeh/l operates differently for BE and LE, because of the order of
+ the vector elements (l->r in BE and r->l in LE).
+ To fix that, we simply need to swap between the input parameters, in case
+ we are working in LE.
+
+ v2:
+
+ - replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
+ - fixed whitespaces and indentation issues
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Reviewed-by: Adam Jackson <ajax@redhat.com>
+ Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+
+commit 8d379ad88e208bed9697065f6911c9ef83d85276
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 25 15:59:56 2015 +0300
+
+ vmx: fix unused var warnings
+
+ v2: don't put ';' at the end of macro definition. Instead, move it to
+ each line the macro is used.
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Reviewed-by: Adam Jackson <ajax@redhat.com>
+ Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+
+commit ff66a4a3ce95f2adcbf30b354eac60944596d6a2
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 25 15:59:55 2015 +0300
+
+ vmx: encapsulate the temporary variables inside the macros
+
+ v2: fixed whitespaces and indentation issues
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Reviewed-by: Adam Jackson <ajax@redhat.com>
+ Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+
+commit f6a26d09257dde9cd41144120543c8b754de515f
+Author: Fernando Seiti Furusato <ferseiti@linux.vnet.ibm.com>
+Date: Thu Jun 25 15:59:54 2015 +0300
+
+ vmx: adjust macros when loading vectors on ppc64le
+
+ Replaced usage of vec_lvsl to direct unaligned assignment
+ operation (=). That is because, according to Power ABI Specification,
+ the usage of lvsl is deprecated on ppc64le.
+
+ Changed COMPUTE_SHIFT_{MASK,MASKS,MASKC} macro usage to no-op for powerpc
+ little endian since unaligned access is supported on ppc64le.
+
+ v2:
+
+ - replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
+ - fixed whitespaces and indentation issues
+
+ Signed-off-by: Fernando Seiti Furusato <ferseiti@linux.vnet.ibm.com>
+ Reviewed-by: Adam Jackson <ajax@redhat.com>
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+
+commit b3a61703f41c6b34ba2ec9736030e1df04f53ab4
+Author: Oded Gabbay <oded.gabbay@gmail.com>
+Date: Thu Jun 25 15:59:53 2015 +0300
+
+ vmx: fix splat_alpha for ppc64le
+
+ The permutation vector isn't correct for LE, so correct its values
+ in case we are in LE mode.
+
+ v2:
+
+ - replace _LITTLE_ENDIAN with WORDS_BIGENDIAN for consistency
+ - change #ifndef to #ifdef for readability
+
+ Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
+ Reviewed-by: Adam Jackson <ajax@redhat.com>
+ Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+
+commit eebc1b78200aff075dbcae9c8d00edad1f830d91
+Author: Ben Avison <bavison@riscosopen.org>
+Date: Tue May 26 23:58:29 2015 +0100
+
+ mmx/sse2: Use SIMPLE_NEAREST_SOLID_MASK_FAST_PATH for NORMAL repeat
+
+ These two architectures were the only place where
+ SIMPLE_NEAREST_SOLID_MASK_FAST_PATH was used, and in both cases the
+ equivalent SIMPLE_NEAREST_SOLID_MASK_FAST_PATH_NORMAL macro was used
+ immediately afterwards, so including the NORMAL case in the main macro
+ simplifies the fast path table.
+
+ [Pekka: removed extra comma from the end of
+ SIMPLE_NEAREST_SOLID_MASK_FAST_PATH]
+
+ Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
+commit 7f6692807902b840b81f860fb2196d2fb242d977
+Author: Ben Avison <bavison@riscosopen.org>
+Date: Tue May 26 23:58:28 2015 +0100
+
+ mmx/sse2: Use SIMPLE_NEAREST_FAST_PATH macro
+
+ There is some reordering, but the only significant thing to ensure that
+ the same routine is chosen is that a COVER fast path for a given
+ combination of operator and source/destination pixel formats must
+ precede all the variants of repeated fast paths for the same
+ combination. This patch (and the other mmx/sse2 one) still follows that
+ rule.
+
+ I believe that in every other case, the set of operations that match any
+ pair of fast paths that are reordered in these patches are mutually
+ exclusive. While there will be a very subtle timing difference due to
+ the distance through the table we have to search to find a match
+ (sometimes faster, sometime slower) there is no evidence that the tables
+ have been carefully ordered by frequency of occurrence - just for ease
+ of copy-and-pasting.
+
+ Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
+ Acked-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
+
Reply to: