Some bignum speed tests using Cryptokit
On Wednesday 17 December 2003 05:36 am, Sven Luther wrote:
> Hello,
>
> I have finally found time to make a separate package of the new
> nat/bignum implementation, and uploaded it today, but as always, it will
> be in the NEW queue for some time.
>
> The package is called ocaml-nums (after the dllnums.so name), and
> produces two packages called libnums-ocaml and libnums-ocaml-dev.
[snip]
> If i have time (or someone else feels like it), i will also provide the
> non-free version as a separate package providing libnums-ocaml, but i am
> not entirely sure it can go into non-free even. It would be nice for
> comparisons though between the old and the new code. The new code should
> be 50% faster on SSE2 enabled CPUs and on powerpc ones (not sure if only
> altivec supporting or not). I am sure that if someone feels like it,
> other vector engines support could be added for other kind of cpus.
I ran the new ocaml-nums through its paces using Cryptokit's speedtest on RSA
operations, and have attached my results. Note that Xavier's original
implementation of these ops in Cryptokit uses the Nat type directly and
explicitly allocates space for results based on a knowledge of how many
"digits" will be needed to store them. I expect that's one reason why the
Nat implementation is so much faster for some operations than Numerix.Big,
which uses the same underlying ocaml-nums primitives. I haven't built
Numerix.Big against the old non-free nums, so I don't know whether it was
slower before.
I will do some test runs on PowerPC as well, but perhaps not before the new
year. My current numerix and cryptokit packages are at
http://www-static.sane.net/
and testers are welcome. (Thanks, Sylvain!) The easiest thing for
comparative speed testing is probably to build and install numerix and then
rebuild the cryptokit source package for each variation (no need to install
the debs) using:
dpkg-buildpackage -rfakeroot && make speedtest && ./speedtest
The packages use dpatch, so you want to edit a file in debian/patches rather
than editing cryptokit.ml directly. See 11_numerix_slong.dpatch for an
example (this one-line patch is only applied on i386, where Slong is the fast
assembly implementation).
An older cryptokit-1.2-1 package is there also, which contains the original
Nat-based cryptokit.ml and builds well enough to run speedtest. This is what
I used for the Nat cases below.
> Friendly,
>
> Sven Luther
- Michael
The following performance comparisons were done on 2003-12-18 on a
2.4 GHz Pentium IV with 1GB of RAM (mostly free) using Cryptokit's
speedtest.ml (built as optimized native code). The numbers represent
seconds to perform the stated operations, so smaller is faster.
Numerix.Slong:
1.36 RSA key generation (1024 bits) x 10
0.42 RSA public-key operation (1024 bits, exponent 65537) x 1000
2.40 RSA private-key operation (1024 bits) x 100
0.77 RSA private-key operation with CRT (1024 bits) x 100
Numerix.Gmp:
1.49 RSA key generation (1024 bits) x 10
1.21 RSA public-key operation (1024 bits, exponent 65537) x 1000
2.17 RSA private-key operation (1024 bits) x 100
0.73 RSA private-key operation with CRT (1024 bits) x 100
Original (OCaml <= 3.07) Nat:
2.31 RSA key generation (1024 bits) x 10
0.47 RSA public-key operation (1024 bits, exponent 65537) x 1000
3.08 RSA private-key operation (1024 bits) x 100
0.95 RSA private-key operation with CRT (1024 bits) x 100
New (OCaml >= 3.08) Nat:
2.84 RSA key generation (1024 bits) x 10
0.73 RSA public-key operation (1024 bits, exponent 65537) x 1000
2.86 RSA private-key operation (1024 bits) x 100
1.09 RSA private-key operation with CRT (1024 bits) x 100
Numerix.Big:
2.31 RSA key generation (1024 bits) x 10
2.28 RSA public-key operation (1024 bits, exponent 65537) x 1000
3.30 RSA private-key operation (1024 bits) x 100
1.48 RSA private-key operation with CRT (1024 bits) x 100
Numerix.Dlong:
4.86 RSA key generation (1024 bits) x 10
1.14 RSA public-key operation (1024 bits, exponent 65537) x 1000
8.78 RSA private-key operation (1024 bits) x 100
2.61 RSA private-key operation with CRT (1024 bits) x 100
Numerix.Clong:
6.35 RSA key generation (1024 bits) x 10
1.17 RSA public-key operation (1024 bits, exponent 65537) x 1000
8.77 RSA private-key operation (1024 bits) x 100
2.98 RSA private-key operation with CRT (1024 bits) x 100
Reply to: