[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: hardware encryption



On woensdag 20 januari 2021 11:40:26 CEST brainfart@posteo.net wrote:
> hardware accelerated encryption is a bit of a mystery to me
> some processors advertise it but how do we know if it's being used
> is there a way to test if hardware accelerated encryption is being used
> or if it's just advertising hype

I very much like to understand this as well.
I have a/several Rock64 devices and it is supposed to have ARMv8 Cryptography 
Extensions according to https://wiki.pine64.org/wiki/ROCK64#CPU_Architecture.

Due to bug #976635  several CRYPTO modules got enabled in the 5.10 kernel.
But I don't know whether that's relevant for ARMv8 CE.

https://turecki.net/content/getting-most-out-ssh-hardware-acceleration-tuning-aes-ni
contains a test to check the speed of some crypto operations.
Based on that I've made a procedure which I've now run on several devices:

# adduser test
$ ssh-add (make sure ssh agent is running)
$ ssh-copy-id test@localhost
$ ssh test@localhost (verify key based auth works)
$ exit
$ for i in `ssh -Q cipher`; do dd if=/dev/zero bs=1M count=100 2> /dev/null | \
ssh -c $i test@localhost "(time -p cat) > /dev/null" 2>&1 | grep real | \
awk '{print "'$i': "100 / $2" MB/s" }'; done
$ grep -i -E "(flags|features)" /proc/cpuinfo | tail -n1

On a Rock64 with kernel 5.8.0-1-arm64, I got these results:
aes128-ctr: 45.8716 MB/s
aes192-ctr: 45.6621 MB/s
aes256-ctr: 44.6429 MB/s
aes128-gcm@openssh.com: 49.505 MB/s
aes256-gcm@openssh.com: 48.7805 MB/s
chacha20-poly1305@openssh.com: 36.9004 MB/s

Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

But on kernel 5.10.0-7-arm64, with those CRYPTO modules, I got this:
aes128-ctr: 42,735 MB/s
aes192-ctr: 44,4444 MB/s
aes256-ctr: 44,0529 MB/s
aes128-gcm@openssh.com: 48,0769 MB/s
aes256-gcm@openssh.com: 46,0829 MB/s
chacha20-poly1305@openssh.com: 37,037 MB/s

Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

If you run the test several times, you'll get slightly different results 
each time, so I consider these results the same.

For comparison (I don't remember which kernel version) on Ryzen 7 1800X:
aes128-ctr: 714.286 MB/s
aes192-ctr: 714.286 MB/s
aes256-ctr: 769.231 MB/s
aes128-gcm@openssh.com: 1000 MB/s
aes256-gcm@openssh.com: 1000 MB/s
chacha20-poly1305@openssh.com: 294.118 MB/s

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp 
lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni 
pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx 
f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 
3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext 
perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 
avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 
xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale 
vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic 
v_vmsave_vmload vgif overflow_recov succor smca

with kernel 5.10.0-7-amd64:
aes128-ctr: 714,286 MB/s
aes192-ctr: 769,231 MB/s
aes256-ctr: 714,286 MB/s
aes128-gcm@openssh.com: 909,091 MB/s
aes256-gcm@openssh.com: 909,091 MB/s
chacha20-poly1305@openssh.com: 500 MB/s

very odd that aes192-ctr and aes256-ctr seem to have switched, but the values
are otherwise EXACTLY the same :-O
Very impressive speed improvement with chacha20-poly1305 though :D
(Note that the aforementioned bug report was about arm64, not amd64)

On a RPi2, the values were around 12 MB/s


I don't find the scores of the Rock64 impressive, but that may be because
I've read somewhere that ARMv8 Cryptography Extensions could/should
result in a FACTOR 10 speed improvements with cryptography.

There could be a number of issues here:
1) The 'factor 10' is horseshit
2) The 'factor 10' is true, but it doesn't work on Rock64 (yet?)
3) The 'factor 10' is true and working and without it, the scores would be abysmal.
4) The test is all wrong 

If I do 'cat /proc/crypto' I get a long list, but I have no idea what the output means.


So essentially I have the same question as OP.
How can I/we know if it's present and working as intended?
What kind of speed improvement can/should one expect?
What is needed to take advantage of it? Kernel modules and if so, which?
The CRYPTO_XYZ_CE ones? Others? Something else entirely?

Cheers,
  Diederik

Attachment: signature.asc
Description: This is a digitally signed message part.


Reply to: