Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"

To: PICCA Frederic-Emmanuel <frederic-emmanuel.picca@synchrotron-soleil.fr>
Cc: Paul Wise <pabs@debian.org>, debian-devel@lists.debian.org
Subject: Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
From: Mo Zhou <lumin@debian.org>
Date: Sat, 25 May 2019 02:38:58 -0700
Message-id: <[🔎] b69e7a2421deda6d98085bdc8ac200fc@debian.org>
In-reply-to: <[🔎] A2A20EC3B8560D408356CAC2FC148E53015BBA21B2@SUN-DAG3.synchrotron-soleil.fr>
References: <[🔎] f544829dcd6c0f92ea11cdb25543bdac@debian.org> <[🔎] 20190521090709.t4o3hsx4p665ws6w@an3as.eu> <[🔎] 7ba5a9c7-a58e-e173-a99b-28f1dfc3deae@cohens.org.il> <[🔎] 319bcc5280dd76a6911598333a85cb8c@debian.org> <[🔎] 05eb680a-63aa-f4e4-343d-6f86f401c7c5@debian.org> <[🔎] tslmujdkwxn.fsf@suchdamage.org> <[🔎] 7174EFDF-6BDE-42E3-BC53-20503060EB38@koipond.org.uk> <[🔎] tslh89ljbnh.fsf@suchdamage.org> <[🔎] CAKTje6FO3X+x0qCD0T1Un3hebPxyD5GuE7qDdBXDFR-FdUWOTA@mail.gmail.com> <[🔎] e4f150b92f156633ef640c3273568884@debian.org>,<[🔎] ea5f8520db97fb1ddd8d4b87875e22f54dd7b23f.camel@debian.org> <[🔎] A2A20EC3B8560D408356CAC2FC148E53015BBA21B2@SUN-DAG3.synchrotron-soleil.fr>

Hi PICCA,

On 2019-05-24 12:01, PICCA Frederic-Emmanuel wrote:
> What about ibm power9 with pocl ?
> 
> it seems that this is better than the latest NVIDIA GPU.

The typical workload for training neural networks is linear
operations such as general matrix-matrix multiplication and
convolution.

I know nothing about pocl, but it's hard for CPU to beat
GPU in terms of these highly-parallelizable linear operations.
Try a 4096x4096 multiplication and you will easily find out
the difference.

E.g. my CPU = I5 7440HQ (middle-end mobile CPU), GPU = Nvidia 940MX
(junk)
The junk GPU (CUDA) is 100x faster than my CPU (MKL).

~ ❯❯❯ optirun ipython3
Python 3.7.3 (default, Apr  3 2019, 05:39:12) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import torch as th                                              
       

In [2]: x = th.rand(4096, 4096)                                         
       

In [3]: %time x@x                                                       
       
CPU times: user 1.65 s, sys: 38.7 ms, total: 1.69 s
Wall time: 449 ms
Out[3]: 
tensor([[1015.7596, 1004.2767, 1001.6245,  ..., 1026.8447,  996.3105,
         1002.7847],
        [1047.8833, 1014.3856, 1020.8246,  ..., 1055.3224, 1021.6126,
         1031.0334],
        [1049.3168, 1027.7637, 1030.9961,  ..., 1054.3218, 1015.3804,
         1031.6709],
        ...,
        [1039.6516, 1024.6678, 1021.1326,  ..., 1047.0674, 1015.1402,
         1029.5969],
        [1020.1988,  994.0073, 1005.5823,  ..., 1015.6786,  990.2491,
         1008.1358],
        [1022.9388,  991.9886,  990.4608,  ..., 1013.9000,  998.8676,
         1007.8554]])

In [4]: x = x.cuda()                                                    
                                             

In [5]: %time x@x                                                       
                                             
CPU times: user 1.1 ms, sys: 174 µs, total: 1.27 ms
Wall time: 2.67 ms
Out[5]: 
tensor([[1015.7591, 1004.2764, 1001.6254,  ..., 1026.8447,  996.3105,
         1002.7841],
        [1047.8838, 1014.3846, 1020.8243,  ..., 1055.3209, 1021.6123,
         1031.0328],
        [1049.3174, 1027.7644, 1030.9971,  ..., 1054.3210, 1015.3800,
         1031.6727],
        ...,
        [1039.6511, 1024.6686, 1021.1323,  ..., 1047.0674, 1015.1404,
         1029.5974],
        [1020.1982,  994.0067, 1005.5826,  ..., 1015.6784,  990.2482,
         1008.1347],
        [1022.9395,  991.9879,  990.4588,  ..., 1013.9014,  998.8687,
         1007.8544]], device='cuda:0')

Reply to:

References:
- Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Mo Zhou <lumin@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Andreas Tille <andreas@an3as.eu>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Tzafrir Cohen <tzafrir@cohens.org.il>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Mo Zhou <lumin@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Andy Simpkins <rattusrattus@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Sam Hartman <hartmans@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Andy Simpkins <andy@koipond.org.uk>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Sam Hartman <hartmans@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Paul Wise <pabs@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Mo Zhou <lumin@debian.org>
- Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: Paul Wise <pabs@debian.org>
- RE:Bits from /me: A humble draft policy on "deep learning v.s. freedom"
  - From: PICCA Frederic-Emmanuel <frederic-emmanuel.picca@synchrotron-soleil.fr>

Prev by Date: Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
Next by Date: Re: Hurd-i386 and kfreebsd-{i386,amd64} removal
Previous by thread: RE:Bits from /me: A humble draft policy on "deep learning v.s. freedom"
Next by thread: Re: Bits from /me: A humble draft policy on "deep learning v.s. freedom"
Index(es):
- Date
- Thread