Hi Mo,
I presume it's still a draft, but d/rules creates
`third_party/dlpack`, but does not create `third_party/jitify`.
The d/copyright also does not yet seem to be complete.
cupy-rocm now compiles on amd64, arm64, and ppc64el locally, but I don't know whether it is really working. Could someone help me test it on real hardware before I upload it to NEW? https://salsa.debian.org/science-team/cupy
I built and tested the current version in salsa and it seems to work on my Radeon VII.
Input:
# python3 <<EOF
import cupy as cp
import numpy as np
x_cpu = np.array([1, 2, 3])
x_gpu = cp.array([1, 2, 3])
l2_cpu = np.linalg.norm(x_cpu)
l2_gpu = cp.linalg.norm(x_gpu)
print("Using Numpy: ", l2_cpu)
print("\nUsing Cupy: ", l2_gpu)
EOF
Output:
Using Numpy: 3.7416573867739413 Using Cupy: 3.7416573867739413
Input/Output:
>>> import cupy as cp
>>> import numpy as np
>>> ary_cpu = np.arange(10)
>>> ary_gpu = cp.asarray(ary_cpu)
>>> print('cpu:', ary_cpu)
cpu: [0 1 2 3 4 5 6 7 8 9]
>>> print('gpu:', ary_gpu)
gpu: [0 1 2 3 4 5 6 7 8 9]
>>> print(ary_gpu.device)
<CUDA Device 0>
>>> ary_cpu_returned = cp.asnumpy(ary_gpu)
>>> print(repr(ary_cpu_returned))
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> print(type(ary_cpu_returned))
<class 'numpy.ndarray'>
>>> print(ary_gpu * 2)
[ 0 2 4 6 8 10 12 14 16 18]
>>> print(cp.exp(-0.5 * ary_gpu**2))
[1.00000000e+00 6.06530660e-01 1.35335283e-01 1.11089965e-02
3.35462628e-04 3.72665317e-06 1.52299797e-08 2.28973485e-11
1.26641655e-14 2.57675711e-18]
>>> print(cp.linalg.norm(ary_gpu))
16.881943016134134
>>> print(cp.random.normal(loc=5, scale=2.0, size=10))
[ 7.17178344 8.17284596 5.72956002 5.16859175 4.29981156 6.99345567
2.62313118 3.33248402 10.09166774 6.32673795]
- Cory