[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#996958: ITP: mumax -- GPU accelerated micromagnetic simulator



That's a neat project.

The README.md says:
> if you don't have git:
> 
> *  seriously, no git?

The question is not whether one does not have git, but whether one does
not have CUDA, unfortunately.

> The Design and Verification of mumax3:
> 
> http://scitation.aip.org/content/aip/journal/adva/4/10/10.1063/1.4899186

The hyperlink seems to be paywalled or broken.

You write:
> A speed-up of the order of 100x compared to CPU-based simulations can
> easily be reached....

Since I am unable to view the paper, would you briefly, approximately
tell me how you achieved the speed-up?  Alternately, would you link me
to relevant presentation slides, a presentation video, or the like?
Again alternately, would you advise me in which source file one should
look for the core of the main loop, where the 100x speed-up is
implemented?

I ask because I have a simulation that improperly relies on g++'s
optimizer to vectorize the simulation's main loop, the elements
being 64+64 = 128-bit complex doubles.  Even if my loop technique were
not clumsy and 15 years outdated, the optimizer goes only to SSE
hardware, and not (as far as I can tell by reviewing the disassembly)
to the GPU at all.  One could try OpenCL, of course; but without a good
example to follow, I'd probably flounder around six months trying to
figure out how to apply OpenCL intelligently....

Anyway, if you believe that your code is a good example, then I'd be
interested to see how you have achieved the 100x.

Attachment: signature.asc
Description: PGP signature


Reply to: