[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Any input for some talk about usage of Debian in HPC



On 20/05/2024 21:00, Steven Robbins wrote:
Hello,

On Sunday, May 19, 2024 9:31:02 A.M. CDT Tony Travis wrote:

You can't ignore the host OS when you talk about HPC applications and
the HEP (High Energy Physics) community put a lot of effort into
developing good node provisioning systems and job-scheduling for HPC.
Consequently, there was a significant bias towards support for HEP
applications running under CentOS and less support for bioinformatics.

I've been out of academia for decades, but HEP was my first love and
neuroimaging my second, so this paragraph really piqued my interest.  Can you
briefly say what are the different needs of HEP and bioinformatics and how they
are in conflict?

Hi, Steve.

Many HEP applications involve a lot of floating-point arithmetic and are computationally intensive. By contrast most bioinformatics applications do not require floating point arithmetic: They are dominated by speed of memory access and memory size. Optimisations used in HEP calculations to keep everything in the high-speed CPU cache don't help with this access.

One area of bioinformatics in particular that has this sort of memory requirement is sequence alignment and sequence assembly. Some efforts have been made to speed this up using GPGPU and SIMD CPU instructions, but I've found it all very complicated and disappointing to be honest.

A recent success for GPGPU applications in bioinformatics is, however, base-calling of 'long' DNA sequencing reads and TensorFlow ML (Machine Learning) methods for error-correcting DNA/RNA sequence reads and e.g. predicting Transcription Factor Binding Sites etc.

None of this requires the use of floating point calculations in the frequency/Fourier domain that many HEP applications do. I must admit that my views are largely based on experience of helping my friend do DFT (Density Field Theory) simulations of protein 'docking' domains on a Beowulf cluster that we built for chemical modelling and bioinformatics!

I also worked for several years doing image analysis with Physicists :-)

Bye,

  Tony.

--
Minke Informatics Limited, Registered in Scotland - Company No. SC419028
Registered Office: 3 Donview, Bridge of Alford, AB33 8QJ, Scotland (UK)
tel. +44(0)19755 63548                    http://minke-informatics.co.uk
mob. +44(0)7985 078324        mailto:tony.travis@minke-informatics.co.uk


Reply to: