[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[sarah-julia.kriesch@gmx.de: „Missing“ Support for AI (New IBM z17 announced)]



Hi,

forwarding for comments from Debian AI team.

Kind regards
    Andreas.

----- Weitergeleitete Nachricht von Sarah-Julia Kriesch <sarah-julia.kriesch@gmx.de> -----

Date: Thu, 1 May 2025 14:53:16 +0000
From: Sarah-Julia Kriesch <sarah-julia.kriesch@gmx.de>
To: wg-linux-distros@lists.openmainframeproject.org, tille@debian.org, afedorova@redhat.com
Subject: „Missing“ Support for AI (New IBM z17 announced)


Hello together,

IBM has announced the new IBM z17 at the IBM Z Day Special Edition (8th April 2025). The main focus is AI with the new IBM Spyre™ Accelerator [0] for optimized ML/AI trainings besides the integrated AI Accelerator of z16. The fact is that it can not work without all the Python modules „supported“ by IBM for AI on Linux on IBM Z [1]. But there are many failing tests in our Linux distributions that, from my point of view, we can not support it.

Last year I went through the list of Python packages from the presentation by Andreas Krebbel [2] and checked for build failures based on tests and dependencies. I brought this topic also to the Linux Distributions Working Group. 2 open example issues/bugs are the following:
1) Onnx (required for the connection between software and hardware layer for AI:
https://bugzilla.suse.com/show_bug.cgi?id=1215337#c30 (continuously some failed tests)
2) oneDNN (hardware-specific optimizations for Neronal Networks): https://github.com/uxlfoundation/oneDNN/issues/2228

The AI Developers at IBM are so overloaded that they can not respond to and fix these test issues.

The suggestion from IBM side has been to disable failing tests, but we deliver upstream tests within our binaries and guarantee our users with the verification the functionality. If that does not work, we do not provide and support the software for the specific hardware architecture as a default in our Linux distributions.

Last year, we discussed this topic at the openSUSE Conference (with Fedora). The suggestion was not to support the next mainframe version if it contains more AI features, which we can not verify based on missing test systems and failing tests.
PJ Catalano wanted to sponsor us a z16-based LinuxONE after an escalation by an IBM Developer at the IBM Z Symposium 2024, that Linux distributions should receive newer hardware for verification. It seems, that it is blocked by IBM regulatories.

My suggestion is that we – as Linux distributions - do not support the z17 as long as we can not verify as a minimum the functionality with the AI Accelerator of z16, and the test cases of all required AI/ML software packages „supported by IBM“ are not fixed. Until then, we will support only the s390x hardware based on z15, which is provided for development and testing. And we offer the basic Linux operating system functionality for it. I am also open, that we can find a common solution with working together (with IBM) for fixing this problem. But the required hardware should be available.

If you agree, I would also publish an announcement for IBM customers regarding that in the IBM Community. What do you say about this suggestion?

Best regards,
Sarah J. Kriesch

Release Engineer for openSUSE zSystems
(Co-)Chair of the Linux Distributions Working Group

P.S. I received also feedback from the Enterprise Linux level with the statement by IBM: „No certification that Linux is AI certified“


[0] https://newsroom.ibm.com/z17
[1] https://www.redbooks.ibm.com/redpapers/pdfs/redp5712.pdf
[2] https://www.youtube.com/watch?v=_t7O6C0kU1Y



----- Ende weitergeleitete Nachricht -----

-- 
https://fam-tille.de


Reply to: