[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Question about distributed FS for high-performance I/O

Hash: SHA256


On 25/01/2016 18:01, Serge Cohen wrote:
> Dear list,
> I am looking for a distributed FS for high-performance I/O (not high
> availability) that is well suited to be both served by debian systems
> and on which it is easy to have debian clients. The clients of the FS
> will be performing scientific computation which are often I/O bound
> (few large files, rather than many small files/DB like)
> A single file is in the range of 100s of MB to 100s of GB. Datasets
> can be going up to a few TB (eg. 10 files of 200GB each). The
> computation is embarrassingly parallel but mostly I/O bound (one of
> the typical problem is related to transposition of arrays of 100GB
> size, each element being a few kB).
> I am in a small lab, we already have some (or all) of the hardware : 
> 1 HP RAID arrays in fibre-channel and SAS, 3 servers for OSS + MDS
> types an infiniband fabric 2 extra servers on the fabric for
> computations and as «NAS head» for the rest of the network (partly
> 10GbE) for serving to client running unsupported OS/fabric.

I have a few questions, if you don't mind, regarding this paragraph
because I am not sure I really get it well:
- - How is the disk array attached to servers?
- - What is the unsupported OS? This makes me think that your constraint
  on Debian below is not a hard requirement for the storage part.

If your storage is attached to two servers, then maybe your best option
today is to use NFS? Correctly configured and tuned on this kind of
configuration, it can run reasonably well! Did you consider that option?

> Initially this system was supposed to run Lustre as FS, but since the
> support never went into stable and now is not even into unstable
> anymore it is no more an option given our limited resources in term
> of sys-admin and related activities.
> One option I have recently seen is BeeGFS, and it seems a reasonable
> solution… but the documentation is sparse and there seems to be not
> that many users already.

There are a few more options you can try:
- - GlusterFS
- - and Ceph

Documentation for both is reather good and both are supported for Debian
(on both client and server side).

Alternatively, there is also MooseFS (or its fork LizardFS). You can have
a metadata server with MooseFS, but I'm not sure this would help a lot given
your current hardware setup.

> Is there a plan for Lustre to be back into stable distribution, do
> any of you have experience with BeeGFS (ex. FraunhoferGFS/FhGFS). Or
> do some of you have better (or even interesting) experience with
> other solutions ?

I haven't investigated the state of Lustre yet, but I suspect the server
part still needs a patched version of the Kernel. It is less true for the
client side though. That's why it didn't make much sense to provide Lustre
in Debian. There was an ongoing effort to provide the client part back [1]
in Debian, but afaik involved persons have been stuck due to lack of testing

[1] http://anonscm.debian.org/gitweb/?p=users/waldi/lustre.git

> Thanks in advance for any comments/help/pointers.
> Sincerily,
> Serge.
> PS : We are using Debian for both servers and clients, and it is not
> an option for us to be using another system or distribution.

Kind regards,

- -- 
Version: GnuPG v2


Reply to: