[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Applying for "A cloud image for bioinformatics with Debian"



On Fri, Apr 26, 2013 at 7:32 AM, Charles Plessy <plessy@debian.org> wrote:
> Le Thu, Apr 25, 2013 at 05:26:34PM +0800, harryxiyou a écrit :
>>
>> 1, I have launched and maintained sub-project HLFS(Hadoop Distributed
>> Log-Structured Based Log-Structured File System) of Cloudxy
>> (http://code.google.com/p/cloudxy/   in Chinese
>> http://code.google.com/p/cloudxy/wiki/WHAT_IS_CLOUDXY  in English).
>>
>> I have also developed HLFS drivers for QEMU, Libvirt, Openstack and
>> they all work well for HLFS. I am submitting HLFS driver patches to
>> these communities. Maybe i will let HLFS support Cloudsim ;-).
>>
>> 2, Past open source contributions.
>> Maintain Couldxy: http://code.google.com/p/cloudxy/
>> Patch for Libvirt:
>> http://libvirt.org/git/?p=libvirt.git&a=search&h=HEAD&st=commit&s=Harry+Wei
>> Patches for Linux Kernel:
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/log/?id=refs%2Ftags%2Fv3.9-rc7&qt=author&q=Harry+Wei
>> Some other trivial patches for Sheepdog or others are not listed here
>>
>> 3, My basic infos are here http://houjiaoshe.com/~harry/
>
> Dear Harry,

Dear Charles,

Sorry to reply so late :-/

>
> thank you for your interest in bioinformatics and Debian.
>
> I have posted on the debian-cloud mailing list an explanation on the bottom-up
> aspect of the project:  the precise goal is to be defined together.  This is
> quite challenging (especially that the deadline is on May 3rd), but this way we
> can make project that wouldn't have been thought about by biologists or
> computer scientists alone.
>
>     https://lists.debian.org/debian-cloud/2013/04/msg00006.html
>
> You have a strong bacground in computer science, so please as many questions as
> you need to better see where your skills can solve commont problems we have in
> bioinformatics.  I have listed some of them in the email cited above.  Please
> use the debian-cloud or debian-med mailing lists so that you can have more
> answers than just mine.
>
> I had a look at Cloudxy.  Forgive me if I misunderstood, but is it a system
> that would deduplicate common contents between images that are very similar ?

No, it wouldn't. Following are some basic infos about HLBS.

We have launched and developed a new
block storage system, HLBS, which is Hadoop Distributed File
System Based Log-Structured Block System. HLBS is a sub-project
of Cloudxy (http://code.google.com/p/cloudxy/ in Chinese
http://code.google.com/p/cloudxy/wiki/WHAT_IS_CLOUDXY in English).

My brother(Kang hua, kuanghua151@gmail.com) and
me(http://houjiaoshe.com/~harry/) have been developing HLBS for more
than two years. We know
that HDFS can just write once and read many times and this feature is
not suitable for back-end storage system so we base on HDFS then
realize a new LBS(Log-Structure Block Storage System) with the concepts
of LFS(Log-Structured File System).

HLBS design ideas are here: http://code.google.com/p/cloudxy/wiki/HlfsDesign

Now, HLBS has many features like following.
1, Support Snapshots -- http://code.google.com/p/cloudxy/wiki/HlfsSnapshotDesign

2, Support Cache -- http://code.google.com/p/cloudxy/wiki/HLFS_CACHE_DESIGN

3, Support Block Compression --
http://code.google.com/p/cloudxy/wiki/About_block_Compression

4, Support Segment clean(simple Garbage Collection) --
http://code.google.com/p/cloudxy/wiki/SEG_CLEAN_USAGE

...

Actually, we also support for many famous software, like NBD(Network
Block Device), QEMU, Libvirt, Openstack, iSCSI, etc. The patches
are here http://cloudxy.googlecode.com/svn/branches/hlfs/features/multi-file/patches/
. I am now submitting these patches to communities.

> One recurrent problem in bioinformatics is that each project is tied with the
> version numbers of the programs they used at the beginning (most bioinformatics
> software are not as strong as more core application when it comes to stable
> APIs, etc.)  But there may be even more interesting things together.
>
> Let's continue this discussion on a public mailing list.

I have cc'ed this topic to debian-cloud. My understanding is like
this. HLBS could be
an image that storage bioinformatics datas, which we can deploy HLBS in Debian.


--
Thanks
Harry Wei


Reply to: