[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Thoughts on distributing virtual machine images to promote Debian



On 4/16/07, Michael Hanke <michael.hanke@gmail.com> wrote:

I maintain an inofficial package of FSL (a non-(commercial|free) toolkit for
fMRI analysis). For a special-interest package like this it has pretty high
number of users, but from reading the upstream mailing list and other
user feedback I know that far more people are using FSL via cygwin on the
win32 platform. However, upstream decided that the win32 port is too
complicated to maintain and they want to abandon it (going Mac & Linux
only). They expressed their intention to distribute a VMWare image
running some Linux distribution and FSL pre-installed to be used by the
Windows users instead. They will probably choose a recent Fedora or OpenSuse
release.

I have some experience running NASA SeaDAS, a remote sensing application,
in VMware players.  I do this for testing, as the application is much
too I/O intensive for
real work to happen in a VM.

Reading this I thought if Debian wouldn't be a better choice. Of course
it is, but I tried to collect some arguments. I got:

 - Easy to maintain.
 - Easy to upgrade (even across releases).
 - Reduces bandwidth on the download server as a virtual machine can be
   updated via the package management system and does not have to be
   downloaded again and again with every new release. This is a Debian
   advantage as APT makes sure that this works in all, but the most
   unlikely situations.
 - The Debian archive contains far more software than any other.

I have used mostly Fedora, as the NASA group uses that as their development
platform.  Kernel updates are a problem because VMware tools use kernel modules
that have to be built for each install.  I gather there is some lack
of clarity over
whether the license permits distributing the tools with a linux kernel.

I think a VM solution would work quite well for FSL, but the situation isn't
much different for other software packages (at least for science-related
packages). Therefore I'd like to ask whether you think that it would be
reasonable for Debian to provide a virtual machine image with the
most recent stable release that can be customized to perform a specific
task on a win32 machine?

In principle, it would be good if the standard kernels included the
hooks required to run
in VM's with licenses that permit such things.   This gives people
flexibility to run Debian
in places where it would not otherwise be available.   It is a big
step for people to just switch from Windows to linux, if if they are
inclined that way.  The ability to keep their existing environment
while exploring linux makes it much easier for people to make the
switch.

The ability to install software is key to the unix/linux toolkit
approach -- most apps
are not standalone but rely heavily on other tools.

The target audience would be win32 users that are forced to use this OS
(or simply aren't brave enough to try something different). As
installing and updating software on win32 isn't really a pleasure
(especially for special-interest or hacked-together tools) I assume that
this service could gain some popularity -- always given that some
software is actually available in Debian.

My current concept of the virtual machine setup is this:

 * Minimal installation to keep the size of the image as low as
   possible.
 * One user is completely configured (root via sudo). (Almost) no
   questions are asked and no configuration besides installing the
   virtualization software is required. It should be sufficient to boot
   the VM, log in and start the app you want.
 * All software installed in the image (expect for those that the user
   is interested in) should be selected/configured to honour that most
   win32 machine that are likely to run this image won't have much RAM
   (i.e. several GBs) installed. Therefore little memory consumtion is
   more or less critical.
 * Given that most win32 users expect a graphical desktop it should have
   one. Taking the memory consumption and disk space issue into account
   it should probably be XFCE.

I find that running Xming so linux apps appear in windows beside Win
apps is useful:
it allows cut/paste between linux and windows and teaches the
important lesson that
linux separates the OS from the graphics.

 * It must provide easy access to data on the host machine. Either via
   directly mounting the host filesystem or via a samba server inside
   the VM.

A VMware player can mount Windows' shared filesystems using cifs, but
I haven't found a way to map permissions properly.

There is also the question what virtualization software shall be used.
While FSL upstream favors VMWare I think that Debian should not
encourage someone to use closed-source software. Therefore I tested
VirtualBox and it works pretty good. While previous releases had some
problems (high cpu load, even when idle) the latest release seems to be
quite stable.

Additionally VirtualBox has a number of other advantages:

 * Runs headless providing video output via an integrated RDP server -- one
   of the few technologies Windows can handle without additional software.
 * It is able to mount local folders/drives on the host system without a
   network setup -- removes the need for a samba server and therefore
   reduces the VM image size.
 * Easy to setup.
 * Codebase is GPL'ed.

But unfortunately VirtualBox has at least one disadvantage, as it requires
administrator privileges to run on win32. Although this is on the
upstream TODO list, nobodys know if it will change soon.

So many apps need admin that many sites have been forced to give admin
rights to large numbers of users, so for a technical audience this may not
be a big problem.

Somebody might ask whether a Live-CD can do the job as well, especially
as there already is a project working on Debian Live-CDs. My oppinion is:
no it can't.

I think VMs are superior to Live-CDs for this task as the VMs can be
used as a living Debian system that can be further customized. In
contrast Live-CDs always feel like a snapshot of a certain system that
one has to live with.

If a user discovers any problem with a Live-CD image, the best he/she
can do is report it and wait for it to get fixed, redownload and try
again. But most likely it won't happen. Why should one replace one set
of installation/maintainance problems with another.

Some Live-CD's support updates using unionfs, but the updates have to
be installed again for eaach boot.

A VM image can be easily modified/fixed/customized. Additionally
IMHO it demonstrates much better the real advantages of Debian: the
wealth of high quality free software only one 'apt-get' away.

The popular commercial distros (R.H., SUSE) both have lots of apps that can
generally be installed quite easily using a GUI tool.

How could the virtual machine image be maintained? At the moment I see
four possiblities:

 1) Integrating its installation setup in tasksel. Although it might be
    overkill as only relatively few people would use it.
 2) Maintaining a configuration that can be used to pre-seed the
    installer.
 3) Maintaining a full master copy of a VM image.
 4) Maintaing a wiki page with a detailed installation description
    tailored for this purpose

ATM I don't know what is best. 4) has to advantage that it has the
lowest threshold to start working.

Also, people can see what is being done and exactly where things
break so you get better bug reports.

I apologize for this rather long message, but I wanted to provide
something we can hopefully talk about.

I'd be glad to hear if anyone already did a similiar thing, or anyone
is interested in doing it now and of course any technical advise.

Many important packages that run on linux have "why isn't there a
Windows version?" in the FAQ.  A few (R, the S-plus clone, and
ghostscript) do have ports, but maintaining a Windows port takes
a huge effort and is dangerous to the health of smaller projects.

Additionally I'm interested in comments about how we could ensure the
educational aspect of this project. I'd love if we could transport the message
that whatever problem one has, in whatever environment Debian is the universal
solution (I read that somewhere ;).

There are also advantages to running certain linux apps in linux hosted VM's.
You can test the impact of proposed changes before updating your production
machines, etc., or as in my case, check bugs found when running the app on
SGI IRIX64 against the linux version the developers use.   For this to
be useful,
you need to be able to acquire the VM's transparently, with minimal effort and
resources.

I suggest moving the discussion away from a specific app towards the problem
of providing "throwaway" VM's.


--
George N. White III <aa056@chebucto.ns.ca>
Head of St. Margarets Bay, Nova Scotia



Reply to: