[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Publishing raw generic{,cloud} images without tar, and without compression, plus versionning of point releases



On 5/25/20 7:36 PM, Bastian Blank wrote:
> On Mon, May 25, 2020 at 02:21:48AM +0200, Thomas Goirand wrote:
>>>> So I was wondering if we could:
>>>> 1/ Make the resulting extracted disk smaller. That'd be done in FAI, and
>>>> I have no idea how that would be done. Thomas, can you help, at least
>>>> giving some pointers on how we could fix this?
>>> Fix what?
>> The fact that the raw image is 2GB once extracted, when it could be
>> 1/4th of that.
> 
> Please provide a prototype.

I already linked to how it could be done. I just asked (mostly Thomas)
if FAI was able to do it by itself, let's wait for his answer.

>>>> 2/ Published the raw disk directly without compression (together with
>>>> its compressed form), so one can just point to it with Glance for
>>>> downloading. BTW, I don't see the point of having a tarball around the
>>>> compressed form, raw.xz is really enough, and would be nicer because
>>>> then one can pipe the output of xz directly to the OpenStack client (I
>>>> haven't checked, but I think that's maybe possible).
>>> No. Nothing in the download chain supports sparse files, so unwrapped
>>> raw images are somewhat out of the question.
>> I've done this for 3 Debian releases [2], I don't see why we would loose
>> the feature because of a "sparse files" thing which you somehow find
>> important. Truth is: nobody cares storing the raw image as sparse on an
>> OpenStack cluster because:
> 
> Truth is: Nobody cares about OpenStack while we are talking about how to
> store images on our own infrastructure.

We're already wasting so much space with daily testing/sid images, and
it didn't seem to be a problem so far. I also pointed out already, we've
done it for the last 3 Debian releases, and so far, it hasn't been an
issue at all (and these images were 2GB, when I'm proposing to reduce
them to 512MB).

> Maybe ask Glance to process the images as needed? Wait, it already is
> able to do that.

Able to extract the .raw file from a tar.xz? I don't think so. If it
can, please provide the command line to do that, but as much as I know,
it's currently not possible. I'd love to be wrong here...

> The images have 700-800MB of space used, which is still three times the
> size of the qcow2 file.  Why do you refuse to use what's already there?

I already explained: because downloading from ADSL, extracting, then
re-uploading to a cloud provider, is a stupid way of doing things, when
one could just point to the URL of the artifact and let the cloud
provider do the download.

Also, as I wrote earlier, it's not 700-800MB, but less than 512MB, and
we could reduce the size of the raw images to that.

>> - the users that would download raw OpenStack images would be mainly
>> those willing to store them with Ceph as a backend (where sparse files
>> don't exist anyways, unless I'm mistaking).
> 
> Sorry, but this is bullshit.  Ceph RBD will try to store only the
> minimal amount of data.  A RBD image is sliced into a lot of raw objects
> and missing ones are considered holes.  It's even written in the
> manpage:
> 
> | Create a new image and imports its data from path (use - for stdin). The
> | import operation will try to create sparse rbd images if possible.
> (https://docs.ceph.com/docs/master/man/8/rbd/)

When saying "bullshit", this doesn't imply that I'm mistaking, but that
I'm *WILLINGLY* trying to misguide my readers. This clearly isn't my
intention, and you know it, so it feels insulting. This is uncalled for.
Moreover, you still don't understand what I'm trying to tell you, and
you don't know any better. So please quit this condescending tone you're
having, it's not annoying for only me, but for everyone reading you.

Now, while what you're saying about RBD may be right, this still doesn't
help OpenStack users, as the images must be uploaded through Glance, and
sparse files aren't understood by glanceclient.

>>>> One thing though: if I understand well, artifacts are first stored on
>>>> Salsa, and currently, there's a size limit. What is the max size? If I'm
>>>> not mistaking, it's 1GB max, right? If that's the case, then maybe
>>>> that's a problem with the current 2GB decompressed disk.raw image.
>>> It's 250MB.
>> Then how are the ppc64el images generated? (they are bigger than this)
> 
> No, they are not.  It's just 180MB:
> 
> | $ xz -vk5T0 *.tar
> | debian-sid-generic-ppc64el-official-20200525-274.tar: 184.0 MiB / 889.7 MiB = 0.207, 8.1 MiB/s, 1:49
> (https://salsa.debian.org/cloud-admin-team/debian-cloud-images-daily/-/jobs/762303)

That's not the location I had in mind, but this one:
http://cdimage.debian.org/cdimage/cloud/buster/20200511-260/

In there, debian-10-generic-ppc64el-20200511-260.qcow2 (for example) is
more than 256 MB. How was it generated? Directly on Casulana?

Cheers,

Thomas Goirand (zigo)


Reply to: