[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: jigdo-port/lite/lite2win (Re: web pages)

On Sun, 20 Jan 2002, Richard Atterer wrote:

[Sorry for the late reply. My mail-R/W intervals are short and sparsely
distributed in time. This message was written during several of these
intervals the past few days. I just read that jigdo 0.6.2 is available, but I
haven't had time to check that out yet (and probably will not before the end
of next week), so everything here concerns 0.6.1.]

> On Fri, Jan 18, 2002 at 01:59:11PM +0100, J.A. Bezemer wrote:
> > As far as I'm concerned, jigdo has just left beta stage. (That's the
> > jigdo scheme, not necessarily jigdo 0.6.1 ;-)
> Hm, for *me* jigdo will be finished the moment my mother can download
> a CD image without me giving her any instructions! ;-]

I'll keep that in mind ;-)

> BTW, Anne (or anybody else who is interested), should you have more
> spare time in the future, here are some fun projects:
>  - Hack debian-cd to allow output of .jigdo/.template files without
>    saving a temporary image to disc. A must for DVD jigdo generation.

Done, will commit shortly.

>  - Add DVD support to debian-cd (trivial AFAIK?)

If you want an iso9660 on the DVD, then trivial (SIZELIMIT to 4GB or so)
(eh, if this is used in shell calculations it might not work, use bc instead).
BUT at least Macs will only recognize a UDF filesystem on DVD; you have to do
something special to make it recognize iso9660. AFAIK we have UDF R/W support
in recent kernels but that needs a full partition/disk/loopfile, nothing like

>  - Hack mkhybrid to output .jigdo/.template instead of raw ISO9660
>    data. This would make the template generation a lot faster -
>    imagine daily "testing" DVD images!

I don't think that'll work well. The stuff on CD has (or can have) quite
different paths than on the mirrors, so forget the .jigdo. We _can_ do the
.template (i.e. only the md5sums) but then there is the problem that mkisofs
doesn't know which files were generated (README, Packages) and which weren't.
(hardlink count or symlink-outside-tree might do the trick, but that isn't
really reliable). The current piping works very well and circumvents all these

On the other hand, you might use isoinfo or some specially patched mkisofs to
provide hints to jigdo-file so that, for specified offsets, it tries one
specific file first before trying everything else. To use this with jigdo on
the fly, it'd be optimal if jigdo-file could actually parse the iso's
directory structure... 

>  - A CGI script (or mini web server, or Apache module...:) which
>    assembles the CD image on the fly from .jigdo, .template and a
>    local mirror. Essentially, this allows mirror admins to offer
>    direct HTTP downloads without the need to store the complete images
>    locally. (HTTP/1.1 ranges support would be cool.)

Have jigdo-file support --start=byte --end=byte and it can probably be done in
a small perl script (but my perl is read-only ;-)

> [snip]
> > It didn't compile on my potato box, 'cause it requires libdb3 and
> > sstream.h. libdb3 was out-configurable, but replacing sstream by
> > strstream did compile but produced weird output in the list files.
> /me awards himself the Portable Programming of the Month award. :-/
> OH NO! It is definitely my intention to make jigdo-file compilable on
> potato! After I noticed that sstream worked with GCC 2.95 under
> testing, I started to use it. I didn't know that the GCC 2.95 in
> potato does not cope with it.
> I'll try to remove the dependency on sstream soon. As for the other
> C++ features that many C++ compilers out there probably don't support
> yet: The C++ ISO standard has been out for three years, I can't be
> bothered to enter "write once, test everywhere" mode. GCC 2.95 or 3 is
> available for all the important arches...

Just to compare: (unextended) pascal was standardized before 1990, but you
won't find many (if any) machines with any pascal support installed. Even
though there's an excellent gpc. Availability isn't an issue, the willingness
of the sysadmin to install it is.

> [snip]
> > Now for the future. I don't plan to do anything to
> > jigdo-port/lite/lite2win any more (unless I've made some stupid
> > mistake somewhere) except fanatically using it. It's my idea that
> > -port and -lite get merged into the "official" jigdo distribution. 
> > Please NOT hidden in some contrib/ dir, _many_ people need it. Maybe
> > also -lite could better be placed in it's own top-level dir, along
> > with its README and myprintf.c, and not in scripts/ along with all
> > those things that "end users" aren't interested in.
> That's fine with me, as soon as the remaining issues (see below) are
> resolved! But hopefully you won't just "dump" this on me and then
> disappear? :) Your shell scripting expertise and test machines will be
> needed for future modifications to the script!
> But I'd rather not include the zlib source in the jigdo source
> distribution - doing that increases its size quite a bit, also it's
> easy for people to download it...

Well, jigdo-port links statically to it (correct dynamic linking is a real
portability mess), so people compiling jigdo-port NEED to have the sources,
one way or another, so it's very handy if they already have a tested&working
version available. But if size is the problem, just delete all zlib's subdirs
and it'll still work. Saves ~450kB unzipped. Don't forget to update the last
few lines of jigdo-port's README.

> Should jigdo-port.c use the configure mechanism?

No. Configure adds a POSIX.2 requirement (shell) to what only needs POSIX.1
(C compiler) and probably not even that. And there's only one thing that needs
configuring (WORDS_BIGENDIAN) which jigdo-port tests itself. KISS.

> > Mainly to Richard (already mentioned in the READMEs as maintainer --
> > unless you don't want that ;-) : be _very_ careful with -port and
> > especially -lite. EVERYTHING has a good reason, even if that usually
> > isn't documented.
> That's what I really hate about shell scripting (and why I'd prefer to
> leave future changes to the script to you) - making any non-trivial
> shell script portable is a nightmare!

Replacing your man pages with
will help a lot. And everything in there now does work, so reusing it in
exactly the same way will still work. Or just don't make any changes at
all ;-) 

> > Short-term future: You can take over the online menu stuff; look in
> > my mess called public_html/jigdo on open how I did things.
> The menu stuff is the one big thing where I don't like what you've
> done. The idea I had was eventually to have just *one* jigdo file per
> Debian CD release. That file would contain info on all the CDs plus
> the mirror list. IMHO this is plain and simple, and easy to understand
> for users. There's no reason to spread the info across a menu.jigdo,
> mirrors.jigdo, binary*.jigdo; that only increases the complexity.

I already suspected that you were planning that. But please NO! And I'll tell
you exactly why. 

The 2.2 rev5 jigdo files combined are 3 MB unzipped and 1.02 MB gzip -9'd [*]. 
This is for 28 images. With woody, we'll have >80 images, so the complete
jigdo will be at least about 3 MB.

You know how long this takes to download over a modem? Dual-ISDN (128 kbit) 
does about 1 MB per minute, so after you start jigdo you'll have to wait a
full three minutes to get just the menu. With the 56k modems doing free local
calls in the US it takes more than twice as long. Even fast cable modems (150
KB/s) will have to wait longer than half a minute. 

This is a BIG annoyance (sorry, I mean it) -- and that while I'll only use
less than two percent of it!! (approx 1/80'th to be "exact")

So what I've effectively done is split your original .jigdo idea into the
"real" .jigdo, a separate selection/menu system, and a separate mirror list.
You never download anything that you won't use (except the mirror list if you
use the same mirror as before; hmm, should move that to _after_ the
mirror-location questions and only download if a search action is requested). 

Have you actally used the new jigdo-lite? Did you ever need to select ONE of
those separate files? No. You had to select a CD image. Everything else was
done "under the hood". I left the wget messages visible (originally I had
--quiet!) just because that's the most usual problem cause and I didn't like
to write my own error handling for it. (Best approach: no --quiet but output
to logfile and get the error message from there if necessary.) 

>From the user's point of view, the entire menu is one single entity, but it
can actually come from many different places and all image providers can
maintain their own piece. For example, I've added pointers to Attila's
woody/sid snapshots at ftp.fsn.hu; that menu is on cdimage.d.o at this moment,
but he can take over himself whenever he likes. The menu also includes the
2.2rev5 update CDs and the net-install CDs.

And all that time, the user doesn't have to know the concept "jigdo" at all,
he/she just downloads the CD he/she wants. Simple. Can your mother understand
that? Or does she want to choose the right one from a list of 80+ CDs (Ugh!)
and need to cut'n'paste another .jigdo URL if she wants the update CD, or the
netinst CD, or one of the latest woody/sid versions? Really?

The separate mirrors.jigdo has the added advantage that Attila (or anyone) 
doesn't have to worry about it any more. There is one well-maintained
(hopefully) mirrors list for Debian, and everyone can use it.

[*] gzip: yes, that's portable: zlib comes with it's own minigzip. However,
current jigdo-file does not yet have support for it.

> (So why don't I offer just /one/ .jigdo? Laziness... ;-)

No, you're unconsciously doing just the right thing ;-)

> The "Info" label in the [Jigdo] section would contain an introductory
> text along the lines of "choose the correct arch, you only need one
> out of binary-1 and binary-1-NONUS", etc. When the user selects one
> image, that image's "Info" label would be shown.

If you want fancy things like HTML formatting and "tooltip"-style hints, the
menu system can easily be used for that. Use TitleHTML= and OptionHint= or
even OptionHintHTML=. Easy. And jigdo-lite won't even notice.

> Your menu structure also has the problem that it's a bit too flexible
> - making a GUI app parse and display it wouldn't be impossible, but
> I'd rather keep things simple and have just one list (generated from
> the "ShortInfo" labels of all the "[Image]" sections) showing all the
> available images.

I refuse to believe that it would be hard to implement in a GUI app. If I can
parse it in <100 lines of (portable!) shell code, then it can't be difficult.
Think "web browser": instead of exiting the selection on the first mouse
click, just fill the selection window again with another list.

> Another, minor point: Hard-coding the initial .jigdo URL into the
> script doesn't seem right to me. Of course you can always download to
> a file and supply that, but that will not be obvious to new users who
> want to download non-Debian releated stuff.

Oh, and Netscape and Mozilla and IE and Opera and whatever do not have
hard-coded "home" URLs? 

The purpose of all these "home" sites is that people can start using the
browser (and jigdo-lite) immediately. If they need another page (menu),
they'll get the URL from whoever told them. Note that you can also enter
menu.jigdo URLs directly on any menu prompt, and it'll go just there.

And the "home" menu isn't really fixed. I mean, you can have your official
jigdo source point at a general menu (with Debian accessible as [main]->
Linux->Debian), the Debian package starting with Debian directly, and the Red
Hat package starting with their own stuff.

(Okay, so the "home menu" of jigdo should be in the .jigdorc. That's for the
next version.)

> > On the future of the electronic Debian CD image distribution: I
> > propose that cdimage.debian.org will stop offering the CD images via
> > rsync, as soon as possible.
> We discussed this before - essentially, despite being flattered, I
> still disagree. ;) Let rsync and HTTP/FTP mirrors co-exist with jigdo
> throughout "woody==stable" - that way, there's enough time to improve
> jigdo and fix bugs without being flooded by angry users.

I didn't yet encounter any bug (except that it didn't compile on !=woody). And
the only way to get good testing is to force people to use it. Just like I did
with the Kit and what Linus did with 2.4...

With that in mind, I'd very much like to see the webpage organized slightly

  "Want a CD?"
   o  netinst CD --> netinst page  (okay now)
   o  buy --> vendors list  (okay now)
   o  download the smart way --> something like
        http://cdimage.debian.org/~costar/jigdo/ but without the first
        paragraphs (i.e. only the download links & quickstart help)
   o  download the old-fashioned way --> http/ftp list  (okay now)

Start pushing. Now.

> > Jigdo does a _much_ better job: the biggest template is only 8 MB
> > (alpha binary-1) and offering the templates via HTTP shouldn't
> > produce any significant load.
> Beware - you need to include debian-keyring.tar.gz (~4MB) in the
> .template, because it changes quite often on the FTP server!

The doc/ version you mean. Yes, doc/ is allowed to change at any time. But the
question is of course of we need the keyring from doc/ on the CDs as we've
also got it as regular .deb package that doesn't change.

> > So for end users we'll have two download methods: jigdo and
> > HTTP/FTP. First-tier mirrors (mirroring from cdimage.d.o) can only
> > use jigdo. The Pseudo-Image Kit and rsync download/mirroring will be
> > discontinued.
> There is no compelling reason ATM to discontinue rsync, so IMHO we
> should not do it. Of course, when DVD images become available, there
> won't be any way of providing them as raw images.

This is the first time I'll publicly admit it (because there's a better
alternative ;-) : one rsync download == load +1 for 10-20 minutes on the
server and much disk activity during that time. I guess you haven't witnessed
the load on open when potato was released. With ONLY the first-tier mirrors
having access it was almost impossible to come through, and mirrors had to
mirror from each other and communicate over this list who had what image
available where. I'd like to avoid that now, and only doing jigdo will
accomplish just that. 

Of course, if individual mirror maintainers want to "blow up" the full image
and make it available, that's their responsibility. And their system load and
their network traffic. We should just not do it on the master site.

> > So what I've done for the currently-still-available-via-jigdo 2.2
> > rev4 images is extracting the needed files directly from the .iso's
> > and putting them in my public_html on cdimage.d.o, and rewriting the
> > .jigdo files. This works great.
> ...as long as you still *have* the original images!

The only other option is to hardlink every single file on those images
during creation, which (obviously) wasn't done with rev4.

> > You'll also see that I've radically changed the Filenames that are
> > suggested by the .jigdos. Since there's no need any longer to keep
> > these names the same for mirroring purposes, I figured it wouldn't
> > hurt to make them more descriptive.
> Agreed in principle, but
>  - The names should be identical to those of the .iso files, else it's
                                                             ^ on the HTTP/FTP
mirrors you mean?
>    a bit confusing. So once we change the jigdo names, the .iso names
>    should change as well.

Not necessarily. The names on HTTP/FTP aren't visible to people using jigdo,
and vice versa. Mirror maintainers offering HTTP/FTP downloads of full images
should use the names of the appropriate .template files (which can be anything
since the end user doesn't ever see them).

>  - Let's not do this change now, in the middle of 2.2, let's switch
>    over to a new naming scheme for 3.0

For HTTP/FTP: IF we switch then 3.0's release is the right time.
For jigdo: nothing depends on it (besides user-friendliness), so we can
change at any time.

> > Finally one technical issue. Richard: I really don't know why you
> > introduced that quoting mess in the .jigdo format specs.
> :-)
> It's because sooner or later there will be support for switches after
> the label values. Things like different --priority values for
> different mirrors, --referer to allow you to upload jigdo stuff to
> geocities.com ;), --jpeg-steg to extract data out of a JPEG, --decrypt
> with a passphrase, --make-coffee, --dominate-world, ... endless
> possiblities!

Okay, switches. But do we need switches in the Info/ShortInfo? Don't think so,
so why use quotes there? Gets messy very quickly if you want HTML in Info=
(note:  I'd rather have that as InfoHTML then, and Info as plain text
rendering, like the ALT= tags). Define both switches and quoting for
individual labels, not for the entire file.

> Last words: I'm graduating from uni in two months. I'll be very busy
> indeed, don't expect any major updates anytime soon. :-/

And I was expected to graduate six months ago...

  Anne Bezemer

Reply to: