[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian (would like) to do list

[moving the thread to debian-devel --- see Mail-Followup-To:]

On Fri, 26 Jul 2002 16:07:41 -0500 (CDT),
Drew Scott Daniels <umdanie8@cc.UManitoba.CA> wrote:
> I would like to become a Debian developer to help accomplish these tasks,
> but my time is limited and I do not need to be a developer to help if some
> developers pick up these tasks. Also my computing resources are limited so
> projects like scanning source code and brute force "now" checking of
> packages would be too time consuming without help or more resources.
As you said, you can help Debian even if you don't have a Debian
account.  Why don't you pick a package you are interested in and
check it?

> I'm not sure what the appropriate forum for discussing my list, but
> debian-user seemed to be the best fit as I am a user.
debian-devel is better

> First some clarification. When I say "before, after, now" I mean that the
> uploader should(?) check this before uploading (perhaps this can be
> automated), the archive maintainers or upload procedure should(?) check
> this after it has been uploaded (perhaps this can be automated), and this
> should(?) be checked for now to catch any violations that have been missed
> (perhaps this can be automated).

> Debian related tasks:

> QA and improvements:
> Continue the spell check campaign and look at improving it (before, after,
> now)
> Add grammar checking (before, after, now).
Go ahead, do it yourself.

> Add watch files to as many package sources (or diffs) as possible (before,
> now, after may be unnecessarily complex).
Patches welcome.

> Add Trove descriptions to packages and source (before, after, now). This
> would be nice. Perhaps this may help improve the trove format.
I don't know why Debian should bother about
http://www.tuxedo.org/~esr/trove/ .

> Why are packages removed from the archives? There are many reasons, but
> sometimes it's hard to find out. There should be some way of recording
> this especially for those who track unstable on an infrequent basis.
> Perhaps an entry into the Debian BTS under the package name?

> Packages should purge configuration files before purging directories
> otherwise empty directories can be left behind. (now)
If a package foo has package.d/ style configuration files and another
package bar creates a configuration file in package.d/, foo must not
remove package.d/bar .

> Scanning package descriptions, documentation and other package related
> areas for URL's and seeing if they are active URL's. (before, now)
Go ahead, do it yourself.

> Check to see if a package depends on a pseudopackage, transitional (also
> dummy packages?) or other package that will be removed from the archives.
> (before, after, now) Should Debian have a way to mark packages that are
> going to be removed from the archives, pseudopackages, transitional and
> dummy packages? A common word to describe such packages may help users to
> better identify these packages and deal with them (users may want to
> remove them, developers must want to depend on other packages).
Why don't you read Description: of installed packages?
grep-status is your friend.

> Check for bash specific pieces of shell scripts where it may cause
> problems such as in install scripts. (before, after, now)
> Checking for policy violations or better fits:
> Section 2.3.4 of the policy manual says:
> "Packages are not required to declare any dependencies they have on other
> packages which are marked Essential (see below), and should not do so
> unless they depend on a particular version of that package.", this should
> be checked for (before, after, now).
Go ahead, do it yourself.

> Check for packages that use old policies (before, after, now) and see if
> the policy version can be updated or what needs to be done and file bug
> reports against the package.
lintian does this.

> Check for contrib packages that can be moved to main (before, after, now).
> Check for non-free packages that can be moved (before, after, now).
Go ahead, do it yourself.  (Yes, I moved gpgp to main.)

> From dpkg (1.10.1) unstable's changelog:
> "* Add conflict with dpkg-iasearch which intruded on our namespace." by:
> -- Wichert Akkerman <wakkerma@debian.org>  Tue,  2 Jul 2002 12:34:07
> +0200. Is this a policy violation? Did dpkg-iasearch violate a policy?
seems first-come-first-served situation

> Automate testing of policy musts and where approval must be met create an
> automated system for approval by people (may require authority structure
> to be created). Many parts of Debian policy say to get approval from
> debian-devel. I would like to avoid having people upload packages without
> explicit approval which an automated mechanism could check for. (after,
> now)
Debian does not need bureaucracy.

> Reducing the size of the distribution & packages, cleaning up, and backing
> up:
> Look at not only gziping documentation but also compressing other files
> such as png files using pngcrush or other files using other utilities.
> (before, after, now)
I don't think png files can be compressed well without reducing the number
of colors in them.

> Why not bzip2 instead of gzip? New upcoming algorithms are being worked on
> and there are known deficiencies in bzip2. See the bzip2 homepage and read
> about how the author thinks that he can make some significant
> improvements. Also see http://www.compression.ca for some comparisons of
> archives and note that PPM variants compress things more. CTW is pretty
> good too, but the algorithm that bzip2 is based on is lower on the list
> for compression ratio. Using bzip2 on source files is a wishlist item for
> Debian policy. I'm arguing that it's a good idea to look at algorithms
> other than gzip, but jumping on bzip2 may be a large transition that may
> be made unnecessary by another large transition to a new compression
> format. I'm hoping to help in the development of new compression formats
> some of which should have better performance than bzip2.
You don't know dbs or dpkg-source v2.  Read debian-devel.

> Section 2.4.1 of the policy manual says:
> "only the first three components of the policy version are significant in
> the Standards-Version control field, and so either these three components
> or the all four components may be specified." As this is a may, I would
> prefer the saved space over the acknowledgement of cosmetic differences.
> If the cosmetic difference is found to cause a meaning to change then a
> higher version number will be changed.
Replacing with 3.5.6 saves only 2 bytes.  Re-building the entire
archive (takes at least one week) for (at most) 20k bytes is too much.

> A policy for reducing the length of changelogs may help reduce package
> sizes. "Before, after, now" only after a policy has been chosen. I know
> changelogs can be needed and useful. Changelogs can also be useless and
> consume precious space, especially on minimal installations. Perhaps
> packages could have a ranking of what files in them are necessary? This
> may imply splitting the archives, but you can't split some files like
> changelogs as they are required(2.4.4) for every package.
I think changelog should be more verbose.  You can remove
/usr/share/doc/package/changelog.Debian.gz (or even /usr/share/doc/ itself)
manually after install.

> Optimize ordering of files in tar archives. tar files are usually
> compressed, but if files of similar types are put closer together they can
> compress better. I am looking at a simple method using 'file' and sorting
> by 'file' type first, then looking at mime types, and then looking at
> doing some statistical testing for file information. I may also create a
> utility for using brute force to try every combination and then
> compressing them and checking for the best order. Note that this may be
> affected by concepts discussed in the gzip/bzip argument above as
> compression methods do prefer different orderings in different cases.
> (before, after, now)
Contact dpkg maintainers.  Note that such brute force optimization is
a nightmare of autobuilders.

> Removing unnecessary directories from package listings. Some .deb's
> contain lists of directories that they need. Even when it is not required
> that they list certain directories, they are still allowed to. (before,
> and now, but as this is a 'may' then not after)
Go ahead, do it yourself.  I don't think there are many.

> Detecting the 'want' for virtual packages (when many "depend" and/or
> "require" have or's, or a virtual package is provided by few packages).
> This may cause virtual packages to be either created or removed. (before,
> after, now)
seems Provides:

> Using upx or alike for minimal installs, boot disks, base? Making it an
> option? Perhaps this could be an option integrated into apt.
Working debian-installer has much higher priority.

> Some programs use static code for things like regex expressions and
> handling tar archives. A program to go through the source code of all the
> programs (or a developer effort) may help to find common code that could
> be put into a library or that already is in a library. This could make
> packages smaller, but if we're not careful, creating new libraries could
> increase the overall installed size for a program. (before, now) An
> additional benefit would be fewer places to change code (good for
> security, good for efficiency, good for all updates). Are there any
> security issues to exporting code from packages? (This should be looked at
> whenever code's exported.)
Do you know why apt-get works so wonderfully?
Have you ever heard anything about SONAME?

> Searching for more ways of removing unnecessary content from debs.
Why do you install unnecessary packages?

> Using a thesaurus such as Aiksaurus may help to reduce the size of
> descriptions. Shorter descriptions and more clear descriptions would be a
> good project (aka laconic's good). Automated tools could help (before,
> now).
I don't think newspeak is the best language for Description: .

> http://lists.debian.org/debian-mentors/1999/debian-mentors-199901/msg00051.html
> talks about putting datasets into Debian or non-free. I wonder what has
> become of this particular dataset and if there has been a policy developed
> for datasets. I would like to see astronomical, meteorological,
> geographical and other data sets easily available. If a data set is
> DFSG-free then I feel it should be put in main, but segregated somehow (in
> extra?). Data sets may require maintenance too. For example, recently new,
> more accurate data was collected about the distance certain stars are from
> our sun. When I did some more investigation, I found out that a vote to
> include a dataset section was made and it was decided to create a dataset
> section. No such section was created and the astronomical data is sitting
> with the person made this proposal. Special handling of datasets may be
> required to reduce the impact on Debian distribution infrastructure. I
> recommend updates and distribution only be allowed through diffs or some
> other method that uses less bandwidth than is used now.
I didn't hear about such a vote.

> findimagedups and other such packages could be used to search for
> duplicate or near duplicate files Debian packages. Then 'common' packages
> which have these files can be created and/or symbolic links may be used to
> save space. (before, now) Perhaps a program that makes symbolic links to
> common files where necessary?
Did you find any such duplicate files?

> Create Debian cleanup procedure and program(s) (cruft, deborphan...), now.

> Create Debian backup procedure and program(s) (debian cleanup, cruft to
> backup, dpkg --getselections > myselections, backup config files possibly
> checking md5's which more than should be in every package), now.
Use "tar cf etc.tar /etc".  If you want to verify the tarball later,
sign it with gpg.

> Creating a version of Debian that binds a writeable filesystem onto a read
> only filesystem (floppy writeable with a readonly CD). I would love to
> have this to cary around and run Linux on any machine with a CDROM drive
> that I could boot with. upx may be useful. A compressed filesystem for
> writing may be useful. Support for umsdos, NFS, samba, and/or mounting
> file systems, creating a file and mounting the file could be useful.
> http://www.debian.org/CD/faq/index#live-cd is something I later found. I
> would like to see more development, and official Debian development. Upon
> further investigation bootcd seems to be a start, but how much of this can
> it do? Maybe these features should be wishlist bugs, but the CD faq needs
> to be updated, and I would still like to see an official CD image.
Go ahead, do it yourself.

> Should CD images be optimized for space? I saw an option to optimize CD
> images for space in Roxio's Ez-cdcreator (formerly by Adaptec).
Does it work?  Is it free (in the sense of DFSG)?

> Security, Policy and other bug stuff:
> Automated rough security audit of all source code (rats, splint & other
> programs can be used, before, after, now).
_Rough_ security audit does more harm than good because it provides
a false sense of security.  Are you sure such auditing tool has no bug,
or are you volunteering for line-by-line audit of all 10k+ packages?

> Programs that use keyahead or mouseahead routines may be a security risk
> or cause other undesirable results. One example is my apt-get using
> readline has keyahead so if I accidentally hit enter, the enter is saved
> until the next question and then inputted. Instead I'd prefer it be
> disregarded so I can read the arbitrary (it's hard to predict the order)
> question that appears next. Mouseahead can be very dangerous if the
> program hasn't updated the interface, the user will likely have no idea
> what they will have clicked on ("it didn't work. I'll click again. What? I
> didn't select that second option."). Yes, these are probably wishlist
> bugs, but they could be a normal bug as this can affect the desired
> functionality of programs. These bugs may also to be tagged security in
> some situations (the default password gets set by accident, etc). This may
> tie into scanning code for security vulnerabilities. (With scanning before
> and after, but this should be checked for now, especially where system
> security can be involved).
Think before type.

> 'popularity-contest' and other methods can be very helpful in finding out
> what users are interested in seeing being developed and maintained.
> Perhaps this, archive (mirrors too?) statistics and other methods can be
> used to create a priority list for the qa group. Perhaps a system should
> be put in place to allow user input into package importance/maintenance
> priority levels. Currently I would assume that a good system would be by
> the subtype such as essential, optional...
If you want some package well-maintained, help it yourself.
Note that a "thanks" mail to a maintainer (or the upstream) is much more
appreciated than a vote on popularity-contest.

> Campaigning for signed debs to be a must (if not already). Signed debs
> more than should be used (before, after, now).
I heard singed Packages.gz .  Each .deb is not signed.
I don't know what dpkg maintainers are planning about signed .deb .
> Campaigning for md5 lists in debs to be a must. md5 values for all files
> in packages more than should be done (before, after, now).
I don't think md5 lists are worth a must clause of the Policy.
Why do you trust md5 lists in /var/lib/dpkg/info ?

> A procedure should be put in place to ensure installation starvation due
> to dependencies does not occur in the unstable distribution. (perhaps
> waiting a day for dependencies to catch up?) I feel this could be
> automated or automated better (before, after, now).
Testing is for this.

> Find a way to reduce the chance of bad NMU's (accidental, malicious,
> poorly done, etc.). I haven't looked into how this is done now (if at
> all), but the developer making the NMU should be warned that it's an NMU.
> It may be good to list NMU policy for the first time for an NMU by a
> specific developer and ask for confirmation. It may be good to have an
> automated system where maintainers can block NMU's except by permission of
> an authority such as the security or qa group.
I repeat: Debian does not need bureaucracy.  By the way, currently
the QA group is defined as "subscribers of debian-qa" or
"anyone who is interested in QA work".

> Joey says at http://www.debian.org/devel/website/todo#misc that security
> updates are on the same server as the signatures for the updates. This
> could be a potential security issue as if one method is exploited to
> change the files, it can be used to change the signatures at the same
> time. wyrmbait at debianplanet.org says in his article Security with apt
> (found at http://www.debianplanet.org/article.php?sid=643 ), that apt can
> be viewed as a single point of failure. While his arguments may not quite
> be thorough, he does bring up some issues of security. Why not have a
> package for keys/certificates, then have dpkg complain if a new package
> has not been correctly signed. Also packages in the archive should be
> signed by a public key that is available on many public key servers and
> available offline (on CD perhaps). Changes to the keyring packages would
> need to have the appropriate signature(s).
sounds like singed Packages.gz

> Packages being signed by multiple people and allowing users to assign
> trust levels (checked before installing an upgrade) to people could
> improve security.
Signing a package means "_I_ can't find a bug", not "there is no bug".

> I would like to encourage distribution of public key server media. Having
> keys stored online lends them to potential man in the middle attacks even
> if multiple protocols are used. It's much more difficult to circumvent an
> offline signature.
Why do you trust the post office (or whatever)?
Do you know web of trust?

> One of the reasons for the delay of the release of Woody was said to have
> been security concerns. It has also been reported (see the glibc example
> at http://www.debianplanet.org/article.php?sid=568 ) that it takes a long
> time for security patches to get through due to the compiling and testing
> on the 68k and arm architectures. I would like to bring forward the idea
> of using emulation to help speed things up. There was recently (March?) a
> patch for UAE (a 68k emulator) to support running Linux. There are also
> emulators for the arm architecture such as arcem (
> http://bugs.debian.org/cgi-bin/bugreport.cgi?archive=no\&bug=136844 for
> details ). arcem is said to be quite fast on Intel architecture. Emulation
> of old architecture brings two advantages and one disadvantage: it's
> usually faster, it's easier to get, it could have trouble being a
> completely accurate emulation of the original hardware (bugs not emulated
> or new bugs not yet found/patched).
The new security infrastructure seems to work fast enough.
Read the recent ssh story.

> Many of the patches and programs found at
> http://www.theaimsgroup.com/~hlein/haqs/ can be quite useful. The programs
> can be packaged. The patches, when useful, should be added to existing
> packages or modified to make them run time options. For example the idle
> connection traffic patch for ssh may be a useful option that may be
> possible to be chosen at runtime.
File a RFP bug to wnpp or a wishlist bug to a specific package.

> Performance issues:
> Someone mentioned the idea of ordering the startup scripts into a
> dependency tree, and have programs startup in parallel. I feel this would
> be useful for many people running many startup scripts such as myself.
> Perhaps this should be a before, after, now. If nothing else, it should be
> looked at now and a policy document regarding this may be useful. I forget
> which Debian developer I read this idea from.
Write a working code and provide a transition plan.

> Having package install in parallel may speed up installation. This may be
> a wishlist item for dpkg or apt.
It will break the system.

> Should CD images be optimized for speed? I saw an option to optimize CD
> images for creation speed in Roxio's Ez-cdcreator (formerly by Adaptec).
> Also speed of installation or other reads of the CD. Seek time might also
> be a consideration when choosing what order to put data on CD's.
Does it work?  Is it free (in the sense of DFSG)?
I think CD is fast enough.

> Using programs like cmix, the performance of programs may be able to be
> optimized (before and now, but not after upload as optimizing programs may
> not work desirably).
Is it better than gcc -O flags?

> Sometimes threads come up about performance optimization done at compile
> time. Yes numbers have not been provided, but some number should be. As
> such a comparison of compilers of gcc, lc (Intel's compiler if allowed),
> tcc and any other compilers should be done. Binaries could be compiled
> with each available compiler and then checked to see which produces the
> best results in application performance, binary size and perhaps compile
> time. (Smallest binary size usually means high compile time and better
> application performance, or so I've been told.) This should be done before
> upload and now.
I think gcc -O2 is enough on most cases.

> Other sometimes bigger ideas:
> Should there be a method to force retirement of developers? I don't
> believe so, I believe that a new category should be created for developers
> who are not active. Why separate inactive developers? To limit the
> security risk and make managing developers easier.
They may be active again.  Insert a rant about bureaucracy here.
Any Debian developer can take over an unmaintained package as a last resort.

> A restructuring of the online distribution protocol is needed. Recently in
> the Debian Weekly news this was mentioned and this has been discussed. A
> BTS location may be a good place to start putting won't fix, wishlist and
> other bug information regarding the distribution protocol(s). Personally,
> I'd like to see low server loads, compressed files, deltas, and have
> upgrade priorities visible before downloading the package/archive.
Go ahead, do it yourself.

> It might be nice to make debian/watch files separately available and to
> have a watch file for all upstream sources even when it's version
> specific. It would also be nice to carry md5's for upstream sources (last
> known version of course) so when upstream sources get modified (like the
> dsniff security issue), users of watch files to grab the current source
> get a heads up that there may be something wrong.
URL in Description: will help you.  I have no means to know
md5sum of the next upstream release before it is released.

> Support for installing Debian via a netboot/bootp by distributing an
> official netboot image.
I think there are already such boot floppies.

> A comparison of xwindows terminals (or is it terminal emulators?) is
> disirable. xvt seems to have a smaller footprint than rxvt which, I
> though, was supposed to be reduced xvt.
> http://dickey.his.com/xterm/xterm.faq.html has some starting information.
> This would be useful for creating a small RAM xwindows install.
The priority of a terminal emulator is 20.
See the Policy, section 12.8.3.

> Other related projects that aren't Debian specific:
> RATS for gnu assembly (note: intel2gas) may be more useful if it existed,
> but it doesn't, yet.
File a RFP bug.

> An open source grammar checker (not EBNF or alike) doesn't seem to exist.
> Openoffice lacks a grammar checker and does not plan to add one. A grammar
> checker is a major proofing tool that would be extremely useful to many
> people. I did find one open source grammar checker called Link Grammar
> http://www.link.cs.cmu.edu/link/ . I disagree with the evolution of
> English being too fast for creating a static grammar checker as many in
> the commercial world have done so.
Debian has nothing to do with the evolution of English.

> Update File's database. This would be useful for my projects looking at
> reordering files in tar archives and other compression projects of mine.
> (This may have to be a Debian thing as I don't see updates to the database
> very often.)
What kind of update are you talking about?

> Other related projects (to those discussed):
> I'm working on some compression algorithms (Charles Bloom at
> http://www.cbloom.com and xiph have some starting work of what I can do).
> I believe I can improve existing compression. I don't have much time as
> I'm a full time student and I need money to pay rent. I will be graduating
> with a computer science degree in December.

Oohara Yuuma <oohara@libra.interq.or.jp>
Debian developer
PGP key (key ID F464A695) http://www.interq.or.jp/libra/oohara/pub-key.txt
Key fingerprint = 6142 8D07 9C5B 159B C170  1F4A 40D6 F42E F464 A695

her occasionally near suicidal sense of loyal self-sacrifice
--- Luke Seubert, about what Rei Ayanami and Debian developers have in common

To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to: