[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#688210: condor: Multiple security issues

[CC the release team to get an opinion on incorporating bugfixes
 from upstream stable/bugfix releases during the freeze]

On Fri, Sep 21, 2012 at 09:11:56AM +0200, Moritz Muehlenhoff wrote:
> On Thu, Sep 20, 2012 at 01:55:52PM -0500, Jaime Frey wrote:
> > The commits were made on the V7_6-branch, then merged into the
> > V7_8-branch. We had to manually resolve conflicts during the merge,
> > as the affected code had been modified during the 7.7.x series.
> > Thus, there's no commit that can be cleanly cherry-picked. I can
> > provide patch files that will apply cleanly.
> > 
> > We should certainly get Condor 7.8.4 into Unstable. It only contains
> > bug fixes. I would prefer it if we could get it into Debian Testing
> > as well, but I thought we were too far into the freeze for that.
> During the freeze it's preferred to upload a 7.8.2~dfsg.1-1+deb7u1
> version to unstable, which only contains the isolated security fixes.
> This version can then be unblocked by the Debian release managers (by
> filing a bug against release.debian.org)

It is indeed preferred. However, while it makes perfect sense for many
projects to use this as a stabilization method, I think the situation
here is a little different.

Condor uses a dual stable (even version) and development (odd version)
branch system. The current stable release 7.8 has been uploaded to
wheezy to not have problems with a development version in a stable
Debian release. Every single update to the 7.8 branch is a bugfix-only
release. If you look into the changelog you find:

	New Features:

for all 7.8.* releases after the one we have in wheezy right now. So the
purpose of the branch is identical to the purpose of the wheezy freeze
-- stabilization. In this particular case I find it difficult to see,
why we would want one kind of bugfix but not the other. Especially at
the cost of breaking stuff when having to backport the patches.

IMHO it makes perfect sense to base stabilization efforts for Condor in
Debian wheezy atop of the continuous work on stabilizing Condor 7.8 done
by the Condor development team (with pretty impressive man power).

Here is the current list of bugs we are not planning on fixing for
wheezy when following the preferred procedure (excluding those already
filed in the Debian BTS):

1. Fixed the condor_schedd daemon; it would crash when a submit description
   file contained a malformed $$() expansion macro that contained a period.
   (Ticket #3216).
2. Fixed a case in which a daemon could crash and leave behind a log file
   owned by root. This root-owned file would then cause subsequent attempts
   to restart the daemon to fail. (Ticket #2894).
3. Fixed a special case bug in which configuration variables defined utilizing
   initial substrings of $(DOLLAR), for example $(D) and $(DO), were not
   expanded properly. (Ticket #3217).
4. Fixed a bug in which usage of cgroups incorrectly included the page cache
   in the maximum memory usage. This bug fix is also included in Condor version
   7.9.0. (Ticket #3003).
5. Jobs from a hook to fetch work, where the hook is defined by configuration
   variable <Keyword>_HOOK_FETCH_WORK, now correctly receive dynamic slots
   from a partitionable slot instead of claiming the entire partitionable slot.
   (Ticket #2819).
6. Fixed a bug in which a slot might become stuck in the Preempting state when
   a condor_startd is configured with a hook to fetch work, as defined by
   <Keyword>_HOOK_FETCH_WORK . (Ticket #3076).
7. Fixed a bug that caused Condor to transfer a job's input files from the
   execute machine back to the submit machine as if they were output files.
   This would happen if the job's input files were stored in Condor's spool
   directory; occurred if the job was submitted via Condor-C or via
   condor_submit with the -spool or -remote options. (Ticket #2406).
8. Fixed a bug that could cause the first grid-type cream jobs destined for a
   particular CREAM server to never be submitted to that server. This bug was
   probably introduced in Condor version 7.6.5. (Ticket #3054).
9. Fixed several problems with the XML parsing class ClassAdXMLParser in the
   ClassAds library:
   - Several methods named ParseClassAd() were declared, but never implemented.
     (Ticket #3049).
   - The parser silently dropped leading white space in string values.
     (Ticket #3042).
   - The parser could go into an infinite loop or leak memory when reading a
     malformed ClassAd XML document. (Ticket #3045).
10. Fixed a bug that prevented the -f command line option to condor_history
    from being recognized. The -f option was being interpreted as -forward. At
    least four letters are now required for the -forward option (-forw) to
    prevent ambiguity. (Ticket #3044).
11. The implementation of the condor_history -backwards option, which is the
    default ordering for reading the history file, in the 7.7 series did not
    work on Windows platforms. This has been fixed. (Ticket #3055).
12. Fixed a bug that caused an invalid proxy to be delegated when refreshing
    the job's X.509 proxy when configuration variable
    DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME was set to 0. (Ticket #3059).
13. Fixed a bug in which DAGMan did not account properly for jobs being
    suspended and then unsuspended. (Ticket #3108).
14. condor_dagman now takes note of job reconnect failed events (event code 24)
    in the user log, for counting idle jobs. (Ticket #3189).
15. Job IDs generated by NorduGrid ARC 12.05 and above are now properly
    recognized. (Ticket #3062).
16. Fixed a bug in which Condor would not mark grid-type nordugrid jobs as
    Running due to variation in the format of the job status value. NorduGrid
    ARC job statuses of the form INLRMS: ? are now properly recognized both
    with and without the space after the colon. (Ticket #3118).
17. The condor_gridmanager now properly handles X.509 proxy files that are
    specified in the job ClassAd with a relative path name. (Ticket #3027).
18. Fixed a bug that caused daemon names, as set in configuration variables
    such as STARTD_NAME, containing a period character to be ignored.
    (Ticket #3172).
19. Fixed a bug that prevented the condor_schedd from removing old execute
    directories for local universe jobs on start up. (Ticket #3176).
20. The condor_defrag daemon sometimes scheduled fewer draining attempts
    than specified. (Ticket #3199).
21. Fixed a bug that could cause the condor_gridmanager to crash if a grid
    universe job's X.509 user certificate did not contain an e-mail address.
    (Ticket #3203).
22. Fixed a bug introduced in Condor version 7.7.5 that caused multiple
    condor_schedd daemons running on the same machine to share the job queue
    with each other due to way in which the default value of configuration
    variable JOB_QUEUE_LOG was set. (Ticket #3196).
23. Fixed a bug that could cause condor_q to not print all jobs when it
    thought it was querying an old condor_schedd daemon. (Ticket #3206).
24. Fixed a bug that could cause a job's standard output and standard error
    files to be written in the job's initial working directory, despite the
    submit description file's specification to write them to a different
    directory. This would happen when the file transfer mechanism was used,
    the execution machine was running Condor version 7.7.1 or earlier, and
    either Condor's security negotiation was disabled or the configuration
    (Ticket #3208).
25. The log message generated when the EXECUTE directory is missing is now
    more helpful. (Ticket #3194).
26. The load average was incorrect for non-English versions on Windows
    platforms. This has been fixed for Windows Vista and more recent versions.
    (Ticket #3182).
27. The command condor_q -run now displays correct HOST field information for
    local universe jobs. (Ticket #3150).

Given these facts, and unless someone convinces me otherwise, I'm
inclined to upload Condor 7.8.4 with all the bugfixes to unstable. All
the sites I have talked to that use the Debian Condor package have no
interest in testing a version that has known but unfixed bugs. If the
release team objects a transition of this package into wheezy, a
security-fix-only version could go through proposed-updates. The
reduction in testing exposure for this package from by-passing unstable
is probably negligible anyway.



Michael Hanke

Reply to: