[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#956671: marked as done (htcondor: preemption not working if NEGOTIATOR_INTERVAL is set)



Your message dated Thu, 2 May 2024 15:02:56 -0500
with message-id <1267e14b-b9f7-44f3-a762-9539d27d7c4f@cs.wisc.edu>
and subject line Re: htcondor: preemption not working if NEGOTIATOR_INTERVAL is set
has caused the Debian Bug report #956671,
regarding htcondor: preemption not working if NEGOTIATOR_INTERVAL is set
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
956671: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=956671
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: htcondor
Version: 8.8.7-1
Severity: normal

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
We are configuring preemption settings. Running jobs wich reached the
guaranteed minimum run time should be preempted if jobs with a priority
1.2 times better are in the idle queue.
Here is the NEGOTIATOR configuration:

use ROLE : CENTRALMANAGER
NEGOTIATOR_INTERVAL     = 3600
JobExceedsMinRunTime    = ($(ActivationTimer)) > MinRunTimeHours * 60
NewUserBetterPrio       =  RemoteUserPrio > SubmitterUserPrio * 1.2 
IsGPUJob                = Target.RequestGPUs > 0 && My.RequestGPUs =!= 0 && My.TotalGPUs =!= 0
PREEMPTION_REQUIREMENTS = debug( ( ($(NewUserBetterPrio)) && ($(JobExceedsMinRunTime))) || ($(IsGPUJob)))
ALLOW_PSLOT_PREEMPTION  = True
NEGOTIATOR_DEBUG = D_FULLDEBUG
NEGOTIATOR_CONSIDER_EARLY_PREEMPTION = True
NEGOTIATOR_CONSIDER_PREEMPTION = True

   * What exactly did you do (or not do) that was effective (or
     ineffective)?
Preemption works, unless we set NEGOTIATOR_INTERVALL.
   * What was the outcome of this action?
Jobs have not been preempted anymore.
   * What outcome did you expect instead?
We would expect that preemption is working, regardless whether
NEGOTIATOR_INTERVALL is beeing used or not.


-- System Information:
Debian Release: 10.3
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-8-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages htcondor depends on:
ii  adduser                       3.118
ii  debconf [debconf-2.0]         1.5.71
ii  libboost-python1.67.0         1.67.0-13+deb10u1
ii  libc6                         2.28-10
ii  libcgroup1                    0.41-8.1
ii  libclassad10                  8.8.7-1
ii  libcom-err2                   1.44.5-1+deb10u3
ii  libcurl4                      7.64.0-4+deb10u1
ii  libdate-manip-perl            6.76-1
ii  libexpat1                     2.2.6-2+deb10u1
ii  libgcc1                       1:8.3.0-6
ii  libglobus-callout0            4.1-1
ii  libglobus-common0             18.2-1
ii  libglobus-ftp-client2         9.2-1
ii  libglobus-gass-transfer2      9.1-1
ii  libglobus-gram-client3        14.2-1
ii  libglobus-gram-protocol3      13.2-1
ii  libglobus-gsi-callback0       6.1-1
ii  libglobus-gsi-cert-utils0     10.2-1
ii  libglobus-gsi-credential1     8.1-1
ii  libglobus-gsi-openssl-error0  4.1-1
ii  libglobus-gsi-proxy-core0     9.2-1
ii  libglobus-gsi-proxy-ssl1      6.1-1
ii  libglobus-gsi-sysconfig1      9.2-1
ii  libglobus-gss-assist3         12.2-1
ii  libglobus-gssapi-error2       6.1-1
ii  libglobus-gssapi-gsi4         14.10-1
ii  libglobus-io3                 12.1-1
ii  libglobus-openssl-module0     5.1-1
ii  libglobus-rsl2                11.1-1
ii  libglobus-xio0                6.1-1
ii  libgomp1                      8.3.0-6
ii  libgssapi-krb5-2              1.17-3
ii  libk5crypto3                  1.17-3
ii  libkrb5-3                     1.17-3
ii  libkrb5support0               1.17-3
ii  libldap-2.4-2                 2.4.47+dfsg-3+deb10u1
ii  libltdl7                      2.4.6-9
ii  libmunge2                     0.5.13-2
ii  libpcre3                      2:8.39-12
ii  libpython2.7                  2.7.16-2+deb10u1
ii  libpython3.7                  3.7.3-2+deb10u1
ii  libsqlite3-0                  3.27.2-3
ii  libssl1.1                     1.1.1d-0+deb10u2
ii  libstdc++6                    8.3.0-6
ii  libuuid1                      2.33.1-0.1
ii  libvirt0                      5.0.0-4+deb10u1
ii  libx11-6                      2:1.6.7-1
ii  libxext6                      2:1.3.3-1+b2
ii  libxss1                       1:1.2.3-1
ii  lsb-base                      10.2019051400
ii  perl                          5.28.1-6
ii  python                        2.7.16-1
ii  zlib1g                        1:1.2.11.dfsg-1

Versions of packages htcondor recommends:
pn  ecryptfs-utils  <none>

Versions of packages htcondor suggests:
pn  coop-computing-tools   <none>
pn  docker.io              <none>
pn  singularity-container  <none>
pn  slurm-client           <none>

-- Configuration Files:
/etc/condor/condor_config changed:
RELEASE_DIR = /usr
LOCAL_DIR = /local/condor
LOCAL_CONFIG_FILE = /etc/condor/condor_config.local
REQUIRE_LOCAL_CONFIG_FILE = false
LOCAL_CONFIG_DIR = /etc/condor/config.d
LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$
use SECURITY : HOST_BASED
ALLOW_WRITE = 10.0.0.0/9
RUN     = /run/condor
LOG     = /var/log/condor
LOCK    = /run/lock/condor
SPOOL   = $(LOCAL_DIR)/spool
EXECUTE = $(LOCAL_DIR)/execute
BIN     = $(RELEASE_DIR)/bin
LIB     = $(RELEASE_DIR)/lib/condor
INCLUDE = $(RELEASE_DIR)/include/condor
SBIN    = $(RELEASE_DIR)/sbin
LIBEXEC = $(RELEASE_DIR)/lib/condor/libexec
SHARE   = $(RELEASE_DIR)/share/condor
PROCD_ADDRESS = $(RUN)/procd_pipe


-- debconf information:
  condor/daemons: COLLECTOR:NEGOTIATOR, SCHEDD, STARTD
  condor/reservedmemory:
* condor/wantdebconf: false
  condor/allowwrite: $(CONDOR_HOST) $(IP_ADDRESS) 127.*
  condor/title:
  condor/admin: root@localhost
  condor/personal: true
  condor/startpolicy: false
  condor/phonehome: false
  condor/filesystemdomain: $(FULL_HOSTNAME)
  condor/uiddomain: $(FULL_HOSTNAME)
  condor/centralmanager: $(FULL_HOSTNAME)

--- End Message ---
--- Begin Message --- I am sorry that I did not see this bug for a very long time. It was filed against and old version of HTCondor. I'll close this now. Feel free to open another bug report if this is still a problem.

...Tim Theisen

--
Tim Theisen (he, him, his)
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736

--- End Message ---

Reply to: