[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#698382: s390/time: fix sched_clock() overflow



Source: linux-2.6
Version: 2.6.32-46

Hi,

this should be backported/included into our kernel when/if it's accepted
upstream. 2.6.32 and 3.2.0 are both affected. (So given the source
package change I guess this bug would need to be cloned for the latter,
but I'll leave that up to you.)

Kind regards
Philipp Kern

----- Forwarded message from Heiko Carstens <heiko.carstens@de.ibm.com> -----

Date: Tue, 15 Jan 2013 10:05:03 +0100
From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: LINUX-390@VM.MARIST.EDU
Subject: Re: Twin penguins prospered for 417 days, then died
User-Agent: Mutt/1.5.21 (2010-09-15)
Message-ID: <20130115090503.GB3227@osiris>

On Mon, Jan 14, 2013 at 08:36:58PM +0100, Philipp Kern wrote:
> On Mon, Jan 14, 2013 at 01:11:34PM +0100, Heiko Carstens wrote:
> > That patch is for x86 only and won't fix your problem. However we do have a
> > similar bug in the s390 kernel code. The only difference is that it will
> > indeed trigger after 417 days instead of 208 days.
> > 
> > The reason is that we calculate with differences of the TOD clock register.
> > The TOD clock wraps after appr. 143 years. So far no problem...
> > 
> > However when converting a difference to nanoseconds we must divide the value
> > by 4.096. Without floating point arithmetics in the kernel we do that by
> > multiplying with 125 and afterwards dividing by 512... and there you can
> > see when the overflow happens:
> > 
> > 143 years / 125 = 1.114 years. And 365 days * 1.114 = 417.56 days.
> > 
> > So, that's when we hit the overflow.
> > 
> > We are working on a fix!
> > 
> > Thanks again for reporting!
> 
> I'd be cool if you could followup here with the patch when it's available. For
> the sake of Debian and others…

The patch below is what we came up with. The patch is against current
kernel version 3.8-rc3+, however I added the "Cc: stable@kernel.org"
tag, so it should be backported for all maintained stable kernels at
kernel.org.


From bf01d5d698b7d4a2ec42d55af19fb6d5c0bc0fe0 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Mon, 14 Jan 2013 16:55:55 +0100
Subject: [PATCH] s390/time: fix sched_clock() overflow

Converting a 64 Bit TOD format value to nanoseconds means that the value
must be divided by 4.096. In order to achieve that we multiply with 125
and divide by 512.
When used within sched_clock() this triggers an overflow after appr.
417 days. Resulting in a sched_clock() return value that is much smaller
than previously and therefore may cause all sort of weird things in
subsystems that rely on a monotonic sched_clock() behaviour.

To fix this implement a tod_to_ns() helper function which converts TOD
values without overflow and call this function from both places that
open coded the conversion: sched_clock() and kvm_s390_handle_wait().

Cc: stable@kernel.org
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/s390/include/asm/timex.h | 28 ++++++++++++++++++++++++++++
 arch/s390/kernel/time.c       |  2 +-
 arch/s390/kvm/interrupt.c     |  2 +-
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/timex.h b/arch/s390/include/asm/timex.h
index fba4d66..4c060bb 100644
--- a/arch/s390/include/asm/timex.h
+++ b/arch/s390/include/asm/timex.h
@@ -128,4 +128,32 @@ static inline unsigned long long get_clock_monotonic(void)
 	return get_clock_xt() - sched_clock_base_cc;
 }
 
+/**
+ * tod_to_ns - convert a TOD format value to nanoseconds
+ * @todval: to be converted TOD format value
+ * Returns: number of nanoseconds that correspond to the TOD format value
+ *
+ * Converting a 64 Bit TOD format value to nanoseconds means that the value
+ * must be divided by 4.096. In order to achieve that we multiply with 125
+ * and divide by 512:
+ *
+ *    ns = (todval * 125) >> 9;
+ *
+ * In order to avoid an overflow with the multiplication we can rewrite this.
+ * With a split todval == 2^32 * th + tl (th upper 32 bits, tl lower 32 bits)
+ * we end up with
+ *
+ *    ns = ((2^32 * th + tl) * 125 ) >> 9;
+ * -> ns = (2^23 * th * 125) + ((tl * 125) >> 9);
+ *
+ */
+static inline unsigned long long tod_to_ns(unsigned long long todval)
+{
+	unsigned long long ns;
+
+	ns = ((todval >> 32) << 23) * 125;
+	ns += ((todval & 0xffffffff) * 125) >> 9;
+	return ns;
+}
+
 #endif
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index aff0e35..a5f4f5a 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -63,7 +63,7 @@ static DEFINE_PER_CPU(struct clock_event_device, comparators);
  */
 unsigned long long notrace __kprobes sched_clock(void)
 {
-	return (get_clock_monotonic() * 125) >> 9;
+	return tod_to_ns(get_clock_monotonic());
 }
 
 /*
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index c30615e..82c481d 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -408,7 +408,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
 		return 0;
 	}
 
-	sltime = ((vcpu->arch.sie_block->ckc - now)*125)>>9;
+	sltime = tod_to_ns(vcpu->arch.sie_block->ckc - now);
 
 	hrtimer_start(&vcpu->arch.ckc_timer, ktime_set (0, sltime) , HRTIMER_MODE_REL);
 	VCPU_EVENT(vcpu, 5, "enabled wait via clock comparator: %llx ns", sltime);
-- 
1.7.12.4

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to LISTSERV@VM.MARIST.EDU with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

----- End forwarded message -----

Attachment: signature.asc
Description: Digital signature


Reply to: