[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Jackit-devel] Re: Re: little NPTL SCHED_FIFO test program



On Saturday 21 August 2004 00:36, Lee Revell wrote:
> On Fri, 2004-08-20 at 19:29, Martijn Sipkema wrote:
> >
> > It's a pretty serious bug though, in that very basic functionality
> > doesn't not work correctly, and I personally wouldn't go out of my way to
> > program a workaround.
>
> Worse, it claims to be POSIX compliant, and silently does the wrong
> thing.
>
> This 'bug' is not even a bug, it's a stopgap measure, the developers
> intentionally broke POSIX compliance because the kernel support wasn't
> good enough.  Now that it's good enough, glibc should be fixed.  QED.

While I, too, believe that this stopgap measure was a bad idea, as far as I 
can tell the kernel support is still lacking. This wasn't about latency; 
current NPTL mutexes are not POSIX compliant because waiters are woken up in 
FIFO order instead of priority order, and that's a kernel problem (the futex 
sync primitive has this limitation). Additionally, there is no reasonably 
efficient (compared to plain mutexes) mechanism to implement priority 
inheritance and/or protection. See

 http://developer.osdl.org/dev/robustmutexes/ 

for the current state of one effort to fix this, and, specifically,

 http://developer.osdl.org/dev/robustmutexes/fusyn-doc/fusyn-why.txt

for an in-depth explanation of the problems and proposed solutions. Also, 
Inaky Perez-Gonzalez posted a proof-of-concept patch that fixes the wakeup 
order without changes to glibc to linux-kernel yesterday, so that part might 
make it into the real world fairly soon.

BTW, when I first read about the troubles with jack and NPTL, I suspected the 
wakeup order problem. After hours of tracing call chains I convinced myself 
that that probably wasn't it, though I could imagine unaware clients with 
other high-priority threads suffering.

Daniel.



Reply to: