From: Mark Kettenis Subject: Re: pthread_mutex_timedlock(3): don't block forever if the timeout is NULL To: Scott Cheloha Cc: tech@openbsd.org, mpi@openbsd.org, guenther@openbsd.org Date: Mon, 05 Feb 2024 19:50:01 +0100 > Date: Mon, 5 Feb 2024 11:44:18 -0600 > From: Scott Cheloha > > On Mon, Feb 05, 2024 at 04:50:18PM +0100, Martin Pieuchot wrote: > > On 27/01/24(Sat) 18:19, Scott Cheloha wrote: > > > On Tue, Jan 09, 2024 at 11:14:08PM -0600, Scott Cheloha wrote: > > > > Both pthread_mutex_timedlock(3) implementations pass the timeout > > > > parameter from the caller through to the internal implementation > > > > function. In both of those functions, a NULL timeout pointer means > > > > "no timeout" or "block indefinitely". > > > > > > > > So, we have accidentally extended the pthread_mutex_timedlock(3) > > > > specification with special, undocumented behavior when the timeout > > > > pointer is NULL. > > > > Is this a bug? It sounds sane to me. Behave just like > > pthread_mutex_lock(3). > > I agree that it sounds sane from an interface design perspective to > have NULL mean "no timeout". At least, that would be consistent with > many other POSIX interfaces. However... > > > > > The easy fix is to copy the timeout to the stack before passing it on. > > > > > > > > This change will cause any code that depends upon the current behavior > > > > to segfault instead of blocking indefinitely. I think it's unlikely > > > > such code exists. It it does exist, it is not portable. > > > > I'm not sure adding an explicit NULL-dereference in a library is the way to > > go. If anything it should be better to return an error. > > We're talking about a programmer error here. I think failing hard and > faulting in this case would be better than returning an error. There > is a real possibility that the programmer will not see the EINVAL and > the caller will proceed as if they have locked the mutex. > > > But I see no evidence that passing a NULL pointer should behave > > differently than pthread_mutex_lock(3). > > The behavior is undefined: > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_timedlock.html > > Snipped to only include text relevant to the timeout: > > > #include > > #include > > > > int pthread_mutex_timedlock(pthread_mutex_t *restrict mutex, > > const struct timespec *restrict abstime); > > > > The pthread_mutex_timedlock() function shall lock the mutex object > > referenced by mutex. If the mutex is already locked, the calling thread > > shall block until the mutex becomes available as in the pthread_mutex_lock() > > function. If the mutex cannot be locked without waiting for another thread > > to unlock the mutex, this wait shall be terminated when the specified > > timeout expires. > > > > The timeout shall expire when the absolute time specified by abstime > > passes, as measured by the clock on which timeouts are based (that is, > > when the value of that clock equals or exceeds abstime), or if the > > absolute time specified by abstime has already been passed at the time > > of the call. > > > > The timeout shall be based on the CLOCK_REALTIME clock. The resolution > > of the timeout shall be the resolution of the clock on which it is > > based. The timespec data type is defined in the header. > > > > Under no circumstance shall the function fail with a timeout if the > > mutex can be locked immediately. The validity of the abstime parameter > > need not be checked if the mutex can be locked immediately. > > > > [...] > > > > [EINVAL] > > The process or thread would have blocked, and the abstime parameter > > specified a nanoseconds field value less than zero or greater than > > or equal to 1000 million. > > > > [...] > > > > [ETIMEDOUT] > > The mutex could not be locked before the specified timeout expired. > > There is no mention of special behavior when the timeout input is > NULL, so I don't think we should provide special behavior. If NULL > didn't mean "no timeout" in so many other interfaces there wouldn't > even be anything to discuss. > > The simplest thing is to interpret the standard literally. > > The current behavior looks like an accident. At the same time your proposed code looks artificial. Not convinced this is worth complicating the code for. > Index: rthread_mutex.c > =================================================================== > RCS file: /cvs/src/lib/libc/thread/rthread_mutex.c,v > diff -u -p -r1.5 rthread_mutex.c > --- rthread_mutex.c 13 Feb 2019 13:09:32 -0000 1.5 > +++ rthread_mutex.c 5 Feb 2024 17:38:03 -0000 > @@ -211,9 +211,11 @@ pthread_mutex_trylock(pthread_mutex_t *m > } > > int > -pthread_mutex_timedlock(pthread_mutex_t *mutexp, const struct timespec *abs) > +pthread_mutex_timedlock(pthread_mutex_t *mutexp, const struct timespec *absp) > { > - return (_rthread_mutex_timedlock(mutexp, 0, abs, 1)); > + struct timespec abs = *absp; > + > + return (_rthread_mutex_timedlock(mutexp, 0, &abs, 1)); > } > > int > Index: rthread_sync.c > =================================================================== > RCS file: /cvs/src/lib/libc/thread/rthread_sync.c,v > diff -u -p -r1.6 rthread_sync.c > --- rthread_sync.c 10 Jan 2024 04:28:43 -0000 1.6 > +++ rthread_sync.c 5 Feb 2024 17:38:03 -0000 > @@ -179,9 +179,11 @@ pthread_mutex_trylock(pthread_mutex_t *p > } > > int > -pthread_mutex_timedlock(pthread_mutex_t *p, const struct timespec *abstime) > +pthread_mutex_timedlock(pthread_mutex_t *p, const struct timespec *abstimep) > { > - return (_rthread_mutex_lock(p, 0, abstime)); > + struct timespec abstime = *abstimep; > + > + return (_rthread_mutex_lock(p, 0, &abstime)); > } > > int > >