Dag-Erling Smrgrav wrote:
> Julian Elischer <julian@FreeBSD.org> writes:
>> This removes a reproducible lockup in NFS.
> Could you elaborate on that?
There is an error in the single-threading mode selected in fork
(some "optimization" code that was added at some time (maybe by me))
that suspends threads that are already sleeping with PCATCH by simply
adding the suspended bit. Turns out this is a bad idea.
NFS sometimes sleeps with a vnode lock held, with PCATCH set.
(and is this a candidate for the above)
now, the mechanism:
thread A does an NFS operation, locks an NFS vnode, and sleeps
with PCATCH for some reply from the server.
thread B enters NFS but hits the locked vnode and waits (NO PCATCH)
thread C does fork()
thread A is suspended and can not proceed.
(bug but let's get past that)
it is counted as quiesced for the thread_single
Thread B can not proceed and so can not be suspended and
counted as quiesced (also bug I think)
thread C never reached 'single threading state' (B is not yet quiesced)
and can not proceed.
thread A can not be reawakened.
There are so many bugs here that one loses count,
however it turns out that the whole idea of single-threading
in the fork is unneeded due to all the locking introduced for
all the components altered in fork().
to fix the problem: use another mode of thread_single()
that counts threads quiesced differently and doesn't do the suspend stupidity.
having fixed that,
the whole thing can be removed anyhow.
(analysis by davidxu, alc, me, alfred in concert)