Re: yup, found it (NFS)

[ Available lists | Index of freebsd-hackers | Month of Dec 1998 | Week of 16 Dec 1998 | Raw email | View thread | Wrap long lines | Reply | Tag ]
From
Karl Denninger <karl@Denninger.Net>
Date
16 Dec 1998 21:09:05
Subject
Re: yup, found it (NFS)
Message-ID
19981216230855.A27443@Denninger.Net

Referenced by

[ Hide this part ]
On Wed, Dec 16, 1998 at 11:51:39PM -0500, Alfred Perlstein wrote:
> On Wed, 16 Dec 1998, Karl Denninger wrote:
>
> > Remove the intr for now. If that fixes it then at least we have
> > hard proof of where it is.
>
> Already done. I'm silly, not suicidal about things :)
>
> > The problem is that vinvlbuf is not the only place you can get screwed.
> > There is also a problem in the vm pager (it can hang in there too, as I've
> > now been able to prove and isolate) due to what I *believe* is the same
> > cause. This of course assumes you mount executable directories (very
> > common in clusters) across NFS.
>
> You mean, if i'm running an executable over NFS? I've seen this but not nearly as often. In my case pine is local to the machine, but my mailbox isn't.
>
> Just because of curiousity, it's hanging because the program text
> retrieval from the binary (not swap) has a similar loop?

Yep. It locks up the process in question. I suspect, but haven't yet
proven, that if that lockup bites "pagedaemon" you're fucked on a system
level. I *have* proven that the process in question gets hosed and
deadlocks.

Example:
www 11988 0.0 0.5 6260 612 ?? D 8:12AM 0:00.99 /lbin/httpd.apa
www 11994 0.0 0.5 6288 620 ?? D 8:12AM 0:06.68 /lbin/httpd.apa

Guess what. Right at 8:12 in the morning the server gets "kicked" to
produce logs (it gets sent a SIGINT). Hmmm.....

> > Certainly the expected execution path is basically the same, and I can
> > *trigger it* with a SIGINT to a running process which happens to have some
> > of its working set paged out at the time it receives the signal (ouch!)
>
> That doesn't seem very good at all. Is this second case for all
> NFS mounts? or only intr mounts?

Don't know yet - still testing.

> Thanks for the attention. Sorry i took so long to get some proof
> of this bug, it's just that it's a work machine and taking time
> out to do this isn't always possible.
>
> I'm sure tracking down/fixing the problem is on a totally different
> level, so thanks,
>
> -Alfred

Yep. I understand fully.

What I want to know is whether a "ro,soft" mount has the same
vulnerability. We use them around here for things like mounting
the Usenet spool.

--
--
Karl Denninger (karl@denninger.net) http://www.mcs.net/~karl
I ain't even *authorized* to speak for anyone other than myself, so give
up now on trying to associate my words with any particular organization.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message


Elapsed time: 0.125 seconds