> I'm just an observer, and I may be confused, but it seems to me that this is
> motion in the wrong direction (at least, it's not going to fix the actual
> problem). As I understand the problem, once you reach a certain point, the
> system slows down *every* 30.999 seconds. Now, it's possible for the code to
> cause one slowdown as it cleans up, but why does it need to clean up so much
> 31 seconds later?
> Why not find/fix the actual bug? Then work on getting the yield right if it
> turns out there's an actual problem for it to fix.
> If the problem is that too much work is being done at a stretch and it turns
> out this is because work is being done erroneously or needlessly, fixing
> that should solve the whole problem. Doing the work that doesn't need to be
> done more slowly is at best an ugly workaround.
> Or am I misunderstanding?
It's the syncer that is causing the problem, and it runs every 31 seconds.
Historically, the syncer ran every 30 seconds, but things have changed a
bit over time.
The reason that the syncer takes so muck time is that ffs_sync is a bit
stupid in how it works - it loops through all of the vnodes on each ffs
mountpoint (typically almost all of the vnodes in the system) to see if
any of them need to be synced out. This was marginally okay when there
were perhaps a thousand vnodes in the system, but when the maximum number
of vnodes was dramatically increased in FreeBSD some years ago (to
typically 50000-100000) and combined with kernel threads of FreeBSD 5,
this has resulted in some rather bad side effects.
I think the proper solution would be to create a ffs_sync work list
(another TAILQ/LISTQ), probably with the head in the mountpoint struct,
that has on it any vnodes that need to be synced. Unfortuantely, such a
change would be extensive, scattered throughout much of the ufs/ffs code.
David G. Lawrence
Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500
The FreeBSD Project - http://www.freebsd.org
Pave the road of life with opportunities.