Attilio Rao wrote:
> 2006/10/18, Attilio Rao <firstname.lastname@example.org>:
>> In my P4 branch: //depot/user/attilio/attilio_smpng/...
>> you can find a sx locks rewriting using the optimized semantic of
>> rwlocks; in the end this might result in a valuable performance
Do you have any benchmark results? You might have a hard time convincing
someone to commit this if you don't show that it's worth it, especially
if you consider how much it complicates the code.
> After have received very positive stress-test feedbacks from pho@
> (that I would like to thank for his patience and work) I went ahead
> and rather completed the implementation. Now this is ready to be
> reviewed and possibly committed into the CVS.
> Even if I plan a longer work on this branch (about syncronizing
> primitives), I think it is time for a revision of the work done until
> now from SMPng people (jhb, kmacy, etc.) and possibily an inclusion
> into HEAD (patch actually is 58k...).
> Some few hints about the patch:
> - LOCK_DEBUG adds a dependence between sx.h and lock.h (as for rwlocks
> and mutex) and a the new options SXLOCK_NOINLINE is added
> - XFS locking is still disabled in the patch (I hope to do it for
> tomorrow, I'm in GMT+1).
> - Possibly the sleepqueue interface modifies and new flags might be
> documented in the manpages (and NOTES file too, in order to reflect
> SXLOCK_NOINLINE inclusion).
> - It misses still of the adaptive spinning code, but it can be
> inserted after without problems.
We might want to avoid adaptive spinning, since sx locks may be held for
quite long periods of time.
> You can download the code directly from perforce
> (//depot/user/attilio/attilio_smpng/...) but patches are available
> I hope you will enjoy it (feedbacks, ideas, comments would be very
A few comments:
--- //depot/vendor/freebsd/src/sys/i386/include/param.h 2006/01/09 06:10:20
+++ //depot/user/attilio/attilio_smpng/i386/include/param.h 2006/10/03
@@ -109,6 +109,15 @@
+ * Define our own cache alignment mask for syncronizing primitives.
+ * and Xeon want a 128-byte wide aligned syncronizing primitive in order to
+ * minimize cache bus traffic on CPUs cache lines movements.
+#define SYNC_ALIGN (128 - 1)
Please also add this to amd64, and maybe rename it to UMA_ALIGN_SYNC.
On other architectures, have it be the same as UMA_ALIGN_CACHE. This
way, you don't have to define SYNC_ALIGN in every C file that uses it.
While there, you might also want to make UMA_ALIGN_CACHE the correct
size for i386/amd64 (64 bytes, on most machines, i believe?). Ideally,
this would be a variable set at boot time, depending on what the CPU
I also feel that the part that makes turnstiles and sleepqueues 128 byte
aligned should be broken up as as a separate patch/commit, as it doesn't
really have much to do with your sx rewriting, as I understand it.
--- //depot/vendor/freebsd/src/sys/kern/kern_sx.c 2006/08/15 18:31:36
+++ //depot/user/attilio/attilio_smpng/kern/kern_sx.c 2006/11/06 02:24:27
@@ -78,13 +50,8 @@
sx_init(struct sx *sx, const char *description)
- sx->sx_lock = mtx_pool_find(mtxpool_lockbuilder, sx);
- sx->sx_cnt = 0;
- cv_init(&sx->sx_shrd_cv, description);
- sx->sx_shrd_wcnt = 0;
- cv_init(&sx->sx_excl_cv, description);
- sx->sx_excl_wcnt = 0;
- sx->sx_xholder = NULL;
+ sx->sx_lock = SX_UNHELD;
+ sx->sx_desc = "sx lock";
lock_init(&sx->sx_object, &lock_class_sx, description, NULL,
LO_WITNESS | LO_RECURSABLE | LO_SLEEPABLE | LO_UPGRADABLE);
Shouldn't it be sx->sx_desc = description; ?
Keep up the good work! I hope to see some benchmark results and to see
it committed soon! :-)