: Why do you think that hlt-ing the CPU(s) when idle would actually
: improve performance in this case? My only suspicion is that perhaps
: this reduces scheduling on the auxiliary 'logical' (fake) CPUs,
: thereby indirectly reducing cache ping-ponging and abuse. I would
: imagine that both units sharing the same execution engine in the
: HTT-enabled model would be effectively 'hlt'-ed when one of the two
: threads executes an 'hlt' until the next timer tick.
: I guess we'll wait for the two other data sets from Trish: one with
: HTT off, and cpu_idle_hlt=0, and the other with HTT off, and
: cpu_idle_hlt=1, before figuring this out.
:Bosko Milekic * firstname.lastname@example.org * bmilekic@FreeBSD.org
I am almost certain that it is related to pipeline stalls created
by the fairly long (in instructions) idle loop and the locked bus
cycles used by the mutex code. It could also possibly be related to
L1 cache contention.
I think that if we can fit the idle loop for the 'logical' processor
into a single instruction cache line and a single data cache line,
accessing a single memory location without any locked bus cycles, that
it would solve the problem. Unfortunately I have no boxes I can do
testing on so this is just a guess.
Another solution would be to have a global mask of 'idle' cpus and send
an IPI to them when a new KSE is scheduled on a non-idle cpu that would
simply serve to wakeup the HLT. IPIs are nasty, but there are large
(power consumption) advantages to standardizing on the HLT methodology.
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message