Ewan Todd wrote:
> Hi All,
>
> I have what I think is a serious performance issue with fbsd 5.3
> release. I've read about threading issues, and it seems to me that
> that is what I'm looking at, but I'm not confident enough to rule out
> that it might be a hardware issue, a kernel configuration issue, or
> something to do with the python port. I'd appreciate it if someone
> would it point out if I'm overlooking something obvious. Otherwise,
> if it is the problem I think it is, then there seems entirely too
> little acknowledgement of a major issue.
>
> Here's the background. I just got a new (to me) AMD machine and put
> 5.3 release on it. I'd been very happy with the way my old Intel
> machine had been performing with 4.10 stable, and I decided to run a
> simple performance diagnostic on both machines, to wow myself with the
> amazing performance of the new hardware / kernel combination.
> However, the result was pretty disappointing.
>
> Here are what I think are the pertinent dmesg details.
>
> Old rig:
>
> FreeBSD 4.10-RELEASE #0: Thu Jul 1 22:47:08 EDT 2004
> Timecounter "i8254" frequency 1193182 Hz
> Timecounter "TSC" frequency 449235058 Hz
> CPU: Pentium III/Pentium III Xeon/Celeron (449.24-MHz 686-class CPU)
>
> New rig:
>
> FreeBSD 5.3-RELEASE #0: Fri Nov 5 04:19:18 UTC 2004
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: AMD Athlon(tm) Processor (995.77-MHz 686-class CPU)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> Timecounter "TSC" frequency 995767383 Hz quality 800
> Timecounters tick every 10.000 msec
>
> The diagnostic I selected was a python program to generate 1 million
> pseudo-random numbers and then to perform a heap sort on them. That
> code is included at the foot of this email. I named the file
> "heapsort.py". I ran it on both machines, using the "time" utility in
> /usr/bin/ (not the builtin tcsh "time"). So the command line was
>
> /usr/bin/time -al -o heapsort.data ./heapsort.py 1000000
>
> A typical result for the old rig was
>
> 130.78 real 129.86 user 0.11 sys
> 22344 maximum resident set size
> 608 average shared memory size
> 20528 average unshared data size
> 128 average unshared stack size
> 5360 page reclaims
> 0 page faults
> 0 swaps
> 0 block input operations
> 0 block output operations
> 0 messages sent
> 0 messages received
> 0 signals received
> 0 voluntary context switches
> 2386 involuntary context switches
>
> Whereas, the typical result for the new rig looked more like
>
> 105.36 real 71.10 user 33.41 sys
> 23376 maximum resident set size
> 659 average shared memory size
> 20796 average unshared data size
> 127 average unshared stack size
> 5402 page reclaims
> 0 page faults
> 0 swaps
> 0 block input operations
> 0 block output operations
> 0 messages sent
> 0 messages received
> 0 signals received
> 0 voluntary context switches
> 10548 involuntary context switches
>
> You'll notice that the new rig is indeed a little faster (times in
> seconds): 105.36 real (new rig) compared with 130.78 real (old rig).
>
> However, the new rig spends about 33.41 seconds on system overhead
> compared with just 0.11 seconds on the old rig. Comparing the rusage
> stats, the only significant difference is the "involuntary context
> switches" field, where the old rig has 2386 and the new rig has a
> whopping 10548. Further, I noticed that the number of context
> switches on the new rig seems to be more or less exactly one per 10
> msec of real time, that is, one per timecounter tick. (I saw this
> when comparing heapsort.py runs with arguments other than 1000000.)
>
> I think the new rig ought to execute this task in about 70 seconds:
> just over the amount of user time. Assuming that I'm not overlooking
> something obvious, and that I'm not interpreting a feature as a bug,
> this business with the context switches strikes me as a bit of a
> show-stopper. If that's right, it appears to be severely underplayed
> in the release documentation.
>
> I'll be happy if someone would kindly explain to me what's going on
> here. I'll be even happier to hear of a fix or workaround to remedy
> the situation.
>
> Thanks in advance,
>
> -e
>
>
First of all, make sure that you have WITNESS and INVARIANTS off in your
kernel. You might also want to recompile your kernel with the SMP
option turned off.
Scott