FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang

Many FreeBSD users (e.g. server admins) are interested in having their binaries run as fast as possible. There many options of improving the speed of the binaries – we can use different compilers and for each compiler different optimizations. But what combination is best for which processor?

We have benchmarked the perl binary compiled with gcc from the FreeBSD base system against gcc from ports and the new clang compiler. We have tested different optimizations on 8 different processors, all on the amd64 platform. The benchmark software we used is the perlbench benchmark running on Perl 5.12.3 on top of FreeBSD 8.2. This benchmark can also be used as a reference for users using other scripting languages (e.g. PHP, Python or Ruby) as these use similiar structures and methods.

We are benchmarking speed of the generated binaries, not the speed of compiling, as this is most important for us.
“Compile once, run many.”
Benchmarked compilers:

gcc 4.2.1 from FreeBSD Base
gcc 4.5 (or 4.6 for corei7) from the FreeBSD ports tree
llvm/clang rev. 127334 from the FreeBSD ports tree

Tested optimization flags (depending on processor type):
none, -march=atom, -march=nocona, -march=core2, -march=corei7, -march=opteron-sse3, -march=barcelona

How do the general results look like?
First, as of this benchmark, we can say the following in general:

clang was 10% slower in average on most of the tested CPUs than FreeBSD base gcc (4.2.1)
gcc 4.5 was 5-10% faster in average on most of the tested CPUs

The test results are relative and the base 100 for this test is gcc 4.2.1 with unset CPUTYPE.
The following table summarizes the processors tested (click on processors for individual scores).

CPU	Family	Rec. system gcc	Rec. ports compiler
Intel Atom D525	atom	CPUTYPE=core2 (*)	gcc45 -march=atom
Intel Xeon 3065	core2	CPUTYPE=core2 (*)	gcc45
Intel Xeon E5310	core2	CPUTYPE=core2 (*)	gcc45 -march=core2
Intel Xeon E5405	core2	no CPUTYPE	gcc45 -march=core2
Intel Core i7-920	nehalem	CPUTYPE=nocona	gcc45 -march=nocona
Intel Xeon X3450	nehalem	CPUTYPE=nocona	gcc45 -march=nocona
Intel Xeon E5620	nehalem	CPUTYPE=nocona	gcc45 -march=nocona
AMD Opteron 6128	barcelona	CPUTYPE=opteron-ssse3	gcc45 -march=barcelona

(*) with SSSE3 patch

Did we see any surprises? Yes, here they are:

Core i7 based procesors run slower with -march=core2 (new option) on the system compiler than with -march=nocona
For Core i7, the new optimization -march=corei7 from gcc 4.6 is still slower on average than -march=nocona
On the Intel Atom -march=nocona hurts performance in many tests with both base gcc and ports gcc
New AMD processors perform best with -march=opteron-sse3 if using the base compiler (otherwise -march=barcelona)

The full benchmark results are available at this URL:
http://www.vx.sk/benchmarks/perlbench/20110311

Leave a Reply Cancel Reply