Geek Patrol



Geekbench Comparison

A while ago we (finally) released a preview version of Geekbench, our cross-platform processor benchmark. I thought it was about time we put Geekbench to use and compare the performance of a variety of machines. Even though the tests in Geekbench haven’t been finalized, this article should give you a good idea what kinds of comparisons you’ll be able to make with Geekbench in the future.

I gathered Geekbench results from twelve different machines for this article, including results from both of the new Intel-based iMacs, dual-core and dual-processor Power Mac G5s, and a few PCs.

Test Machines

The twelve machines I got results from span a fairly wide spectrum of computers currently (and recently) available, ranging from a low-end desktop to a rack-mounted server. Rather than arrange them by clock speed, I’ve grouped them by type (low-end Mac, high-end Mac, PC), and within the group I’ve arranged the machines by release date.

  1. Mac Mini 1.42 GHz
    • 512 KB L2 cache
    • 167 MHz system bus
    • 1 GB DDR 333 SDRAM
  2. iMac G5 2.1 GHz
    • 512 KB L2 cache
    • 700 MHz system bus
    • 1.5 GB DDR2 533 SDRAM.
  3. iMac Core Duo 1.83 GHz
    • 2 MB L2 cache shared between cores
    • 667 MHz system bus
    • 1 GB DDR2 667 SDRAM.
  4. iMac Core Duo 2.0 GHz
    • 2 MB L2 cache shared between cores
    • 667 MHz system bus
    • 512 MB RAM DDR2 667 SDRAM
  5. Power Mac G4 Dual 1.25 GHz
    • 256 KB L2 cache per cpu
    • 2 MB backside L3 cache per cpu
    • 167 MHz system bus
    • 1.75 GB DDR 333 SDRAM
  6. Power Mac G5 1.6 GHz
    • 512 KB L2 cache
    • 800 MHz system bus
    • 768 MB DDR 333 SDRAM
  7. Power Mac G5 Dual 1.8 GHz
    • 512 KB L2 cache per processor
    • 900 MHz system bus
    • 1 GB DDR 400 SDRAM.
  8. Power Mac G5 Dual Core 2.0 GHz
    • 1 MB L2 cache per core
    • 1000 MHz system bus
    • 2.5 GB DDR2 533 SDRAM
  9. Power Mac G5 ‘Quad’ 2.5 GHz
    • 1 MB L2 cache per core
    • 1.25 GHz system bus
    • 1.5 GB DDR2 533 SDRAM
  10. AMD Athlon 64 3200+ (2.2 GHz)
    • 512 KB L2 cache
    • 800 MHz system bus
    • 1 GB of DDR 400 SDRAM.
  11. Intel Pentium 4c 2.4 GHz HT
    • 512 KB L2 cache
    • 800 MHz system bus
    • 1 GB DDR 400 SDRAM
  12. Intel Xeon Dual 3.2 GHz HT
    • 1MB L2 cache per CPU
    • 800 MHz system bus
    • 1 GB DDR2 400 SDRAM

Update: It has been brought to my attention that knowing what OS the machines are running would be of use. All of the Windows machines are running WinXP SP2, the PowerMac G5 Dual 1.8 GHz is running OS X 10.3.9 and all the other Macs are running OS X 10.4.4.

Floating Point Performance

Geekbench measures floating point performance by computing a Mandelbrot set in one or four threads, using the square root function in one set of tests, and without using the square root function in another set of tests. All results are in megaflops.

Legend
mandelbrot, square root

On the single-threaded square-root Mandelbrot test the scores are fairly even; except for the two G4 machines (which lack a hardware square root instruction), most machines scored between 450 to 700 megaflops.

On the multi-threaded version of the square-root Mandelbrot test all of the single-processor machines are, not surprisingly, left behind. The Intel-based iMacs score in the same range as the dual-processor Power Mac 1.8GHz and dual-core Power Mac 2.0GHz. The Xeon and the Quad Power Mac scored the highest since both are able to process four threads at once (the Xeon has two hyper-threaded processors, the Power Mac has two dual-core processors). What’s interesting is the Pentium 4c didn’t seem to benefit from hyper-threading, since the Athlon 64 managed to achieve a higher score.

It’s worth pointing out that the single-processor Power Mac G5 1.6GHz beat the dual-processor Power Mac G4 1.25GHz, probably because the lack of a hardware square root instruction in the G4.

Legend
mandelbrot, no square root

On the single-threaded, non-square-root Mandelbrot test all of the machines perform better, and the results are much more even. The Quad Power Mac G5 takes the top score, followed by the iMac Core Duo and the Athlon 64.

On the multi-threaded test the Quad G5 dominates the results scoring nearly two gigaflops higher than the Xeon. The Intel Core Duo iMacs make a strong showing, beating both the dual-processor Power Mac G5 1.8GHz and the dual-core Power Mac G5 2.0GHz. Neither the Athlon 64 nor the Pentium 4c did, despite the Pentium 4c being hyper-threaded. In fact, the Athlon 64 performed better than the Pentium 4c on both the single-threaded and multi-threaded tests, despite having only one logical processor!

Integer Performance

Geekbench measures integer performance by performing Blowfish encryption on one or four threads, using a data set small enough for a processor cache in one set of tests, and a data set too large for a processor cache in another set of tests. All results are in megabytes per second.

Legend
blowfish, cache

The Core Duo iMacs really did well here thanks to their rather generous cache (2 MB). It’s also not surprising to seem them on top as x86 processors have typically been as good as or better than PPC processors at integer calculations.

What’s surprising here is that the Quad Power Mac G5 was beaten by the Dual Power Mac G4 and the Mac Mini. I know the G4 has more cache (256KB L2 and 2MB L3 per CPU) and more integer units than the G5 (the G4 has three simple and one complex, while the G5 has two), but I still didn’t expect it to do better than a G5 at twice the clock speed. John’s suggested that Blowfish uses simple operations and can keep all four integer units on the G4 busy, which would explain how the G4 is able to outperform the G5.

Also surprising is how poorly not only the Athlon 64 and Pentium 4c fared, but the Xeon as well. Even on the multi-threaded test the Xeon (which has four logical processors) is beaten by the Mac Mini (which has only one logical processor)!

The Quad Power Mac G5 regained its top spot in the multi-threaded test, but the rest of the G5 machines were left behind by the Dual G4, and the Mac Mini dominated the single processor/core machines as well as the hyper-threaded ones. Just shows to go you that the G4 may still be good for something.

Legend
blowfish, memory

When running the Blowfish tests from memory the landscape changes quite a bit. As you can see the graph is fairly flat on the single-threaded test. The Xeon is back on top, followed by the Quad Power Mac G5. The G4 machines did well, besting the Power Mac G5 1.6GHz, the Dual Power Mac G5 1.8GHz and the Pentium 4c. The Core Duo iMacs also did well thanks to their shiny new DDR2 667 RAM.

The results from the multi-threaded version of the test are a lot different, though. The Xeon shot back up to the top of the pile with the Power Mac G5 Quad right on its heels. The dual-processor and dual-core Power Mac G5s did better, but the Dual Power Mac G4 1.25GHz managed to keep up with them. The Athlon 64 and Pentium 4c again performed rather poorly by comparison.

More Integer Performance

For another measure of integer performance, Geekbench executes code for the MOS Technology 6502 CPU in a virtual CPU. This test is single threaded, and results are in megahertz.

Legend
emulate 6502

On this test the G4 machines did not do as well, being surpassed by every other machine tested. The Xeon and Core Duo machines came out on top, emulating at nearly 400 MHz. The G5 machines all perform right about where you’d expect them to, except maybe for the Power Mac G5 Quad which I expected to be one of the top two, but instead came in fourth.

What’s interesting about this test is that while most of the x86 CPUs performed quite well, the Pentium 4c is left behind. Granted it’s an older processor, but it is hyper-threaded and runs at a higher clock speed than either of the Core Duo machines. It also runs at a higher clock speed than the Athlon 64 3200+ (the Athlon 64 is 2.2 GHz, the Pentium 4c is 2.4 GHz), and yet the Athlon 64 performed nearly twice as fast. While that’s not really surprising, it does highlight how little megahertz really means to performance these days.

Memory Performance

Geekbench uses two methods to test memory performance. The first set of four all use functions from the standard library, first to fill a block of memory, then to sequentially access chunks of memory within a larger block, then randomly access chunks of memory from a larger block and finally to copy one block of memory to another. These tests are single threaded, and results are in megabytes per second.

Legend
memory, stlib 1

Here it seems to be that the Core Duo iMacs do very well when writing to memory, but when accessing memory the G5 machines dominate the field. On the sequential access test the Core Duo iMacs are in the same general area as the G4 machines.

Legend
memory, stlib 2

The G5s continued to dominate these tests, and the iMac Core Duo made a small comeback on the copy test (where overall scores were lower). The Athlon 64 scored similarly to the Core Duos on the access tests, but was beaten handily by them on the write tests.

It’s worth noting that the Pentium 4c scored better than the Xeon on all four of these tests.

The second set of memory tests uses processor operations, first copying data from one block to another, then three operations that manipulate data (scale, add and triad) before writing it back to memory. All of these tests are single threaded, and results are in gigabytes per second.

Legend
memory, stream 1

Here the G5 remains on top, but not so clearly. What is more interesting here is that the G4 is so far behind. Admittedly, it isn’t that surprising considering the incredibly slow bus speed (167 MHz), less than half the memory bandwith of the G5 (2.7 vs 6.4 GB per second) and less than a quarter the bus bandwidth of the G5 (1.3 vs 8.0 GB per second) and the just the fact that it’s an older CPU.

Legend
memory, stream, 2

In these latter two tests the G4 continued to do poorly, and the Core Duo iMac overtook the iMac G5 and the Power Mac G5s 1.6 and Dual 1.8, but not the Power Mac G5 2.0 Dual Core.

The Athlon 64 did poorly on these tests also, outscored by all the other x86 CPUs on all four tests. The Pentium 4c managed to keep up with the Xeon on all four tests, and even surpassed it on one.

Conclusions

While I can tell you more about CPU design than the average person on the street, I’m no Jon “Hannibal” Stokes; there are technical differences between CPUs with account for some of the more interesting results (such as the PowerPC G4 doing so well in the integer tests) that I am unaware of. That said, I can tell you what the results mean to me.

The PowerPC G4 has served us well, its time has past. It still performs reasonably well as a lower-end CPU, but when compared to the PowerPC G5 and the Intel Core Duo it’s sorely lacking; it was blown away by both processors in almost every test (save for the Blowfish tests). With the Intel Core Duo in both the iMac and the PowerBook1 hopefully we’ll see the Intel Core Solo replace the PowerPC G4 in the Mac Mini and the iBook.

The PowerPC G5 is still a good processor. In fact, it’s still a great processor. Apple isn’t switching to Intel chips because Intel chips perform better, but rather because a G5 would melt through the bottom of a laptop.

The Athlon 64 edged out the Pentium 4c on all the CPU tests, while the Pentium 4c edged out the Athlon 64 on all the memory tests. It seems to me that Intel and AMD have had their different strengths all along, so I don’t find this surprising.

The Intel Core Duo is a great processor. It performed as well or better than the PowerPC G5 at similar clock speeds (1.8GHz and 2.0GHz), and has nowhere to go but up. Better still, it’s likely that Intel-based Power Macs (or Mac Pros, or whatever they’ll be called) will use the sucessor to Intel’s current Pentium D processor, which won’t be a mobile processor (like the Intel Core Duo) and which will only perform better than the Intel Core Duo.

We’ll be doing another comparison (hopefully with wider variety of machines2) after the next preview of Geekbench is released. If anyone wants to help out with the next round of tests please drop me a line.

Addendum (30-Jan-06, 3pm Pacific)

I already posted this in the comments, but since it seems that most of you aren’t reading the comments I thought I should make it more visible.

I was reading the comments on this article over at digg.com, and someone said this:

they just used what they had on hand. The point wasn’t to compare new technology. It was to introduce a cross platform benchmarking tool. AMDers, don’t get all uppity. They’re not claiming to represent high end machines. Apple fan boys, shut up, shit down and recognize that you’re rather favored in the line up. This has no bearing on processor superiority.

And is pretty much exactly right. In short, this article is less about the machines tested as it is about the testing itself.

I know that not including a Pentium D or an Athlon64 X2 the results aren’t necessarily what you’d like to see. By using the Athlon64 3200+ sitting on my desk I wasn’t meaning to sell AMD short, I was trying to make sure they had at least one result in the mix. I don’t have Pentium D or Athlon64 X2 to test! Which is why (as I previously mentioned) I’m hoping that some of you who want to see those machines compared are willing to help me out when I write another article. Are you willing to help me make a better comparison next time around? Then get in touch with me for Pete’s sake!

Also, here are two things to bear in mind when considering the results:

  1. Geekbench is new, and not yet finished.
  2. Geekbench is not currently optimized for any CPU specifically.

Further Addendum

There are now two mailing lists for Geekbench. geekbench-announce, wherein we will announce things like new versions, and geekbench-discuss, in which we will discus new features, feature requests, methods, etc. You can subscribe to either (or both) list here

Or, if you prefer your browser to your email client there is now a forum too. Check it out.

Notice (20-Feb-05)

Since the writing of this article a new version of Geekbench has been released. Please see the Geekbench page for more info.

1 I’m still not calling them “MacBook Pros”.

2 With any systems, really, but I’ll be specifically looking for desktop Pentium D, Pentium M, Core Solo, Athlon 64 X2, Sempron, Athlon XP, Celeron, G3 and G4 machines.


Trackbacks & Pingbacks

  1. Mac Performance Comparison at Geek Patrol - DuaneFields.com - http://www.duanefields.com pingbacked Posted April 17, 2006, 9:16 pm
  2. Geek Patrol | Geekbench Comparison (December 2006) pingbacked Posted December 6, 2006, 4:00 pm

Comments

  1. 1 Micheal says:

    Nicely done! Certainly gives a good idea as to where the new machines fit in the current world of processors.

    Posted January 30, 2006, 6:00 pm
  2. 2 the krok says:

    I’m surprised that new processors can loose against a G4… and that the “PC” processors where so slow in general. Quads rule, but we all knew that already? Atleast us owners! ;)

    Posted January 30, 2006, 8:53 pm
  3. 3 AndrewZ says:

    really a nice roundup of performance data, thanks for putting it together.

    Posted January 30, 2006, 9:15 pm
  4. 4 Oliver says:

    Would have been nice if you had more recent “PC” processors in these benchmarks. The Athlon and Pentium 4 chips used are pretty dated and not at all representative of current generation PC hardware. Both AMD and Intel have dual-core chips out that are readily available…

    Posted January 30, 2006, 9:17 pm
  5. 5 Frankie says:

    The PowerMacs won’t switch over to Intel until the next-gen 64-bit CPUs are available.

    The word of the day will be Conroe

    Posted January 30, 2006, 9:20 pm
  6. 6 Matt Simpson says:

    “Would have been nice if you had more recent “PC” processors in these benchmarks. ”

    I know, I know. This is why I have asked for help in the next comparison article. I’d really like to get some Pentium D and Athlon64 X2s in there, among others.

    Posted January 30, 2006, 9:22 pm
  7. 7 Geek says:

    Help, I can’t discern the grey shades. Can you put strip in or something.

    Posted January 30, 2006, 9:31 pm
  8. 8 ktolis says:

    so… where are the altivec accelerated tests? I only see integer and fp. and the intel preprocessor does optimize for mmx etc.. these tests are not altivec accelerated probably. Any intention to finally publish a decent test against altivec implementations?…

    Posted January 30, 2006, 9:35 pm
  9. 9 rick says:

    “Would have been nice if you had more recent “PC” processors in these benchmarks. ”

    You didn’t know the iMac Core 1.8 and 2.0 machines ARE the most recent PC processors right?

    Get out more?

    Posted January 30, 2006, 9:38 pm
  10. 10 Aaron says:

    Do you seriously suggest that the A64/3200+ has a 1GB/sec bandwidth? something is really wrong with your testing methodology man, same box, same proc, I score around 5.8GB/s for mem access… Please check your non-mac tests…

    Posted January 30, 2006, 9:42 pm
  11. 11 blackrim says:

    I am receiving a AMD Athlon64 4800+ with 2 GB RAM in a couple of days and I will surely pass along the results, I think, as you have mentioned, that you are greatly under-representing the Athlon and other Intel chips. They are dated…understatement.

    Posted January 30, 2006, 9:53 pm
  12. I was just reading the comments on this article over at digg.com, and someone said this:

    they just used what they had on hand. The point wasn’t to compare new technology. It was to introduce a cross platform benchmarking tool. AMDers, don’t get all uppity. They’re not claiming to represent high end machines. Apple fan boys, shut up, shit down and recognize that you’re rather favored in the line up. This has no bearing on processor superiority.

    And is pretty much exactly right. By using the Athlon64 3200+ sitting on my desk I wasn’t meaning to sell AMD short, I was trying to make sure they had at least one result in the mix. As I have already said, I will be soliciting a greater number of results for an article after Geekbench Preview 2 is out.

    Two things to bear in mind:

    1. Geekbench is new, and not yet finished.
    2. Geekbench is not optimized for any CPU specifically.
    Posted January 30, 2006, 10:06 pm
  13. Interesting results. Will be interesting to see how far Intel can push the Yonah.

    Posted January 30, 2006, 10:07 pm
  14. 14 Richard says:

    unbelievable. everyone wants to be a testbench publisher. limited tests, no qualification on overall system used. dated CPUs. no optimised/unoptimised comparison. not worth a digg (how I got here). not worth anything. people like you give bad names to brands that suffer from your ignorant, irrelevant results

    Posted January 30, 2006, 10:09 pm
  15. 15 Barrett says:

    Please add stripes or texture to the grey bars, you could also have a text identifier. Otherwise, awesome work, can’t wait to see the next roundup with newer processors!

    Posted January 30, 2006, 10:09 pm
  16. 16 Nash says:

    Where is the Opteron dual core the AMD server CPU?

    Posted January 30, 2006, 10:20 pm
  17. 17 samsonbull says:

    Opteron??? Well, i guess that is a good question too but I wanted to ask: Where is the Dual Core Athlon64 (like a 3800 or something)?

    Posted January 30, 2006, 10:27 pm
  18. Where is the Dual Core Athlon64 (like a 3800 or something)?

    See either of my comments.

    Posted January 30, 2006, 10:33 pm
  19. 19 Ken says:

    Uh, a much fairer fight would be:

    AMD Athlon 64 FX-60

    AMD Athlon X2 4800+

    Those are AMD’s flagships.

    Posted January 30, 2006, 10:35 pm
  20. 20 Markos says:

    Good work, the critics are quick to point out your flaws but aren’t willing to help or conduct their own tests. I for one applaud you.

    Posted January 30, 2006, 10:39 pm
  21. 21 Emanuel says:

    Your tests are a tad flawed, you used dated CPUs! Come on. Where are the X2s? The Pentium D830s??

    It was a nice effort but you can’t expect anyone to take these tests too seriously with dated hardware.

    Posted January 30, 2006, 10:44 pm
  22. 22 kyle says:

    I would appreciate to see some real world application benchmarks. especially photoshop ones…

    Posted January 30, 2006, 10:46 pm
  23. 23 Tux says:

    Bias, what about the latest AMD? Clearly pro-mac… clearly pro-ignorant

    Posted January 30, 2006, 10:46 pm
  24. 24 chris says:

    the Athlon64 @ 2,2GHz ist not that dated, the P4 however should be around ~ 3 GHz to achieve the same performance.

    I’m a little surprised about the sometimes low scores of Athlon64 and P4. But with 1-4 CPU configurations in one Diagramm, it’s not always easy to read.

    However I don’t need multitple cores, hyperthreading…, just a single fast core. not the best times for this nowadays.

    Posted January 30, 2006, 10:50 pm
  25. 25 Gabriel says:

    “You didn’t know the iMac Core 1.8 and 2.0 machines ARE the most recent PC processors right?”

    No, the AMD Athlon FX-60’s Dual-Core and the 65 nm Pentium D 900’s are the most recent.

    Posted January 30, 2006, 10:50 pm
  26. 26 ras says:

    Interesting results. I hope you get some assistance so you can quiet the cynics. I don’t really care which platform is fastest; they appear to be close in a number of tests and it is the applications that my decisions are primarily based upon.

    Posted January 30, 2006, 10:54 pm
  27. 27 Rick Tugman says:

    It is entirely possible that either Apple’s OS or the Applications running are not taking advantage of the Dual-Core Chips—It’s been a topic of conversation lately that only part of the Dual-Core chips are running since there is nothing is enabling the real power of the chip to kick in.

    Posted January 30, 2006, 11:01 pm
  28. 28 Simon Fearby says:

    Where is the Athlon 64 X2 or even an Athlon 64 4800?

    Great review, NOT.

    Posted January 30, 2006, 11:06 pm
  29. 29 riskin says:

    I appreciate anyone willing to take the time to develop a useful cross-platform benchmark. However, I think the results indicate that this benchmark isn’t ready for primetime. I’m less concerned with the specs of the hardware you’ve represented here. Obviously your article here is more of an analysis of the benchmark itself, not the hardware that ran it. I don’t care that the specs of the Athlon64 machine isn’t up to date. What DOES concern me is that the benchmark does not properly indicate the level of performance expected from a given piece of hardware.

    To put as fine a point on this as possible: to all the folks who are demanding this benchmark be shown running top of the line hardware of a given manufacturer, you are missing the point. Giving this guy a hard time about the specs of the machines used make you look both uninformed and like a jerk. The issue is about the software and its accuracy, NOT on how a G5, an Athlon64, and a Xeon compare to each other. It’s completely irrelevant to this discussion, and continuing to harp on it is unhelpful and distracting.

    The real issue here is whether or not this benchmark is an accurate measuring tool of performance for the hardware tested. As one poster above me mentioned, if you’re only getting 1 GB/s of memory bandwidth out of that AMD system there’s clearly something wrong with your code. I’m not suggesting you trash this project, just that it needs more refinement.

    Similarly, I’m not sure that running a single Mandelbrot set computation which includes a SQRT calculation is in any way indicative of FPU performance. For one, this is only one type of floating point computation, and fractals are not really that common a calculation in real world applications. Neither is SQRT a common operation, so emphasizing it in the single, narrow-focused floating point benchmark is disingenuous and does not provide an accurate measurement of performance. Accurate performance measurement is PRECISELY what a benchmark is supposed to provide.

    I was initially going to take issue with the number of threads you’ve chosen, either 1 or 4, as opposed to two for hyperthreaded or dual CPU or dual core systems. However, the overhead of two CPUs dealing with an extra set of threads isn’t large enough to really give you grief for it. I would suggest exploring that in your testing just to make sure. For an example, see this link (it’s also a great example of a floating point benchmark that accurately reflects real world usage):

    http://www.techreport.com/reviews/2006q1/fx60-vs-955xe/index.x?pg=13

    I take issue with your use of a single task, Blowfish encryption, as your defining measurement of integer performance. (I see you have also included some obscure CPU emulation code as a benchmark for integer performance as well. Considering how esoteric such an endeavor is, I don’t even consider it to be an actual benchmark. For my purposes I am completely disregarding it, as it has little to no practical use in the real world.) While I would definitely include this measurement in your suite of integer measurements, I would not make it the only one. Chess benchmarks are good for measuring integer as well as branch prediction performance and would be useful to include. In short, there are a myriad of ways to measure integer performance across platforms in a way that is tied more closely to hardware than the limitations of that hardware’s OS (such as the downfall of OS X when running MySQL, see link: http://www.anandtech.com/mac/showdoc.aspx?i=2436&p=6 ). I would encourage you to research more of those tests and include them in your benchmark. One narrow type of calculation is effectively useless as a benchmark for an overall category of computation. All you’ve done is show which hardware runs your Blowfish encryption code the best.

    Your memory benchmarks are better (at least you included STREAM code), but they are still not quite an accurate measurement of performance. I see in your conclusion that you read Ars Technica and know of Hannibal’s articles. I would encourage you to explore the Ars Technica forums, particularly the Battlefront (it can get a bit flamey, so be aware) because of the ArsTestbench discussion. There are a lot of folks working on cross platform benchmarks and working together might benefit everyone.

    Despite my feedback, don’t be discouraged. You have a good start, but there is much more work to be done. Good luck!

    Posted January 30, 2006, 11:07 pm
  30. Holy crap, a long thought out comment. I’d almost forgotten they existed. Thanks riskin.

    We know there is work to be done, we know it’s not really ready for prime time. This is why it’s only a preview release at the moment. Preview 2 (getting close to release) will already have some refinements in it, we’ll look at what you’ve contributed and see what we can do with it for Preview 3.

    Posted January 30, 2006, 11:15 pm
  31. 31 Eric says:

    Is it possible to make a graph with the ouput automatically yet?

    I can provide results for G4 1.42ghz, AMD XP 2600+, AMD64 S754 3000+, AMD64 S939 3500+, dual AMD Opteron 244 also :)

    Posted January 31, 2006, 12:06 am
  32. 32 Johan Krßger-Haglert says:

    I think it was very intresting, especially to see that the Intel chips performed quite well. I would to have appreciated an Athlon64 X2 4800+ or something but I guess it will get added later.

    Posted January 31, 2006, 12:11 am
  33. 33 drowelf says:

    ************************************

    WHERE ARE THE MODERN AMD CHIPS??

    ************************************

    As Johan writes, until you get an Athlon64 X2 4800+ and an Athlon 64 FX 60 in this roundup, you don’t have anything substantial from AMD and are getting the wrong impression about which processors are on top.

    Posted January 31, 2006, 12:14 am
  34. This is probably the best benchmarking I’ve seen done in a while, and by far the best looking.

    What did you use to make those graphs? iWork?

    Posted January 31, 2006, 12:31 am
  35. 35 Mike says:

    All of these CPU are for games, or high graphical thing. Most people wouldn’t know the difference between these and the last generation CPU. If you have the money to get one of these CPU then go for it, but if your happy with your own computer then save your money for other thing.

    Posted January 31, 2006, 12:37 am
  36. 36 robot says:

    Hm, it is impossible to se the difference between the grayscales… You have to fix this

    Posted January 31, 2006, 12:37 am
  37. 37 me says:

    Get real. Yet another benchmark with an iMac with 512MB of ram against machines with double that or more. Just plain dumb.

    Posted January 31, 2006, 1:42 am
  38. John,

    Thanks for your input.

    Eric,

    Not yet. I’ll get in touch when we do more testing, though.

    Johan,

    Yes, in the next battery of tests we’ll hopefully have at least one A64 X2 and Pentium D.

    drowelf,

    Thanks for your input.

    Mike,

    You make a good point. Most people probably don’t need the horsepower offered by the newest of the new CPUs.

    robot,

    I don’t have a problem differentiating the grey bars, but I’ve received enough feedback that I will change it in the next article.

    me,

    The iMac with the least RAM was one of the best overall machines. Plus, the Geekbench tests don’t use anywhere near as much RAM as any of these machines have, so much so that I doubt it has any bearing on performance unless you are running a bunch of other stuff at the same time.

    Posted January 31, 2006, 2:44 am
  39. 39 dh says:

    Nice article and thoughtful well balanced conclusions.

    As this article is described as being as much about the testing as the results I would like to make a suggestion / request.

    It seems that it would be useful if you were to augment your testing by also doing tests with linux or netBSD based tools running on each machine.

    Both of these os’s are available for all of the cpu’s you are testing.

    This would allow you to run the same tests on all machines, even optimizing the tests to show each cpu at its best advantage while also allowing for a more direct comparison of different cpu families.

    Posted January 31, 2006, 3:47 am
  40. 40 DanoX says:

    Nice test one of the best so far.

    Posted January 31, 2006, 5:39 am
  41. 41 Sturmmann says:

    Interesting tests…

    1) Could you also mention the used compiler and compiler switches for the benchmarks?

    2) Athlon64 may suffer significantly in the integer (blowfish) test if 64bit multiplication instructions are not used.

    Posted January 31, 2006, 7:49 am
  42. 42 Jason T. says:

    I ran the benchmarks under linux using wine on my Athlon64 3200. I had firefox, azureus (with a few uploads going), evolution, and a few gnome-terminals open. Here are the results:

    os: Windows 2000 Service Pack 4

    cpu: x 1

    (x86 Family 6 Model 6 Stepping 8)

    cpu frequency: 2003 MHz

    bus frequency: ?

    memory: 1011 MB

    version: Geekbench Preview (10)

    compiler: Visual C++ v1310 (600)

    cpu (float) mandelbrot (sqrt) 1 thread 649.04 megaflops

    cpu (float) mandelbrot (sqrt) 4 threads 670.48 megaflops

    cpu (float) mandelbrot (nosqrt) 1 thread 885.31 megaflops

    cpu (float) mandelbrot (nosqrt) 4 threads 881.78 megaflops

    cpu (integer) blowfish (cache) 1 thread 45.29 megabytes/sec

    cpu (integer) blowfish (cache) 4 threads 49.91 megabytes/sec

    cpu (integer) blowfish (memory) 1 thread 39.39 megabytes/sec

    cpu (integer) blowfish (memory) 4 threads 44.51 megabytes/sec

    cpu (integer) emulate 6502 1 thread 384.68 megahertz

    memory (stdlib) fill 1 thread 1.18 gigabytes/sec

    memory (stdlib) sequential access 1 thread 1.14 gigabytes/sec

    memory (stdlib) random access 1 thread 1.16 gigabytes/sec

    memory (stdlib) copy 1 thread 899.62 megabytes/sec

    memory (stream) copy 1 thread 1.56 gigabytes/sec

    memory (stream) scale 1 thread 1.57 gigabytes/sec

    memory (stream) add 1 thread 1.65 gigabytes/sec

    memory (stream) triad 1 thread 1.64 gigabytes/sec

    Posted January 31, 2006, 9:29 am
  43. 43 bloodline says:

    Hi, could you give an average result for each CPU at the end of the article? That should even out the results, and give a better real world indicator.

    Thanx

    Posted January 31, 2006, 9:36 am
  44. 44 nicolas says:

    mira loco!

    Posted January 31, 2006, 3:28 pm
  45. These aren’t TOO relevant as integer, floating point, and memory speeds are only pieces of a puzzle that aren’t put together. Bus speeds, integrated memory controllers etc will have such a huge effect on performanace as to seriously cloud any conclusions you might draw from these results. Integrated Memory Control is the Crown Jewels of AMD. This can only be shown to shine with real running applications compared to another computer w/ a different proc running the same app.

    my 2 cents. ;)

    Posted January 31, 2006, 6:01 pm
  46. 46 Bill Leeper says:

    It was an interesting read for two reasons.

    1. It was guaranteed to generate a lot of negative comments by people who need to feel that their choice of processor is superior to everyone else’s. With all the ranting about the latest technology not being used it tells me that most of them don’t have a clue about what most people in the real world use. Your tests are very relevant to those people. Most users can’t afford the latest and greatest Pentium or Athlon.

    2. With the exception of the Core Dual machines none of the Apple hardware could really be considered state of the art. Even the G5 quad is pretty much old technology simply packaged in a new form.

    I for one hope the the author does not let all the disparaging remarks deter him from proceeding. Benchmarking the latest and greatest really has a lot less relevance to computing in general because only the elite few can afford the equipment. these tests give the rest of us some useful information about purchasing less than bleeding edge equipment.

    Posted January 31, 2006, 6:29 pm
  47. 47 dragonace says:

    Hi,

    First of all, thanks for the benchmarks. I think it’s a good start as well. Hopefully, with some good feedback you’ll be able to put together a top-notch benchmark. (Especially liked riskin’s input!)

    Not sure if it’s important or not, but I’d be interested in knowing the specs on the memory in the machines as well.

    Such as the manufacturer, CAS latency, ECC/non-ECC, buffered/non-buffered, etc. Basically the more info you give on the specs would help the rest of us determine if there are any obvious differences that could skew the results.

    I saw your comment on the amount of memory not really coming into play for the benchmark results, but since it’s a benchmark the closer the systems are to each other in every aspect except the processor, the less variables for anomalies.

    Just wondering if there would be any way the Hard Drives could effect the results in any way as well? Are all the tests and results done totally in RAM? Are the systems that were used for the tests running anything else? Perhaps a list of running processes when the test is being run for each system would be helpful as well?

    As you can probably tell, just throwing out some ideas/questions. You’ve probably already thought of all them, but figured since you asked for input it wouldn’t hurt. :)

    Good Luck.

    Posted January 31, 2006, 6:51 pm
  48. 48 stanner says:

    I ran geekbench on a new out of the box Pentium D the machine is running XP SP2, here are my findings.

    os: Windows XP Service Pack 2

    cpu: Intel® Pentium® D CPU 3.20GHz x 2

    (x86 Family 15 Model 4 Stepping 4)

    cpu frequency: 3199 MHz

    bus frequency: ?

    memory: 2045 MB

    version: Geekbench Preview (10)

    compiler: Visual C++ v1310 (600)

    cpu (float) mandelbrot (sqrt) 1 thread 870.12 megaflops

    cpu (float) mandelbrot (sqrt) 4 threads 1.96 gigaflops

    cpu (float) mandelbrot (nosqrt) 1 thread 777.89 megaflops

    cpu (float) mandelbrot (nosqrt) 4 threads 1.48 gigaflops

    cpu (integer) blowfish (cache) 1 thread 79.89 megabytes/sec

    cpu (integer) blowfish (cache) 4 threads 152.15 megabytes/sec

    cpu (integer) blowfish (memory) 1 thread 73.69 megabytes/sec

    cpu (integer) blowfish (memory) 4 threads 123.66 megabytes/sec

    cpu (integer) emulate 6502 1 thread 372.11 megahertz

    memory (stdlib) fill 1 thread 1.75 gigabytes/sec

    memory (stdlib) sequential access 1 thread 1.74 gigabytes/sec

    memory (stdlib) random access 1 thread 1.89 gigabytes/sec

    memory (stdlib) copy 1 thread 1.22 gigabytes/sec

    memory (stream) copy 1 thread 2.81 gigabytes/sec

    memory (stream) scale 1 thread 2.57 gigabytes/sec

    memory (stream) add 1 thread 2.63 gigabytes/sec

    memory (stream) triad 1 thread 2.63 gigabytes/sec

    Posted January 31, 2006, 7:36 pm
  49. 49 LoLL says:

    Try on AMD X2 3800+ (2000Mhz) under XP/SP1 (apps running in the back)

    os: Windows XP Service Pack 1

    cpu: AMD Athlon™ 64 X2 Dual Core Processor 3800+ x 2

    (x86 Family 15 Model 35 Stepping 2)

    cpu frequency: 2002 MHz

    bus frequency: ?

    memory: 1535 MB

    version: Geekbench Preview (10)

    compiler: Visual C++ v1310 (600)

    cpu (float) mandelbrot (sqrt) 1 thread 643.85 megaflops

    cpu (float) mandelbrot (sqrt) 4 threads 1.26 gigaflops

    cpu (float) mandelbrot (nosqrt) 1 thread 950.62 megaflops

    cpu (float) mandelbrot (nosqrt) 4 threads 1.79 gigaflops

    cpu (integer) blowfish (cache) 1 thread 54.12 megabytes/sec

    cpu (integer) blowfish (cache) 4 threads 106.18 megabytes/sec

    cpu (integer) blowfish (memory) 1 thread 49.84 megabytes/sec

    cpu (integer) blowfish (memory) 4 threads 88.21 megabytes/sec

    cpu (integer) emulate 6502 1 thread 297.26 megahertz

    memory (stdlib) fill 1 thread 1.37 gigabytes/sec

    memory (stdlib) sequential access 1 thread 1.37 gigabytes/sec

    memory (stdlib) random access 1 thread 1.38 gigabytes/sec

    memory (stdlib) copy 1 thread 848.56 megabytes/sec

    memory (stream) copy 1 thread 1.84 gigabytes/sec

    memory (stream) scale 1 thread 1.77 gigabytes/sec

    memory (stream) add 1 thread 1.84 gigabytes/sec

    memory (stream) triad 1 thread 1.82 gigabytes/sec

    Posted January 31, 2006, 7:53 pm
  50. Someone mentioned that the 64-bit Athlon was hampered by not using 64-bit multiply. Well, if you want to get right down to it, so is the G5. You can’t reasonably expect to see everything built 64-bit, but it would be interesting to see the benchmarks redone with gcc configured for a 64-bit PPC or Intel target on the machines that support 64-bit just to see what the performance impact is.

    It would also be useful to have a much broader spread of integer and floating point tests.

    Someone else suggested optimization, however, and I think that would be a big mistake. Real-world code rarely is optimized for a particular processor. In a few cases, it might be—audio plug-ins, Photoshop-but in general, you should assume that most of the software you deal with was written in straight C with only a limited amount of optimization. In a typical app, you -might see the use of optimized libraries for certain heavy duty transforms like FFT (which would be a good benchmark test), but I wouldn’t expect the bulk of the code to be optimized, so any optimized tests should be taken with a grain of salt.

    Posted January 31, 2006, 9:53 pm
  51. 51 DrPizza says:

    Synthetic tests aren’t accurate at all for real performance.

    Posted February 1, 2006, 12:07 am
  52. 52 Seiya says:

    well..

    in the floating point test athlon64 has great performance, but in integer performance the first test is almost to zero..

    won’t it be of the tests a PPC too in favor?

    Posted February 1, 2006, 1:34 am
  53. 53 Jeremy Richey says:

    Most the benchmarks in this suite seem very informative and accurate. The one exception is the blowfish algorithm which generally takes into account endianness.

    If you were wanting to measure the interager performance then the algorithm should be free from endian swapping functions and things of that nature.

    Also if it’s going to be one ran in cache then the text and data should be small enough to fit in the cache size for all the processors.

    Guess I’m saying that while blowfish is an awesome encryption protocol it has a couple things that should disqualify it for a benchmark.

    We need a endian safe algorithm. And for measuring from within the cache the whole thing needs to be in there for all the processors.

    Of course the second blowfish test is more relevant because we only have the endian issue with it. However it should still probably be replaced by one that finds a prime number or something else that doesn’t care what the endian of the processor is.

    Posted February 1, 2006, 5:02 am
  54. 54 l'Ours says:

    ok ok, bench are always interesting, but why my 50 Mhz MC68060 is more reagent than the more powerfull system tested here ?

    Posted February 1, 2006, 10:12 am
  55. This is only a synthetic benchmarking, we need a real work benchmark based in pro applications like Final Cut, Adobe AfterEffects, Flash etc.

    Please check http://www.maZintel.com for a benchmark clip

    Posted February 1, 2006, 10:28 am
  56. 56 Sturmmann says:

    I haven’t sources for these benchmarks, but I checked two different blowfish implementations. Blowfish contains mainly 32-bit xor operations and additions. The algorithm is specified so that it cannot be optimized for 64-bit processors. The algorithm uses 32-bit subkeys and 32-bit S-boxes. It is possible to initialize subkeys and S-boxes in little or big endian format. In this benchmark, I suspect that subkeys and S-boxes are always initialized in big endian format so it is necessary to convert these from the big endian to little endian format during the encryption. This would cripple performance on Pentium and Athlon64. However, I haven’t sources so I cannot confirm this.

    Posted February 1, 2006, 1:07 pm
  57. 57 Anonymous Coward says:

    I realize that your tool is an attempt at writing a good set of synthetic benchmarks, but your results for the Intel and AMD memory tests are both way off. The results roughly state that a memory fill on the Intel or AMD chips hit around 1 gig/second, which is really only a fraction of the actual memory bandwidth. The AMD chip should be running PC3200 (3.2 gig/sec), and if you’re running dual channel, then that should be bumped up to 6.4 gig/sec (theoretical max).

    Even accounting for the fact that you’re not going to hit the theoretical max, your benchmark is doing something very poorly to score so low on memory benchmarks.

    Look at the SiSoft Sandra set of benchmarks as a good set of synthetic benchmarks. As it stands, your tool does not seem to handle the Intel or AMD chips even roughly correctly.

    Posted February 1, 2006, 8:11 pm
  58. The AMD only pulled in at around 1 GB/sec, the Pentium came in with 1.5 or so. both with a possible 3.2 GB/sec. The PowerMac G5 Quad came in at ~3.5 GB/sec, but out of a possible 8.5 GB/sec.

    In other words, all the machines are coming at about 30-40% of their theoretical top score. In other words, the scores may not be as pretty as you want, but I don’t believe they are unrealistic nor inconsistent given that Geekbench isn’t written in any machine specific assembler like most other synthetic benchmarks we’ve looked at.

    For what it’s worth, the G4 machines only scored at about 25% of their theoretical limit.

    Posted February 1, 2006, 8:49 pm
  59. 59 syko says:

    The Athlon64 had 3.2GB/s of potential bandwidth (if it is S754, 6.4GB/s if it is S939, but you don’t specify this. Look at reviews on Anandtech, Tech Report, etc, to see how to report hardware configurations accurately). The 1GB/s measured does seem to indicate an issue with the code there. But maybe the A64 just copes badly with that particular implementation? The G5 has a TERRIBLE memory controller (very high latency) yet performs better – so something isn’t right. Have you got a latency test?

    Posted February 2, 2006, 11:34 pm
  60. 60 tom ward says:

    I always find these tests bring out the “pissing contest” mentality … Nice to see some fast computers – also, good to have a comparison of them, but I use an “operating system” and “applications”. ie. figures of speed don’t mean a great deal if half my time is spent trying to work out how to use an application or get from point A to point B.

    I think these are the points that those comparing their respective “size” and “girth”, and who wave the numbers like a flag will always fail to see.

    Posted February 7, 2006, 1:09 am
  61. 61 Mathias says:

    Nice. A really useful website to compare the strength and the weakness of the CPU’s.
    I was surprised abot the comparison of the iMac G5 and the iMac Core Duo.
    In some tests the G5 has even better results, which I am wondering about.
    Fair enough to say that the G5 is still a 64-bit CPU and processes bigger packages.
    A 1.67 GHz G4 CPU, used in the actual PowerBook 15″ and 17″, is somewhat missing.
    As the MacBook Pro now ships with the same CPU, which is in the iMac Core Duo,
    this would have been a direct comparison.
    Anyway. Having seen the results I will buy a MacBook Pro, as soon as I can afford.
    Just having seen the difference and ‘not really competitive’ performance of the G4 is
    another argument going for the MacBook Pro.

    Posted February 21, 2006, 2:31 am
  62. Hello.

    I think your benchmark might have more meaning from a hardware perspective if you wrote it so that it would run all by its lonesome from some sort of bootable media, say a bootable CD. I’m not saying a bootable mac/linux/windows CD. I’m suggesting booting your benchmark application. The variances between operating system schedulers and the decisions they make for short-term application scheduling and whatnot is just too much to take into account if you really want to compare the processors and system memory bandwidth. I’d recommend dumping your results to the nearest serial port. :-) This is really the only way to have a true hardware comparison across all of these different architectures and operating systems. Eliminate as much system overhead and noise as possible.

    Yes, doing a benchmark suite in this manner would be quite difficult. It would also have real meaning at the end of the day…

    Posted March 4, 2006, 6:33 am
  63. 63 Mikael says:

    The PPC processors are more sensitive to optimizations and it’s in practice necessary to use the Mac OS X libraries to get the performance they are capable of.

    Memory access speed is excellent with the G5 processors, less good with the G4 but it’s somewhat compensated with proper AltiVec usage why programming for the Mac OS is so crucial. A quick and dirty port from Windows will not do, generic software packages is no good either.

    Integer calculations speed is excellent with both the G5 and G4 PPC but need top optimizations (gcc -fast -mcpu=powerpc) to really shine. I’d expect even the older G4 to outshine the Pentium with a factor at least 2.

    Floating point is great with the G5 with its dual FPU architecture, while the G4 is somewhat lacking. With proper optimizations should it perform roughly the same as the Pentium though.

    So while it was interesting to see these test result are they in practice completely meaningless.

    Posted March 15, 2006, 4:00 am
  64. 64 Matt says:

    Mikael,

    Thanks for the input, but I’m afraid you are missing the point of Geekbench.

    Posted March 15, 2006, 8:28 pm
  65. 65 Mikael says:

    Then what is it?
    If the Mac OS PPC version is so crippled (read: unoptimized) then these results are not realistic.

    Posted March 26, 2006, 7:17 am
  66. 66 Matt says:

    Geekbench isn’t optimized for Altivec because it also isn’t optimized for SSE. You won’t see a version optimized for one before you see it optimized for both.

    The key is balance. You may think that the OS X version is crippled, but rest assured that the Windows and Linux versions are just as crippled.

    Posted March 26, 2006, 1:29 pm
  67. 67 Mikael says:

    I’m not talking about only AltiVec but a number of special treatment of the resulting machinecode which the PPC requires. GCC has been used on the x86 platform many many years longer so it’s better adapted for this specific CPU.

    Do a man gcc in the Mac terminal app.
    Search by entering “/-fast”.
    Press n once to find the explanation of -fast.

    I’d optimize with the G4 as the base CPU by giving GCC these options:
    -fast -mcpu=G4

    If fast math is not wanted then use the options individually as desired.
    -malign-natural will align floating-point values along their natural boundaries.
    -falign-functions to the best boundaries as well, and so on.

    This is a price to be paid for using a compiler which is not fully designed for the PPC CPU.

    Posted April 2, 2006, 7:24 am
  68. 68 John says:

    I’ve talked about the compilers and compiler flags we use before, but it worth repeating here: we’re using the standard compilers for the platform, and the vendor-recommended compiler flags for release code.

    In other words, Apple doesn’t set -fast on by default (for whatever reason), so we’re not going to turn it on.

    Posted April 2, 2006, 4:40 pm
  69. 69 Mikael says:

    “In other words, Apple doesn’t set -fast on by default (for whatever reason), so we’re not going to turn it on.”

    That’s almost funny.

    Posted April 8, 2006, 4:59 pm
  70. 70 loko says:

    i see now by your posts there is no doubt i will stay rendering using my sgi 2048 cpu and 4096 GB ram instead of buying that new “chip chaw” intels or ibm, motorolas and whatsover

    …but i still love that aluminium finish of the quad…

    any intersted on a cluster? starting at 224k… ferraris and astons accepted as a deposit

    Posted April 21, 2006, 8:41 pm
  71. 71 Wibby says:

    Here is my FX60

    Geekbench Information
    Version: Geekbench Preview 2 (r72)
    Compiler: Visual C++ 1400

    System Information
    OS: Microsoft Windows XP Professional
    Model: NVIDIA AWRDACPI
    Motherboard: KN1 SLI Lite
    CPU: AMD Athlon(tm) 64 FX-60 Dual Core Processor
    CPU ID: Family 15 Model 35 Stepping 2
    CPU Count (Physical): 2
    CPU Count (Logical): 2
    CPU Frequency: 2612 MHz
    Bus Frequency: 0 MHz
    Memory: 2046 MB

    CPU Integer Performance
    Emulate 6502 119 (1 thread, 212.8 megahertz)
    Emulate 6502 237 (4 threads, 434.4 megahertz)
    Blowfish 51 (1 thread, 71.11 megabytes/sec)
    Blowfish 92 (4 threads, 139.9 megabytes/sec)
    bzip2 Compress 214 (1 thread, 38.34 megabytes/sec)
    bzip2 Compress 414 (4 threads, 77.05 megabytes/sec)
    bzip2 Decompress 199 (1 thread, 82.97 megabytes/sec)
    bzip2 Decompress 383 (4 threads, 162.2 megabytes/sec)

    CPU Floating Point Performance
    Mandelbrot 185 (1 thread, 1.246 gigaflops)
    Mandelbrot 345 (4 threads, 2.41 gigaflops)

    Memory Performance
    Latency 698 (1 thread, 14.99 nanoseconds/load)
    Read Sequential 340 (1 thread, 2.474 gigabytes/sec)
    Write Sequential 332 (1 thread, 1.913 gigabytes/sec)
    Stdlib Allocate 3721 (1 thread, 2.899 megaallocs/sec)
    Stdlib Allocate 674 (4 threads, 528.5 kiloallocs/sec)
    Stdlib Write 122 (1 thread, 1.919 gigabytes/sec)
    Stdlib Copy 151 (1 thread, 1.129 gigabytes/sec)

    Stream Performance
    Stream Copy 172 (1 thread, 2.353 gigabytes/sec)
    Stream Scale 173 (1 thread, 2.363 gigabytes/sec)
    Stream Add 174 (1 thread, 2.407 gigabytes/sec)
    Stream Triad 169 (1 thread, 2.387 gigabytes/sec)

    Posted May 4, 2006, 4:45 am