Rabu, 07 Desember 2011

Why Intel Ivy Bridge Still Quad Core




A couple days ago, we published our Ivy Bridge Desktop Lineup Overview in which we mentioned that Ivy Bridge will remain a quad-core solution. There are dozens of forum posts with people asking why there's no hex-core Ivy Bridge, so now seems like a good time to address the question. Fundamentally, Ivy Bridge is a die shrink of Sandy Bridge (a "tick" in Intel's world), and that usually means either the core count or frequency is increased due to the lower power consumption of the smaller process node. Thus, instead of hex-core, we get a chip that looks much the same as a year-old Sandy Bridge, only with improved efficiency and some other moderate tweaks to the design. Let's go through some of the elements that influence the design of a new processor, and when we're done we will have hopefully clarified why Ivy Bridge remains a quad-core solution.

Marketing
If we look at the situation from the marketing standpoint first, having a hex-core Ivy Bridge die would more or less kill the just released Sandy Bridge E. Sure, IVB is about five months away, but I doubt Intel wants to relive the Sandy Bridge vs. Nehalem (i7-9xx) situation--even Bloomfield vs. Lynnfield was quite bad. If Intel created a hex-core IVB die, they would have to also substantially cut the prices of SNB-E. The current cheapest hex-core SNB-E is $555, while IVB hex-core would most likely be priced at $300~$400 since it's aimed at the mainstream; otherwise very few SNB-E systems would be sold. Even then, most consumers would opt for the IVB platform due to cheaper motherboard costs and lower TDP. PCIe 3.0 should also make 16 lanes fine for dual-GPU setups, reducing the market for SNB-E even more.
Differentiating the lineup by keeping Ivy Bridge quad-core allows some market for SNB-E among enthusiast consumers. Ivy Bridge E isn't coming before H2 2012 anyway so SNB-E must please the high-end until IVB-E hits. In the end, we still recommend SNB-E primarily for servers and workstations where the extra memory channels, PCIe lanes, and dual-socket support are more important, but the lack of hex-core IVB parts at least gives the platform a bit more of an advantage.
Evolution from traditional CPU to SoC
There are more than just marketing reasons, though. If we look at the following die shots, we can see that CPUs are becoming increasing similar to SoCs.

Quad-core Kentsfield package (2006)

Quad-core Nehalem die (2008)

Quad-core Sandy Bridge die (2011)

These three (well, techically two because Kentsfield consists of two dual-core Conroe dies) chips are the only "real" quad-core CPUs from Intel. There are quad-core Gulftown Xeons, and there will soon be quad-core SNB-E CPUs, but they all have more cores on the actual die; some of them have just been disabled. Comparing the die shots, we notice that our definition of CPU has changed a lot in only five years or so. Kentsfield is a traditional CPU, consisting of processing cores and L2 cache. In 2008, Nehalem moved the memory controller onto the CPU die, which allowed Intel to get rid of the Northbridge-Southbridge combination and replace it with their Platform Controller Hub. A year and a half later, Westmere (e.g. Arrandale and Clarkdale) brought us on-package graphics--note that it was on-package, not on-die as the GPU was on a separate die. It wasn't until Sandy Bridge that we got on-die graphics. The SNB graphics occupy roughly 25% of the total die area, or the space of three cores if you prefer to look at it that way, and IVB's graphics (a "tock" on the GPU side, as opposed to a "tick") will occupy even more space.

While we don't have a close-up die shot of Ivy Bridge (yet), we do know its approximate die size and the layout should be similar to the Sandy Bridge die as well.Anand estimated the die size to be around 162mm^2 for what appears to be the quad-core die (dual-core SNB with GT2 is 149mm^2, and even with the more complex IGP we wouldn't expect dual-core IVB to be larger). That's a 25% reduction in the die size when compared with quad-core SNB die (216mm^2). A 22nm quad-core SNB die would measure in at 102mm^2 with perfect scaling and assuming all the logic/architecture is the same; however, scaling is never perfect and we know there are a few new additions to IVB, so 162mm^2 for IVB die sounds right. Transistor wise, IVB counts in at around 1.4 billion, a 20.7% increase over quad-core SNB.
To the point, today's CPUs have much more than just CPU cores in them. We could easily have had a hex-core 32nm SNB die at the same die size if the graphics and memory controller were not on-die .We've actually got a pretty good reference point with SNB and Gulftown; accouting for the larger L3 cache and extra QPI link, Gulftown checks in at 240mm^2, though TDP is higher than SNB thanks to the extra cores. The same applies to Ivy Bridge. If Intel took away the graphics, or even kept the same die size as SNB, a hex-core would be more or less given. Instead, Intel has chosen to boost the graphics and decrease the die size.

Subjectively, this is not a bad decision. Intel needs to increase graphics performance, and will do just that in IVB. Intel's IGP solutions account for over 50% of the PC marketshare, yet the graphics are their Achilles' Heel. All modern laptops have integrated graphics (though many still opt to go discrete-only or use switchable graphics), and having more CPU cores isn't that useful if your system will be severely handicapped by a weak GPU. We've also shown in numerous articles how hex-core scaling over quad-core is largely unnecessary on desktop workloads (more on this below). Increasing the graphics' EU count and complexity while also adding CPU cores would have led to a larger than ideal die, not to mention the increased complexity and cost. Remember, Moore's Law was more an observation of the ideal size/complexity relationship of microprocessors rather than pure transistor count, and smaller die sizes generally improve yields in addition to being less expensive.


Performance
While six cores is obviously 50% more than four cores, the increase in cores isn't proportional to the increase in performance. More cores put off more heat and hence clock speeds must be lower, unless the TDP is increased. Intel couldn't have achieved the 77W TDP at reasonable clock speeds if Ivy Bridge was hex-core. On top of that, there is still plenty of software that is not fully multithreaded or fails to scale linearly with core count, so you would rarely be using all six cores (plus six more virtual cores thanks to Hyper-Threading). More cores will only help if you can actually use them, while higher frequencies universally improve performance (all other things being equal). We can give some clear examples of this with a few graphs from our Sandy Bridge E review.
Photoshop is a prime example of software that has limited multithreading. We used the older CS4 in our tests, but CS5 isn't any better, unfortunately. Photoshop can actively take advantage of four threads, and thus the hex-core i7-3960X isn't really faster than quad-core i7-2600K. The slight difference is most likely due to the difference in Turbo (3.9GHz vs 3.8GHz) or the quad-channel vs. dual-channel memory configuration. There are also a few peaks where more than four threads are used, thus i7-2600K is faster than i5-2500K thanks to Hyper-Threading, on top of the extra cache and higher Turbo of course.
In general, games are horribly multithreaded. DiRT 3 is an example of a typical game engine, and adding more cores and enabling Hyper-Threading actually hurts the performance. There are only a handful of games that benefit from more cores, although there are still obstacles to overcome even then (see below).
Civilization V fits in the handful of games that can scale across multiple cores. However, you will still be bottlenecked by your GPU in GPU bound scenarios (like in the second graph), which makes the usefulness of more cores questionable in this case. It's irrelevant whether you get 60 or 120 FPS in CPU bound scenarios if the real gaming performance is ultimately bound by your GPU speed.
The above graphs are biased in the sense that they are for tests where SNB-E is roughly on-par with regular quad-core SNB. However, keep in mind that we are comparing 130W hex-core and 95W quad-core; a 77W hex-core part might need lower clock speeds and could perform worse in limited-threaded tasks (depending on the Turbo speeds of course). In general, tasks like video encoding, 3D rendering, and archiving scale well with additional cores, but how many consumers run these tasks on a day-to-day basis? If you know you will be doing a lot of CPU intensive work that can benefit from additional cores, SNB-E (and later IVB-E) will always be an option--though you'll give up Quick Sync and the integrated graphics in the process. For most consumers, higher frequencies will likely prove far more useful due to the limited multithreading of everyday applications.
There is also the AMD point of view. Bulldozer hasn't exactly been a success story and there is no real competition in the high-end CPU market because of that. Intel could skip Ivy Bridge altogether and their position at the top of the performance charts would still hold. With no real competition, there's no need to push the performance much higher. Four cores is enough to keep the performance higher than AMD's, and reducing the TDP as a side effect is a big plus, especially when thinking about the future and ARM. As another point of comparison with AMD, look at Llano: it's a quad-core CPU that focuses more on improved graphics. The now rather "old" Clarkdale i5-750 is able to surpass the CPU performance of Llano, but that hasn't stopped plenty of people from picking up Llano as an inexpensive solution that provides all the performance needed for most tasks.


Wrap-Up
When looking at the big picture, there really aren't any compelling reasons why Intel should have gone with hex-core design for Ivy Bridge. Just like the Sandy Bridge vs. Gulftown comparison, IVB vs. SNB-E looks like a good use of market segmentation. Sure, some enthusiasts will argue that having a quad-core CPU is so 2007, but don't let the number of cores fool you. The only thing that 2007 and 2012 quad-cores share is the core count; otherwise they are very different animals (see for example i7-2600K vs Q6600). It also appears that even without additional cores or clock speed improvements, Ivy Bridge will be around 15% faster clock for clock than Sandy Bridge (according to Intel's own tests; a deeper performance analysis will come soon).
Increasing the frequencies and boosting the clock for clock performance yields increased performance in every CPU bound task, and improving the quality of the on-die graphics helps in other areas. In contrast, increasing the core count only helps if the software has proper multithreading and can scale to additional cores--both of which are easier said than done. Given all of the possibilities, it would appear that Intel has done the right thing, and in the process there's no need to try and convince consumers into believing that they need more cores than they actually do.




Tidak ada komentar:

Posting Komentar