Welcome, Guest. Please login or register.
Did you miss your activation email?
September 06, 2010, 05:18:09 PM
Home Help Search Login Register

+  RotateRight Forum
|-+  Performance
| |-+  Tips and Techniques
| | |-+  Intel Core i7 Tidbits
« previous next »
Pages: [1] Print
Author Topic: Intel Core i7 Tidbits  (Read 2164 times)
Kay Tiong
Administrator
Newbie
*****
Posts: 34


View Profile
« on: October 15, 2008, 09:27:19 PM »

Read the article on Intel Core i7 over on Tom's hardware. Things to consider if you're coding for this architecture:

http://www.tomshardware.com/reviews/Intel-i7-nehalem-cpu,2041.html

1. With SMT, CPU affinity is now important.

"To help solve this puzzle, Intel provides a way of precisely determining the exact topology of the processor (the number of physical and logical processors), and programmers can then use the operating system affinity mechanism to assign each thread to a processor. This kind of thing shouldn’t be a problem for game programmers, who are already in the habit of working that way because of the way the Xenon processor (the one used in the Xbox 360) works."

2. Unaligned Load and Store are now optimized

"The Intel engineers have optimized these accesses to make them faster. First of all, there’s no performance penalty for using the unaligned versions of load/store instructions in cases where the data are aligned in memory. In other cases, Intel has optimized these accesses to reduce the performance hit compared to that of the Core architecture."

3. Use large pages if possible.

"The level 1 data TLB now stores 64 entries for small pages (4K) or 32 for large pages (2M/4M), while the level 1 instruction TLB stores 128 entries for small pages (the same as with Core 2) and seven for large pages."

4. Leave prefetching to the hardware.

"Intel says the problem is solved now, but provides no details on the operation of the new prefetch algorithms; all its says is that it won’t be necessary to disable them for server configurations."

In my opinion, prefetching is more an art than a science and is almost always fragile.

5. Keep unrolled loops small

"With the Nehalem architecture, Intel has improved the functionality of the Loop Stream Detector. First of all the buffer is larger—it can now store 28 instructions. But what’s more, its position in the pipeline has changed. In Conroe, it was located just after the instruction fetch phase. It’s now located after the decoders; this new position allows a larger part of the pipeline to be disabled. The Nehalem’s Loop Stream Detector no longer stores x86 instructions, but rather µops."
Logged
Sanjay Patel
Administrator
Newbie
*****
Posts: 20


View Profile
« Reply #1 on: February 13, 2009, 12:16:51 PM »

Here's an Intel Nehalem doc with not quite as much marketing fluff as the official product docs on Intel's site. Discusses the Loop Stream Detector (LSD), branch predictor, memory hierarchy, and sync primitive performance:
https://intel.wingateweb.com/SHchina/published/NGMS001/SP08_NGMS001_100r_eng.pdf
Logged
foodype09
Newbie
*
Posts: 6


View Profile WWW
« Reply #2 on: December 09, 2009, 03:00:31 PM »

I posted a link to just such an article yesterday but today its been pulled due to a request by Intel.
Logged
foodype09
Newbie
*
Posts: 6


View Profile WWW
« Reply #3 on: December 13, 2009, 03:27:08 AM »

Originally Posted by alzaeemYou guys are talking about the distant future, AMDs next architecture will probably take 9 to 12 months to be released if not more, so until then there is a new king in town.

anyways AMD are going to cut down their prices so very much on the day Core 2 is released, but it wont be enough unless fx-62 sells for like 300.AMD wont likely be reducing their prices so drastically.

Games are GPU limited these days, not CPU.  Gamers will see little improvement.

AMDs cycle is behind Intels, but you seriously think theyd let Intel have their way for a year?  I know its PR so take it with a grain of salt, but to
Logged
pheftsype
Newbie
*
Posts: 4


View Profile
« Reply #4 on: December 29, 2009, 10:32:33 AM »

You dont need new memory, you just need to change the ratio of RAM:CPU so that the RAM runs slower at the same bus speed and thus same CPU speed.

But if you still have a the standard Intel cooler, then the H50 is probably a good idea.


M
Logged
Pages: [1] Print 
« previous next »
Jump to:  


Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC