转自:http://www.lighterra.com/papers/modernmicroprocessors/
by Jason Robert Carey Patterson, last updated Aug 2012 (orig Feb 2001)
WARNING: This article is meant to be informal and fun!
Okay, so you're a CS graduate and you did a hardware/assembly course as part of your degree, but perhaps that was a few years ago now and you haven't really kept up with the details of processor designs since then.
In particular, you might not be aware of some key topics that developed rapidly in recent times...
pipelining (superscalar, OoO, VLIW, branch prediction, predication)
multi-core & simultaneous multithreading (SMT, hyper-threading)
SIMD vector instructions (MMX/SSE/AVX, AltiVec)
caches and the memory hierarchy
Fear not! This article will get you up to speed fast. In no time you'll be discussing the finer points of in-order vs out-of-order, hyper-threading, multi-core and cache organization like a pro.
But be prepared – this article is brief and to-the-point. It pulls no punches and the pace is pretty fierce (really). Let's get into it...
More Than Just Megahertz
The first issue that must be cleared up is the difference between clock speed and a processor's performance. They are not the same thing. Look at the results for processors of a few years ago (the late 1990s)...
SPECint95 SPECfp95
195 MHz MIPS R10000 11.0 17.0
400 MHz Alpha 21164 12.3 17.2
300 MHz UltraSPARC 12.1 15.5
300 MHz Pentium-II 11.6 8.8
300 MHz PowerPC G3 14.8 11.4
135 MHz POWER2 6.2 17.6
A 200 MHz MIPS R10000, a 300 MHz UltraSPARC and a 400 MHz Alpha 21164 were all about the same speed at running most programs, yet they differed by a factor of two in clock speed. A 300 MHz Pentium-II was also about the same speed for many things, yet it was about half that speed for floating-point code such as scientific number crunching. A PowerPC G3 at that same 300 MHz was somewhat faster than the others for normal integer code, but still far slower than the top 3 for floating-point. At the other extreme, an IBM POWER2 processor at just 135 MHz matched the 400 MHz Alpha 21164 in floating-point speed, yet was only half as fast for normal integer programs.
How can this be?
[b]Obviously, there's more to it than just clock speed – it's all about how much work gets done in each clock cycle[/b]. Which leads to...
(除了CPU的频率之外,最重要的是CPU在一个时钟周期做了多少工作,而不仅仅是CPU的频率有多高。)
Pipelining & Instruction-Level Parallelism
Instructions are executed one after the other inside the processor, right? Well, that makes it easy to understand, but that's not really what happens. In fact, that hasn't happened since the middle of the 1980s. Instead,
several instructions are all partially executing at the same time.
Consider how an instruction is executed – first it is fetched, then decoded, then executed by the appropriate functional unit, and finally the result is written into place. With this scheme, a simple processor might take 4 cycles per instruction (CPI = 4)...
分享到:
相关推荐
处理器90分钟公开教程
6/17/13 Modern Microprocessors - A 90 Minute Guide!www.lighterra.com/papers/modernmicroprocessors/ 1/21Table of ContentsModern Microprocessors A 90 Minute Guide!Today's robots are very ...
此文档为原书的电子版,PDF,欢迎下载参考,微型计算机技术及原理
本书为 TNT DOS-Extender 8.0 的参考手册之一。 ------------------------------------------------------------ ...A technical reference for Phar Lap’s assembler for Intel 8086 and later microprocessors.
This book provides the perfect tutorial-based introduction to the ARTIK family of “Systems on Modules,” which integrate powerful microprocessors, memory, wireless connectivity, and enhanced ...
Focusing on the revolutionary change taking place in industry today--the switch from uniprocessor to multicore microprocessors--this classic textbook has a modern and up-to-date focus on parallelism ...
微处理器和嵌入式系统我们关于基于x86的RAM测试仪的微处理器课程的课程项目,位于BITS
Complete Digital Design - A Comprehensive Guide to Digital Electronics and Computer System Architecture PART 1 Digital Fundamentals Chapter 1 Digital Logic . . . . . . . . . . . . . . . . . . . . . ....
微处理器和接口此仓库包含ECE F241微处理器的所有实验室课程工作以及BITS Pilani的接口。 该代码以汇编语言(缩写为asm)编写。使用名为emu8086的应用程序将代码模拟并转换为可执行的机器代码。
Today, Intel and other semiconductor firms are abandoning the single fast processor model in favor of multi-core microprocessors--chips that combine two or more processors in a single package....
Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity MicroprocessorsYedidya Hilewitz & Ruby B. LeeReceived: 10 January 2007 /Revised: 25 February 2008 /Accepted: 12 April 2008 /...
Microprocessors-and-Microsystems journal 2016
This unique step-by-step guide is a complete introduction to modern microprocessor design, explained in simple nontechnical language without complex mathematics. An ideal primer for those working in ...
The textbook covers a diverse range of materials -- from motors to microprocessors -- providing a wide range of analog and digital systems. The individual troubleshooting guides given in the ...
本书采用逐步方法介绍了Atmel AVR微控制器的汇编语言编程。
Focusing on the revolutionary change taking place in industry today - the switch from uniprocessor to multicore microprocessors - this classic textbook has a modern and up-to-date focus on ...
They may generate stable frequencies, recover a signal from a noisy communication channel, or distribute clock timing pulses in digital logic designs such as microprocessors. Since a single ...
The Future Evolution of High-Performance Microprocessors
The Intel Microprocessors The Intel Microprocessors
Analog Interfacing to Embedded Microprocessors