計(jì)算機(jī)組成與設(shè)計(jì)

出版時間:2010-4  出版社:機(jī)械工業(yè)出版社  作者:(美)帕特森//亨尼西  頁數(shù):689  
Tag標(biāo)簽:無  

前言

We believe that learning in computer science and engineering should reflect the current state of the field, as well as introduce the principles that are shaping com- puting. We also feel that readers in every specialty of computing need to appreciate the organizational paradigms that determine the capabilities, performance, and, ultimately, the success of computer systems.Modern computer technology requires professionals of every computing spe- cialty to understand both hardware and software. The interaction between hard- ware and software at a variety of levels also offers a framework for understanding the fundamentals of computing. Whether your primary interest is hardware or software, computer science or electrical engineering, the central ideas in computer organization and design are the same. Thus, our emphasis in this book is to show the relationship between hardware and software and to focus on the concepts that are the basis for current computers.The recent switch from uniprocessor to multicore microprocessors confirmed the soundness of this perspective, given since the first edition. While programmers could once ignore that advice and rely on computer architects, compiler writers, and silicon engineers to make their programs run faster without change, that era is now over. For programs to run faster, they must become parallel. While the goal of many researchers is to make it possible for programmers to be unaware of the underlying parallel nature of the hardware they are programming, it will take many years to realize this vision. Our view is that for at least the next decade, most programmers are going to have to understand the hardware/software interface if they want programs to run efficiently on parallel computers.The audience for this book includes those with little experience in assembly language or logic design who need to understand basic computer organization as well as readers with backgrounds in assembly language and/or logic design who want to learn how to design a computer or understand how a system works and why it performs as it does.

內(nèi)容概要

這本最暢銷的計(jì)算機(jī)組成書籍經(jīng)過全面更新,關(guān)注現(xiàn)今發(fā)生在計(jì)算機(jī)體系結(jié)構(gòu)領(lǐng)域的革命性變革:從單處理器發(fā)展到多核微處理器。此外,出版這本書的ARM版是為了強(qiáng)調(diào)嵌入式系統(tǒng)對于全亞洲計(jì)算行業(yè)的重要性,并采用ARM處理器來討論實(shí)際計(jì)算機(jī)的指令集和算術(shù)運(yùn)算,因?yàn)锳RM是用于嵌入式設(shè)備的最流行的指令集架構(gòu),而全世界每年約銷售40億個嵌入式設(shè)備。與前幾版一樣,本書采用了一個MIPS處理器來展示計(jì)算機(jī)硬件技術(shù)、流水線、存儲器層次結(jié)構(gòu)以及I/O等基本功能。此外,本書還包括一些關(guān)于x86架構(gòu)的介紹?! ”緯饕攸c(diǎn)  ·采用ARMv6(ARM11系列)為主要架構(gòu)來展示指令系統(tǒng)和計(jì)算機(jī)算術(shù)運(yùn)算的基本功能?!  じ采w從串行計(jì)算到并行計(jì)算的革命性變革,新增了關(guān)于并行化的一章,并且每章中還有一些強(qiáng)調(diào)并行硬件和軟件主題的小節(jié)。  ·新增一個由NVIDIA的首席科學(xué)家和架構(gòu)主管撰寫的附錄,介紹了現(xiàn)代GPU的出現(xiàn)和重要性,首次詳細(xì)描述了這個針對可視計(jì)算進(jìn)行了優(yōu)化的高度并行化、多線程、多核的處理器。  ·描述一種度量多核性能的獨(dú)特方法——“Roofline model”,自帶benchmark測試和分析AMD Opteron X4、Intel Xeon 5000、Sun UltraSPARC T2和 IBM Cell的性能?!  ずw了一些關(guān)于閃存和虛擬機(jī)的新內(nèi)容。  ·提供了大量富有啟發(fā)性的練習(xí)題,內(nèi)容達(dá)200多頁?!  MD Opteron X4和Intel Nehalem作為貫穿本書的實(shí)例?!  び肧PEC CPU2006組件更新了所有處理器性能實(shí)例。

作者簡介

David A.Patterson,加州大學(xué)伯克利分校計(jì)算機(jī)科學(xué)系教授。美國國家工程研究院院士。IEEE和ACM會士。曾因成功的啟發(fā)式教育方法被IEEE授予James H.Mulligan,Jr教育獎?wù)隆K驗(yàn)閷ISC技術(shù)的貢獻(xiàn)而榮獲1 995年IEEE技術(shù)成就獎,而在RAID技術(shù)方面的成就為他贏得了1999年IEEE Reynold Johnson信息存儲獎。2000年他~13John L.Hennessy分享了John von Neumann獎。John L.Hennessy,斯坦福大學(xué)校長,IEEE和ACM會士。美國國家工程研究院院士及美國科學(xué)藝術(shù)研究院院士。Hennessy教授因?yàn)樵赗ISC技術(shù)方面做出了突出貢獻(xiàn)而榮獲2001年的Eckert-Mauchly獎?wù)?他也是2001年Seymour Cray計(jì)算機(jī)工程獎得主。并且和David A.Patterson分享了2000年John von Neumann獎。

書籍目錄

Contents Preface xv CHAPTERS Computer Abstractions and Technology 2 1.1 Introduction 3 1.2 Below Your Program 10 1.3 Under the Covers 13 1.4 Performance 26 1.5 The Power Wall 39 1.6 The Sea Change: The Switch from Uniprocessors to Multiprocessors 41 1.7 Real Stuff: Manufacturing and Benchmarking the AMD Opteron X4 44 1.8 Fallacies and Pitfalls 51 1.9 Concluding Remarks 54 1.10 Historical Perspective and Further Reading 55 1.11 Exercises 56 Instructions: Language of the Computer 74 2.1 Introduction 76 2.2 Operations of the Computer Hardware 77 2.3 Operands of the Computer Hardware 80 2.4 Signed and Unsigned Numbers 86 2.5 Representing Instructions in the Computer 93 2.6 Logical Operations 100 2.7 Instructions for Making Decisions 104 2.8 Supporting Procedures in Computer Hardware 113 2.9 Communicating with People 122 2.10 ARM Addressing for 32-Bit Immediates and More Complex Addressing Modes 127 2.11 Parallelism and Instructions: Synchronization 133 2.12 Translating and Starting a Program 135 2.13 A C Sort Example to Put It All Together 143 : This icon identi.es material on the CD 2.14 Arrays versus Pointers 152 2.15 Advanced Material: Compiling C and Interpreting Java 156 2.16 Real Stuff: MIPS Instructions 156 2.17 Real Stuff: x86 Instructions 161 2.18 Fallacies and Pitfalls 170 2.19 Concluding Remarks 171 2.20 Historical Perspective and Further Reading 174 2.21 Exercises 174 Arithmetic for Computers 214 3.1 Introduction 216 3.2 Addition and Subtraction 216 3.3 Multiplication 220 3.4 Division 226 3.5 Floating Point 232 3.6 Parallelism and Computer Arithmetic: Associativity 258 3.7 Real Stuff: Floating Point in the x86 259 3.8 Fallacies and Pitfalls 262 3.9 Concluding Remarks 265 3.10 Historical Perspective and Further Reading 268 3.11 Exercises 269 The Processor 284 4.1 Introduction 286 4.2 Logic Design Conventions 289 4.3 Building a Datapath 293 4.4 A Simple Implementation Scheme 302 4.5 An Overview of Pipelining 316 4.6 Pipelined Datapath and Control 330 4.7 Data Hazards: Forwarding versus Stalling 349 4.8 Control Hazards 361 4.9 Exceptions 370 4.10 Parallelism and Advanced Instruction-Level Parallelism 377 4.11 Real Stuff: the AMD Opteron X4 (Barcelona) Pipeline 390 4.12 Advanced Topic: an Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 392 4.13 Fallacies and Pitfalls 393 4.14 Concluding Remarks 394 4.15 Historical Perspective and Further Reading 395 4.16 Exercises 395 Large and Fast: Exploiting Memory Hierarchy 436 5.1 Introduction 438 5.2 The Basics of Caches 443 5.3 Measuring and Improving Cache Performance 461 5.4 Virtual Memory 478 5.5 A Common Framework for Memory Hierarchies 504 5.6 Virtual Machines 511 5.7 Using a Finite-State Machine to Control a Simple Cache 515 5.8 Parallelism and Memory Hierarchies: Cache Coherence 520 5.9 Advanced Material: Implementing Cache Controllers 524 5.10 Real Stuff: the AMD Opteron X4 (Barcelona) and Intel Nehalem Memory Hierarchies 525 5.11 Fallacies and Pitfalls 529 5.12 Concluding Remarks 533 5.13 Historical Perspective and Further Reading 534 5.14 Exercises 534 Storage and Other I/O Topics 554 6.1 Introduction 556 6.2 Dependability, Reliability, and Availability 559 6.3 Disk Storage 561 6.4 Flash Storage 566 6.5 Connecting Processors, Memory, and I/O Devices 568 6.6 Interfacing I/O Devices to the Processor, Memory, and Operating System 572 6.7 I/O Performance Measures: Examples from Disk and File Systems 582 6.8 Designing an I/O System 584 6.9 Parallelism and I/O: Redundant Arrays of Inexpensive Disks 585 6.10 Real Stuff: Sun Fire x4150 Server 592 6.11 Advanced Topics: Networks 598 6.12 Fallacies and Pitfalls 599 6.13 Concluding Remarks 603 6.14 Historical Perspective and Further Reading 604 6.15 Exercises 605 Multicores, Multiprocessors, and Clusters 616 7.1 Introduction 618 7.2 The Dif.culty of Creating Parallel Processing Programs 620 7.3 Shared Memory Multiprocessors 624 7.4 Clusters and Other Message-Passing Multiprocessors 627 7.5 Hardware Multithreading 631 7.6 SISD, MIMD, SIMD, SPMD, and Vector 634 7.7 Introduction to Graphics Processing Units 640 7.8 Introduction to Multiprocessor Network Topologies 646 7.9 Multiprocessor Benchmarks 650 7.10 Roo.ine: A Simple Performance Model 653 7.11 Real Stuff: Benchmarking Four Multicores Using the Roo. ine Model 661 7.12 Fallacies and Pitfalls 670 7.13 Concluding Remarks 672 7.14 Historical Perspective and Further Reading 674 7.15 Exercises 674 Index I-1 CD-ROM CONTENT Graphics and Computing GPUs A-2 A.1 Introduction A-3 A.2 GPU System Architectures A-7 A.3 Scalable Parallelism – Programming GPUs A-12 A.4 Multithreaded Multiprocessor Architecture A-25 A.5 Parallel Memory System G.6 Floating Point A-36 A.6 Floating Point Arithmetic A-41 A.7 Real Stuff: The NVIDIA GeForce 8800 A-46 A.8 Real Stuff: Mapping Applications to GPUs A-55 A.9 Fallacies and Pitfalls A-72 A.10 Concluding Remarks A-76 A.11 Historical Perspective and Further Reading A-77 ARM and Thumb Assembler Instructions B1-2 B1.1 Using This Appendix B1-3 B1.2 Syntax B1-4 B1.3 Alphabetical List of ARM and Thumb Instructions B1-8 B1.4 ARM Assembler Quick Reference B1-49 B1.5 GNU Assembler Quick Reference B1-60 ARM and Thumb Instruction Encodings B2-2 B2.1 ARM Instruction Set Encodings B2-3 B2.2 Thumb Instruction Set Encodings B2-9 B2.3 Program Status Registers B2-11 Instruction Cycle Timings B3-2 B3.1 Using the Instruction Set Cycle Timing Tables B3-3 B3.2 ARM7TDMI Instruction Cycle Timings B3-5 B3.3 ARM9TDMI Instruction Cycle Timings B3-6 B3.4 StrongARM1 Instruction Cycle Timings B3-8 B3.5 ARM9E Instruction Cycle Timings B3-9 B3.6 ARM10E Instruction Cycle Timings B3-11 B3.7 Intel XScale Instruction Cycle Timings B3-12 B3.8 ARM11 Cycle Timings B3-14 C The Basics of Logic Design C-2 C.1 Introduction C-3 C.2 Gates, Truth Tables, and Logic Equations C-4 C.3 Combinational Logic C-9 C.4 Using a Hardware Description Language C-20 C.5 Constructing a Basic Arithmetic Logic Unit C-26 C.6 Faster Addition: Carry Lookahead C-38 C.7 Clocks C-48 C.8 Memory Elements: Flip-Flops, Latches, and Registers C-50 C.9 Memory Elements: SRAMs and DRAMs C-58 C.10 Finite-State Machines C-67 C.11 Timing Methodologies C-72 C.12 Field Programmable Devices C-78 C.13 Concluding Remarks C-79 C.14 Exercises C-80 D Mapping Control to Hardware D-2 D.1 Introduction D-3 D.2 Implementing Combinational Control Units D-4 D.3 Implementing Finite-State Machine Control D-8 D.4 Implementing the Next-State Function with a Sequencer D-22 D.5 Translating a Microprogram to Hardware D-28 D.6 Concluding Remarks D-32 D.7 Exercises D-33 ADVANCED CONTENT Section 2.15 Compiling C and Interpreting Java Section 4.12 An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations Section 5.9 Implementing Cache Controllers Section 6.11 Networks HISTORICAL PERSPECTIVES & FURTHER READING Chapter 1 Computer Abstractions and Technology: Section 1.10 Chapter 2 Instructions: Language of the Computer: Section 2.20 Chapter 3 Arithmetic for Computers: Section 3.10 Chapter 4 The Processor: Section 4.15 Chapter 5 Large and Fast: Exploiting Memory Hierarchy: Section 5.13 Chapter 6 Storage and Other I/O Topics: Section 6.14 Chapter 7 Multicores, Multiprocessors, and Clusters: Section 7.14 Appendix A Graphics and Computing GPUs: Section A.11 TUTORIALS VHDL Verilog SOFTWARE Xilinx FPGA Design, Simulation and Synthesis Software QEMU http://www.nongnu.org/qemu/about.html Glossary G-1 Index I-1 Further Reading FR-1

章節(jié)摘錄

插圖:Diameters of hard disks vary by more than a factor of 3 today, from 1 inch to 35 inches, and have been shrunk over the years to fit into new products; workstation servers, personal computers, laptops, palmtops, and digital cameras have all inspired new disk form factors Traditionally, the widest disks have the highest performance and the smallest disks have the lowest unit cost The best cost per gigabyte varies Although most hard drives appear inside computers, as in Figure 17, hard drives can also be attached using external interfaces such as universal serial bus(USB).The use of mechanical components means that access times for magnetic disks are much slower than for DRAMs: disks typically take 5-20 milliseconds, while DRAMs take 50-70 nanoseconds——making DRAMs about 100,000 times faster Yet disks have much lower costs than DRAM for the same storage capacity, because the production costs for a given amount of disk storage are lower than for the same amount of integrated circuit In 2008, the cost per gigabyte of disk is 30 to 100 times less expensive than DRAM.Thus, there are three primary differences between magnetic disks and main memory: disks are nonvolatile because they are magnetic; they have a slower access time because they are mechanical devices; and they are cheaper per gigabyte because they have very high storage capacity at a modest cost.Many have tried to invent a technology cheaper than DRAM but faster than disk to fill that gap, but many have failed Challengers have never had a product to market at the right time By the time a new product would ship, DRAMs and disks had continued to make rapid advances, costs had dropped accordingly, and the challenging product was immediately obsoleteFlash memory, however, is a serious challenger This semiconductor memory is nonvolatile like disks and has about the same bandwidth, but latency is 100 to 1000 times faster than disk Flash is popular in cameras and portable music players because it comes in much smaller capacities, it is more rugged, and it is more power efficient than disks, despite the cost per gigabyte in 2008 being about 6 to 10 times higher than disk Unlike disks and DRAM, flash memory bits wear out after 100,000 to 1,000,000 writes Thus, file systems must keep track of the number of writes and have a strategy to avoid wearing out storage, such as by moving popular data Chapter 6 describes flash in more detail.Although hard drives are not removable, there are several storage technologies in use that indude the following:Optical disks, including both compact disks (CDs) and digital video disks (DVDs), constitute the most common form of removable storage The Blu- Ray (BD) optical disk standard is the heir-apparent to DVD Flash-based removable memory cards typically attach to a USB connection and are often used to transfer files.Magnetic tape provides only slow serial access and has been used to back up.disks, a role now often replaced by duplicate hard drives.

媒體關(guān)注與評論

“本版特別之處在于采用ARM取代了早先使用MIPS作為核心處理器來講述計(jì)算機(jī)設(shè)計(jì)的蓉本原則,約本書增加了另一個屢面的內(nèi)涵。ARM作為嵌入式領(lǐng)域的量流處理器,在嵌入式計(jì)算領(lǐng)域具有非常重要的意義。本書彌補(bǔ)了現(xiàn)有數(shù)學(xué)體降系中的空白,即有針對性塏向掌習(xí)嵌入式系統(tǒng)的學(xué)生講授計(jì)算機(jī)組成的基本原理。同以往版本一樣,本書仍主要介紹計(jì)算機(jī)硬件/軟件接口,弄巧妙地將其與嵌入式系統(tǒng)設(shè)計(jì)的基本知識相聯(lián)系?!薄  猂anjani Parthasarathi Anna大學(xué),印度欽奈

編輯推薦

《計(jì)算機(jī)組成與設(shè)計(jì):硬件/軟件接口(英文版·第4版·ARM版)》為經(jīng)典原版書庫。

圖書封面

圖書標(biāo)簽Tags

評論、評分、閱讀與下載


    計(jì)算機(jī)組成與設(shè)計(jì) PDF格式下載


用戶評論 (總計(jì)18條)

 
 

  •   總之我買錯了。。這是ARM版,原來舊版的是MIPS版,但是如今MIPS版不在亞洲地區(qū)賣了,所以大家看清楚??!
  •   只有A.11,其他的A.1-A.10都沒有,本來就是沖A買的
  •   買的書看上去不錯的,不過對我來說買了本英語版的的確有點(diǎn)小失誤了,畢竟我的英語水平不是很高,不過看原版的還是挺好的了……
  •   書寫的很好,字小了點(diǎn),看的很累
  •   打折的時候買的,47.5,就是紙張有點(diǎn)暗,灰色調(diào)的感覺不太適應(yīng)。其他的還好,準(zhǔn)備好好學(xué)學(xué)。
  •   書的質(zhì)量很好,包裝也很仔細(xì)。
  •   還不錯,順便練習(xí)一下英文。
  •   書本身是本好書,不過還是覺得字太小,能有電子版的該多好
  •   附錄只有A11,沒有A1~A10,原書900多頁,這個只有600頁,印刷質(zhì)量非常差,機(jī)械工業(yè)出版社真是臭名昭著啊,怒火中燒中!誰買誰后悔!真是焚琴煮鶴?。?/li>
  •   這個版本最好。1)mips版cd配套網(wǎng)上有,對比后,arm版appendix比mips要多幾個~~;2)網(wǎng)上已有mips版的revised。對比后,主要第二章不同,其他都一樣~~3)GPU的附錄確實(shí)沒在cd里,華章網(wǎng)站上有提供下載…不知再版后有無加入cd,沒得化就應(yīng)該譴責(zé),確實(shí)讓人很不快滴;... 閱讀更多
  •   CD-ROM CONTENT沒包含附錄A的全部東西!
  •   本來想買mips版的
  •   大神寫的經(jīng)典教材
  •   風(fēng)格大福給對方給第三方
  •   計(jì)算機(jī)組成經(jīng)典教材
  •   都是些基礎(chǔ)學(xué)習(xí)的書,感興趣的可以試試
  •   經(jīng)典好書的有一本
  •   計(jì)算機(jī)體系結(jié)構(gòu)的經(jīng)典教材
 

250萬本中文圖書簡介、評論、評分,PDF格式免費(fèi)下載。 第一圖書網(wǎng) 手機(jī)版

京ICP備13047387號-7