文档介绍:计算机结构与程序优化Introduction to Intel? 64 Architectures OptimizationMain Purpose?处理器架构简介?SIMD指令介绍(SSE & AVX)?程序优化的准则?针对Intel 64的优化?实例(convolution code)Introduction to Processor?Brief Description puter System?The Memory Hierarchy?Intel?64 and IA32 Processor?AMD64 Processor?Latency and puter SystemCPUALURegistersI/O BridgeRAMUSBControllerGraphicsAdapterDisk workAdapterMemoryinterfaceExpansion slots for other devices such as work adaptersI/O Busmemory bussystem busWhere is CachesThe Memory HierarchyHow about the True ProcessorIntel? Sandy Bridge Pipeline Functionality OverviewCopy form Intel? 64 and IA-32 Architectures Optimization Reference Manual [248966-24]Intel? Sandy Bridge Overview?The Front End?Decode Pipeline?Decoded ICache?Branch Prediction Unit (BPU)?Micro-op queue?The Out-of-Order Engine?The Execution Core?Cache HierarchyIntel? Sandy Bridge Overview?The Front End?The Out-of-Order Engine?Renamer?Dependency Breaking Idioms?The Execution Core?Cache HierarchyExecution CoreSchedulerPort 0Port 1Port 5Port 2Port 3Port 4ALUMemory ControlALUALUV-MulV-AddJMPV-ShuffleV-ShuffleF-DIV256-Add256-Shuf256-Bool256-MulL1Data CacheLoadSt DataSt AddLoadSt Add48 bytes/cycleIn orderOut of orderIntel? Sandy Bridge