Rabu, 15 Januari 2014

bahan kuliah OAK

bahan OAK

apa itu superscalar ?
Instruksi umum (aritmatika, load / store, cabang bersyarat) dapat dimulai dan dilaksanakan secara independen
Sama berlaku untuk RISC & CISC
Dalam prakteknya biasanya RISC

Why Superscalar?
Most operations are on scalar quantities (see RISC notes)
Improve these operations to get an overall improvement

General Superscalar Organization


Superpipelined
-Many pipeline stages need less than half a clock cycle
-Double internal clock speed gets two tasks per external clock cycle
-Superscalar allows parallel fetch execute

Superscalar v Superpipeline

Limitations
-Instruction level parallelism
-Compiler based optimisation
-Hardware techniques
-Limited by
  --True data dependency
  --Procedural dependency
  --Resource conflicts
  --Output dependency
  --Antidependency



IA64

Background to IA-64
Pentium 4 appears to be last in x86 line
Intel & Hewlett-Packard (HP) jointly developed
New architecture
--64 bit architecture
--Not extension of x86
--Not adaptation of HP 64bit RISC architecture
Exploits vast circuitry and high speeds
Systematic use of parallelism
Departure from superscalar

Motivation
Instruction level parallelism
--Implicit in machine instruction
--Not determined at run time by processor
Long or very long instruction words (LIW/VLIW)
Branch predication (not the same as branch prediction)
Speculative loading
Intel & HP call this Explicit Parallel Instruction Computing (EPIC)
IA-64 is an instruction set architecture intended for implementation on EPIC
Itanium is first Intel product

Superscalar v IA-64

Mengapa Arsitektur Baru?
Tidak kompatibel dengan hardware x86
Sekarang memiliki puluhan juta transistor yang tersedia pada chip
Bisa membangun cache yang lebih besar
menurun
Menambahkan unit eksekusi lebih
- Meningkatkan superscaling
- "Kompleksitas wall"
- Unit lainnya membuat prosesor "yang lebih luas"
- Lebih logika yang dibutuhkan untuk mengatur
- Peningkatan prediksi cabang yang diperlukan
- Pipa yang lebih panjang diperlukan
- Hukuman yang lebih besar untuk misprediction
- Jumlah yang lebih besar dari register penggantian nama diperlukan
- Pada kebanyakan enam instruksi per siklus


Paralelisme eksplisit
Instruksi paralelisme dijadwalkan pada waktu kompilasi
-Termasuk dengan instruksi mesin
Prosesor menggunakan info ini untuk melakukan eksekusi paralel
Membutuhkan sirkuit kurang kompleks
Compiler memiliki lebih banyak waktu untuk menentukan operasi paralel mungkin
Compiler melihat seluruh program

General Organization


Key Features
#Besarnya jumlah register
#Format instruksi
-IA-64 mengasumsikan 256
 -- 128 * 64 bit integer, logis & tujuan umum
 --128 * 82 bit floating point dan grafis
-64 * 1 bit berpredikat register eksekusi (lihat nanti)
- Untuk mendukung tingkat tinggi paralelisme
# unit eksekusi multiple
-Diperkirakan 8 atau lebih
-Tergantung pada jumlah transistor yang tersedia
-Eksekusi instruksi paralel tergantung pada perangkat keras yang tersedia
- 8 instruksi paralel dapat tumpah ke dua banyak empat jika hanya empat unit eksekusi yang tersedia

IA-64 Execution Units
i-Unit
--Integer arithmetic
--Shift and add
--Logical
--Compare
--Integer multimedia ops
M-Unit
--Load and store
  ---Between register and memory
--Some integer ALU
B-Unit
--Branch instructions
F-Unit
--Floating point instructions

Instruction Format Diagram


Instruction Format
128 bit bundle
--Holds three instructions (syllables) plus template
--Can fetch one or more bundles at a time
--Template contains info on which instructions can be executed in parallel
----Not confined to single bundle
----e.g. a stream of 8 instructions may be executed in parallel
----Compiler will have re-ordered instructions to form contiguous bundles
----Can mix dependent and independent instructions in same bundle
--Instruction is 41 bit long
----More registers than usual RISC
----Predicated execution registers (see later)






CONTROL UNIT OPERATION 


Micro-Operation
A computer executes a program
Fetch/execute cycle
Each cycle has a number of steps
--see pipelining
Called micro-operations
Each step does very little
Atomic operation of CPU

Constituent Elements of Program Execution


Fetch - 4 Registers
#Memory Address Register (MAR)
--Connected to address bus
--Specifies address for read or write op
#Memory Buffer Register (MBR)
--Connected to data bus
--Holds data to write or last data read
#Program Counter (PC)
--Holds address of next instruction to be fetched
#Instruction Register (IR)
--Holds last instruction fetched


Fetch Sequence
Alamat instruksi berikutnya ada di PC
Alamat (MAR) ditempatkan pada bus alamat
Isu unit kontrol READ perintah
Hasil (data dari memori) muncul pada bus data
Data dari bus data yang disalin ke MBR
PC bertambah dengan 1 (secara paralel dengan data mengambil dari memori)
Data (instruksi) dipindahkan dari MBR ke IR
MBR sekarang bebas untuk fetch data lebih lanjut



























Flowchart for Instruction Cycle




Functional Requirements
Define basic elements of processor
Describe micro-operations processor performs
Determine functions control unit must perform


Basic Elements of Processor
ALU
Registers
Internal data pahs
External data paths
Control Unit


Types of Micro-operation
Transfer data between registers
Transfer data from register to external
Transfer data from external to register
Perform arithmetic or logical ops


Functions of Control Unit
Sequencing
--Causing the CPU to step through a series of micro-operations
Execution
--Causing the performance of each micro-op
This is done using Control Signals


Control Signals
Clock
--One micro-instruction (or set of parallel micro-instructions) per clock cycle
Instruction register
--Op-code for current instruction
--Determines which micro-instructions are performed
Flags
--State of CPU
--Results of previous operations
From control bus
--Interrupts
--Acknowledgements


Model of Control Unit

Control Signals - output
Within CPU
--Cause data movement
--Activate specific functions
Via control bus
--To memory
--To I/O modules


Example Control Signal Sequence - Fetch
MAR <- div="">
--Control unit activates signal to open gates between PC and MAR
MBR <- div="" memory="">
--Open gates between MAR and address bus
--Memory read control signal
--Open gates between data bus and MBR


Data Paths and Control Signals

Internal Organization
Usually a single internal bus
Gates control movement of data onto and off the bus
Control signals control data transfer to and from external systems bus
Temporary registers needed for proper operation of ALU

Intel 8085 CPU Block Diagram



Intel 8085 Pin Configuration


Intel 8085 OUT Instruction Timing Diagram





Problems With Hard Wired Designs
Complex sequencing & micro-operation logic
Difficult to design and test
Inflexible design
Difficult to add new instructions





Micro-programmed Control

Control Unit Organization


Micro-programmed Control
Use sequences of instructions (see earlier notes) to control complex operations
Called micro-programming or firmware

Implementation (1)
All the control unit does is generate a set of control signals
Each control signal is on or off
Represent each control signal by a bit
Have a control word for each micro-operation
Have a sequence of control words for each machine code instruction
Add an address to specify the next micro-instruction, depending on conditions

Implementation (2)
Today’s large microprocessor
--Many instructions and associated register-level hardware
--Many control points to be manipulated
This results in control memory that
--Contains a large number of words
----co-responding to the number of instructions to be executed
--Has a wide word width 
----Due to the large number of control points to be manipulated

Micro-program Word Length
Based on 3 factors
--Maximum number of simultaneous micro-operations supported
--The way control information is represented or encoded
--The way in which the next micro-instruction address is specified

Vertical Micro-programming
Width is narrow
n control signals encoded into log2 n bits
Limited ability to express parallelism
Considerable encoding of control information requires external memory word decoder to identify the exact control line being manipulated


Horizontal Micro-programming
Wide memory word
High degree of parallel operations possible
Little encoding of control information


Compromise
Divide control signals into disjoint groups
Implement each group as separate field in memory word
Supports reasonable levels of parallelism without too much complexity

Organization of Control Memory


Control Unit


Control Unit Function
Sequence login unit issues read command
Word specified in control address register is read into control buffer register
Control buffer register contents generates control signals and next address information
Sequence login loads new address into control buffer register based on next address information from control buffer register and ALU flags


Next Address Decision
-Depending on ALU flags and control buffer register
--Get next instruction
----Add 1 to control address register
--Jump to new routine based on jump microinstruction
----Load address field of control buffer register into control address register
--Jump to machine instruction routine
----Load control address register based on opcode in IR

Functioning of Microprogrammed Control Unit














































































































rangga bahtera

Tidak ada komentar:

Posting Komentar