Similar presentations:
DRAM Tutorial
1. DRAM Tutorial
18-447 LectureVivek Seshadri
2. DRAM Module and Chip
Vivek Seshadri – Thesis Proposal2
3. Goals
Cost
Latency
Bandwidth
Parallelism
Power
Energy
Vivek Seshadri – Thesis Proposal
3
4. DRAM Chip
Cell ArrayArray of Sense Amplifiers
Cell Array
Cell Array
Array of Sense Amplifiers
Cell Array
Bank I/O
4
Vivek Seshadri – Thesis Proposal
Row Decoder
Row Decoder
5. Sense Amplifier
topenable
Inverter
bottom
Vivek Seshadri – Thesis Proposal
5
6. Sense Amplifier – Two Stable States
VDD1
0
1
0
Logical “1”
Vivek Seshadri – Thesis Proposal
VDD
Logical “0”
6
7. Sense Amplifier Operation
VTDDVT > VB
0
1
V0B
Vivek Seshadri – Thesis Proposal
7
8. DRAM Cell – Capacitor
Empty StateLogical “0”
Fully Charged State
Logical “1”
1
Small – Cannot drive circuits
2
Reading destroys the state
Vivek Seshadri – Thesis Proposal
8
9. Capacitor to Sense Amplifier
VDD0
1
1
VDD
Vivek Seshadri – Thesis Proposal
0
9
10. DRAM Cell Operation
½VVDDDD+δ
1
0
0 DD
½V
Vivek Seshadri – Thesis Proposal
10
11. DRAM Subarray – Building Block for DRAM Chip
Row DecoderCell Array
Array of Sense Amplifiers (Row Buffer) 8Kb
Cell Array
Vivek Seshadri – Thesis Proposal
11
12. DRAM Bank
Row DecoderRow Decoder
Address
DRAM Bank
Cell Array
Array of Sense Amplifiers (8Kb)
Cell Array
Cell Array
Array of Sense Amplifiers
Cell Array
Bank I/O (64b)
Address
Vivek Seshadri – Thesis Proposal
Data
12
13. DRAM Chip
Cell ArrayRow Decoder
Row Decoder
Row Decoder
Row Decoder
Array of Sense
Amplifiers
Cell Array
Cell Array
Cell Array
Bank I/O
Bank I/O
Cell Array
Array of Sense
Amplifiers
Cell Array
Array of Sense
Amplifiers
Cell Array
Row Decoder
Cell Array
Cell Array
Cell Array
Array of Sense
Amplifiers
Bank I/O
Bank I/O
Cell Array
Array of Sense
Amplifiers
Cell Array
Cell Array
Array of Sense
Amplifiers
Cell Array
Cell Array
Array of Sense
Amplifiers
Cell Array
Cell Array
Array of Sense
Amplifiers
Cell Array
Cell Array
Cell Array
Cell Array
Cell Array
Array of Sense
Amplifiers
Array of Sense
Amplifiers
Cell Array
Bank I/O
Array of Sense
Amplifiers
Bank I/O
Row Decoder
Row Decoder
Cell Array
Cell Array
Cell Array
Array of Sense
Amplifiers
Cell Array
Cell Array
Array of Sense
Amplifiers
Cell Array
Array of Sense
Amplifiers
Cell Array
Cell Array
Array of Sense
Amplifiers
Cell Array
Bank I/O
Bank I/O
Row Decoder
Row Decoder
Row Decoder
Row Decoder
Row Decoder
Row Decoder
Row Decoder
Row Decoder
13
Vivek Seshadri – Thesis Proposal
Row Decoder
Array of Sense
Amplifiers
DRAM Chip
Shared internal bus
Memory channel - 8bits
14. DRAM Operation
1 ACTIVATE RowRow Decoder
Row Decoder
Row Address
DRAM Operation
2 READ/WRITE Column
Cell Array
Array of Sense Amplifiers
Cell Array
3 PRECHARGE
Bank I/O
Column Address
Vivek Seshadri – Thesis Proposal
Data
14
15. RowClone
Fast and Energy-Efficient In-DRAMBulk Data Copy and Initialization
Vivek Seshadri
Y. Kim, C. Fallin, D. Lee, R. Ausavarungnirun,
G. Pekhimenko, Y. Luo, O. Mutlu,
P. B. Gibbons, M. A. Kozuch, T. C. Mowry
16. Memory Channel – Bottleneck
CoreMC
High Energy
Vivek Seshadri – Thesis Proposal
Channel
Memory
Core
Cache
Limited Bandwidth
17. Goal: Reduce Memory Bandwidth Demand
CoreMC
Channel
Reduce unnecessary data movement
Vivek Seshadri – Thesis Proposal
Memory
Core
Cache
Goal: Reduce Memory Bandwidth
Demand
18. Bulk Data Copy and Initialization
Bulk DataCopy
src
dst
Bulk Data
Initialization
val
dst
Vivek Seshadri – Thesis Proposal
19. Bulk Data Copy and Initialization
Bulk DataCopy
src
dst
Bulk Data
Initialization
val
dst
Vivek Seshadri – Thesis Proposal
20. Bulk Copy and Initialization – Applications
0000000000
00000
Forking
Zero initialization
(e.g., security)
Checkpointing
Many more
VM Cloning
Deduplication
Vivek Seshadri – Thesis Proposal
Page Migration
21. Shortcomings of Existing Approach
High EnergyCore
Core
Cache
(3600nJ to copy 4KB)
MC
Channel
High latency
(1046ns to copy 4KB)
Interference
Vivek Seshadri – Thesis Proposal
dst
src
22. Our Approach: In-DRAM Copy with Low Cost
XCore
Core
Cache
High Energy
MC
Channel
X
Interference
X
High latency
Vivek Seshadri – Thesis Proposal
dst
?
src
23. RowClone: In-DRAM Copy
2324. Two Key Observations
Row DecoderMany DRAM cells
2 share the same
sense amplifier
1
Any operation on one sense
amplifier can be easily
performed in bulk
Vivek Seshadri – Thesis Proposal
24
25. Bulk Copy in DRAM – RowClone
½VVDDDD +δ
Data gets
copied
1
0
½V0DD
Vivek Seshadri – Thesis Proposal
25
26. Fast Parallel Mode – Benefits
Bulk Data Copy (4KB across a module)Latency
11X
Energy
1046ns to 90ns
74X
3600nJ to 40nJ
No bandwidth consumption
Very little changes to the DRAM chip
Vivek Seshadri – Thesis Proposal
26
27. Fast Parallel Mode – Constraints
• Location constraint– Source and destination in same subarray
• Size constraint
– Entire row gets copied (no partial copy)
1 Can still accelerate many existing primitives
(copy-on-write, bulk zeroing)
2 Alternate mechanism to copy data across banks
(pipelined serial mode – lower benefits than Fast Parallel)
Vivek Seshadri – Thesis Proposal
27
28. End-to-end System Design
• Software interface– memcpy and meminit instructions
• Managing cache coherence
– Use existing DMA support!
• Maximizing use of Fast Parallel Mode
– Smart OS page allocation
Vivek Seshadri – Thesis Proposal
28
29. Applications Summary
Fraction of Memory TrafficZero
Copy
Write
Read
1
0,8
0,6
0,4
0,2
0
bootup
compile forkbench mcached
Vivek Seshadri – Thesis Proposal
mysql
shell
29
30. Results Summary
Compared to BaselineIPC Improvement
Memory Energy Reduction
70%
60%
50%
40%
30%
20%
10%
0%
bootup
compile forkbench mcached
Vivek Seshadri – Thesis Proposal
mysql
shell
30