Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Adaptive Techniques for Dynamic Processor Optimization_Theory and Practice Episode 1 Part 8 ppsx
MIỄN PHÍ
Số trang
20
Kích thước
793.0 KB
Định dạng
PDF
Lượt xem
1603

Adaptive Techniques for Dynamic Processor Optimization_Theory and Practice Episode 1 Part 8 ppsx

Nội dung xem thử

Mô tả chi tiết

Chapter 6 Dynamic Voltage Scaling with the XScale Embedded Microprocessor 129

As a concrete example, assume the processor is running at 1 GHz and

VDD = 1.75 V. If half of the cycles are stalls waiting for the bus, as

determined by a combination of the total clock count, instructions

executed, and data dependency stall or bus request counts, the VDD can be

adjusted to 1.2 V (see Figure 6.2) and the core frequency reduced to 500

MHz. Useful work is then performed in a greater number of the (fewer

overall) core clock cycles. Referring to Figure 6.2, the power savings is

nearly 50% with the same work finished in the same amount of time.

6.2 Dynamic Voltage Scaling on the XScale Microprocessor

This section describes experimental results running DVS on the 180 nm

XScale microprocessor. The value of DVS is evident in Figure 6.3. Here,

the 80200 microprocessor is shown functioning across a power range from

10 mW in idle mode, up to 1.5 W at 1 GHz clock frequency. The idle

mode power is dominated by the PLL and clock generation unit. The

processor core includes the capacity to apply reverse-body bias and supply

collapse [10, 11] to the core transistors for fully state-retentive power￾down. The microprocessor core consumes 100 μW in the low standby

“Drowsy” mode [12]. The PLL and clock divider unit must be restarted

when leaving Drowsy mode. When running with a clock frequency of 200

MHz, the VDD can be reduced to 700 mV, providing power dissipation less

than 45 mW.

Figure 6.3 The value of dynamic voltage scaling is evident from this plot of the

80200 power and VDD voltage over time. The power lags due to the latency of the

measurement and time averaging.

4IME￾ARBITRARY￾SCALE

6$$￾6

#LOCK￾&REQUENCY￾-(Z ￾0OWER￾M7

&REQUENCY

0OWER

AVE￾￾SAMPLES 6OLTAGE

 

































130 Lawrence T. Clark, Franco Ricci, William E. Brown

6.2.1 Running DVS

To demonstrate DVS on the XScale, a synthetic benchmark programmed

using the LRH demonstration board is used here. The onboard voltage

regulator is bypassed, and a daughter-card using a Lattice GAL22v10 PLD

controller and a Maxim MAX1855 DC-DC converter evaluation kit is

added. The DC–DC converter output voltage can vary from 0.6 to 1.75 V.

The control is memory mapped, allowing software to control the processor

core VDD.

The synthetic benchmark loops between a basic block of code that has a

data set that fits entirely in the cache (these pages are configured for write￾back mode) and one that is non-cacheable and non-bufferable. The latter

requires many more bus operations, since the bus frequency of 100 MHz is

lower than the core clock frequency, which must be at least 3× the bus

frequency on the demonstration board.

The code monitors the actual operational CPI using the processor PMU.

The number of executed instructions as well as the number of clocks, since

the PMU was initialized and counting began, are monitored. The C code,

with inline assembly code to perform low-level functions is

unsigned int count0, count1, count2;

int cpi() {

int val;

// read the performance counters

asm("mrc p14, 0, r0, c0, c0, 0":::"r0"); // read the PMNC register

asm("bic r1, r0, #1":::"r1"); // clear the enable bit

asm("mcr p14, 0, r1, c0, c0, 0":::"r1"); // clear interrupt flag, disable counting

// read CCNT register

asm("mrc p14, 0, %0, c1, c0, 0" : "=r" (count0) : "0" (count0));

asm("mrc p14, 0, %0, c2, c0, 0" : "=r" (count1) : "0" (count1));

asm("mrc p14, 0, %0, c3, c0, 0" : "=r" (count2) : "0" (count2));

return(val = count0);

}

int startcounters() {

unsigned int z;

// set up and turn on the performance counters

z = 0×00710707;

asm("mov r1, %0" :: "r" (z) : "r1"); // initialization value in reg. 1

asm("mcr p14, 0, r1, c0, c0, 0" ::: "r1"); // write reg. 1 to PMNC

}

Tải ngay đi em, còn do dự, trời tối mất!
Adaptive Techniques for Dynamic Processor Optimization_Theory and Practice Episode 1 Part 8 ppsx | Siêu Thị PDF