Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Design of energy-efficient application-specific set processors
Nội dung xem thử
Mô tả chi tiết
DESIGN OF ENERGY-EFFICIENT APPLICATIONSPECIFIC INSTRUCTION SET PROCESSORS
This page intentionally left blank
Design of Energy-Efficient
Application-Specific
Instruction Set Processors
by
Tilman Glökler
IBM Deutschland Entwicklung GmbH,
Böblingen, Germany
and
Heinrich Meyr
Integrated Signal Processing Systems,
Aachen University of Technology, Germany
KLUWER ACADEMIC PUBLISHERS
NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: 1-4020-2540-8
Print ISBN: 1-4020-7730-0
©2004 Springer Science + Business Media, Inc.
Print ©2004 Kluwer Academic Publishers
All rights reserved
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic,
mechanical, recording, or otherwise, without written consent from the Publisher
Created in the United States of America
Visit Springer's eBookstore at: http://www.ebooks.kluweronline.com
and the Springer Global Website Online at: http://www.springeronline.com
Dordrecht
Contents
Acknowledgments
About the Authors
1 Introduction 1
2 Focus and Related Work 5
2.1 Focus of This Work . . . ................. 5
2.2 Previous Work . . . . . . ................. 6
2.2.1 ASIP Design Methodologies .......... 6
2.2.2 ASIP Case Studies . . . . . . . . . . . . . . . 10
2.2.3 Basic Low-Power Design Techniques . . . . . 11
2.2.4 Verification . . . . . . . . . . . . . . . . . . . 14
2.3 Differences to Previous Work . . . . . . . . . . . . . . . 15
3 Efficient Low-Power Hardware Design 17
3.1 Metrics of the Implementation and the Hardware Design
Methodology . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 Characteristics of the Implementation . . . . . 18
3.1.2 Characteristics of the Design Methodology . . 20
3.2 Basics of Low-Energy Hardware Design . . . . . . . . . 22
3.2.1 Sources of CMOS Energy Consumption . . . . 23
3.2.2 Basic Principles of Lowering the Power Consumption . . . . . . . . . . . . . . . . . . . . 26
xi
xii
Foreword
List of Figures
List of Tables
xiii
xv
xix
vi Contents
3.2.3 Measuring and Quantifying Energy-Efficiency 28
3.3 Techniques to Reduce the Energy Consumption . . . . . 32
3.3.1 System and Architecture Level . . . . . . . . . 33
3.3.2 Register Transfer and Logic Level . . . . . . . 36
3.3.3 Physical Level . . . . . . . . . . . . . . . . . 40
3.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . 41
4 Application-Specific Processor Architectures 43
4.1 Definitions of ASIP Related Terms . . . . . . . . . . . . 43
4.2 ASIP Applications . . . . . . . . . . . . . . . . . . . . . 46
4.3 ASIP Design Space . . . . . . . . . . . . . . . . . . . . 48
4.3.1 Functional Units . . . . . . . . . . . . . . . . 51
4.3.2 Storage elements . . . . . . . . . . . . . . . . 52
4.3.3 Pipelining . . . . . . . . . . . . . . . . . . . . 53
4.3.4 Interconnection Structure . . . . . . . . . . . . 55
4.3.5 Control Mechanisms . . . . . . . . . . . . . . 56
4.3.6 Storage Access . . . . . . . . . . . . . . . . . 58
4.3.7 Instruction Coding and Instruction Fetch Mechanisms . . . . . . . . . . . . . . . . . . . . . 59
4.3.8 Interface Mechanisms . . . . . . . . . . . . . 61
4.3.9 Tightly-Coupled ASIP Accelerators . . . . . . 64
4.4 Critical Factors for Energy-Efficient ASIPs . . . . . . . . 65
4.4.1 Timing and Computational Performance . . . . 65
4.4.2 Energy Consumption . . . . . . . . . . . . . . 68
4.4.3 Implementation Area . . . . . . . . . . . . . . 73
4.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . 74
Contents vii
5 The ASIP Design Flow 75
5.1 Example Applications . . . . . . . . . . . . . . . . . . . 76
5.2 Application Profiling and Partitioning . . . . . . . . . . . 80
5.2.1 Stimulus Generation for Application Profiling . 80
5.2.2 Application Profiling . . . . . . . . . . . . . . 81
5.2.3 HW/SW Partitioning . . . . . . . . . . . . . . 87
5.2.4 ASIP Class Selection . . . . . . . . . . . . . . 89
5.3 Combined ASIP HW/SW Synthesis and Profiling . . . . 93
5.3.1 ASIP Interface Definition . . . . . . . . . . . 94
5.3.2 ASIP ISA Definition . . . . . . . . . . . . . . 96
5.3.3 Software Implementation and Tools . . . . . . 97
5.3.4 Hardware Implementation and Logic Synthesis 99
5.3.5 Implementation Profiling and Worst Case Runtime Analysis . . . . . . . . . . . . . . . . . . 100
5.3.6 Iterative ASIP Optimization . . . . . . . . . . 102
5.3.7 Definition of a tightly coupled ASIP Accelerator 109
5.4 Verification . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . 116
6 The ASIP Design Environment 117
6.1 The LISA Language . . . . . . . . . . . . . . . . . . . . 117
6.2 The LISA Design Environment . . . . . . . . . . . . . . 123
6.3 Extensions to the LISA Design Environment . . . . . . . 125
6.3.1 Instruction Encoding and Decoder Generation . 125
6.3.1.1 Minimization of the instruction width 127
6.3.1.2 Minimization of the Toggle Activity 131
viii Contents
6.3.2 Semi-Automatic Test Case Generation . . . . . 138
6.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . 143
7 Case Studies 145
7.1 Case Study I: DVB-T Acquisition and Tracking . . . . . 145
7.1.1 Application Profiling and ASIP Class Selection 147
7.1.2 Iterative Instruction Set Optimization . . . . . 149
7.1.2.1 Example 1: Saturation . . . . . . . . 149
7.1.2.2 Example 2: CORDIC . . . . . . . . 151
7.1.3 Overall Energy Optimization Results . . . . . 153
7.2 Case Study II: Linear Algebra Kernels and Eigenvalue Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.1 Implementation I: Optimized ASIP with Accelerator . . . . . . . . . . . . . . . . . . . . . . 157
7.2.2 Implementation II: Compiler-Programmed Parameterizable Core with Accelerator . . . . . . 161
7.2.3 Evaluation Results . . . . . . . . . . . . . . . 163
7.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . 165
8 Summary 167
A ASIP Development Using LISA 2.0 171
A.1 The LISA 2.0 Language . . . . . . . . . . . . . . . . . . 171
A.2 Design Space Exploration . . . . . . . . . . . . . . . . . 173
A.3 Design Implementation . . . . . . . . . . . . . . . . . . 175
A.4 Software Tools Generation . . . . . . . . . . . . . . . . 177
A.4.1 Compiler Generation . . . . . . . . . . . . . . 177
A.4.2 Assembler and Linker Generation . . . . . . . 178
Contents ix
A.4.3 Simulator Generation . . . . . . . . . . . . . . 179
A.4.3.1 Interpretive Simulation . . . . . . . 181
A.4.3.2 Compiled Simulation . . . . . . . . 181
A.4.3.3 Just-In-Time Cache Compiled Simulation (JIT-CCS) . . . . . . . . . . . 181
A.5 System Integration . . . . . . . . . . . . . . . . . . . . . 183
A.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 184
B Computational Kernels 185
B.1 The CORDIC Algorithm . . . . . . . . . . . . . . . . . 185
B.2 FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . . 187
B.3 The Fast Fourier Transformation . . . . . . . . . . . . . 188
B.4 Vector/Matrix Operations . . . . . . . . . . . . . . . . . 188
B.5 Complex EVD using a Jacobi-like Algorithm . . . . . . . 190
C ICORE Instruction Set Architecture 193
C.1 Processor Resources . . . . . . . . . . . . . . . . . . . . 193
C.2 Pipeline Organization . . . . . . . . . . . . . . . . . . . 193
C.3 Instruction Summary . . . . . . . . . . . . . . . . . . . 198
C.4 Exceptions to the Hidden Pipeline Model . . . . . . . . . 202
C.5 ICORE Memory Organization and I/O Space . . . . . . . 203
C.6 Instruction Coding . . . . . . . . . . . . . . . . . . . . . 203
D Different ICORE Pipeline Organizations 205
E ICORE HDL Description Templates 207
E.1 Generic Register File Entity . . . . . . . . . . . . . . . . 207
E.2 Generic Bit-Manipulation Unit . . . . . . . . . . . . . . 209
x Contents
F Area, Power and Design Time for ICORE 213
G Acronyms 217
Bibliography 221
Foreword It would appear we have reached the limits
of what is possible to achieve with computer technology, although one should be
careful with such statements – they tend to
sound pretty silly in five years.
John von Neumann, 1949.
Application-specific instruction set processors (ASIPs) have the potential to become a key building block of future integrated circuits for digital signal processing. ASIPs combine the flexibility and competitive
time-to-market of embedded processors with the computational performance and energy-efficiency of dedicated VLSI hardware implementations. Furthermore, ASIPs can easily be integrated into existing semicustom design flows: the ASIP designer has full control of the implementation and verification. As ASIPs replace commercial embedded
processors, there is no need to pay royalties to third parties.
This book was written for hard- and software design engineers as well
as students with a fundamental knowledge of VLSI logic design. The
benefits of ASIPs can only be exploited by designers with expertise
in the fields of VLSI hardware, computer architecture, and embedded software design. This book provides the essential knowledge in
each of these disciplines and focuses on the practical implementation
of ASIPs for real-world applications. Many examples illustrate the proposed methodology; theoretic discussions are kept to the minimum.
This book constitutes my Ph.D. thesis, which has been performed at the
Institute for Integrated Signal Processing Systems at Aachen University
of Technology (ISS/RWTH Aachen/Germany). My reviewers encouraged me to extend my thesis and publish this comprehensive book about
ASIP design.
The first chapter of this book introduces the advantages of ASIPs and
motivates the requirement for an elaborated design methodology. In
Chapter 2, the focus of this work is described in detail and an overview
of related work is given. Chapter 3 introduces and summarizes the
basics of low-energy VLSI design. This chapter is a prerequisite for
the design space definition of ASIPs and the discussion of critical factors for energy-efficient ASIP architectures in Chapter 4. The proposed
ASIP design flow is presented in Chapter 5 with a special focus on design tasks to obtain an energy-efficient implementation. The LISA tool
suite, which was developed at the ISS, and enhancements of these tools
triggered by this work are presented in Chapter 6. The described tools
support the generation of critical hardware parts in order to save energy as well as the verification of the implemented ASIP hard- and software. Quantitative results of two case studies are given in Chapter 7,
which prove the applicability of the proposed design flow and the developed tools. The first case study demonstrates the impressive potential of
ASIP performance and energy optimizations, whereas the second case
study compares the architectural and implementation efficiency of two
different ASIP design approaches.
Acknowledgments
I would like to express my sincere gratitude to Professor Heinrich Meyr,
the coauthor of this book, for supervising my Ph.D. thesis. Frequent
discussions with him added greatly to my work and his guidance and
support have been invaluable to my academic development over the last
five years.
Furthermore, I would like to thank Professor Stefan Heinen for generously spending his time to advise me and for his valuable contribution
to improve this thesis.
I am expecially grateful to Dr. Stephan Bitterlich for many fruitful discussions and various good ideas. Moreover, I would like to thank all
my colleagues at the Institute for Integrated Signal Processing Systems
(ISS) for the pleasant five common years. Special recognition is given
to Tim Kogel, Dr. Falco Munsche, Dr. Jens Horstmannshoff, Oliver
Wahlen and Manuel Hohenauer for proof-reading and for many valuable proposals to improve this thesis. I also would like to thank Oliver
Schliebusch for updating the LISA appendix. Moreover, I am very
grateful to Dr. Stefan A. Fechtel and the design team of Infineon Technologies AG for supporting the ICORE DVB-T chip project.
Finally, I would like to thank my parents for their support during my
studies and my girlfriend Eva-Marie for her comprehension and patience during the writing of this thesis.
Tilman Gl¨okler
October, 2003
xii
About the Authors
T. Glokler ¨ received his diploma degree with honors in Electrical
Engineering from Technical University of Stuttgart, Germany, in 1997.
He spent five years working on his Ph.D. thesis at the Institute for
Integrated Signal Processing Systems (ISS) at Aachen University of
Technology (RWTH Aachen). At the ISS he was primarily involved in
ASIP design and low-power hardware design methodology as well as
in the development of EDA tools. He has written about 10 scientific
conference and journal papers. His research interests include advanced
algorithms for design automation and digital signal processing with
a special focus on programmable architectures for efficient HW/SW
codesign. Currently, he is with IBM Deutschland Entwicklung GmbH,
Germany, where he is working on the design and verification of highend microprocessors for consumer applications. Tilman Gl¨okler is a
member of the IEEE.
H. Meyr received his M.Sc. and Ph.D. from ETH Zurich, Switzerland.
He spent over 12 years in various research and management positions
in industry before accepting a professorship in Electrical Engineering
at Aachen University of Technology (RWTH Aachen) in 1977. He has
worked extensively in the areas of communication theory, digital signal
processing and CAD tools for system level design for the last thirty
years. His research has been applied to the design of many industrial
products. At RWTH Aachen he is a co-director of the institute for
integrated signal processing system (ISS) involved in the analysis
and design of complex signal processing systems for communication
applications. He was a co-founder of CADIS GmbH (acquired 1993 by
Synopsys, Mountain View, California) a company which commercialized the tool suite COSSAP. In 2001 he has co-founded LISATek Inc.,
a company with breakthrough technology to design application specific
processors. Recently (February 2003) LISATek has been acquired by
CoWare, an acknowledged leader in the area of system level design. At
CoWare Dr. Meyr has accepted the position of Chief Scientist. He also
serves as a member of the board of directors at CoWare and another
large corporation. Dr. Meyr has published numerous IEEE papers and
About the Authors
holds many patents . He is author (together with Dr. G. Ascheid) of
the book ”Synchronization in Digital Communications”, Wiley 1990
and of the book ”Digital Communication Receivers. Synchronization,
Channel Estimation, and Signal Processing” (together with Dr. M.
Moeneclaey and Dr. S. Fechtel), Wiley, October 1997. He has received
two IEEE best paper awards. Dr.Meyr is also the recipient of the
prestigious Vodafone Innovation Prize for the year 2000. The Vodafone
prize is awarded for outstanding contribution to the area of wireless
communication As well as being a Fellow of the IEEE he has served as
Vice President for International Affairs of the IEEE Communications
Society.
xiv