Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Speech coding algorithms
Nội dung xem thử
Mô tả chi tiết
SPEECH CODING
ALGORITHMS
Foundation and Evolution
of Standardized Coders
WAI C. CHU
Mobile Media Laboratory
DoCoMo USA Labs
San Jose, California
A JOHN WILEY & SONS, INC., PUBLICATION
SPEECH CODING
ALGORITHMS
SPEECH CODING
ALGORITHMS
Foundation and Evolution
of Standardized Coders
WAI C. CHU
Mobile Media Laboratory
DoCoMo USA Labs
San Jose, California
A JOHN WILEY & SONS, INC., PUBLICATION
Copyright # 2003 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or
by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as
permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior
written permission of the Publisher, or authorization through payment of the appropriate per-copy fee
to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400,
fax 978-750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should
be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken,
NJ 07030, (201) 748-6011, fax (201) 748-6008, e-mail: [email protected].
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in
preparing this book, they make no representations or warranties with respect to the accuracy or
completeness of the contents of this book and specifically disclaim any implied warranties of
merchantability or fitness for a particular purpose. No warranty may be created or extended by sales
representatives or written sales materials. The advice and strategies contained herein may not be suitable
for your situation. You should consult with a professional where appropriate. Neither the publisher nor
author shall be liable for any loss of profit or any other commercial damages, including but not limited to
special, incidental, consequential, or other damages.
For general information on our other products and services please contact our Customer Care Department
within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print,
however, may not be available in electronic format.
Library of Congress Cataloging-in-Publication Data:
Chu, Wai C. —
Speech coding algorithms: Foundation and evolution of standardized coders
ISBN 0-471-37312-5
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
Intelligence is the fruit of industriousness
Accretion of knowledge creates genii
A Chinese proverb
CONTENTS
PREFACE xiii
ACRONYMS xix
NOTATION xxiii
1 INTRODUCTION 1
1.1 Overview of Speech Coding / 2
1.2 Classification of Speech Coders / 8
1.3 Speech Production and Modeling / 11
1.4 Some Properties of the Human Auditory System / 18
1.5 Speech Coding Standards / 22
1.6 About Algorithms / 26
1.7 Summary and References / 31
2 SIGNAL PROCESSING TECHNIQUES 33
2.1 Pitch Period Estimation / 33
2.2 All-Pole and All-Zero Filters / 45
2.3 Convolution / 52
2.4 Summary and References / 57
Exercises / 57
vii
3 STOCHASTIC PROCESSES AND MODELS 61
3.1 Power Spectral Density / 62
3.2 Periodogram / 67
3.3 Autoregressive Model / 69
3.4 Autocorrelation Estimation / 73
3.5 Other Signal Models / 85
3.6 Summary and References / 86
Exercises / 87
4 LINEAR PREDICTION 91
4.1 The Problem of Linear Prediction / 92
4.2 Linear Prediction Analysis of Nonstationary Signals / 96
4.3 Examples of Linear Prediction Analysis of Speech / 101
4.4 The Levinson–Durbin Algorithm / 107
4.5 The Leroux–Gueguen Algorithm / 114
4.6 Long-Term Linear Prediction / 120
4.7 Synthesis Filters / 127
4.8 Practical Implementation / 131
4.9 Moving Average Prediction / 137
4.10 Summary and References / 138
Exercises / 139
5 SCALAR QUANTIZATION 143
5.1 Introduction / 143
5.2 Uniform Quantizer / 147
5.3 Optimal Quantizer / 149
5.4 Quantizer Design Algorithms / 151
5.5 Algorithmic Implementation / 155
5.6 Summary and References / 158
Exercises / 158
6 PULSE CODE MODULATION AND ITS VARIANTS 161
6.1 Uniform Quantization / 161
6.2 Nonuniform Quantization / 166
6.3 Differential Pulse Code Modulation / 172
6.4 Adaptive Schemes / 175
6.5 Summary and References / 180
Exercises / 181
viii CONTENTS
7 VECTOR QUANTIZATION 184
7.1 Introduction / 185
7.2 Optimal Quantizer / 188
7.3 Quantizer Design Algorithms / 189
7.4 Multistage VQ / 194
7.5 Predictive VQ / 216
7.6 Other Structured Schemes / 219
7.7 Summary and References / 221
Exercises / 222
8 SCALAR QUANTIZATION OF LINEAR
PREDICTION COEFFICIENT 227
8.1 Spectral Distortion / 227
8.2 Quantization Based on Reflection Coefficient and
Log Area Ratio / 232
8.3 Line Spectral Frequency / 239
8.4 Quantization Based on Line Spectral Frequency / 252
8.5 Interpolation of LPC / 256
8.6 Summary and References / 258
Exercises / 260
9 LINEAR PREDICTION CODING 263
9.1 Speech Production Model / 264
9.2 Structure of the Algorithm / 268
9.3 Voicing Detector / 271
9.4 The FS1015 LPC Coder / 275
9.5 Limitations of the LPC Model / 277
9.6 Summary and References / 280
Exercises / 281
10 REGULAR-PULSE EXCITATION CODERS 285
10.1 Multipulse Excitation Model / 286
10.2 Regular-Pulse-Excited–Long-Term Prediction / 289
10.3 Summary and References / 295
Exercises / 296
11 CODE-EXCITED LINEAR PREDICTION 299
11.1 The CELP Speech Production Model / 300
CONTENTS ix
11.2 The Principle of Analysis-by-Synthesis / 301
11.3 Encoding and Decoding / 302
11.4 Excitation Codebook Search / 308
11.5 Postfilter / 317
11.6 Summary and References / 325
Exercises / 326
12 THE FEDERAL STANDARD VERSION OF CELP 330
12.1 Improving the Long-Term Predictor / 331
12.2 The Concept of the Adaptive Codebook / 333
12.3 Incorporation of the Adaptive Codebook to
the CELP Framework / 336
12.4 Stochastic Codebook Structure / 338
12.5 Adaptive Codebook Search / 341
12.6 Stochastic Codebook Search / 344
12.7 Encoder and Decoder / 346
12.8 Summary and References / 349
Exercises / 350
13 VECTOR SUM EXCITED LINEAR PREDICTION 353
13.1 The Core Encoding Structure / 354
13.2 Search Strategies for Excitation Codebooks / 356
13.3 Excitation Codebook Searches / 357
13.4 Gain Related Procedures / 362
13.5 Encoder and Decoder / 366
13.6 Summary and References / 368
Exercises / 369
14 LOW-DELAY CELP 372
14.1 Strategies to Achieve Low Delay / 373
14.2 Basic Operational Principles / 375
14.3 Linear Prediction Analysis / 377
14.4 Excitation Codebook Search / 380
14.5 Backward Gain Adaptation / 385
14.6 Encoder and Decoder / 389
14.7 Codebook Training / 391
14.8 Summary and References / 393
Exercises / 394
x CONTENTS
15 VECTOR QUANTIZATION OF LINEAR
PREDICTION COEFFICIENT 396
15.1 Correlation Among the LSFs / 396
15.2 Split VQ / 399
15.3 Multistage VQ / 403
15.4 Predictive VQ / 407
15.5 Summary and References / 418
Exercises / 419
16 ALGEBRAIC CELP 423
16.1 Algebraic Codebook Structure / 424
16.2 Adaptive Codebook / 425
16.3 Encoding and Decoding / 433
16.4 Algebraic Codebook Search / 437
16.5 Gain Quantization Using Conjugate VQ / 443
16.6 Other ACELP Standards / 446
16.7 Summary and References / 451
Exercises / 451
17 MIXED EXCITATION LINEAR PREDICTION 454
17.1 The MELP Speech Production Model / 455
17.2 Fourier Magnitudes / 456
17.3 Shaping Filters / 464
17.4 Pitch Period and Voicing Strength Estimation / 466
17.5 Encoder Operations / 474
17.6 Decoder Operations / 477
17.7 Summary and References / 481
Exercises / 482
18 SOURCE-CONTROLLED VARIABLE BIT-RATE CELP 486
18.1 Adaptive Rate Decision / 487
18.2 LP Analysis and LSF-Related Operations / 494
18.3 Decoding and Encoding / 496
18.4 Summary and References / 498
Exercises / 499
19 SPEECH QUALITY ASSESSMENT 501
19.1 The Scope of Quality and Measuring Conditions / 501
CONTENTS xi
19.2 Objective Quality Measurements for Waveform Coders / 502
19.3 Subjective Quality Measures / 504
19.4 Improvements on Objective Quality Measures / 505
APPENDIX A MINIMUM-PHASE PROPERTY OF THE
FORWARD PREDICTION-ERROR FILTER 507
APPENDIX B SOME PROPERTIES OF LINE
SPECTRAL FREQUENCY 514
APPENDIX C RESEARCH DIRECTIONS IN
SPEECH CODING 518
APPENDIX D LINEAR COMBINER FOR
PATTERN CLASSIFICATION 522
APPENDIX E CELP: OPTIMAL LONG-TERM PREDICTOR TO
MINIMIZE THE WEIGHTED DIFFERENCE 531
APPENDIX F REVIEW OF LINEAR ALGEBRA:
ORTHOGONALITY, BASIS, LINEAR
INDEPENDENCE, AND THE
GRAM–SCHMIDT ALGORITHM 537
BIBLIOGRAPHY 542
INDEX 553
xii CONTENTS