Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Speech coding algorithms
PREMIUM
Số trang
585
Kích thước
8.4 MB
Định dạng
PDF
Lượt xem
1460

Speech coding algorithms

Nội dung xem thử

Mô tả chi tiết

SPEECH CODING

ALGORITHMS

Foundation and Evolution

of Standardized Coders

WAI C. CHU

Mobile Media Laboratory

DoCoMo USA Labs

San Jose, California

A JOHN WILEY & SONS, INC., PUBLICATION

SPEECH CODING

ALGORITHMS

SPEECH CODING

ALGORITHMS

Foundation and Evolution

of Standardized Coders

WAI C. CHU

Mobile Media Laboratory

DoCoMo USA Labs

San Jose, California

A JOHN WILEY & SONS, INC., PUBLICATION

Copyright # 2003 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or

by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as

permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior

written permission of the Publisher, or authorization through payment of the appropriate per-copy fee

to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400,

fax 978-750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should

be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken,

NJ 07030, (201) 748-6011, fax (201) 748-6008, e-mail: [email protected].

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in

preparing this book, they make no representations or warranties with respect to the accuracy or

completeness of the contents of this book and specifically disclaim any implied warranties of

merchantability or fitness for a particular purpose. No warranty may be created or extended by sales

representatives or written sales materials. The advice and strategies contained herein may not be suitable

for your situation. You should consult with a professional where appropriate. Neither the publisher nor

author shall be liable for any loss of profit or any other commercial damages, including but not limited to

special, incidental, consequential, or other damages.

For general information on our other products and services please contact our Customer Care Department

within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print,

however, may not be available in electronic format.

Library of Congress Cataloging-in-Publication Data:

Chu, Wai C. —

Speech coding algorithms: Foundation and evolution of standardized coders

ISBN 0-471-37312-5

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

Intelligence is the fruit of industriousness

Accretion of knowledge creates genii

A Chinese proverb

CONTENTS

PREFACE xiii

ACRONYMS xix

NOTATION xxiii

1 INTRODUCTION 1

1.1 Overview of Speech Coding / 2

1.2 Classification of Speech Coders / 8

1.3 Speech Production and Modeling / 11

1.4 Some Properties of the Human Auditory System / 18

1.5 Speech Coding Standards / 22

1.6 About Algorithms / 26

1.7 Summary and References / 31

2 SIGNAL PROCESSING TECHNIQUES 33

2.1 Pitch Period Estimation / 33

2.2 All-Pole and All-Zero Filters / 45

2.3 Convolution / 52

2.4 Summary and References / 57

Exercises / 57

vii

3 STOCHASTIC PROCESSES AND MODELS 61

3.1 Power Spectral Density / 62

3.2 Periodogram / 67

3.3 Autoregressive Model / 69

3.4 Autocorrelation Estimation / 73

3.5 Other Signal Models / 85

3.6 Summary and References / 86

Exercises / 87

4 LINEAR PREDICTION 91

4.1 The Problem of Linear Prediction / 92

4.2 Linear Prediction Analysis of Nonstationary Signals / 96

4.3 Examples of Linear Prediction Analysis of Speech / 101

4.4 The Levinson–Durbin Algorithm / 107

4.5 The Leroux–Gueguen Algorithm / 114

4.6 Long-Term Linear Prediction / 120

4.7 Synthesis Filters / 127

4.8 Practical Implementation / 131

4.9 Moving Average Prediction / 137

4.10 Summary and References / 138

Exercises / 139

5 SCALAR QUANTIZATION 143

5.1 Introduction / 143

5.2 Uniform Quantizer / 147

5.3 Optimal Quantizer / 149

5.4 Quantizer Design Algorithms / 151

5.5 Algorithmic Implementation / 155

5.6 Summary and References / 158

Exercises / 158

6 PULSE CODE MODULATION AND ITS VARIANTS 161

6.1 Uniform Quantization / 161

6.2 Nonuniform Quantization / 166

6.3 Differential Pulse Code Modulation / 172

6.4 Adaptive Schemes / 175

6.5 Summary and References / 180

Exercises / 181

viii CONTENTS

7 VECTOR QUANTIZATION 184

7.1 Introduction / 185

7.2 Optimal Quantizer / 188

7.3 Quantizer Design Algorithms / 189

7.4 Multistage VQ / 194

7.5 Predictive VQ / 216

7.6 Other Structured Schemes / 219

7.7 Summary and References / 221

Exercises / 222

8 SCALAR QUANTIZATION OF LINEAR

PREDICTION COEFFICIENT 227

8.1 Spectral Distortion / 227

8.2 Quantization Based on Reflection Coefficient and

Log Area Ratio / 232

8.3 Line Spectral Frequency / 239

8.4 Quantization Based on Line Spectral Frequency / 252

8.5 Interpolation of LPC / 256

8.6 Summary and References / 258

Exercises / 260

9 LINEAR PREDICTION CODING 263

9.1 Speech Production Model / 264

9.2 Structure of the Algorithm / 268

9.3 Voicing Detector / 271

9.4 The FS1015 LPC Coder / 275

9.5 Limitations of the LPC Model / 277

9.6 Summary and References / 280

Exercises / 281

10 REGULAR-PULSE EXCITATION CODERS 285

10.1 Multipulse Excitation Model / 286

10.2 Regular-Pulse-Excited–Long-Term Prediction / 289

10.3 Summary and References / 295

Exercises / 296

11 CODE-EXCITED LINEAR PREDICTION 299

11.1 The CELP Speech Production Model / 300

CONTENTS ix

11.2 The Principle of Analysis-by-Synthesis / 301

11.3 Encoding and Decoding / 302

11.4 Excitation Codebook Search / 308

11.5 Postfilter / 317

11.6 Summary and References / 325

Exercises / 326

12 THE FEDERAL STANDARD VERSION OF CELP 330

12.1 Improving the Long-Term Predictor / 331

12.2 The Concept of the Adaptive Codebook / 333

12.3 Incorporation of the Adaptive Codebook to

the CELP Framework / 336

12.4 Stochastic Codebook Structure / 338

12.5 Adaptive Codebook Search / 341

12.6 Stochastic Codebook Search / 344

12.7 Encoder and Decoder / 346

12.8 Summary and References / 349

Exercises / 350

13 VECTOR SUM EXCITED LINEAR PREDICTION 353

13.1 The Core Encoding Structure / 354

13.2 Search Strategies for Excitation Codebooks / 356

13.3 Excitation Codebook Searches / 357

13.4 Gain Related Procedures / 362

13.5 Encoder and Decoder / 366

13.6 Summary and References / 368

Exercises / 369

14 LOW-DELAY CELP 372

14.1 Strategies to Achieve Low Delay / 373

14.2 Basic Operational Principles / 375

14.3 Linear Prediction Analysis / 377

14.4 Excitation Codebook Search / 380

14.5 Backward Gain Adaptation / 385

14.6 Encoder and Decoder / 389

14.7 Codebook Training / 391

14.8 Summary and References / 393

Exercises / 394

x CONTENTS

15 VECTOR QUANTIZATION OF LINEAR

PREDICTION COEFFICIENT 396

15.1 Correlation Among the LSFs / 396

15.2 Split VQ / 399

15.3 Multistage VQ / 403

15.4 Predictive VQ / 407

15.5 Summary and References / 418

Exercises / 419

16 ALGEBRAIC CELP 423

16.1 Algebraic Codebook Structure / 424

16.2 Adaptive Codebook / 425

16.3 Encoding and Decoding / 433

16.4 Algebraic Codebook Search / 437

16.5 Gain Quantization Using Conjugate VQ / 443

16.6 Other ACELP Standards / 446

16.7 Summary and References / 451

Exercises / 451

17 MIXED EXCITATION LINEAR PREDICTION 454

17.1 The MELP Speech Production Model / 455

17.2 Fourier Magnitudes / 456

17.3 Shaping Filters / 464

17.4 Pitch Period and Voicing Strength Estimation / 466

17.5 Encoder Operations / 474

17.6 Decoder Operations / 477

17.7 Summary and References / 481

Exercises / 482

18 SOURCE-CONTROLLED VARIABLE BIT-RATE CELP 486

18.1 Adaptive Rate Decision / 487

18.2 LP Analysis and LSF-Related Operations / 494

18.3 Decoding and Encoding / 496

18.4 Summary and References / 498

Exercises / 499

19 SPEECH QUALITY ASSESSMENT 501

19.1 The Scope of Quality and Measuring Conditions / 501

CONTENTS xi

19.2 Objective Quality Measurements for Waveform Coders / 502

19.3 Subjective Quality Measures / 504

19.4 Improvements on Objective Quality Measures / 505

APPENDIX A MINIMUM-PHASE PROPERTY OF THE

FORWARD PREDICTION-ERROR FILTER 507

APPENDIX B SOME PROPERTIES OF LINE

SPECTRAL FREQUENCY 514

APPENDIX C RESEARCH DIRECTIONS IN

SPEECH CODING 518

APPENDIX D LINEAR COMBINER FOR

PATTERN CLASSIFICATION 522

APPENDIX E CELP: OPTIMAL LONG-TERM PREDICTOR TO

MINIMIZE THE WEIGHTED DIFFERENCE 531

APPENDIX F REVIEW OF LINEAR ALGEBRA:

ORTHOGONALITY, BASIS, LINEAR

INDEPENDENCE, AND THE

GRAM–SCHMIDT ALGORITHM 537

BIBLIOGRAPHY 542

INDEX 553

xii CONTENTS

Tải ngay đi em, còn do dự, trời tối mất!