Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Multimedia Communication Technology
PREMIUM
Số trang
858
Kích thước
25.3 MB
Định dạng
PDF
Lượt xem
1910

Multimedia Communication Technology

Nội dung xem thử

Mô tả chi tiết

Jens-Rainer Ohm

Multimedia Communication Technology

Springer-Verlag Berlin Heidelberg GmbH

Engineering ONLINE LlBRARY

springeronline.com

Jens-Rainer Ohm

Multimedia

Communication

Technology

Representation, Transmission and Identification

of Multimedia Signals

With 441 Figures

, Springer

Professor Jens-Rainer Ohm

RWTH Aachen University

Chair and Institute of Communications Engineering

Melatener Str. 23

52074 Aachen

Germany

Cataloging-in-Publication Data applied for

ISBN 978-3-642-62277-9 ISBN 978-3-642-18750-6 (eBook)

DOI 10.1007/978-3-642-18750-6

This work is subject to copyright. AlI rights are reserved, whether the whole or part of the material is

concemed, specifically the rights of translation, reprinting, reuse of illustratţons, recitation,

broadcasting, reproduction on microfilm or in other ways, and storage in data banks. Duplication of

this publication or parts thereof is permitted only under the provisions ofthe German Copyright Law

of September 9, 1965, in its current version, and permission for use must always be obtained from

Springer-Verlag. Violations are liable for prosecution under German Copyright Law.

springeronline.com

C Springer-Verlag Berlin Heidelberg 2004

Originally published by Springer-Verlag Berlin Heidelberg New York in 2004

Softcover reprint of the hardcover 18t edition 2004

The use of general descriptive names, registered names, trademarks, etc. in this publication does not

imply, even in the absence of a specific statement, that such names are exempt from the relevant

protective laws and regulations and therefore free for general use.

Typesetting: Digital data supplied by author

Cover-Design: Design & Production, Heidelberg

Printed on acid-free paper 62/3020 Rw 5432 1 O

Preface

Information technology provides a plenty of new ways to process, store, distribute

and access audiovisual information. Beyond traditional broadcast and telephone

channels and analog storage media like film or tapes, the emerging Internet, mo￾bile networks and digital storage are going to revolutionize the terms of distribu￾tion and access. This development is ruled by the convergence of audiovisual

media technology, information technology and telecommunications technology. By

capabilities of digital processing, established media like photography, movie, tele￾vision and radio are changing their roles and are becoming subsumed by new inte￾grated services which are mobile, interactive, pervasive, usable from anywhere,

giving freedom to play with, and penetrating everyday life. Multimedia communi￾cation establishes new forms of communication between people, between people

and machines, allows also communication between machines using audiovisual

information or related feature parameters. Intelligent media interfaces are becom￾ing increasingly important, and machine assistance in accessing media, in acquir￾ing, organizing, distributing, manipulating and consuming audiovisual information

becomes inevitable in the future.

This book intends to provide a deep insight into important enabling technolo￾gies of multimedia communication systems, which are methods of multimedia

signal processing, analysis, identification and recognition, and schemes for multi￾media signal representation, compression and expression by features or other

properties. All these are lively and highly innovative areas at present, where this

book reviews state-of-the-art technology and its scientific foundations, but shall

primarily support systematic understanding of underlying methods, algorithms and

their theoretical foundations. It is strongly believed that this is the best approach to

contribute to future improvements in the field.

In part, the book is a substantially upgraded translation ofmy German language

textbook on digital image and video coding, which was published by the mid '90s.

Since then, the progress that was made in compression of audiovisual data has

been breath-taking, and consequently newest developments are reflected, including

the Advanced Video Coding standard and motion-compensated Wavelet coding.

The second basis for this book are my lectures on topics of multimedia communi￾cations held regularly at RWTH Aachen University. These treat all aspects of

image, video and audio compression, including networking interfaces, and also

include multimedia signal identification and recognition . These latter aspects,

topically related to the MPEG-7 multimedia content description standard, establish

a profound basis for intelligent multimedia systems.

Most chapters are supplemented by homework problems, for which solutions are

available from http://www.ient.rwth-aachen.de.

VI

The book would not have been possible without contributions of numerous stu￾dents and many other people who have worked with me on topics of image, video

and audio processing, encoding and recognition over more than 15 years. These

are (in alphabetical order) Sven Bauer, Michael Becker, Markus Beermann, Sven

Brandau, Nicole Brandenburg, Michael Briinig, Ferry Bunjamin, Kai Cliiver, Em￾manuelle Come, Holger Crysandt, Sila Ekmekci, Christoph Fehn, Ingo Feldmann,

Oliver Fromm, Karsten Griineberg, Karsten Griinheit, Jens Guther, Hafez Hadine￾jad, Konstantin Hanke, Guido Heising, Hans Dieter Holme, Michael Hoynck,

Laetitia Hue, Ebroul Izquierdo, Peter Kauff, Jorg Kramer, Silko Kruse, Patrick

Laurent, Thomas Ledworuski, Wolfram Liebsch, Oliver Lietz, Phuong Ma, Bela

Makai, Claudia Mayer, Bernd Menser, Domingo Mery, Karsten Muller, Patrick

Ndjiki-Nya, Bernhard Pasewaldt, Andreas Praatz, Lars Prokop, Oliver Rockinger,

Katrin Riimtnler, Thomas Rusert, Mihaela van der Schaar, Ansgar Schiffler, Oliver

Schreer, Holger Schulz, Aljoscha Smolic, Frank Sperling, Peter Stammnitz, Jens

Wellhausen, Mathias Wien and DetlefZier. Please forgive me ifI forgot anybody.

Very special thanks are also directed to my scientific mentors Peter Noll, Hans

Dieter Luke and Irmfried Hartmann, all people ofIENT and to my family.

Aachen, August 15, 2003

Jens-Rainer Ohm

Table of Contents

1 Introduction 1

1.1 Concepts and Terminology 1

1.1.1 Signal Representation by Source Coding .4

1.1.2 Optimization ofTransmission 6

1.1.3 Content Identification 7

1.2 Signal Sources and Acquisit ion 9

1.3 Digital Representation ofMultimedia Signals 13

1.3.1 Image and Video Signals 13

1.3.2 Speech and Audio Signals 18

1.4 Problems 19

Part A: Multimedia Signal Processing and Analysis 21

2 Signals and Sampling 23

2.1 Signals and Fourier Spectra 23

2.1.1 Spatial Signals and Two-dimensional Spectra 24

2.1.2 Spatio-temporal Signals .30

2.2 Sampling ofMultimedia Signals 33

2.2.1 The Sampling Theorem 33

2.2.2 Separable Two-dimensional Sampling 35

2.2.3 Non-separable Two-dimensional Sampling 37

2.2.4 Sampling ofVideo Signals .42

2.3 Problems 46

3 Statistical Analysis of Multimedia Signals .49

3.1 Properties Related to Sample Statistics .49

3.2 Joint Statistical Properties 54

3.3 Spectral Properties 63

3.4 Statistical Modeling and Tests 68

3.5 Statistical Foundations ofInfonnation Theory 73

3.6 Problems 77

4 Linear Systems and Transforms 79

4.1 Two- and Multi-dimensional Linear Systems 79

4.1.1 Properties ofTwo-dimensional Filters 79

4.1.2 Frequency Transfer Functions of Multi-dimensional Filters 85

4.1.3 Image filtering by Matrix Operations 91

4.1.4 Realization ofTwo-dimensional Filters 93

VIII

4.2 Linear Prediction 96

4.2.1 One- and Two-dimensional Autoregressive Models 96

4.2.2 Linear Prediction 104

4.3 Linear Block Transforms 109

4.3.1 Orthogonal Basis Functions 109

4.3.2 Basis Functions of Orthogonal Transforms 113

4.3.3 Efficiency ofTransforms 126

4.3.4 Fast Transform Algorithms 129

4.3.5 Transforms with Block Overlap 130

4.4 Filterbank Transforms 133

4.4.1 Decimation and Interpolation 135

4.4.2 Properties of Subband Filters 138

4.4.3 Implementation ofFilterbank Structures 145

4.4.4 Wavelet Transform 151

4.4.5 Two- and Multi-dimensional Filter Banks 160

4.4.6 Pyramid Decomposition 164

4.5 Problems 167

5 Pre- and Postprocessing 171

5.1 Nonlinear Filters 171

5.1.1 Median Filters and Rank Order Filters 172

5.1.2 Morphological Filters 175

5.1.3 Polynomial Filters 179

5.2 Signal Enhancement 180

5.3 Amplitude-value transformations 182

5.3.1 Amplitude Mapping Functions 183

5.3.2 Probability Distribution Modification and Equalization 185

5.4 Interpolation 187

5.4.1 Zero- and First-order Interpolators 188

5.4.2 Interpolation using linear Filters 190

5.4.3 Interpolation based on Frequency Extension 193

5.4.4 Spline and Lagrangian Interpolation 194

5.4.5 Interpolation on Irregular 2D Grids 198

5.5 Problems 200

Part B: Content-related Multimedia Signal Analysis 203

6 Perceptual Properties of Vision and Hearing 205

6.1 Properties ofVision 205

6.1.1 Physiology ofthe Eye 205

6.1.2 Sensitivity Functions 207

6.1.3 Color Vision 210

6.2 Properties ofHearing 211

6.2.1 Physiology of the Ear 211

6.2.2 Sensitivity Functions 212

IX

7 Features of Multimedia Signals 217

7.1 Color 217

7.1.1 Color Space Transformations 218

7.1.2 Representation ofColor Features 223

7.2 Texture 228

7.2.1 Statistical Texture Analysis 229

7.2.2 Spectral Features ofTexture 235

7.3 Edge Analysis 242

7.3.1 Edge Detection by Gradient Operators 242

7.3.2 Edge Characterization by second Derivative 244

7.3.3 Edge Finding and Consistency Analysis 247

7.3.4 Edge Model Fitting 249

7.3.5 Description and Analysis of Edge Properties 251

7.4 Contour and Shape Analysis 253

7.4.1 Contour fitting 253

7.4.2 Contour Description by Orientation and Curvature 259

7.4.3 Geometric Features and Binary Shape Features 263

7.4.4 Projection and geometric mapping 267

7.4.5 Moment analysis 274

7.4.6 Shape Analysis by Basis Functions 278

7.4.7 Three-dimensional Shapes 279

7.5 Correspondence analysis 284

7.6 Motion Analysis 288

7.6.1 Mapping of motion into the image plane 288

7.6.2 Motion Estimation by the Optical Flow Principle 292

7.6.3 Motion Estimation by Matching 297

7.6.4 Estimation ofParameters for Warping Grids 307

7.6.5 Estimation of non-translational Motion Parameters 310

7.6.6 Estimation ofMotion Vector Fields at Object Boundaries 313

7.6.7 Analysis of 3D Motion 315

7.7 Disparity and Depth Analysis 316

7.7.1 Central Projection in Stereoscopic and Multiple-camera Systems 321

7.7.2 Epipolar Geometry for arbitrary Camera Configurations 323

7.8 Mosaics 326

7.9 Face Detection and Description 328

7.10 Audio Signal Features 331

7.10.1 Basic Features 332

7.10.2 Speech Signal Analysis 333

7.10.3 Musical Signals, Instruments and Sounds 334

7.10.4 Room Properties 344

7.11 Problems 346

8 Signal and Parameter Estimation 353

8.1 Observation and Degradation Models 353

x

8.2 Estimation based on linear filters 355

8.2.1 Inverse Filtering 355

8.2.2 Wiener Filtering 356

8.3 Least Squares Estimation 358

8.4 Singular Value Decomposition 361

8.5 ML and MAP Estimation 363

8.6 Kalman Estimation 366

8.7 Outlier rejection in estimation 370

8.8 Problems 373

9 Feature Transforms and Classification 375

9.1 Feature Transforms 375

9.1.1 Eigenvector Analysis ofFeature Value Sets 376

9.1.2 Independent Component Analysis 377

9.1.3 Generalized Hough Transform .378

9.2 Feature Value Normalization and Weighting 379

9.2.1 Normalization ofFeature Values 380

9.2.2 Simple Distance Metrics 381

9.2.3 Distance Metrics related to Statistical Distributions 382

9.2.4 Distance Metrics based on Class Features 385

9.2.5 Reliability measures 387

9.3 Feature-based Comparison 389

9.4 Feature-based Classification 391

9.4.1 Linear Classification oftwo Classes 393

9.4.2 Generalization of Linear Classification 398

9.4.3 Nearest-neighbor and Cluster-based Methods .400

9.4.4 Maximum a Posteriori (Bayes) Classification .404

9.4.5 Artificial Neural Networks 407

9.4.6 Hidden Markov Models .414

9.5 Problems .415

10 Signal Decomposition .....................•...............•..............••••..•........•.••..•........•..•••..417

10.1 Segmentation ofImage Signals .418

10.1.1 Pixel-based Segmentation .418

10.1.2 Region-based Methods 423

10.1.3 Texture Elimination .425

10.1.4 Relaxation Methods .428

10.1.5 Image Region Labeling .433

10.2 Segmentation ofVideo Signals .434

10.2.1 Temporal Segmentation for Scene Changes .434

10.2.2 Combination of Spatial and Temporal Segmentation .436

10.2.3 Segmentation ofObjects based on Motion Information 438

10.3 Segmentation and Decomposition ofAudio Signals .440

XI

10.4 Problems .441

Part C: Coding of Multimedia Signals 443

11 Quantization and Coding 445

11.1 Scalar Quantization 445

11.2 Coding Theory 450

11.2.1 Source Coding Theorem and Rate Distortion Function .450

11.2.2 Rate-Distortion Function for Correlated Signals .451

11.2.3 Rate Distortion Function for Multi-dimensional Signals .454

11.3 Rate-Distortion Optimization of Quantizers .456

11.4 Entropy Coding .461

11.4.1 Properties ofVariable-length Codes .461

11.4.2 Huffman Codes 464

11.4.3 Systematic Variable-length Codes .466

11.4.4 Arithmetic Coding .470

11.4.5 Context-dependent Entropy Coding .475

11.4.6 Adaptive Entropy Coding .476

11.4.7 Entropy Coding and Transmission Errors .478

11.4.8 Run-length Coding .479

11.4.9 Lempel-Ziv Coding .481

11.5 Vector Quantization .483

11.5.1 Basic Principles ofVector Quantization 483

11.5.2 Vector Quantization with Uniform Codebooks .488

11.5.3 Vector Quantization with Non-uniform Codebooks .491

11.5.4 Structured Codebooks 494

11.5.5 Rate-constrained Vector Quantization .498

11.6 Sliding Block Coding 501

11.6.I Trellis Coding .502

11.6.2 Tree Coding 504

11.7 Problems 506

12 Still Image Coding 509

12.1 Compression ofBinary Images 509

12.2 Vector Quantization ofImages 514

12.3 Predictive Coding 521

12.3.1 DPCM Systems 521

12.3.2 Predictor filters in 2D DPCM 524

12.3.3 Quantization and Encoding ofPrediction Errors 526

12.3.4 Error propagation in DPCM 531

12.4 Transform Coding 533

12.4.1 Block Transform Coding 533

12.4.2 Subband and Wavelet Transform Coding 544

12.4.3 Vector Quantization ofTransform Coefficients 554

12.4.4 Adaptation oftransform bases to signal properties 557

XII

12.4.5 Transform coding and transmission losses 559

12.5 Fractal Coding 562

12.5.1 Principles of Fractal Transforms 563

12.5.2 Collage Theorem 563

12.5.3 Fractal Decoding 565

12.6 Region-based coding 571

12.6.1 Binary Shape Coding 57I

12.6.2 Contour shape coding 573

12.6.3 Coding within arbitrary-shaped Regions 575

12.7 Problems 578

13 Video Coding 583

13.1 Methods without Motion Compensation 583

13.1.1 Frame Replenishment 585

13.1.2 3D Transform and Subband coding 586

13.2 Hybrid Video Coding 590

13.2.1 Motion-compensated Hybrid Coders 590

13.2.2 Characteristics ofInterframe Prediction Error Signals 592

13.2.3 Quantization error feedback and error propagation 595

13.2.4 Forward, Backward and Multiframe Prediction 598

13.2.5 Bi-directional Prediction 600

13.2.6 Improved Methods ofmotion compensation 604

13.2.7 Hybrid Coding ofInteriaced Video Signals 611

13.2.8 Scalable Hybrid Coding 613

13.2.9 Multiple-description Video Coding 624

13.2.10 Optimization ofHybrid Encoders 627

13.3 MC Prediction Coding using the Wavelet Transform 629

13.3.1 Wavelet Transform in the Prediction Loop 630

13.3.2 Frequency Coding with In-band Motion Compensation 63I

13.4 Spatio-temporal Frequency Coding with MC 637

13.4.1 Temporal-axis Haar Filters with MC 638

13.4.2 Temporal-axis Lifting Filters for arbitrary MC 643

13.4.3 Improvements on Motion Compensation 653

13.4.4 Quantization and Encoding of 3D Wavelet Coefficients 656

13.4.5 Delay and Complexity onD Wavelet Coders 662

13.5 Encoding ofMotion Parameters 666

13.5.1 Spatial Contexts in Motion Coding 666

13.5.2 Temporal Contexts in Motion Coding 668

13.5.3 Fractal Video Coding 670

13.6 Problems 671

14 Audio Coding 673

14.1 Coding of Speech Signals 673

14.2 Waveform Coding ofAudio signals 676

XIII

14.3 Parametric Coding of Audio and Sound Signals 681

Part D: Applications and Standards 685

15 Transmission and Storage 687

15.1 Convergence ofDigital Multimedia Services 687

15.2 Adaptation to Channel Characteristics 690

15.2.1 Rate and Transmission Control 693

15.2.2 Error Control 697

15.3 Digital Broadcast. 703

15.4 Media Streaming 706

15.5 Content-based Media Access 711

15.6 Content Protection 715

16 Signal Composition, Rendering and Presentation 717

16.1 Composition and Mixing of Visual Signals 718

16.2 Warping and Morphing 724

16.3 Viewpoint Adaptation 725

16.4 Frame Rate Conversion 728

16.5 Rendering of Image and Video Signals 732

16.6 Composition and Rendering of Audio Signals 735

17 Multimedia Representation Standards 739

17.1 Interoperabilityand Compatibility 739

17.2 Definitions at Systems Level.. 745

17.3 Still Image Coding 751

17.3.1 The JBIG Standards 75I

17.3.2 The JPEG Standards 752

17.3.3 MPEG-4 Still Texture Coding 760

17.4 Video Coding 760

17.4.1 ITU-T Recommendations H.261 and H.263 761

17.4.2 MPEG-I and MPEG-2 764

17.4.3 MPEG-4 Visual 769

17.4.4 H.264/MPEG-4 Part 10 Advanced Video Coding (AVC) 774

17.5 Audio Coding 778

17.5.1 Speech Coding 778

17.5.2 Music and Sound Coding 779

17.6 Multimedia Content Description Standard MPEG-7 783

17.6.1 Elements of MPEG-7 Descriptions 785

17.6.2 Generic Multimedia Description Concepts 786

17.6.3 Visual Descriptors 789

XIV

17.6.4 Audio Descriptors 794

17.7 Multimedia Framework MPEG-21 797

Appendices 801

A Quality Measurement 803

A.l Signal Quality 803

A.l .l Objective Signal Quality Measurements 803

A.l .2 Subjective Assessment 806

A.2 Classification Quality 808

B Vector and Matrix Algebra 813

C Symbols and Variables 819

D Acronyms 823

References 829

Index 853

Part A: Multimedia Signal Processing and Analysis

Tải ngay đi em, còn do dự, trời tối mất!