Multimedia Image and Video Processing - EEn

MULTIMEDIA

IMAGE and VIDEO

PROCESSING

IMAGE PROCESSING SERIES

Series Editor: Phillip A. Laplante

Forthcoming Titles

Adaptive Image Processing: A Computational Intelligence

Perspective

Ling Guan, Hau-San Wong, and Stuart William Perry

Shape Analysis and Classification: Theory and Practice

Luciano da Fontoura Costa and Roberto Marcondes Cesar, Jr.

Published Titles

Image and Video Compression for Multimedia Engineering

Yun Q. Shi and Huiyang Sun

Boca Raton London New York Washington, D.C.

CRC Press

Edited by

Ling Guan

Sun-Yuan Kung Jan Larsen

MULTIMEDIA

IMAGE and VIDEO

PROCESSING

This book contains information obtained from authentic and highly regarded sources. Reprinted material is

quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts

have been made to publish reliable data and information, but the author and the publisher cannot assume

responsibility for the validity of all materials or for the consequences of their use.

Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or

mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval

system, without prior permission in writing from the publisher.

use of specific clients, may be granted by CRC Press LLC, provided that $.50 per page photocopied is paid

directly to Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923 USA. The fee code for

users of the Transactional Reporting Service is ISBN 0-8493-3492-6/01/$0.00+$.50. The fee is subject to

change without notice. For organizations that have been granted a photocopy license by the CCC, a separate

system of payment has been arranged.

The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating

new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such

copying.

Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used

only for identification and explanation, without intent to infringe.

No claim to original U.S. Government works

International Standard Book Number 0-8493-3492-6

Library of Congress Card Number 00-030341

Printed in the United States of America 1 2 3 4 5 6 7 8 9 0

Printed on acid-free paper

Library of Congress Cataloging-in-Publication Data

Multimedia image and video processing / edited by Ling Guan, Sun-Yuan Kung, Jan Larsen.

p. cm.

Includes bibliographical references and index.

ISBN 0-8493-3492-6 (alk.)

1. Multimedia systems. 2. Image processing—Digital techniques. I. Guan, Ling. II.

Kung, S.Y. (Sun Yuan) III. Larsen, Jan.

QA76.575 2000

006.4′2—dc21 00-030341

Contents

1 Emerging Standards for Multimedia Applications

Tsuhan Chen

1.1 Introduction

1.2 Standards

1.3 Fundamentals of Video Coding

1.3.1 Transform Coding

1.3.2 Motion Compensation

1.3.3 Summary

1.4 Emerging Video and Multimedia Standards

1.4.1 H.263

1.4.2 H.26L

1.4.3 MPEG-4

1.4.4 MPEG-7

1.5 Standards for Multimedia Communication

1.6 Conclusion

References

2 An Efficient Algorithm and Architecture for Real-Time Perspective Image

Warping

Yi Kang and Thomas S. Huang

2.1 Introduction

2.2 A Fast Algorithm for Perspective Transform

2.2.1 Perspective Transform

2.2.2 Existing Approximation Methods

2.2.3 Constant Denominator Method

2.2.4 Simulation Results

2.2.5 Sprite Warping Algorithm

2.3 Architecture for Sprite Warping

2.3.1 Implementation Issues

2.3.2 Memory Bandwidth Reduction

2.3.3 Architecture

2.4 Conclusion

References

3 Application-Specific Multimedia Processor Architecture

Yu Hen Hu and Surin Kittitornkun

3.1 Introduction

3.1.1 Requirements of Multimedia Signal Processing (MSP) Hardware

3.1.2 Strategies: Matching Micro-Architecture and Algorithm

3.2 Systolic Array Structure Micro-Architecture

3.2.1 Systolic Array Design Methodology

3.2.2 Array Structures for Motion Estimation

3.3 Dedicated Micro-Architecture

3.3.1 Design Methodologies for Dedicated Micro-Architecture

3.3.2 Feed-Forward Direct Synthesis: Fast Discrete Cosine Transform (DCT)

3.3.3 Feedback Direct Synthesis: Huffman Coding

3.4 Concluding Remarks

References

4 Superresolution of Images with Learned Multiple Reconstruction Kernels

Frank M. Candocia and Jose C. Principe

4.1 Introduction

4.2 An Approach to Superresolution

4.2.1 Comments and Observations

4.2.2 Finding Bases for Image Representation

4.2.3 Description of the Methodology

4.3 Image Acquisition Model

4.4 Relating Kernel-Based Approaches

4.4.1 Single Kernel

4.4.2 Family of Kernels

4.5 Description of the Superresolution Architecture

4.5.1 The Training Data

4.5.2 Clustering of Data

4.5.3 Neighborhood Association

4.5.4 Superresolving Images

4.6 Results

4.7 Issues and Notes

4.8 Conclusions

References

5 Image Processing Techniques for Multimedia Processing

N. Herodotou, K.N. Plataniotis, and A.N. Venetsanopoulos

5.1 Introduction

5.2 Color in Multimedia Processing

5.3 Color Image Filtering

5.3.1 Fuzzy Multichannel Filters

5.3.2 The Membership Functions

5.3.3 A Combined Fuzzy Directional and Fuzzy Median Filter

5.3.4 Application to Color Images

5.4 Color Image Segmentation

5.4.1 Histogram Thresholding

5.4.2 Postprocessing and Region Merging

5.4.3 Experimental Results

5.5 Facial Image Segmentation

5.5.1 Extraction of Skin-Tone Regions

5.5.2 Postprocessing

5.5.3 Shape and Color Analysis

5.5.4 Fuzzy Membership Functions

5.5.5 Meta-Data Features

5.5.6 Experimental Results

5.6 Conclusions

References

6 Intelligent Multimedia Processing

Ling Guan, Sun-Yuan Kung, and Jenq-Neng Hwang

6.1 Introduction

6.1.1 Neural Networks and Multimedia Processing

6.1.2 Focal Technical Issues Addressed in the Chapter

6.1.3 Organization of the Chapter

6.2 Useful Neural Network Approachesto Multimedia Data Representation, Classification, and Fusion

6.2.1 Multimedia Data Representation

6.2.2 Multimedia Data Detection and Classification

6.2.3 Hierarchical Fuzzy Neural Networks as Linear Fusion Networks

6.2.4 Temporal Models for Multimodal Conversion and Synchronization

6.3 Neural Networks for IMP Applications

6.3.1 Image Visualization and Segmentation

6.3.2 Personal Authentication and Recognition

6.3.3 Audio-to-Visual Conversion and Synchronization

6.3.4 Image and Video Retrieval, Browsing, and Content-Based Indexing

6.3.5 Interactive Human–Computer Vision

6.4 Open Issues, Future Research Directions, and Conclusions

References

7 On Independent Component Analysis for Multimedia Signals

Lars Kai Hansen, Jan Larsen, and Thomas Kolenda

7.1 Background

7.2 Principal and Independent Component Analysis

7.3 Likelihood Framework for Independent Component Analysis

7.3.1 Generalization and the Bias-Variance Dilemma

7.3.2 Noisy Mixing of White Sources

7.3.3 Separation Based on Time Correlation

7.3.4 Likelihood

7.4 Separation of Sound Signals

7.4.1 Sound Separation using PCA

7.4.2 Sound Separation using Molgedey–Schuster ICA

7.4.3 Sound Separation using Bell–Sejnowski ICA

7.4.4 Comparison

7.5 Separation of Image Mixtures

7.5.1 Image Segmentation using PCA

7.5.2 Image Segmentation using Molgedey–Schuster ICA

7.5.3 Discussion

7.6 ICA for Text Representation

7.6.1 Text Analysis

7.6.2 Latent Semantic Analysis — PCA

7.6.3 Latent Semantic Analysis — ICA

7.7 Conclusion

Acknowledgment

Appendix A

References

8 Image Analysis and Graphics for Multimedia Presentation

Tülay Adali and Yue Wang

8.1 Introduction

8.2 Image Analysis

8.2.1 Pixel Modeling

8.2.2 Model Identification

8.2.3 Context Modeling

8.2.4 Applications

8.3 Graphics Modeling

8.3.1 Surface Reconstruction

8.3.2 Physical Deformable Models

8.3.3 Deformable Surface–Spine Models

8.3.4 Numerical Implementation

8.3.5 Applications

References

9 Combined Motion Estimation and Transform Coding in Compressed Domain

Ut-Va Koc and K.J. Ray Liu

9.1 Introduction

9.2 Fully DCT-Based Motion-Compensated Video Coder Structure

9.3 DCT Pseudo-Phase Techniques

9.4 DCT-Based Motion Estimation

9.4.1 The DXT-ME Algorithm

9.4.2 Computational Issues and Complexity

9.4.3 Preprocessing

9.4.4 Adaptive Overlapping Approach

9.4.5 Simulation Results

9.5 Subpixel DCT Pseudo-Phase Techniques

9.5.1 Subpel Sinusoidal Orthogonality Principles

9.6 DCT-Based Subpixel Motion Estimation

9.6.1 DCT-Based Half-Pel Motion Estimation Algorithm (HDXT-ME)

9.6.2 DCT-Based Quarter-Pel Motion Estimation Algorithm (QDXT-ME

and Q4DXT-ME)

9.6.3 Simulation Results

9.7 DCT-Based Motion Compensation

9.7.1 Integer-Pel DCT-Based Motion Compensation

9.7.2 Subpixel DCT-Based Motion Compensation

9.7.3 Simulation

9.8 Conclusion

References

10 Object-Based Analysis–Synthesis Coding Based on Moving 3D Objects

Jörn Ostermann

10.1 Introduction

10.2 Object-Based Analysis–Synthesis Coding

10.3 Source Models for OBASC

10.3.1 Camera Model

10.3.2 Scene Model

10.3.3 Illumination Model

10.3.4 Object Model

10.4 Image Analysis for 3D Object Models

10.4.1 Overview

10.4.2 Motion Estimation for R3D

10.4.3 MF Objects

10.5 Optimization of Parameter Coding for R3D and F3D

10.5.1 Motion Parameter Coding

10.5.2 2D Shape Parameter Coding

10.5.3 Coding of Component Separation

10.5.4 Flexible Shape Parameter Coding

10.5.5 Color Parameters

10.5.6 Control of Parameter Coding

10.6 Experimental Results

10.7 Conclusions

References

11 Rate-Distortion Techniques in Image and Video Coding

Aggelos K. Katsaggelos and Gerry Melnikov

11.1 The Multimedia Transmission Problem

11.2 The Operational Rate-Distortion Function

11.3 Problem Formulation

11.4 Mathematical Tools in RD Optimization

11.4.1 Lagrangian Optimization

11.4.2 Dynamic Programming

11.5 Applications of RD Methods

11.5.1 QT-Based Motion Estimation and Motion-Compensated Interpolation

11.5.2 QT-Based Video Encoding

11.5.3 Hybrid Fractal/DCT Image Compression

11.5.4 Shape Coding

11.6 Conclusions

References

12 Transform Domain Techniques for Multimedia Image and Video Coding

S. Suthaharan, S.W. Kim, H.R. Wu, and K.R. Rao

12.1 Coding Artifacts Reduction

12.1.1 Introduction

12.1.2 Methodology

12.1.3 Experimental Results

12.1.4 More Comparison

12.2 Image and Edge Detail Detection

12.2.1 Introduction

12.2.2 Methodology

12.2.3 Experimental Results

12.3 Summary

References

13 Video Modeling and Retrieval

Yi Zhang and Tat-Seng Chua

13.1 Introduction

13.2 Modeling and Representation of Video: Segmentation vs.

Stratification

13.2.1 Practical Considerations

13.3 Design of a Video Retrieval System

13.3.1 Video Segmentation

13.3.2 Logging of Shots

13.3.3 Modeling the Context between Video Shots

13.4 Retrieval and Virtual Editing of Video

13.4.1 Video Shot Retrieval

13.4.2 Scene Association Retrieval

13.4.3 Virtual Editing

13.5 Implementation

13.6 Testing and Results

13.7 Conclusion

References

14 Image Retrieval in Frequency Domain Using DCT Coefficient Histograms

Jose A. Lay and Ling Guan

14.1 Introduction

14.1.1 Multimedia Data Compression

14.1.2 Multimedia Data Retrieval

14.1.3 About This Chapter

14.2 The DCT Coefficient Domain

14.2.1 A Matrix Description of the DCT

14.2.2 The DCT Coefficients in JPEG and MPEG Media

14.2.3 Energy Histograms of the DCT Coefficients

14.3 Frequency Domain Image/Video Retrieval Using DCT Coefficients

14.3.1 Content-Based Retrieval Model

14.3.2 Content-Based Search Processing Model

14.3.3 Perceiving the MPEG-7 Search Engine

14.3.4 Image Manipulation in the DCT Domain

14.3.5 The Energy Histogram Features

14.3.6 Proximity Evaluation

14.3.7 Experimental Results

14.4 Conclusions

References

15 Rapid Similarity Retrieval from Image and Video

Kim Shearer, Svetha Venkatesh, and Horst Bunke

15.1 Introduction

15.1.1 Definitions

15.2 Image Indexing and Retrieval

15.3 Encoding Video Indices

15.4 Decision Tree Algorithms

15.4.1 Decision Tree-Based LCSG Algorithm

15.5 Decomposition Network Algorithm

15.5.1 Decomposition-Based LCSG Algorithm

15.6 Results of Tests Over a Video Database

15.6.1 Decomposition Network Algorithm

15.6.2 Inexact Decomposition Algorithm

15.6.3 Decision Tree

15.6.4 Results of the LCSG Algorithms

15.7 Conclusion

References

16 Video Transcoding

Tzong-Der Wu, Jenq-Neng Hwang, and Ming-Ting Sun

16.1 Introduction

16.2 Pixel-Domain Transcoders

16.2.1 Introduction

16.2.2 Cascaded Video Transcoder

16.2.3 Removal of Frame Buffer and Motion Compensation Modules

16.2.4 Removal of IDCT Module

16.3 DCT Domain Transcoder

16.3.1 Introduction

16.3.2 Architecture of DCT Domain Transcoder

16.3.3 Full-Pixel Interpolation

16.3.4 Half-Pixel Interpolation

16.4 Frame-Skipping in Video Transcoding

16.4.1 Introduction

16.4.2 Interpolation of Motion Vectors

16.4.3 Search Range Adjustment

16.4.4 Dynamic Frame-Skipping

16.4.5 Simulation and Discussion

16.5 Multipoint Video Bridging

16.5.1 Introduction

16.5.2 Video Characteristics in Multipoint Video Conferencing

16.5.3 Results of Using the Coded Domain and Transcoding Approaches

16.6 Summary

References

17 Multimedia Distance Learning

Sachin G. Deshpande, Jenq-Neng Hwang, and Ming-Ting Sun

17.1 Introduction

17.2 Interactive Virtual Classroom Distance Learning Environment

17.2.1 Handling the Electronic Slide Presentation

17.2.2 Handling Handwritten Text

17.3 Multimedia Features for On-Demand Distance Learning Environment

17.3.1 Hypervideo Editor Tool

17.3.2 Automating the Multimedia Features Creation for On-Demand System

17.4 Issues in the Development of Multimedia Distance Learning

17.4.1 Error Recovery, Synchronization, and Delay Handling

17.4.2 Fast Encoding and Rate Control

17.4.3 Multicasting

17.4.4 Human Factors

17.5 Summary and Conclusion

References

18 A New Watermarking Technique for Multimedia Protection

Chun-Shien Lu, Shih-Kun Huang, Chwen-Jye Sze, and Hong-Yuan Mark Liao

18.1 Introduction

18.1.1 Watermarking

18.1.2 Overview

18.2 Human Visual System-Based Modulation

18.3 Proposed Watermarking Algorithms

18.3.1 Watermark Structures

18.3.2 The Hiding Process

18.3.3 Semipublic Authentication

18.4 Watermark Detection/Extraction

18.4.1 Gray-Scale Watermark Extraction

18.4.2 Binary Watermark Extraction

18.4.3 Dealing with Attacks Including Geometric Distortion

18.5 Analysis of Attacks Designed to Defeat HVS-Based Watermarking

18.6 Experimental Results

18.6.1 Results of Hiding a Gray-Scale Watermark

18.6.2 Results of Hiding a Binary Watermark

18.7 Conclusion

References

19 Telemedicine: A Multimedia Communication Perspective

Chang Wen Chen and Li Fan

19.1 Introduction

19.2 Telemedicine: Need for Multimedia Communication

19.3 Telemedicine over Various Multimedia Communication Links

19.3.1 Telemedicine via ISDN

19.3.2 Medical Image Transmission via ATM

19.3.3 Telemedicine via the Internet

19.3.4 Telemedicine via Mobile Wireless Communication

19.4 Conclusion

References

Preface

Multimedia is one of the most important aspects of the information era. Although there are

books dealing with various aspects of multimedia, a book comprehensively covering system,

processing, and application aspects of image and video data in a multimedia environment is

urgently needed. Contributed by experts in the field, this book serves this purpose.

Our goal is to provide in a single volume an introduction to a variety of topics in image and

video processing for multimedia. An edited compilation is an ideal format for treating a broad

spectrum of topics because it provides the opportunity for each topic to be written by an expert

in that field.

The topic of the book is processing images and videos in a multimedia environment. It covers

the following subjects arranged in two parts: (1) standards and fundamentals: standards, multimedia architecture for image processing, multimedia-related image processing techniques,

and intelligent multimedia processing; (2) methodologies, techniques, and applications: image and video coding, image and video storage and retrieval, digital video transmission, video

conferencing, watermarking, distance education, video on demand, and telemedicine.

The book begins with the existing standards for multimedia, discussing their impacts to

multimedia image and video processing, and pointing out possible directions for new standards.

The design of multimedia architectures is based on the standards. It deals with the way

visual data is being processed and transmitted at a more practical level. Current and new

architectures, and their pros and cons, are presented and discussed in Chapters 2 to 4.

Chapters 5 to 8 focus on conventional and intelligent image processing techniques relevant to

multimedia, including preprocessing, segmentation, and feature extraction techniques utilized

in coding, storage, and retrieval and transmission, media fusion, and graphical interface.

Compression and coding of video and images are among the focusing issues in multimedia.

New developments in transform- and motion-based algorithms in the compressed domain,

content- and object-based algorithms, and rate–distortion-based encoding are presented in

Chapters 9 to 12.

Chapters 13 to 15 tackle content-based image and video retrieval. They cover video modeling

and retrieval, retrieval in the transform domain, indexing, parsing, and real-time aspects of

retrieval.

The last chapters of the book (Chapters 16 to 19) present new results in multimedia application areas, including transcoding for multipoint video conferencing, distance education,

watermarking techniques for multimedia processing, and telemedicine.

Each chapter has been organized so that it can be covered in 1 to 2 weeks when this book is

used as a principal reference or text in a senior or graduate course at a university.

It is generally assumed that the reader has prior exposure to the fundamentals of image and

video processing. The chapters have been written with an emphasis on a tutorial presentation

so that the reader interested in pursuing a particular topic further will be able to obtain a solid

introduction to the topic through the appropriate chapter in this book. While the topics covered

are related, each chapter can be read and used independently of the others.

This book is primarily a result of the collective efforts of the chapter authors. We are

very grateful for their enthusiastic support, timely response, and willingness to incorporate

suggestions from us, from other contributing authors, and from a number of our colleagues

who served as reviewers.

Ling Guan

Sun-Yuan Kung

Jan Larsen

Thư viện tri thức trực tuyến

Multimedia Image and Video Processing - EEn

Nội dung xem thử

Mô tả chi tiết

Tài liệu tương tự (6)

Shi image and video compression for multimedia engineering fundamentals, algorithms and

IMAGE and VIDEO COMPRESSION for MULTIMEDIA ENGINEERING Fundamentals, Algorithms, and Standards - Yun

Academic press library in signal processing volume 5 image and video compression and multimedia

multimedia

Multimedia

Multimedia: Making It Work