Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Multimedia Image and Video Processing - EEn
Nội dung xem thử
Mô tả chi tiết
MULTIMEDIA
IMAGE and VIDEO
PROCESSING
© 2001 by CRC Press LLC
IMAGE PROCESSING SERIES
Series Editor: Phillip A. Laplante
Forthcoming Titles
Adaptive Image Processing: A Computational Intelligence
Perspective
Ling Guan, Hau-San Wong, and Stuart William Perry
Shape Analysis and Classification: Theory and Practice
Luciano da Fontoura Costa and Roberto Marcondes Cesar, Jr.
Published Titles
Image and Video Compression for Multimedia Engineering
Yun Q. Shi and Huiyang Sun
© 2001 by CRC Press LLC
Boca Raton London New York Washington, D.C.
CRC Press
Edited by
Ling Guan
Sun-Yuan Kung Jan Larsen
MULTIMEDIA
IMAGE and VIDEO
PROCESSING
© 2001 by CRC Press LLC
This book contains information obtained from authentic and highly regarded sources. Reprinted material is
quoted with permission, and sources are indicated. A wide variety of references are listed. Reasonable efforts
have been made to publish reliable data and information, but the author and the publisher cannot assume
responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval
system, without prior permission in writing from the publisher.
All rights reserved. Authorization to photocopy items for internal or personal use, or the personal or internal
use of specific clients, may be granted by CRC Press LLC, provided that $.50 per page photocopied is paid
directly to Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923 USA. The fee code for
users of the Transactional Reporting Service is ISBN 0-8493-3492-6/01/$0.00+$.50. The fee is subject to
change without notice. For organizations that have been granted a photocopy license by the CCC, a separate
system of payment has been arranged.
The consent of CRC Press LLC does not extend to copying for general distribution, for promotion, for creating
new works, or for resale. Specific permission must be obtained in writing from CRC Press LLC for such
copying.
Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation, without intent to infringe.
© 2001 by CRC Press LLC
No claim to original U.S. Government works
International Standard Book Number 0-8493-3492-6
Library of Congress Card Number 00-030341
Printed in the United States of America 1 2 3 4 5 6 7 8 9 0
Printed on acid-free paper
Library of Congress Cataloging-in-Publication Data
Multimedia image and video processing / edited by Ling Guan, Sun-Yuan Kung, Jan Larsen.
p. cm.
Includes bibliographical references and index.
ISBN 0-8493-3492-6 (alk.)
1. Multimedia systems. 2. Image processing—Digital techniques. I. Guan, Ling. II.
Kung, S.Y. (Sun Yuan) III. Larsen, Jan.
QA76.575 2000
006.4′2—dc21 00-030341
Contents
1 Emerging Standards for Multimedia Applications
Tsuhan Chen
1.1 Introduction
1.2 Standards
1.3 Fundamentals of Video Coding
1.3.1 Transform Coding
1.3.2 Motion Compensation
1.3.3 Summary
1.4 Emerging Video and Multimedia Standards
1.4.1 H.263
1.4.2 H.26L
1.4.3 MPEG-4
1.4.4 MPEG-7
1.5 Standards for Multimedia Communication
1.6 Conclusion
References
2 An Efficient Algorithm and Architecture for Real-Time Perspective Image
Warping
Yi Kang and Thomas S. Huang
2.1 Introduction
2.2 A Fast Algorithm for Perspective Transform
2.2.1 Perspective Transform
2.2.2 Existing Approximation Methods
2.2.3 Constant Denominator Method
2.2.4 Simulation Results
2.2.5 Sprite Warping Algorithm
2.3 Architecture for Sprite Warping
2.3.1 Implementation Issues
2.3.2 Memory Bandwidth Reduction
2.3.3 Architecture
2.4 Conclusion
References
©2001 CRC Press LLC
3 Application-Specific Multimedia Processor Architecture
Yu Hen Hu and Surin Kittitornkun
3.1 Introduction
3.1.1 Requirements of Multimedia Signal Processing (MSP) Hardware
3.1.2 Strategies: Matching Micro-Architecture and Algorithm
3.2 Systolic Array Structure Micro-Architecture
3.2.1 Systolic Array Design Methodology
3.2.2 Array Structures for Motion Estimation
3.3 Dedicated Micro-Architecture
3.3.1 Design Methodologies for Dedicated Micro-Architecture
3.3.2 Feed-Forward Direct Synthesis: Fast Discrete Cosine Transform (DCT)
3.3.3 Feedback Direct Synthesis: Huffman Coding
3.4 Concluding Remarks
References
4 Superresolution of Images with Learned Multiple Reconstruction Kernels
Frank M. Candocia and Jose C. Principe
4.1 Introduction
4.2 An Approach to Superresolution
4.2.1 Comments and Observations
4.2.2 Finding Bases for Image Representation
4.2.3 Description of the Methodology
4.3 Image Acquisition Model
4.4 Relating Kernel-Based Approaches
4.4.1 Single Kernel
4.4.2 Family of Kernels
4.5 Description of the Superresolution Architecture
4.5.1 The Training Data
4.5.2 Clustering of Data
4.5.3 Neighborhood Association
4.5.4 Superresolving Images
4.6 Results
4.7 Issues and Notes
4.8 Conclusions
References
5 Image Processing Techniques for Multimedia Processing
N. Herodotou, K.N. Plataniotis, and A.N. Venetsanopoulos
5.1 Introduction
5.2 Color in Multimedia Processing
5.3 Color Image Filtering
5.3.1 Fuzzy Multichannel Filters
5.3.2 The Membership Functions
5.3.3 A Combined Fuzzy Directional and Fuzzy Median Filter
5.3.4 Application to Color Images
5.4 Color Image Segmentation
5.4.1 Histogram Thresholding
5.4.2 Postprocessing and Region Merging
5.4.3 Experimental Results
5.5 Facial Image Segmentation
5.5.1 Extraction of Skin-Tone Regions
©2001 CRC Press LLC
5.5.2 Postprocessing
5.5.3 Shape and Color Analysis
5.5.4 Fuzzy Membership Functions
5.5.5 Meta-Data Features
5.5.6 Experimental Results
5.6 Conclusions
References
6 Intelligent Multimedia Processing
Ling Guan, Sun-Yuan Kung, and Jenq-Neng Hwang
6.1 Introduction
6.1.1 Neural Networks and Multimedia Processing
6.1.2 Focal Technical Issues Addressed in the Chapter
6.1.3 Organization of the Chapter
6.2 Useful Neural Network Approachesto Multimedia Data Representation, Classification, and Fusion
6.2.1 Multimedia Data Representation
6.2.2 Multimedia Data Detection and Classification
6.2.3 Hierarchical Fuzzy Neural Networks as Linear Fusion Networks
6.2.4 Temporal Models for Multimodal Conversion and Synchronization
6.3 Neural Networks for IMP Applications
6.3.1 Image Visualization and Segmentation
6.3.2 Personal Authentication and Recognition
6.3.3 Audio-to-Visual Conversion and Synchronization
6.3.4 Image and Video Retrieval, Browsing, and Content-Based Indexing
6.3.5 Interactive Human–Computer Vision
6.4 Open Issues, Future Research Directions, and Conclusions
References
7 On Independent Component Analysis for Multimedia Signals
Lars Kai Hansen, Jan Larsen, and Thomas Kolenda
7.1 Background
7.2 Principal and Independent Component Analysis
7.3 Likelihood Framework for Independent Component Analysis
7.3.1 Generalization and the Bias-Variance Dilemma
7.3.2 Noisy Mixing of White Sources
7.3.3 Separation Based on Time Correlation
7.3.4 Likelihood
7.4 Separation of Sound Signals
7.4.1 Sound Separation using PCA
7.4.2 Sound Separation using Molgedey–Schuster ICA
7.4.3 Sound Separation using Bell–Sejnowski ICA
7.4.4 Comparison
7.5 Separation of Image Mixtures
7.5.1 Image Segmentation using PCA
7.5.2 Image Segmentation using Molgedey–Schuster ICA
7.5.3 Discussion
7.6 ICA for Text Representation
7.6.1 Text Analysis
7.6.2 Latent Semantic Analysis — PCA
7.6.3 Latent Semantic Analysis — ICA
©2001 CRC Press LLC
7.7 Conclusion
Acknowledgment
Appendix A
References
8 Image Analysis and Graphics for Multimedia Presentation
Tülay Adali and Yue Wang
8.1 Introduction
8.2 Image Analysis
8.2.1 Pixel Modeling
8.2.2 Model Identification
8.2.3 Context Modeling
8.2.4 Applications
8.3 Graphics Modeling
8.3.1 Surface Reconstruction
8.3.2 Physical Deformable Models
8.3.3 Deformable Surface–Spine Models
8.3.4 Numerical Implementation
8.3.5 Applications
References
9 Combined Motion Estimation and Transform Coding in Compressed Domain
Ut-Va Koc and K.J. Ray Liu
9.1 Introduction
9.2 Fully DCT-Based Motion-Compensated Video Coder Structure
9.3 DCT Pseudo-Phase Techniques
9.4 DCT-Based Motion Estimation
9.4.1 The DXT-ME Algorithm
9.4.2 Computational Issues and Complexity
9.4.3 Preprocessing
9.4.4 Adaptive Overlapping Approach
9.4.5 Simulation Results
9.5 Subpixel DCT Pseudo-Phase Techniques
9.5.1 Subpel Sinusoidal Orthogonality Principles
9.6 DCT-Based Subpixel Motion Estimation
9.6.1 DCT-Based Half-Pel Motion Estimation Algorithm (HDXT-ME)
9.6.2 DCT-Based Quarter-Pel Motion Estimation Algorithm (QDXT-ME
and Q4DXT-ME)
9.6.3 Simulation Results
9.7 DCT-Based Motion Compensation
9.7.1 Integer-Pel DCT-Based Motion Compensation
9.7.2 Subpixel DCT-Based Motion Compensation
9.7.3 Simulation
9.8 Conclusion
References
10 Object-Based Analysis–Synthesis Coding Based on Moving 3D Objects
Jörn Ostermann
10.1 Introduction
10.2 Object-Based Analysis–Synthesis Coding
10.3 Source Models for OBASC
©2001 CRC Press LLC
10.3.1 Camera Model
10.3.2 Scene Model
10.3.3 Illumination Model
10.3.4 Object Model
10.4 Image Analysis for 3D Object Models
10.4.1 Overview
10.4.2 Motion Estimation for R3D
10.4.3 MF Objects
10.5 Optimization of Parameter Coding for R3D and F3D
10.5.1 Motion Parameter Coding
10.5.2 2D Shape Parameter Coding
10.5.3 Coding of Component Separation
10.5.4 Flexible Shape Parameter Coding
10.5.5 Color Parameters
10.5.6 Control of Parameter Coding
10.6 Experimental Results
10.7 Conclusions
References
11 Rate-Distortion Techniques in Image and Video Coding
Aggelos K. Katsaggelos and Gerry Melnikov
11.1 The Multimedia Transmission Problem
11.2 The Operational Rate-Distortion Function
11.3 Problem Formulation
11.4 Mathematical Tools in RD Optimization
11.4.1 Lagrangian Optimization
11.4.2 Dynamic Programming
11.5 Applications of RD Methods
11.5.1 QT-Based Motion Estimation and Motion-Compensated Interpolation
11.5.2 QT-Based Video Encoding
11.5.3 Hybrid Fractal/DCT Image Compression
11.5.4 Shape Coding
11.6 Conclusions
References
12 Transform Domain Techniques for Multimedia Image and Video Coding
S. Suthaharan, S.W. Kim, H.R. Wu, and K.R. Rao
12.1 Coding Artifacts Reduction
12.1.1 Introduction
12.1.2 Methodology
12.1.3 Experimental Results
12.1.4 More Comparison
12.2 Image and Edge Detail Detection
12.2.1 Introduction
12.2.2 Methodology
12.2.3 Experimental Results
12.3 Summary
References
©2001 CRC Press LLC
13 Video Modeling and Retrieval
Yi Zhang and Tat-Seng Chua
13.1 Introduction
13.2 Modeling and Representation of Video: Segmentation vs.
Stratification
13.2.1 Practical Considerations
13.3 Design of a Video Retrieval System
13.3.1 Video Segmentation
13.3.2 Logging of Shots
13.3.3 Modeling the Context between Video Shots
13.4 Retrieval and Virtual Editing of Video
13.4.1 Video Shot Retrieval
13.4.2 Scene Association Retrieval
13.4.3 Virtual Editing
13.5 Implementation
13.6 Testing and Results
13.7 Conclusion
References
14 Image Retrieval in Frequency Domain Using DCT Coefficient Histograms
Jose A. Lay and Ling Guan
14.1 Introduction
14.1.1 Multimedia Data Compression
14.1.2 Multimedia Data Retrieval
14.1.3 About This Chapter
14.2 The DCT Coefficient Domain
14.2.1 A Matrix Description of the DCT
14.2.2 The DCT Coefficients in JPEG and MPEG Media
14.2.3 Energy Histograms of the DCT Coefficients
14.3 Frequency Domain Image/Video Retrieval Using DCT Coefficients
14.3.1 Content-Based Retrieval Model
14.3.2 Content-Based Search Processing Model
14.3.3 Perceiving the MPEG-7 Search Engine
14.3.4 Image Manipulation in the DCT Domain
14.3.5 The Energy Histogram Features
14.3.6 Proximity Evaluation
14.3.7 Experimental Results
14.4 Conclusions
References
15 Rapid Similarity Retrieval from Image and Video
Kim Shearer, Svetha Venkatesh, and Horst Bunke
15.1 Introduction
15.1.1 Definitions
15.2 Image Indexing and Retrieval
15.3 Encoding Video Indices
15.4 Decision Tree Algorithms
15.4.1 Decision Tree-Based LCSG Algorithm
15.5 Decomposition Network Algorithm
15.5.1 Decomposition-Based LCSG Algorithm
15.6 Results of Tests Over a Video Database
©2001 CRC Press LLC
15.6.1 Decomposition Network Algorithm
15.6.2 Inexact Decomposition Algorithm
15.6.3 Decision Tree
15.6.4 Results of the LCSG Algorithms
15.7 Conclusion
References
16 Video Transcoding
Tzong-Der Wu, Jenq-Neng Hwang, and Ming-Ting Sun
16.1 Introduction
16.2 Pixel-Domain Transcoders
16.2.1 Introduction
16.2.2 Cascaded Video Transcoder
16.2.3 Removal of Frame Buffer and Motion Compensation Modules
16.2.4 Removal of IDCT Module
16.3 DCT Domain Transcoder
16.3.1 Introduction
16.3.2 Architecture of DCT Domain Transcoder
16.3.3 Full-Pixel Interpolation
16.3.4 Half-Pixel Interpolation
16.4 Frame-Skipping in Video Transcoding
16.4.1 Introduction
16.4.2 Interpolation of Motion Vectors
16.4.3 Search Range Adjustment
16.4.4 Dynamic Frame-Skipping
16.4.5 Simulation and Discussion
16.5 Multipoint Video Bridging
16.5.1 Introduction
16.5.2 Video Characteristics in Multipoint Video Conferencing
16.5.3 Results of Using the Coded Domain and Transcoding Approaches
16.6 Summary
References
17 Multimedia Distance Learning
Sachin G. Deshpande, Jenq-Neng Hwang, and Ming-Ting Sun
17.1 Introduction
17.2 Interactive Virtual Classroom Distance Learning Environment
17.2.1 Handling the Electronic Slide Presentation
17.2.2 Handling Handwritten Text
17.3 Multimedia Features for On-Demand Distance Learning Environment
17.3.1 Hypervideo Editor Tool
17.3.2 Automating the Multimedia Features Creation for On-Demand System
17.4 Issues in the Development of Multimedia Distance Learning
17.4.1 Error Recovery, Synchronization, and Delay Handling
17.4.2 Fast Encoding and Rate Control
17.4.3 Multicasting
17.4.4 Human Factors
17.5 Summary and Conclusion
References
©2001 CRC Press LLC
18 A New Watermarking Technique for Multimedia Protection
Chun-Shien Lu, Shih-Kun Huang, Chwen-Jye Sze, and Hong-Yuan Mark Liao
18.1 Introduction
18.1.1 Watermarking
18.1.2 Overview
18.2 Human Visual System-Based Modulation
18.3 Proposed Watermarking Algorithms
18.3.1 Watermark Structures
18.3.2 The Hiding Process
18.3.3 Semipublic Authentication
18.4 Watermark Detection/Extraction
18.4.1 Gray-Scale Watermark Extraction
18.4.2 Binary Watermark Extraction
18.4.3 Dealing with Attacks Including Geometric Distortion
18.5 Analysis of Attacks Designed to Defeat HVS-Based Watermarking
18.6 Experimental Results
18.6.1 Results of Hiding a Gray-Scale Watermark
18.6.2 Results of Hiding a Binary Watermark
18.7 Conclusion
References
19 Telemedicine: A Multimedia Communication Perspective
Chang Wen Chen and Li Fan
19.1 Introduction
19.2 Telemedicine: Need for Multimedia Communication
19.3 Telemedicine over Various Multimedia Communication Links
19.3.1 Telemedicine via ISDN
19.3.2 Medical Image Transmission via ATM
19.3.3 Telemedicine via the Internet
19.3.4 Telemedicine via Mobile Wireless Communication
19.4 Conclusion
References
©2001 CRC Press LLC
Preface
Multimedia is one of the most important aspects of the information era. Although there are
books dealing with various aspects of multimedia, a book comprehensively covering system,
processing, and application aspects of image and video data in a multimedia environment is
urgently needed. Contributed by experts in the field, this book serves this purpose.
Our goal is to provide in a single volume an introduction to a variety of topics in image and
video processing for multimedia. An edited compilation is an ideal format for treating a broad
spectrum of topics because it provides the opportunity for each topic to be written by an expert
in that field.
The topic of the book is processing images and videos in a multimedia environment. It covers
the following subjects arranged in two parts: (1) standards and fundamentals: standards, multimedia architecture for image processing, multimedia-related image processing techniques,
and intelligent multimedia processing; (2) methodologies, techniques, and applications: image and video coding, image and video storage and retrieval, digital video transmission, video
conferencing, watermarking, distance education, video on demand, and telemedicine.
The book begins with the existing standards for multimedia, discussing their impacts to
multimedia image and video processing, and pointing out possible directions for new standards.
The design of multimedia architectures is based on the standards. It deals with the way
visual data is being processed and transmitted at a more practical level. Current and new
architectures, and their pros and cons, are presented and discussed in Chapters 2 to 4.
Chapters 5 to 8 focus on conventional and intelligent image processing techniques relevant to
multimedia, including preprocessing, segmentation, and feature extraction techniques utilized
in coding, storage, and retrieval and transmission, media fusion, and graphical interface.
Compression and coding of video and images are among the focusing issues in multimedia.
New developments in transform- and motion-based algorithms in the compressed domain,
content- and object-based algorithms, and rate–distortion-based encoding are presented in
Chapters 9 to 12.
Chapters 13 to 15 tackle content-based image and video retrieval. They cover video modeling
and retrieval, retrieval in the transform domain, indexing, parsing, and real-time aspects of
retrieval.
The last chapters of the book (Chapters 16 to 19) present new results in multimedia application areas, including transcoding for multipoint video conferencing, distance education,
watermarking techniques for multimedia processing, and telemedicine.
Each chapter has been organized so that it can be covered in 1 to 2 weeks when this book is
used as a principal reference or text in a senior or graduate course at a university.
It is generally assumed that the reader has prior exposure to the fundamentals of image and
video processing. The chapters have been written with an emphasis on a tutorial presentation
so that the reader interested in pursuing a particular topic further will be able to obtain a solid
introduction to the topic through the appropriate chapter in this book. While the topics covered
are related, each chapter can be read and used independently of the others.
©2001 CRC Press LLC
This book is primarily a result of the collective efforts of the chapter authors. We are
very grateful for their enthusiastic support, timely response, and willingness to incorporate
suggestions from us, from other contributing authors, and from a number of our colleagues
who served as reviewers.
Ling Guan
Sun-Yuan Kung
Jan Larsen
©2001 CRC Press LLC