Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Feature Extraction & Image Processing for Computer Vision
Nội dung xem thử
Mô tả chi tiết
Feature Extraction &
Image Processing for
Computer Vision
We would like to dedicate this book to our parents.
To Gloria and to Joaquin Aguado,
and to Brenda and the late Ian Nixon.
This page intentionally left blank
Feature Extraction &
Image Processing for
Computer Vision
Third edition
Mark S. Nixon
Alberto S. Aguado
AMSTERDAM • BOSTON • HEIDELBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Academic Press is an imprint of Elsevier
Academic Press is an imprint of Elsevier
The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, UK
84 Theobald’s Road, London WC1X 8RR, UK
First edition 2002
Reprinted 2004, 2005
Second edition 2008
Third edition 2012
Copyright r 2012 Professor Mark S. Nixon and Alberto S. Aguado. Published by Elsevier Ltd.
All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or any information storage and retrieval
system, without permission in writing from the publisher. Details on how to seek permission, further
information about the Publisher’s permissions policies and our arrangements with organizations
such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our
website: www.elsevier.com/permissions
This book and the individual contributions contained in it are protected under copyright by the
Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience
broaden our understanding, changes in research methods, professional practices, or medical treatment
may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating
and using any information, methods, compounds, or experiments described herein. In using such
information or methods they should be mindful of their own safety and the safety of others, including
parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume
any liability for any injury and/or damage to persons or property as a matter of products liability,
negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas
contained in the material herein.
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
ISBN: 978-0-123-96549-3
For information on all Academic Press publications visit
our website at books.elsevier.com
Printed and bound in the UK
12 10 9 8 7 6 5 4 3 2 1
Contents
Preface ......................................................................................................................xi
CHAPTER 1 Introduction ............................................................................. 1
1.1 Overview......................................................................................1
1.2 Human and computer vision........................................................2
1.3 The human vision system ............................................................4
1.3.1 The eye.............................................................................5
1.3.2 The neural system............................................................8
1.3.3 Processing ........................................................................9
1.4 Computer vision systems...........................................................12
1.4.1 Cameras..........................................................................12
1.4.2 Computer interfaces.......................................................15
1.4.3 Processing an image ......................................................17
1.5 Mathematical systems................................................................19
1.5.1 Mathematical tools ........................................................19
1.5.2 Hello Matlab, hello images!..........................................20
1.5.3 Hello Mathcad! ..............................................................25
1.6 Associated literature ..................................................................30
1.6.1 Journals, magazines, and conferences...........................30
1.6.2 Textbooks.......................................................................31
1.6.3 The Web.........................................................................34
1.7 Conclusions................................................................................35
1.8 References..................................................................................35
CHAPTER 2 Images, Sampling, and Frequency
Domain Processing............................................................. 37
2.1 Overview....................................................................................37
2.2 Image formation.........................................................................38
2.3 The Fourier transform................................................................42
2.4 The sampling criterion...............................................................49
2.5 The discrete Fourier transform ..................................................53
2.5.1 1D transform..................................................................53
2.5.2 2D transform..................................................................57
2.6 Other properties of the Fourier transform.................................63
2.6.1 Shift invariance..............................................................63
2.6.2 Rotation..........................................................................65
2.6.3 Frequency scaling ..........................................................66
2.6.4 Superposition (linearity) ................................................67
2.7 Transforms other than Fourier...................................................68
2.7.1 Discrete cosine transform..............................................68
v
2.7.2 Discrete Hartley transform ............................................70
2.7.3 Introductory wavelets ....................................................71
2.7.4 Other transforms ............................................................78
2.8 Applications using frequency domain properties......................78
2.9 Further reading...........................................................................80
2.10 References..................................................................................81
CHAPTER 3 Basic Image Processing Operations............................. 83
3.1 Overview....................................................................................83
3.2 Histograms .................................................................................84
3.3 Point operators ...........................................................................86
3.3.1 Basic point operations ...................................................86
3.3.2 Histogram normalization ...............................................89
3.3.3 Histogram equalization..................................................90
3.3.4 Thresholding ..................................................................93
3.4 Group operations........................................................................98
3.4.1 Template convolution....................................................98
3.4.2 Averaging operator ......................................................101
3.4.3 On different template size ...........................................103
3.4.4 Gaussian averaging operator .......................................104
3.4.5 More on averaging.......................................................107
3.5 Other statistical operators ........................................................109
3.5.1 Median filter ................................................................109
3.5.2 Mode filter ...................................................................112
3.5.3 Anisotropic diffusion ...................................................114
3.5.4 Force field transform ...................................................121
3.5.5 Comparison of statistical operators.............................122
3.6 Mathematical morphology.......................................................123
3.6.1 Morphological operators..............................................124
3.6.2 Gray-level morphology................................................127
3.6.3 Gray-level erosion and dilation...................................128
3.6.4 Minkowski operators ...................................................130
3.7 Further reading.........................................................................134
3.8 References................................................................................134
CHAPTER 4 Low-Level Feature Extraction (including
edge detection)..................................................................137
4.1 Overview..................................................................................138
4.2 Edge detection..........................................................................139
4.2.1 First-order edge-detection operators ...........................139
4.2.2 Second-order edge-detection operators .......................161
4.2.3 Other edge-detection operators ...................................170
4.2.4 Comparison of edge-detection operators ....................171
4.2.5 Further reading on edge detection...............................173
vi Contents
4.3 Phase congruency.....................................................................173
4.4 Localized feature extraction ....................................................180
4.4.1 Detecting image curvature (corner extraction) ...........180
4.4.2 Modern approaches: region/patch analysis .................193
4.5 Describing image motion.........................................................199
4.5.1 Area-based approach ...................................................200
4.5.2 Differential approach...................................................204
4.5.3 Further reading on optical flow...................................211
4.6 Further reading.........................................................................212
4.7 References................................................................................212
CHAPTER 5 High-Level Feature Extraction: Fixed Shape
Matching ..............................................................................217
5.1 Overview..................................................................................218
5.2 Thresholding and subtraction ..................................................220
5.3 Template matching ..................................................................222
5.3.1 Definition .....................................................................222
5.3.2 Fourier transform implementation...............................230
5.3.3 Discussion of template matching ................................234
5.4 Feature extraction by low-level features.................................235
5.4.1 Appearance-based approaches.....................................235
5.4.2 Distribution-based descriptors.....................................238
5.5 Hough transform ......................................................................243
5.5.1 Overview......................................................................243
5.5.2 Lines.............................................................................243
5.5.3 HT for circles...............................................................250
5.5.4 HT for ellipses .............................................................255
5.5.5 Parameter space decomposition ..................................258
5.5.6 Generalized HT............................................................271
5.5.7 Other extensions to the HT .........................................287
5.6 Further reading.........................................................................288
5.7 References................................................................................289
CHAPTER 6 High-Level Feature Extraction: Deformable
Shape Analysis ...........................................................293
6.1 Overview..................................................................................293
6.2 Deformable shape analysis ......................................................294
6.2.1 Deformable templates..................................................294
6.2.2 Parts-based shape analysis...........................................297
6.3 Active contours (snakes)..........................................................299
6.3.1 Basics ...........................................................................299
6.3.2 The Greedy algorithm for snakes................................301
Contents vii
6.3.3 Complete (Kass) snake implementation......................308
6.3.4 Other snake approaches...............................................313
6.3.5 Further snake developments........................................314
6.3.6 Geometric active contours (level-set-based
approaches) ..................................................................318
6.4 Shape skeletonization ..............................................................325
6.4.1 Distance transforms .....................................................325
6.4.2 Symmetry.....................................................................327
6.5 Flexible shape models—active shape and active
appearance................................................................................334
6.6 Further reading.........................................................................338
6.7 References................................................................................338
CHAPTER 7 Object Description.............................................................343
7.1 Overview..................................................................................343
7.2 Boundary descriptions .............................................................345
7.2.1 Boundary and region ...................................................345
7.2.2 Chain codes..................................................................346
7.2.3 Fourier descriptors .......................................................349
7.3 Region descriptors ...................................................................378
7.3.1 Basic region descriptors ..............................................378
7.3.2 Moments ......................................................................383
7.4 Further reading.........................................................................395
7.5 References................................................................................395
CHAPTER 8 Introduction to Texture Description,
Segmentation, and Classification ............................399
8.1 Overview..................................................................................399
8.2 What is texture? .......................................................................400
8.3 Texture description ..................................................................403
8.3.1 Performance requirements...........................................403
8.3.2 Structural approaches ..................................................403
8.3.3 Statistical approaches ..................................................406
8.3.4 Combination approaches .............................................409
8.3.5 Local binary patterns ...................................................411
8.3.6 Other approaches .........................................................417
8.4 Classification............................................................................417
8.4.1 Distance measures .......................................................417
8.4.2 The k-nearest neighbor rule.........................................424
8.4.3 Other classification approaches...................................428
8.5 Segmentation............................................................................429
8.6 Further reading.........................................................................431
8.7 References................................................................................432
viii Contents
CHAPTER 9 Moving Object Detection and Description ..............435
9.1 Overview..................................................................................435
9.2 Moving object detection ..........................................................437
9.2.1 Basic approaches .........................................................437
9.2.2 Modeling and adapting to the (static) background .....442
9.2.3 Background segmentation by thresholding.................447
9.2.4 Problems and advances................................................450
9.3 Tracking moving features........................................................451
9.3.1 Tracking moving objects .............................................451
9.3.2 Tracking by local search .............................................452
9.3.3 Problems in tracking....................................................455
9.3.4 Approaches to tracking................................................455
9.3.5 Meanshift and Camshift ..............................................457
9.3.6 Recent approaches .......................................................472
9.4 Moving feature extraction and description .............................474
9.4.1 Moving (biological) shape analysis.............................474
9.4.2 Detecting moving shapes by shape matching
in image sequences......................................................476
9.4.3 Moving shape description............................................480
9.5 Further reading.........................................................................483
9.6 References................................................................................484
CHAPTER 10 Appendix 1: Camera Geometry Fundamentals........489
10.1 Image geometry .......................................................................489
10.2 Perspective camera ..................................................................490
10.3 Perspective camera model .......................................................491
10.3.1 Homogeneous coordinates and projective
geometry.......................................................................491
10.3.2 Perspective camera model analysis .............................496
10.3.3 Parameters of the perspective camera model..............499
10.4 Affine camera ..........................................................................500
10.4.1 Affine camera model ...................................................501
10.4.2 Affine camera model and the perspective
projection .....................................................................503
10.4.3 Parameters of the affine camera model.......................504
10.5 Weak perspective model..........................................................505
10.6 Example of camera models .....................................................507
10.7 Discussion ................................................................................517
10.8 References................................................................................517
CHAPTER 11 Appendix 2: Least Squares Analysis .......................519
11.1 The least squares criterion.......................................................519
11.2 Curve fitting by least squares..................................................521
Contents ix
CHAPTER 12 Appendix 3: Principal Components Analysis .......525
12.1 Principal components analysis ..............................................525
12.2 Data........................................................................................526
12.3 Covariance .............................................................................526
12.4 Covariance matrix..................................................................529
12.5 Data transformation ...............................................................530
12.6 Inverse transformation...........................................................531
12.7 Eigenproblem.........................................................................532
12.8 Solving the eigenproblem......................................................533
12.9 PCA method summary ..........................................................533
12.10 Example .................................................................................534
12.11 References..............................................................................540
CHAPTER 13 Appendix 4: Color Images.......................................541
13.1 Color images..........................................................................542
13.2 Tristimulus theory..................................................................542
13.3 Color models..........................................................................544
13.3.1 The colorimetric equation .......................................544
13.3.2 Luminosity function ................................................545
13.3.3 Perception based color models: the CIE RGB
and CIE XYZ...........................................................547
13.3.4 Uniform color spaces: CIE LUV and CIE LAB.....562
13.3.5 Additive and subtractive color models: RGB
and CMY .................................................................568
13.3.6 Luminance and chrominance color models:
YUV, YIQ, and YCbCr...........................................575
13.3.7 Perceptual color models: HSV and HLS ................583
13.3.8 More color models...................................................599
13.4 References..............................................................................600
x Contents
Preface
What is new in the third edition?
Image processing and computer vision has been, and continues to be, subject to
much research and development. The research develops into books and so the
books need updating. We have always been interested to note that our book contains stock image processing and computer vision techniques which are yet to be
found in other regular textbooks (OK, some is to be found in specialist books,
though these rarely include much tutorial material). This has been true of the previous editions and certainly occurs here.
In this third edition, the completely new material is on new methods for lowand high-level feature extraction and description and on moving object detection,
tracking, and description. We have also extended the book to use color and more
modern techniques for object extraction and description especially those capitalizing on wavelets and on scale space. We have of course corrected the previous
production errors and included more tutorial material where appropriate. We continue to update the references, especially to those containing modern survey material and performance comparison. As such, this book—IOHO—remains the most
up-to-date text in feature extraction and image processing in computer vision.
Why did we write this book?
We always expected to be asked: “why on earth write a new book on computer
vision?”, and we have been. A fair question is “there are already many good
books on computer vision out in the bookshops, as you will find referenced later,
so why add to them?” Part of the answer is that any textbook is a snapshot of
material that exists prior to it. Computer vision, the art of processing images
stored within a computer, has seen a considerable amount of research by highly
qualified people and the volume of research would appear even to have increased
in recent years. That means a lot of new techniques have been developed, and
many of the more recent approaches are yet to migrate to textbooks. It is not just
the new research: part of the speedy advance in computer vision technique has
left some areas covered only in scanty detail. By the nature of research, one cannot publish material on technique that is seen more to fill historical gaps, rather
than to advance knowledge. This is again where a new text can contribute.
Finally, the technology itself continues to advance. This means that there is
new hardware, new programming languages, and new programming environments. In particular for computer vision, the advance of technology means that
computing power and memory are now relatively cheap. It is certainly considerably cheaper than when computer vision was starting as a research field. One of
xi
the authors here notes that the laptop in which his portion of the book was written
on has considerably more memory, is faster, and has bigger disk space and better
graphics than the computer that served the entire university of his student days.
And he is not that old! One of the more advantageous recent changes brought by
progress has been the development of mathematical programming systems. These
allow us to concentrate on mathematical technique itself rather than on implementation detail. There are several sophisticated flavors of which Matlab, one of the
chosen vehicles here, is (arguably) the most popular. We have been using these
techniques in research and in teaching, and we would argue that they have been
of considerable benefit there. In research, they help us to develop technique faster
and to evaluate its final implementation. For teaching, the power of a modern laptop and a mathematical system combines to show students, in lectures and in
study, not only how techniques are implemented but also how and why they work
with an explicit relation to conventional teaching material.
We wrote this book for these reasons. There is a host of material we could
have included but chose to omit; the taxonomy and structure we use to expose the
subject are of our own construction. Our apologies to other academics if it was
your own, or your favorite, technique that we chose to omit. By virtue of the
enormous breadth of the subject of image processing and computer vision, we
restricted the focus to feature extraction and image processing in computer vision
for this has been the focus of not only our research but also where the attention of
established textbooks, with some exceptions, can be rather scanty. It is, however,
one of the prime targets of applied computer vision, so would benefit from better
attention. We have aimed to clarify some of its origins and development, while
also exposing implementation using mathematical systems. As such, we have
written this text with our original aims in mind and maintained the approach
through the later editions.
The book and its support
Each chapter of this book presents a particular package of information concerning
feature extraction in image processing and computer vision. Each package is
developed from its origins and later referenced to more recent material. Naturally,
there is often theoretical development prior to implementation. We have provided
working implementations of most of the major techniques we describe, and
applied them to process a selection of imagery. Though the focus of our work has
been more in analyzing medical imagery or in biometrics (the science of recognizing people by behavioral or physiological characteristic, like face recognition),
the techniques are general and can migrate to other application domains.
You will find a host of further supporting information at the book’s web site
http://www.ecs.soton.ac.uk/Bmsn/book/. First, you will find the worksheets (the
Matlab and Mathcad implementations that support the text) so that you can study
xii Preface
the techniques described herein. The demonstration site too is there. The web
site will be kept up-to-date as much as possible, for it also contains links to other
material such as web sites devoted to techniques and applications as well as to
available software and online literature. Finally, any errata will be reported there.
It is our regret and our responsibility that these will exist, and our inducement for
their reporting concerns a pint of beer. If you find an error that we don’t know
about (not typos like spelling, grammar, and layout) then use the “mailto” on the
web site and we shall send you a pint of good English beer, free!
There is a certain amount of mathematics in this book. The target audience is
the third- or fourth-year students of BSc/BEng/MEng in electrical or electronic
engineering, software engineering, and computer science, or in mathematics or
physics, and this is the level of mathematical analysis here. Computer vision can
be thought of as a branch of applied mathematics, though this does not really
apply to some areas within its remit and certainly applies to the material herein.
The mathematics essentially concerns mainly calculus and geometry, though
some of it is rather more detailed than the constraints of a conventional lecture
course might allow. Certainly, not all the material here is covered in detail in
undergraduate courses at Southampton.
Chapter 1 starts with an overview of computer vision hardware, software, and
established material, with reference to the most sophisticated vision system yet
“developed”: the human vision system. Though the precise details of the nature
of processing that allows us to see are yet to be determined, there is a considerable range of hardware and software that allow us to give a computer system
the capability to acquire, process, and reason with imagery, the function of
“sight.” The first chapter also provides a comprehensive bibliography of material
you can find on the subject including not only textbooks but also available software and other material. As this will no doubt be subject to change, it might well
be worth consulting the web site for more up-to-date information. The preference
for journal references is those which are likely to be found in local university
libraries or on the Web, IEEE Transactions in particular. These are often subscribed to as they are relatively of low cost and are often of very high quality.
Chapter 2 concerns the basics of signal processing theory for use in computer
vision. It introduces the Fourier transform that allows you to look at a signal in
a new way, in terms of its frequency content. It also allows us to work out the
minimum size of a picture to conserve information, to analyze the content in
terms of frequency, and even helps to speed up some of the later vision algorithms. Unfortunately, it does involve a few equations, but it is a new way of
looking at data and at signals and proves to be a rewarding topic of study in its
own right. It extends to wavelets, which are a popular analysis tool in image
processing.
In Chapter 3, we start to look at basic image processing techniques, where
image points are mapped into a new value first by considering a single point in
an original image and then by considering groups of points. Not only do we see
common operations to make a picture’s appearance better, especially for human
Preface xiii
vision, but also we see how to reduce the effects of different types of commonly
encountered image noise. We shall see some of the modern ways to remove noise
and thus clean images, and we shall also look at techniques which process an
image using notions of shape rather than mapping processes.
Chapter 4 concerns low-level features which are the techniques that describe
the content of an image, at the level of a whole image rather than in distinct
regions of it. One of the most important processes we shall meet is called edge
detection. Essentially, this reduces an image to a form of a caricaturist’s sketch,
though without a caricaturist’s exaggerations. The major techniques are presented
in detail, together with descriptions of their implementation. Other image properties we can derive include measures of curvature, which developed into modern
methods of feature extraction, and measures of movement. These are also covered in this chapter.
These edges, the curvature, or the motion need to be grouped in some way so
that we can find shapes in an image and are dealt with in Chapter 5. Using basic
thresholding rarely suffices for shape extraction. One of the newer approaches is
to group low-level features to find an object—in a way this is object extraction
without shape. Another approach to shape extraction concerns analyzing the
match of low-level information to a known template of a target shape. As this
can be computationally very cumbersome, we then progress to a technique that
improves computational performance, while maintaining an optimal performance.
The technique is known as the Hough transform and it has long been a popular
target for researchers in computer vision who have sought to clarify its basis,
improve its speed, and to increase its accuracy and robustness. Essentially, by the
Hough transform, we estimate the parameters that govern a shape’s appearance,
where the shapes range from lines to ellipses and even to unknown shapes.
In Chapter 6, some applications of shape extraction require to determine rather
more than the parameters that control appearance, and require to be able to
deform or flex to match the image template. For this reason, the chapter on shape
extraction by matching is followed by one on flexible shape analysis. This is a
topic that has shown considerable progress of late, especially with the introduction of snakes (active contours). The newer material is the formulation by level
set methods and brings new power to shape extraction techniques. These seek to
match a shape to an image by analyzing local properties. Further, we shall see
how we can describe a shape by its skeleton though with practical difficulty
which can be alleviated by symmetry (though this can be slow), and also how
global constraints concerning the statistics of a shape’s appearance can be used
to guide final extraction.
Up to this point, we have not considered techniques that can be used to
describe the shape found in an image. In Chapter 7, we shall find that the two
major approaches concern techniques that describe a shape’s perimeter and those
that describe its area. Some of the perimeter description techniques, the Fourier
descriptors, are even couched using Fourier transform theory that allows analysis
of their frequency content. One of the major approaches to area description, statistical moments, also has a form of access to frequency components, though it is
xiv Preface
of a very different nature to the Fourier analysis. One advantage is that insight
into descriptive ability can be achieved by reconstruction which should get back
to the original shape.
Chapter 8 describes texture analysis and also serves as a vehicle for introductory material on pattern classification. Texture describes patterns with no known
analytical description and has been the target of considerable research in computer vision and image processing. It is used here more as a vehicle for material
that precedes it, such as the Fourier transform and area descriptions though references are provided for access to other generic material. There is also introductory
material on how to classify these patterns against known data, with a selection of
the distance measures that can be used within that, and this is a window on a
much larger area, to which appropriate pointers are given.
Finally, Chapter 9 concerns detecting and analyzing moving objects. Moving
objects are detected by separating the foreground from the background, known as
background subtraction. Having separated the moving components, one
approach is then to follow or track the object as it moves within a sequence of
image frames. The moving object can be described and recognized from the
tracking information or by collecting together the sequence of frames to derive
moving object descriptions.
The appendices include materials that are germane to the text, such as camera
models and coordinate geometry, the method of least squares, a topic known as
principal components analysis, and methods of color description. These are
aimed to be short introductions and are appendices since they are germane to
much of the material throughout but not needed directly to cover it. Other related
material is referenced throughout the text, especially online material.
In this way, the text covers all major areas of feature extraction and image processing in computer vision. There is considerably more material in the subject than
is presented here; for example, there is an enormous volume of material in 3D computer vision and in 2D signal processing, which is only alluded to here. Topics that
are specifically not included are 3D processing, watermarking, and image coding.
To include all these topics would lead to a monstrous book that no one could afford
or even pick up. So we admit we give a snapshot, and we hope more that it is considered to open another window on a fascinating and rewarding subject.
In gratitude
We are immensely grateful to the input of our colleagues, in particular, Prof.
Steve Gunn, Dr. John Carter, and Dr. Sasan Mahmoodi. The family who put up
with it are Maria Eugenia and Caz and the nippers. We are also very grateful to
past and present researchers in computer vision at the Information: Signals,
Images, Systems (ISIS) research group under (or who have survived?) Mark’s
supervision at the School of Electronics and Computer Science, University of
Southampton. In addition to Alberto and Steve, these include Dr. Hani Muammar,
Preface xv