Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Feature extraction and image processing
Nội dung xem thử
Mô tả chi tiết
Feature Extraction
and
Image Processing
Dedication
We would like to dedicate this book to our parents:
To Gloria and Joaquin Aguado, and to Brenda and the late Ian Nixon.
Feature Extraction
and
Image Processing
Second edition
Mark S. Nixon
Alberto S. Aguado
Amsterdam • Boston • Heidelberg • London • New York • Oxford
Paris • San Diego • San Francisco • Singapore • Sydney • Tokyo
Academic Press is an imprint of Elsevier
Academic Press is an imprint of Elsevier
Linacre House, Jordan Hill, Oxford OX2 8DP, UK
84 Theobald’s Road, London WC1X 8RR, UK
First edition 2002
Reprinted 2004, 2005
Second edition 2008
Copyright © 2008 Elsevier Ltd. All rights reserved
No part of this publication may be reproduced, stored in a retrieval system
or transmitted in any form or by any means electronic, mechanical, photocopying,
recording or otherwise without the prior written permission of the publisher
Permissions may be sought directly from Elsevier’s Science & Technology Rights
Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333;
email: [email protected]. Alternatively you can submit your request online by
visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting
Obtaining permission to use Elsevier material
Notice
No responsibility is assumed by the publisher for any injury and/or damage to persons
or property as a matter of products liability, negligence or otherwise, or from any use
or operation of any methods, products, instructions or ideas contained in the material
herein. Because of rapid advances in the medical sciences, in particular, independent
verification of diagnoses and drug dosages should be made
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
ISBN: 978-0-12372-538-7
For information on all Academic Press publications
visit our web site at books.elsevier.com
Printed and bound in Hungary
08 09 10 10 9 8 7 6 5 4 3 2 1
Working together to grow
libraries in developing countries
www.elsevier.com | www.bookaid.org | www.sabre.org
. Contents .
Preface xi
1 Introduction 1
1.1 Overview 1
1.2 Human and computer vision 1
1.3 The human vision system 3
1.3.1 The eye 4
1.3.2 The neural system 6
1.3.3 Processing 7
1.4 Computer vision systems 9
1.4.1 Cameras 10
1.4.2 Computer interfaces 12
1.4.3 Processing an image 14
1.5 Mathematical systems 15
1.5.1 Mathematical tools 16
1.5.2 Hello Mathcad, hello images! 16
1.5.3 Hello Matlab! 21
1.6 Associated literature 24
1.6.1 Journals and magazines 24
1.6.2 Textbooks 25
1.6.3 The web 28
1.7 Conclusions 29
1.8 References 29
2 Images, sampling and frequency domain processing 33
2.1 Overview 33
2.2 Image formation 34
2.3 The Fourier transform 37
2.4 The sampling criterion 43
2.5 The discrete Fourier transform 47
2.5.1 One-dimensional transform 47
2.5.2 Two-dimensional transform 49
2.6 Other properties of the Fourier transform 54
2.6.1 Shift invariance 54
2.6.2 Rotation 56
2.6.3 Frequency scaling 56
2.6.4 Superposition (linearity) 57
v
2.7 Transforms other than Fourier 58
2.7.1 Discrete cosine transform 58
2.7.2 Discrete Hartley transform 59
2.7.3 Introductory wavelets: the Gabor wavelet 61
2.7.4 Other transforms 63
2.8 Applications using frequency domain properties 64
2.9 Further reading 65
2.10 References 66
3 Basic image processing operations 69
3.1 Overview 69
3.2 Histograms 70
3.3 Point operators 71
3.3.1 Basic point operations 71
3.3.2 Histogram normalization 74
3.3.3 Histogram equalization 75
3.3.4 Thresholding 77
3.4 Group operations 81
3.4.1 Template convolution 81
3.4.2 Averaging operator 84
3.4.3 On different template size 87
3.4.4 Gaussian averaging operator 88
3.5 Other statistical operators 90
3.5.1 More on averaging 90
3.5.2 Median filter 91
3.5.3 Mode filter 94
3.5.4 Anisotropic diffusion 96
3.5.5 Force field transform 101
3.5.6 Comparison of statistical operators 102
3.6 Mathematical morphology 103
3.6.1 Morphological operators 104
3.6.2 Grey-level morphology 107
3.6.3 Grey-level erosion and dilation 108
3.6.4 Minkowski operators 109
3.7 Further reading 112
3.8 References 113
4 Low-level feature extraction (including edge detection) 115
4.1 Overview 115
4.2 First order edge detection operators 117
4.2.1 Basic operators 117
4.2.2 Analysis of the basic operators 119
4.2.3 Prewitt edge detection operator 121
4.2.4 Sobel edge detection operator 123
4.2.5 Canny edge detection operator 129
vi Contents
4.3 Second order edge detection operators 137
4.3.1 Motivation 137
4.3.2 Basic operators: the Laplacian 137
4.3.3 Marr–Hildreth operator 139
4.4 Other edge detection operators 144
4.5 Comparison of edge detection operators 145
4.6 Further reading on edge detection 146
4.7 Phase congruency 147
4.8 Localized feature extraction 152
4.8.1 Detecting image curvature (corner extraction) 153
4.8.1.1 Definition of curvature 153
4.8.1.2 Computing differences in edge direction 154
4.8.1.3 Measuring curvature by changes in intensity
(differentiation) 156
4.8.1.4 Moravec and Harris detectors 159
4.8.1.5 Further reading on curvature 163
4.8.2 Modern approaches: region/patch analysis 163
4.8.2.1 Scale invariant feature transform 163
4.8.2.2 Saliency 166
4.8.2.3 Other techniques and performance issues 167
4.9 Describing image motion 167
4.9.1 Area-based approach 168
4.9.2 Differential approach 171
4.9.3 Further reading on optical flow 177
4.10 Conclusions 178
4.11 References 178
5 Feature extraction by shape matching 183
5.1 Overview 183
5.2 Thresholding and subtraction 184
5.3 Template matching 186
5.3.1 Definition 186
5.3.2 Fourier transform implementation 193
5.3.3 Discussion of template matching 196
5.4 Hough transform 196
5.4.1 Overview 196
5.4.2 Lines 197
5.4.3 Hough transform for circles 203
5.4.4 Hough transform for ellipses 207
5.4.5 Parameter space decomposition 210
5.4.5.1 Parameter space reduction for lines 210
5.4.5.2 Parameter space reduction for circles 212
5.4.5.3 Parameter space reduction for ellipses 217
5.5 Generalized Hough transform 221
5.5.1 Formal definition of the GHT 221
5.5.2 Polar definition 223
Contents vii
5.5.3 The GHT technique 224
5.5.4 Invariant GHT 228
5.6 Other extensions to the Hough transform 235
5.7 Further reading 236
5.8 References 237
6 Flexible shape extraction (snakes and other techniques) 241
6.1 Overview 241
6.2 Deformable templates 242
6.3 Active contours (snakes) 244
6.3.1 Basics 244
6.3.2 The greedy algorithm for snakes 246
6.3.3 Complete (Kass) snake implementation 252
6.3.4 Other snake approaches 257
6.3.5 Further snake developments 257
6.3.6 Geometric active contours 261
6.4 Shape skeletonization 266
6.4.1 Distance transforms 266
6.4.2 Symmetry 268
6.5 Flexible shape models: active shape and active
appearance 272
6.6 Further reading 275
6.7 References 276
7 Object description 281
7.1 Overview 281
7.2 Boundary descriptions 282
7.2.1 Boundary and region 282
7.2.2 Chain codes 283
7.2.3 Fourier descriptors 285
7.2.3.1 Basis of Fourier descriptors 286
7.2.3.2 Fourier expansion 287
7.2.3.3 Shift invariance 289
7.2.3.4 Discrete computation 290
7.2.3.5 Cumulative angular function 292
7.2.3.6 Elliptic Fourier descriptors 301
7.2.3.7 Invariance 305
7.3 Region descriptors 311
7.3.1 Basic region descriptors 311
7.3.2 Moments 315
7.3.2.1 Basic properties 315
7.3.2.2 Invariant moments 318
7.3.2.3 Zernike moments 320
7.3.2.4 Other moments 324
7.4 Further reading 325
7.5 References 326
viii Contents
8 Introduction to texture description, segmentation and classification 329
8.1 Overview 329
8.2 What is texture? 330
8.3 Texture description 332
8.3.1 Performance requirements 332
8.3.2 Structural approaches 332
8.3.3 Statistical approaches 335
8.3.4 Combination approaches 337
8.4 Classification 339
8.4.1 The k-nearest neighbour rule 339
8.4.2 Other classification approaches 343
8.5 Segmentation 343
8.6 Further reading 345
8.7 References 346
9 Appendix 1: Example worksheets 349
9.1 Example Mathcad worksheet for Chapter 3 349
9.2 Example Matlab worksheet for Chapter 4 352
10 Appendix 2: Camera geometry fundamentals 355
10.1 Image geometry 355
10.2 Perspective camera 355
10.3 Perspective camera model 357
10.3.1 Homogeneous coordinates and projective geometry 357
10.3.1.1 Representation of a line and duality 358
10.3.1.2 Ideal points 358
10.3.1.3 Transformations in the projective space 359
10.3.2 Perspective camera model analysis 360
10.3.3 Parameters of the perspective camera model 363
10.4 Affine camera 364
10.4.1 Affine camera model 365
10.4.2 Affine camera model and the perspective projection 366
10.4.3 Parameters of the affine camera model 368
10.5 Weak perspective model 369
10.6 Example of camera models 371
10.7 Discussion 379
10.8 References 380
11 Appendix 3: Least squares analysis 381
11.1 The least squares criterion 381
11.2 Curve fitting by least squares 382
Contents ix
12 Appendix 4: Principal components analysis 385
12.1 Introduction 385
12.2 Data 385
12.3 Covariance 386
12.4 Covariance matrix 388
12.5 Data transformation 389
12.6 Inverse transformation 390
12.7 Eigenproblem 391
12.8 Solving the eigenproblem 392
12.9 PCA method summary 392
12.10 Example 393
12.11 References 398
Index 399
x Contents
. Preface .
Why did we write this book?
We will no doubt be asked many times: why on earth write a new book on computer vision?
Fair question: there are already many good books on computer vision in the bookshops, as you
will find referenced later, so why add to them? Part of the answer is that any textbook is a
snapshot of material that exists before it. Computer vision, the art of processing images stored
within a computer, has seen a considerable amount of research by highly qualified people and
the volume of research would appear even to have increased in recent years. This means that a
lot of new techniques have been developed, and many of the more recent approaches have yet
to migrate to textbooks.
But it is not just the new research: part of the speedy advance in computer vision technique
has left some areas covered only in scanty detail. By the nature of research, one cannot publish
material on technique that is seen more to fill historical gaps, rather than to advance knowledge.
This is again where a new text can contribute.
Finally, the technology itself continues to advance. This means that there is new hardware,
and there are new programming languages and new programming environments. In particular for
computer vision, the advance of technology means that computing power and memory are now
relatively cheap. It is certainly considerably cheaper than when computer vision was starting as
a research field. One of the authors here notes that the laptop that his portion of the book was
written on has more memory, is faster, and has bigger disk space and better graphics than the
computer that served the entire university of his student days. And he is not that old! One of
the more advantageous recent changes brought about by progress has been the development of
mathematical programming systems. These allow us to concentrate on mathematical technique
itself, rather than on implementation detail. There are several sophisticated flavours, of which
Mathcad and Matlab, the chosen vehicles here, are among the most popular. We have been using
these techniques in research and teaching, and we would argue that they have been of considerable benefit there. In research, they help us to develop technique more quickly and to evaluate
its final implementation. For teaching, the power of a modern laptop and a mathematical system
combines to show students, in lectures and in study, not only how techniques are implemented,
but also how and why they work with an explicit relation to conventional teaching material.
We wrote this book for these reasons. There is a host of material that we could have included
but chose to omit. Our apologies to other academics if it was your own, or your favourite,
technique. By virtue of the enormous breadth of the subject of computer vision, we restricted the
focus to feature extraction and image processing in computer vision, for this not only has been
the focus of our research, but is also where the attention of established textbooks, with some
exceptions, can be rather scanty. It is, however, one of the prime targets of applied computer
vision, so would benefit from better attention. We have aimed to clarify some of its origins
and development, while also exposing implementation using mathematical systems. As such,
we have written this text with our original aims in mind.
xi
Why did we produce another edition?
There are many reasons why we have updated the book to provide this new edition. First,
despite its electronic submission, some of the first edition was retyped before production. This
introduced errors that we have now corrected. Next, the field continues to move forward: we
now include some techniques which were gaining appreciation when we first wrote the book,
or have been developed since. Some areas move more rapidly than others, and this is reflected
in the changes made. Also, there has been interim scholarship, especially in the form of new
texts, and we include these new ones as much as we can. Matlab and Mathcad are still the
computational media here, and there is a new demonstration site which uses Java. Finally, we
have maintained the original format. It is always tempting to change the format, in this case even
to reformat the text, but we have chosen not to do so. Apart from corrections and clarifications,
the main changes from the previous edition are:
• Chapter 1: updating of eye operation, camera technology and software, updating and extension of web material and literature
• Chapter 2: very little (this is standard material), except for an excellent example of aliasing
• Chapter 3: inclusion of anisotropic diffusion for image smoothing, the force field operator
and mathematical morphology
• Chapter 4: extension of frequency domain concepts and differentiation operators; inclusion
of phase congruency, modern curvature operators and the scale invariant feature transform
(SIFT)
• Chapter 5: emphasis of the practical attributes of feature extraction in occlusion and noise,
and some moving-feature techniques
• Chapter 6: inclusion of geometric active contours and level set methods, inclusion of skeletonization, extension of active shape models
• Chapter 7: extension of the material on moments, particularly Zernike moments, including
reconstruction from moments
• Chapter 8: clarification of some of the detail in feature-based recognition
• Appendices: these have been extended to cover camera models in greater detail, and principal
components analysis.
As already mentioned, there is a new JAVA-based demonstration site, at http://www.ecs.soton.
ac.uk/∼msn/book/new_demo/, which has some of the techniques described herein and some
examples of how computer vision-based biometrics work. This webpage will continue to be
updated.
The book and its support
Each chapter of the book presents a particular package of information concerning feature
extraction in image processing and computer vision. Each package is developed from its origins
and later referenced to more recent material. Naturally, there is often theoretical development
before implementation (in Mathcad or Matlab). We have provided working implementations of
most of the major techniques we describe, and applied them to process a selection of imagery.
Although the focus of our work has been more in analysing medical imagery or in biometrics
xii Preface
(the science of recognizing people by behavioural or physiological characteristics, like face
recognition), the techniques are general and can migrate to other application domains.
You will find a host of further supporting information at the book’s website
(http://www.ecs.soton.ac.uk/∼msn/book/). First, you will find the worksheets (the Matlab and
Mathcad implementations that support the text) so that you can study the techniques described
herein. There are also lecturing versions that have been arranged for display via an overhead
projector, with enlarged text and more interactive demonstration. The example questions (and,
eventually, their answers) are also there. The demonstration site is there too. The website will
be kept as up to date as possible, for it also contains links to other material such as websites
devoted to techniques and to applications, as well as to available software and online literature.
Finally, any errata will be reported there. It is our regret and our responsibility that these will
exist, but our inducement for their reporting concerns a pint of beer. If you find an error that
we do not know about (not typos such as spelling, grammar and layout) then use the mailto on
the website and we shall send you a pint of good English beer, free!
There is a certain amount of mathematics in this book. The target audience is third or fourth
year students in BSc/BEng/MEng courses in electrical or electronic engineering, software engineering and computer science, or in mathematics or physics, and this is the level of mathematical
analysis here. Computer vision can be thought of as a branch of applied mathematics, although
this does not really apply to some areas within its remit, but certainly applies to the material
herein. The mathematics essentially concerns mainly calculus and geometry, although some of
it is rather more detailed than the constraints of a conventional lecture course might allow. Certainly, not all of the material here is covered in detail in undergraduate courses at Southampton.
The book starts with an overview of computer vision hardware, software and established
material, with reference to the most sophisticated vision system yet ‘developed’: the human
vision system. Although the precise details of the nature of processing that allows us to see
have yet to be determined, there is a considerable range of hardware and software that allow
us to give a computer system the capability to acquire, process and reason with imagery, the
function of ‘sight’. The first chapter also provides a comprehensive bibliography of material
on the subject, including not only textbooks, but also available software and other material. As
this will no doubt be subject to change, it might well be worth consulting the website for more
up-to-date information. The preferred journal references are those that are likely to be found
in local university libraries or on the web, IEEE Transactions in particular. These are often
subscribed to as they are relatively low cost, and are often of very high quality.
The next chapter concerns the basics of signal processing theory for use in computer vision.
It introduces the Fourier transform, which allows you to look at a signal in a new way, in terms
of its frequency content. It also allows us to work out the minimum size of a picture to conserve
information and to analyse the content in terms of frequency, and even helps to speed up some
of the later vision algorithms. Unfortunately, it does involve a few equations, but it is a new
way of looking at data and signals, and proves to be a rewarding topic of study in its own right.
We then start to look at basic image-processing techniques, where image points are mapped
into a new value first by considering a single point in an original image, and then by considering
groups of points. We see not only common operations to make a picture’s appearance better,
especially for human vision, but also how to reduce the effects of different types of commonly
encountered image noise. This is where the techniques are implemented as algorithms in Mathcad
and Matlab to show precisely how the equations work. We shall see some of the modern ways
to remove noise and thus clean images, and we shall also look at techniques which process an
image using notions of shape, rather than mapping processes.
Preface xiii
The following chapter concerns low-level features, which are the techniques that describe
the content of an image, at the level of a whole image rather than in distinct regions of it. One
of the most important processes is edge detection. Essentially, this reduces an image to a form
of a caricaturist’s sketch, but without a caricaturist’s exaggerations. The major techniques are
presented in detail, together with descriptions of their implementation. Other image properties
we can derive include measures of curvature and measures of movement. These also are covered
in this chapter.
These edges, the curvature or the motion need to be grouped in some way so that we can
find shapes in an image. Our first approach to shape extraction concerns analysing the match
of low-level information to a known template of a target shape. As this can be computationally
very cumbersome, we then progress to a technique that improves computational performance,
while maintaining an optimal performance. The technique is known as the Hough transform,
and it has long been a popular target for researchers in computer vision who have sought to
clarify its basis, improve its speed, and increase its accuracy and robustness. Essentially, by
the Hough transform we estimate the parameters that govern a shape’s appearance, where the
shapes range from lines to ellipses and even to unknown shapes.
Some applications of shape extraction require the determination of rather more than the
parameters that control appearance, but require the ability to deform or flex to match the image
template. For this reason, the chapter on shape extraction by matching is followed by one on
flexible shape analysis. This is a topic that has shown considerable progress of late, especially
with the introduction of snakes (active contours). The newer material is the formulation by level
set methods, and brings new power to shape-extraction techniques. These seek to match a shape
to an image by analysing local properties. Further, we shall see how we can describe a shape by
its skeleton, although with practical difficulty which can be alleviated by symmetry (though this
can be slow), and also how global constraints concerning the statistics of a shape’s appearance
can be used to guide final extraction.
Up to this point, we have not considered techniques that can be used to describe the shape
found in an image. We shall find that the two major approaches concern techniques that describe
a shape’s perimeter and those that describe its area. Some of the perimeter description techniques,
the Fourier descriptors, are even couched using Fourier transform theory, which allows analysis
of their frequency content. One of the major approaches to area description, statistical moments,
also has a form of access to frequency components, but is of a very different nature to the Fourier
analysis. One advantage is that insight into descriptive ability can be achieved by reconstruction,
which should get back to the original shape.
The final chapter describes texture analysis, before some introductory material on pattern
classification. Texture describes patterns with no known analytical description and has been the
target of considerable research in computer vision and image processing. It is used here more
as a vehicle for material that precedes it, such as the Fourier transform and area descriptions,
although references are provided for access to other generic material. There is also introductory
material on how to classify these patterns against known data, but again this is a window on a
much larger area, to which appropriate pointers are given.
The appendices include a printout of abbreviated versions of the Mathcad and Matlab worksheets. The other appendices include material that is germane to the text, such as camera
models and coordinate geometry, the method of least squares and a topic known as principal
components analysis. These are aimed to be short introductions, and are appendices since they
are germane to much of the material. Other related, especially online, material is referenced
throughout the text.
xiv Preface