Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Feature Extraction and Image Processing
PREMIUM
Số trang
423
Kích thước
7.8 MB
Định dạng
PDF
Lượt xem
929

Feature Extraction and Image Processing

Nội dung xem thử

Mô tả chi tiết

Feature Extraction

and

Image Processing

Dedication

We would like to dedicate this book to our parents:

To Gloria and Joaquin Aguado, and to Brenda and the late Ian Nixon.

Feature Extraction

and

Image Processing

Second edition

Mark S. Nixon

Alberto S. Aguado

Amsterdam • Boston • Heidelberg • London • New York • Oxford

Paris • San Diego • San Francisco • Singapore • Sydney • Tokyo

Academic Press is an imprint of Elsevier

Academic Press is an imprint of Elsevier

Linacre House, Jordan Hill, Oxford OX2 8DP, UK

84 Theobald’s Road, London WC1X 8RR, UK

No part of this publication may be reproduced, stored in a retrieval system

or transmitted in any form or by any means electronic, mechanical, photocopying,

recording or otherwise without the prior written permission of the publisher

Permissions may be sought directly from Elsevier’s Science & Technology Rights

Department in Oxford, UK: phone (+44) (0) 1865 843830; fax (+44) (0) 1865 853333;

email: [email protected]. Alternatively you can submit your request online by

visiting the Elsevier web site at http://elsevier.com/locate/permissions, and selecting

Obtaining permission to use Elsevier material

Notice

No responsibility is assumed by the publisher for any injury and/or damage to persons

or property as a matter of products liability, negligence or otherwise, or from any use

or operation of any methods, products, instructions or ideas contained in the material

herein. Because of rapid advances in the medical sciences, in particular, independent

verification of diagnoses and drug dosages should be made

British Library Cataloguing in Publication Data

A catalogue record for this book is available from the British Library

Library of Congress Cataloging-in-Publication Data

A catalog record for this book is available from the Library of Congress

ISBN: 978-0-12372-538-7

For information on all Academic Press publications

visit our web site at books.elsevier.com

Printed and bound in Hungary

08 09 10 10 9 8 7 6 5 4 3 2 1

Working together to grow

libraries in developing countries

www.elsevier.com | www.bookaid.org | www.sabre.org

First edition 2002

Reprinted 2004, 2005

Second edition 2009

Copyright © 2009 Elsevier Ltd. All rights reserved

. Contents .

Preface xi

1 Introduction 1

1.1 Overview 1

1.2 Human and computer vision 1

1.3 The human vision system 3

1.3.1 The eye 4

1.3.2 The neural system 6

1.3.3 Processing 7

1.4 Computer vision systems 9

1.4.1 Cameras 10

1.4.2 Computer interfaces 12

1.4.3 Processing an image 14

1.5 Mathematical systems 15

1.5.1 Mathematical tools 16

1.5.2 Hello Mathcad, hello images! 16

1.5.3 Hello Matlab! 21

1.6 Associated literature 24

1.6.1 Journals and magazines 24

1.6.2 Textbooks 25

1.6.3 The web 28

1.7 Conclusions 29

1.8 References 29

2 Images, sampling and frequency domain processing 33

2.1 Overview 33

2.2 Image formation 34

2.3 The Fourier transform 37

2.4 The sampling criterion 43

2.5 The discrete Fourier transform 47

2.5.1 One-dimensional transform 47

2.5.2 Two-dimensional transform 49

2.6 Other properties of the Fourier transform 54

2.6.1 Shift invariance 54

2.6.2 Rotation 56

2.6.3 Frequency scaling 56

2.6.4 Superposition (linearity) 57

v

2.7 Transforms other than Fourier 58

2.7.1 Discrete cosine transform 58

2.7.2 Discrete Hartley transform 59

2.7.3 Introductory wavelets: the Gabor wavelet 61

2.7.4 Other transforms 63

2.8 Applications using frequency domain properties 64

2.9 Further reading 65

2.10 References 66

3 Basic image processing operations 69

3.1 Overview 69

3.2 Histograms 70

3.3 Point operators 71

3.3.1 Basic point operations 71

3.3.2 Histogram normalization 74

3.3.3 Histogram equalization 75

3.3.4 Thresholding 77

3.4 Group operations 81

3.4.1 Template convolution 81

3.4.2 Averaging operator 84

3.4.3 On different template size 87

3.4.4 Gaussian averaging operator 88

3.5 Other statistical operators 90

3.5.1 More on averaging 90

3.5.2 Median filter 91

3.5.3 Mode filter 94

3.5.4 Anisotropic diffusion 96

3.5.5 Force field transform 101

3.5.6 Comparison of statistical operators 102

3.6 Mathematical morphology 103

3.6.1 Morphological operators 104

3.6.2 Grey-level morphology 107

3.6.3 Grey-level erosion and dilation 108

3.6.4 Minkowski operators 109

3.7 Further reading 112

3.8 References 113

4 Low-level feature extraction (including edge detection) 115

4.1 Overview 115

4.2 First order edge detection operators 117

4.2.1 Basic operators 117

4.2.2 Analysis of the basic operators 119

4.2.3 Prewitt edge detection operator 121

4.2.4 Sobel edge detection operator 123

4.2.5 Canny edge detection operator 129

vi Contents

4.3 Second order edge detection operators 137

4.3.1 Motivation 137

4.3.2 Basic operators: the Laplacian 137

4.3.3 Marr–Hildreth operator 139

4.4 Other edge detection operators 144

4.5 Comparison of edge detection operators 145

4.6 Further reading on edge detection 146

4.7 Phase congruency 147

4.8 Localized feature extraction 152

4.8.1 Detecting image curvature (corner extraction) 153

4.8.1.1 Definition of curvature 153

4.8.1.2 Computing differences in edge direction 154

4.8.1.3 Measuring curvature by changes in intensity

(differentiation) 156

4.8.1.4 Moravec and Harris detectors 159

4.8.1.5 Further reading on curvature 163

4.8.2 Modern approaches: region/patch analysis 163

4.8.2.1 Scale invariant feature transform 163

4.8.2.2 Saliency 166

4.8.2.3 Other techniques and performance issues 167

4.9 Describing image motion 167

4.9.1 Area-based approach 168

4.9.2 Differential approach 171

4.9.3 Further reading on optical flow 177

4.10 Conclusions 178

4.11 References 178

5 Feature extraction by shape matching 183

5.1 Overview 183

5.2 Thresholding and subtraction 184

5.3 Template matching 186

5.3.1 Definition 186

5.3.2 Fourier transform implementation 193

5.3.3 Discussion of template matching 196

5.4 Hough transform 196

5.4.1 Overview 196

5.4.2 Lines 197

5.4.3 Hough transform for circles 203

5.4.4 Hough transform for ellipses 207

5.4.5 Parameter space decomposition 210

5.4.5.1 Parameter space reduction for lines 210

5.4.5.2 Parameter space reduction for circles 212

5.4.5.3 Parameter space reduction for ellipses 217

5.5 Generalized Hough transform 221

5.5.1 Formal definition of the GHT 221

5.5.2 Polar definition 223

Contents vii

5.5.3 The GHT technique 224

5.5.4 Invariant GHT 228

5.6 Other extensions to the Hough transform 235

5.7 Further reading 236

5.8 References 237

6 Flexible shape extraction (snakes and other techniques) 241

6.1 Overview 241

6.2 Deformable templates 242

6.3 Active contours (snakes) 244

6.3.1 Basics 244

6.3.2 The greedy algorithm for snakes 246

6.3.3 Complete (Kass) snake implementation 252

6.3.4 Other snake approaches 257

6.3.5 Further snake developments 257

6.3.6 Geometric active contours 261

6.4 Shape skeletonization 266

6.4.1 Distance transforms 266

6.4.2 Symmetry 268

6.5 Flexible shape models: active shape and active

appearance 272

6.6 Further reading 275

6.7 References 276

7 Object description 281

7.1 Overview 281

7.2 Boundary descriptions 282

7.2.1 Boundary and region 282

7.2.2 Chain codes 283

7.2.3 Fourier descriptors 285

7.2.3.1 Basis of Fourier descriptors 286

7.2.3.2 Fourier expansion 287

7.2.3.3 Shift invariance 289

7.2.3.4 Discrete computation 290

7.2.3.5 Cumulative angular function 292

7.2.3.6 Elliptic Fourier descriptors 301

7.2.3.7 Invariance 305

7.3 Region descriptors 311

7.3.1 Basic region descriptors 311

7.3.2 Moments 315

7.3.2.1 Basic properties 315

7.3.2.2 Invariant moments 318

7.3.2.3 Zernike moments 320

7.3.2.4 Other moments 324

7.4 Further reading 325

7.5 References 326

viii Contents

8 Introduction to texture description, segmentation and classification 329

8.1 Overview 329

8.2 What is texture? 330

8.3 Texture description 332

8.3.1 Performance requirements 332

8.3.2 Structural approaches 332

8.3.3 Statistical approaches 335

8.3.4 Combination approaches 337

8.4 Classification 339

8.4.1 The k-nearest neighbour rule 339

8.4.2 Other classification approaches 343

8.5 Segmentation 343

8.6 Further reading 345

8.7 References 346

9 Appendix 1: Example worksheets 349

9.1 Example Mathcad worksheet for Chapter 3 349

9.2 Example Matlab worksheet for Chapter 4 352

10 Appendix 2: Camera geometry fundamentals 355

10.1 Image geometry 355

10.2 Perspective camera 355

10.3 Perspective camera model 357

10.3.1 Homogeneous coordinates and projective geometry 357

10.3.1.1 Representation of a line and duality 358

10.3.1.2 Ideal points 358

10.3.1.3 Transformations in the projective space 359

10.3.2 Perspective camera model analysis 360

10.3.3 Parameters of the perspective camera model 363

10.4 Affine camera 364

10.4.1 Affine camera model 365

10.4.2 Affine camera model and the perspective projection 366

10.4.3 Parameters of the affine camera model 368

10.5 Weak perspective model 369

10.6 Example of camera models 371

10.7 Discussion 379

10.8 References 380

11 Appendix 3: Least squares analysis 381

11.1 The least squares criterion 381

11.2 Curve fitting by least squares 382

Contents ix

12 Appendix 4: Principal components analysis 385

12.1 Introduction 385

12.2 Data 385

12.3 Covariance 386

12.4 Covariance matrix 388

12.5 Data transformation 389

12.6 Inverse transformation 390

12.7 Eigenproblem 391

12.8 Solving the eigenproblem 392

12.9 PCA method summary 392

12.10 Example 393

12.11 References 398

Index 399

x Contents

. Preface .

Why did we write this book?

We will no doubt be asked many times: why on earth write a new book on computer vision?

Fair question: there are already many good books on computer vision in the bookshops, as you

will find referenced later, so why add to them? Part of the answer is that any textbook is a

snapshot of material that exists before it. Computer vision, the art of processing images stored

within a computer, has seen a considerable amount of research by highly qualified people and

the volume of research would appear even to have increased in recent years. This means that a

lot of new techniques have been developed, and many of the more recent approaches have yet

to migrate to textbooks.

But it is not just the new research: part of the speedy advance in computer vision technique

has left some areas covered only in scanty detail. By the nature of research, one cannot publish

material on technique that is seen more to fill historical gaps, rather than to advance knowledge.

This is again where a new text can contribute.

Finally, the technology itself continues to advance. This means that there is new hardware,

and there are new programming languages and new programming environments. In particular for

computer vision, the advance of technology means that computing power and memory are now

relatively cheap. It is certainly considerably cheaper than when computer vision was starting as

a research field. One of the authors here notes that the laptop that his portion of the book was

written on has more memory, is faster, and has bigger disk space and better graphics than the

computer that served the entire university of his student days. And he is not that old! One of

the more advantageous recent changes brought about by progress has been the development of

mathematical programming systems. These allow us to concentrate on mathematical technique

itself, rather than on implementation detail. There are several sophisticated flavours, of which

Mathcad and Matlab, the chosen vehicles here, are among the most popular. We have been using

these techniques in research and teaching, and we would argue that they have been of consider￾able benefit there. In research, they help us to develop technique more quickly and to evaluate

its final implementation. For teaching, the power of a modern laptop and a mathematical system

combines to show students, in lectures and in study, not only how techniques are implemented,

but also how and why they work with an explicit relation to conventional teaching material.

We wrote this book for these reasons. There is a host of material that we could have included

but chose to omit. Our apologies to other academics if it was your own, or your favourite,

technique. By virtue of the enormous breadth of the subject of computer vision, we restricted the

focus to feature extraction and image processing in computer vision, for this not only has been

the focus of our research, but is also where the attention of established textbooks, with some

exceptions, can be rather scanty. It is, however, one of the prime targets of applied computer

vision, so would benefit from better attention. We have aimed to clarify some of its origins

and development, while also exposing implementation using mathematical systems. As such,

we have written this text with our original aims in mind.

xi

Why did we produce another edition?

There are many reasons why we have updated the book to provide this new edition. First,

despite its electronic submission, some of the first edition was retyped before production. This

introduced errors that we have now corrected. Next, the field continues to move forward: we

now include some techniques which were gaining appreciation when we first wrote the book,

or have been developed since. Some areas move more rapidly than others, and this is reflected

in the changes made. Also, there has been interim scholarship, especially in the form of new

texts, and we include these new ones as much as we can. Matlab and Mathcad are still the

computational media here, and there is a new demonstration site which uses Java. Finally, we

have maintained the original format. It is always tempting to change the format, in this case even

to reformat the text, but we have chosen not to do so. Apart from corrections and clarifications,

the main changes from the previous edition are:

• Chapter 1: updating of eye operation, camera technology and software, updating and exten￾sion of web material and literature

• Chapter 2: very little (this is standard material), except for an excellent example of aliasing

• Chapter 3: inclusion of anisotropic diffusion for image smoothing, the force field operator

and mathematical morphology

• Chapter 4: extension of frequency domain concepts and differentiation operators; inclusion

of phase congruency, modern curvature operators and the scale invariant feature transform

(SIFT)

• Chapter 5: emphasis of the practical attributes of feature extraction in occlusion and noise,

and some moving-feature techniques

• Chapter 6: inclusion of geometric active contours and level set methods, inclusion of skele￾tonization, extension of active shape models

• Chapter 7: extension of the material on moments, particularly Zernike moments, including

reconstruction from moments

• Chapter 8: clarification of some of the detail in feature-based recognition

• Appendices: these have been extended to cover camera models in greater detail, and principal

components analysis.

As already mentioned, there is a new JAVA-based demonstration site, at http://www.ecs.soton.

ac.uk/∼msn/book/new_demo/, which has some of the techniques described herein and some

examples of how computer vision-based biometrics work. This webpage will continue to be

updated.

The book and its support

Each chapter of the book presents a particular package of information concerning feature

extraction in image processing and computer vision. Each package is developed from its origins

and later referenced to more recent material. Naturally, there is often theoretical development

before implementation (in Mathcad or Matlab). We have provided working implementations of

most of the major techniques we describe, and applied them to process a selection of imagery.

Although the focus of our work has been more in analysing medical imagery or in biometrics

xii Preface

(the science of recognizing people by behavioural or physiological characteristics, like face

recognition), the techniques are general and can migrate to other application domains.

You will find a host of further supporting information at the book’s website

(http://www.ecs.soton.ac.uk/∼msn/book/). First, you will find the worksheets (the Matlab and

Mathcad implementations that support the text) so that you can study the techniques described

herein. There are also lecturing versions that have been arranged for display via an overhead

projector, with enlarged text and more interactive demonstration. The example questions (and,

eventually, their answers) are also there. The demonstration site is there too. The website will

be kept as up to date as possible, for it also contains links to other material such as websites

devoted to techniques and to applications, as well as to available software and online literature.

Finally, any errata will be reported there. It is our regret and our responsibility that these will

exist, but our inducement for their reporting concerns a pint of beer. If you find an error that

we do not know about (not typos such as spelling, grammar and layout) then use the mailto on

the website and we shall send you a pint of good English beer, free!

There is a certain amount of mathematics in this book. The target audience is third or fourth

year students in BSc/BEng/MEng courses in electrical or electronic engineering, software engi￾neering and computer science, or in mathematics or physics, and this is the level of mathematical

analysis here. Computer vision can be thought of as a branch of applied mathematics, although

this does not really apply to some areas within its remit, but certainly applies to the material

herein. The mathematics essentially concerns mainly calculus and geometry, although some of

it is rather more detailed than the constraints of a conventional lecture course might allow. Cer￾tainly, not all of the material here is covered in detail in undergraduate courses at Southampton.

The book starts with an overview of computer vision hardware, software and established

material, with reference to the most sophisticated vision system yet ‘developed’: the human

vision system. Although the precise details of the nature of processing that allows us to see

have yet to be determined, there is a considerable range of hardware and software that allow

us to give a computer system the capability to acquire, process and reason with imagery, the

function of ‘sight’. The first chapter also provides a comprehensive bibliography of material

on the subject, including not only textbooks, but also available software and other material. As

this will no doubt be subject to change, it might well be worth consulting the website for more

up-to-date information. The preferred journal references are those that are likely to be found

in local university libraries or on the web, IEEE Transactions in particular. These are often

subscribed to as they are relatively low cost, and are often of very high quality.

The next chapter concerns the basics of signal processing theory for use in computer vision.

It introduces the Fourier transform, which allows you to look at a signal in a new way, in terms

of its frequency content. It also allows us to work out the minimum size of a picture to conserve

information and to analyse the content in terms of frequency, and even helps to speed up some

of the later vision algorithms. Unfortunately, it does involve a few equations, but it is a new

way of looking at data and signals, and proves to be a rewarding topic of study in its own right.

We then start to look at basic image-processing techniques, where image points are mapped

into a new value first by considering a single point in an original image, and then by considering

groups of points. We see not only common operations to make a picture’s appearance better,

especially for human vision, but also how to reduce the effects of different types of commonly

encountered image noise. This is where the techniques are implemented as algorithms in Mathcad

and Matlab to show precisely how the equations work. We shall see some of the modern ways

to remove noise and thus clean images, and we shall also look at techniques which process an

image using notions of shape, rather than mapping processes.

Preface xiii

The following chapter concerns low-level features, which are the techniques that describe

the content of an image, at the level of a whole image rather than in distinct regions of it. One

of the most important processes is edge detection. Essentially, this reduces an image to a form

of a caricaturist’s sketch, but without a caricaturist’s exaggerations. The major techniques are

presented in detail, together with descriptions of their implementation. Other image properties

we can derive include measures of curvature and measures of movement. These also are covered

in this chapter.

These edges, the curvature or the motion need to be grouped in some way so that we can

find shapes in an image. Our first approach to shape extraction concerns analysing the match

of low-level information to a known template of a target shape. As this can be computationally

very cumbersome, we then progress to a technique that improves computational performance,

while maintaining an optimal performance. The technique is known as the Hough transform,

and it has long been a popular target for researchers in computer vision who have sought to

clarify its basis, improve its speed, and increase its accuracy and robustness. Essentially, by

the Hough transform we estimate the parameters that govern a shape’s appearance, where the

shapes range from lines to ellipses and even to unknown shapes.

Some applications of shape extraction require the determination of rather more than the

parameters that control appearance, but require the ability to deform or flex to match the image

template. For this reason, the chapter on shape extraction by matching is followed by one on

flexible shape analysis. This is a topic that has shown considerable progress of late, especially

with the introduction of snakes (active contours). The newer material is the formulation by level

set methods, and brings new power to shape-extraction techniques. These seek to match a shape

to an image by analysing local properties. Further, we shall see how we can describe a shape by

its skeleton, although with practical difficulty which can be alleviated by symmetry (though this

can be slow), and also how global constraints concerning the statistics of a shape’s appearance

can be used to guide final extraction.

Up to this point, we have not considered techniques that can be used to describe the shape

found in an image. We shall find that the two major approaches concern techniques that describe

a shape’s perimeter and those that describe its area. Some of the perimeter description techniques,

the Fourier descriptors, are even couched using Fourier transform theory, which allows analysis

of their frequency content. One of the major approaches to area description, statistical moments,

also has a form of access to frequency components, but is of a very different nature to the Fourier

analysis. One advantage is that insight into descriptive ability can be achieved by reconstruction,

which should get back to the original shape.

The final chapter describes texture analysis, before some introductory material on pattern

classification. Texture describes patterns with no known analytical description and has been the

target of considerable research in computer vision and image processing. It is used here more

as a vehicle for material that precedes it, such as the Fourier transform and area descriptions,

although references are provided for access to other generic material. There is also introductory

material on how to classify these patterns against known data, but again this is a window on a

much larger area, to which appropriate pointers are given.

The appendices include a printout of abbreviated versions of the Mathcad and Matlab work￾sheets. The other appendices include material that is germane to the text, such as camera

models and coordinate geometry, the method of least squares and a topic known as principal

components analysis. These are aimed to be short introductions, and are appendices since they

are germane to much of the material. Other related, especially online, material is referenced

throughout the text.

xiv Preface

Tải ngay đi em, còn do dự, trời tối mất!