Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Data mining and medical  knowledge management: cases and applications
PREMIUM
Số trang
465
Kích thước
11.1 MB
Định dạng
PDF
Lượt xem
1946

Data mining and medical knowledge management: cases and applications

Nội dung xem thử

Mô tả chi tiết

Data Mining and Medical

Knowledge Management:

Cases and Applications

Petr Berka

University of Economics, Prague, Czech Republic

Jan Rauch

University of Economics, Prague, Czech Republic

Djamel Abdelkader Zighed

University of Lumiere Lyon 2, France

Hershey • New York

Medical Information science reference

Director of Editorial Content: Kristin Klinger

Managing Editor: Jamie Snavely

Assistant Managing Editor: Carole Coulson

Typesetter: Sean Woznicki

Cover Design: Lisa Tosheff

Printed at: Yurchak Printing Inc.

Published in the United States of America by

Information Science Reference (an imprint of IGI Global)

701 E. Chocolate Avenue, Suite 200

Hershey PA 17033

Tel: 717-533-8845

Fax: 717-533-8661

E-mail: [email protected]

Web site: http://www.igi-global.com/reference

and in the United Kingdom by

Information Science Reference (an imprint of IGI Global)

3 Henrietta Street

Covent Garden

London WC2E 8LU

Tel: 44 20 7240 0856

Fax: 44 20 7379 0609

Web site: http://www.eurospanbookstore.com

Copyright © 2009 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in any form or by

any means, electronic or mechanical, including photocopying, without written permission from the publisher.

Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does

not indicate a claim of ownership by IGI Global of the trademark or registered trademark.

Library of Congress Cataloging-in-Publication Data

Data mining and medical knowledge management : cases and applications / Petr Berka, Jan Rauch, and Djamel Abdelkader Zighed, editors.

p. ; cm.

Includes bibliographical references and index.

Summary: "This book presents 20 case studies on applications of various modern data mining methods in several important areas of medi￾cine, covering classical data mining methods, elaborated approaches related to mining in EEG and ECG data, and methods related to mining

in genetic data"--Provided by publisher.

ISBN 978-1-60566-218-3 (hardcover)

1. Medicine--Data processing--Case studies. 2. Data mining--Case studies. I. Berka, Petr. II. Rauch, Jan. III. Zighed, Djamel A., 1955-

[DNLM: 1. Medical Informatics--methods--Case Reports. 2. Computational Biology--methods--Case Reports. 3. Information Storage and

Retrieval--methods--Case Reports. 4. Risk Assessment--Case Reports. W 26.5 D2314 2009]

R858.D33 2009

610.0285--dc22

2008028366

British Cataloguing in Publication Data

A Cataloguing in Publication record for this book is available from the British Library.

All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not

necessarily of the publisher.

If a library purchased a print copy of this publication, please go to http://www.igi-global.com/agreement for information on activating

the library's complimentary electronic access to this publication.

Editorial Advisory Board

Riccardo Bellazzi, University of Pavia, Italy

Radim Jiroušek, Academy of Sciences, Prague, Czech Republic

Katharina Morik, University of Dortmund, Germany

Ján Paralič, Technical University, Košice, Slovak Republic

Luis Torgo, LIAAD-INESC Porto LA, Portugal

Blaž Župan, University of Ljubljana, Slovenia

List of Reviewers

Ricardo Bellazzi, University of Pavia, Italy

Petr Berka, University of Economics, Prague, Czech Republic

Bruno Crémilleux, University Caen, France

Peter Eklund, Umeå University, Umeå, Sveden

Radim Jiroušek, Academy of Sciences, Prague, Czech Republic

Jiří Kléma, Czech Technical University, Prague, Czech Republic

Mila Kwiatkovska, Thompson Rivers University, Kamloops, Canada

Martin Labský, University of Economics, Prague, Czech Republic

Lenka Lhotská, Czech Technical University, Prague, Czech Republic

Ján Paralić, Technical University, Kosice, Slovak Republic

Vincent Pisetta, University Lyon 2, France

Simon Marcellin, University Lyon 2, France

Jan Rauch, University of Economics, Prague, Czech Republic

Marisa Sánchez, National University, Bahía Blanca, Argentina

Ahmed-El Sayed, University Lyon 2, France

Olga Štěpánková, Czech Technical University, Prague, Czech Republic

Vojtěch Svátek, University of Economics, Prague, Czech Republic

Arnošt Veselý, Czech University of Life Sciences, Prague, Czech Republic

Djamel Zighed, University Lyon 2, France

Foreword ............................................................................................................................................ xiv

Preface ................................................................................................................................................ xix

Acknowledgment .............................................................................................................................xxiii

Section I

Theoretical Aspects

Chapter I

Data, Information and Knowledge.......................................................................................................... 1

Jana Zvárová, Institute of Computer Science of the Academy of Sciences of the Czech

Republic v.v.i., Czech Republic; Center of Biomedical Informatics, Czech Republic

Arnošt Veselý, Institute of Computer Science of the Academy of Sciences of the Czech Republic

v.v.i., Czech Republic; Czech University of Life Sciences, Czech Republic

Igor Vajda, Institutes of Computer Science and Information Theory and Automation of

the Academy of Sciences of the Czech Republic v.v.i., Czech Republic

Chapter II

Ontologies in the Health Field .............................................................................................................. 37

Michel Simonet, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Radja Messai, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Gayo Diallo, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Ana Simonet, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Chapter III

Cost-Sensitive Learning in Medicine.................................................................................................... 57

Alberto Freitas, University of Porto, Portugal; CINTESIS, Portugal

Pavel Brazdil, LIAAD - INESC Porto L.A., Portugal; University of Porto, Portugal

Altamiro Costa-Pereira, University of Porto, Portugal; CINTESIS, Portugal

Table of Contents

Chapter IV

Classification and Prediction with Neural Networks............................................................................ 76

Arnošt Veselý, Czech University of Life Sciences, Czech Republic

Chapter V

Preprocessing Perceptrons and Multivariate Decision Limits............................................................ 108

Patrik Eklund, Umeå University, Sweden

Lena Kallin Westin, Umeå University, Sweden

Section II

General Applications

Chapter VI

Image Registration for Biomedical Information Integration .............................................................. 122

Xiu Ying Wang, BMIT Research Group, The University of Sydney, Australia

Dagan Feng, BMIT Research Group, The University of Sydney, Australia; Hong Kong Polytechnic

University, Hong Kong

Chapter VII

ECG Processing .................................................................................................................................. 137

Lenka Lhotská, Czech Technical University in Prague, Czech Republic

Václav Chudáček, Czech Technical University in Prague, Czech Republic

Michal Huptych, Czech Technical University in Prague, Czech Republic

Chapter VIII

EEG Data Mining Using PCA............................................................................................................ 161

Lenka Lhotská, Czech Technical University in Prague, Czech Republic

Vladimír Krajča, Faculty Hospital Na Bulovce, Czech Republic

Jitka Mohylová, Technical University Ostrava, Czech Republic

Svojmil Petránek, Faculty Hospital Na Bulovce, Czech Republic

Václav Gerla, Czech Technical University in Prague, Czech Republic

Chapter IX

Generating and Verifying Risk Prediction Models Using Data Mining ............................................. 181

Darryl N. Davis, University of Hull, UK

Thuy T.T. Nguyen, University of Hull, UK

Chapter X

Management of Medical Website Quality Labels via Web Mining.................................................... 206

Vangelis Karkaletsis, National Center of Scienti.c Research “Demokritos”, Greece

Konstantinos Stamatakis, National Center of Scientific Research “Demokritos”, Greece

Pythagoras Karampiperis, National Center of Scientific Research “Demokritos”, Greece

Martin Labský, University of Economics, Prague, Czech Republic

Marek Růžička, University of Economics, Prague, Czech Republic

Vojtěch Svátek, University of Economics, Prague, Czech Republic

Enrique Amigó Cabrera, ETSI Informática, UNED, Spain

Matti Pöllä, Helsinki University of Technology, Finland

Miquel Angel Mayer, Medical Association of Barcelona (COMB), Spain

Dagmar Villarroel Gonzales, Agency for Quality in Medicine (AquMed), Germany

Chapter XI

Two Case-Based Systems for Explaining Exceptions in Medicine .................................................... 227

Rainer Schmidt, University of Rostock, Germany

Section III

Speci.c Cases

Chapter XII

Discovering Knowledge from Local Patterns in SAGE Data............................................................. 251

Bruno Crémilleux, Université de Caen, France

Arnaud Soulet, Université François Rabelais de Tours, France

Jiří Kléma, Czech Technical University, in Prague, Czech Republic

Céline Hébert, Université de Caen, France

Olivier Gandrillon, Université de Lyon, France

Chapter XIII

Gene Expression Mining Guided by Background Knowledge........................................................... 268

Jiří Kléma, Czech Technical University in Prague, Czech Republic

Filip Železný, Czech Technical University in Prague, Czech Republic

Igor Trajkovski, Jožef Stefan Institute, Slovenia

Filip Karel, Czech Technical University in Prague, Czech Republic

Bruno Crémilleux, Université de Caen, France

Jakub Tolar, University of Minnesota, USA

Chapter XIV

Mining Tinnitus Database for Knowledge.......................................................................................... 293

Pamela L. Thompson, University of North Carolina at Charlotte, USA

Xin Zhang, University of North Carolina at Pembroke, USA

Wenxin Jiang, University of North Carolina at Charlotte, USA

Zbigniew W. Ras, University of North Carolina at Charlotte, USA

Pawel Jastreboff, Emory University School of Medicine, USA

Chapter XV

Gaussian-Stacking Multiclassifiers for Human Embryo Selection..................................................... 307

Dinora A. Morales, University of the Basque Country, Spain

Endika Bengoetxea, University of the Basque Country, Spain

Pedro Larrañaga, Universidad Politécnica de Madrid, Spain

Chapter XVI

Mining Tuberculosis Data................................................................................................................... 332

Marisa A. Sánchez, Universidad Nacional del Sur, Argentina

Sonia Uremovich, Universidad Nacional del Sur, Argentina

Pablo Acrogliano, Hospital Interzonal Dr. José Penna, Argentina

Chapter XVII

Knowledge-Based Induction of Clinical Prediction Rules................................................................. 350

Mila Kwiatkowska, Thompson Rivers University, Canada

M. Stella Atkins, Simon Fraser University, Canada

Les Matthews, Thompson Rivers University, Canada

Najib T. Ayas, University of British Columbia, Canada

C. Frank Ryan, University of British Columbia, Canada

Chapter XVIII

Data Mining in Atherosclerosis Risk Factor Data .............................................................................. 376

Petr Berka, University of Economics, Prague, Czech Republic; Academy of Sciences of the

Czech Republic, Prague, Czech Republic

Jan Rauch, University of Economics, Praague, Czech Republic; Academy of Sciences of the

Czech Republic, Prague, Czech Republic

Marie Tomečková, Academy of Sciences of the Czech Republic, Prague, Czech Republic

Compilation of References............................................................................................................... 398

About the Contributors.................................................................................................................... 426

Index................................................................................................................................................... 437

Foreword ............................................................................................................................................ xiv

Preface ................................................................................................................................................ xix

Acknowledgment .............................................................................................................................xxiii

Section I

Theoretical Aspects

This section provides a theoretical and methodological background for the remaining parts of the book.

It defines and explains basic notions of data mining and knowledge management, and discusses some

general methods.

Chapter I

Data, Information and Knowledge.......................................................................................................... 1

Jana Zvárová, Institute of Computer Science of the Academy of Sciences of the Czech

Republic v.v.i., Czech Republic; Center of Biomedical Informatics, Czech Republic

Arnošt Veselý, Institute of Computer Science of the Academy of Sciences of the Czech Republic

v.v.i., Czech Republic; Czech University of Life Sciences, Czech Republic

Igor Vajda, Institutes of Computer Science and Information Theory and Automation of

the Academy of Sciences of the Czech Republic v.v.i., Czech Republic

This chapter introduces the basic concepts of medical informatics: data, information, and knowledge. It

shows how these concepts are interrelated and can be used for decision support in medicine. All discussed

approaches are illustrated on one simple medical example.

Chapter II

Ontologies in the Health Field .............................................................................................................. 37

Michel Simonet, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Radja Messai, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Gayo Diallo, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Ana Simonet, Laboratoire TIMC-IMAG, Institut de l’Ingénierie et de l’Information de Santé,

France

Detailed Table of Contents

This chapter introduces the basic notions of ontologies, presents a survey of their use in medicine, and

explores some related issues: knowledge bases, terminology, information retrieval. It also addresses the

issues of ontology design, ontology representation, and the possible interaction between data mining

and ontologies.

Chapter III

Cost-Sensitive Learning in Medicine.................................................................................................... 57

Alberto Freitas, University of Porto, Portugal; CINTESIS, Portugal

Pavel Brazdil, LIAAD - INESC Porto L.A., Portugal; University of Porto, Portugal

Altamiro Costa-Pereira, University of Porto, Portugal; CINTESIS, Portugal

Health managers and clinicians often need models that try to minimize several types of costs associated

with healthcare, including attribute costs (e.g. the cost of a specific diagnostic test) and misclassification

costs (e.g. the cost of a false negative test). This chapter presents some concepts related to cost-sensitive

learning and cost-sensitive classification in medicine and reviews research in this area.

Chapter IV

Classification and Prediction with Neural Networks............................................................................ 76

Arnošt Veselý, Czech University of Life Sciences, Czech Republic

This chapter describes the theoretical background of artificial neural networks (architectures, methods

of learning) and shows how these networks can be used in medical domain to solve various classifica￾tion and regression problems.

Chapter V

Preprocessing Perceptrons and Multivariate Decision Limits............................................................ 108

Patrik Eklund, Umeå University, Sweden

Lena Kallin Westin, Umeå University, Sweden

This chapter introduces classification networks composed of preprocessing layers and classification

networks, and compares them with “classical” multilayer percpetrons on three medical case studies.

Section II

General Applications

This section presents work that is general in the sense of a variety of methods or variety of problems

described in each of the chapters.

Chapter VI

Image Registration for Biomedical Information Integration .............................................................. 122

Xiu Ying Wang, BMIT Research Group, The University of Sydney, Australia

Dagan Feng, BMIT Research Group, The University of Sydney, Australia; Hong Kong Polytechnic

University, Hong Kong

In this chapter, biomedical image registration and fusion, which is an effective mechanism to assist medical

knowledge discovery by integrating and simultaneously representing relevant information from diverse

imaging resources, is introduced. This chapter covers fundamental knowledge and major methodologies

of biomedical image registration, and major applications of image registration in biomedicine.

Chapter VII

ECG Processing .................................................................................................................................. 137

Lenka Lhotská, Czech Technical University in Prague, Czech Republic

Václav Chudáček, Czech Technical University in Prague, Czech Republic

Michal Huptych, Czech Technical University in Prague, Czech Republic

This chapter describes methods for preprocessing, analysis, feature extraction, visualization, and clas￾sification of electrocardiogram (ECG) signals. First, preprocessing methods mainly based on the discrete

wavelet transform are introduced. Then classification methods such as fuzzy rule-based decision trees

and neural networks are presented. Two examples - visualization and feature extraction from Body

Surface Potential Mapping (BSPM) signals and classification of Holter ECGs – illustrate how these

methods are used.

Chapter VIII

EEG Data Mining Using PCA............................................................................................................ 161

Lenka Lhotská, Czech Technical University in Prague, Czech Republic

Vladimír Krajča, Faculty Hospital Na Bulovce, Czech Republic

Jitka Mohylová, Technical University Ostrava, Czech Republic

Svojmil Petránek, Faculty Hospital Na Bulovce, Czech Republic

Václav Gerla, Czech Technical University in Prague, Czech Republic

This chapter deals with the application of principal components analysis (PCA) to the field of data mining

in electroencephalogram (EEG) processing. Possible applications of this approach include separation of

different signal components for feature extraction in the field of EEG signal processing, adaptive seg￾mentation, epileptic spike detection, and long-term EEG monitoring evaluation of patients in a coma.

Chapter IX

Generating and Verifying Risk Prediction Models Using Data Mining ............................................. 181

Darryl N. Davis, University of Hull, UK

Thuy T.T. Nguyen, University of Hull, UK

In this chapter, existing clinical risk prediction models are examined and matched to the patient data to

which they may be applied using classification and data mining techniques, such as neural Nets. Novel

risk prediction models are derived using unsupervised cluster analysis algorithms. All existing and derived

models are verified as to their usefulness in medical decision support on the basis of their effectiveness

on patient data from two UK sites.

Chapter X

Management of Medical Website Quality Labels via Web Mining.................................................... 206

Vangelis Karkaletsis, National Center of Scientific Research “Demokritos”, Greece

Konstantinos Stamatakis, National Center of Scientific Research “Demokritos”, Greece

Pythagoras Karampiperis, National Center of Scientific Research “Demokritos”, Greece

Martin Labský, University of Economics, Prague, Czech Republic

Marek Růžička, University of Economics, Prague, Czech Republic

Vojtěch Svátek, University of Economics, Prague, Czech Republic

Enrique Amigó Cabrera, ETSI Informática, UNED, Spain

Matti Pöllä, Helsinki University of Technology, Finland

Miquel Angel Mayer, Medical Association of Barcelona (COMB), Spain

Dagmar Villarroel Gonzales, Agency for Quality in Medicine (AquMed), Germany

This chapter deals with the problem of quality assessment of medical Web sites. The so called “quality

labeling” process can benefit from employment of Web mining and information extraction techniques,

in combination with flexible methods of Web-based information management developed within the

Semantic Web initiative.

Chapter XI

Two Case-Based Systems for Explaining Exceptions in Medicine .................................................... 227

Rainer Schmidt, University of Rostock, Germany

In medicine, doctors are often confronted with exceptions, both in medical practice or in medical research.

One proper method of how to deal with exceptions is case-based systems. This chapter presents two such

systems. The first one is a knowledge-based system for therapy support. The second one is designed for

medical studies or research. It helps to explain cases that contradict a theoretical hypothesis.

Section III

Specific Cases

This part shows results of several case studies of (mostly) data mining applied to various specific medi￾cal problems. The problems covered by this part, range from discovery of biologically interpretable

knowledge from gene expression data, over human embryo selection for the purpose of human in-vitro

fertilization treatments, to diagnosis of various diseases based on machine learning techniques.

Chapter XII

Discovering Knowledge from Local Patterns in SAGE Data............................................................. 251

Bruno Crémilleux, Université de Caen, France

Arnaud Soulet, Université François Rabelais de Tours, France

Jiří Kléma, Czech Technical University, in Prague, Czech Republic

Céline Hébert, Université de Caen, France

Olivier Gandrillon, Université de Lyon, France

Current gene data analysis is often based on global approaches such as clustering. An alternative way

is to utilize local pattern mining techniques for global modeling and knowledge discovery. This chapter

proposes three data mining methods to deal with the use of local patterns by highlighting the most promis-

ing ones or summarizing them. From the case study of the SAGE gene expression data, it is shown that

this approach allows generating new biological hypotheses with clinical applications.

Chapter XIII

Gene Expression Mining Guided by Background Knowledge........................................................... 268

Jiří Kléma, Czech Technical University in Prague, Czech Republic

Filip Železný, Czech Technical University in Prague, Czech Republic

Igor Trajkovski, Jožef Stefan Institute, Slovenia

Filip Karel, Czech Technical University in Prague, Czech Republic

Bruno Crémilleux, Université de Caen, France

Jakub Tolar, University of Minnesota, USA

This chapter points out the role of genomic background knowledge in gene expression data mining.

Its application is demonstrated in several tasks such as relational descriptive analysis, constraint-based

knowledge discovery, feature selection and construction, or quantitative association rule mining.

Chapter XIV

Mining Tinnitus Database for Knowledge.......................................................................................... 293

Pamela L. Thompson, University of North Carolina at Charlotte, USA

Xin Zhang, University of North Carolina at Pembroke, USA

Wenxin Jiang, University of North Carolina at Charlotte, USA

Zbigniew W. Ras, University of North Carolina at Charlotte, USA

Pawel Jastreboff, Emory University School of Medicine, USA

This chapter describes the process used to mine a database containing data, related to patient visits dur￾ing Tinnitus Retraining Therapy. The presented research focused on analysis of existing data, along with

automating the discovery of new and useful features in order to improve classification and understanding

of tinnitus diagnosis.

Chapter XV

Gaussian-Stacking Multiclassifiers for Human Embryo Selection..................................................... 307

Dinora A. Morales, University of the Basque Country, Spain

Endika Bengoetxea, University of the Basque Country, Spain

Pedro Larrañaga, Universidad Politécnica de Madrid, Spain

This chapter describes a new multi-classification system using Gaussian networks to combine the outputs

(probability distributions) of standard machine learning classification algorithms. This multi-classifica￾tion technique has been applied to a complex real medical problem: The selection of the most promising

embryo-batch for human in-vitro fertilization treatments.

Chapter XVI

Mining Tuberculosis Data................................................................................................................... 332

Marisa A. Sánchez, Universidad Nacional del Sur, Argentina

Sonia Uremovich, Universidad Nacional del Sur, Argentina

Pablo Acrogliano, Hospital Interzonal Dr. José Penna, Argentina

This chapter reviews current policies of tuberculosis control programs for the diagnosis of tuberculosis.

A data mining project that uses WHO’s Direct Observation of Therapy data to analyze the relationship

among different variables and the tuberculosis diagnostic category registered for each patient is then

presented.

Chapter XVII

Knowledge-Based Induction of Clinical Prediction Rules................................................................. 350

Mila Kwiatkowska, Thompson Rivers University, Canada

M. Stella Atkins, Simon Fraser University, Canada

Les Matthews, Thompson Rivers University, Canada

Najib T. Ayas, University of British Columbia, Canada

C. Frank Ryan, University of British Columbia, Canada

This chapter describes how to integrate medical knowledge with purely inductive (data-driven) methods

for the creation of clinical prediction rules. To address the complexity of the domain knowledge, the

authors have introduced a semio-fuzzy framework, which has its theoretical foundations in semiotics

and fuzzy logic. This integrative framework has been applied to the creation of clinical prediction rules

for the diagnosis of obstructive sleep apnea, a serious and under-diagnosed respiratory disorder.

Chapter XVIII

Data Mining in Atherosclerosis Risk Factor Data .............................................................................. 376

Petr Berka, University of Economics, Prague, Czech Republic; Academy of Sciences of the

Czech Republic, Prague, Czech Republic

Jan Rauch, University of Economics, Praague, Czech Republic; Academy of Sciences of the

Czech Republic, Prague, Czech Republic

Marie Tomečková, Academy of Sciences of the Czech Republic, Prague, Czech Republic

This chapter describes goals, current results, and further plans of long-time activity concerning the ap￾plication of data mining and machine learning methods to the complex medical data set. The analyzed

data set concerns longitudinal study of atherosclerosis risk factors.

Compilation of References............................................................................................................... 398

About the Contributors.................................................................................................................... 426

Index................................................................................................................................................... 437

xiv

Foreword

Current research directions are looking at Data Mining (DM) and Knowledge Management (KM) as

complementary and interrelated fields, aimed at supporting, with algorithms and tools, the lifecycle of

knowledge, including its discovery, formalization, retrieval, reuse, and update. While DM focuses on

the extraction of patterns, information, and ultimately knowledge from data (Giudici, 2003; Fayyad et

al., 1996; Bellazzi, Zupan, 2008), KM deals with eliciting, representing, and storing explicit knowledge,

as well as keeping and externalizing tacit knowledge (Abidi, 2001; Van der Spek, Spijkervet, 1997).

Although DM and KM have stemmed from different cultural backgrounds and their methods and tools

are different, too, it is now clear that they are dealing with the same fundamental issues, and that they

must be combined to effectively support humans in decision making.

The capacity of DM to analyze data and to extract models, which may be meaningfully interpreted

and transformed into knowledge, is a key feature for a KM system. Moreover, DM can be a very useful

instrument to transform the tacit knowledge contained in transactional data into explicit knowledge, by

making experts’ behavior and decision-making activities emerge.

On the other hand, DM is greatly empowered by KM. The available, or background knowledge, (BK)

is exploited to drive data gathering and experimental planning, and to structure the databases and data

warehouses. BK is used to properly select the data, choose the data mining strategies, improve the data

mining algorithms, and finally evaluates the data mining results (Bellazzi, Zupan, 2008; Bellazzi, Zupan,

2008). The output of the data analysis process is an update of the domain knowledge itself, which may

lead to new experiments and new data gathering (see Figure 1).

If the interaction and integration of DM and KM is important in all application areas, in medical

applications it is essential (Cios, Moore, 2002). Data analysis in medicine is typically part of a complex

reasoning process which largely depends on BK. Diagnosis, therapy, monitoring, and molecular research

are always guided by the existing knowledge of the problem domain, on the population of patients or

on the specific patient under consideration. Since medicine is a safety critical context (Fox, Das, 2000),

P atterns

interpretation

B ackground

K now ledge

E xperim ental design

D ata b ase design

D ata e xtraction

C ase-base definition

D ata M ining P atterns

interpretation

B ackground

K now ledge

E xperim ental design

D ata b ase design

D ata e xtraction

C ase-base definition

D ata M ining P atterns

interpretation

B ackground

K now ledge

E xperim ental design

D ata b ase design

D ata e xtraction

C ase-base definition

D ata M ining

Figure 1. Role of the background knowledge in the data mining process

Tải ngay đi em, còn do dự, trời tối mất!