Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Data Warehousing
PREMIUM
Số trang
351
Kích thước
2.0 MB
Định dạng
PDF
Lượt xem
864

Data Warehousing

Nội dung xem thử

Mô tả chi tiết

This page

intentionally left

blank

Copyright © 2006, New Age International (P) Ltd., Publishers

Published by New Age International (P) Ltd., Publishers

All rights reserved.

No part of this ebook may be reproduced in any form, by photostat, microfilm,

xerography, or any other means, or incorporated into any information retrieval

system, electronic or mechanical, without the written permission of the publisher.

All inquiries should be emailed to [email protected]

PUBLISHING FOR ONE WORLD

NEW AGE INTERNATIONAL (P) LIMITED, PUBLISHERS

4835/24, Ansari Road, Daryaganj, New Delhi - 110002

Visit us at www.newagepublishers.com

ISBN (13) : 978-81-224-2705-9

Dedicated DedicatedDedicated

To

Beloved Friends Beloved Friends

This page

intentionally left

blank

PREFACE

This book is intended for Information Technology (IT) professionals who have been

hearing about or have been tasked to evaluate, learn or implement data warehousing

technologies. This book also aims at providing fundamental techniques of KDD and Data

Mining as well as issues in practical use of Mining tools.

Far from being just a passing fad, data warehousing technology has grown much in scale

and reputation in the past few years, as evidenced by the increasing number of products,

vendors, organizations, and yes, even books, devoted to the subject. Enterprises that have

successfully implemented data warehouses find it strategic and often wonder how they ever

managed to survive without it in the past. Also Knowledge Discovery and Data Mining (KDD)

has emerged as a rapidly growing interdisciplinary field that merges together databases,

statistics, machine learning and related areas in order to extract valuable information and

knowledge in large volumes of data.

Volume-I is intended for IT professionals, who have been tasked with planning, manag￾ing, designing, implementing, supporting, maintaining and analyzing the organization’s data

warehouse.

The first section introduces the Enterprise Architecture and Data Warehouse concepts,

the basis of the reasons for writing this book.

The second section focuses on three of the key People in any data warehousing initia￾tive: the Project Sponsor, the CIO, and the Project Manager. This section is devoted to

addressing the primary concerns of these individuals.

The third section presents a Process for planning and implementing a data warehouse

and provides guidelines that will prove extremely helpful for both first-time and experienced

warehouse developers.

The fourth section focuses on the Technology aspect of data warehousing. It lends order

to the dizzying array of technology components that you may use to build your data ware￾house.

The fifth section opens a window to the future of data warehousing.

The sixth section deals with On-Line Analytical Processing (OLAP), by providing differ￾ent features to select the tools from different vendors.

Volume-II shows how to achieve success in understanding and exploiting large databases

by uncovering valuable information hidden in data; learn what data has real meaning and

what data simply takes up space; examining which data methods and tools are most effective

for the practical needs; and how to analyze and evaluate obtained results.

S. NAGABHUSHANA

This page

intentionally left

blank

ACKNOWLEDGEMENTS

My sincere thanks to Prof. P. Rama Murthy, Principal, Intell Engineering College,

Anantapur, for his able guidance and valuable suggestions - in fact, it was he who brought

my attention to the writing of this book. I am grateful to Smt. G. Hampamma, Lecturer in

English, Intell Engineering College, Anantapur and her whole family for their constant sup￾port and assistance while writing the book. Prof. Jeffrey D. Ullman, Department of Computer

Science, Stanford University, U.S.A., deserves my special thanks for providing all the neces￾sary resources. I am also thankful to Mr. R. Venkat, Senior Technical Associate at Virtusa,

Hyderabad, for going through the script and encouraging me.

Last but not least, I thank Mr. Saumya Gupta, Managing Director, New Age Interna￾tional (P) Limited, Publishers. New Delhi, for their interest in the publication of the book.

This page

intentionally left

blank

(xi)

CONTENTS

Preface (vii)

Acknowledgements (ix)

VOLUME I: DATA WAREHOUSING

IMPLEMENTATION AND OLAP

PART I : INTRODUCTION

Chapter 1. The Enterprise IT Architecture 5

1.1 The Past: Evolution of Enterprise Architectures 5

1.2 The Present: The IT Professional’s Responsibility 6

1.3 Business Perspective 7

1.4 Technology Perspective 8

1.5 Architecture Migration Scenarios 12

1.6 Migration Strategy: How do We Move Forward? 20

Chapter 2. Data Warehouse Concepts 24

2.1 Gradual Changes in Computing Focus 24

2.2 Data Warehouse Characteristics and Definition` 26

2.3 The Dynamic, Ad Hoc Report 28

2.4 The Purposes of a Data Warehouse 29

2.5 Data Marts 30

2.6 Operational Data Stores 33

2.7 Data Warehouse Cost-Benefit Analysis / Return on Investment 35

PART II : PEOPLE

Chapter 3. The Project Sponsor 39

3.1 How does a Data Warehouse Affect Decision-Making Processes? 39

3.2 How does a Data Warehouse Improve Financial Processes? Marketing?

Operations? 40

3.3 When is a Data Warehouse Project Justified? 41

3.4 What Expenses are Involved? 43

3.5 What are the Risks? 45

3.6 Risk-Mitigating Approaches 50

3.7 Is Organization Ready for a Data Warehouse? 51

3.8 How the Results are Measured? 51

Chapter 4. The CIO 54

4.1 How is the Data Warehouse Supported? 54

4.2 How Does Data Warehouse Evolve? 55

4.3 Who should be Involved in a Data Warehouse Project? 56

4.4 What is the Team Structure Like? 60

4.5 What New Skills will People Need? 60

4.6 How Does Data Warehousing Fit into IT Architecture? 62

4.7 How Many Vendors are Needed to Talk to? 63

4.8 What should be Looked for in a Data Warehouse Vendor? 64

4.9 How Does Data Warehousing Affect Existing Systems? 67

4.10 Data Warehousing and its Impact on Other Enterprise Initiatives 68

4.11 When is a Data Warehouse not Appropriate? 69

4.12 How to Manage or Control a Data Warehouse Initiative? 71

Chapter 5. The Project Manager 73

5.1 How to Roll Out a Data Warehouse Initiative? 73

5.2 How Important is the Hardware Platform? 76

5.3 What are the Technologies Involved? 78

5.4 Are the Relational Databases Still Used for Data Warehousing? 79

5.5 How Long Does a Data Warehousing Project Last? 83

5.6 How is a Data Warehouse Different from Other IT Projects? 84

5.7 What are the Critical Success Factors of a Data Warehousing 85

Project?

(xii)

PART III : PROCESS

Chapter 6. Warehousing Strategy 89

6.1 Strategy Components 89

6.2 Determine Organizational Context 90

6.3 Conduct Preliminary Survey of Requirements 90

6.4 Conduct Preliminary Source System Audit 92

6.5 Identify External Data Sources (If Applicable) 93

6.6 Define Warehouse Rollouts (Phased Implementation) 93

6.7 Define Preliminary Data Warehouse Architecture 94

6.8 Evaluate Development and Production Environment and Tools 95

Chapter 7. Warehouse Management and Support Processes 96

7.1 Define Issue Tracking and Resolution Process 96

7.2 Perform Capacity Planning 98

7.3 Define Warehouse Purging Rules 108

7.4 Define Security Management 108

7.5 Define Backup and Recovery Strategy 111

7.6 Set Up Collection of Warehouse Usage Statistics 112

Chapter 8. Data Warehouse Planning 114

8.1 Assemble and Orient Team 114

8.2 Conduct Decisional Requirements Analysis 115

8.3 Conduct Decisional Source System Audit 116

8.4 Design Logical and Physical Warehouse Schema 119

8.5 Produce Source-to-Target Field Mapping 119

8.6 Select Development and Production Environment and Tools 121

8.7 Create Prototype for this Rollout 121

8.8 Create Implementation Plan of this Rollout 122

8.9 Warehouse Planning Tips and Caveats 124

Chapter 9. Data Warehouse Implementation 128

9.1 Acquire and Set Up Development Environment 128

9.2 Obtain Copies of Operational Tables 129

9.3 Finalize Physical Warehouse Schema Design 129

(xiii)

(xiv)

9.4 Build or Configure Extraction and Transformation Subsystems 130

9.5 Build or Configure Data Quality Subsystem 131

9.6 Build Warehouse Load Subsystem 135

9.7 Set Up Warehouse Metadata 138

9.8 Set Up Data Access and Retrieval Tools 138

9.9 Perform the Production Warehouse Load 140

9.10 Conduct User Training 140

9.11 Conduct User Testing and Acceptance 141

PART IV : TECHNOLOGY

Chapter 10. Hardware and Operating Systems 145

10.1 Parallel Hardware Technology 145

10.2 The Data Partitioning Issue 148

10.3 Hardware Selection Criteria 152

Chapter 11. Warehousing Software 154

11.1 Middleware and Connectivity Tools 155

11.2 Extraction Tools 155

11.3 Transformation Tools 156

11.4 Data Quality Tools 158

11.5 Data Loaders 158

11.6 Database Management Systems 159

11.7 Metadata Repository 159

11.8 Data Access and Retrieval Tools 160

11.9 Data Modeling Tools 162

11.10 Warehouse Management Tools 163

11.11 Source Systems 163

Chapter 12. Warehouse Schema Design 165

12.1 OLTP Systems Use Normalized Data Structures 165

12.2 Dimensional Modeling for Decisional Systems 167

12.3 Star Schema 168

12.4 Dimensional Hierarchies and Hierarchical Drilling 169

12.5 The Granularity of the Fact Table 170

Tải ngay đi em, còn do dự, trời tối mất!