Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Mastering data warehouse aggregates
PREMIUM
Số trang
377
Kích thước
6.1 MB
Định dạng
PDF
Lượt xem
1750

Mastering data warehouse aggregates

Nội dung xem thử

Mô tả chi tiết

Christopher Adamson

Mastering

Data Warehouse

Aggregates

Solutions for Star

Schema Performance

01_777099 ffirs.qxp 6/2/06 3:42 PM Page iii

Mastering

Data Warehouse

Aggregates

01_777099 ffirs.qxp 6/2/06 3:42 PM Page i

01_777099 ffirs.qxp 6/2/06 3:42 PM Page ii

Christopher Adamson

Mastering

Data Warehouse

Aggregates

Solutions for Star

Schema Performance

01_777099 ffirs.qxp 6/2/06 3:42 PM Page iii

Mastering Data Warehouse Aggregates: Solutions for Star Schema Performance

Published by

Wiley Publishing, Inc.

10475 Crosspoint Boulevard

Indianapolis, IN 46256

www.wiley.com

Copyright © 2006 by Wiley Publishing, Inc., Indianapolis, Indiana

Published simultaneously in Canada

ISBN-13: 978-0-471-77709-0

ISBN-10: 0-471-77709-9

Manufactured in the United States of America

10 9 8 7 6 5 4 3 2 1

1MA/SQ/QW/QW/IN

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form

or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as

permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior

written permission of the Publisher, or authorization through payment of the appropriate per-copy fee

to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978)

646-8600. Requests to the Publisher for permission should be addressed to the Legal Department, Wiley

Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or

online at http://www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or

warranties with respect to the accuracy or completeness of the contents of this work and specifically

disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No

warranty may be created or extended by sales or promotional materials. The advice and strategies con￾tained herein may not be suitable for every situation. This work is sold with the understanding that the

publisher is not engaged in rendering legal, accounting, or other professional services. If professional

assistance is required, the services of a competent professional person should be sought. Neither the

publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or

Website is referred to in this work as a citation and/or a potential source of further information does not

mean that the author or the publisher endorses the information the organization or Website may provide

or recommendations it may make. Further, readers should be aware that Internet Websites listed in this

work may have changed or disappeared between when this work was written and when it is read.

For general information on our other products and services or to obtain technical support, please con￾tact our Customer Care Department within the U.S. at (800) 762-2974, outside the U.S. at (317) 572-3993

or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may

not be available in electronic books.

Library of Congress Cataloging-in-Publication Data

Adamson, Christopher, 1967–

Mastering data warehouse aggregates: solutions for star schema performance / Christopher Adamson.

p. cm.

Includes index.

ISBN-13: 978-0-471-77709-0 (pbk.)

ISBN-10: 0-471-77709-9 (pbk.)

1. Data warehousing. I. Title.

QA76.9.D37A333 2006

005.74—dc22

2006011219

Trademarks: Wiley, the Wiley logo, and related trade dress are trademarks or registered trademarks of

John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be

used without written permission. All other trademarks are the property of their respective owners.

Wiley Publishing, Inc., is not associated with any product or vendor mentioned in this book.

01_777099 ffirs.qxp 6/2/06 3:42 PM Page iv

For Wayne H. Adamson

1929–2003

Through those whose lives you touched,

your spirit of love endures.

01_777099 ffirs.qxp 6/2/06 3:42 PM Page v

01_777099 ffirs.qxp 6/2/06 3:42 PM Page vi

Christopher Adamson is a data warehousing consultant and founder of

Oakton Software LLC. An expert in star schema design, he has managed and

executed data warehouse implementations in a variety of industries. His cus￾tomers have included Fortune 500 companies, large and small businesses,

government agencies, and data warehousing tool vendors. Mr. Adamson also

teaches dimensional modeling and is a co-author of Data Warehouse Design

Solutions (also from Wiley). He can be contacted through his website, www

.ChrisAdamson.net.

About the Author

vii

01_777099 ffirs.qxp 6/2/06 3:42 PM Page vii

01_777099 ffirs.qxp 6/2/06 3:42 PM Page viii

Executive Editor

Robert Elliott

Development Editor

Brian Herrmann

Technical Editor

Jim Hadley

Copy Editor

Nancy Rapoport

Editorial Manager

Mary Beth Wakefield

Production Manager

Tim Tate

Vice President and Executive

Group Publisher

Richard Swadley

Vice President and Executive

Publisher

Joseph B. Wikert

Project Coordinator

Michael Kruzil

Graphics and Production

Specialists

Jennifer Click

Denny Hager

Stephanie D. Jumper

Heather Ryan

Quality Control Technicians

John Greenough

Brian H. Walls

Proofreading and Indexing

Techbooks

Credits

ix

01_777099 ffirs.qxp 6/2/06 3:42 PM Page ix

01_777099 ffirs.qxp 6/2/06 3:42 PM Page x

Foreword xix

Acknowledgments xxi

Introduction xxiii

Chapter 1 Fundamentals of Aggregates 1

Star Schema Basics 2

Operational Systems and the Data Warehouse 3

Operational Systems 3

Data Warehouse Systems 4

Facts and Dimensions 5

The Star Schema 7

Dimension Tables and Surrogate Keys 7

Fact Tables and Grain 10

Using the Star Schema 13

Multiple Stars and Conformance 15

Data Warehouse Architecture 20

Invisible Aggregates 22

Improving Performance 23

The Base Schema and the Aggregate Schema 25

The Aggregate Navigator 26

Principles of Aggregation 27

Providing the Same Results 27

The Same Facts and Dimension Attributes

as the Base Schema 28

Contents

xi

02_777099 ftoc.qxp 6/2/06 3:43 PM Page xi

Other Types of Summarization 29

Pre-Joined Aggregates 29

Derived Tables 30

Tables with New Facts 31

Summary 32

Chapter 2 Choosing Aggregates 35

What Is a Potential Aggregate? 36

Aggregate Fact Tables: A Question of Grain 36

Aggregate Dimensions Must Conform 37

Pre-Joined Aggregates Have Grain Too 39

Enumerating Potential Aggregates 39

Identifying Potentially Useful Aggregates 40

Drawing on Initial Design 41

Design Decisions 41

Listening to Users 44

Where Subject Areas Meet 45

The Conformance Bus 45

Aggregates for Drilling Across 46

Query Patterns of an Existing System 49

Analyzing Reports for Potential Aggregates 49

Choosing Which Reports to Analyze 54

Assessing the Value of Potential Aggregates 55

Number of Aggregates 55

Presence of an Aggregate Navigator 55

Space Consumed by Aggregate Tables 56

How Many Rows Are Summarized 57

Examining the Number of Rows Summarized 59

The Cardinality Trap and Sparsity 62

Who Will Benefit from the Aggregate 64

Summary 65

Chapter 3 Designing Aggregates 67

The Base Schema 68

Identification of Grain 68

When Grain Is Forgotten 68

Grain and Aggregates 69

Conformance Bus 70

Rollup Dimensions 72

Aggregation Points 74

Natural Keys 74

Source Mapping 75

Slow Change Processing 76

Hierarchies 76

Housekeeping Columns 78

xii Contents

02_777099 ftoc.qxp 6/2/06 3:43 PM Page xii

Design Principles for the Aggregate Schema 81

A Separate Star for Each Aggregation 81

Single Schema and the Level Field 81

Drawbacks to the Single Schema Approach 84

Advantages of Separate Tables 85

Pre-Joined Aggregates 86

Naming Conventions 87

Naming the Attributes 87

Naming Aggregate Tables 88

Aggregate Dimension Design 90

Attributes of Aggregate Dimensions 90

Sourcing Aggregate Dimensions 91

Shared Dimensions 92

Aggregate Fact Table Design 93

Aggregate Facts: Names and Data Types 94

No New Facts, Including Counts 94

Degenerate Dimensions 96

Audit Dimension 96

Sourcing Aggregate Fact Tables 97

Pre-Joined Aggregate Design 98

Documenting the Aggregate Schema 98

Identify Schema Families 99

Identify Dimensional Conformance 99

Documenting Aggregate Dimension Tables 101

Documenting Aggregate Fact Tables 103

Pre-Joined Aggregates 106

Materialized Views and Materialized Query Tables 108

Summary 108

Chapter 4 Using Aggregates 109

Which Tables to Use? 110

The Schema Design 110

Relative Size 113

Aggregate Portfolio and Availability 114

Requirements for the Aggregate Navigator 116

Why an Aggregate Navigator? 116

Two Views and Query Rewrite 117

Dynamic Availability 120

Multiple Front Ends 121

Multiple Back Ends 123

Evaluating Aggregate Navigators 126

Front-End Aggregate Navigators 127

Approach 127

Pros and Cons 128

Contents xiii

02_777099 ftoc.qxp 6/2/06 3:43 PM Page xiii

Tải ngay đi em, còn do dự, trời tối mất!