Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu Module 17: Introduction to Data Mining pptx
Nội dung xem thử
Mô tả chi tiết
Contents
Overview 1
Introducing Data Mining 2
Training a Data Mining Model 12
Building a Data Mining Model with
OLAP Data 13
Browsing the Dependency Network 23
Lab A: Creating a Decision Tree with
Relational Data 27
Review 32
Module 17: Introduction
to Data Mining
BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
Information in this document is subject to change without notice. The names of companies,
products, people, characters, and/or data mentioned herein are fictitious and are in no way intended
to represent any real individual, company, product, or event, unless otherwise noted. Complying
with all applicable copyright laws is the responsibility of the user. No part of this document may
be reproduced or transmitted in any form or by any means, electronic or mechanical, for any
purpose, without the express written permission of Microsoft Corporation. If, however, your only
means of access is electronic, permission to print one copy is hereby granted.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.
2000 Microsoft Corporation. All rights reserved.
Microsoft, BackOffice, MS-DOS, Windows, Windows NT, <plus other appropriate product
names or titles. Replace this example list with list of trademarks provided by copy editor.
Microsoft is listed first, followed by all other Microsoft trademarks in alphabetical order. > are
either registered trademarks or trademarks of Microsoft Corporation in the U.S.A. and/or other
countries.
<This is where mention of specific, contractually obligated to, third party trademarks, which are
added by the Copy Editor>
The names of companies, products, people, characters, and/or data mentioned herein are fictitious
and are in no way intended to represent any real individual, company, product, or event, unless
otherwise noted.
Other product and company names mentioned herein may be the trademarks of their respective
owners.
Module 17: Introduction to Data Mining iii
BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
Instructor Notes
This module introduces students to data mining and explains how to build and
browse data mining models by using MicrosoftÆ SQL Serverô 2000 Analysis
Services. Students will learn fundamental data mining terminology, concepts,
techniques, and algorithms.
This is an overview module that focuses on the use of built-in Analysis
Manager wizards. It is not intended to provide in-depth knowledge of data
mining.
After completing this module, students will be able to:
! Describe data mining characteristics, applications, and modeling techniques.
! Describe the process of training a model.
! Use the online analytical processing (OLAP) Mining Model Wizard to edit,
process, and explore the decision trees.
! Analyze relational data relationships in the dependency network browser.
! Describe the steps required to build a clustering model by using OLAP data.
Materials and Preparation
This section lists the required materials and preparation tasks that you need to
teach this module.
Required Materials
To teach this module, you need Microsoft PowerPointÆ file 2074A_17.ppt.
Preparation Tasks
To prepare for this module, you should:
! Read all the materials for this module.
! Read the instructor notes and margin notes.
! Practice combining the lecture with the demonstrations.
! Complete the lab.
! Review the Trainer Preparation presentation for this module on the Trainer
Materials compact disc.
! Review any relevant white papers that are located on the Trainer Materials
compact disc.
Presentation:
40 Minutes
Lab:
20 Minutes
iv Module 17: Introduction to Data Mining
BETA MATERIALS FOR MICROSOFT CERTIFIED TRAINER PREPARATION PURPOSES ONLY
Demonstration: Determining Why Students Attend College
The following demonstration procedures provide information that will not fit
in the margin notes or is not appropriate for student notes.
In this demonstration, you will create a data mining model by using a decision
tree with relational data. Specifically, you will create a decision tree that
determines why students attend college.
You will create a new OLAP database with a data source connecting to the
Module 17 relational database.
! To create an OLAP database
1. In Analysis Manager, expand the Analysis Servers folder, right-click your
local server, and then click New Database.
2. Enter Module 17 as the database name, and then click OK.
3. Expand the Module 17 database, right-click the Data Sources folder, and
then click New Data Source.
4. On the Provider tab of the Data Link Properties dialog box, click
Microsoft OLE DB Provider for SQL Server. Click Next.
5. Type localhost in Step 1.
6. In Step 2, click Use Windows NT Integrated security.
7. In Step 3, click Module 17 from the list of databases. Click OK.
! To create the data mining model
In this procedure, you will create the data mining model by selecting source,
case table, data mining technique, and key column.
1. In the Module 17 database, right-click the Mining Models folder, and then
click New Mining Model.
2. At the welcome page, click Next.
3. From the Select source type step of the Mining Model Wizard, click
Relational data, and then click Next.
Point out that either relational tables or OLAP cubes can be used as source
data. For this model, you are accessing relational data.
4. From the Select case tables step, in the Available tables list, click College
Plans, and then click Next.
5. From the Select data mining technique step, in the Technique list, click
Microsoft Decision Trees, and then click Next.
Two algorithms ship with Analysis Services: Microsoft Decision Trees and
Microsoft Clustering. Use the Decision Trees algorithm for this
demonstration.
6. From the Select the key column step, in the Case key column list, click
StudentID, and then click Next.
Demonstration:
10 Minutes