Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Big Data Analytics
PREMIUM
Số trang
315
Kích thước
8.2 MB
Định dạng
PDF
Lượt xem
1069

Big Data Analytics

Nội dung xem thử

Mô tả chi tiết

Big Data Analytics: Optimization

and Randomization

Tianbao Yang†

, Qihang Lin\

, Rong Jin∗

Tutorial@SIGKDD 2015

Sydney, Australia

†Department of Computer Science, The University of Iowa, IA, USA

\Department of Management Sciences, The University of Iowa, IA, USA

∗Department of Computer Science and Engineering, Michigan State University, MI, USA

Institute of Data Science and Technologies at Alibaba Group, Seattle, USA

August 10, 2015

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 1 / 234

URL

http://www.cs.uiowa.edu/˜tyng/kdd15-tutorial.pdf

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 2 / 234

Some Claims

No

This tutorial is not an exhaustive literature survey

It is not a survey on different machine learning/data mining

algorithms

Yes

It is about how to efficiently solve machine learning/data mining

(formulated as optimization) problems for big data

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 3 / 234

Outline

Part I: Basics

Part II: Optimization

Part III: Randomization

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 4 / 234

Big Data Analytics: Optimization and Randomization

Part I: Basics

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 5 / 234

Basics Introduction

Outline

1 Basics

Introduction

Notations and Definitions

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 6 / 234

Basics Introduction

Three Steps for Machine Learning

Model Optimization

20 40 60 80 100 0

0.05

0.1

0.15

0.2

0.25

0.3

iterations

distance to optimal objective

0.5T

1/T2

1/T

Data

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 7 / 234

Basics Introduction

Big Data Challenge

Big Data

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 8 / 234

Basics Introduction

Big Data Challenge

Big Model

60 million parameters

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 9 / 234

Basics Introduction

Learning as Optimization

Ridge Regression Problem:

min

w∈Rd

1

n

Xn

i=1

(yi − w

>xi)

2 +

λ

2

kwk

2

2

xi ∈ R

d

: d-dimensional feature vector

yi ∈ R: target variable

w ∈ R

d

: model parameters

n: number of data points

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 10 / 234

Basics Introduction

Learning as Optimization

Ridge Regression Problem:

min

w∈Rd

1

n

Xn

i=1

(yi − w

>xi)

2

| {z }

Empirical Loss

+

λ

2

kwk

2

2

xi ∈ R

d

: d-dimensional feature vector

yi ∈ R: target variable

w ∈ R

d

: model parameters

n: number of data points

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 11 / 234

Basics Introduction

Learning as Optimization

Ridge Regression Problem:

min

w∈Rd

1

n

Xn

i=1

(yi − w

>xi)

2 +

λ

2

kwk

2

2

| {z }

Regularization

xi ∈ R

d

: d-dimensional feature vector

yi ∈ R: target variable

w ∈ R

d

: model parameters

n: number of data points

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 12 / 234

Basics Introduction

Learning as Optimization

Classification Problems:

min

w∈Rd

1

n

Xn

i=1

`(yiw

>xi) + λ

2

kwk

2

2

yi ∈ {+1, −1}: label

Loss function `(z): z = yw>x

1. SVMs: (squared) hinge loss `(z) = max(0, 1 − z)

p

, where p = 1, 2

2. Logistic Regression: `(z) = log(1 + exp(−z))

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 13 / 234

Basics Introduction

Learning as Optimization

Feature Selection:

min

w∈Rd

1

n

Xn

i=1

`(w

>xi

, yi) + λkwk1

`1 regularization kwk1 =

Pd

i=1

|wi

|

λ controls sparsity level

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 14 / 234

Basics Introduction

Learning as Optimization

Feature Selection using Elastic Net:

min

w∈Rd

1

n

Xn

i=1

`(w

>xi

, yi)+λ



kwk1 + γkwk

2

2



Elastic net regularizer, more robust than `1 regularizer

Yang, Lin, Jin Tutorial for KDD’15 August 10, 2015 15 / 234

Tải ngay đi em, còn do dự, trời tối mất!