Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Data Analysis Machine Learning and Applications Episode 3 Part 8 doc
MIỄN PHÍ
Số trang
25
Kích thước
952.5 KB
Định dạng
PDF
Lượt xem
1109

Data Analysis Machine Learning and Applications Episode 3 Part 8 doc

Nội dung xem thử

Mô tả chi tiết

692 Palumbo et al.

In the early 80’s, Tanaka proposed the first fuzzy linear regression model, moving

on from fuzzy sets theory and possibility theory (Tanaka et al., 1980). The functional

relation between dependent and independent variables is represented as a fuzzy linear

function whose parameters are given by fuzzy numbers. Tanaka proposed the first

Fuzzy Possibilistic Regression (FPR) using the following fuzzy linear model with

crisp input and fuzzy parameters:

y˜n = ˜

E0 + ˜

E1xn1 +...+ ˜

Epxnp,+...+ ˜

EPxnP (4)

where the parameters are symmetric triangular fuzzy numbers denoted by ˜

Ep =

(cp;wp)L with cp and wp as center and the spread, respectively.

Differently from statistical regression, the deviations between data and linear models

are assumed to depend on the vagueness of the parameters and not on measurement

errors. The basic idea of Tanaka’s approach was to minimize the uncertainty of the

estimates, by minimizing the total spread of the fuzzy coefficients. Spread minimiza￾tion must be pursued under the constraint of the inclusion of the whole given data

set, which satisfies a degree of belief D (0 < D < 1) defined by the decision maker.

The estimation problem is solved via a mathematical programming approach, where

the objective function aims at minimizing the spread parameters, and the constraints

guarantee that observed data fall inside the fuzzy interval:

minimize

N

n=1

P

p=0

wp|xnp| (5)

subject to the following constraints:

"

c0 + P

p=1 cpxnp#

+ (1−D)

"

w0 + P

p=1wp|xnp|

#

≥ yn "

c0 + P

p=1 cpxnp#

−(1−D)

"

w0 + P

p=1wp|xnp|

#

≤ yn

wp ≥ 0, cp ∈ R,xn0 = 1,n = (1,...,N), p = (1,...,P)

where xn0 = 1 (n = 1,...,N), wp ≥ 0 and cp ∈ R (p = 1,...,P).

2.2 The F-PLSPM algorithm

The F-PLSPM follows the component based approach SEM-PLS, alternatively de￾fined PLS Path Modeling (PLS-PM) (Tenenhaus et al., 2005). The reason is that

fuzzy regression and PLS path modeling share several characteristics. They are both

soft modeling and data oriented approaches.

Specifically, fuzzy regression joins PLS-PM in its final step, allowing for a fuzzy

structural model (see, Figure 1) but a still crisp measurement model. This connection

implies a two stage estimation procedure:

• stage 1: latent variables are estimated according to the PLS-PM estimation pro￾cedure (Wold, 1982);

Fuzzy PLS Path Modeling 693

Fig. 1. Fuzzy path model representation

• stage 2: FPR on the estimated latent variables is performed so that the following

fuzzy structural model is obtained:

[h = ˜

Eh0 +

h

˜

Ehh[h (6)

where ˜

Ehh refers to the generic fuzzy path coefficient, [h and [h are adjacent

latent variables and h,h ∈ [1,...,H] vary according to the model complexity.

It is worth noticing that the structural model from this procedure is different with

respect to the traditional structural model. Here the path coefficients are fuzzy num￾bers and there is no error term, as a natural consequence of a FPR. In the analysis of

a statistical model one should always, in one way or another, take into account the

goodness of fit, above all in comparing different models. The proposal is then to use

the FPR. The estimation of fuzzy parameters, instead of single-valued (crisp) param￾eters, permits us to gather both the structural and the residual information. The char￾acteristic to embed the residual in the model via fuzzy parameters (Tanaka and Guo,

1999) permits to evaluate the differences between assessors (panel performance) as

well as the reproducibility of each assessor (assessor performance) (Romano and

Palumbo, 2006b).

3 Application

The data set comes from sensory profiling of 14 cheese samples by a panel of 12

assessors on the basis of twelve attributes in two replicates.

The final data matrix consists of 336 rows (12 assessors × 14 samples × 2 repli￾cates) and 12 columns (attributes: intensity odour, acidic odour, sun odour, rancid

odour, intensity flavour, acidic flavour, sweet flavour, salty flavour, bitter flavour, sun

flavour, metallic flavour, rancid flavour). Two blocks of variables describe the latent

variables odour and flavour. First the hierarchical PLS model proposed by Tenen￾haus and Vinzi (2005) will be used to estimate a global model after averaging over

the assessors and the replicates (see, Figure 2). Thus, collapsing the data structure

into a two-way table (samples × attributes). Then fuzzy PLS path modeling will

Tải ngay đi em, còn do dự, trời tối mất!