Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Data Analysis Machine Learning and Applications Episode 3 Part 8 doc
Nội dung xem thử
Mô tả chi tiết
692 Palumbo et al.
In the early 80’s, Tanaka proposed the first fuzzy linear regression model, moving
on from fuzzy sets theory and possibility theory (Tanaka et al., 1980). The functional
relation between dependent and independent variables is represented as a fuzzy linear
function whose parameters are given by fuzzy numbers. Tanaka proposed the first
Fuzzy Possibilistic Regression (FPR) using the following fuzzy linear model with
crisp input and fuzzy parameters:
y˜n = ˜
E0 + ˜
E1xn1 +...+ ˜
Epxnp,+...+ ˜
EPxnP (4)
where the parameters are symmetric triangular fuzzy numbers denoted by ˜
Ep =
(cp;wp)L with cp and wp as center and the spread, respectively.
Differently from statistical regression, the deviations between data and linear models
are assumed to depend on the vagueness of the parameters and not on measurement
errors. The basic idea of Tanaka’s approach was to minimize the uncertainty of the
estimates, by minimizing the total spread of the fuzzy coefficients. Spread minimization must be pursued under the constraint of the inclusion of the whole given data
set, which satisfies a degree of belief D (0 < D < 1) defined by the decision maker.
The estimation problem is solved via a mathematical programming approach, where
the objective function aims at minimizing the spread parameters, and the constraints
guarantee that observed data fall inside the fuzzy interval:
minimize
N
n=1
P
p=0
wp|xnp| (5)
subject to the following constraints:
"
c0 + P
p=1 cpxnp#
+ (1−D)
"
w0 + P
p=1wp|xnp|
#
≥ yn "
c0 + P
p=1 cpxnp#
−(1−D)
"
w0 + P
p=1wp|xnp|
#
≤ yn
wp ≥ 0, cp ∈ R,xn0 = 1,n = (1,...,N), p = (1,...,P)
where xn0 = 1 (n = 1,...,N), wp ≥ 0 and cp ∈ R (p = 1,...,P).
2.2 The F-PLSPM algorithm
The F-PLSPM follows the component based approach SEM-PLS, alternatively defined PLS Path Modeling (PLS-PM) (Tenenhaus et al., 2005). The reason is that
fuzzy regression and PLS path modeling share several characteristics. They are both
soft modeling and data oriented approaches.
Specifically, fuzzy regression joins PLS-PM in its final step, allowing for a fuzzy
structural model (see, Figure 1) but a still crisp measurement model. This connection
implies a two stage estimation procedure:
• stage 1: latent variables are estimated according to the PLS-PM estimation procedure (Wold, 1982);
Fuzzy PLS Path Modeling 693
Fig. 1. Fuzzy path model representation
• stage 2: FPR on the estimated latent variables is performed so that the following
fuzzy structural model is obtained:
[h = ˜
Eh0 +
h
˜
Ehh[h (6)
where ˜
Ehh refers to the generic fuzzy path coefficient, [h and [h are adjacent
latent variables and h,h ∈ [1,...,H] vary according to the model complexity.
It is worth noticing that the structural model from this procedure is different with
respect to the traditional structural model. Here the path coefficients are fuzzy numbers and there is no error term, as a natural consequence of a FPR. In the analysis of
a statistical model one should always, in one way or another, take into account the
goodness of fit, above all in comparing different models. The proposal is then to use
the FPR. The estimation of fuzzy parameters, instead of single-valued (crisp) parameters, permits us to gather both the structural and the residual information. The characteristic to embed the residual in the model via fuzzy parameters (Tanaka and Guo,
1999) permits to evaluate the differences between assessors (panel performance) as
well as the reproducibility of each assessor (assessor performance) (Romano and
Palumbo, 2006b).
3 Application
The data set comes from sensory profiling of 14 cheese samples by a panel of 12
assessors on the basis of twelve attributes in two replicates.
The final data matrix consists of 336 rows (12 assessors × 14 samples × 2 replicates) and 12 columns (attributes: intensity odour, acidic odour, sun odour, rancid
odour, intensity flavour, acidic flavour, sweet flavour, salty flavour, bitter flavour, sun
flavour, metallic flavour, rancid flavour). Two blocks of variables describe the latent
variables odour and flavour. First the hierarchical PLS model proposed by Tenenhaus and Vinzi (2005) will be used to estimate a global model after averaging over
the assessors and the replicates (see, Figure 2). Thus, collapsing the data structure
into a two-way table (samples × attributes). Then fuzzy PLS path modeling will