Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Data Analysis Machine Learning and Applications Episode 1 Part 6 docx
MIỄN PHÍ
Số trang
25
Kích thước
455.0 KB
Định dạng
PDF
Lượt xem
1739

Data Analysis Machine Learning and Applications Episode 1 Part 6 docx

Nội dung xem thử

Mô tả chi tiết

152 Kurt Hornik and Walter Böhm

Table 2. Formation of a third class in the Euclidean consensus partitions for the Gordon-Vichi

macroeconomic ensemble as a function of the weight ratio w between 3- and 2-class partitions

in the ensemble.

1.5 India

2.0 India, Sudan

3.0 India, Sudan

4.5 India, Sudan, Bolivia, Indonesia

10.0 India, Sudan, Bolivia, Indonesia

12.5 India, Sudan, Bolivia, Indonesia, Egypt

f India, Sudan, Bolivia, Indonesia, Egypt

these, 85 female undergraduates at Rutgers University were asked to sort 15 English

terms into classes “on the basis of some aspect of meaning”. There are at least three

“axes” for classification: gender, generation, and direct versus indirect lineage. The

Euclidean consensus partitions with Q = 3 classes put grandparents and grandchil￾dren in one class and all indirect kins into another one. For Q = 4, {brother, sister}

are separated from {father, mother, daughter, son}. Table 3 shows the memberships

for a soft Euclidean consensus partition for Q = 5 based on 1000 replications of the

AO algorithm.

Table 3. Memberships for the 5-class soft Euclidean consensus partition for the Rosenberg￾Kim kinship terms data.

grandfather 0.000 0.024 0.012 0.965 0.000

grandmother 0.005 0.134 0.016 0.840 0.005

granddaughter 0.113 0.242 0.054 0.466 0.125

grandson 0.134 0.111 0.052 0.581 0.122

brother 0.612 0.282 0.024 0.082 0.000

sister 0.579 0.391 0.026 0.002 0.002

father 0.099 0.546 0.122 0.158 0.075

mother 0.089 0.654 0.136 0.054 0.066

daughter 0.000 1.000 0.000 0.000 0.000

son 0.031 0.842 0.007 0.113 0.007

nephew 0.012 0.047 0.424 0.071 0.447

niece 0.000 0.129 0.435 0.000 0.435

cousin 0.080 0.056 0.656 0.033 0.174

aunt 0.000 0.071 0.929 0.000 0.000

uncle 0.000 0.000 0.882 0.071 0.047

Figure 1 indicates the classes and margins for the 5-class solutions. We see that

the memberships of ‘niece’ are tied between columns 3 and 5, and that the margin

of ‘nephew’ is only very small (0.02), suggesting the 4-class solution as the optimal

Euclidean consensus representation of the ensemble.

Hard and Soft Euclidean Consensus Partitions 153

uncle

aunt

cousin

niece

nephew

son

daughter

mother

father

sister

brother

grandson

granddaughter

grandmother

grandfather

0.0 0.2 0.4 0.6 0.8 1.0

4

4

4

4

1

1

2

2

2

2

5

3/5

3

3

3

Fig. 1. Classes (incicated by plot symbol and class id) and margins (differences between the

largest and second largest membership values) for the 5-class soft Euclidean consensus parti￾tion for the Rosenberg-Kim kinship terms data.

Quite interestingly, none of these consensus partitions split according to gender,

even though there are such partitions in the data. To take the natural heterogene￾ity in the data into account, one could try to partition them (perform clusterwise

aggregation, Gaul and Schader (1988)), resulting in meta-partitions (Gordon and

Vichi (1998)) of the underlying objects. Function cl_pclust in package clue pro￾vides an AO heuristic for soft prototype-based partitioning of classifications, allow￾ing in particular to obtain soft or hard meta-partitions with soft or hard Euclidean

consensus partitions as prototypes.

References

BARTHÉLEMY, J.P. and MONJARDET, B. (1981): The median procedure in cluster analysis

and social choice theory. Mathematical Social Sciences, 1, 235–267.

BARTHÉLEMY, J.P. and MONJARDET, B. (1988): The median procedure in data analysis:

new results and open problems. In: H. H. Bock, editor, Classification and related methods

of data analysis. North-Holland, Amsterdam, 309–316.

BOORMAN, S. A. and ARABIE, P. (1972): Structural measures and the method of sorting.

In R. N. Shepard, A. K. Romney and S. B. Nerlove, editors, Multidimensional Scaling:

Theory and Applications in the Behavioral Sciences, 1: Theory. Seminar Press, New

York, 225–249.

CHARON, I., DENOEUD, L., GUENOCHE, A. and HUDRY, O. (2006): Maximum transfer

distance between partitions. Journal of Classification, 23(1), 103–121.

DAY, W. H. E. (1981): The complexity of computing metric distances between partitions.

Mathematical Social Sciences, 1, 269–287.

DIMITRIADOU, E., WEINGESSEL, A. and HORNIK, K. (2002): A combination scheme for

fuzzy clustering. International Journal of Pattern Recognition and Artificial Intelligence,

16(7), 901–912.

GAUL, W. and SCHADER, M. (1988): Clusterwise aggregation of relations. Applied Stochas￾tic Models and Data Analysis, 4, 273–282.

Tải ngay đi em, còn do dự, trời tối mất!