Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Data Analysis Machine Learning and Applications Episode 2 Part 4 doc
Nội dung xem thử
Mô tả chi tiết
A Pattern Based Data Mining Approach 333
2. Science converges. Concepts in one area of science is applicable in another area.
Patterns support these processes. This potential is comparable to the promises of
Systems Theory.
3. Decision for a specific algorithm can be postponed to later stages. A solution
path as a whole will be sketched through patterns and algorithms need only be
filled in immediately prior to processing. Using differnet algorithms in places
will not invalidate the solution path, creating “late binding” at the algorithm
level.
Current Data Mining applications occasionally provide the user with first traces
of pattern based DM. Figure 5 shows the example of Bagging of Classifiers within
the TANAGRA project and its graphical user interface (Rakotomalala (2004)). Bagging cannot be described with a pure data flow paradigm, rather a nesting of a classifier pattern within the bagging pattern is needed. This nested structure will then be
pipelined with pre- and postprocessing patterns.
Fig. 5. Screenshot of Tanagra Software
Further steps in our project are to
• collect a list of patterns which are useful in the whole knowledge discovery process and data mining (list will be open-ended).
• integrate these patterns into data mining software to help design ad-hoc
algorithms, choose an existing one or have guidance in the data mining
process.
• develop a software prototype with our pattern and make experiments
with users: how it works and what are the benefits.
334 Boris Delibašic, Kathrin Kirchner and Johannes Ruhland ´
References
ALEXANDER, C. (1979): The Timeless Way of Building, Oxford University Press.
ALEXANDER, C. (2002a): The Nature of Order Book 1: The Phenomenon of Life, The Center
for Environmental Structure, Berkeley, California.
ALEXANDER, C. (2002b): The Nature of Order Book 2: The Process of Creating Life, The
Center for Environmental Structure, Berkeley, California.
CHAPMAN, P., CLINTON, J., KERBER, R., KHABAZA, T., REINARTZ, T., SHEARER,
C. and WIRTH, R. (2000): CRISP-DM 1.0. Step-by-step data mining guide, www.crispdm.org.
COPLIEN, J.O.(1996): Software Patterns, SIGS Books & Multimedia.
COPLIEN, J.O. and ZHAO, L. (2005): Toward a General Formal Foundation of Design -
Symmetry and Broken Symmetry, Brussels: VUB Press.
ECKERT, C. and CLARKSON, J. (2005): Design Process Improvement: a review of current
practice, Springer Verlag London.
FAYYAD, U.M., PIATETSKY-SHAPIRO, G. and UTHURUSAMY, R. (Ed.) (1996): Advances in Knowledge Discovery and Data Mining, MIT Press.
GAMMA, E., HELM, R., JOHNSON, R. and VLISSIDES, J. (1995): Design Patterns. Elements of Reusable Object-Oriented Software, Addison-Wesley.
HIPPNER, H., MERZENICH, M. and STOLZ, C. (2002): Data Mining: Einsatzpotentiale und
Anwendungspraxis in deutschen Unternehmen, In: WILDE, K.D.: Data Mining Studie,
absatzwirtschaft.
RAKOTOMALALA, R. (2004): Tanagra – A free data mining software for research and education, www.eric.univ-lyon2.fr/∼rico/tanagra/.
WITTEN, I.H. and FRANK, E. (2005): Data Mining: Practical machine learning tools and
techniques, Morgan Kaufmann, San Francisco.