Data Mining - Clustering and Association

Este Curso es parte de

Pathway en Introduction to Data Mining


This course introduces basic concepts and methods of Data Mining with specific reference to Clustering and Association Rules. We present concept and purposes of cluster analysis, together with its’ main components. Partitioning, hierarchical, density based, and graph based clustering methods are described. Particular attention is devoted to; cluster validity measures and clustering validation. The last part of the course introduces association rule discovery. The concepts of association rule, frequent itemset, support and confidence are given. Furthermore, we give a brief description of the Apriori algorithm for frequent itemset generation, and introduce the concepts of maximal and closed frequent itemset. Finally, different criteria, for evaluating the quality of association patterns, are introduced.

Asistencia y Certificados

Cuota de Asistencia
GRATUITO!
Costo del Certificado de Participación
GRATUITO!

Categorìa

Informatica, Gestión y Análisis de datos

Horas de Entrenamiento

40

Nivel

Beginner

Metodos de Curso

Tutoría

Idioma

English

Duraciòn

4 Semana

Tipología

Online

Estado del Curso

Tutoría Soft

Iniciar Suscripciones

abr 21, 2016

Apertura del Curso

sep 14, 2016

Comenzando la Tutoría

oct 3, 2016

Tutoría Final

nov 14, 2016

Tutoría Soft

nov 15, 2016

Cierra Curso

No establecido

By the end of this course, you will be able to; develop a Data Mining workflow for solving a clustering problem as well as for extracting potentially interesting association rules. You will be able to use the appropriate proximity measure, and to select the "optimal clustering model" (whatever it means) to solve a clustering problem. Furthermore, you will be able to develop a Data Mining workflow to extract potentially interesting association rules. You will learn all this by using the KNIME open source platform, which integrates power and expressiveness of Weka, R and Java.

Basic knowledge of probability and statistics. Basic knowledge of R programming.

  1. Pang-Ning Tan, Steinbach Michael and Vipin Kumar, (2006). Introduction to Data Mining. Morgan-Kaufmann. 
  2. Kaufmann. Guojun Gan, Chaoqun Ma and Jianhong Wu (2007). Data Clustering: Theory, Algorithms, and Applications, Siam. 
  3. Rui Xu and Donald C Wunsch II (2009). Clustering, Wiley.
The course spans four weeks. Each week requires 8 to 10 hours of work. Each week consists of 3 to 5 lectures. Each lecture consists of a methodology video, a software usage video and a practice session.

You must accomplish all practice sessions associated with lectures and upload the corresponding KNIME workflow to the course platform.


FABIO STELLA

FABIO STELLA

Department of Informatics, Systems and Communication

PAOLA CHIESA

PAOLA CHIESA

Department of Informatics, Systems and Communication