Text Mining

Este Curso es parte de

Pathway en Introduction to Data Mining


Natural language text is everywhere, social networks, business, finance, medicine and biology are just few of the many sources of natural language text. However, computers are not fit to process natural language text. Indeed, Data Mining methods and algorithms, which operate on structured data, can not be directly applied to unstructured data for knowledge extraction. The Text Mining course, the last one of the Introduction to Data Mining pathway, introduces methods and tools for knowledge extraction from natural language text. The course assumes you are familiar with methods and models presented in the previous two courses, namely Data Mining: Classification and Data Mining: Clustering and Association. The course shows that Text Mining allows to formulate and solve problems in Business Intelligence, Finance, Recommendation, Medicine, Biomedicine, Social Networks, and Intelligence Gathering to mention just a few. In particular, the course introduces methods, models and algorithms for; natural language text preprocessing, text categorization, text clustering, topic modeling and information extraction.

Asistencia y Certificados

Cuota de Asistencia
GRATUITO!
Costo del Certificado de Participación
GRATUITO!

Categorìa

Informatica, Gestión y Análisis de datos

Horas de Entrenamiento

35

Nivel

Beginner

Metodos de Curso

Tutoría

Idioma

English

Duraciòn

4 Semana

Tipología

Online

Estado del Curso

Tutoría Soft

Iniciar Suscripciones

mar 31, 2017

Apertura del Curso

abr 21, 2017

Comenzando la Tutoría

abr 21, 2017

Tutoría Final

may 29, 2017

Tutoría Soft

may 30, 2017

Cierra Curso

No establecido
By the end of this course, you will be able to develop, validate and apply Text Mining workflows for automatic classification and organization of natural language text. Furthermore, you will learn to develop workflows to extract entities (persons, organizations, locations, genes, drugs etc.) from natural language text and to discover their relationships. You will learn how to access natural language text from many sources such as RSS, Web pages, YouTube, Twitter, PubMed, PDF and txt files etc. 

The course is self-contained, and hands-on lectures exploit the KNIME open source software platform, which integrates power and expressiveness of Weka, R, Java, and Python.

Basic knowledge of probability and statistics. Basic knowledge of R programming. Data Mining: Classification, Data Mining: Clustering and Association.
  1. Sholom M. Weiss, Nitin Indurkhya and Tong Zhang (2010). Fundamentals of Predictive Text Mining, Springer. 
  2. Marie-Francine Moens (2006). Information Extraction: Algorithms and Prospects in a Retrieval Context, Springer.
The course spans four weeks. Each week requires 6 to 8 hours of work. Each week consists of 4 to 8 video-lectures. Each video-lecture consists of a methodology video, a software usage video and a practice session.
You must accomplish all practice sessions associated with lectures and upload the corresponding KNIME workflow to the course platform.

FABIO STELLA

FABIO STELLA

Department of Informatics, Systems and Communication

PAOLA CHIESA

PAOLA CHIESA

Department of Informatics, Systems and Communication

DANIELE BELLANI

DANIELE BELLANI

Department of Informatics, Systems and Communication

ALESSANDRO BREGOLI

ALESSANDRO BREGOLI

Department of Informatics, Systems and Communication