Monday 4 February 2013

Advanced Data Mining


Data Mining

Instructor: Pedro Domingos
Textbook: Tom Mitchell, Machine Learning, McGraw-Hill,
Download Slides from here

Topics

Lecture Notes
Introduction; Inductive learning, Instance-based learningpdf, pptx
Decision trees; Empirical evaluationpdf
Bayesian Learningpdf
Rule Inductionpdf
Neural networkspdf
Genetic algorithms, model ensemblespdf
SVMs; Learning theorypdf
Clusteringpdf
Association rulespdf
Memorial Day - No Class





Topic

Slides

PDF   of Slides

Notes   in PS

Notes   in PDF
Overview of Data MiningPPT PDF Postscript PDF
Association-Rules, A-Priori   AlgorithmPPT PDF Postscript PDF
Other Frequent-Pair AlgorithmsPPT PDF

Correlated ItemsPPT PDFPostscript PDF
Query Flocks

Postscript PDF
PageRank, Hubs-and-AuthoritiesPPT PDF Postscript PDF
Web Mining

Postscript PDF
Stream Mining, Part IPPT PDF

Stream Mining, Part IIPPT PDF

Stream Mining, Part IIIPPT PDF

Clustering, Part IPPT PDFPostscript PDF
Clustering, Part IIPPT PDF Postscript PDF
Clustering Part III --- Stream   ClusteringPPT PDF

Matching Sequences

PostscriptPDF
Mining Event Sequences

Postscript PDF




Professor David Mease

Lecture Powerpoint Slides and Videos:

Lecture 8 = Thursday 10/8     Video 1      Video 2     Video 3

Lecture 7 = Thursday 10/1     Video 1      Video 2     Video 3

Lecture 6 = Thursday 9/24      Video 1      Video 2     Video 3

Lecture 5 = Saturday 9/19      Video 1      Video 2     Video 3

Lecture 4 = Thursday 9/10      Video 1      Video 2     Video 3

Lecture 3 = Thursday 9/3     Video 1      Video 2     Video 3

Lecture 2 = Saturday 8/29     Video 1      Video 2     Video 3

Lecture 1 = Thursday 8/27     Video 1      Video 2     Video 3 






Course Title: Data Mining

Instructor: Padhraic Smyth

Download Slides here :

Introduction to Data Mining:

  • Introduction to Data Mining [PPT] [PDF]
  • Measurement and Data [PPT] [PDF]
  • Exploratory Data Analysis and Visualization [PPT] [PDF]
Basic Principles of Data Mining

Slides

Text Mining

  • text classification [PPT] [PDF]
  • text mining and topic models [PPT] [PDF]
  • notes on graphical models [PPT] [PDF]

Recommender Systems

  • Recommender systems [PPT] [PDF]
  • Netflix case study [PPT] [PDF]

Web Data Analysis

  • Web link analysis [PPT] [PDF]
  • Web usage mining [PPT] [PDF]

Time Series Analysis and Anomaly Detection




Course Overview :
The Course will cover the following materials:
a) fundamentals, data mining concepts and functions, data pre-processing, data reduction, mining association rules in large databases, classification and prediction techniques, clustering analysis algorithms,data mining languages, data mining applications and new trends.
b) Advanced Knowledge discovery in semi-structured/unstructured data repositories with emphasis on emerging computational intelligence paradigms such as soft computing and artificial life. Application will be visited in special themes: advanced transactional data mining, Web Mining, Text Mining, Bioinformatics, and other scientific and engineering applications.
Text Book :

Data Mining: Concepts and Techniques, 1st or 2nd Ed., Jiawei Han and Micheline Kamber, Morgan Kaufmann, 2003 or 2006. ISBN 1-55860-901-6
Book Web site: http://www-faculty.cs.uiuc.edu/~hanj/bk2/index.html


Course Outline Get the PDF version of the Course Syllabus
Introduction Get Slides
1 What Motivated Data Mining? Why Is It Important?
2 So, What Is Data Mining?
3 Data Mining--On What Kind of Data?
4 Data Mining Functionalities—What Kinds of Patterns Can Be Mined?
5 Are All of the Patterns Interesting?
6 Classification of Data Mining Systems
7 Data Mining Task Primitives
8 Integration of a Data Mining System with a Database or Data Warehouse System
9 Major Issues in Data Mining
10 Data Mining Applications
11 Data Mining System Products and Research Prototypes
12 Social Impacts of Data Mining
Data Preprocessing Get Slides Get Math Pages File
1 Why Preprocess the Data?
2 Descriptive Data Summarization
3 Data Cleaning
4 Data Integration and Transformation
5 Data Reduction
6 Data Discretization and Concept Hierarchy Generation
7 Feature Selection Techniques
Mining Frequent Patterns and Associations Get Slides
1 Basic Concepts and a Road Map
2 Efficient and Scalable Frequent Item set Mining Methods
3 Mining Various Kinds of Association Rules
4 Using WEKA software for finding Association Rules
Classification and Prediction Get Slides
1 What Is Classification? What Is Prediction?
2 Issues Regarding Classification and Prediction
3 Classification by Decision Tree Induction Get More Slides
4 Bayesian Classification
Get Slides
5 Rule-Based Classification Get Slides
6 Prediction
7 Accuracy and Error Measures
8 Evaluating the Accuracy of a Classifier or Predictor
9 Using WEKA software for data Classification
10 Using Oracle Data Mining
Get Slides
Classification Using Lazy Learning Techniques Get Slides
1 Tasks of concept learning and classification
2 Features of lazy learning
3 Similarity measures
4 Calculate and Explain values of similarity
5 Formulate lazy learning tasks
6 Lazy learning algorithms : (Instance-based learning and kNN-learning)
7 Apply the lazy learning algorithms to learning tasks, (Classification task)
8 Advantages and disadvantages of lazy learning algorithms
Classification using Soft-Computing Get Slides
1 Introduction to Soft Computing
2 Introduction to Rough Set Theory
3 Reduct Computation Techniques
4 Classification using Rough Set Theory
5 Using Rosetta Tool for Reduct computation and data Classification
6 Major Issues in Rough Set Theory for Data Mining
7 Fuzzy Set and Data Mining
Get Slides
Cluster Analysis Get Slides Get More Slides
1 What Is Cluster Analysis?
2 Types of Data in Cluster Analysis
3 A Categorization of Major Clustering Methods
Mining Spatial, Multimedia, Text, and Web Data Get Slides
1 Spatial Data Mining
2 Multimedia Data Mining
3 Text Mining
Get Slides
4 Mining the World Wide Web
Get Slides
Applications and Trends in Data Mining
1 Data Mining Applications
2 Data Mining System Products and Research Prototypes
3 Additional Themes on Data Mining
4 Social Impacts of Data Mining
5
Data Mining Methodologies Get Slides
Data Warehouse and OLAP Technology: An Overview Get Slides
1 What Is a Data Warehouse?
2 A Multidimensional Data Model
3 Data Warehouse Architecture
4 Data Warehouse Implementation
5 From Data Warehousing to Data Mining

Required Software
WEKA is a software for machine learning and data mining . WEKA is an open source software issued under the GNU General Public License.
Download the software from:
http://www.cs.waikato.ac.nz/ml/weka/
Rosetta is a software for data reduction and classification purposes based on the concepts of Rough Set Theory.
Download the software from: http://rosetta.lcb.uu.se/general/
See the Software page for other Recourses (Software and Datasets).


No comments:

Post a Comment