Data Ware Housing and Mining

DATA WAREHOUSING AND MINING
UNIT –I:
Introduction : What Motivated Data Mining? Why Is It Important, Data Mining—On What Kind of Data, Data Mining Functionalities—What Kinds of Patterns Can Be Mined? Are All of the Patterns Interesting? Classification of Data Mining Systems, Data Mining Task Primitives, Integration of a Data Mining System with a Database or Data Warehouse System, Major Issues in Data Mining. 


UNIT –II:
Data Pre-Processing: Why Pre-Process the Data? Descriptive Data Summarization, Data Cleaning, Data Integration and Transformation, Data Reduction, Data Discretization and Concept Hierarchy Generation. 


UNIT –III:
Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to Data Mining.

UNIT –IV:
Classification : Basic Concepts, General Approach to solving a classification problem, Decision Tree Induction: Working of Decision Tree, building a decision tree, methods for expressing an attribute test conditions, measures for selecting the best split, Algorithm for decision tree induction.
Model Over fitting: Due to presence of noise, due to lack of representation samples, evaluating the performance of classifier: holdout method, random sub sampling, cross-validation, bootstrap. 


UNIT –V
Association Analysis: Basic Concepts and Algorithms : Introduction, Frequent Item Set generation, Rule generation, compact representation of frequent item sets, FP-Growth Algorithm. 


UNIT –VI
Cluster Analysis: Basic Concepts and Algorithms : What Is Cluster Analysis? Different Types of Clustering, Different Types of Clusters, K-means, The Basic K-means Algorithm,
K-means: Additional Issues, Bisecting K-means, K-means and Different Types of Clusters, Strengths and Weaknesses, K-means as an Optimization Problem, Agglomerative Hierarchical Clustering, Basic Agglomerative Hierarchical Clustering Algorithm, Specific Techniques, DBSCAN,
Traditional Density: Center-Based Approach, The DBSCAN Algorithm, Strengths and Weaknesses. 

Comments

Popular posts from this blog

Tree pruning useful in decision tree induction

Machine Learning Syllabus