Data mining tasks are generally divided into
Predictive tasks
to predict the value of a particular attribute based on the values of other attributes.
Attribute to be predicted - target or dependent variables.
Attributes used for making the prediction - explanatory or independent variables
Descriptive tasks
to derive patterns (correlations, trends, clusters, trajectories, and anomalies) that summarize the underlying relationships in data.
It is exploratory in nature
It requires postprocessing techniques to validate and explain the results
Core data mining tasks
Predictive Modeling
Association Analysis
Cluster Analysis
Anomaly Detection
Predictive Modeling
task of building a model for the target variable as a function of the explanatory variables.
Two types of predictive tasks
Regression
used for continuous target variables
Classification
used for discrete/categorical target variables
Example:
Predicting disease of a patient
Association Analysis
used to discover patterns that describe strongly associated features in the data.
Example
Identifying products bought together
Cluster Analysis
used to identify groups of closely related observations
Example
Grouping articles to the related topics
Anomaly Detection
task of identifying observations whose characteristics are significantly different from the rest of the data.
A good anomaly detector must have a high detection rate and a low false alarm rate.
Example
Detecting credit card fraud