Data Mining


Data mining is the process of analyzing a large datasets of information to discern trends and patterns, using algorithms and statistical methods to help answer business questions. These techniques are used in industries such as fraud detection, risk management, medical diagnoses, marketing, and cybersecurity.

Here are 5 of the most common data mining techniques:

  • Classification Analysis
  • Association Rule Learning
  • Anomaly or Outlier Detection
  • Clustering Analysis
  • Regression Analysis

Classification Analysis

Classification analysis identifies and assigns categories to a collection of data to allow for more accurate analysis. Classification analysis can be used to question, make a decision, or predict behavior through the use of an algorithm. It is commonly used in email applications for classifying email as spam or not.

Association Rule Learning

Association rules are used to find correlations and co-occurrences between datasets. They are also used to identify data patterns between seemingly unrelated data sources or repositories. They are used in machine learning to help the application learn new associations.

Anomaly or Outlier Detection

Outlier detection is the process of detecting and subsequently excluding outliers from a given set of data. When a data point deviates dramatically from the normal trend of the dataset, it is considered an outlier or anomaly. Looking specifically for outliers means you're looking for something out of the ordinary to analyze.

Clustering Analysis

Clustering analysis sorts different data points into groups that are similar to each other and not similar data points in another cluster. It seeks to find structures within the dataset. Its results can be used to perform customer profiling.

Regression Analysis

Regression analysis identifies and analyzes the relationships between variables. This technique estimates how one or more variables might impact the dependent variable, in order to identify trends and patterns. This is especially used for making predictions and forecasting future trends.

Source: Precisely: Data Mining Techniques