What is Data Mining?
Data mining is the process of analyzing large datasets to find actionable information through the identification of patterns. Modern data mining uses computational processing to streamline analysis with methodology from various disciplines such as machine learning, artificial intelligence and statistics. Many of the computational processes for pattern detection are automated through disciplines such as machine learning while methodology of analysis is focused on statistics.
While the above disciplines mentioned are focused on analysis functions, database systems are heavily involved in data mining. They are the foundation for knowledge discovery because databases hold all the data for potential actionable information. In essence, all of these disciplines work in tandem for powerful insights to be derived from data.
Types of Data Mining Patterns
- Dependency Modeling: Finding relationships between variables in your data sets. Retailers use dependency modeling to analyze consumer behavior such as purchasing habits. For example, customers might be buying groups of products together on specific days allowing a retailer to change up product positioning to increase revenue.
- Statistical Classification: Grouping newly found patterns in your data based on relationships. This is frequently used by businesses to identify new trends, consumer behavior, etc.
- Cluster Analysis: Grouping sets of similar data into new clusters that differ from other clusters. Essentially, you are providing known structures to new data for analysis.
- Anomaly Detection: Identifying unusual patterns in data such as outliers, changes, deviations that do not conform to a pattern for further investigation. This is frequently used in Finance industries to detect fraud.
- Forecasting: An advanced form of prescriptive statistics that utilizes likelihood functions to anticipate patterns and trends in your data.