Data Mining

Aatish Sai
2 min readJan 23, 2017

--

Data mining is the computational process of discovering patterns in large data sets. It involves methods at the intersection of artificial intelligence, machine learning, statistics, and database system. The goal of the data mining process is to extract information from a data set. It also help transform huge data into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data processing, model and inference considerations, interesting metrics, complexity considerations, post-processing of discovered structure, visualization, and online updating.

Every two days now, we create as much information as we did from the dawn of civilization up until 2003. Eric Schmidt

Considering the amount of data we generate in our day to day basis, we are in a data rich situation. We gather far more of it then we can digest. Every transaction or interaction leaves a data signature that someone somewhere is capturing and storing. The sheer scale of this data has far exceeded human sense-making capabilities. At these scale patters are often too subtle and relationships are too complex. Also they are multi-dimensional and difficult to observe by looking at the data. Data mining is an automating part this process to detect interpretative patterns.

Data mining ultimate goal is prediction. Predictive data mining is the most common type of data mining and one that has most direct business applications. The process of data mining consists of three stages: the initial exploration; model building or pattern identification with validation/verification; and deployment.

In conclusion, data mining, in this way, can grant immense inferential power. If an algorithm can classify a case into known category based on limited data, it is possible to estimate a wide-range of other information about the case based on the properties of all the other case in that category.

References:
Wikipedia

--

--

Aatish Sai

Engineering Manager | Software | Machine Learning | AWS Certified