Introduction

 

Due to the excessive development
in software  industry large number of
databases are produces on daily basis. So, the Software Engineering become complex
day by day because the SE rising on huge scale, in this regard it become
difficult for developers how to gain worthy information.SE deals with the
development of computer bases systems, their deployment, maintenance,
specification and architectural design.

 

Software repositories continually
produced. These repositories need to be managed.SE facing different challenges in
which  include  gathering of requirements ,  integration of system  and development, maintenance , pattern’s discovery
, detection of error , reliability and complexity of software development. In
SE there are three types of data. (1) Sequence (2) Graph (3) Text.

 

 

Data mining is the computing process of discovering patterns in
large data
sets involving methods at the
intersection of machine learning, statistics, and database systems. It is an essential process where intelligent methods are
applied to extract data patterns 2. Due to data mining technique valuable
data can be obtained from  large number
of dataset.

 

Data mining help the software engineers to find out the
causes of software failure , errors of software , interaction between different
classes and their relationships , it also help to identify the patterns used in
program source code.  The data mining result  used by the researchers or practitioners to find
out the  problems in current system  and help to produce highly qualitative product
in manageable budget and time period.

 

Data mining techniques are helpful for solving problems in
three categories of data in SE.

Data mining process consists of seven  steps : data  integration , data cleaning , data selection ,
data transformation , data mining 
pattern evaluation and knowledge presentation 3 .These techniques can
be applied to  improve  SE related generalization , characterization
, classification , clustering associative tree , decision or rule induction ,
frequent pattern mining etc. 4

 

There are different mining techniques available that can be applied
on different types of data. The mining techniques are classified as: Classification,
Clustering, Associated
attributes

 

Figure 1: Data Mining Process

 

      

The
objective of this review is to understand the concept of Data mining use in
software engineering and applications of techniques on different types of data
available in SE.