INTRODUCTION

 

Structured data vs unstructured data: structured
data is involved of
clearly characterized data types whose pattern makes them effectively
searchable; while unstructured data “everything else” contains data which
is not easily searchable such as social media postings.

Unstructured data versus
structured data does not signify any genuine clash between the two. Clients
select either not founded on their information structure, but rather on the
applications that utilization them: social databases for organized, and most
some other sort of use for unstructured data.

However, there is a
growing strain between the simplicity of investigation on structured data
versus additionally difficult examination on unstructured data. Structured data
examination is a develop procedure and innovation. Unstructured data analytics
is a beginning industry with a great deal of new speculation into R&D,
however isn’t a develop innovation. The structured data versus unstructured
data issue inside companies is choosing in the event that they ought to put
resources into investigation for unstructured data, and on the off chance that
it is conceivable to total the two into better business knowledge.

 

What is structured data ?

The
structured data depends upon the creation of data model :- which tells the type
of business data which will be recorded and how it will be stored and
processed. It also includes which field of data is stored and how the data will
be stored which is called data type and it includes Numeric, textual, name,
address, etc and also the restrictions on the data input. Structured data has a
benefit that it can be easily stored, processed and analysed. Structured data is often
managed using Structured Query Language (SQL) – which is a programming language
created for management and query of data

 

What
is unstructured data?

 

Unstructured data is not arranged
in fixed pre defined way and it’s the data which have no fixed data model

 

1.    
Unstructured
data cant be stored in a table without preprocessing

2.    
 Examples: social media sites(tweets, blogs,
posts, etc.), call centre data, email, surveys with open questions.

 

Unstructured data has strong
influence of three V’s:-

Volume:- Unstructured data usually
requires more storage than structured data.

Variety:-Unstructured data
previously was generated by untapped data sources, which can reveal personal
information of customers.

Velocity:-The unstructured data is
increasing at more pace than the structured data.

Figure representing 3V’s is below:-

 

 

 

Figure 1                                         Source
infodiagram.com

 

 

How prevalent are unstructured data?

 Most of the
business data is unstructured data. It grows much more faster than the
structured data.

 

1.    
More
data storage is required for pictures and videos which is also called as “Rich
Content”

2.    
The
data which is produced by objects that are formerly not connected, like
watches, cars, robots, etc are very important for the growth of data.
Unstructured data sources become transcendent reason for customer insights.

3.    
The
structured data when combined with unstructured data sources help to obtain a
more complete picture of the needs and what customers want.

4.    
Unstructured
data is more subjective, while the structured data tends to provide answers to
“what” questions while unstructured data usually provides the answer to “why”
questions.

 

 

The
universe of computing has developed from a little, moderately unsophisticated
world in the mid 1960’s to an environment of enormous size and modernity.
Everything from the day by day life of people to our national financial
profitability has been significantly and emphatically influenced by the
development of the utilization of the computer. Furthermore, this development
can be measured in two ways :- structured systems and
unstructured systems

 

 

 

 

DIFFERENCE BETWEEN AND STRUCTURED AND
UNSTRUCTURED DATA

 

STRUCTURED DATA
 

UNSTRUCTURED DATA

Structured
systems are those systems where the activity of processing data and output is
predetermined and highly composed.
 
Structured
systems are designed, built and operated by the IT department.
 
ATM
transactions, manufacturing inventory control systems are all forms of
structured systems.
 
 
 
The
rules in structured system are little complex.
 

By
contrast, unstructured systems are those systems which  have very less form or structure.
 
Unstructured
systems include email, reports, contracts, and other communications.
 
 A person who performs a communications
activity in an unstructured system has wide latitude to structure the message
in whatever form is desired.
 
The
rules of unstructured systems are fewer and less complex.

 

Figure2:- Great benefits can be achieved from
bridging the gap between structured and unstructured systems

 

The structured and
unstructured data system has grown in parallel but separately. So, both has
separate environment and different from each other in ways such as:-

1.     Structural

2.     Organisational

3.     Functional
and technical

 

There could be
huge number of possibilities if both of the systems are connected in an
effective way. The new type of systems can be built with the enhancement to
existing systems. There could be more amazing benefits which could be achieved
if all the technical, structural, functional and organisational barriers can be
removed.

 

A NEW PERSPECTIVE OF DATA

Business intelligence
faces certain limitations because  it is totally
based on the numbers. The most distinctive and necessary way to reduce the gap
between structured and unstructured data is to merge the text and numeric data,
which can lead to better and higher information and insight which was not attainable
previously.

There are numerous ways
with which the merger of numeric and textual data can be used to make more
innovative results. An example is to create an unstructured contact file, which
has access to every communication which the customer had previously with the
organisation including letters and emails. So, this file will have all useful
sources such as communication, date of contact, with whom person contacted,
nature of the contact and many more.

 

USES
FOR THE UNSTRUCTUED CONTACT FILE

The most powerful use of
contact file of customer in terms of increasing a CRM system to create a
broader view of a customer, enables us to attain these important objectives :-

One of the most powerful
uses of the customer contact file is in terms of supplementing a CRM system to
create the broad view of the customer, enabling 
to accomplish these important objectives:

1.     Cross
Selling:- If one understands a lot about the customer in one arena, the chances
to sell to the same customer in another arena will materialize.

2.     Prospecting:-
Better one knows or understands a customer, the better one can qualify sales
prospect list.

3.     Anticipation:-
By understanding more about the customer, we can meet the future needs.

One of the essential
fundamentals of CRM is that it is substantially simpler to offer into a established
client than get another client. This long haul relationship is set up in view
of coordinated learning about the client, including:

·       Age
 

·       Occupation
 

·       Net worth
 

·       Marital status
 

·       Education
 

·       Children
 

·       Income
 

·       Address
 

The idea behind making
the 360 degree perspective of the client is to unite information from a wide
range of places in request to coordinate the information and accomplish a
genuinely strong and far reaching perspective of the client.

 

 

Figure
3

 

 

However, there are
challenges to integrating all this data, such as:

1.     Data
finding in first place.

2.     Data
maintainence using different technologies

3.     Merging
the gathered data

4.     Maintaining
customer’s profile up to date

5.     Management
of volume of collected data

 

Unstructured contact file

CUSTOMER ID
 

·       name

·       age

·       gender

·       address

·       phone

·       occupation

·       Income

 

Independent from anyone
else the information accumulated as a major aspect of this procedure is
profitable. In any case, to make a genuine 360 degree view of the client, one
should upgrade this information with the rich vein of unstructured client
correspondences data. At exactly that point will you have the complete
viewpoint. Rather than simply knowing odd actualities about the client, the
organization can recognize what the client has been stating what communication
have happened. So as to accomplish the 360 degree perspective of the client,
bunches of different kinds of data are coordinated together.

 

 

Figure4

 

 

 

 

BUILDING
THE UNSTRUCTURED CONTACT FILE

 

 

There are various
methods to accomplish build of an unstructured file. Using an example of email,
the easiest and common way is to index the un-structured the contact file and
leave email from where they are located originally. With the use of this
technique , an index is created for every communication, which contains few items
such as :-

 

• Communication date

• With whom the
communication is directed

• Customer’s name and
identification

• Email’s location

 

Whenever any
corporation wants to figure out if there is any communication, the index is
used. If it seems that the communication is relevant, the corporation can see
the storage location of the email and also can read the email. Alternately, the
actual email sent with the index and there is no requirement of further search.
This approach requires more system resources , it does reduces the required
work finding a specific email.

 

 

 

 

USES OF UNSTRUCTURED CONTENT IN OTHER
APPLICATIONS

 

The
most important use of unstructured data is in litigation support. For instance
:- if a company is sued by someone. The first thing which that company should
know is that what contact it had with that person. With whom he/she was working
with and with whom her/she contacted. In this case, the ability of viewing
unstructured data is invaluable.

 

There is
another use of mixing structured and unstructured data to increase the business
intelligence and reports. While it is through reports and business
insight that applications pass on their discoveries to the end client, there is
an incredible impediment to reports and business insight since they essentially
depend on structured frameworks for their data. Structured applications are
great at:

1.     Summaries
creation

2.     Summary
of data break down into different categories.

3.     Drill
down creation

4.     Drill
across creation

 

Figure
5

How Semi-Structured Data Fits with Structured
and Unstructured Data

Semi-structured data
keeps internal markings that acknowledge separate data elements, that empowers
information grouping and chain of commands. The two reports and databases will
be semi-structured. This information just represents  around 5-10% of the semi-structured/structured/unstructured
data pie, but also has basic business use cases.

Email is an very basic
case of a semi-structured data type. Although further developed examination tools
are important for string chase, close dedupe, and idea seeking; email’s local
metadata empowers grouping and catchphrase looking with no extra tools.

 

Semi-structured Data
examples :-

·      
Markup language XML

It is a semi structured
language. XML is an arrangement of report encoding rules that characterizes a
human-and machine-decipherable format. Its value is that its tag-driven
structure is profoundly flexible, and coders can adjust it to universalize
information structure, storage, and transport on the Web.

 

Open standard
JSON

JSON is another semi-structured data trade
arrange. Java is understood in the name yet other C-like programming languages
recognize it. Its structure comprises of name/value matches (ex question), and
a requested value list (ex cluster). Since the structure is exchangeable among languages,
JSON exceeds expectations at transmitting information between web applications
and servers.

 

NoSQL

 Semi-structured information is a vital part of various NoSQL databases. NoSQL databases distinction from relative databases since they do not separate the
organization from the info.
This settles on NoSQL a superior call to
store information that doesn’t effectively match into the record and table
format, as an example,
content with dynamical lengths.
It likewise takes into thought less hard data trade between databases. Some a lot of up to this point NoSQL information bases like Couchbase &
MongoDB to boot fuse
semi-structured data by regionally put away them within the JSON format.

 

 

 

Structured vs Unstructured Data: Next Generation
Tools are Game Changers

There are new tools which
are accessible to interrupt unstructured data. Most of these tools rely on
machine learning. Structured data examination may also use machine learning,
the huge volume and a huge range of various kind of unstructured data needs it.
Unstructured information examination with machine-learning insight enables
associations to :-

 

•Analyze digital correspondence for consistence:-

Failed consistence can
cost organizations a lot of dollars in  lost
business and cost. Pattern recognition and email threading investigation
programming seeks enormous measures of email and visit information for
potential noncompliance. A current case incorporates Volkswagen’s burdens, who
may have maintained a strategic distance from a tremendous fines and
reputational hits by utilizing examination to screen correspondences for
suspicious messages.

 

•Track high-volume client conversations in social media:-

 Content analytics and opinion investigation
gives experts a chance to audit positive and negative results of advertising
efforts, or even distinguish online dangers. This level of analytics is
significantly more modern straightforward keyword search, which can just report
basics like how frequently notices said the organization name during new
campaign. New investigation likewise incorporate setting: was the say positive
or negative? Were notices responding to each other? What was the tone of
responses to official declarations? The automotive business for instance is
intensely engaged with examining online networking, since auto purchasers
frequently swing to different notices to measure their auto buying experience.
Experts utilize a mix of text mining and assessment analysis to track
auto-related client posts on social media sites (Twitter).

 

• Gain new advertising intelligence:-

 Machine-learning examination instruments
rapidly work enormous measures of archives to investigate client behaviour. A
noteworthy magazine distributer connected content mining to countless articles,
examining each different production by the prevalence of major subtopics. At
that point they broadened analytics over all their substance properties to see
which general themes got the most consideration by client statistic. The analytics
kept running crosswise over a huge number of bits of substance over all
productions, and cross-referenced interesting issue comes about by segments.
The outcome was a rich instruction in which topics were most fascinating to
particular clients, and which marketing messages reverberated most firmly with
them.