« Back to blog

Data mining 101, it begins!

As a brief introduction to this series, we will start off with just a small example of who is using such a technique and why.

When people look at the term “data mining” they think about computer geeks hovering over their keyboards, getting excited about a bunch of formulas and mixing them together. Just for the record this is only a very small part of this amazing large market.

Analytic

 

Let’s start with journalists, people always looking to combine various means of gaining information, scrubbing this, and trying to be the first to report the next big thing.

 

Journal

Then there are the marketing agencies, scouring through the information provided by the web each day.  Did you ever wonder why Facebook ads on the right of your profile are kinda related to what you like?  Perhaps you thought it was pretty cool or scary that google searches always have advertisements that are focused on what you search for and in your area? And then there are those web pages you visit and the advertisement is from that same company you looked at yesterday that seemed pretty interesting?

Marketing_process_model

Common businesses use this as well for everything from new product research, consumer acceptance on products, to brand protection.  It is critical for companies to have such information, and as quickly as possible.

So it is interesting to see just a few of these areas that are dependent on data mining. One of the biggest challenges today is that the people that need it  have difficulty doing it.

Recently I have seen reports and blogs stating that there will be over 190,000 unfilled positions alone in the US shortly.  These are the jobs in the top companies of the world, knowing that a small to normal size organization could never afford such specialist (does supply and demand ring a bell in relation to their salaries?) So what is being done for the “normal” people of the world that can not afford to hire such a data specialist?

I guess I will stop there for today, before I write a novel length blog post. Next I will speak about the types of data that is being mined, from historical to realtime, structured to unstructured, and the lack of various combinations - such as realtime and unstructured, which is the most interesting and valuable of all.