Business Intelligence and Data Mining

It’s a well-known fact that business intelligence (BI) is very “in” and cutting edge these days. Many companies are devoting a lot of energy to implementing and maintaining their smart business systems, whether that involves a data warehouse or even an operational report center. The opportunities are plentiful and the resources endless. There seem to be only benefits, but is it really worthwhile? Is it cost-effective and productive to put so much effort into it, or is it a luxury that only big businesses can afford? There’s no “one answer fits all”, just as no two companies have identical needs, but I have to say that having a better informed vision, based on the various business lines defined by the customer, at a time when the need for analysis arises, boosts confidence and provides extra assurance when making short, medium and long term decisions, and that’s a very good thing.

90% of the data in the world today have been created in the last two years alone.

Source: IBM (2013)

Let’s start at the beginning: What is business intelligence? It’s a set of principles and methods for extracting useful information and knowledge from data. Whether it’s used for forecasting, exploring or investigating, the concern remains the same: data. They can be extracted and displayed in various ways:

  • graphic representations
  • summary and/or detailed reports
  • interactive display (i.e. dynamic forms)
  • etc.

Now let’s talk about a particularly timely and well-known aspect: data mining. Often used for exploration, it consists of extracting knowledge from data, usually a lot of data, using meaningful structures or motifs based on predefined criteria. It involves three steps: collect, analyze and present the data to the customer. Doing that means understanding the need and understanding the data, preparing them and applying techniques. See the table below:

Processus Data Minning

“Data mining will become much more important and companies will throw away nothing about their customers because it will be so valuable. If you’re not doing this, you’re out of business.”

– Dr. Arno Penzias, winner of a Nobel Prize in physics

Each step is divided into sub-steps, which can complicate a seemingly simple solution. As in any field of technology, BI presents challenges. What comes to mind here is historical data and how to combine them with current data, with types of incompatible data, required expertise and technological choices to be made based on performance. Systems often have to run day and night to prepare data for D-day. Also worth noting is the fact that the user’s focus in data mining is not on generating formulas but on how the algorithm performs based on his objectives.

Data preparation is the technical side of the process and is done in four stages: consolidation, cleanup, transformation and reduction. Its source is real data and its destination a basic system of prepared data. For the technical portion of the article, there are several types of data to be considered; for example, Categorical data (Nominal and Ordinal) and Digital data (Interval and Ratio).

Categorical data

  • Often come from the conversion of a digital variable (nominal)
    • e.g. conversion of age into age group
  • Add the concept of order to the possible values (ordinal)
    • e.g. credit rating (low, medium, high)

Digital data

  • Variables used to measure a certain quantity
    • e.g. age, number of children, family income, etc.

Other data

  • pictures, audio, text, etc.

In conclusion, I should stress that data mining provides an especially attractive competitive edge. This technique, already widely used in high-growth industries and fast moving business domains, can now be applied to most spheres of human activity, such as scientific research and public safety. Such newly acquired knowledge lets companies improve and speed up decision making, resulting in original, cost-effective business solutions. It provides unmatched know-how and drives better informed decisions about consumer behavior, for example. Data mining lets you get to know your customers better so you can find out what they really want—knowledge is power.

In the next few weeks, we’ll be going into further detail about the multiple facets of business intelligence and the various techniques that exploit it. I’m thinking especially of data warehousing, decision support systems or even operational report centers. Fortunately, I feel it’s an inexhaustible subject, and I hope you think so too.

Until then, thanks for your attention and good data mining!