Business Intelligence

I first starting  thinking of the need for every company to have a BI infrastructure when I moved from a large multinational who utilized SAP extensively  to a smaller ‘Mom Pop’ business where I work in Digital Marketing.  Company B was able to see Booking  Sales data through a  ‘Read Only’ setting however they were unable to export data and run pivot tables against it.

They could only  see what time a booking came in and the retail value of it and any analysis undertaken was actioned on a Manual basis (which completely blew my mind). As a new person on the team ( who was reading Tim Ferris’s 4 Hour workweek), I had an unbiased eye onto how we could improve the business.  I suggested that  that at present we had only received Data however little valuable Information could be gleaned from it.  Ideally I would like to integrate our Online Booking system with a CRM system  and our Digital Marketing Analytics so that we could have a one stop shop to supply our information requirements.

Business Intelligence
Business Intelligence

Business Intelligence is often misused as a catch all term, but what exactly is it ?

Mukhles Zaman describes it as defined business intelligence as: “…neither a product nor a system.  It is an umbrella term that combines architectures, applications, and databases.  It enables the real-time, interactive access, analysis, and manipulation of information, which provides the business community with easy access to business data”.

BI_VI
BI_VI

BI analyzes historical data generated through transactions or by other kinds of business activities- and helps businesses by analyzing the past and present business situations and performances. In Summary Business Intelligence is the collective information garnered from your customers, competitors, external business partners such as Suppliers and internal operations in order to be able to make Data based decisions’.

BI as a Data refinery turning data into actions and business value
BI as a Data refinery turning data into actions and business value

BI has a symbiotic relationship with big data, both technologies developed at the end of the 198Os, Business Intelligence took over from OLAP Reporting. While Data Warehouses provide a one stop shop for data driven insights , Business intelligence extracts this data and presents it in a more usable format.  The beauty of BI is that it can extropolate information from many different formats

  • Excel reporting deliver tabular information including new tools such as Power Query, Power Pivot,  Power View and Power Man along with the Data Mining Add ins to do BI.
  • Dashboard offering a combination of tabular and visual information on screen. Idea of getting an over view and then performing a deep dive.
  • Increasing use of Map based visualisation e.g. Fusion Tables where information is presented in a geographical format.

BI-Solution2

Information is made available for reports and dashboards to turn raw data into information that could be easily accessed. Enterprises collect data at a more granular level therefore generating more results, number of tweets per day. Business Intelligence caters for different type of Users in the Analytical environment Decision  Makers, Operation Users and Analysts and Specialists. Operational Users are the largest group, they are the front line staff who need fast access to the information. They have the lowest access clearing which may only involve access to look up screens. The Analysts and Specialists are a step up and support the Decision Makers in the organisation, they have access to detailed data across the Business including  as hoc query tools. On the top of the access pyramid lies the Decision Makers including Business Unit Leaders who need fast access to KPIs and click and point reports based on their interests.

Word Cloud "Business Intelligence"

At  its’ kernel,  BI offers allows companies to draw critical insights from Data to get a competitive or knowledge advantage by :

  • Empowering people to make Data Driven Decisions through the delivery of meaningful analysis and reporting
  • Simplify collaborating  and sharing information by providing the ability a single source for information which will benefit  all aspects of the business including operations and financial sections.
  • Allowing the end customer to gain insights to understand and analyze your business performance and opportunities on a deeper level which may throw up some surprises (i.e the customer that you devote most time to may not be the most profitable.
  • Sharing information efficiently and effectively with people across your organization. BI applys a slice and dice mechanism so information is not  inefficiently replicated across the business.
  • Providing a solution with the scalability and flexibility to grow and change as your organization does.

In order for a BI project to succeed the following steps must be put in place

Business Intelligence Plan
Business Intelligence Plan

 

  • The Business Users  and the BI technical team should work closely together,   the primary stakeholders from different levels of business must come together to scope the project and to track the progress.  Business Intelligence requires cross functional data e.g. to accurately  measure the ratio of customer profitability to employee incentives, by business unit, by region  would require customer information, financial and business performance information and employee information so involvement from IT, Sales, HR, Supply Chain Functions would be required.
  • It should enjoy a high level of C – Level support.  Change management is a skill in itself and many people are naturally resistant to the idea of change. Change must be adopted from a top down format and a ‘scrum leader’ appointed to break any impasses to the project. Every group  believes their project is the most important so Prioritizing the BI initiative is vital to keep the project on track, (Tip Top repeat “Out of scope’ ad nauseam, it might be worthwhile to compile a  thesaurus of different ways of saying ‘Excellent idea, however it is out of scope of this project !!) BI initiative should be prioritized based on Return on Investment, Strategic Importance and Ease of Execution.
  • Appropriate resources to support the project. BI projects represent a huge commitment in terms of  money and time so  it is vital that adequate resources are dedicated to bringing the project across the line.

I found the topic of BI fascinating  and I believe that critical thinking ,  problem solving and the desire to succeed is one of the most important attributes that any organisation can have. Deployed correctly,  Business Intelligence will save money by decreasing costs and increase sales which is at the kernal of all things Business Related .

 

 

Big Data and Tesco

Talk more about Big Data

Tesco was a pioneer in teaming  up with Data Analysts Dunhumby  in 1995 to launch the Tesco Clubcard  in order to turn the tide on their diminishing market share. This moved away from their historic Green Shield Stamp offline promotional deal which rewarded purchases however did not collect customer information  In 1993 they introduced a card that incorporated a magnetic strip that recorded the customers details. name, address, telephone number, email address, occupation, gender, family status. Tesco are now also able to acquire data on the Type of internet browser you are using , type of computer operating system you are using, what you eat, shopping habits, what products you buy on a consistent basis Data including payment methods and social networking information.

TESCO CLUB CARD
TESCO CLUB CARD

In 2011, Tesco CEO Mr Terry Leahy attributed the success of Tesco.com on the Data garnered from their Loyalty Card, Tesco receives detailed data on 2/3s of its’ purchases. They are able to take the  analyse data on 10 million customers, hundreds of millions of shopping baskets a year, billions of rows of item level data. The Clubcard scheme operates in the Uk, Ireland, Czech Republic, Hungary Poland. UK market in excess of 15 million members as of 2010. Andy Ruckley  Teradata data warehouse  with 100 in order to be able to process, keep 5 years of  sales data on product store level.  Dunhum,Utlizes Tableau data visualisation to analyse the 3000 stores in the UK with  1500 lines of produce. 100 Teradata capacity

  1. Your

Even customers who have opted out of having a loyalty card or using it on that day purchases can be tracked if they use a credit card. Can tell that an anonymised card would only need an opt in requirement if they planned to use this information to communicate with them.  Data is also gathered from data havesting on Social Media identifying interests, demographics etc. 

Designed to help retain loyal customers and to market to them based on their historical purchasing preferences. Help build up a data driven demographic  profile, gain insights, engage with loyalty customers.  Tailor promotions delivered to customers in the same manner that Amazon has mastered so well.  The redemption rates of Tesco vouchers are the best within the industry.  Service.  Tesco produce a variety of different versions of their lifestyle magazine magazines, based on information gathered from individual shopping cards, that matches content and coupons to various life stages and orientations. They have developed a dozen or so core lifestyle classifications, but that results in millions of variations in content and coupons. These personalized magazines of course drive conversion substantially

Tesco Club Card TV: Streaming site that serves remarketing ads on information based offline in clubcard receipts. The contents of your trolly purchased earlier can be served to you later on an remarketing add.

  • Gain Insights on Customer Behaviour
    • Identify preferences and tailor offers to the customer dependent on their historic buying patterns
  • Apply Predictive Weather Analysis
    • Minimising out of stocks

Unlocking Customer Behaviour and Insights through data driven

You’ll find it’s worth your time ; according to MGI , a retailer using Big Data to its current potential could increase their operating margin by up to 60%.

Lord Ian MacLaurin famously telling Clubcard’s developers that they knew more about Tesco’s customers after just three months than he did after 30 years.

Every time a Clubcard is used, a copy of the store shopped in, products purchased and price paid are stored against the Clubcard account. Applicants are asked to provide personal details such as name, address and children. Tesco have stated that this is to help them pick vouchers that are relevant to the holder and also monitor trends to help product availability.[12] Information is stored for 2 years/

Simple Concept, For every €1 spent customer would get back €0.01 back .Vouchers sent to customers on a quarterly basis , recognise repeat purchases be rewarded for shopping :

  • Gain Insights on Customer Behaviour
    • Identify preferences and tailor offers to the customer dependent on their historic buying patterns
  • Apply Predictive Weather Analysis
    • Minimising out of stocks

Unlocking Customer Behaviour and Insights through data driven

 

Tesco profitability and big data
Tesco profitability and big data

Tesco  pioneering use of Business Intelligence.

Supply Chain Management who utilizes  Statistical model that predicts the impact of weather on customer buying behaviour. Comparing sales records and historical weather data able to . Adjust stock levels based on weather forecast. Saves Data Analyst utilise Matlab to perform complex analysis. Modelling buying Patterns, stock better and cut waste and target discounts. used

Prodictive weather analysis Analysed sales records to historical weathers, reduced out of stock by factor of 4. Duncan Apthorp Supply Chain Programme Development Manager estimates that this saved Tesco 16million in first year.

Over 25 %  of Tesco’s stock is large selection of persishable e.g. fruit and veg, it was vital that they implement a robust intelligence infrastructure.  dairy, it reduced waste in stores and reduced stock piling

http://www.tableau.com/learn/stories/tesco-heats-sales-through-data-insight

 

Weather impact on retail
Weather impact on retail

Tesco have also took 70 million refrigerator-related data points coming off its units and fed them into a dedicated data warehouse to optimise performance

Critical analysis

Don’t foster loyalty as consumers are promiscus with shopping around , prevalence of different cards in our wallets. Gain incredible insights that allowed them to tailor their  offers.   Exchanging providing information to receiving offers that are relevant to them , increased rate of Opt out if you receive irrelevant offers. Like the Catholic Church , consumers have very little real opt out options to ensure that their data isn’t used.

the author, Michael Schrage, noted: “Tesco’s decline presents a clear and unambiguous warning that even rich and data-rich loyalty programs and analytics capabilities can’t stave off the competitive advantage of slightly lower prices and a simpler shopping experience

Insights garnered from Analysing Big Data and mining the data  ECL

Role of Big Data in eCommmerce.  Tesco’s use of Data to harness its’ Data,  Tesco’s ability to gather Data, turn it information, Gain Knowledge that  leads to informed actions.  Tesco is the 3rd largest retailer and turns massive amount of customer data and translate it into sales. Tesco is able to mine this data to tailor its’  product offerings & communications with their customers

 

pRactical R

Laura’s Data Fix

My previous post dealt with R Programming from a  Theorotical  stance so I dediced to follow up the  Theory with some practical evaluation of R.

When I first started to work on R, I found that there was a complex lack of  ‘real beginners guides’ which delivered a step by step guide in  implementing R.  I found using ‘R Studio’ more User Friendly than the standard ‘R Console’ so would recommend  new users installing ‘R Studio’.

As with all R  Programmes, the initial step is  to call up the library and chose the packages to install  e.g.

Step 1 install.packages(“ggplot2”)   : install ggplot2:

#install.packages(‘ggplot2’),dep =TRUE)

Next up, Load library  for  ggplot2 as follows : library(ggplot2) , once the library is installed you can ‘comment it out’

Create  a variable  from the  Lunch Dinner Data Frame as follows :
dat <- data.frame(time = factor(c(“Lunch”,”Dinner”), levels=c(“Lunch”,”Dinner”)),total_bill = c(14.89, 17.23))

# The following command resulting in mapping the time of day to different fill colors : ggplot(data=dat, aes(x=time, y=total_bill, fill=time)) + geom_bar(stat=”identity”)

Map the time of day to different fill colours
Map the time of day to different fill colours

# In order to add title, narrower bars, fill color, and change axis labels run following command :
ggplot(data=dat, aes(x=time, y=total_bill, fill=time)) +  geom_bar(colour=”black”, fill=”#DD8888″, width=.8, stat=”identity”) + guides(fill=FALSE) +
xlab(“Time of day”) + ylab(“Total bill”) +  ggtitle(“Average bill for 2 people”)

Average Bill for 2 people
Average Bill for 2 people

 

#Head = shows me the top of the data, give me the top of the data,show me what it looks like head(tips)
ggplot(data=tips, aes(x=day)) + geom_bar(stat=”bin”)
#Tips = the total amount of the tips

#It can be interesting to compare female and male variables in order to calculate how much they spent for dinner and lunch -. In order to accurately compare the  2 groups it was necessary to , group into female and male and  into lunch and dinner time.

dat1 <- data.frame(sex = factor(c(“Female”,”Female”,”Male”,”Male”)),
time = factor(c(“Lunch”,”Dinner”,”Lunch”,”Dinner”), levels=c(“Lunch”,”Dinner”)),
total_bill = c(13.53, 16.81, 16.24, 17.42)

R GRAPH COMPARING MALE, FEMALE SPENDS OVER LUNCH AND DINNER
R GRAPH COMPARING MALE, FEMALE SPENDS OVER LUNCH AND DINNER

Top Tip: Remember to run #echoes dat1 so that we can see what the data looks like when you run it  dat1

E.G. ggplot(data=dat1, aes(x=time, y=total_bill, fill=sex)) +geom_bar(stat=”identity”)

# Bar graph, time on x-axis, color fill grouped by sex — use position_dodge()
ggplot(data=dat1, aes(x=time, y=total_bill, fill=sex)) + geom_bar(stat=”identity”,position=position_dodge())

ggplot(data=dat1, aes(x=time, y=total_bill, fill=sex))+geom_bar(stat=”identity”, position=position_dodge(), colour=”black”)

Tips per day
Tips per day

 

Things that helped me ( own data used ) being able to check the structure of data

Laura's Data Fix

 

R Programming

Allowing non programmers to dig deep in Data Science.

Arrr arrr me hearties,  Captain Philips was not here to save me, however instead of walking the plank I received my R Programming Badge. Thankfully I never received a Jim will Fix Badge so I promise to cherish my R Badge.

Proud owner of an R Certification Badge !
Proud owner of an R Certification Badge !

R first appeared in August 1993 a statistical programming language developed for and by statisticians. It takes its’ name from its’ 2 creators Robert Gentleman and Ross Ihaka. It maintains its’ object orientated history from its’ genesis in John Chambers ‘S’ Language . R is maintained by 19 Developers  and is reviewed regularly by internationally renowned statisticians and computational scientists.

In addition to R’s Conferences Globally,  R boasts a very active online community  It has continued its’ strong tradition of academic and entrepreneurial academic collaboration and the popularity of R has spread from Data Statisticians to other fields of Academia including Biosciences and Humanities,  Finance, Genetics, High Performing Computers, Machine Learning, Medical Imaging, Social Sciences and Special Statistics. Matt Adams from Code School asserts that “Any new research in the field probably has an accompanying R package to go with it from the get-go. So in this respect, R stays at the cutting edge,” . Confidence was demonstrated by the US Food and Drink Administration’s s approval of R to interpret clinical data including Genomic Data in 2008.

Popularity of R Programmers
Popularity of R Programmers

Part of R’s huge popularity stems from the fact that  is a Open Source Programming Language  meaning the wide community of developers can review the source code, improve on it, fix any bugs and add new features without having to wait on developers. R offers developers the opportunity to build their own tools to analyse data.  Wikipaedia asserts that there are currently in excess of 7000 User created packages available via CRAN (Comprehensive R Archive Network  – the collection of sites which carry identical material, consisting of the R distributions, the contributed extensions, documentation for R, and binaries)  Omeghat, GitHub  & other repositories  in August 2015 and this number is rising daily.  It has been said that “If a statistical technique exists, adds are there is an R package on it”.

R’s dexterity and agility is well documented in relation to its’ compatibility with other tools and operating systems. The Cross Platform Support runs via various Operating Systems including GNU/Linux, Macintosh, and Microsoft Windows, running on both 32 and 64 bit processors. Data can be importing from CSV, SAS, and SPSS, or also directly from Microsoft Excel, Microsoft Access, Oracle, MySQL, and SQLite. It can also produce graphics output in PDF, JPG, PNG, and SVG formats, and table output for LATEX and HTML.

 

Disadvantages about using R – the only potential fly in the appointment is Security. Security is its’ biggest downfall, R is based on Older technology which did not have security to the forefront. R can not be used for Web or Internet Development which rendered its’ use impossible as a backend server to perform calculations. It is advised to use a  64 bit operating system when Data Mining was R  has the potential to use all available memory when Data Mining as Memory Management is not prioritized.

Technically Challenged Granny
Technically Challenged Granny

Some personal observations when learning how to programme using R.

  • From learning the ‘Red Cross Code’, we  are programmed to think of  Red as a danger colour or in this case denoting an error. I found it very disconcerting for the commands show up in Red even when there wasn’t an error.
  • I love being able to record (and translate codes ) using #command and find this functionality very User friendly.
  • As I am a complete novice with coding, I found R to be extremely challenging and the learning curve to be vertigo inducing.
  • I very much appreciate the extra tutorials from my tutor and the gurus at Stack Overflow. I found the  existing documentation is  be sparse and  virtually impenetrable to the non-statistician

 

Why Learn R?
Why Learn R?

To summarize , R is the leading tool for statistics, data analysis, and machine learning.  It is more than a statistical package; it’s a programming language, so you can create your own objects, functions, and packages. Its’ huge following can be somewhat attributed to the fact that its Open Source and essentially free and is able to operate on multiple platforms with multiple languages.

 

Fusion Table

2011 Census Fusion Map
2011 Census Fusion Map – Ta DA !

 

Challenge, Merge 2011 Census with a KLM file to  prepare a heatmap based on the 2011 Census. Initially I thought this would be an easy exercise  however the conFusion tables struck in earnest ! Check out my following step by step blog which will show you how to recreate this map in a few short steps.

A Database is as good as the data it includes,  Step 1 must always be to analyse the Data to ensure consistency.  When The 2011 Census data had to be scrubbed to remove any anomalies  e.g. Geographic file shows county boundaries whereas the Census subdivides larger Counties into County and City, Indeed in the case of Dublin the following areas have been identified : Fingal,  Dun Laoghaire Rathdown etc.

 CSO Census Data
CSO Census Data

Step  2,  Import KLM file  into a Fusion Table,  at this stage additional scrubbing was needed to  cleanse the data ( while Clare is an exceptionally beautiful county it doesn’t need to show up  3 times on  the map ! ) I manually  amended the description to show correct Titles, Clare, Carlow and Cavan were all having identity issues ), Step  2 was merging the ‘Cleansed file’ with   Census data showing  26 counties in the  Republic of Ireland.  I will happily  update this heatmap to show all  32 counties once Ireland becomes unified !

There were 42,854 more females than males in the State in April 2011 resulting in an overall sex ratio of 98.1:100 male female ratio. This is a reversal of the situation in 2006 when the sex ratio was 100.1. It can be assumed that this is as a result of the Recession which impacted ‘traditional male roles’ such as construction and may have  driven Male immigration at a higher rate.

Data Visualisation !  Heatmap is a super easy way of visualising Data, Cities &  their surrounding areas have a higher .

Additional analysis of  the 2011 Census may be found here