Can I stand on your shoulders?

Sometimes we bump into problems that we can’t solve on our own.  When this happens you can walk away defeated or think of it as an opportunity to learn something new and gain an understanding of your limits.

There are many ways to learn but the one I find most effective is learning to solve a specific problem by engaging with the community.
Continue reading

Posted in Analytics, Uncategorized | Tagged , | Leave a comment

Don’t wait for the data scientists

As a data and system literate business partner I’m always interested in applying technology to solve business problems.

In recent months I’ve been evaluating business friendly Data Mining apps to open the door to the next level of business analytics.

The reason I’m interested in this is for two main reasons.

Big Data

Big Data will swing the analytical pendulum towards more use cases that involve a high level of processing and modelling. With BI we often have to aggregate transactions to optimise OLAP processing. To date, this has limited the practicality of analysing transactional data.  With Big Data we’ll finally get the opportunity to consume transactional level data and unstructured data on a more frequent basis.

No need for a PhD

Data Mining is within our grasp.  The industry standard framework for working with Data Mining is called CRISP-DM.CRISP-DM

Business Understanding

Data Mining begins with business understanding. This is where the objective is studied and requirements gathered from the perspective of the business. Data literate business people are close to the questions that can be answered by Data Mining.

Data Preparation

Data Preparation is akin to self-service data integration. With tools like Power Query you don’t have to be a developer to prepare data. Although a BI developer certainly comes in useful if they are available to you.

Modelling

This is not a big part of the process. Most effort actually goes into the preceding stages. We don’t have to understand the modelling techniques at a PhD level. Do you understand regression? Know your R2 ? How about confidence interval?

There is some learning involved but you may be surprised at the level of learning required. In my experience with a base level of knowledge it’s better to get started with the doing and learn as you go.

The software and learning material is finally becoming more business friendly. Don’t bite off more than you can chew initially – start with something you feel confident about delivering.

Evaluation

This is where we measure the quality of the model and check the overall robustness.  Does it cover all pertinent issues?   At this time we decide if the model is good enough to deploy.

We want to avoid this.

Deployment

Deployment can take many different forms depending on the objective. Think about a model that predicts the customers that are likely to churn.  The model also tells us which factors are most likely to lead to the customer churning.

In this deployment type outcomes are fed back into the operational process/system to reduce the likelihood of churn taking place. Think about alerts to trigger actions or retention campaigns. It’s here that you’re likely to hit the IT buffers but this really depends on the systems you have and the IT resources at your disposal.

Conclusion

With the right tools and access to data, data literate business professionals can produce the insight that will add value in different ways to traditional BI. Don’t wait around for the Data Scientists.

Additional reading

Data Science for Business book

Posted in The Data Revolution | Tagged , | Leave a comment

Filtering data with MDX

One of the key activities when analysing data is to filter.   This is especially true when the data source is a data mart.

This post is about filtering with MDX in preparation for reporting with SSRS.

The query editor within SSRS is okay for basic queries but for complex situations we have to write the query by hand.

In this post I’ll give an example of filtering a set of products based on part of the product name.

The 3 functions we need to do this are:

Except() : Except takes 2 sets and extracts 1 from the other, returning members that are left.

Filter() : Filter takes a set and filters out members based on an expression.

Instr() : Returns the position of the first occurrence of one string within another.

Before we get into the filter let’s have a look at the initial query.

SELECT { [Measures].[Reseller Sales Amount] } ON COLUMNS ,

NON EMPTY { [Product].[Product].CHILDREN } ON ROWS

FROM [Adventure Works]

s1

In this scenario we want to remove products with the words Front, Classic, or Full somewhere within the product name.

We can process the first filter to check we’re on the right track.

EXCEPT(
{[Product].[Product].CHILDREN},
FILTER([Product].[Product].MEMBERS,
Instr(1,[Product].[Product].CurrentMember.Name,"Front")>=1)
)

In the code above we’re taking the product children and removing members that are returned from the Filter function.

The Filter function is filtering the product members based on the occurrence of “Front”.

Once we’re satisfied that this works we can add the other filters.  I don’t know if there is a better way of doing this other than to nest the EXCEPT functions.

EXCEPT(
EXCEPT(
EXCEPT(
{[Product].[Product].CHILDREN},
FILTER([Product].[Product].members, Instr(1,[Product].[Product].CurrentMember.Name,"Front")>=1)),
FILTER([Product].[Product].members, Instr(1,[Product].[Product].CurrentMember.Name,"Classic")>=1)),
FILTER([Product].[Product].members, Instr(1,[Product].[Product].CurrentMember.Name,"Full")>=1))

The complete query is displayed below.

SELECT { [Measures].[Reseller Sales Amount] } ON COLUMNS ,

NON EMPTY

EXCEPT(
EXCEPT(
EXCEPT(
{[Product].[Product].CHILDREN},
FILTER([Product].[Product].members, Instr(1,[Product].[Product].CurrentMember.Name,"Front")>=1)),
FILTER([Product].[Product].members, Instr(1,[Product].[Product].CurrentMember.Name,"Classic")>=1)),
FILTER([Product].[Product].members, Instr(1,[Product].[Product].CurrentMember.Name,"Full")>=1))

ON ROWS

FROM [Adventure Works]

mdx1

I prefer this type of filter as any new occurrences of products matching the filter will be automatically filtered out.

When we explicitly reference products in MDX we often have to amend the query when new business entities are created. It’s best to avoid this.

Of course, use this filter with care and be sure you aren’t inadvertently filtering more members than required.

Posted in Analysis Services | Tagged , | Leave a comment

The Data Revolution

The last 9 months have been incredible.   The ‘Circle of Interest’ tool that Jerry Hopkins introduced me to is yielding amazing results.

I’m enjoying the new training career branch that was sprouted by my yearning to practice public speaking.  I’ve connected and met some really interesting people ranging from artists to scientists.

The self-service data revolution is increasing in pace and the cloud is giving me new ideas to take the business forward.

images

I finally found a way to enable the business to start the ‘Beyond Budgeting’ journey.  More on this later.

beyond-budgeting

I’ve been itching to write about a Data conference I attended in Sep, I’ve been so busy with things it’s hard to get a chance to work on the blog.

SQL Saturday is a user-community organised event stemming from the Professional Association of SQL Server group (PASS).   They take place all over the world.

images (1)

At SQL Saturday you can get face time with MVP’s and other experts at the top of their field and learn about the latest innovations in SQL Server and the cloud.   I refer to it as ‘knowledge soup’.

Lyndsey and I were on a tour of Britain – we hit Cambridge, the cycle city, as the SQL community arrived in town.

I wish the Holiday Inn started breakfast earlier than 7:30 at the weekend.   The first session was 8am which made for a mad dash to Queens College.

I arrived in time for the start of Jen Stirrup’s talk “From Big Data to secret sauce: power your organisation with PowerBI”.

I enjoyed seeing how Big Data can be processed in a self-service way.  Jen as usual delivered a top class talk.  There is no use case in my current business for big data, unless I’m missing a trick.

Next up was Advanced Data Visualisation with Reporting Services from Jen.  I was interested to learn that SSRS can consume DAX queries.  Lots of food for thought.

After a period of networking I joined the audience to see Carmel Gunn’s talk “The Irish Economic Crisis Visualised”.

It was this talk that got my mind buzzing the most – Carmel sourced a bunch of economic data available in the run up to the economic crises and modelled it to create visualisations that told an interesting data story, showing obvious facts that would have helped avoid the crisis, had they been spotted at the time.

A new spark of interest was formed for me – Open Data.  More local to home – my mission to connect and shape business data to create new insights was validated.  I firmly believe there are stories in business data that are waiting to be discovered.

Next up was a light hearted, entertaining talk by Jen (that girl really does like to talk!)  “Now that’s what I call the worst Visualisation ever”.

I’ve started to talk about colour perception and ink spread back at the office and I’ve banished red from the MI charts.

After lunch with the Cambridge students it was Niko Neugebauer’s talk, “DataExplorer – the reason you might not need SQL Server.”

This was all about self-service Data Integration.  I had previously used Data Explorer (renamed to Power Query) to connect to AdaptivePlans web service as a means to produce on-premise reports.

It was really interesting to see the direction that Data Integration is going.  The data revolution is in full swing.

I enjoyed Nico’s talk. It was full of passion yet retained a level of professionalism throughout.

My brain was filling up at this point.  I went to see John Martin talk about “Data Driven Branding for SSRS”.

This was an interesting technical talk that told how John used expressions to enable report meta-data to be held in a database and then referenced in reports.  The idea being that when Marketing change the font or image a single change can be made in the database to refresh 100’s of reports.   I liked this solution, although, I’m not sure I’ll be making use of it at the current time.   I did make a note to share it with IT and file in my ‘someday folder’.

Next up I made my way over to a classroom where I joined a couple of other speakers to give a lightning talk on “BI Self Service Musings”.  This was much harder than I imagined.   I remember thinking that 10 minutes is a long time.  In all future talks I will value minutes much greater.

The Power Pivot add-in crashed with the bug that requires a log off/on.  No time for that – I opened the video that I had pre-recorded to cover such eventualities.

To conclude the event there was a raffle to give out prizes donated by the sponsors.   Many of the people drawn out first had left early.  It pays to stick around.

I remember thinking that Buck Woody (SQL/Cloud evangelist from Microsoft) would make a good comedian.

All in all it was a great data conference and one that I recommend for anyone with the slightest interest in data or SQL Server.

My special thanks go out to the organisers who gave up their time to facilitate this event.  Especially Mark Broadbent the event lead.   That guy has patience.  Must not forget the rest of the organising team.

Lorraine Broadbent – Project Manager
John Martin
Kevin Chant
Alicia Horner
Peter Baddeley

http://www.sqlsaturday.com/228/eventhome.aspx

http://www.sqlsaturday.com

http://www.sqlpass.org/

Posted in The Data Revolution | Tagged , | Leave a comment

Quench your thirst for knowledge

Like many analysts I’ve always had it in my mind that if I only had some new data to fill in a ‘data jigsaw’ then I’d be able to add much more value to the business.

Frequently this information isn’t held in the Cubes.  This situation is not down to BI failures, moreover, business operates at a faster cadence to typical BI projects.

The ‘If I only had this’ questions arise some months after the last iteration of BI.  The last thing the business wants to do at this point is start another project.

What do we do?  Historically the answer has been to log a change request that will be picked up in 12 months.

With the advent of self-service BI we can answer this question differently.

Using Power Pivot we can consume data held in the cubes and mash it up with disparate data that has not yet been fully augmented with the existing BI model.  This small capability addresses the ‘information thirst’ that we see in dynamic businesses.

I would even go so far as to say some types of requirements can be addressed more easily.  For instance, ask your Database specialist to give you a count of the number of new customers and ask them for the Sales that these customers have made.   This is a common business requirement and happens to be one of the most expensive database queries.  We can answer this question with ease with the modern in-memory tools.

I’ll be honest with you.   BI Self Service isn’t as easy as some people are making out.

It’s probable you’ll have to build your skills in the following areas :

Data Modelling
Learning a new query language

I recognised this trend 3 years and decided back then to start the journey as the potential value was too great to ignore.

With these skills I created this new information in less than 30 minutes.

newc2

I created the data model below from the Adventure Works sample database.

Model

And used this expression :

=CALCULATE(COUNTROWS(DimCustomer),
FILTER(DimCustomer,
CALCULATE(COUNTROWS(Sales))>0 &&
CALCULATE(COUNTROWS(Sales),
Filter(All(DimDate),DimDate[FullDateAlternateKey] < MIN(DimDate[FullDateAlternateKey])))))

It looks complicated but all that it’s doing is counting the customers and filtering out those customers who have made a sale in the past, thereby, giving us the new customers for whatever context is active.

It can be hard to produce these queries as you have to think about things differently.  It takes time and practice but I can confidently tell anyone who is interested “If you’re capable of writing a VLOOKUP function in Excel then you have the aptitude to learn this new world of analytics.

I’m providing a training day for the people at the start of this journey on the 17th September at Media City near Manchester.

You can find out about the training day at leehbi.eventbrite.co.uk

Come and join me to discover the new way of tapping into data to produce amazing insights.

Posted in Analytics, Power Pivot | Tagged , | Leave a comment

Business Analytics Training with Power Pivot

Posted in Analytics, Training | Tagged | Leave a comment

Power View external data refresh (kind of!)

I recently subscribed to Office 365 Pro plus to learn about Power View and the associated Share Point Online benefits of publishing self service BI dashboards.

Dashboard

Power View reports can be accessed via Excel Services on-line which makes for easy publishing of dynamic dashboards.   For a company with SharePoint Enterprise on premise you already have access to Power View, and with SQL Server you have access to the in memory analysis services model.  All you need is Share Point and SQL Server support.

Many companies on the Microsoft BI platform are not always SharePoint Enterprise customers.  These customers can find it difficult to make the business case for Enterprise if they will only use the BI features.

With Share Point on-line can we benefit from the BI publishing features by subscribing to the service?  If this is possible we could side step the problem mentioned above.

I quickly realised that with the current release of Share Point on-line only internal (in the Excel workbook) data models can be updated.  It’s not directly possible to update models that are linked to on premise databases. This is very limiting as the majority of data still resides in on premise databases.  I do wonder how Microsoft think sometimes.

One workaround to this is to download the workbook to the local file system.  Open in the Excel client application, refresh the data and then upload back to the Share Point document library.  Of course with a stack of files this is the last thing you want to be doing.

What if we could automate this?

Using Visual Studio it’s possible to drive Excel with a short piece of code.

When coming up with this solution I quickly hit a wall as I’m no Share Point expert.    I wondered if I connected the document library in Share Point to a drive letter on local explorer then could I side step the complexity of Share Point.

Did it work?  I’m very happy to say yes.

Here’s the complete code I used to write a simple refresh utility :

using System;
using System.IO;
using Microsoft.Office.Interop.Excel;

namespace ExcelRefresh
{
    class Program
    {
        static void Main(string[] args)
        {

           string filePath = args[0];
           Update(filePath);

        }

private static void Update(string excelfile)
        {
            Console.WriteLine("Refresh is starting...");
            Application excel = new Application();

            try
            {

                Workbook wb = excel.Workbooks.Open(excelfile);
                Console.WriteLine("File open.");
                Console.WriteLine("Refresh starting.");
                wb.RefreshAll();
                Console.WriteLine("Refresh complete.");
                Console.WriteLine("Saving the workbook to Office365.");
                wb.Save();
                Console.WriteLine("Operation Complete.");
                wb.Close();
            }

            finally
            {
                excel.Quit();

            }
        }

    }

}

When this app is built you just need to pass in a path to the Excel file you want refreshed.

c:\ExcelRefresh z:\\SalesByModel.xlsx

Notice I escaped the \.   Don’t forget to map the document library to a local drive letter before running.

I must give thanks to Jamie Tarquino who blogged about this here.

Also thanks to Chris Webb & Jen Stirrup for bringing Power View and Share Point Online to my attention.  Seriously, if you ever need top class support for Microsoft BI – you have to look these folks up.  They are prolific speakers/knowledge sharers at the top of their community.

You can find a copy of the app I built here.  It wouldn’t be too much of a stretch to add this to task scheduler that runs according to the times you set.

This process enables you to use the on-line Share Point service which doesn’t require lots of IT investment.  You can then publish self service reports and dashboards to your stakeholders.   One issue with this is that the data model resides in the Excel workbook. To limit this risk you can prevent the workbook from being downloaded by your users simply by changing the document permissions.

If this proves successful you may find you’ve built the support to deploy Share Point on premise, and thus benefit from tighter data integration directly with SQL Server.  If not, what have you risked?  A couple of subscriptions to Office and a bit of your time.

What do you think about this approach?  As I mentioned above, it’s a workaround and not without problems but it may just help you overcome one of the challenges of deploying any BI system and that is getting the business support.

Lee

Posted in Power View | Tagged | 9 Comments

Travelling Salesperson and Quantum bits

On all my analytical travels in the past several years I’m yet to come across a Data Scientist in the flesh.   These people are the mythical figures who supposedly can turn data into oil.

Data Scientist

When you think about the breadth of skills this role requires it’s not surprising these folks are rare in business.

There are two ways to close the analytical gap, develop the human resources and develop the technology.

We already have technology that is making analytics easier and I believe we’ll continue to see improvements in this space.

In this post I’ll demo how we can take a classical data scientist type problem and solve it by producing a model in trusty old Excel. Continue reading

Posted in Analytics | Tagged , | 1 Comment

Give me the master keys

For the past several years I’ve been a keen proponent of the OLAP approach to deliver Business Intelligence.

Cube

We have different models to work from but I can sum it up in terms of :

  • Extract
  • Transform
  • Load
  • Query

By pulling the data from the core system we can be assured that we won’t impact it beyond the initial extraction (which is often scheduled off-peak).

The next phase to transform the data is sometimes essential.  The data in the source systems is difficult to query directly.  The complexity can be reduced in the transformation phase.  At this point the data resides in a Data Warehouse (DW).

The transformation can include turning some of the DW into a number of multi-dimensional cubes.

We can use the cube to carry out our own analysis without having to involve IT.   Queries are fast and reliable  All good so far?  Well, yes and no.

If you get this far it’s likely you’ll soon have multiple cubes.  Don’t get me wrong, I use cubes every day in my analytic role.  They are very powerful.

The difficulty comes when we need to join 2 cubes.   Cubes are often designed along functional lines e.g Sales, Finance, Procurement, Marketing.

The value of the data can be increased on a huge scale if the data within the functional areas is linked.   Think about some of these basic questions:

  • How does the marketing effort affect Sales?   How effective are our campaigns?
  • Why is the stock so high?
  • Why are the lead-times increasing?
  • Which products will our customers likely buy next?  Which customers will churn?
  • What’s eating free cash-flow, how will it impact on EBITDA next quarter?

To answer these questions we often have to link the data.   It’s not easy to link cubes due to their multi-dimensional aspects.  Instead we have to query the DW or worst case, the source system.

When I talk to BI developers about this conundrum they say the cube needs to be remodeled.   Sometimes this is correct but many times the cube shouldn’t be changed at all.   The next worst thing from having no cube is having a cube that is difficult to use.

With the latest in-memory tools from QlikView, JasperSoft, Microsoft and many others it’s possible for non IT workers to write cross functional queries on the DW that are just as fast as those queries run against cubes.

To really benefit from these cutting edge tools we need the Master Keys.

We use keys to join the data.   Think of a key as the look-up value in VLOOKUP.

I once read that the customer is king in business.   This naturally means lots of analysis is centered around the customer.   I find it odd that in many organisations the customer exists in many different parts of the business each with a record that is different to the other.

We see the customer in the Sales cube.   In the Marketing cube we see Leads.    Some of these Leads will become Customers but as the Customer isn’t present in the Marketing cube it’s almost impossible to link the 2 two objects.    We go to the DW and find the customer key isn’t present in the table.   This means we have to infer a fuzzy key using the address and possibly name.   Far from ideal.

What we really need is the Master Customer Key that represents a single view of customer that can be shared across the different areas.

Master Key

At the time when source systems are designed it’s often the case that no one is thinking about analytics such as this nor master sources.     Data modelers need to be asking questions such as :

  • How do leads relate to customers?
  • Is a lead created for an existing customer?
  • How does Procurement relate to Finance?
  • How does the Customer relate to Finance?

When you have the Master Keys your analytical capability will be boosted.

Until then have a read of my post on Fuzzy Lookups.

As always, please feel free to comment or share your experience.

Posted in Analytics | Tagged , , | Leave a comment

Performance management levers with CRM

Why CRM on a performance management blog I hear you say! Well, think about the performance management of a sales force. We frequently find this area of the business has little solid facts available other than those captured on phones and often scribbled down in a paper diary.

paper diary

CRM (Customer Relationship Management) is a wide subject. It can be summed up as the software and processes that allows the sales and biz dev teams to manage their activity such as selling to customers, managing leads and managing relationships.  CRM software can be integrated with existing back office systems to avoid the need to re-process data, thereby, increasing efficiency.

From my perspective the benefit of CRM comes from the facts that are provided, we can analyse leads, conversion rates, targets, and sales fulfilment. These facts can be used to evaluate performance and aid decision making. facts

CRM & Planning is like bread & butter

Financial pro’s have a habit of talking to the business in our Financial language. For instance, we ask for the £ Revenue forecast. In some companies the £ Revenue can be a difficult figure to calculate due to recognition rules or inter-co sharing.

Would it be better to talk the language the Sales team understands? Ask for the Conversion rates, Opportunity levels and their Temperature or other revenue drivers that are applicable. It’s harder for managers to hide behind speculative or biased forecasts when the £ amount is represented along with the revenue drivers.

With CRM it’s possible to back calculate the £ Actual to the originating measures. You can use this calculation to build business rules that connect the CRM measures with the Planning system.  Each time you need to update the forecast just refresh the data from the CRM system.  No need to open Excel.

Variance Analysis

Let’s move on to the task all Financial pro’s perform at some point in their lives - Variance Analysis.

Have you ever felt that you don’t have the full picture?  I’ve seen time after time a variance explanation that explains the movements (the what) but doesn’t tell the user the reasons (the why). “10% up on last month due to higher sales activity”

Would it be better to say:

“10% up on last month due to increase of £100K opportunities at stage 4 with a 2% increase in conversion rates. Looks to be a 1 off as stage 1 opps are static.”

Biggest winners are the sales team

CRM can have a huge impact on the sales team. If CRM is delivered in the right way the sales people will be spending less time on administration and more time selling. Leads can be managed in smart ways that maximises the chance of conversion.  Contacts can be searched, called, managed, information can be shared, collaboration is easier, business rules can replace the intervention of team leaders that is often required to close deals.

Challenges

The challenge with CRM depends on your sales force make-up. The requisite skills for selling don’t always include technical or analytical aspects. This doesn’t bode well for CRM as it can result in resistance to change with the users not seeing the benefits up front.

Often the sales team believe that their sales performance (and bonus!) will drop if they adopt the CRM system. Under these conditions, CRM is still possible, but more time needs to be spent capturing requirements and involving users early on.  Identify evangelical users !

This brings me on to a second challenge, active user participation. In the CRM context, this means bringing sales people out of service. If you have 100 sales people, this isn’t an issue, but if you have a team of 30, it can be a problem.

Biggest Success Factors

One of the biggest success factors for CRM is user adoption. Based on this the system delivered should be aimed at the sales teams and not the office. The User Interface (UI) is one of the most important aspects, followed closely by the quality of data. With modern CRM packages it’s possible to cover all user requirements in a way that doesn’t give a complex user interface.

CRM Packages

There are hundreds of different CRM packages. You’ve probably heard of the most popular one, Salesforce, but there are many others. The experience I have is with Microsoft Dynamics CRM. From an analytical perspective I like Dynamics as I can easily get at the data using tools I’m already familiar with, such as SSRS, SQL Server, Excel. The other benefit of Dynamics is that it can be purchased in the form of SAAS (software as a service) or you can deploy an on-premise version, the latter being more suitable for large deployments with more complex requirements.

Here are some links to CRM sites if you want to learn more about CRM.

Feel free to ask any questions or comment on your experiences.

Posted in Analytics, CRM, Performance Management | Tagged , , | Leave a comment