Spreadsheets

Get Out of the Spreadsheet Abyss

When an organization turns a blind eye to the proliferation of spreadsheet based processes, it subjects itself to substantial risks. Have you ever encountered (or enabled) the following scenario?

  • Enterprise reporting is lacking thus a “power-user” (i.e. analyst) is conscripted into cobbling together an ad-hoc spreadsheet based process to address the management request;
  • The power user exports data from various business applications and then manipulates the data output typically with macros and formulas;
  • This initial spreadsheet is manually integrated with business unit data from various other unofficial spreadsheets, further distancing the data from the source business application;
  • Multiple tabs are added, charts are generated and data is pivoted (all manually);
  • Management finds value in the report and elevates it to a repeatable process;
  • The original request increases in complexity over time as it requires more manipulations, calculations and rogue data sources to meet management needs;
  • Management doubles down with the need for a new request and the process is repeated;
  • IT proper is NEVER consulted on any of the requests;

The business unit is now supporting a “spreadmart”. The term is considered derogatory in data circles.

“A spreadmart (spreadsheet data mart) is a business data analysis system running on spreadsheets or other desktop databases that is created and maintained by individuals or groups to perform the tasks normally done by a data mart or data warehouse. Typically a spreadmart is created by individuals at different times using different data sources and rules for defining metrics in an organization, creating a fractured view of the enterprise.” [1]

Although the initial intentions of these requests may be reasonable, the business never bothers to approach IT to propose building out a proper data store. Additionally, the conscripted analysts are unhappy with their additional manual responsibilities. Spreadsheet wrangling and manual integration activities shift precious time away from more value-added pursuits such as data analysis and formulating recommendations.

From management’s perspective, why should they pay IT to build out an officially sanctioned solution that will deliver the same information that an internal team of analysts can provide? After all, the spreadmart is responsive (changes can be made quickly) and it’s inexpensive (as opposed to new investments in IT). Eventually, the manual processes are baked into the job description and new hires are enlisted to expand and maintain this system. The business sinks deeper and deeper into the spreadsheet abyss.

The short term rewards of the spreadmart are generally not worth the longer term risks.

Risks:

“It’s not an enterprise tool. The error rates in spreadsheets are huge. Excel will dutifully average the wrong data right down the line. There’s no protection around that.” [2]

The spreadmart can also be bracketed as a “data shadow” system to borrow a term from The Business Intelligence Guidebook, authored by Rick Sherman. Here are the problems associated with “data shadow” systems as paraphrased from The Business Intelligence Guidebook [3]:

  • Productivity is severely diminished as analysts spend their time creating and maintaining an assortment of manual data processes;
    • I would add that team morale suffers as well;
  • Business units have daggers drawn as they try to reconcile and validate whose numbers are “more correct”;
    • As a result of a new silo, the organization has compounded its data governance issues;
  • Data errors can (and will) occur as a result of manual querying, integrating and calculating;
  • Data sources can change without notice and the data shadow process is not on IT’s radar for source change notifications;
  • Embedded business logic becomes stagnant in various complex macros or code modules because they are hidden or simply not understood by inheritors;
  • The solution doesn’t scale with increasing data volume or number of users;
  • Audit trail to ensure control and compliance does not exist;
    • “It is often ironic that a finance group can pass an audit because the IT processes it uses are auditable, but the data shadow systems that they use to make decisions are not, and are ignored in an internal audit”;
  • Process and technical documentation does not exist which impacts the ability to update the solution;

Additionally, these processes are not backed up with any regularity, multiple versions may exist on multiple users’ desktops and anyone can make changes to the embedded business logic. The bottom line is that the business is potentially making decisions based upon erroneous data which can have serious financial and reputational impacts.

“F1F9 estimated that 88 percent of all spreadsheets have errors in them, while 50 percent of spreadsheets used by large companies have material defects. The company said the mistakes are not just costly in terms of time and money – but also lead to damaged reputations, lost jobs and disrupted careers.” [4]

Mitigation:

There is nothing wrong with the business responding to an emerging issue by requesting a one-time ad-hoc solution. The highest risks emerge when the ad-hoc process is systematized and a number of repeatable ad-hoc processes proliferate unchecked; and IT is never involved in any discussions.

IT proper is highly effective when it is allowed to merge, integrate and validate data. Business unit analysts and spreadsheets should be out of the collection and integration game for repeatable management reporting. Analysts should focus on analysis, trending and interpretation. Too often analysts get tossed into productivity traps involving hours of cutting, pasting and linking to someone else’s spreadsheet for data integration in order to meet management demands.

When IT is brought into the discussion, they must not point fingers but rather understand why the shadow system was established in the first place. Likewise, the business unit should not point fingers at IT for being unresponsive or limited by budget constraints. Once the peace treaty has been established, IT should analyze and reverse-engineer the cobbled together integration processes and data sources (which admittedly is a time consuming event) and deliver more controlled and scalable processes.

The new data integration processes should culminate in loading data to a business specific, validated, central data mart. The central mart doesn’t try to impose an unfamiliar tool upon the business but rather automates integration activities and references more “trustworthy” data sources. Spreadsheets can still be used by analysts to access the data but the analysts are not expected to be manual aggregators using a sub-standard ETL tool.

“Go with the flow. Some users will never give up their spreadsheets, regardless of how robust an analytic environment you provide. Let them keep their spreadsheets, but configure them as front ends to the data warehouse. This way, they can use their spreadsheets to access corporate-approved data, metrics and reports. If they insist on creating new reports, provide an incentive for them to upload their reports to the data warehouse instead of distributing them via e-mail.” [5]

Have I ever had to “get out” of a situation where data governance was lacking and burned-out, morale depleted analysts spent all of their time collecting and integrating spreadsheets to maintain an inefficient spreadmart?

I’ll never tell!

References:

[1] https://en.wikipedia.org/wiki/Spreadmart

[2] http://ww2.cfo.com/analytics/2012/01/imagine-theres-no-excel/

[3] Sherman, R. (2015). Business intelligence guidebook: from data integration to analytics.

[4] http://www.cnbc.com/id/100923538

[5] https://www.information-management.com/news/the-rise-and-fall-of-spreadmarts