Posts Tagged ‘data’

Lifetime validity of data (data aging) in the AI era

Impact on the success of data-driven initiatives.

Abstract

Organizations frequently discuss the importance of data quality and its impact on business value. Even the most sophisticated analytical models falter with outdated and unreliable data, resulting in misleading recommendations, inaccurate forecasts, suboptimal business decisions, and wasted resources.

In today’s data-driven world, organizations face information overload, often storing vast amounts of data without considering its diminishing relevance. While some clients recognize this “information overload” and exercise caution regarding what they capture, others maintain the status quo, leading to increased costs, flawed insights, low customer satisfaction, and poor performance.

Organizations must understand that the value of data is not static; it evolves and degrades over time. This understanding is crucial for accurate analysis and effective decision-making. In fact, one dimension of quality is timeliness, which translates to the lifetime value of data or data aging. This article explores the concept of ‘data aging’ and its implications for the success of data-driven initiatives.

The four dimensions of data

To calculate the lifetime validity of data, one must understand the four dimensions of data, commonly referred to as the 4V’s: Volume (Vo), Velocity (Ve), Variety (Va), and Veracity (Vr). The first three—Volume, Velocity, and Variety—are straightforward.

DimensionDescription
Volume (Vo)The sheer amount/quantity of data from various sources. E.g., transactions, logs.
Velocity (Ve)The speed at which data is generated and processed. Also known as the rate of data flow. E.g., real-time, batch.
Variety (Va)The diverse forms/types of data. E.g., structured, semi-structured, and unstructured data.
Veracity (Vr)The reliability and trustworthiness of data. E.g., accuracy, consistency, conformity.

Let’s focus on the fourth V, Veracity (Vr) which encompasses the accuracy and truthfulness aspects of data. Veracity is a function of four components that directly influence the insights and Business Value (Bv) generated.

This equation represents a more traditional view and emphasizes the fundamental aspects of data veracity: data quality, data value, data density, data volatility, and the impact of time. This equation is suitable for situations where the dataset is small and data volume, velocity, and variety are relatively stable or not significant factors. In short, the focus is on the intrinsic quality and reliability of the data.

The components explained:

  1. Quality of Data (Dq): A normalized quantitative score, derived from a comprehensive data profiling process, serves as a measure of data quality (Dq). This score encapsulates the 4Cs: completeness, correctness, clarity, and consistency.
  2. Data Volatility (Dvo): Refers to the duration for which the data or dataset remains relevant. It quantifies the spread and variability of data points, extending beyond mere temporal change. While some define volatility as the rate of data change, this definition emphasizes the overall fluctuation, i.e., rate at which data changes[1]. For example, customer preferences. A numerical scale, such as 1 to 10, can be used to represent the spectrum from low to high volatility.
  3. Data Value (Dva): Represents the actionable insights, cost savings, or value of derived knowledge obtained through analytical modeling, such as correlation and regression. In essence, it answers the question, “What is the practical significance of this data analysis?” A numerical scale, such as 1 to 10, can be used to represent the range from low to high data value.
  4. Quality of Data Density (Dd): Measures the concentration of valuable, complete, and relevant information within a dataset. It emphasizes the presence of meaningful data, rather than sheer volume. For example, a dataset with numerous entries but missing essential fields exhibits low data density quality. This assessment is determined through a combination of data profiling and subject matter expert (SME) evaluation.

Computing the lifetime value using Vr

All the above components are time-dependent, and any equation involving time will have an associated lifetime or value. Hence, the value of data either remains constant (for a period) or degrades over time, depending on the type of data. Now, let us integrate the 3Vs (Volume, Velocity and Variety) into this equation (Vr).

To briefly explain, data quality, value, and density are in the numerator because high values for these components improve data reliability. The other components negatively impact trustworthiness with higher values and are therefore in the denominator. To tailor the equation to specific use cases, weight coefficients can be incorporated to reflect the relative importance of each factor. These weights should be adjusted based on the unique context or requirements of the analysis. Generally, a lower overall score indicates that the data is aged, exhibits reduced stability, and/or possesses diminished reliability. This characteristic can be particularly valuable in scenarios where historical trends and patterns hold greater significance than contemporary data, such as retrospective studies or long-term trend analyses.

Real-world examples

Consider customer purchasing behavior data. Companies utilize segmentation and personalization based on customer lifecycle stages for targeted marketing. As individuals transition through life stages, their purchasing patterns evolve. Consequently, relying on data from a specific historical point—such as during a period of job searching, financial dependence, or early adulthood—to predict purchasing behavior during a later stage of financial independence, high-income employment, family life, or mid-adulthood is likely to produce inaccurate results.

Similarly, credit rating information demonstrates the impact of data aging. Financial institutions typically prioritize a customer’s recent credit history for risk assessment. A credit rating from an individual’s early adulthood is irrelevant for risk calculations in their mid-40s. These examples underscore the principle of data aging and its implications for analytical accuracy.

Strategies for mitigating the effects of data aging

  • Data Governance: Establishing clear data retention and data quality standards.
  • Data Versioning (by customer stages): Tracking changes to data over time to understand its evolution.
  • AI Infusion: Utilizing AI at every stage of the data lifecycle to identify and address data anomalies, inconsistencies and data decay.

Conclusion

The truth is, data isn’t static. It’s a living, breathing entity that changes over time. Recognizing and adapting to these changes is what separates effective data strategies from those that quickly become obsolete. If you found this post insightful, please comment below! In a future post, I will explore the impact of other components like data gravity and data visualization on business value. Let me know if that’s something you’d like to see!

Reference:

  • “The Importance of Data Quality in a Data-Driven World” by Gartner (2023)
  • “Data Decay: Why Your Data Isn’t as Good as You Think It Is” by Forbes (2022)
  • McKinsey & Company, “The Age of Analytics: Competing in a Data-Driven World” (2023)
  • Deloitte Insights, “Data Valuation: Understanding the Value of Your Data Assets” (2022)
  • Equation created using https://www.imatheq.com/imatheq/com/imatheq/math-equation-editor.html


[1] “rate of change of data” is typically represented as a derivative in mathematics. It gives a precise value showing how one variable changes in relation to another (e.g., how temperature changes with time). “rate at which data changes” emphasizes the speed or pace at which the data is changing over time (pace of data variation).

Divestiture Framework – Data Perspective

Introduction

The selling of assets, divisions, or subsidiaries to another corporation or individual(s) is termed divestiture. According to a divestiture survey conducted by Deloitte, “the top reason for divesting a business unit or segment is that it is not considered core to the company’s business strategy” and “the need to get rid of non-core assets or financing needs as their top reason for divesting an asset”. In some cases, divestiture is done to de-risk the parent company from a high-potential but risky business line or product line. Economic turnaround and a wall of capital also drive the demand for divestitures.

Divestitures have some unique characteristics that distinguish them from other M&A transactions and spin-offs. For example, the need to separate (aka disentangle) business and technology assets of the unit being sold from that of the seller before the sale is executed. Performing the disentanglement under tighter time constraints, i.e. before the close of the transaction, unlike in the case of an acquisition scenario adds to the complexity.

The critical aspect of the entire process is data disposition. Though similar technologies could have been deployed on the buyer and seller side, the handover can end up painful if a formal process is not adopted right from the due-diligence phase. This is because, in the case of divestiture, the process is not as simple as a ‘lift-shift and operate’ process. There is a hand full of frameworks available in the market detailing the overall process in a divestiture scenario nevertheless the core component which is “data” is touched upon at the surface level and not expanded enough to throw light on the true complexities involved.

What does the trend indicate?

Divestitures and carve-outs are very common in Life Science, Retail and Manufacturing.

If we observe the economic movements and divestiture trend over the past decade, it is clear that the economic conditions have a direct correlation and significant impact on divestiture. So organizations have to proactively start assessing their assets at least annually to understand which assets are potential candidates for divestitures and prepare for the same. This way, when the time is right the organization would be well prepared for the transition services agreement (TSA) phase.

The bottom-line

Overall planning is a critical success factor, however, that process not involving sufficient planning around the “data” component can result in surprises at various points during the course of divestiture and even end up in breaking the deal. End of the day the shareholders and top management will only look at the data to say if the deal was successful or not.

Faster due-diligence, quicker integration, and visible tracking of key metrics/milestones from the start is what one looks for. As per industry experts, having a proactive approach in place has helped sellers to increase the valuation of the deal.

One of the key outputs of this framework is a customized technology and data roadmap. This roadmap will contain recommendations and details around the data and technology complexities that need to be addressed prior to, during, and post the divestiture to ensure a higher success rate for both the selling and buying organization.

The Divestiture Model – Buyer and Seller Perspective

Broadly the Divestiture model has three components: Core, Buyer, and Seller:

  1. Core component: Handles all activities related to overall due-diligence related to data such as identifying data owners and stewards, data disposition strategy, value creation opportunity (VCO), and enterprise-level data integration with core applications at the buyer and seller end.
  2. Seller component: Focuses on seller side activities related to data like data inventory, business metadata documentation, data lineage/dependency, business process, data flow/process flow diagrams and level of integration with enterprise apps, business impact on existing processes, and resource movement (technology and people).
  3. Buyer component: Focuses on buyer-side activities related to data like data mapping, data integration, data quality, technology, capacity planning, and business process alignment.
  4. Governance: The entire process is governed by a 360-degree data/information governance framework to maintain the privacy, security, regulatory, and integrity aspects of data between the two organizations.

Divestiture model – The Core

Addressing the “data” component:

Selling Organization

Only a few sellers understand the fact that just getting a deal signed and closed isn’t always the end. From a pre-divestiture perspective, the organization should have a well-defined process for possible carve-outs, a good data inventory with documented business metadata, documented business processes around the non-performing assets, and a clear data lineage and impact document. Armed with this information, the selling organization can get into any kind of TSA comfortably and answer most of the questions the buyer will raise during their due-diligence.

From a post-divestiture perspective, the selling organization needs to assess what technologies and processes need to be tweaked or decoupled to achieve the company’s post-divestiture strategy. A plan to minimize the impact of operational dependencies on existing systems and processes with enterprise applications like ERP when the data stops coming in. If this was not done thoroughly and analyzed well in advance, it can have a crippling effect on the entire organization. A typical mistake committed by the selling organization is, just looking at the cost savings due to alignment/rationalization of infrastructure and missing the intricate coupling the data has at the enterprise level.

Having a divestiture strategy with data as the core of the framework can address a host of issues for the selling organization and speed up the pace of transactions.

Buying Organization

There could be two potential scenarios when it comes to the buying organization. Either the organization already has the product line or business unit and is looking to enhance its position in the market or the organization is extending itself into a new line of business with no past hands-on experience. In the case of the former, the complexities can be primarily attributed to migration and merging of data between the two organizations. Questions like what data to keep/pull, what technology to use, what data requires cleansing, similarities in processes, capacity planning to house the new data, what tweaks will be required to existing reports, the new reports that need to be created, to show the benefit of the buy to shareholders, etc. arise.

The pre-divestiture stage will address most of the questions raised above and based on the parameters a strategy is drawn for data disposition. During the divestiture stage, when the data disposition is actually happening, new reports, scorecards, and dashboards are built to ensure complete visibility across the organization at every stage of the divestiture process.

In the latter case where the organization is extending itself into a new line of business, questions like should a lift and shift strategy be adopted or should just the key data be brought in, or should it be a start from a clean state, etc. arise. There is no one correct answer for this as it depends on the quality of processes, technology adopted and data coming from the selling organization.

Divestiture Data Framework

The Divestiture Data Framework was designed to highlight the importance of the core component which is “data”.

Divestiture Data Framework

One of the key outputs of this framework is a customized technology and data roadmap. The roadmap will contain recommendations and details around both data and technology complexities that need to be addressed prior to, during, and post the divestiture to ensure a higher success rate for both the selling and buying organization.