Using a Privacy-First Mindset to Transform Data into Intelligence
The Data Economy is currently a $230B+ market and expected to grow to $400B in the next four years. However, 80 percent of current spend is still limited to advertising use cases. It is expected that in the next 3–5 years that this ratio will reverse as the penetration and demand for 5G and artificial intelligence (AI) creates even more opportunity and demand for enterprises to strategically leverage data as a competitive advantage.
With the rise of commoditized big data solutions over the past decade, enterprises and independent software vendors have enjoyed a boom in data and data services to help their businesses. Today, a robust data ecosystem has emerged in which data supply from one industry vertical can be transformed and cast into useful intelligence for other, unrelated industry verticals.
This piece will delve into how this practice typically requires a solid understanding of the characteristics of the underlying data resource, how that data is transformed into useful intelligence, and how these processes are carried out with the upmost care to represent the interests of the Data Subject.
The Road to Intelligence Starts with (Good) Data
Recent research shows that 70 percent of companies report zero to minimal impact of AI initiatives. Additionally, data challenges are routinely cited as the primary reason why 87 percent of data science projects fail to even make it into production. The typical data science team is overwhelmed by admin heavy tasks associated with converting data into usable intelligence entities.
Current AI efforts are stymied because of the lack of “application-ready” intelligence - as only the largest tech companies have access to the infrastructure, tools, and skills to process and transform raw data signals into ready-to-use intelligence – or have large enough first-party databases – or both (e.g., the walled gardens of Facebook, Google, Amazon). The market opportunity (and ultimate impact) for enabling AI by solving the core data challenges of identifying, sourcing and processing data represents a significant unmet need across industries and use cases.
The traditional DIKW pyramid still holds value to anyone familiar with data. The most unique and untapped data sources are by definition the most difficult sources to stimulate. While web and mobile app data is ubiquitous, the ability to combine data sources, identify truly compelling insights and transform those insights into wisdom and actionable takeaways requires iterative, applied data science in cooperation with expert industry resources.
Most enterprises don’t have the resources to manage an enterprise data lake, ingesting and sanitizing multiple data feeds, ultimately producing intelligence applicable to their business need. Indeed this seems like an awfully high bar to set for an enterprise that simply wants to find the right answers their business questions. Enterprises that have embarked on their data enablement journeys in the 2010s have learned the lesson that where they once thought they “need the data,” they now know they really “need the answers.”
Additionally, regardless of the industry vertical, patterns have emerged with respect to the types of questions asked (and the answers provided) by these intelligence sets. For instance, interests in the retail and commercial real-estate industries each have similar motives in the analysis of routes travelled in relation to relevant points of interest (POI). In retail, the interest is in analyzing observed dimensioned traffic in and around storefronts to drive consumer insights. Likewise, both the buy-side and sell-side in commercial real estate want to see the same information relative to a given set of properties as a form of location insight.
It makes business sense to normalize and simplify the analysis of this intelligence into a common function, or set of related functions, and expose these through a lightweight API or equivalent. Furnishing an integration point to a standardized “always on” intelligence repository represents a generational evolution in the delivery of the intelligence-as-a-service value proposition.
Creating a unified view of accurate consumer intelligence derived from the privacy-first processing of signals from wireless carriers, OEMs and mobile apps epitomizes the heart of this value proposition. In the end, the goal is to produce a privacy-first holistic intelligence set across both physical and digital dimensions.
Leveraging Telco-Based Data with Trust
Consider the traditional wireless telco operation - the macro network generates hundreds of terabytes of signal data every day. Mobile devices are constantly radiating when they are performing a teleservice or data operation. Idle mobile device radios will also “pilot” so that base stations know where devices are and can arbitrate among themselves on how to best complete an inbound call to the device. These radio events are logged at each base station, which maintain synchronized clocks. The telco has meta-data describing the base stations, including tower height, number of antennas and their orientations, beam width and azimuth. From these inputs, high volumes of multi-lateraled location scores can be computed with varying degrees of precision, often 100m to 300m. It’s not uncommon to see a macro network produce an average of 600 of these network-sourced location (NSL) scores per subscriber-day.
Additionally, as wireless devices interact with internet services, summary events of these interactions are logged within the network. The majority of these events occur over a secure transport, but even these will include a timestamp, source and destination addresses and ports and numbers of bytes exchanged. In cases where clear protocols are in use, headers and request/response details may also be logged. These packet layer data (PLD) records represent another high-volume, high-velocity signal data source representing subscriber behavior.
When combined with demographic and behavioral characteristics gleaned from billing, CRM and marketing database systems, a very rich and entirely unique panel of data emerges. This data is differentiated from other app and device centric sources by the breadth of the signal – the combination of physical, digital and demographic dimensions – and the frequency of the signals collected.
But there are challenges to realizing value from this data. As with any raw signal source, intelligent analysis must be conducted in order to produce usable data by-products and intelligence that can bring actual applied value.
But first and foremost, the privacy interests of the subscriber must be represented, and subscriber choice must be applied in any data processing workflow.
Privacy is King
It goes without saying, the application of consumer data in any principle requires a great deal of trust from all parties involved. To gain consumer adoption and respect, companies should provide them with tangible control over the information they share, respect data privacy selections, offer easily understood policies and protect personal information moving forward.
In fact, recent research on consumer privacy issues, legislation and best practices at firms around the world concludes that the successfully handling of client data comes down to three Ts: trust, transparency, and type of data.
Unfortunately, this is not always the case as businesses often implement processes that make consumer control confusing, or at best an intentional obstruction as a measure of “check the box” opt-in. It is unfortunate that in an ever-growing ecosystem where there is paramount focus on data privacy that businesses will jeopardize consumer trust to gain value from customer data.
What would meeting the standards of trust and transparency with respect to telco signal and mobile data look like? First, it is critical to ensure that all consumer data has appropriate consent from the consumer, or data subject. This means that users should be provided choice and transparency with opportunities to:
- Electively determine how their data are and are not used
- Check and change their elections at various points in time and through multiple convenient touchpoints
- Operate with the understanding that they are never opted in for third-party marketing by default.
The subscriber election state must then be combined with internal telco policies to compile an exclusion list which is used to completely remove signal from data processing workflows. This ensures that any signal that should not be processed (e.g. that of government subscribers, corporate liable subscribers and individually liable opt-outs) is removed at the source, all within the enterprise telco boundary.
While the telco data has now been filtered for exclusion, it is still critical to further remove any personally identifiable information, and just as importantly, to anonymize the remaining data. The most robust platforms will implement further anonymization of post-processed and filtered mobile and telco data to ensure users can never be re-identified.
Thus, the “3 Ts of handling data” need to be rigorously upheld throughout the entire process - from facilitating subscriber election through data collection, filtering and anonymization, all the way through the derivation of intelligence. Using telco signal data requires more than a commitment to adhering to industry requirements, but also requires proactively instituting practices that empower users while constantly adapting to new security and privacy practices.
Privacy is a Concern
The Data Economy is large, and the limitations of traditional data sources have led to challenges in recognizing and applying the value of AI. While current market players, including data aggregators or mobile data providers, have experience in sourcing signal from app and other on-device sources, many of them are struggling as raw telco signal requires significant investment and expertise to meet rigorous privacy and compliance standards. Then there is the investment and challenge to create usable data by-products that can bring actual applied value.
In today’s climate, privacy is a key concern for all companies with high value data assets, especially telcos. By combining the practice of AI to create higher value intelligence entities to serve the market and using a systemic framework that enforces only anonymized and filtered data are employed, organizations will be able to focus with confidence on the problems they want to solve through the application of trusted intelligence.