The Mirage That is 1st Party Data

By Chris Williams Thought Leaders Archives

Marketers are being told to base their media and attribution strategies on their 1st Party Data. It's not the best solution nor is even possible. The marketer ends up working with a mish mash of modelled and deterministic data, unclear which is which and yet must bear all the privacy overhead costs.

Brand growth is simply finding new customer meanwhile the current shiny object is 1st party data strategies. They are incompatible. A 1st party data strategy is actually 2nd party data and much of that data is modelled. Marketers don't have much data on their prospective customers but others do, so they match. Compromises ensue. The match rates aren't always high, they vary and there are volume incentives to try and get some kind of match which means look-a-likes or modelled data. But even when match rates are good, that doesn't mean every attribution attached to the ID is deterministic - much of it is modelled. No one can collect all the desired attributes against all their records, they have to fill in the gaps somehow and the financial incentives are there.

On top of all this matching and modelling going on, there is a privacy and tech overhead that has to be paid for. The marketer has less working media dollars because they have to manage privacy. That is, the privacy of their customers whose 1st party data they matched with 2nd party data providers who modelled look-likes and added attributes to the matched ID. It's the worst of both worlds; the cost of privacy management and the cost of modelling.

It gets more challenging. The marketer has matched with one provider, lets say a retail media network, but neither the advertiser nor the RMN have data on all the customers in the category. The advertiser then matches with multiple RMNs to cover the full category but now they have created two more problems; sample bias and de-duplication. Each RMN matches with a different portion of the advertiser's 1st party data and so their look-a-like modelling has sample bias, each of their projections to category looks different. Plus, none of the RMN's are sharing data with each other so no one has a picture of the de-duplication across them all.

Worse, now there is a bias in the media plan towards any medium that offers 1st party data match instead of planning for effectiveness and efficiency. Radio? Forget it. Why would anyone propose a medium that doesn't use the shiny data clean room and all this 1st party data we invested in collecting? Plus anyone who has looked at the mechanics of the Turbo Confabulator realizes that where there is mystery there is margin.

Lastly, the marketer's challenge didn't just pop out of nowhere. Before the marketing plan, there was a business plan and in that plan there are sales projections, regional distribution plans, market share and repeat purchase objectives; all of which were developed by data scientists using modelling against top down category populations. Those audience definitions were good enough to get approval for product development, why not use them in marketing and media? More and more of this business data science uses synthetic data national populations to support business decisions. The synthetic population is nothing but a bunch of Virtual IDs (VIDs) with thousands of attributions attached to each VID. Whether it's product usage or media usage - same thing - same data - same ID. The VID is far more capable of spanning business and media operations than the compromises being made in 1st/2nd party data match in clean rooms.

Chris Williams

Chris Williams has 30 years experience in digital media. Previously, he was President IAB Canada and Vice President Digital at the Association of Canadian Advertisers. Now with a startup that offers live market mix models combined with consumer insight tools…