Of 21 experts to whom I sent my June 28 article, 13 participated in an e-mail discussion which the ARF has summarized at the link below called ARF discussion. Most of the 13 agree with the statement, "The evidence that a duplication estimation technique closely agrees with an empirical canonical set is essential and needs regular validation."
In the June 28 article we offered evidence that random probability methods do not accurately reproduce audience duplication patterns as measured empirically by best practices. Nielsen ONE Ads represented the empirical approach and what used to be called Sainsbury -- the assumption of random probability duplication -- represented that one general modeling approach.
Many people became involved in discussing the article with me after publication, and it became a single e-mail thread involving over twenty leading technical minds in the media research industry. One of the participants, Advertising Research Foundation (ARF) Chief Research officer Paul Donato, found the ad hoc collaboration exhilarating, and decided to publish it. Here is the link: ARF discussion.
The discussion exemplified what I had described as "inclusive industry collaboration" in my Hall Of Fame acceptance speech at Cynopsis last month.
The thread also shows that most media researchers would still prefer measurement to any form of modeling, with an openness to modeling where measurement is not an option due to any specific walled garden’s insistence.
The discussion also revealed to many who did not already know it that walled gardens can be measured ... but only if they want to be. What they then do is integrate with a chosen measurer, either by direct match or through a clean room that preserves all the empirical relationships without risk to privacy. Nielsen has been the typical integration choice of walled gardens so far.
Although the Virtual IDs approach embedded in the World Federation of Advertisers (WFA)/Association of National Advertisers (ANA) Cross-Platform Audience Measurement Initiative is not identical with Sainsbury, I opined in the article that both methods would tend to lack fidelity with the empirical duplication patterns. But why did I say that, a number of my colleagues wanted to know.
Here was my underlying reasoning -- which did not appear in the article (good thing, too, because if I had included my thinking to that degree, the thread discussion might never have happened):
One Virtual ID is assigned to replace a given ID in one platform, and to replace another ID in another platform. The likelihood of them both actually happening to represent the same human being is low. One of them may have been exposed to a brand’s TV campaign, and the other may not have been. When the reach/frequency is put together across all media measured (TV and digital are the starting point), the arbitrariness of the VID assignments will have a randomizing effect, with some VIDs found to have been exposed to TV and digital, and other VIDs found to have been either unexposed, or only exposed to TV or to digital but not both. However, the assignments have a good chance of not being accurate.
In the first "sketch," many of these VIDs are likely to have been misclassified. Some VIDs said to have been unreached by a platform might actually have been exposed to that platform, and many said to have been reached might actually have not been reached by that platform. The WFA/ANA model (originally developed by Google and Facebook) expects those errors, and the idea is to then make iterative comparisons with a probability sample panel that is taken as the truth standard, and the model machine-learns how to maximally force the model to agree with the truthset.
That might work, if data were rigidly stable from week to week (which was shown to be anything but the case). Otherwise the trailing correction factors would be outdated by the time they are applied.
More importantly, there are just so many things that can be controlled by weighting or changing simulated exposure statuses. TRA found that an iterative algorithm could control for about 100 vectors, and if we went much higher than that, the model would not converge and we could wind up with even greater differences than we started with. TRA was not trying to simulate audience duplication patterns among thousands of media vehicles across hundreds of popular target audiences; we were merely trying to weight and project set-top-box data to Universe Estimates (UEs) by demographics within 210 unduplicated markets in the U.S. We found a geodemo set of 110 vectors that optimized the results for that situation, and were never more than two percentage points off any UEs.
This foregoing explanation of how I formed my opinion of the probable inaccuracy of the VID approach was the reasoning behind my statement in the June 28 article:
"Whereas Project Blueprint found that duplication among media are definitely not random, the WFA/ANA blueprint allows for a concept of Virtual IDs which will tend to produce results that are very similar to random probability."
Former Nielsen Chief Research Officer Mainak Mazumdar, now Chief Research and Analytics Officer at Fox, provided what was to me the most solid answer in the discussion when he reported that Nielsen, in testing VIDs, found them to be 90% accurate for duplication between two platforms, but the more platforms the less accurate it became, dropping to 75% when a third platform was added, and down to 50% after that.
This made me feel that my instinct of lumping VIDs with Sainsbury was somewhat vindicated. Sainsbury of course was from an era where machine learning (ML) did not yet exist. VIDs having been born into an age where ML exists have a far better chance of success than Sainsbury, Agostini and Metheringham, three famous reach/frequency/duplication heuristics of the last century.
Former ARF Chief Research Officer Joel Rubinson in the thread wondered if modeling was not inevitable because of the walled gardens, and former Chief Research Officer of Nielsen NCS Leslie Wood responded that Nielsen empirically measures many walled gardens by full integration, either by direct match or via clean room, but 100% empirically in either case. The WFA/ANA would be wise to alter their blueprint to place a higher value on integration of datasets empirically by direct match or clean room, allowing the use of modeling only in cases where a particular walled garden is not willing to integrate with any measurer.
The modeling which WFA/ANA ought to recommend should be a method that has demonstrated adequate proof in a sufficient series of diverse cases that the model approximates reality in terms of complex duplication patterns among a host of platforms and target audiences. Rubinson points out that while he respects all models, the choice of Dirac equations for the Google/Facebook/WFA/ANA method is not the best choice. In the thread he reports that Dirac is specifically known to not preserve internal relationships. Rubinson has devised a different mathematical model which he argues will outperform the Dirac-based modeling, in the cases where modeling is the only option the industry is left with for specific walled gardens.
Meanwhile, it appears (thanks to Jim Spaeth and to Michael Vinson) that the original formulation from Google did start out assuming "independence" (random probability) but has already learned that a process for bringing in empirical guardrails is necessary to achieve the degree of accuracy desired. So there may be new data on VIDs which we may be able to bring to light in upcoming articles, which could demonstrate an acceptable degree of agreement with empirical benchmarks.
In this short space I’ve only lightly touched on the richness of the discussion which arose naturally among some of the finest minds in the business. Our industry has no shortage of fine minds. It might not be as well represented in the day-to-day, hand-to-hand as it is in this ARF compilation of the spontaneous e-mail thread. You will recognize many famous names in our field.
Posted at MediaVillage through the Thought Leadership self-publishing platform.
Click the social buttons to share this story with colleagues and friends.
The opinions expressed here are the author's views and do not necessarily represent the views of MediaVillage.org/MyersBizNet.