Sharper resolution: using machine-learning to measure audiences in high definition

By matching device-level data with the strength of machine-learning, Kantar Media is able to fully leverage panels to understand who is watching content at census level. It’s unlocking the full power of people-based measurement, writes Emiliano Cancellieri.

When RTL Netherlands partnered with Kantar Media for a Proof of Concept, the broadcaster was not lacking in data. With eight television channels, a well-used streaming service and a strong digital footprint, RTL was capturing signals from around five million unique devices across websites and catch-up platforms.

The problem was resolution. RTL, like many organisations with access to large volumes of first-party data, could see what was being watched, but not by whom. The data was tied to devices rather than individuals, tracking activity without providing a meaningful view of the person behind the screen.

This is a common gap in media measurement today. The increasing fragmentation of viewing habits across platforms has made it more difficult to produce reliable, person-level insights. Many datasets are passive, logged automatically by apps or set-top boxes. While they are useful for understanding patterns, they rarely tell the full story.

Kantar Media’s methodology is designed to bridge this divide. Rather than relying on devices or households as proxies for people, it integrates behavioural data with insights from verified audience panels. The approach is grounded in established measurement principles, enhanced with machine learning models to improve scale and accuracy.

At its core is the panel, with groups of real individuals whose demographics and media consumption habits are carefully verified and regularly updated. They act as a reference point; a baseline against which broader data signals can be interpreted. While limited in reach, panels provide a degree of reliability that raw behavioural data cannot match.

To extend this reliability across larger datasets, Kantar Media cross-references device-level viewing logs with panel data. This allows it to identify patterns in content consumption, time of use and other behavioural markers. Algorithms are then applied to infer demographic and lifestyle attributes for each device. These models draw on techniques based on decision trees such as XGBoost, LightGBM and random forest, thus producing probabilistic profiles for each viewer.

The process is particularly useful when data is incomplete or anonymised. For example, if a tablet is regularly used to stream cooking content in the early evening, and similar behaviour is observed among a particular demographic in the panel, the system assigns a likely profile to that device. These profiles may include age range, gender, household composition or interests such as health, gaming or food shopping. This was the case with RTL. By matching its own device-level data with Kantar Media’s panel, RTL was able to enhance its understanding of individual users, moving from anonymous identifiers to attributes such as age, gender, family structure etc. This provided a more detailed view of the audience, enabling better targeting and content decisions.

A shift in how audiences are understood

Machine learning methodologies are also applied by Kantar Media with a corrective function. Where broadcasters or brands already hold claimed demographic data — submitted, for instance, through user registration forms — it may be inaccurate. Kantar Media applies what we refer to as “accuracy matrices” to adjust for common reporting errors, using panel benchmarks to recalibrate the data.

Naturally, then, these techniques will never replace panels given their unique function, nor does it suggest that probabilistic modelling is a substitute for verified data. The value lies in combining both: maintaining the rigour of representative measurement while extending its reach using scalable data science. The result is a dataset that is both more comprehensive and actionable.

For marketers, this represents a shift in how audiences can be understood and reached. Rather than relying on broad categories or device proxies, campaigns can be based on more precise profiles. This supports better measurement of reach and frequency and enables more consistent targeting across platforms. For broadcasters, this not only boosts their commercial potential, but could also enable them to refine content strategies.

As the media environment continues to evolve, this kind of person-level insight is certainly likely to become more central to how broadcasters, advertisers and platforms operate, shifting the focus from data collection to meaningful interpretation.

By focusing on the individual rather than the device, and by anchoring behavioural modelling in established measurement standards, Kantar Media’s approaches offer a route to meeting that challenge. These approaches embody a wider industry shift: data that is not only technically sophisticated but also truly actionable, fit for purpose.

Emiliano Cancellieri is a Data Science Manager at Kantar Media.

Download the report

2025 Media Trends & Predictions

Sharper resolution: using machine-learning to measure audiences in high definition

A shift in how audiences are understood

You are currently on the English language site.

A shift in how audiences are understood