skip to main content

Unlocking the power of geographic data

Professor Paul Longley explains how the Geographic Data Service is transforming research by connecting smart data to real places and people

Smart data in a fragmented world

More data is collected about us today than at any point in human history. Ever-increasing proportions are smart data, collected and retained as a by-product of human interactions with smartphones and other digital devices. This brings great potential to research because of the intricate detail with which human activities and behaviours are recorded.  

More traditional sources of data collection, such as censuses and social surveys provide rich information about people’s beliefs, attributes and values but are not designed to collect behavioural transactional data that might, for example, indicate problem gambling, or financial vulnerability.  

Smart data have their limitations. No single smart data provider has the complete picture of how we buy and use goods and services. Some platforms know a lot about certain groups but barely anything about others. And very few can connect what we do online with who we really are and how specific aspects of our behaviour relate to how we live our lives. 

When different bits of our digital lives are scattered across platforms, representing a complete human self presents a kind of “digital identity puzzle” that can be as hard to put together again as Humpty Dumpty. Pieces of the puzzle are hidden in different smart data silos but, unless put together, smart data alone can’t answer many of our important research questions. 

The challenges of data aggregation

A still more fundamental challenge lies in how personal data are processed for research and analysis. General Data Protection Regulation is essential for safeguarding privacy, but overly cautious interpretations by data owners can lead to “one size fits all” data aggregation solutions that can limit the usefulness of data for public benefit.  

Geographic detail—crucial for understanding neighbourhood-level patterns—is often sacrificed first in implementing disclosure control measures to protect privacy. For example, while aggregating property addresses into postcode sectors (areas typically containing 3,000 properties) may very well protect privacy, it is very unlikely that all properties and their residents share similar characteristics, which limits the value of analysis using these aggregations. 

Some data consolidators, such as telecoms companies, take even broader approaches that bulldoze natural neighbourhood boundaries in favour of somewhat arbitrary geographic units, like 350-meter hexagonal grid tiles. These practices provide easy and well-intentioned privacy protection, but ride rough-shod over fine-grained social patterns and neighbourhood structures. Privacy provision does not need to aggregate away the defining geographical characteristics of communities that are intrinsic to the value of smart data for research in the national interest.  

Geographic Data Service: a new vision for research data

The Geographic Data Service (GeoDS) represents a fundamental shift in how research data are accessed, managed, and used. Rather than accepting the limitations of pre-aggregated datasets, GeoDS champions individual-level data, which are always suitably de-identified and safeguarded to maintain privacy. This approach preserves geographic precision and context. 

GeoDS actively advocates for researchers, challenging the notion that they should simply accept whatever aggregated data providers deem “convenient”. Instead, GeoDS works directly with data providers, taking full responsibility for handling personal data using ISO27001-accredited secure settings.  

How it works

Working within strict privacy frameworks and only making available carefully de-identified data, GeoDS creates research datasets more attuned to researcher needs by connecting individual characteristics with property-level information. This approach addresses what researchers call the “ecological fallacy” – when we assume things about individuals based on averaging 3,000 or so of their neighbours (as in the case of postcode sectors). For example, high earners live cheek-by-jowl with those of much more limited means in many parts of the UK’s cities, and our ability to detect or monitor gentrification would be dulled by aggregation of income data to postcode sectors.

So GeoDS enables researchers to understand social patterns and processes more accurately, cognisant of the distortions that aggregation might bring. By starting with property-level precision, the service can create more reliable neighbourhood profiles for research purposes. 

Many GeoDS datasets are built from data that are linked at the individual level – including household composition, property values, energy performance, and residence history. When specific individual characteristics (like age, gender or ethnicity) aren’t directly available, they’re modelled at the individual level and validated against established aggregate sources such as censuses. 

This approach is rigorously validated, with methods and results published in peer-reviewed journals. The data are updated annually as new smart and statistical data become available, and importantly, any gaps or limitations are transparently documented in the metadata. 

By combining individual-level precision with geographic context, GeoDS provides researchers with data that more accurately reflect the sometimes complex social patterning of real communities, enabling more reliable insights and improved evidence-based policies. 

Making complex data accessible

Creating and maintaining data infrastructure from sources acquired at the individual level requires significant effort and expertise. We negotiate with data providers, validate data quality, customise datasets to research needs, and link disparate data sources—and followed by ongoing curation, documentation, and regular updates. 

GeoDS uses its comprehensive individual-level data infrastructure to develop derivative “research-ready data” that are designed to meet a full range of research purposes. These data products are developed in response to researcher needs and are made available through streamlined access procedures for registered users. 

All this delivers one massive benefit: capacity to provide researchers with carefully aggregated data that maintains maximum analytical value while adhering to privacy requirements and the complex mix of legal terms governing access to multiple different data, as encapsulated in data licensing agreements negotiated for the entire academic sector. 

Dataset creation processes are transparently documented through peer-reviewed research papers and detailed standardized metadata, ensuring both scientific rigour and accessibility. This approach has proven highly valuable to the research community—in 2024 alone, our team provided over 11,000 data downloads to more than 7,000 unique users. 

Practical applications: gambling research

Our work on observed gambling behaviour demonstrates how this approach extends what is possible using conventional survey instruments alone. Linkage of georeferenced, anonymised individual customer records to neighbourhood attributes has enabled a GeoDS UK-wide classification of actual patterns of gambling behaviour at neighbourhood scale rather than regional reports of what people report in surveys. 

The new smart data approach reveals small-area variations in gambling participation and observed behaviour that traditional surveys cannot capture. By connecting de-identified transaction data with geographic context, researchers can identify community-level patterns that could inform proportionate interventions that are more targeted and effective. 

The gambling research exemplifies how GeoDS and smart data push the frontiers of social science methodology. GeoDS research-ready data products enable researchers at all career stages and throughout the UK to develop deeper insights into complex social phenomena at scales previously impossible to achieve – all while maintaining the highest standards of data protection and ethical research practice. 

This represents just one example of how careful, responsible use of geographic data can transform our understanding of social issues and inform evidence-based solutions that benefit communities. 

Please do visit the GeoDs website to explore the Great Britain Gambling Behaviours Classification dataset (GB2C: Local Authority District Geography). A Lower layer Super Output Area version is available upon successful application to the GeoDS.

Professor Paul Longley is Director of the Geographic Data Service – discover more about our data services.

Newsletter Sign Up

Sign up to receive our latest news updates