Building TED: Unlocking the power of education data securely

We’ve been working on a groundbreaking initiative designed to provide researchers with secure, anonymised access to UK school data at scale.

Building TED: Unlocking the power of education data securely
Photo by Kaffeebart / Unsplash

At the National Institute of Teaching, we’ve been working on a groundbreaking initiative: the Teacher Education Data (TED) platform. TED is designed to provide researchers with secure, anonymised access to school data at scale — something that has never before been possible in the UK.

Why TED Matters

Schools and multi-academy trusts (MATs) generate huge amounts of information every day, from attendance and attainment to wellbeing and support. The challenge has always been how to bring this data together in a way that is secure, scalable, and genuinely useful for research. That question became the starting point for TED, and before we could automate anything, we first had to prove it could work in practice. This is where the PiloTED project began, manually extracting and preparing data to test whether accurate, research-ready outputs could really be delivered.

The Build Journey

Throughout the PiloTED project, we manually extracted data from our founding MATs (FMATs). This was deliberately labour-intensive: it let us see exactly how data is entered and stored at both MAT and school level, surface local variations, and learn what it would take to produce accurate, research-ready datasets. Working closely with the TIDE research team, we then manually cleaned and formatted these datasets so they could be used immediately for analysis. This groundwork was essential as it validated the concept and gave us the blueprint for automation.

In parallel, we engineered TED to do the same work automatically and at scale. TED now runs in a fully segregated Microsoft Azure environment, with a custom API designed to extract data from our FMATs securely and repeatably. The past year has focused on designing, building, and testing these foundations so that high-quality, anonymised data can flow reliably into research workflows.

Milestones from the last year 

  • Robust security from the ground up: TED operates within a segregated digital tenancy and has been independently audited by cyber-security specialists, confirming high standards of resilience and protection.
  • Transforming raw data into research-ready insights: We’ve successfully ingested real FMAT data and are testing the use of ehrQL (the query language behind OpenSAFELY) to extract anonymised, structured datasets that researchers can code against directly.
  • Hands-on researcher testing: Pilot users from the TIDE team have already run ehrQL coding sessions with TED data, demonstrating that the system is both functional and research-ready.
  • Collaboration: Our partners at the Bennett Institute for Applied Data Science (University of Oxford) have built the modified ehrQL to understand schools data so researchers can extract data in uniform way and the code can also be reused.

What’s Next 

With real data now flowing into TED, our focus is on strengthening governance to match the gold standards set by OpenSAFELY. Over the coming months, we’ll be updating governance procedures to ensure that every process is transparent, rigorous, and trustworthy. On the technical side, we’ll be testing TED further, bringing in more elements of OpenSAFELY that are currently being developed by The Bennett Institute to ensure that outputs are fully reproducible as the platform evolves.

Looking Ahead 

TED is more than just a platform, it’s a step-change in how education research can be conducted. By combining secure infrastructure, rigorous governance, and collaborative research design, TED will enable new insights that can improve outcomes for children and schools across the country. A key part of this future will be the OpenSAFELY Schools components being developed with our partners at Oxford, ensuring that every analysis is transparent, consistent, and trustworthy as the platform scales.