University of Cambridge
Annual Review 2025

Transforming how LLMs are evaluated

Annual Review

As large language models (LLMs) become increasingly central to business, research and public life, a critical challenge has emerged.

Existing benchmarks offer limited insight into what these systems can truly do, yet they continue to shape investment decisions, deployment choices and regulatory judgements.

Cambridge spinout Trismik is addressing this issue by applying proven psychometric methods, originally developed for measuring human intelligence, to the evaluation of AI systems. By combining these methods with adaptive testing that adjusts difficulty in real time, Trismik offers a more rigorous and informative approach to assessing model capabilities, moving beyond narrow accuracy scores towards richer evidence of strengths and limitations. Trismik’s unique approach, if successfully translated, could fundamentally change how we think about model capabilities.

The idea originated in Professor Nigel Collier’s lab in the Department of Theoretical and Applied Linguistics, where work on AI language understanding and safety exposed the limitations of current evaluation methods.

With early support from Cambridge Enterprise in 2022, the team was able to explore the commercial potential of the research and the idea gained pace, moving from lab to venture. The team joined our Ideas Incubator in 2023 and received proof of concept backing from the Arts and Humanities Research Council Impact Acceleration Account. A £97k Technology Investment Fund award in 2024 was a critical turning point. The translational funding enabled Trismik to transform its early prototype into a functioning software product built for scale, and also enabled trials with enterprise partners in telecoms, media and other risk-sensitive markets for AI adoption.

By 2024 a commercial team was forming around this promising technology, with CEO Rebekka Mikkola joining Professor Collier to participate in the Founders at the University of Cambridge START 2.0 accelerator programme. Marco Basaldella, himself a Cambridge alumnus, brought technical capacity as CTO and third co-founder. In September 2025, Trismik launched its LLM Experimentation Platform offering beta access to users and delivering evaluation results up to 180x faster than classical testing.

Since spinning out, Trismik has raised £2.2 million in pre-seed funding, with participation from Cambridge Enterprise, and is working with early adopter partners to set a new standard in safe, smart LLM performance testing.