Sparse Autoencoders Can Learn Graded Latents for Relational Composition
Theo Farrell, Patrick Leask, Noura Al Moubayed
Mechanistic Interpretability Workshop at ICML 2026
View on OpenReview
AI Safety Research and Field-building
MSci Natural Sciences (Computer Science & Philosophy) at Durham University. Founder and co-organiser of Durham AI Safety Initiative. I’ve got experience in foundational mechanistic interpretability research, but I’m increasingly interested and concerned about the risks from open weight models (as outlined in this paper). Starting summer 2026 I will be in London for the LASR Labs fellowship. Outside of AI Safety I play bass guitar 🎸 and dabble with piano and viola 🎹🎻!
Theo Farrell, Patrick Leask, Noura Al Moubayed
Mechanistic Interpretability Workshop at ICML 2026
View on OpenReviewTheo Farrell, Patrick Leask, Noura Al Moubayed
39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop: ResponsibleFM
View on OpenReviewManon Kempermann, Sai Suresh Macharla Vasu, Mahalakshmi Raveenthiran, Theo Farrell, Ingmar Weber
Second Conference of the International Association for Safe and Ethical Artificial Intelligence (IASEAI'26)
View preprint on arXivHow reliable are current interpretability methods? Recent work on CoT monitoring and SAEs (co-presented with Toby Pullan)
I founded DAISI in my second year of university after being inspired by a three-person reading group at the university’s Effective Altruism society. Supported by the Pathfinder fellowship, the group helps to funnel top-university talent into AI Safety.
Reinforcement learning implementation for training bipedal agents to walk
IASEAI 2026
Challenges of Evaluating LLM Safety for User Welfare — Code & Dataset
Algorithm and protocol implementation and analysis
Data preprocessing, cleaning, and statistical analysis
Implementation of data compression algorithms including Huffman coding, LZW, and run-length encoding
Organisation website with information on DAISI initiatives and research
First major programming project — a calculator application for learning fundamentals
d/acc Hackathon @ Apart Research
Research project investigating watermarking LLM code output
Hackathon project converting sign language to text using computer vision and machine learning (DurHack 2023)
Deep learning pipeline enhancing human character visuals in game videos using movie reference footage. Integrates YOLO detection, GCN pose classification, and patch-level CUT.
Computer vision and image processing techniques
Comparison of K-Nearest Neighbours and Logistic Regression classifiers on various datasets
Implementation and comparison of metaheuristic algorithms (genetic algorithm and ant colony optimisation) for the Travelling Salesman Problem
Exploration of bio-inspired and nature-based computational algorithms including cuckoo search and negative selection
ResponsibleFM @ NeurIPS 2025
Implementation of research on relative-magnitude relational composition in attention-only transformers
Interactive browser simulators for bio-inspired computing models, featuring active membranes (P-systems) and connectivity-preserving robot swarm shape transformations.
NLP project implementing transformer-based models for detecting argumentative stances in text
Pick a time that works for you and I'll see you there.