Joel Niklaus is a Machine Learning Engineer at Hugging Face working on synthetic pretraining data. He also serves as an advisor and angel investor to various AI companies. Previously, Joel was a Research Scientist at Harvey, specializing in large language model systems for legal applications. Before he was an AI Resident at (Google) X, where he trained multi-billion parameter models on hundreds of TPUs and achieved state-of-the-art results on the LegalBench evaluation dataset. He also conducted research at Thomson Reuters Labs on efficient domain-specific pretraining approaches.

Joel conducted research on LLMs at Stanford University under the supervision of Prof. Dan Ho and Prof. Percy Liang, and has led research projects for the Swiss Federal Supreme Court. With extensive experience in pretraining and fine-tuning LLMs across various compute environments, his research focuses on dataset curation for multilingual legal language models. His datasets have established the foundation for legal NLP in Switzerland. Joel has contributed to open-source projects including lighteval and Marin. His research has been published at leading NLP and machine learning conferences, covered by Anthropic and Swiss National Radio & Television, and honored with an Outstanding Paper Award at ACL. He holds a PhD in Natural Language Processing, a Master’s in Data Science, and a Bachelor’s in Computer Science, all from the University of Bern.

He has lectured at the University of Bern and the Bern University of Applied Sciences, delivering continuing education courses in natural language processing. Previously, he taught computer science at several Swiss high schools. He also brings experience in delivering corporate courses and talks.

JoelNiklaus JoelNiklaus JoelNiklaus JoelNiklaus 🤗 JoelNiklaus

I am always happy to advise motivated young researchers, don’t hesitate to reach out!