Skip to content

Who am I

Research Engineer at 567 Labs working on synthetic data generation and evaluations for Large Language Models. I maintain open source libraries like Instructor and indomee.

Follow my newsletter for the latest updates on blog articles, resources I liked the most and other random thoughts on taming your LLMs.

Latest Articles

Here are some articles I've written recently which might be of interest

  1. Write Stupid Evals: Start simple with evals and build up complexity gradually. The best evaluation isn't the most sophisticated one - it's the one you'll actually use consistently.

  2. Are your eval improvements just pure chance?: A guide to statistical analysis for LLM evals using bootstrapping and t-tests to validate if improvements are significant or just random noise.

  3. Synthetic Data is not a Free Lunch: Hard-earned lessons from generating millions of synthetic data points and why validation matters more than volume. Success requires careful thought and systematic validation.

  4. You're probably not doing experiments right: Three key factors that make the biggest difference in LLM experiments: being clear about what you're varying, investing in infrastructure, and doing sensitivity analysis.

  5. Simplify your LLM Evals: A practical guide to writing binary evals for subjective tasks