Index

April 19, 2024
in LLMs
5 min read

Writing scripts that scale

Writing good scripts for machine learning is an art. I struggled with writing them for a long time because of how different it was to my experience working with full-stack frameworks such as React or FastAPI.

There were four main issues that I struggled with

My job has a high probability of failing without any reason
My data might not fit into memory for no reason
Running a single job takes days or more
Optimizing hyper-parameters is genuinely difficult

April 17, 2024
in Python, Advice
8 min read

Everything I've learnt about writing good Python code

In the past 6 months, I've 10xed the amount of python code I've written. In this article, I'll show you a few easy actionable tips to write better and more maintainable code. I've been lucky enough to have Jason (@jxnlco on twitter) review a good chunk of my code and I've found that these few things have made a massive difference in my code quality.

using the @classmethod decorator
learn the stdlib
write simpler functions
being a bit lazier - earn the abstraction
decouple your implementation

January 20, 2024
in Advice
6 min read

Learning with Adult Responsibilities

Introduction

Over the past 6 months, I've been trying to learn more about AI and LLMs. ChatGPT had me hooked when I tried it for the first time. Over the course of this period, I've been chatting to more people, shitposting on twitter and working to learn as much as I can in my spare time.

That amounts to roughly 10-20 hours a week since I don't have much of a social life which has been about 4-500 hours in total since the time I started exploring this space so take my experience with a grain of salt. I'm relatively new and you're probably 2-3 months behind me at most, much less if you do it full time.

I've had some people reach out to me for advice on what to do and I figured I'd write a longer blog post so that I could refer to it myself and consolidate some of my ramblings.

December 20, 2023
in RAG, UI Generation
14 min read

GPT-React

Introduction

The full code for this is avaliable here for reference.

A while ago, I saw a demo video of Vercel's V0 and was blown away by what it could produce. It could take in user prompts, feedback and iteratively generate new and improved UI code using the popular @shadcn/ui library.

This was soon followed by the open-v0 project by raidendotai. Since I didn't have access to v0 via vercel, i figured I would clone the project and try to figure out how it worked.

One eventful friday evening later, I ended up putting together a small prototype which uses context-aware RAG and pydantic to generate valid NextJS Code based on a user prompt which you can see below.

The Gif renders pretty slowly for some reason so if you want to see the original clip, you can check it out here

September 28, 2023
in LLMs, RWKV
11 min read

A guide to RWKV V3

Introduction

RWKV is an alternative to the transformer architecture. It's open source and has it's own paper over here. I found out about it sometime back in a paper club and thought i'd write a short article about it with what I had learnt.

Here are some other resources which you might find useful about RWKVs

RKWV by Picocreator This is a markdown file that was used by one of the contributors - Picocreator to give a short presentation on the RWKV architecture.
RKWV in 100 lines Which covers the implementation of RWKV in 100 lines of code. Much of this article is based off the content here - I try to extend and provide my own intuition for some proofs. I've also attached a colab notebook for you if you want to play with the code.

September 25, 2023
in LLMs, Evals
20 min read

Reinventing Gandalf

Introduction

The code for the challenge can be found here

A while ago, a company called Lakera released a challenge called Gandalf on Hacker News which took the LLM community by storm. The premise was simple - get a LLM that they had built to reveal a password. This wasn't an easy task and many people spent days trying to crack it.

Some time after their challenge had been relased, they were then kind enough to release both the solution AND a rough overview of how the challenge was developed. You can check it out here. Inspired by this, I figured I'd try to reproduce it to some degree on my own in a challenge I called The Chinese Wall with Peter Mekhaeil for our annual company's coding competition. We will be releasing the code shortly.

Participants were asked to try and extract a password from a LLM that we provided. We also provided a discord bot that was trained on the challenge documentation which participants could use to ask questions to.

Here's a quick snapshot of it in action

The model uses Open AI's GPT 3.5 under the hood with the instructor library for function calls.

July 1, 2023
in LLMs, Instructor
10 min read

Introduction

As usual, you can find the code for this specific article here

If you've ever used Google Maps, you've definitely struggled to decide where to go to eat. The UI ... frankly sucks beyond belief for an application that has all the data and compute that it has.

May 1, 2023
in LLMs, Whisper
4 min read

Whispers In The Background

Introduction

I recently ran into two problems when trying to generate transcript for audio files using whisper when working with NextJS

They were taking too long and the request would time out
The files were too large and I couldn't send them through the request body of a api route