Member-only story

The Best Performing public facing LLM is free now

My Deep Dive into Claude 3.5 Sonnet

2 min readJun 26, 2024

A few days ago, I embarked on a thrilling experiment. As a software engineer and AI enthusiast, I was eager to test the latest AI model, Claude 3.5 Sonnet, released by Anthropic. I’d heard whispers of its capabilities, how it supposedly outperformed even its larger sibling, Claude 3 Opus. Could it really be true? I had to find out.

The Power of Claude 3.5 Sonnet

What I discovered was nothing short of astonishing. Claude 3.5 Sonnet, despite its smaller size, consistently outperformed not only Claude 3 Opus but also other models in the market, including GPT-40 and Llama 400b. It aced various benchmarks, from MMLU to zero-shot chain of thought, only stumbling slightly in the math benchmark.

Putting the Model to the Test

I started with a simple task: writing a Python script to output numbers 1 to 100. Claude 3.5 Sonnet passed with flying colors, producing the correct script effortlessly. Next, I threw it a curveball: write the game Snake in Python. Again, the model delivered, producing a working game that even had quirky features like wall-phasing and an on-screen score. I was genuinely impressed by its ability to grasp and execute complex instructions.

The Best Performing public facing LLM is free now

My Deep Dive into Claude 3.5 Sonnet

The Power of Claude 3.5 Sonnet

Putting the Model to the Test

Written by Dhananjay Trivedi

No responses yet