The Best Performing public facing LLM is free now

My Deep Dive into Claude 3.5 Sonnet

Dhananjay Trivedi
2 min readJun 26, 2024

A few days ago, I embarked on a thrilling experiment. As a software engineer and AI enthusiast, I was eager to test the latest AI model, Claude 3.5 Sonnet, released by Anthropic. I’d heard whispers of its capabilities, how it supposedly outperformed even its larger sibling, Claude 3 Opus. Could it really be true? I had to find out.

The Power of Claude 3.5 Sonnet

What I discovered was nothing short of astonishing. Claude 3.5 Sonnet, despite its smaller size, consistently outperformed not only Claude 3 Opus but also other models in the market, including GPT-40 and Llama 400b. It aced various benchmarks, from MMLU to zero-shot chain of thought, only stumbling slightly in the math benchmark.

Putting the Model to the Test

I started with a simple task: writing a Python script to output numbers 1 to 100. Claude 3.5 Sonnet passed with flying colors, producing the correct script effortlessly. Next, I threw it a curveball: write the game Snake in Python. Again, the model delivered, producing a working game that even had quirky features like wall-phasing and an on-screen score. I was genuinely impressed by its ability to grasp and execute complex instructions.

--

--