Notes on ForecastBench

ForecastBench continuously evaluates the performance of LLMs against an automatically generated, continuously updated set of forecasting questions.

August 15, 2025 · 2 min

Lessons on Context Engineering

Practical guidelines on context engineering, like having an append-only context, using response prefill to remove/force tools, setting up restorable compression strategies, and more.

August 9, 2025 · 4 min

Sandbox MCP: Enable LLMs to Run ANY Code Safely

A Model Context Protocol (MCP) server that lets LLMs run code safely in isolated Docker containers.

April 25, 2025 · 6 min

AI and APIs

What do the recent advancements in generative AI mean for APIs?

May 21, 2023 · 4 min

ChatGPT Explained by ChatGPT

A conversation with ChatGPT about ChatGPT. Who are you?

December 21, 2022 · 9 min