Episode 517

517: Plan First, Think Less: Save Tokens, Improve Code

Your Hosts

About this Episode

Episode 517 starts with a light chat about AI avatars and new text‑to‑speech deepfakes before diving into LLM “thinking” modes—what baked‑in planning actually does, why it multiplies token costs, and when it helps or hurts. James and Frank give concrete dev advice: try low‑thinking settings, use big models for creative planning then smaller ones to execute, leverage harnesses/system prompts, and beware quantized local models often do better without thinking.

Follow Us

⭐⭐ Review Us ⭐⭐

Machine transcription available on http://mergeconflict.fm

Support Merge Conflict

Episode Comments