Mental Models

This helps me think about how to interact effectively with LLMs. It is not any sort of explanation of how an LLM actually works.

I think of an LLM like a very high IQ, very low EQ technical hire on their first day. It’s always their first day. Worse, they can only remember things from before their time at the company and the last 15 minutes.

  1. As with people management, assume they would do the right thing with appropriate context.
  2. Give them that context, which means everything useful that they can manage to digest in a short period of time - context engineering.
  3. This mental model does not change with AGENTS.md / rules, tools, agentic workflows, etc. Those things are just ways to mitigate the limited size of working memory / the need to relearn.

As a bit of a practical checklist

  1. What are all the things the model needs to succeed?
  2. Which of these are likely to be baked into the core model training?
  3. Which of these are easily accessible on the internet?
  4. Which of these can I provide easily in a prompt?
  5. Which of these can I give access to through other tools / MCP?