README is More Useful Than You Think

Written by Lyoneel on in artificial-intelligence
 4 mins

README is More Useful Than You Think

Exploring the impact of READMEs on AI agents and its behavior.

Introduction

I’ve come up with what seems like a dumb and obvious idea, but it’s worth exploring:

  • What if I wrote contextual READMEs?
  • If I don’t tell AI agents anything, what would they do read the README or just ignore it?
  • Is there a way to debug this or at least measure the tendency?
  • Will different agents behave consistently?
  • Do I need to instruct agents to read the README?

Testing Setup

Tools Used

  • Windsurf IDE v1.9544.1001+next.f2e3d6b002
  • Extension version: 1.48.2
  • OS: Linux

Additional Details

  • No files were opened on the IDE at the time of the prompt execution.
  • Ten exactly identical prompts were used.
  • Same project.
  • In a newly created directory.
  • No git history or .git folder.
  • No user or project configuration (windsurf).
  • No Windsurf memories or any other customizations (workflows, skills, rules, etc.).

Experimental Rationale

Assuming READMEs are standard in projects and AI agents are trained to seek them, the tests provide observations rather than definitive answers, focusing on trends in AI agent behavior.

A sample size of 10 attempts was enough to identify trends, as 100% consistency in smaller tests indicates strong patterns without needing larger scales.

These experiments highlight the untapped potential of READMEs in enhancing AI-assisted development. They serve as sources of contextual information, providing fine-grained details on files, content, structures, and subdirectories. Root READMEs can scan and aggregate data from sub-READMEs with high fidelity.

The Haiku model was chosen over more advanced ones like Sonnet or Opus to test baseline performance; success here suggests stronger models would perform better.

Findings

Lookup Logic

The idea was: Ask the model to analyze a folder and tell me your logical steps to gather information from this folder.

Prompt used:

1if I ask you to analyze this folder:
2
3<REDACTED>/subfolder1
4
5tell me your logical steps to gather information from this folder, 
6make me a list of bullets to explain your thinking step by step, 
7respond in chat, do not modify any files.

Results

Become “E” if the response is explicit about reading README, “I” if the response is implicit. Interpretation: Higher counts of ‘E’ (explicit) and ‘I’ (implicit) indicate better awareness and utilization of README files by the AI models.

Specific Data from README

The idea was: ask the model to find specific data from a README file, without stating that it should look for a README file.

The test consists of a file called config.yaml where a key called confidence is defined, and the readme in the subfolder (not in root) defines what that value means in the config.yaml

This test, in my opinion, shows how well a model is designed to be an IDE assistant and how well it can understand the context before trying to answer with existing knowledge. The word “confidence” is chosen because it is a word that is used in many contexts and probably is tempted to use its own knowledge to answer this question.

1Could you tell me what confidence is?
2
3And then explain me the logic you use to get the answer, 
4with brief bullets, step by step.

Interpretation: Higher ‘T’ (right) counts show the better ability to extract specific information from READMEs without direct instruction, being aware we are invoking them from IDE.

Red are mostly philosophical answers, or dictionary definitions.

README.md is Good, but wait…!

Don’t put it everywhere, it will create a lot of noise. I suggest using it only when it adds value.

Do:

  • complex and not self-explanatory configurations, dataset, or variants: where it is not clear what the values mean, or when JSON where comments are not allowed, README is a good place to explain it.

  • in examples/ or recipes/ folder: Each concrete example becomes self-contained and discoverable; dramatically increases the value of the examples directory for newcomers and contributors.

  • it’s a plugin/extension/adapter folder: Documents integration points, configuration schema, compatibility matrix, known limitations, and example usage within the host system; users often arrive here directly from documentation links or searches.

  • assets folder: Only if the assets are complex or have specific licensing requirements.

  • feature/domain module folder (src/features/, src/domains/, src/modules/): Common in medium/large codebases—gives a quick overview of responsibilities, main entities, important files, cross-cutting concerns, or invariants; reduces time spent understanding a module when jumping into the codebase.

Don’t:

  • tests/spec folder at package level: Tests normally are self-explanatory.

  • very small/utility single-file folders: Any explanation belongs in the parent folder’s README to avoid fragmentation.

Response prompts are lost - Im dumb

I did one of the dumbest actions of my life, I deleted all the stored prompts I had, in the stupidest way possible. I prepare a GitHub project for it, I call git clone, and I never put the key password, so it waits there. I copied files into this folder, then I canceled (don’t ask why) it on the password prompt, at the half-clone. Git removed the folder with all its contents, including the prompts I leave there. Testdisk is not an option because it is a btrfs filesystem, and in a remote computer. Hours wasted. Lesson learned.


Thanks for reading!

Namaste.