Home Lessons LearnedThree Months of Developing with an AI Coding Agent

Three Months of Developing with an AI Coding Agent

by Marc

I have been using an AI coding agent as my primary development tool for three months. Not a chatbot I paste code into, but an agent with filesystem access, shell execution, and persistent memory that works inside my terminal. It reads my codebase, runs my tests, pushes my commits, deploys my services, and remembers what it learned yesterday. This post is about the technical patterns I have developed to make that work well — the context files, the memory system, the tool integrations, and the compounding feedback loop that makes it better over time.

CLAUDE.md: The Context Backbone

The single most important file in my setup is CLAUDE.md, a markdown file in my home directory that the agent reads at the start of every session. It is the agent’s operating manual: it describes the projects, the tools, the conventions, the gotchas, and the preferences. Without it, the agent starts every session fresh with zero context. With it, the agent knows my codebase, my workflows, and my quirks from the first message.

Here is what goes in it, with real examples from my setup:

Tool Documentation

Every custom CLI tool gets a section with usage examples. The agent can then use these tools directly via its shell access, without me having to explain how they work each time.

## BigQuery CLI

```bash
bigquery.sh query "SQL" [project] [--location=EU]
# project shortcuts: raw, dev, prod (default), data
bigquery.sh schema dataset.table [project]
bigquery.sh tables dataset [project]
bigquery.sh search "pattern" [project]
```

## Jira Tickets

```bash
jira-ticket.sh create "Title" "Description" [--assign USER] [--type TYPE]
jira-ticket.sh TICKET-123 comment "Your comment text"
jira-ticket.sh TICKET-123 status inprogress
jira-ticket.sh board [filter]  # todo, inprogress, marc, orphans
```

## dbt Cloud

```bash
dbt-cloud.sh trigger JOB_ID    # trigger a production job
dbt-cloud.sh status RUN_ID     # check run status
dbt-cloud.sh jobs               # list all jobs
```

## CRM CLI

```bash
crm-tool.sh deal 5329           # get deal details
crm-tool.sh search "Company"    # search across records
crm-tool.sh fields deal          # show custom field definitions
```

When I say “create a Jira ticket for this bug and assign it to Marcus”, the agent already knows the command syntax, the user aliases, and the default conventions (assign to me unless told otherwise, use the Bug issue type, set status to TODO).

Project Configuration

Database connection details, schema mappings, deployment targets, environment variables — everything the agent needs to work with the codebase without asking me clarifying questions.

## Data Warehouse

| dbt folder         | Production schema         |
|--------------------|---------------------------|
| 07_normalized      | dwh                       |
| 08_analytics       | dwh_analytics_bizlogic    |
| 10_analytics       | dwh_analytics             |
| 14_reporting       | dwh_reporting             |

**CRITICAL - DO NOT GUESS COLUMN NAMES:**
1. Before writing ANY query, ALWAYS look up column names in the schema files
2. Use `grep "table_name," schema_file.csv` to find columns
3. If a column doesn't exist in the schema file, READ the actual source code
4. Never try variations of column names hoping one works

## Deprecated Fields

| Field                      | Reason                                      |
|----------------------------|---------------------------------------------|
| bridge_status_calculated   | Use bridge_status_transactional instead      |
| dr_accepted_bid_* fields   | Use bid_accepted_bid_* from int_bids instead |

User Preferences

## User Preferences

- Do NOT ask for confirmation -- just act. Only pause for truly destructive operations.
- Do NOT open links in browser unless explicitly asked.
- Default branch: master (not main).
- Always rebase on master before pushing/creating PRs.
- Always audit generated code for CSRF, rate limiting, input validation.
- Never merge PRs without explicit user approval.
- After fixing code review comments, always reply to the comment explaining the fix.

The Memory System

The agent’s context resets between sessions. It does not remember that yesterday it discovered a bug in the CI pipeline, or that last week it learned the BI tool’s AQL syntax requires a specific escaping pattern. So I built a file-based memory system.

A MEMORY.md file acts as the index. It links to topic-specific memory files — one for each major area of the work. At the start of every session, the agent reads all of them.

## Topic Files Index

| File                  | Contents                                         |
|-----------------------|--------------------------------------------------|
| bi-tool.md            | BI tool learnings, query syntax, filter gotchas   |
| slack-bot.md          | Bot: deploy, test, endpoints, secrets             |
| crm-sync.md           | Custom CRM sync: endpoints, tables, comparison    |
| tools-and-infra.md    | CLI tools, cloud schedulers, SA consolidation     |
| trading-bot.md        | Paper trading: strategies, Pi deploy, yfinance    |
| git-and-repos.md      | Branch naming, protected branches, code reviews   |

## Critical Rules (always loaded)

- Only one dbt Cloud run at a time. Cancel running job only if new covers same scope.
- No business logic after the PBS layer. Reporting only reshapes data.
- Never use {%- or -%} in dbt models -- causes comment/select merge errors.
- dbt Project Evaluator PK Tests required for layers 07, 10, 11, 13, 14.

The topic files are living documents. When the agent discovers something new — a workaround, a gotcha, a new endpoint — it appends to the relevant topic file automatically. The instruction to do this is in CLAUDE.md itself: “Whenever you solve a non-trivial problem or discover something useful, immediately append an entry to the log file without asking.” This creates a compounding knowledge base that grows with every session.

Feedback That Persists

The memory system also stores feedback. When I correct the agent — “don’t amend commits, create new ones” or “always check smoke tests after deploying” or “never merge PRs without my approval” — those corrections are saved as individual feedback files. They load automatically in every future session. The agent never makes the same mistake twice, even months later.

# Example feedback files (one per lesson):

# feedback_deploy_commit.md
Always commit all changes before running deploy.sh.

# feedback_security_first.md
Never ship web apps without CSRF, secure cookies, rate limiting, input validation.

# feedback_dont_auto_merge.md
Never merge PRs without explicit user approval.

# feedback_cr_comments.md
Always reply to code review comments when fixing them -- don't just push silently.

# feedback_dbt_run_layers.md
Run the normalized (07) job before bizlogic (08) when adding columns that flow downstream.

MCP: Connecting External Services

The Model Context Protocol (MCP) lets the agent connect to external services through a standardised interface. I use it to connect to our BI tool’s semantic layer. The agent can list datasets, explore schemas, generate queries, and execute them — all through tool calls that the MCP server handles, including authentication.

# .mcp.json — MCP server configuration
{
  "mcpServers": {
    "analytics": {
      "url": "https://mcp.analytics-tool.io/mcp",
      "headers": {
        "X-API-Key": "..."
      }
    }
  }
}

# The agent can then use tools like:
# - list_datasets: see available datasets
# - fetch_dataset("main_dataset"): get schema details
# - execute_query: run analytics queries
# - search_dashboards: find dashboards by keyword

Real Example: The WordPress Logo Fix

A family member’s blog had broken AMP logos. I asked the agent to fix it. Here is what it did, without any hand-holding:

  1. SSH’d to the web server (it knew the IP and SSH config from the memory file)
  2. Found the theme’s PHP files that controlled logo output
  3. Discovered the issue: the theme had a hardcoded capability check that stripped <img> tags in certain contexts
  4. Edited the PHP file to fix the capability check
  5. Cleared the PHP opcache (knew from prior sessions that opcache would serve the old version)
  6. Cleared the page cache (W3 Total Cache)
  7. Verified the fix by fetching the page and checking the HTML output

Total time: about 3 minutes. Without the agent’s accumulated context (server address, SSH config, cache layer knowledge), this would have been a 30-minute debugging session with multiple trips to documentation.

Real Example: Bulk Jira Processing

For a project that required updating 50+ Jira tickets with structured comments, I pointed the agent at the board filter and said “scan all in-progress tickets, generate a summary comment for each based on the PR status and test results.” It used the Jira CLI tool to list tickets, the GitHub CLI to check PR status for each, ran the test suites, and posted structured comments — all in a single session. 50 tickets processed, 50 comments posted, zero manual intervention.

Productivity in Numbers

  • 50 PRs merged in one week during a bot training sprint (normally I’d manage 8-10)
  • 11 security smoke tests written and integrated into the CI pipeline in one afternoon
  • Custom CRM sync service: built, tested, validated, and deployed in 3 days (would have been 2 weeks manually)
  • 8 trading strategies backtested across 195K candles with 48 parameter variants each — the backtester itself was written by the agent in one session

The Security Habit

One of the persistent feedback rules in my setup is: “Always audit generated code for CSRF, secure cookies, rate limiting, input validation.” Every time the agent writes a web endpoint, it automatically adds timing-safe comparisons for secrets, rate limiting headers, input validation, and CORS restrictions. This is not because the agent inherently thinks about security — it is because the feedback file tells it to. Without that rule, it would happily deploy an unauthenticated endpoint with no input validation. The feedback system turns one correction into a permanent habit.

What I Have Learned

  • Context files beat prompt engineering. Instead of crafting clever prompts, document your project thoroughly. The agent performs dramatically better with a well-maintained CLAUDE.md than with clever one-shot instructions.
  • Memory compounds. Every session adds to the knowledge base. After three months, the agent knows things about my codebase that I have forgotten.
  • Feedback persists. Correcting the agent once is enough. The feedback file ensures it never repeats the mistake, even in a new session weeks later.
  • CLI tools are the interface. The agent works best with well-documented CLI tools that it can compose. A good --help message and a CLAUDE.md section for each tool goes a long way.
  • Security requires explicit rules. LLMs optimise for “does it work” not “is it safe.” Persistent security feedback rules are non-negotiable.
  • The agent is a multiplier, not a replacement. It does not make good architectural decisions. It does not understand business context intuitively. But it executes known patterns 10x faster than I can type, and it never forgets a deployment step or a test command.

Three months in, I cannot imagine going back to a workflow without it. Not because it writes better code than I do — it does not. But because it handles the 80% of development work that is mechanical: looking up column names, formatting SQL, running tests, creating tickets, deploying services, updating documentation. That leaves me free for the 20% that actually requires thinking.

You may also like

Leave a Comment