Walkthrough
Five Claude Finance Agents, Walked Through Step by Step
We ran the five agent categories that matter at seed and Series A on a fictitious 22-person SaaS company. Concrete prompts, sample outputs, and the catch with each one. Built to show you how to actually do this, not just which tool to pick.
Start with the closeThe setup
Meet Tidepool, our test company
Tidepool is fictitious. We made it up. But the company looks like roughly half the seed-to-Series-A SaaS startups Headroom works with, so the walkthroughs that follow will map cleanly onto a real company that resembles it.
Maya wears the finance hat by default. She has been running the month-end close with help from her bookkeeper for the last six months. The Series A close meant the back office got more complicated, and she wants to take less of it on her plate. This is exactly the kind of company the new Claude finance agents were built for. Let's run them.
Before the agents, Maya's month-end looked like this: half a Saturday on the close itself, two hours on Sunday rebuilding the budget-vs-actual spreadsheet, another half day Monday on the board pack the week before each meeting, and a constant low-grade chase across QuickBooks, Mercury, Ramp, and Stripe to figure out where the money actually went. About 30 hours a month, all of it in her head, none of it documented. After the agents, the same work runs in roughly 6 hours and lives in five Claude projects anyone on the team can open. That is what we are walking through.
The aim of this guide: show you how to actually do this, not just which tool to pick.
Walkthrough 1
The month-end close
The pick: Anthropic's Month-End Closer, running inside Claude with QuickBooks and Mercury connected. Numeric or Rillet if you want a packaged tool with a UI. Custom Claude project if you want full control. Skip FloQast and BlackLine; they are for companies five times your size.
What you need before you start
Claude with the finance plugin marketplace enabled (any paid plan works). QuickBooks Online connected. Mercury connected. Ramp connected. Stripe is optional but nice if you want the agent to see invoiced revenue.
If you are starting cold, the connector setup takes about 20 minutes. If you've connected before, it's 5.
Here's how Maya runs Tidepool's April 2026 close
- Open Claude. Install the Month-End Closer skill from the finance plugin marketplace. It shows up as an option in your skill list right after you install.
- Tell it the period and the company. Maya types a starter prompt:
Prompt to Claude
I'm closing April 2026 for Tidepool, our 22-person SaaS company. Walk me through the close. Start with the trial balance check and the bank reconciliation. Flag anything that needs my sign-off before you keep going.
- The agent reads QBO and Mercury and surfaces what it sees. Note: it does not change anything in your books yet. It just reports.
What the agent comes back with
- Bank reconciliation: clean. Ending Mercury balance $7,204,318 matches the QBO cash account.
- Trial balance: balances. No GL errors.
- Roll-forwards needed: deferred revenue, accrued expenses.
- Items needing sign-off: 3 transactions over $1,000 sitting in "Uncategorized Expense." 1 vendor invoice booked to "Software" that looks like it should be "Hosting." 1 duplicate-looking Notion charge (two $487 entries, same day, same vendor).
- Work through the sign-off list. The agent suggests categorizations for each item but waits for Maya to confirm. The three uncategorized expenses turn out to be a conference fee, an Airtable upgrade, and a security audit invoice. She confirms each. The "Software" → "Hosting" reclassification she agrees to. The Notion duplicate is a real billing error, so she flags it to reach out to Notion for a refund.
- Run the roll-forwards. Tidepool collects annual contracts upfront and recognizes the revenue monthly. The agent computes April's piece of that:
Deferred revenue roll-forward, April 2026
Beginning balance: $1,499,000
New billings recognized: $0 (Tidepool didn't invoice new annual deals in April; growth came from contract expansions on existing customers)
Revenue recognized this month: $158,000
Ending balance: $1,341,000Draft journal entry: Dr. Deferred Revenue $158,000 / Cr. Revenue $158,000. Ready for your review.
- Accrued expenses. Rent, software billing in arrears, payroll accruals. The agent applies the same accrual pattern as prior months and asks Maya to confirm any new items. There's one: the security audit invoice that came in on May 2 but covers April work. Maya confirms the accrual.
- The close package. The agent produces a one-page summary: P&L, balance sheet, ending cash, runway calculation. Maya reviews. Everything ties. She signs off.
April 2026 close summary (one-page version)
Revenue: $158,000 (recognized from deferred). YTD $612,000.
Operating expenses: $312,000. Top categories: Salaries $198,000, Hosting + AI inference $48,000, Sales & marketing $42,000.
Net loss: ($154,000) for the month. YTD ($528,000).
Cash: $7,204,318 ending. Beginning cash $7,358,000. Net change: $(154,000) operations, ~$0 financing/investing.
Runway: 23.4 months at current burn ($307k 3-month trailing average).
Signed-off items: 6 items. All sign-offs logged in the agent's audit trail for next month's review.
Total time: about 90 minutes the first run. About 45 minutes by the third cycle, once Maya's added her own customizations to the agent's defaults (Tidepool-specific accrual list, vendor mappings, the way she likes the summary formatted). She used to spend a full Saturday on this. Now it fits into a Tuesday morning between standups.
Also: by default, the agent drafts journal entries and waits for you to post them in QBO. Don't turn on auto-posting until you've run at least three closes and trust the output. The control gate exists for a reason.
Walkthrough 2
Reconciliation exception review
The pick: Mercury and Ramp's native QuickBooks feeds handle 80 percent of reconciliation. A custom Claude project handles the rest. Skip dedicated reconciliation platforms (Vic.ai, Stampli, Tipalti) at this stage; they're built for companies with five times your AP volume.
What you need before you start
The feeds set up correctly. Open QBO, go to Banking, confirm Mercury and Ramp are pulling daily. If they aren't, fix that first. It's a 10-minute task and it's the prerequisite for everything else.
Build a custom Claude project for exceptions
- Open a new Claude project. Name it something like "Tidepool Recon Exceptions."
- Connect QuickBooks. The project will pull transaction data on demand without you re-uploading.
- Save a reusable prompt as the project's default instruction:
Saved prompt for the project
You're helping Tidepool review reconciliation exceptions for the period I specify. Pull from QBO the transactions that are: - Uncategorized - Over $1,000 with no class or customer assigned - Possibly duplicate (same amount, same vendor, within 7 days of each other) - Booked to "Suspense" or "Ask Accountant" For each one, suggest the right category and flag any that look like errors. Don't make changes to QBO. Just produce the review list with your reasoning.
- Run it for the period. Maya types "Run the recon exceptions for April 2026." The agent pulls the period transactions and returns the list.
Exception list, April 2026
- Uncategorized (4): Stripe payout reversal $1,250 (looks like a refund, recommend "Refunds"); AWS invoice $890 (recommend "Hosting"); a $1,400 wire to "FastSpring" (recommend "Software, Payment Processing"); a $245 Ramp transaction at "Sweetgreen" with no employee assigned (recommend "Meals, Office").
- Possible duplicate (1): Two Notion subscription charges of $487 on April 3. Look identical. Recommend reviewing with Notion.
- Over $1,000 with no class assigned (2): $3,500 to Mercury Insurance (recommend class "G&A"); $1,800 to a contractor named Pixel & Pine (recommend class "Sales & Marketing" and assign to the Q2 brand refresh project).
- Review and act. Maya scans the list. Most categorizations are obvious. She confirms five and pushes back on one (the Sweetgreen charge was actually a client lunch, not office meals; she reclassifies it). The Notion duplicate is real, Maya emails Notion for a refund.
- Apply the categorizations in QBO. The agent doesn't push them automatically. Maya (or her bookkeeper) does it. Ten minutes of work.
- Ask a follow-up to catch pattern-level issues. The exception agent only flags what QBO already considers a problem. Sometimes the real issue is upstream. Maya asks the agent:
Maya's follow-up
One more pass. Look at the vendor list in QBO and tell me if any vendors with different names look like they might be the same company. Group anything that looks like a duplicate vendor record.
Vendor cleanup suggestionsMaya says yes to the first two, holds the third. The cleanup takes 5 minutes.- Vercel and V0 by Vercel: two vendor records, same EIN on invoices. Recommend merging into one Vercel record.
- Anthropic, PBC and Anthropic: two vendor records, same address. Recommend merging.
- Stripe and Stripe Inc: same vendor. Three of the Stripe Inc invoices look like they should be in the parent vendor for cleaner reporting.
Want me to draft the merge in QuickBooks (you would still need to approve)?
Walkthrough 3
Variance vs. budget
The pick: Custom Claude project pulling QBO actuals against a Google Sheets budget. Skip Mosaic, Cube, and Pigment until you cross $3M ARR and have a finance lead who'll use the platform daily.
What you need before you start
April actuals (the close should be done from Walkthrough 1) and a budget. Tidepool's budget lives in Google Sheets with line items that match the QuickBooks chart of accounts. If your budget is in a different format, do the alignment work once. It takes an hour and pays back forever.
Build the variance project
- Open a new Claude project. Name it "Tidepool Monthly Variance."
- Connect QuickBooks and the budget Sheet. Both stay live; the project pulls fresh data each time you run it.
- Save the project prompt:
Saved prompt for the project
Compare Tidepool's actuals from QuickBooks against the budget in the connected Sheet for the period I name. Produce: 1. A line-item variance table: actual, budget, dollar variance, percent variance. 2. Flag any line where the absolute variance is over 10 percent OR over $5,000 (whichever is larger). 3. For each flagged line, draft one sentence of commentary using the underlying transaction detail. 4. Tag each flagged variance as "one-time" or "trend" based on the prior 3 months. Be specific. If a vendor or customer drives the variance, name them.
- Run it for the period. "Run the April 2026 variance for Tidepool." The agent returns:
Variance highlights, April 2026 (flagged items only)
- Hosting + AI inference: $48,000 actual vs. $39,000 budget. Over by 23%. Driver: Cinder Robotics (new customer, signed March) is running heavy embedding workloads. Their usage added ~$8,000 to April inference costs. Trend, likely persistent until pricing covers it.
- T&E: $4,100 actual vs. $6,300 budget. Under by 35%. Driver: Maya canceled the Bay Area product conference trip. One-time.
- Sales & Marketing: $42,000 actual vs. $50,000 budget. Under by 16%. Driver: open SDR seat unfilled all month. One-time, will reverse next month if hired.
- Customer Support headcount: $18,000 actual vs. $14,000 budget. Over by 29%. Driver: new CS hire started mid-March; April is the first full month. Trend, ongoing at the new run rate.
Other lines: Within threshold. No commentary needed.
- Review and decide. Maya reads through. The hosting variance needs a conversation with Cinder about an updated pricing tier; she sends Slack to the customer success lead. The CS headcount overage was planned; no action needed but she makes a note to bump that line in next month's budget. T&E and S&M variances are expected and one-time.
- Ask for a multi-month trend on the flagged item. The hosting + AI inference variance is a trend, not a one-time blip, so Maya wants to see the picture across the last six months before pricing the conversation with Cinder.
Maya's follow-up
Show me Tidepool's hosting + AI inference spend month by month for the last 6 months. Split by vendor. Flag the months where Cinder Robotics drove the most usage. Plot it.
Hosting + AI inference, Nov 2025 to Apr 2026Now Maya has the case she needs for the customer success lead to take to Cinder.- Nov 2025: $22,000 (AWS $14k, OpenAI $5k, Anthropic $3k)
- Dec 2025: $26,000 (AWS $16k, OpenAI $6k, Anthropic $4k)
- Jan 2026: $31,000 (AWS $19k, OpenAI $7k, Anthropic $5k)
- Feb 2026: $34,000 (AWS $20k, OpenAI $8k, Anthropic $6k)
- Mar 2026: $39,000 (AWS $22k, OpenAI $10k, Anthropic $7k). Cinder Robotics onboarded mid-March; their workloads added ~$4k in second half of month.
- Apr 2026: $48,000 (AWS $25k, OpenAI $13k, Anthropic $10k). Cinder workloads contributed ~$8k of the $9k month-over-month increase.
Run-rate projection: at current Cinder usage, May trends to $54k, June to $60k. A pricing conversation should land in the next 2-3 weeks. Chart attached.
- Save the output. The variance summary becomes the source material for the board pack in Walkthrough 4.
Total time: 15 minutes for the standard run, plus another 10 if Maya asks a trend follow-up like she did this month. The setup itself is about 2 hours the first time, and you never have to redo it.
Walkthrough 4
Board pack assembly
The pick: Custom Claude project that assembles 80 percent of the board pack from connected sources. The founder writes the strategic commentary on top. Skip dedicated board prep tools at our stage; the work is too bespoke.
What you need before you start
Tidepool's board pack is a 12-page Google Doc that goes to the board monthly. Same structure every cycle: P&L, balance sheet, cash and runway, KPI page, variance highlights, founder commentary, and asks. You need: the template (last month's pack with prior commentary stripped out), QBO, Mercury, Ramp, Stripe, and a KPI tracking Sheet.
Build the board pack project
- Open a new Claude project. Name it "Tidepool Board Pack."
- Upload the template. Last month's pack, prior commentary deleted. The agent will use this for structure and formatting cues.
- Connect the data sources. QBO, Mercury, Ramp, Stripe, and the KPI Sheet.
- Save the project prompt:
Saved prompt for the project
Assemble Tidepool's board pack for the month I specify, using the connected sources and the template I uploaded. Produce: 1. P&L summary with month actuals and YTD comparison 2. Balance sheet 3. Cash position + runway calculation (use 3-month trailing burn) 4. KPI page: ARR, NDR, new logos, churn, CAC, runway 5. Variance highlights pulled from the Tidepool Monthly Variance project output 6. Draft commentary scaffolding with placeholders for the founder DO NOT write the strategic commentary sections. Leave [CEO COMMENTARY: topic] markers for me to fill in. Match the template's formatting and order.
- Run it for the period. "Assemble Tidepool's April 2026 board pack." The agent pulls from each connected source and returns a draft:
Sample of what the draft pack looks like
Page 1, Cover: Tidepool, April 2026 Board Pack, prepared for [Board Members].
Page 2, Financial Summary: P&L for April with actuals, budget, variance. YTD comparison. Brief callouts on the four flagged variance items.
Page 3, Balance Sheet: ending balance sheet, comparison to prior month.
Page 4, Cash & Runway: $7.2M ending cash, $312k average monthly burn, 23 months of runway at current rate.
Page 5, KPIs: ARR $1.8M (+12% MoM), NDR 117%, 3 new logos in April, 0 churn, blended CAC $14,200.
Page 6, Variance Detail: full table from the variance project.
Page 7, CEO Commentary: [CEO COMMENTARY: state of business] [CEO COMMENTARY: customer wins this month] [CEO COMMENTARY: what's been hard]
Page 8, Asks: [CEO COMMENTARY: 2-3 specific asks for the board this period]
- Maya writes the strategic commentary. Three placeholders to fill: state of business, customer wins, what's been hard. Plus the asks page. This part is hers, always. The board reads the commentary much more carefully than they read the numbers; the agent shouldn't be writing it.
- Calibrate the KPI definitions, once. The first time Maya runs the board pack agent, the NDR number comes back wrong: 124% instead of her usual 117%. Why: Tidepool excludes trial customers from the NDR base, and Claude's off-the-shelf logic included them. She tells the agent:
One-time calibrationThe agent updates the project context. On every subsequent month, NDR comes back at the right number. This kind of calibration is a 20-minute up-front cost the first cycle; after that, the agent's KPI math matches yours.
For Tidepool, NDR should be calculated on a trailing-12-month basis, excluding any customer who was in trial status at any point during the period. Remember this for future pack assemblies.
- Format, review, send. Maya does a final read-through, exports to PDF, and emails the board.
Total time for Maya: about 90 minutes start to finish. That's down from 6 to 8 hours when she was doing the assembly manually.
Walkthrough 5
Ad-hoc analysis
The pick: Claude with code execution on exported financial data. Skip Hex and Mode unless you have two or more data-fluent operators on the team.
What you need before you start
Nothing. No project setup. Just open Claude and start.
Maya wants to know what drove Q1 ARR growth
- Export the data. From Stripe (or QBO, depending on where your revenue lives), export the last 12 months of revenue and customer detail as a CSV. Takes about 30 seconds.
- Open Claude. Drag in the CSV.
- Ask the question.
Maya's prompt
Here's Tidepool's Stripe revenue export for the last 12 months. For Q1 2026 (January through March): 1. Which customers added the most ARR (new + expansion)? 2. Which customers churned, and how much ARR did we lose? 3. Plot cumulative net new ARR over the quarter. Show me the code you used.
- Claude runs the analysis. Returns a table, a chart, and the Python code that produced them.
What comes back
Top ARR contributors, Q1 2026:
- Cinder Robotics, new logo in March, $48k ARR
- Glade Health, expansion from $24k to $52k in February
- Pollinate, new logo in January, $36k ARR
- Northwood Studios, expansion from $18k to $32k in February
Churn, Q1 2026:
- Marlowe Labs, churned January at $15k ARR (acquired by competitor)
- Fewest Studio, churned March at $12k ARR (went out of business)
Net new ARR for Q1: $174,000. Cumulative chart attached. Code visible below.
- Follow up. Maya asks in the same conversation: "What's the average sales cycle for the deals that closed in Q1?" Claude pulls the deal-stage dates from the export and computes it: average 47 days from first touch to close. She follows up again: "Compare that to last year." Claude returns the comparison.
Total time for the whole analysis: about 15 minutes. The same exercise in a spreadsheet would take an hour and lose the chart.
The pattern
You're not buying a tool. You're building a prompt library.
After running these walkthroughs once or twice, the pattern emerges. The value isn't in any single agent. It's in the library of prompts and project setups that work together.
Maya's full setup at Tidepool, after a few months:
- 1Anthropic Month-End Closer (installed, used as-is)
- 2Tidepool Recon Exceptions (custom Claude project)
- 3Tidepool Monthly Variance (custom Claude project)
- 4Tidepool Board Pack (custom Claude project)
- 5Ad-hoc analyses (Claude with code execution, no project)
Five things, all running in Claude, total monthly subscription cost under $300, total time on finance work roughly 6 hours per month vs. the 30 hours she was spending before. That's the math.
What Maya's month actually looks like now
Concretely, here's how the agents map to her calendar:
- Business day 2-3 of each month: Month-End Closer runs the close. About 45 minutes once the agent is tuned. Bookkeeper handles the categorization push to QBO.
- Business day 3: Recon Exceptions project flags anything the agent missed. 10 minutes.
- Business day 4: Monthly Variance project runs. 15 to 25 minutes depending on whether Maya wants a trend follow-up.
- Business day 5-7 (board months): Board Pack project assembles the draft. Maya writes the strategic commentary. About 90 minutes total.
- Anytime: Ad-hoc analyses. Claude with code execution, no project, no setup. 10 to 20 minutes per question.
Roughly 5 to 7 hours of finance work a month, predictably scheduled, with every output saved in a place she can show her bookkeeper or her future controller. Compare that to 30 hours scattered across nights and weekends in spreadsheets only she understood.
Where this guide could be wrong
Two honest caveats. First, Anthropic's finance agents are days old as of this writing. The edge cases are still being shaken out, and the right answer in November 2026 may not be the same as today. Re-evaluate the stack every six months. Second, we have a clear preference for building with Claude over buying packaged platforms at this stage, but we have seen real companies where Numeric, Rillet, or Mosaic was the right call from day one because the founder had a finance lead who needed a UI from the start. If that describes you, buy. The right answer is whichever one your specific team will actually use.
Variations
What changes if your company isn't Tidepool
Tidepool is a generic US-only SaaS company at Series A. About half the founders we work with look like that; the other half have something specific that shifts the playbook. Three of the more common variations and what changes in each:
Variation 1
You're an AI-native SaaS with heavy compute spend
The variance walkthrough matters more, and the chart of accounts has to be different. Split "Hosting" into three accounts: AI inference (OpenAI, Anthropic, Bedrock, Vertex), training compute (CoreWeave, Lambda, GPU instances), and infrastructure hosting (everything else). If you lump them together, the variance commentary will be vague no matter how good the agent is.
Run the variance bi-weekly, not monthly. Compute spend can swing burn by 20 to 40 percent month over month, and the lead time on a customer-pricing conversation is too long to wait until close week to spot the trend. The agent runs fine on a 2-week cadence; the prompt is the same.
In the board pack, add a one-page "compute economics" section: gross margin by customer, inference cost per active user, and the trend line on each. Most investors at this stage are watching this number with hawk eyes. The agent will assemble the page from the same connected sources once you've told it the structure once.
Variation 2
You have a UK or Canadian entity
The close walkthrough runs twice, once per entity, and then you consolidate. Anthropic's Month-End Closer is single-entity oriented as of this writing. Run it on each entity independently, then have a Claude project do the consolidation: read both QBO files, produce a consolidated P&L and balance sheet, flag intercompany discrepancies. Setup is 2 hours; the consolidation runs in 10 minutes each cycle.
Currency conversion is the trickiest part. Use the period-end FX rate for balance sheet items and the average rate for P&L items, applied consistently. Tell the agent the conventions once and it sticks.
Sales tax becomes Anrok for US plus a separate tool (Stripe Tax or Quaderno) for UK VAT. Two reconciliation flows, both fed into the close summary. The agent handles both if you connect both data sources.
Variation 3
You're pre-revenue or under $1M ARR
You don't need most of this yet. Run the Month-End Closer to learn the workflow. Skip the variance project (you don't have enough budget complexity to justify it). Skip the board pack project (your board pack at this stage is a Notion doc, not a 12-page pack). Keep the analysis workflow; ad-hoc Claude with code execution is useful at every stage.
The full agent stack becomes worth its setup time at the moment you have a real budget that needs to be defended. That usually arrives somewhere between $1M and $3M ARR, often around the time you close your Series A. Below that, the agents add overhead that exceeds the value they create.
Avoid these
A short list of what to skip
- Buying a $2,000-per-month FP&A platform if you don't have a finance person who'll use it daily. The tool gathers dust; you'll cancel in six months.
- Letting Claude push journal entries directly to QBO without human review. Not yet. The control gate exists for a reason.
- Trying to automate your strategic board commentary. Your investors want your voice, not the agent's. They'll know the difference.
- Using the Month-End Closer for tax filings. Different category, different rules. Use a CPA.
- Skipping the QuickBooks cleanup before deploying agents. Garbage in, garbage out, very fast.
- Buying FloQast or BlackLine pre-Series-B. They're built for companies five times your size. The right answer is "not yet."
Founder FAQ
The questions we hear
How do I start, like today?
Open Claude. Install the Month-End Closer skill from the finance plugin marketplace. Connect QuickBooks. Run it on last month's close just to see what comes out. First 30 minutes.
What if I don't use QuickBooks?
Xero works too. NetSuite isn't covered yet by the Anthropic skill. If you're on NetSuite, you're likely at a stage where you have a finance team that's evaluating Numeric or FloQast anyway.
Do I need a Claude Enterprise plan?
For one founder running this for one company, Claude Pro or Claude Team is fine. Enterprise is right when you have multiple people across the team using it and you want stronger data controls.
Will Claude make up numbers?
The Anthropic Month-End Closer is designed not to. It surfaces what's reconciled and flags what isn't. For custom Claude projects, connect to real data sources and ask the agent to cite where each number came from. Fabrication is much rarer than with the previous generation of finance AI tools.
What if the agent gets something wrong?
Update the prompt. The mistakes compound into the project definition over time. The first month or two is calibration; by month three the agent is tuned to your specific business.
How long until I trust this?
After three months and three closes, you'll know. Most founders we work with go from checking everything the agent does to reviewing only the exceptions between month one and month three.
Should I buy Mosaic or Cube instead?
Build with Claude until $3M ARR or until you have a finance lead who'll operate a platform daily. Mosaic if that lead lives in dashboards. Cube if they live in spreadsheets. Below that revenue threshold, the platforms gather dust regardless of which one you pick.
Keep reading
More on how we work
Get in touch
Tell us where things stand
If you just raised and want help deploying Claude finance agents on your back office without buying $40,000 a year of platforms you don't need yet, email hello@headroom.llc or use the form on the home page. We'll follow up within a couple of days.