Getting started with AIWatcher.

One dashboard for AI cost and behavior across your developers’ local AI tools and your product’s AI agents.

Path A

Your Local

Local-only CLI for Claude Code, Codex, and Cursor. Zero code changes, no account, no cloud upload.

Jump to setup

Path B

Your Apps

One-line SDK wrapper for your product's AI calls. Python and JavaScript supported.

Jump to setup

Path C

Use Claude Code

Paste a prompt into Claude Code and have it instrument your existing codebase end-to-end — routes, lib helpers, cron jobs.

Jump to setup

Use one, two, or all three.

Before you start

1Sign in at ai-watcher-pi.vercel.app
2For Your Local: skip sign-in entirely — pip install aiwatcher-cli runs fully offline, no account or key needed
3For Your Apps: go to Your Apps → New App and copy your API key

Path A

Path A — Your Local (zero code changes)

Reads local session files for Claude Code, Codex CLI, and Cursor. Cline and Windsurf installs are detected today; full session scanning for those two is on the roadmap.

Requirements: Python 3.9+. macOS, Linux, and Windows.

Install

bash

pip install aiwatcher-cli

Run

bash

aiwatcher today

No account, no API key, no cloud upload — everything runs and stays on your machine. Run aiwatcher ui for a local-only dashboard at http://127.0.0.1:8765 (or the next free port).

What it reads

Tool	Source path
Claude Code	~/.claude/projects/
Codex CLI	~/.codex/state_5.sqlite
Cursor	~/Library/Application Support/Cursor/logs/
Cline	Detected only — full session scan coming soon
Windsurf	Detected only — full session scan coming soon

AIWatcher Local never reads prompt content or source code — only session metadata, token counts, model names, and timing.

Path B

Path B — Your Apps (SDK)

Wraps your product’s AI calls and sends telemetry to AIWatcher. Each wrapped call creates a session in Your Apps with cost, token counts, and a full event timeline.

Pick your stack

Works with any Python backend: FastAPI, Flask, Django, AWS Lambda, scripts, or any other runtime.

Install

bash

# Local install (PyPI publish coming soon)
pip install -e /path/to/ai-watcher/sdk

Environment variables

bash

export AIWATCHER_API_KEY=aw_live_...
export AIWATCHER_API_URL=https://ai-watcher-pi.vercel.app

Single AI call

python

from ai_watcher import track_llm

result = track_llm(
    'classify-document',                          # action name
    lambda: openai.chat.completions.create(       # your existing call
        model='gpt-4o',
        messages=[{'role': 'user', 'content': doc_text}]
    ),
    {
        'human_id': current_user.email,           # who triggered this
        'agent_name': 'doc-classifier',           # your agent name
        'model': 'gpt-4o',                        # model being called
        'session_name': 'Classify: invoice',      # shown in dashboard
        'input': {'doc_type': 'invoice'}          # metadata, no PII
    }
)
# result is unchanged — track_llm returns exactly what your fn returns

Chained calls (one session, multiple model calls)

python

from ai_watcher import track_chain

results = track_chain(
    steps=[
        {
            'action': 'extract',
            'model': 'gpt-4o',
            'input': {'pages': 3},
            'fn': lambda: openai.chat.completions.create(...)
        },
        {
            'action': 'classify',
            'model': 'claude-sonnet-4-20250514',
            'input': {'text_preview': preview[:100]},
            'fn': lambda: anthropic.messages.create(...)
        }
    ],
    opts={
        'human_id': customer_id,
        'agent_name': 'shipping-pipeline',
        'session_name': 'Process Shipping Document',
        'framework': 'python'
    }
)
# results is a list matching the order of steps

AWS Lambda

Same pattern. Add env vars to your Lambda configuration:

text

AIWATCHER_API_KEY  = aw_live_...
AIWATCHER_API_URL  = https://ai-watcher-pi.vercel.app

python

from ai_watcher import track_llm

def handler(event, context):
    return track_llm(
        'classify-document',
        lambda: openai.chat.completions.create(
            model='gpt-4o',
            messages=[{'role': 'user', 'content': event['text']}]
        ),
        {
            'human_id': event.get('customer_id', 'unknown'),
            'agent_name': 'doc-classifier',
            'model': 'gpt-4o',
            'framework': 'aws-lambda',
            'session_name': f"Classify: {event.get('doc_type', 'document')}",
            'input': {'doc_type': event.get('doc_type')}
        }
    )

track_llm uses only the Python standard library (urllib). No extra dependencies are added to your Lambda bundle.

Path C

Path C — Use Claude Code to instrument your app

If you have an existing codebase with AI calls scattered across routes, shared lib helpers, and cron jobs, paste the prompt below into Claude Code. It walks through discovery, asks you architectural questions at the right moments, wraps every call site with the right pattern, and verifies the integration end-to-end.

We dogfooded this prompt on a real Next.js sales-engagement app — 17 routes, 15 lib helpers, 9 cron jobs — and refined it from every rough edge we hit. It’s the same prompt we used internally.

Works best in interactive mode so it can ask the architectural-decision questions in real time. Also works with Cursor, Cline, or Aider if you coach it through the phases manually.

Instrument my app with AIWatcher

I want you to integrate AIWatcher into this codebase so every AI call (LLM + tool) becomes a tracked event in the AIWatcher dashboard. Work in phases. Pause for my input at every checkpoint — do not skip ahead, batch decisions, or pick defaults silently.

This is a real integration on a working app, not a demo. Bias toward "make it correct and visible in the dashboard," not "minimize lines of code changed."

What you're about to build (the mental model)

One AIWatcher session = one logical workflow. Usually that's one HTTP request, one cron tick, or one background job. A session has a start, an end, and a SHA-256-verified chain of events in between.
*Events are AI calls made inside a session.* Each anthropic.messages.create, openai.chat.completions.create, exa.searchAndContents, etc. is one event. The session collects them.
Each session is tagged with agent_name and productContext. agent_name discriminates workflows in the dashboard's /agents view. productContext carries the surface metadata (which feature, which entity, which user action) that powers cost-attribution slicing.
Sessions are created via the SDK's withSession() helper that I'll have you build in Phase 3. It opens a session on entry, closes it on exit (verifying the chain), and ensures every wrapped AI call inside lands on that session.

Phase 1 — Discovery (read-only, no edits)

Map every AI call site in the codebase before touching anything. Use the commands below — they're tuned to avoid common false positives.

Always exclude these paths from every grep

node_modules/, .next/, dist/, build/, out/, .git/
docs/, **/getting-started/, **/examples/
**/__tests__/, *.test.ts, *.test.tsx, *.spec.ts, *.spec.tsx
Anything under app/aiwatcher/** or app/agentwatch/** (these are usually in-app docs pages with sample code that looks like runtime AI calls)

1.1 — AI SDK imports

text

grep -rn -E "from ['\"]@?(anthropic-ai/sdk|openai|ai|@ai-sdk/|@google/generative-ai|cohere-ai|replicate|groq-sdk|mistralai)" \
  --include="*.ts" --include="*.tsx" --include="*.js"

1.2 — LLM call shapes (catches inline `fetch()` cases too)

text

grep -rn -E "messages\.create|messages\.stream|chat\.completions\.create|generateText|streamText|streamObject|generateObject|embed\(|createEmbedding" \
  --include="*.ts" --include="*.tsx"

1.3 — External tool calls worth tracking

Ask me which paid/slow third-party APIs to include. Starting set:

text

grep -rn -E "from ['\"](exa-js|@pinecone|@qdrant|@weaviate|chromadb|serpapi|apify|@unipile)" \
  --include="*.ts" --include="*.tsx"

1.4 — Streaming detection

For every file from 1.1/1.2, mark Streaming ✅ if any of these match:

text

grep -nE "new ReadableStream|messages\.stream|streamText|streamObject|text/event-stream" <file>

1.5 — Trace lib helper callers (critical for coverage)

For each AI-using file under lib/**, find every file that imports it:

text

# For lib/foo/bar.ts:
grep -rn -E "from ['\"](@/|\.\./)lib/foo/bar" --include="*.ts" --include="*.tsx"

For each helper, group the callers into buckets:

Routes (app/api/**/route.ts)
Cron (app/api/cron/** or files referenced in vercel.json)
Other libs (lib/**) — recurse, trace their callers too
Scripts (scripts/**) — ask me: one-off backfill or scheduled?
Server actions ('use server' files)

A helper called by cron is a high-priority wrap point.

1.6 — Server actions and cron

text

grep -rln -E "^['\"]use server['\"]" --include="*.ts" --include="*.tsx"
cat vercel.json 2>/dev/null    # cron schedules
ls app/api/cron/ 2>/dev/null

If ls app/api/cron/ shows more directories than vercel.json declares schedules for, ask me which of the unscheduled ones are still active (might be HTTP-only endpoints or stale dead code).

1.7 — Baseline of pre-existing TS errors

Before any instrumentation, capture the baseline so we can attribute any new errors to our work:

text

tsc --noEmit 2>&1 | grep -cE "error TS" || echo "0"

Remember this number. We'll compare against it after each phase.

1.8 — Report back

Build this table and stop:

File	AI signal	Location (route / lib / cron / script)	Streaming?	Callers (if lib)

Then ask me:

"I found X route handlers, Y lib helpers, and Z cron routes that do AI work. Pre-existing TS baseline is N errors. Should I proceed to Phase 2?"

Do not write code yet.

Phase 2 — Architectural decisions (I'll answer)

Once I've reviewed the gap table, ask me these four questions explicitly. Don't pick defaults.

Q1 — Wrap-point for individual LLM call wraps

"All routes/crons that do AI work will get a withSession() wrap to establish the session boundary — that part isn't optional and applies uniformly. The architectural choice is where to put the actual aw.trackLLM wraps on each individual anthropic.messages.create call:
(a) At the call site, inside lib helpers (recommended): Wrap each anthropic.messages.create inside the lib helper that contains it. Calls from any caller — routes, crons, scripts, other libs — get tracked automatically. Looping helpers get per-iteration accuracy.
(b) Only at the route handler: Wrap aw.trackLLM around the lib-helper call from the route. Works only if the helper contains a single Claude call. Breaks for looping helpers, multi-call helpers, and any cron-driven work that flows through a lib.
Recommend (a). Confirm?"

Q2 — `agent_name` per workflow

"Each workflow needs a stable agent_name (becomes a row in the dashboard's /agents page). The convention: kebab-case derived from the route path or helper function. Examples for your codebase:
Route Suggested agent_name
(list per their actual routes)
Approve or adjust?"

Route	Suggested `agent_name`
(list per their actual routes)

Q3 — `productContext` fields

"Each session can carry productContext for cost-attribution slicing: feature, screen, route, user_action, entity_type, entity_id, workflow_id, customer_id, plan. I'll populate feature, route, user_action, and entity_type/entity_id (where derivable from URL params or request body) by default.
Anything else specific to your product I should plumb? Common ones to think about:
customer_id if you're multi-tenant
plan if you have subscription tiers
workflow_id if you have multi-step pipelines worth correlating"

Q4 — Cron handling

"Cron jobs have no logged-in user. The SDK falls back to human_id: 'anonymous' automatically — sessions are still distinguishable by agent_name. Acceptable, or do you want a fixed service identity like 'cron@<app-name>'?"
(Service identity requires a small SDK config tweak; 'anonymous' works out of the box.)

Wait for my answers. Don't proceed until I respond.

Phase 3 — Setup (one config file + env vars)

3.1 — Install the SDK

bash

pnpm add aiwatcher    # or npm install / yarn add

3.2 — Create `lib/aiwatcher.ts`

typescript

import { AIWatcher, type ProductContext } from 'aiwatcher'
import { getCurrentUser } from './current-user'   // adapt to your auth (Clerk, NextAuth, etc.)

export const aw = new AIWatcher({
  apiKey: process.env.AIWATCHER_API_KEY ?? '',
  apiUrl: process.env.AIWATCHER_API_URL ?? 'https://ai-watcher-pi.vercel.app',
  appName: '<your-app-name>',
  framework: 'nextjs',         // or 'express' | 'fastapi' | etc.
  sessionScope: 'persistent',  // 'persistent' lets one workflow own one session w/ multiple events
  getUserId: async () => (await getCurrentUser())?.id ?? null,
})

/**
 * Wrap a non-streaming route handler body. Opens a session at entry,
 * closes it (and verifies the chain) at exit.
 *
 * For streaming routes, call aw.startSession() / aw.endSession() manually —
 * see Phase 4 streaming pattern.
 */
export async function withSession<T>(
  fn: () => Promise<T>,
  opts?: { agentName?: string; productContext?: ProductContext },
): Promise<T> {
  await aw.startSession(opts?.agentName, opts?.productContext)
  try {
    return await fn()
  } finally {
    await aw.endSession()
  }
}

3.3 — Environment variables

Add to .env.local (and your hosting platform's project settings):

text

AIWATCHER_API_KEY=aw_live_...
AIWATCHER_API_URL=https://ai-watcher-pi.vercel.app

(Get the API key from the AIWatcher dashboard at ai-watcher-pi.vercel.app → Apps → New App.)

3.4 — Verify the setup compiles

bash

tsc --noEmit | grep -cE "error TS"

Should equal the baseline from Phase 1.7. Stop and confirm with me before proceeding to wrap call sites.

Phase 4 — Wrap each call site

Apply these patterns based on the file's type. The naming and structural rules below were derived from instrumenting real apps — apply them mechanically.

4.1 — Non-streaming route handlers

Wrap the entire handler body in withSession(). Move URL param access (await params) outside the wrap so entity_id is in scope.

typescript

export async function POST(req: NextRequest, { params }: Params) {
  const { id } = await params                  // ← outside, so entity_id is in scope below

  return withSession(async () => {
    const user = await getCurrentUser()
    if (!user) return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })

    const result = await aw.trackLLM(
      'classify-document',
      () => anthropic.messages.create({ model, messages, max_tokens: 1024 }),
      { model, input: { doc_type: 'invoice' } },  // metadata only — never the raw prompt
    )

    return NextResponse.json(result)
  }, {
    agentName: 'doc-classifier',
    productContext: {
      feature: 'document-pipeline',
      route: '/api/documents/[id]/classify',
      user_action: 'classify-document',
      entity_type: 'document',
      entity_id: id,
    },
  })
}

4.2 — Streaming route handlers

withSession() cannot be used because the handler returns the Response before the stream is consumed. Manage the lifecycle manually around the stream:

typescript

export async function POST(req: NextRequest) {
  // Auth + env checks stay BEFORE startSession — unauthorized requests shouldn't create sessions.
  if (!process.env.ANTHROPIC_API_KEY) {
    return new Response(JSON.stringify({ error: 'Missing key' }), { status: 500 })
  }

  await aw.startSession('sequence-generator', {
    feature: 'sequence-generation',
    route: '/api/sequences/generate',
    user_action: 'generate-sequence',
  })

  const stream = new ReadableStream({
    async start(controller) {
      try {
        const claudeStream = aw.trackStream(
          'sequences-generate',
          () => anthropic.messages.stream({ model, messages }),
          { model, input: { goal: goal.slice(0, 150) } },
        )
        for await (const chunk of claudeStream) {
          /* emit deltas to client */
        }
      } catch (err) {
        /* error handling */
      } finally {
        // CRITICAL: endSession BEFORE controller.close().
        // If close() throws (e.g. on early-return double-close paths),
        // endSession() still ran — session won't leak.
        await aw.endSession()
        controller.close()
      }
    },
  })

  return new Response(stream, { headers: { 'Content-Type': 'text/plain; charset=utf-8' } })
}

4.3 — Lib helpers (shared AI logic)

Wrap each anthropic.messages.create (or other LLM/tool call) inside the helper. The wrap fires inside whatever session the caller opened — no need for the helper to manage lifecycle.

typescript

// lib/foo/synthesize.ts
import { aw } from '@/lib/aiwatcher'
import Anthropic from '@anthropic-ai/sdk'

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY ?? '' })

export async function synthesizeAccount(opts: { signals: string[] }) {
  const response = await aw.trackLLM(
    'synthesize-account',
    () => anthropic.messages.create({ model: 'claude-haiku-4-5', max_tokens: 400, /* ... */ }),
    { model: 'claude-haiku-4-5' },
  )
  return parse(response)
}

Naming rule for action names: kebab-case of the helper function. When a single helper makes multiple distinct sub-tasks (different prompts/shapes), give each a sub-action:

Pattern	Example
1 Claude call in helper	`synthesizeAccount` → `'synthesize-account'`
2+ Claude calls, same task (loop, retry)	`condenseIfNeeded` retry → still `'condense-if-needed'`
2+ Claude calls, distinct tasks	`runCompanyResearch` → `'run-company-research-search'` (Exa) + `'run-company-research-synthesize'` (Claude)
Tool calls (Exa, etc.)	`aw.track('exa-search', () => exa.searchAndContents(...))`

4.4 — Cron jobs

Wrap the handler body in withSession(), with the auth check before the wrap. human_id will be 'anonymous' automatically — that's fine for cron.

typescript

export async function GET(req: NextRequest) {
  const authHeader = req.headers.get('authorization')
  if (process.env.CRON_SECRET && authHeader !== `Bearer ${process.env.CRON_SECRET}`) {
    return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
  }

  return withSession(async () => {
    // existing cron body — any aw.trackLLM/track calls inside (including in lib helpers
    // the cron calls) will land on this session
  }, {
    agentName: 'daily-report-cron',
    productContext: {
      feature: 'reports-cron',
      route: '/api/cron/daily-report',
      user_action: 'scheduled-run',
    },
  })
}

4.5 — Special TypeScript case: cast-preserved method calls

Some codebases need to call SDK methods via casts to keep this bound to the parent object. Wrap inline — don't extract to a variable:

typescript

// ✅ Works — anthropic.messages.create stays as a property access, `this` preserved
const response = await aw.trackLLM(
  'action',
  () => (anthropic.messages.create as unknown as Fn)({ ...args }),
  { model },
)

// ❌ Breaks — extracting to a variable loses `this` binding
const create = anthropic.messages.create  // Now `this` is undefined when called

4.6 — Pace yourself

After wrapping each major group (routes, lib helpers, cron), run:

bash

tsc --noEmit | grep -cE "error TS"

It should match the baseline. If new errors appear in files you wrapped, stop and ask me — most often it's an indentation issue from the wrap, not a real type error.

Phase 5 — Verify in the dashboard

Trigger one action per pattern and check the dashboard. The expected state:

One session per request, not one per LLM call (the headline of sessionScope: 'persistent')
chain_valid: true — proves endSession() ran and the SHA-256 chain closed
Real cost > $0 and tokens > 0 on LLM events. Cost = 0 on a streaming route usually means the SDK is too old — check aiwatcher@^0.2.2 is installed.
agent_name discriminates per workflow — open the /agents page; multiple distinct rows, not one row for the whole app.
productContext is populated — session detail shows feature, route, user_action, and entity_id where derivable.
Empty sessions for early-return paths are expected, not a bug — if a wrapped route handler returns 401 or 404 before doing AI work, the session exists but has 0 events. That's correct.

Common pitfalls (the things real apps hit)

🟥 The `fetch()`-between-your-own-routes gotcha (THE big one)

If your scheduler/webhook/cron route does await fetch('/api/something-else') to trigger work on another route in your own app, you'll get two separate sessions — one for the caller (empty), one for the worker (has all the events).

This is how serverless HTTP works — every fetch is a brand-new function invocation with its own session. The SDK doesn't propagate session IDs across HTTP boundaries.

Detect it:

text

grep -rn "fetch.*api/" --include="*.ts" | grep -v node_modules

Three options:

Option	What	Effort
(a) Accept it	Two sessions per workflow. Manually correlate by timestamp. Watch out: scheduler sessions show $0, which is correct but counter-intuitive.	Zero
(b) Correlate via `workflow_id`	Generate a `workflow_id` in the scheduler, pass it as a query param to the worker, set both sessions' `productContext.workflow_id` to it. Now the dashboard can group them.	~5 lines
*(c) Refactor to a shared `lib/` helper**	Extract the AI work into a lib helper. Both scheduler and worker route call the lib directly (no fetch). Everything lands on one session.	Real refactor

The workflow_id pattern:

typescript

// Scheduler (cron, webhook, etc.):
const workflowId = crypto.randomUUID()
await withSession(async () => {
  await fetch(`${url}?workflow_id=${workflowId}`, { /* ... */ })
}, {
  agentName: 'daily-report-cron',
  productContext: { workflow_id: workflowId, /* ... */ },
})

// Worker route:
export async function POST(req: NextRequest) {
  const workflowId = new URL(req.url).searchParams.get('workflow_id') ?? undefined
  return withSession(async () => { /* AI work */ }, {
    agentName: 'report-generator',
    productContext: { workflow_id: workflowId, /* ... */ },
  })
}

🟥 Don't wrap routes that don't do AI

Wrapping a CRUD endpoint or a debug route in withSession() produces an empty session every time it's called. The dashboard fills up with noise. *Only wrap routes that reach an `aw.track call** — either directly (anthropic.messages.create` inline) or indirectly (calls a wrapped lib helper).

🟥 Streaming routes need `endSession` BEFORE `controller.close()`

If you reverse the order, you can leak sessions on early-return double-close paths. Always:

typescript

} finally {
  await aw.endSession()     // First
  controller.close()         // Then
}

🟥 Looping helpers produce dense timelines

A helper that loops (e.g. one Claude call per slot per prospect) will fire N events per invocation. A session running such a helper could have 50+ events. This is correct — each event has accurate per-call token data. Customers may complain about timeline density; that's a dashboard UX issue, not a wrap issue.

🟥 Pre-existing TS errors don't get worse from wrapping

If your codebase has 42 TS errors before wrapping, you'll have 42 after — if (!x) return narrows stay at the top of the closure. Don't fix pre-existing errors in the wrap PR. Separate PRs.

🟥 `anthropic` instance scope varies — wrap doesn't care

Some helpers have const anthropic = new Anthropic(...) at module scope. Some inside the function. Some receive it as a parameter. The wrap just uses whatever's in scope:

typescript

const response = await aw.trackLLM('action', () => anthropic.messages.create(/* ... */))

No refactor needed.

🟥 Don't extract API calls to variables outside the `trackLLM` arrow

Same this-binding issue as the cast case. Always inline:

typescript

// ❌ Don't:
const create = anthropic.messages.create
await aw.trackLLM('action', () => create(args))   // `this` is undefined

// ✅ Do:
await aw.trackLLM('action', () => anthropic.messages.create(args))

Known SDK limitations (current as of `aiwatcher@0.2.2`)

These are real constraints to be aware of — workarounds shown:

Limitation	Workaround
`session_name` not exposed in JS SDK (Python has it — shows a human-readable label in the dashboard)	None today. Future SDK release. Use `productContext.feature` + `entity_id` for context in the meantime.
`outcomeContext` is set at call time, but outcomes happen later (accepted/sent/discarded by the user)	The only outcome reliably knowable at call time is `failed: true` (in a catch block). For post-hoc outcomes (accept, edit, send), wait for SDK enhancement that exposes `attachOutcome(sessionId, ctx)`.
Same-user concurrent requests share a session (`getSession` keys by `userId`)	Acceptable for most apps. Truly concurrent same-user requests (browser opens two tabs and triggers two AIs simultaneously) will interleave events on one session. Fix requires AsyncLocalStorage in the SDK.
No cross-request session correlation (the `fetch()` gotcha above)	Use `workflow_id` in `productContext` to correlate manually.
`humanId` override on `startSession` not exposed (would let cron use `'cron@<app>'` instead of `'anonymous'`)	Acceptable — `'anonymous'` is correct for cron. Identify cron sessions by `agent_name` + `user_action: 'scheduled-run'`.

Phase summary checklist

Before declaring done:

Phase 1 discovery table reviewed with me
Phase 2 decisions all answered
lib/aiwatcher.ts created, env vars set
Every route handler that does AI work is wrapped (non-streaming → withSession(), streaming → manual startSession/endSession)
Every shared lib AI call is wrapped with aw.trackLLM or aw.track
Every cron route that does AI work is wrapped
tsc --noEmit error count matches the Phase 1.7 baseline
One Polish-style action triggered, session shows correct agent_name, chain_valid: true, real cost/tokens, productContext populated
One streaming action triggered, same checks pass
/agents view shows multiple discriminated workflows, not a single app-name row
fetch()-between-routes patterns identified and one of the three mitigations applied
All known SDK limitations are explicitly accepted (or mitigation plumbed)

Verify it's working

After your first instrumented call, open the dashboard:

Your Local → Sessions — appears within 30 seconds of collector start
Your Apps → [your app] → Sessions — appears within seconds of first call

Each session shows the human ID, session name, model and token counts, estimated cost, and a full event timeline with input metadata.

What is and isn't captured

Captured

Session metadata: human_id, agent_name, framework, model
Token counts per call
Estimated cost at current rates
Latency per call
Input metadata you pass
Output summary you pass

Not captured

Raw prompt text (unless explicitly passed)
Raw model output (unless explicitly passed)
Anything from your database or user records

Supported models and cost rates

Model	Provider	Pricing (per 1M tokens in / out)
claude-sonnet-4-20250514	Anthropic	$3.00 / $15.00
claude-haiku-4-5-20251001	Anthropic	$0.80 / $4.00
claude-opus-4-20250514	Anthropic	$5.00 / $25.00
gpt-4o	OpenAI	$2.50 / $10.00
gpt-4o-mini	OpenAI	$0.15 / $0.60
llama3 via Ollama	Local	$0

Codex CLI and Claude Code sessions are subscription-billed. Token usage is shown in the dashboard but no dollar cost is applied. Configure this under Settings → AI Provider Billing.

Questions

Contact the team directly:

danny.lo@fincastai.io
Or open an issue at github.com/ai-watcher/aiwatcher-local

Get on the early access list.

We’re working with a small number of design partners in 2026.

We reply within 48 hours.

Getting started with AIWatcher.

Before you start

Path A — Your Local (zero code changes)

Install

Run

What it reads

Path B — Your Apps (SDK)

Path C — Use Claude Code to instrument your app

Instrument my app with AIWatcher

What you're about to build (the mental model)

Phase 1 — Discovery (read-only, no edits)

Always exclude these paths from every grep

1.1 — AI SDK imports

1.2 — LLM call shapes (catches inline fetch() cases too)

1.3 — External tool calls worth tracking

1.4 — Streaming detection

1.5 — Trace lib helper callers (critical for coverage)

1.6 — Server actions and cron

1.7 — Baseline of pre-existing TS errors

1.8 — Report back

Phase 2 — Architectural decisions (I'll answer)

Q1 — Wrap-point for individual LLM call wraps

Q2 — agent_name per workflow

Q3 — productContext fields

Q4 — Cron handling

Phase 3 — Setup (one config file + env vars)

3.1 — Install the SDK

3.2 — Create lib/aiwatcher.ts

3.3 — Environment variables

3.4 — Verify the setup compiles

Phase 4 — Wrap each call site

4.1 — Non-streaming route handlers

4.2 — Streaming route handlers

4.3 — Lib helpers (shared AI logic)

4.4 — Cron jobs

4.5 — Special TypeScript case: cast-preserved method calls

4.6 — Pace yourself

Phase 5 — Verify in the dashboard

Common pitfalls (the things real apps hit)

🟥 The fetch()-between-your-own-routes gotcha (THE big one)

🟥 Don't wrap routes that don't do AI

🟥 Streaming routes need endSession BEFORE controller.close()

🟥 Looping helpers produce dense timelines

🟥 Pre-existing TS errors don't get worse from wrapping

🟥 anthropic instance scope varies — wrap doesn't care

🟥 Don't extract API calls to variables outside the trackLLM arrow

Known SDK limitations (current as of aiwatcher@0.2.2)

Phase summary checklist

Verify it's working

What is and isn't captured

Supported models and cost rates

Questions

Get on the early access list.

1.2 — LLM call shapes (catches inline `fetch()` cases too)

Q2 — `agent_name` per workflow

Q3 — `productContext` fields

3.2 — Create `lib/aiwatcher.ts`

🟥 The `fetch()`-between-your-own-routes gotcha (THE big one)

🟥 Streaming routes need `endSession` BEFORE `controller.close()`

🟥 `anthropic` instance scope varies — wrap doesn't care

🟥 Don't extract API calls to variables outside the `trackLLM` arrow

Known SDK limitations (current as of `aiwatcher@0.2.2`)