# CLAUDE.md — Chronis AI Agents

> This file is read automatically by Claude Code on launch. It is the source of truth for how this project must be built. Follow it on every task. When in doubt, consult `docs/PRD.md` (what to build) and `docs/design-system.md` (how it must look and behave).

## What this project is
Chronis AI is a self-hosted, multi-tenant AI agent platform. A business deploys specialised AI agents (Sales, Marketing, PR, Retention, Deliverability, Lead Gen) that prospect, draft, and act — but never take an external action without human approval via Slack. First business: Mailblaze (email marketing platform). First agent: Sales Agent.

## Tech stack (do not substitute without asking)
- Laravel 11, PHP 8.3
- Livewire 3 + Alpine.js + Tailwind CSS (frontend)
- PostgreSQL 16, Redis 7
- Laravel Horizon (queues), Laravel Scheduler (cron)
- Docker Compose (bare metal, self-hosted)
- OpenRouter (LLM routing — NOT a direct Anthropic SDK)
- Slack (approvals), Pipedrive (CRM), Apollo (prospecting), Mailblaze (send), Gmail/IMAP (reply detection)

## Non-negotiable architecture rules
These come from a reviewed spec. Violating them creates real bugs. Respect them on every task.

1. **Skills are TYPED and run as a PIPELINE — never one mega-prompt.**
   Four skill types: `context` (injects info, no LLM call), `evaluator` (scores/judges, own cheap LLM call, can short-circuit the pipeline), `transformer` (rewrites content, own LLM call, output must be inspectable), `tool` (calls external API, no LLM call). Each transformer/evaluator stage writes its own `agent_tasks` row.

2. **Every external write is IDEMPOTENT.**
   Before sending an email or creating a Pipedrive lead, check the `idempotency_keys` table. Queue jobs retry — double-sends and duplicate leads are unacceptable. One `agent_runs` row per execution carries the run-level idempotency key.

3. **Follow-up cadence is DATA, not code.**
   Cadence lives in `sequences` / `sequence_steps`. Never hardcode "day 4, day 9" logic in PHP. Steps carry `delay_days`, `channel`, `requires_approval`, `stop_on_reply`.

4. **Reply detection uses Gmail/IMAP, NOT Mailblaze.**
   Mailblaze only sends and reports opens/clicks/bounces — it does not parse inbound replies. Verified against their API docs. Do not build reply detection on Mailblaze.

5. **Budgets are in USD, checked atomically BEFORE each LLM call.**
   `spend_today_usd + estimated_call_cost <= token_budget_daily_usd`, decremented under a lock so parallel jobs can't overshoot. Warn at 80%, block at 100%.

6. **POPIA/GDPR compliance is required.**
   Prospects are real people. Honour opt-out, log lawful basis, respect `data_retention_days`, include unsubscribe paths. Soft-deletes + scheduled purge.

7. **Approvals expire.** Set `expires_at` (~72h). Expired approvals auto-cancel and notify — never silently send late or vanish.

8. **Secrets never live raw in DB rows or images.** Use `integration_secrets` (AES-256-GCM, key rotation). Inject via `.env` / Docker secrets.

## Model strategy (verified OpenRouter slugs, May 2026)
- `anthropic/claude-opus-4.7` — heavy reasoning, edge-case escalation
- `anthropic/claude-sonnet-4.6` — default drafting/research
- `anthropic/claude-haiku-4.5` — high-volume cheap work (scoring, classification)
Tier per skill, not per agent. Pin slugs in production; floating aliases (`anthropic/claude-sonnet`) only in dev.

## Design & UX
All UI must follow `docs/design-system.md`: flat, calm, operational; two font weights (400/500); sentence case; semantic status colours (never colour alone); WCAG 2.2 AA; mandatory dark mode. Build shared Blade components (`<x-badge>`, `<x-agent-card>`, `<x-approval-item>`) so consistency is structural, not copy-paste. The approval flow is the hero surface — fast, scannable, unambiguous about what will send.

## Build order (follow this sequence)
1. Docker Compose + Laravel skeleton + multi-company auth + global scopes
2. All migrations (see `docs/PRD.md` §4 — including agent_runs, sequences, idempotency_keys, integration_secrets)
3. OpenRouter service + usage/cost capture + atomic budget guard
4. Skills engine (typed: context/evaluator/transformer/tool)
5. Agent engine (pipeline executor + agent_runs)
6. Approval workflow (queue jobs + Slack Block Kit + signature verify + expiry)
7. Idempotency layer
8. Memory manager + retention purge
9. Event/webhook system (dedup_hash, dispatcher)
10. Sales agent (ICP, sequences data, Apollo + Pipedrive + Mailblaze + Gmail/IMAP)
11. UI (dashboard, config, approvals, sequences, observability) per design-system.md
12. Playground, audit, notifications, dead-letter/needs-attention surface

## Conventions
- Service classes in `app/Services/{Agent,LLM,Skills}`. Keep them fully implemented — no stubs or TODO placeholders.
- Every company-owned model gets a global scope for `company_id`.
- Jobs are idempotent and safe to retry.
- Write a feature test alongside each service (Pest preferred).
- Commit in small, logical units with clear messages.

## How to work in this repo
- Read `docs/PRD.md` and `docs/design-system.md` before starting a new area.
- When you finish a build-order step, summarise what changed and what's next.
- If a spec detail is ambiguous, ask rather than guess — especially on the architecture rules above.
