DevAgent — Architecture & Usage Guide

Autonomous ticket-to-fix pipeline. Receives Jira tickets via webhook, classifies them, investigates the codebase, and posts structured triage reports back to Jira. Runs as a systemd service on Hetzner (136.243.36.27) as the devagent user.

1. System Overview

DevAgent is an autonomous development system that processes support tickets end-to-end. It uses a two-phase architecture: Phase A classifies the ticket in ~6 seconds, and Phase B performs a deep codebase investigation in ~4 minutes. The system spawns Claude Code CLI as subprocesses, uses Mem0 (Qdrant + Ollama) for a learning loop, and posts structured reports back to Jira.

Component	Technology
Runtime	Node.js 20+, Express
AI Engine	Claude Code CLI (spawned as subprocesses via `claude --print`)
Memory	Mem0 — Qdrant (vector store at :6333) + Ollama (bge-m3 embeddings at :11434)
Ticket System	Jira REST API (webhooks in, comments out)
Subagents	Claude Code agent definitions (`.md` files): triage, dev-planner, dev-executor
Host	Hetzner AX41 (136.243.36.27), systemd service, `devagent` user

2. Architecture Diagram

Current two-phase pipeline (solid) and planned future phases (dashed).

Live Current Pipeline Two-phase triage

Jira Webhook POST /webhooks/jira

Express Port 9200

Phase A: Classify ~6s, max-turns 3

Mem0 Search Prior context

Phase B: Investigate ~4 min, max-turns 25

Jira Comment REST API

Mem0 Store Learning loop

Planned Fix Pipeline Post-approval automation

Phase B Output Triage report

Approval Gate Slack approve/reject

DevPlanner Subagent

DevExecutor Subagent

Pull Request GitHub

Shared Infrastructure

Qdrant

localhost:6333 — Vector store

Ollama

localhost:11434 — bge-m3 embeddings

Jira REST API

Comments, transitions

Claude Code CLI

Spawned as subprocess

3. Source Files

Located at /root/repos/datastudios-dev-agent/ on Hetzner.

File	Purpose
`src/index.js`	Express app and orchestration. Entry point with `classifyTicket()` → `investigateTicket()` → `handleTicket()` flow. Also contains test endpoints (`/test/classify`, `/test/triage`, `/health`).
`src/triage-ai.js`	Prompt builders: `buildClassifyPrompt()` (Phase A — slim client registry, returns JSON), `buildInvestigationPrompt()` (Phase B — full prompt with Mem0 context, data assets, migration awareness), `buildTriagePrompt()` (legacy single-phase).
`src/core/session-manager.js`	Spawns `claude` CLI as subprocesses via `child_process.spawn()`. Manages session state in `/tmp/triage-agent/sessions/`. 30-minute timeout. Handles Phase A (`claude --print --max-turns 3`) and Phase B (`claude --print --agent triage --max-turns 25`).
`src/core/client-registry.js`	Loads and caches `config/clients.json`. Provides client lookup by Jira project key, email domain, or client ID.
`src/mem0-client.js`	Searches Qdrant directly using Ollama bge-m3 embeddings. `searchMemories()` queries with `user_id: devagent-{client_id}` filter. `buildStoreInstruction()` appends learning-loop instructions to triage prompt.
`src/jira-updater.js`	Posts ADF-formatted comments and transitions issues via Jira REST API. Used after Phase B to post the investigation report.
`src/webhook-listener.js`	Express router for Jira webhooks at `POST /webhooks/jira`. Validates shared secret header, extracts ticket data from webhook payload.
`config/clients.json`	Client registry — maps clients to repos, functions, data assets, external APIs. See Section 5 for schema details.

4. Subagent Definitions

Located at /root/.claude/agents/ on Hetzner (symlinked from datastudios-ops/agents/).

Phase B triage.md

Read-only investigator. Explores the target repo, reads CLAUDE.md, examines code, and produces a structured triage report. Has a Data Investigation Protocol for pipeline issues (reads dbt models, checks OData sources, verifies field population).

Model claude-sonnet-4-6

Max turns 25

Tools

Read Glob Grep Bash Agent

Output Structured report: TRIAGE, ROOT CAUSE, AFFECTED FILES, PROPOSED FIX, COMPLEXITY, CONFIDENCE

Planned dev-planner.md

Read-only planner. Produces implementation plans with affected files, ordered steps, downstream impact analysis, and test requirements. Has Data Enhancement Protocol for pipeline schema changes.

Model claude-sonnet-4-6

Tools

Read Glob Grep Bash Agent WebFetch

Output Implementation plan with files, steps, impact, tests

Planned dev-executor.md

Write-capable executor. Creates feature branches, implements changes, runs tests, and commits. Uses isolation: worktree for safe, isolated execution.

Model claude-sonnet-4-6

Tools

Read Write Edit Glob Grep Bash Agent

Isolation Git worktree (creates feature branch)

5. Client Registry

config/clients.json maps clients to repos, functions, data assets, and external APIs. Phase A uses a slim version (no functions metadata) for fast classification. Phase B gets the full registry.

{ "clients": [ { "client_id": "hmr-designs", "display_name": "HMR Designs", "jira_project": "HD", // ① "email_domains": ["hmrdesigns.com"], "repos": [ { "name": "hmr-aws-lambda-functions", "path": "/root/repos/hmr-aws-...", "tech_stack": ["Python", "AWS Lambda"], "key_paths": { "lambdas": "src/lambdas/", "shared": "src/shared/" }, "functions": [...] // ② } ], "external_apis": { // ③ "nutshell": { "base_url": "https://app.nutshell.com/...", "auth_type": "basic", "investigation_guide": "..." } }, "data_assets": { ... } // ④ } ] }

1 Jira Project Key

Used to route incoming webhooks to the correct client. Phase A matches the ticket's project key against this field.

2 Functions Array

For multi-function repos (e.g., 20 Lambdas in one repo). Each function has name, path, description, and example_requests. Stripped in slim registry for Phase A speed.

3 External APIs

API credentials and investigation guides for external systems (e.g., Nutshell CRM). Includes auth type, methods, and step-by-step investigation playbooks.

4 Data Assets

Source systems, database schemas, and key tables. Used for data-type tickets. Includes legacy flags and migration targets.

Currently Configured Clients

Client	Jira Key	Repos
datastudios	DAT	lead-generator, article-writer, contactos
creme-collective	CC	creme-analytics (legacy), creme-report-automation, creme-elt
hmr-designs	HD	hmr-aws-lambda-functions (20 Lambdas), hmr-nutshell-integration (8 Lambdas)

6. How It Works

End-to-end flow from ticket creation to Jira comment.

Ticket Created

Jira webhook fires to POST /webhooks/jira. The webhook listener validates the shared secret header and extracts ticket ID, summary, description, and reporter email from the payload.
Phase A — Classify

Runs claude --print --max-turns 3 with a slim client registry (no functions metadata). Returns JSON: {client_id, repo, issue_type, severity, summary}. The issue_type determines which investigation protocol Phase B uses (bug, data, feature, infra).

~6 seconds
Mem0 Search

Embeds the ticket summary via Ollama bge-m3, searches Qdrant for prior context using user_id: devagent-{client_id} filter. If relevant memories exist, they are injected into the Phase B prompt as additional context.
Phase B — Investigate

Runs claude --print --agent triage --max-turns 25 with cwd set to the target repo path from Phase A. The triage agent reads CLAUDE.md, explores code with Glob/Grep/Read, and produces a structured report with: triage summary, root cause analysis, affected files, proposed fix, complexity rating, and confidence level. For data-type issues, follows the Data Investigation Protocol (dbt models, OData sources, field population checks).

~2–4 minutes
Post to Jira

Adds the investigation report as an ADF-formatted comment on the original ticket. The comment includes all sections of the structured report (typically 3,000–5,000 characters).
Learning Loop

The triage agent's prompt includes instructions to store findings in Mem0 for future reference. Key findings (root causes, patterns, recurring issues) are embedded and stored in Qdrant for retrieval on subsequent tickets.

7. Infrastructure

All components run on the Hetzner AX41 server.

Component	Details
Service	`systemctl status devagent` — runs as `devagent` user (not root)
Port	9200
Repo path	`/root/repos/datastudios-dev-agent/`
Session files	`/tmp/triage-agent/sessions/`
Claude auth	`/home/devagent/.claude/.credentials.json` (OAuth, refreshed every 4h via cron)
Token refresh	`/root/scripts/refresh-claude-token.sh` — cron every 4 hours. OAuth endpoint: `https://platform.claude.com/v1/oauth/token`
AWS access	`/home/devagent/.aws/` (hmr-designs profile for CloudWatch)
Mem0 — Qdrant	`localhost:6333`
Mem0 — Ollama	`localhost:11434` (bge-m3 embedding model)
Agent definitions	`/root/.claude/agents/` (symlinked from `datastudios-ops/agents/`)
Logs	`journalctl -u devagent -f`

8. Test Endpoints

Use these endpoints to test classification and triage without triggering real Jira webhooks.

Health Check

curl http://localhost:9200/health

Returns JSON with status, uptime, and version.

Classify Only

curl -X POST \ http://localhost:9200/test/classify \ -H 'Content-Type: application/json' \ -d '{"ticketId":"HD-92", "summary":"Nutshell not converting", "description":"..."}'

Returns Phase A classification JSON. ~6 seconds.

Full Triage

curl -X POST \ http://localhost:9200/test/triage \ -H 'Content-Type: application/json' \ -d '{"ticketId":"HD-92", "summary":"Nutshell not converting", "description":"...", "reporter":"user@hmrdesigns.com"}'

Async — runs Phase A + B, posts result to Jira. Check Jira for the triage comment.

9. Known Gaps & Next Steps

Roadmap with Jira ticket references. Items are ordered by priority.

Item	Status	Ticket
Two-phase triage (classify → investigate)	Done	DAT-38
Subagent definitions (triage, dev-planner, dev-executor)	Done	DAT-46
HMR test case (Nutshell event creation — HD-92)	Done	HD-92
Creme test case (data investigation — DSS-2)	Done	DSS-2
Architecture & usage guide	In Progress	DAT-47
Repo rename (triage-agent → dev-agent)	To Do	DAT-48
Approval Gate (Slack approve/reject before fix)	To Do	DAT-43
Fix on approval (dev-executor integration)	To Do	DAT-44
HMR COB hours test case	To Do	DAT-42
Meeting-prep DevAgent handoff (Step D9)	To Do	DAT-45
Mem0 historical context (seed with past investigations)	To Do	DAT-41
Email → Jira pipeline (support@datastudios.ai auto-creates tickets)	Not Filed	—
Live data access (AWS CLI, external APIs) from triage agent	Not Filed	—

10. Lessons Learned

Hard-won findings from building and testing DevAgent. Read these before making changes.

Non-root is mandatory

Claude Code blocks --dangerously-skip-permissions when running as root. The service must run as a non-root user (devagent). This required creating the user, copying Claude auth + AWS creds, and updating the systemd unit file.

OAuth tokens expire

The Claude OAuth token expires periodically. A cron job at /root/scripts/refresh-claude-token.sh runs every 4 hours to refresh it. The token is stored at /home/devagent/.claude/.credentials.json.

Session file ownership

Session files in /tmp/triage-agent/sessions/ created by a previous user (e.g., root) cause EACCES errors after switching to the devagent user. Clean with rm -rf /tmp/triage-agent after user changes.

Phase A must be fast

Phase A classification must complete in <10 seconds. Use a slim client registry (no functions metadata) to keep the prompt small. Current measured time: ~6 seconds with max-turns: 3.

Phase B max-turns: 25

The original max-turns: 10 was too low for deep investigation — the agent would cut short before fully exploring the codebase. 25 is the sweet spot, completing in ~2–4 minutes.

Code ≠ live data

The triage agent found a real code bug (validation gap) but missed the primary issue (API instability) because it couldn’t query CloudWatch or the Nutshell API. Live data access is the biggest remaining gap.

Multiple investigation layers

In the HD-92 test, three independent investigations (DevAgent, CloudWatch analysis, developer review) each found real issues at different layers. DevAgent is complementary to human debugging, not a replacement.

Nutshell API reference

An LLM-friendly API reference at docs/NUTSHELL_API_REFERENCE.md in the hmr-nutshell-integration repo enables the triage agent to understand external API interactions. Consider adding similar references for other external APIs.

11. Mem0 Memory Lifecycle

How DevAgent stores, retrieves, and expires memories across triage runs.

Memory Types

Type	Metadata	Expiration	Example
Permanent learning	`type: "permanent_learning"`	Never	Nutshell API intermittently returns non-JSON responses — check CloudWatch first
Known bug	`type: "known_bug"`, `jira_ticket: "HD-93"`	When Jira ticket is Done/Closed	`start_end_valid` validation fails silently when Event End Date is not set

user_id Convention

Client	user_id
HMR Designs	`devagent-hmr-designs`
Creme Collective	`devagent-creme-collective`
DataStudios	`devagent-datastudios`

Retrieval Rules (Ticket-Linked Memories)

Search Mem0

After Phase A classification, searchMemories() queries Qdrant with the ticket summary and user_id: devagent-{client_id}.
Check metadata

For each returned memory, read metadata.type. If permanent_learning, include it directly. If known_bug, proceed to step 3.
Verify Jira status

Read metadata.jira_ticket and check its status via Jira REST API. If the ticket is Done or Closed, skip the memory or flag it as resolved. If Open or In Progress, inject it into the Phase B prompt normally.

Future: Automated Cleanup (Option 4)

When a Jira ticket transitions to Done, a webhook can trigger automatic Mem0 cleanup — updating or deleting memories tagged with that ticket ID. This automates the retrieval-time check above and removes stale context proactively. Not yet implemented.