Portfolio · Case Study

Diego S.
Diego S.
AI Engineering Agentic Systems Voice AI Multi-Tenant SaaS
Education
Florida International University
BS · Computer Science
Florida International University
MBA · Marketing & E-Commerce
Pennsylvania State University
Master of Applied Statistics
Available now
Live · app.contactos.ai
Expert-VettedExpert-Vetted
Top RatedTop Rated
100%Job Success
Case Study
Multi-Channel AI
Customer Service Platform
SaaS Product
AI Customer Service
4
Channels in production
< 1s
End-to-end voice response
2
AI processing runtimes
01 · 13
Project overview
Key project facts
Platform
Multi-Tenant AI Platform (SaaS)
Industry
Customer Service · Operations
Engagement type
Product Build
Status
In production · app.contactos.ai
Scope
Voice · WhatsApp · Web Chat · Campaigns
Technical stack
Python FastAPI LangGraph Pipecat Claude Haiku 4.5 Anthropic Twilio Deepgram ElevenLabs MongoDB PostgreSQL Redis n8n Stripe React TypeScript Docker
Engagement summary

A production multi-tenant AI customer service platform supporting four inbound and outbound channels: voice (Twilio + Deepgram + ElevenLabs), WhatsApp, web chat, and outbound campaigns. Conversations are handled by a LangGraph ReAct agent with real-time function calling against live client APIs. Voice calls are processed through a dedicated Pipecat streaming pipeline — Silero VAD, Deepgram Nova-2 STT, Claude Haiku 4.5, and ElevenLabs TTS — delivering sub-second responses end-to-end. Multi-tenancy is achieved through a TenantRuntime Registry: each tenant is an isolated directory with its own config, tools, and formatters. Onboarding a new client is a configuration operation — no changes to shared application code. An embedded n8n control plane lets each tenant provision and trigger automation workflows from the platform — templates are instantiated programmatically through the n8n REST API and fired by app and call events.

< 1s
Voice Response Latency
End-to-end streaming: VAD + STT + LLM + TTS
4
Channels in Production
Voice · WhatsApp · Web Chat · Outbound Campaigns
2
AI Processing Runtimes
LangGraph ReAct (text channels) · Pipecat (real-time voice)
02 · 13
Technical Design · AI Customer Service Platform
System Architecture
Multi-Tenant AI Platform
Channels
WhatsApp
Meta Cloud API · Webhook
Web Chat
JS Embed · WebSocket
Twilio Voice
WS · 8kHz mu-law
Outbound Campaigns
Celery Dispatcher · n8n
Gateway
FastAPI · uvicorn
Production · HTTPS
Traefik · TLS
Let's Encrypt
JWT Auth
passlib · python-jose
Channel Router
core/channels/router.py
Webhook Handlers
Meta · Twilio · Internal
LangGraph ReAct Agent
chatbot_node
LLM + tool binding
tool_node
10 function tools
handoff_node
human escalation
MongoDB Checkpointer TenantRuntime Registry Session Manager Anthropic Claude LangSmith tracing
Pipecat Voice Pipeline
Silero VAD
voice activity detection
Deepgram STT
Nova-2 · Spanish (es)
Claude Haiku 4.5
function calling
ElevenLabs TTS
flash_v2_5 · streaming
External APIs
Tenant API
per-tenant endpoints
n8n Control Plane
workflow provisioning · REST
Google Maps
travel time
Pinecone
address normalization
LangSmith
trace + eval
Sentry
error tracking
Data
MongoDB
conversations · messages · sessions
PostgreSQL
users · tenants
Redis
cache · Celery broker
React Dashboard
WebSocket · Analytics · KPIs · Broadcasts
Technical Approach

Inbound messages from all four channels are normalized through a single Channel Router, then dispatched to a shared LangGraph ReAct agent. Voice calls bypass text routing entirely, entering a dedicated real-time Pipecat pipeline for sub-second STT → LLM → TTS latency.

Key Architectural Decisions
LangGraph ReAct — agent loops until tool results satisfy the request or escalation is triggered; MongoDB checkpointer persists state across turns.
TenantRuntime Registry — each tenant is a directory with its own config, tools, and formatters. Adding a client is a new directory, not a code change.
Pipecat for voice — streaming pipeline keeps audio processing out of the main FastAPI event loop, enabling concurrent voice + messaging without blocking.
n8n control plane — per-tenant workflow templates are provisioned via the n8n REST API and dispatched through a universal webhook; call events can trigger workflows.
03 · 13
Technical Design · AI Customer Service Platform
Voice Pipeline — Real-Time Call Flow
Pipecat · FastAPI WebSocket
1
Inbound Channel
Twilio WebSocket
TwiML webhook → WebSocket upgrade → TwilioFrameSerializer (mu-law ↔ PCM)
8kHz · mu-law
~30ms
2
Voice Activity Detection
Silero VAD
Filters silence frames before STT — prevents spurious transcriptions and reduces API cost
ONNX model
~15ms
3
Speech-to-Text
Deepgram Nova-2
Streaming transcription · language: es (Spanish) · interim results discarded, final passed to aggregator
Nova-2 · streaming
~180ms
4
LLM — Function Calling
Claude Haiku 4.5
Lowest-latency Claude model · tools: validate_document, confirm_services, end_call · streaming
claude-haiku-4-5
~350ms
Tool calls (if triggered)
validate_document → Tenant API confirm_services → Tenant API end_call → EndFrame
5
Text-to-Speech
ElevenLabs
Model: eleven_flash_v2_5 · stability: 0.7 · similarity_boost: 0.75 · speed: 0.85 · streaming audio chunks
flash_v2_5
~200ms
6
Outbound Audio
Twilio Transport Output
PCM → mu-law re-encode → WebSocket → Twilio → caller's phone
8kHz · mu-law
~800ms total
Why This Stack

Each component was chosen for streaming latency, not batch throughput. Deepgram Nova-2 returns partials in real time. Claude Haiku 4.5 is Anthropic's fastest inference model. ElevenLabs flash_v2_5 streams audio chunks before synthesis is complete — the caller hears the first word before the full response is generated.

Key Decisions
Silero VAD first — eliminates silence and background noise before STT, cutting Deepgram cost and preventing empty LLM calls.
Claude Haiku over GPT-4o-mini — lower TTFT in real-world voice loops; consistent with the Anthropic-native backend for shared observability in LangSmith.
Function calling in voice — validate_document and confirm_services call the tenant's live API mid-conversation without breaking the audio stream.
End-to-End Latency

VAD + STT + LLM + TTS: ~800ms typical · <1.2s with tool call

04 · 13
01 / 09 Multi-Channel AI Customer Service · Deliverable
Agent Workspace
& Escalation

The agent's command center for live conversations — every channel in one place, with one-click human take-over from the AI.

What this screen does
Active conversation list
All in-flight conversations filtered by Active / Inactive and by ownership (All, Mine, Unassigned, AI).
Live thread with channel context
Customer messages, AI responses, and tool outputs interleaved; the channel (Web, WhatsApp, voice) shown right in the header.
Take over from AI
One-click "Return to AI," Reassign, or End conversation actions on the contact panel.
Identity + state at a glance
Channel, state (Transferred / Active), assigned agent, and start time pinned to the right.
React · TypeScript · FastAPI · LangGraph · Anthropic Claude · Pipecat Anonymized representative UI
05 · 13
02 / 09 Multi-Channel AI Customer Service · Deliverable
Outbound Campaign
Broadcasts

Mass outbound WhatsApp campaigns via Meta-approved templates, with per-send delivery tracking and audit trail.

What this screen does
Template-driven sends
Every campaign uses a Meta-approved WhatsApp template (reminder, appointment, follow-up, location, etc.).
Per-broadcast delivery stats
Recipients, sent, failed, and mode (Test or Real) tracked per send.
One-click new broadcast
Compose, target a recipient list, preview, and send from a single action.
React · TypeScript · FastAPI · LangGraph · Anthropic Claude · Pipecat Anonymized representative UI
06 · 13
03 / 09 Multi-Channel AI Customer Service · Deliverable
Conversation
Analytics

Operational health of the AI customer service platform — resolution, intent mix, agent load, and sentiment in one view.

What this screen does
Headline KPIs
AI resolution rate, total conversations, pending/closed counts, active sessions, and CSAT.
Resolution + handoff reasons
Donut of AI resolution, with the ranked reasons a conversation handed off to a human.
Top intents
The five most-asked topics across the period.
Human agent leaderboard
Per-agent close counts and online status alongside an AI-vs-human sentiment split.
React · TypeScript · FastAPI · LangGraph · Anthropic Claude · Pipecat Anonymized representative UI
07 · 13
04 / 09 Multi-Channel AI Customer Service · Deliverable
AI Agent
Configuration

Per-tenant configuration of the AI agent — model, behavior, tools, greeting, and system prompt — without touching code.

What this screen does
Model + sampling config
Pick the model (e.g. claude-haiku-4-5), temperature, and session expiry per tenant.
Tool registry
Bind the function-calling tools the agent can use (validate_document, confirm_operator, confirm_services, cancel_services).
Greeting & system prompt
Editable first message and full system prompt, applied per tenant.
React · TypeScript · FastAPI · LangGraph · Anthropic Claude · Pipecat Anonymized representative UI
08 · 13
05 / 09Multi-Channel AI Customer Service · Deliverable
Voice Call Logs
& Recordings

Every inbound and outbound voice call in one searchable log — with one-click playback of the recording and the full conversation transcript.

What this screen does
Filterable call log
Inbound & outbound calls by date, agent, campaign, direction, and outcome.
Recording playback
Waveform audio player to replay any call inline.
Full transcript + tool-calls
Turn-by-turn AI/caller transcript with the live-API tool-calls the agent made mid-call.
Voice pipeline at a glance
Per-call view of the Twilio → Deepgram → Claude Haiku → ElevenLabs stack.
Pipecat · Twilio · Deepgram · Claude Haiku · ElevenLabs · ReactAnonymized representative UI
09 · 13
06 / 09Multi-Channel AI Customer Service · Deliverable
Voice Agent
Configuration

Configure a voice AI agent end to end — telephony, voice, language, greeting, prompt, and the AI flow it runs — without touching code.

What this screen does
Telephony + voice stack
Twilio telephony, Deepgram Nova-2 STT, ElevenLabs TTS wired per agent.
Voice & model selection
Pick TTS provider/model, Voice ID, and language.
Greeting & prompt
Initial message plus a system prompt or reusable prompt template.
Routing & availability
Assign a phone number, link an AI Flow, set inbound/outbound and availability.
FastAPI · Twilio · Deepgram · ElevenLabs · ReactAnonymized representative UI
10 · 13
07 / 09Multi-Channel AI Customer Service · Deliverable
Phone Numbers
Buy & Manage

Provision telephony numbers in-app and assign them to agents — the layer that connects the voice agents to the phone network.

What this screen does
Search & buy
Find numbers by country and area code and provision instantly.
Purchased numbers
Manage every owned number in one list.
Assign to agents
Route each number to the right inbound/outbound agent.
Twilio-backed
Numbers provisioned and billed through Twilio.
Twilio · FastAPI · ReactAnonymized representative UI
11 · 13
08 / 09Multi-Channel AI Customer Service · Deliverable
Workflows
n8n Control Plane

Automation around the conversation — built on an embedded n8n control plane that connects the agents to hundreds of downstream apps.

What this screen does
n8n templates
Start from prebuilt templates — post-call summary → CRM, lead handoff → Slack, reminders.
500+ integrations
n8n connects to hundreds of apps incl. CRMs, calendars, and ticketing.
My Workflows
Per-tenant workflows with triggers, execution counts, and on/off control.
Universal webhook dispatch
A single entry point fired by app + call events to run any workflow.
n8n · FastAPI · Webhooks · ReactAnonymized representative UI
12 · 13
09 / 09Multi-Channel AI Customer Service · Deliverable
Billing
& Plans

Subscription and usage management — current plan, tiers, and Stripe-backed payment, so each tenant's voice and messaging usage is metered and billed.

What this screen does
Plans & tiers
Current plan and upgrade tiers per tenant.
Usage metering
Voice/messaging usage tracked against the plan.
Call-minute top-ups
Add capacity at $0.10/min via Stripe Checkout.
Stripe-provisioned
Subscriptions provisioned via webhook on checkout.session.completed.
Stripe · FastAPI · ReactAnonymized representative UI
13 · 13