ποΈ Architecture
π€ Voice-powered AI assistant for SIP phone systems
ROBO CODED β This documentation was made with AI and may not be 100% sane. But the code does work! π
flowchart LR
subgraph Caller
Phone[π± SIP Phone]
end
subgraph Agent["π€ SIP AI Agent"]
SIP[SIP Client]
Audio[Audio Pipeline]
Tools[Tool Manager]
API[REST API]
end
subgraph Services
LLM[π§ LLM Server<br/>OpenAI / vLLM / Ollama]
Speaches[π€ Speaches<br/>STT + TTS]
end
subgraph Integrations
HA[π Home Assistant]
N8N[π n8n]
Webhook[π Webhooks]
end
Phone <-->|SIP/RTP| SIP
SIP <--> Audio
Audio <-->|Whisper| Speaches
Audio <-->|Kokoro| Speaches
Audio <--> Tools
Tools <-->|OpenAI API| LLM
API <--> Tools
HA -->|HTTP| API
N8N -->|HTTP| API
Webhook -->|HTTP| API
Component Overview:
| Component | Description |
|---|---|
| π± SIP Phone | Any SIP-compatible phone or softphone |
| π€ SIP AI Agent | Core application handling calls and conversations |
| π§ LLM Server | Language model for understanding and responses |
| π€ Speaches | Unified STT (Whisper) and TTS (Kokoro) server |
| π Integrations | External systems that trigger calls via API |
π‘ Use Cases
| Use Case | Example |
|---|---|
| β²οΈ Timers & Reminders | "Set a timer for 10 minutes" |
| π Callbacks | "Call me back in an hour" |
| π€οΈ Weather Briefings | Scheduled morning weather calls |
| π Appointment Reminders | Outbound calls with confirmation |
| π¨ Alerts & Notifications | Webhook-triggered phone calls |
| π Smart Home | Voice control via phone |
π§ Recommended Models
Quick reference for GPU-specific configurations. See Configuration for full details.
| GPU | VRAM | Recommended LLM | STT Model |
|---|---|---|---|
| H100 / A100 | 80GB | meta-llama/Llama-3.1-70B-Instruct | faster-whisper-large-v3 |
| DGX Spark | 128GB | meta-llama/Llama-3.1-70B-Instruct | faster-whisper-large-v3 |
| RTX 5090 | 32GB | Qwen/Qwen2.5-32B-Instruct | faster-whisper-large-v3 |
| RTX 4090 | 24GB | Qwen/Qwen2.5-14B-Instruct | faster-whisper-large-v3 |
| RTX 3090 | 24GB | meta-llama/Llama-3.1-8B-Instruct | faster-whisper-medium |
| RTX 4080 | 16GB | meta-llama/Llama-3.1-8B-Instruct | faster-whisper-medium |
| RTX 3080 | 10GB | Qwen/Qwen2.5-7B-Instruct | faster-whisper-small |
π Documentation
- π Getting Started β Installation & setup
- ποΈ Architecture β Installation & setup
- βοΈ Configuration β Environment variables
- π API Reference β REST API endpoints
- π§ Built-in Tools β Available capabilities
- π Creating Plugins β Add custom tools
- π Examples β Integration patterns
π¦ Quick Install
# Clone the repository
git clone https://github.com/your-org/sip-agent.git
cd sip-agent
# Configure environment
cp .env.example .env
nano .env
# Start with Docker Compose
docker compose up -d
# Verify it's running
curl http://localhost:8080/healthExpected output:
{
"status": "healthy",
"sip_registered": true,
"active_calls": 0
}π Support
- π Documentation
- π Issue Tracker
- π¬ Discussions
Updated about 1 month ago
