πŸ—οΈ Architecture

πŸ€– Voice-powered AI assistant for SIP phone systems

πŸ€–

ROBO CODED β€” This documentation was made with AI and may not be 100% sane. But the code does work! πŸŽ‰

flowchart LR
    subgraph Caller
        Phone[πŸ“± SIP Phone]
    end
    
    subgraph Agent["πŸ€– SIP AI Agent"]
        SIP[SIP Client]
        Audio[Audio Pipeline]
        Tools[Tool Manager]
        API[REST API]
    end
    
    subgraph Services
        LLM[🧠 LLM Server<br/>OpenAI / vLLM / Ollama]
        Speaches[🎀 Speaches<br/>STT + TTS]
    end
    
    subgraph Integrations
        HA[🏠 Home Assistant]
        N8N[πŸ”„ n8n]
        Webhook[πŸ”— Webhooks]
    end
    
    Phone <-->|SIP/RTP| SIP
    SIP <--> Audio
    Audio <-->|Whisper| Speaches
    Audio <-->|Kokoro| Speaches
    Audio <--> Tools
    Tools <-->|OpenAI API| LLM
    
    API <--> Tools
    HA -->|HTTP| API
    N8N -->|HTTP| API
    Webhook -->|HTTP| API

Component Overview:

ComponentDescription
πŸ“± SIP PhoneAny SIP-compatible phone or softphone
πŸ€– SIP AI AgentCore application handling calls and conversations
🧠 LLM ServerLanguage model for understanding and responses
🎀 SpeachesUnified STT (Whisper) and TTS (Kokoro) server
πŸ”— IntegrationsExternal systems that trigger calls via API

πŸ’‘ Use Cases

Use CaseExample
⏲️ Timers & Reminders"Set a timer for 10 minutes"
πŸ“ž Callbacks"Call me back in an hour"
🌀️ Weather BriefingsScheduled morning weather calls
πŸ“… Appointment RemindersOutbound calls with confirmation
🚨 Alerts & NotificationsWebhook-triggered phone calls
🏠 Smart HomeVoice control via phone

🧠 Recommended Models

Quick reference for GPU-specific configurations. See Configuration for full details.

GPUVRAMRecommended LLMSTT Model
H100 / A10080GBmeta-llama/Llama-3.1-70B-Instructfaster-whisper-large-v3
DGX Spark128GBmeta-llama/Llama-3.1-70B-Instructfaster-whisper-large-v3
RTX 509032GBQwen/Qwen2.5-32B-Instructfaster-whisper-large-v3
RTX 409024GBQwen/Qwen2.5-14B-Instructfaster-whisper-large-v3
RTX 309024GBmeta-llama/Llama-3.1-8B-Instructfaster-whisper-medium
RTX 408016GBmeta-llama/Llama-3.1-8B-Instructfaster-whisper-medium
RTX 308010GBQwen/Qwen2.5-7B-Instructfaster-whisper-small

πŸ“š Documentation

  1. πŸš€ Getting Started β€” Installation & setup
  2. πŸ—οΈ Architecture β€” Installation & setup
  3. βš™οΈ Configuration β€” Environment variables
  4. 🌐 API Reference β€” REST API endpoints
  5. πŸ”§ Built-in Tools β€” Available capabilities
  6. πŸ”Œ Creating Plugins β€” Add custom tools
  7. πŸ“– Examples β€” Integration patterns

πŸ“¦ Quick Install

# Clone the repository
git clone https://github.com/your-org/sip-agent.git
cd sip-agent

# Configure environment
cp .env.example .env
nano .env

# Start with Docker Compose
docker compose up -d

# Verify it's running
curl http://localhost:8080/health

Expected output:

{
  "status": "healthy",
  "sip_registered": true,
  "active_calls": 0
}

πŸ†˜ Support