Added

๐ŸŽ‰ v0.1.0


๐ŸŽ‰ v0.1.0

๐Ÿค– SIP Enabled AI Agent

๐Ÿค–

ROBO CODED โ€” This release was made with AI and may not be 100% sane. But the code does work! ๐ŸŽ‰

Release Date: November 30, 2025
License: AGPL-3.0
Platform: NVIDIA DGX Spark (Grace Blackwell GB10)


๐Ÿš€ Overview

The first public release of SIP AI Assistant โ€” a voice-powered AI assistant that answers phone calls, understands natural language, and performs actions through an extensible plugin system.

Built for the NVIDIA DGX Spark with 128GB unified memory, this system runs entirely on local infrastructure with no cloud dependencies for voice processing or LLM inference.


โœจ Highlights

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐Ÿ“ž SIP AI Assistant v0.1.0                                  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ โœ… Full SIP/RTP voice call handling                        โ”‚
โ”‚ โœ… Real-time STT via Whisper (Speaches)                    โ”‚
โ”‚ โœ… Natural TTS via Kokoro (Speaches)                       โ”‚
โ”‚ โœ… LLM integration (vLLM, OpenAI, Ollama)                  โ”‚
โ”‚ โœ… 10 built-in tools                                        โ”‚
โ”‚ โœ… Plugin system for custom tools                          โ”‚
โ”‚ โœ… REST API for outbound calls                             โ”‚
โ”‚ โœ… Scheduled & recurring calls                             โ”‚
โ”‚ โœ… Customizable phrases                                     โ”‚
โ”‚ โœ… Full observability stack                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ง Built-in Tools

ToolDescription
๐ŸŒค๏ธ WEATHERCurrent weather from Tempest station
โฒ๏ธ SET_TIMERCountdown timers with voice alerts
๐Ÿ“ž CALLBACKSchedule callbacks to any number
๐Ÿ“ด HANGUPEnd calls gracefully
๐Ÿ“‹ STATUSCheck pending timers/callbacks
โŒ CANCELCancel scheduled tasks
๐Ÿ• DATETIMECurrent date and time
๐Ÿงฎ CALCMath calculations
๐Ÿ˜„ JOKERandom jokes (general, tech, dad)
๐Ÿฆœ SIMON_SAYSEcho back verbatim

๐ŸŒ REST API

EndpointMethodDescription
/healthGETHealth check
/callPOSTInitiate outbound call
/call/{id}GETGet call status
/call/{id}DELETEHang up call
/toolsGETList available tools
/tools/{name}/callPOSTExecute tool via call
/schedulePOSTSchedule a call
/scheduleGETList scheduled calls
/schedule/{id}DELETECancel scheduled call

Scheduled Calls

# One-time call
curl -X POST http://localhost:8080/schedule \
  -d '{"extension": "1001", "tool": "WEATHER", "at_time": "07:00"}'

# Recurring daily
curl -X POST http://localhost:8080/schedule \
  -d '{"extension": "1001", "tool": "WEATHER", "at_time": "07:00", "recurring": "daily"}'

# Weekdays only
curl -X POST http://localhost:8080/schedule \
  -d '{"extension": "1001", "message": "Stand up time!", "at_time": "09:00", "recurring": "weekdays"}'

๐Ÿ—ฃ๏ธ Customizable Phrases

Configure the assistant's personality via environment variables or JSON:

Environment Variables:

PHRASES_GREETINGS=["Hello!","Hi there!","Hey!"]
PHRASES_GOODBYES=["Goodbye!","Take care!"]
PHRASES_ACKNOWLEDGMENTS=["Okay.","Got it.","Sure."]
PHRASES_THINKING=["One moment.","Let me check."]
PHRASES_ERRORS=["Sorry, I didn't catch that."]
PHRASES_FOLLOWUPS=["Anything else?"]

JSON File (data/phrases.json):

{
  "greetings": ["Beep boop! What do you want, human?"],
  "goodbyes": ["Bye, meatbag!"],
  "errors": ["My audio sensors must be malfunctioning."]
}

๐Ÿ”Œ Plugin System

Create custom tools with Python:

from tool_plugins import BaseTool, ToolResult, ToolStatus

class MyTool(BaseTool):
    name = "MY_TOOL"
    description = "Does something cool"
    
    parameters = {
        "input": {"type": "string", "required": True}
    }
    
    async def execute(self, params):
        return ToolResult(
            status=ToolStatus.SUCCESS,
            message=f"You said: {params['input']}"
        )

๐Ÿ“Š Observability

FeatureTechnology
๐Ÿ“ˆ MetricsPrometheus
๐Ÿ” TracingOpenTelemetry / Tempo
๐Ÿ“ LoggingStructured JSON
๐Ÿ“Š DashboardsGrafana

Key Metrics:

  • sip_agent_calls_total โ€” Total calls handled
  • sip_agent_call_duration_seconds โ€” Call duration histogram
  • sip_agent_tool_calls_total โ€” Tool invocations by name
  • sip_agent_stt_latency_seconds โ€” Speech-to-text latency
  • sip_agent_tts_latency_seconds โ€” Text-to-speech latency
  • sip_agent_llm_latency_seconds โ€” LLM response latency

๐Ÿ—๏ธ Architecture

flowchart LR
    subgraph Caller
        Phone[๐Ÿ“ฑ SIP Phone]
    end
    
    subgraph Agent["๐Ÿค– SIP AI Agent"]
        SIP[SIP Client]
        Audio[Audio Pipeline]
        Tools[Tool Manager]
        API[REST API]
    end
    
    subgraph Services
        LLM[๐Ÿง  LLM Server]
        Speaches[๐ŸŽค Speaches]
    end
    
    Phone <-->|SIP/RTP| SIP
    SIP <--> Audio
    Audio <--> Speaches
    Audio <--> Tools
    Tools <--> LLM
    API <--> Tools

๐Ÿ–ฅ๏ธ System Requirements

Recommended: NVIDIA DGX Spark

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๐ŸŸข NVIDIA DGX Spark                                         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ ๐Ÿง  Grace Blackwell GB10 Superchip                          โ”‚
โ”‚ ๐Ÿ’พ 128GB Unified Memory                                     โ”‚
โ”‚ โšก 1 PFLOP AI Performance                                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Minimum Requirements

ComponentRequirement
CPU8+ cores
RAM32GB
GPUNVIDIA with 16GB+ VRAM
Storage50GB SSD
NetworkGigabit Ethernet

Software Dependencies

DependencyVersion
Python3.11+
Docker24.0+
Docker Compose2.20+
SpeachesLatest

๐Ÿ“ฆ Installation

# Clone repository
git clone https://github.com/your-org/sip-agent.git
cd sip-agent

# Configure
cp sip-agent/.env.example sip-agent/.env
nano sip-agent/.env

# Start
docker compose up -d

# Verify
curl http://localhost:8080/health

๐Ÿ“š Documentation

Full documentation available at sip-agent.readme.io

DocumentDescription
OverviewArchitecture and features
Getting StartedInstallation guide
ConfigurationEnvironment variables
API ReferenceREST API endpoints
ToolsBuilt-in tools
PluginsCustom tool development
ExamplesIntegration patterns

โš ๏ธ Known Limitations

  • WebSocket realtime STT mode is experimental (use STT_MODE=batch for stability)
  • Maximum concurrent calls limited by LLM server capacity
  • Weather tool requires Tempest station (or customize for other APIs)
  • Some TTS voices may struggle with unusual words or names

๐Ÿ”œ Roadmap

FeatureStatus
๐ŸŽต Music on holdPlanned
๐Ÿ“ž Call transferPlanned
๐Ÿ—“๏ธ Calendar integrationPlanned
๐Ÿ” Web search toolPlanned
๐Ÿ  Home Assistant nativePlanned
๐Ÿ“ฑ SMS notificationsPlanned
๐ŸŒ Multi-language supportPlanned

๐Ÿ™ Acknowledgments


๐Ÿ“œ License

SPDX-License-Identifier: AGPL-3.0-or-later

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published
by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

๐Ÿ“ž Support

ResourceLink
๐Ÿ“– Docssip-agent.readme.io
๐Ÿ› IssuesGitHub Issues
๐Ÿ’ฌ DiscussionsGitHub Discussions

SIP AI Assistant v0.1.0

Made with โค๏ธ and ๐Ÿค–

Now go make some calls! ๐Ÿ“ž