open source

voice-echo

Talk to Claude. Over the phone.

Voice interface that connects phone calls to Claude Code CLI. Call in, or let Claude call you -- with full context on why it's reaching out. Integrates with n8n for AI-initiated calls from any automation workflow.

View on GitHub Quick start

Rust

Real-time audio

Open source

Self-hosted

How it works

End-to-end voice pipeline

A phone call comes in through Twilio, gets transcribed, reasoned about by Claude, and spoken back -- all in real time over a WebSocket.

📞

Call

Twilio

→

🎙

VAD

energy-based

→

📝

STT

Groq Whisper

→

🧠

Claude

Code CLI

→

🔈

TTS

Inworld

→

📞

Response

mu-law audio

Features

Built for real usage

Not a demo. A working voice pipeline with the pieces you need for reliable phone conversations with Claude.

⏱

Voice Activity Detection

Energy-based VAD with configurable silence threshold and RMS energy floor. Detects when you stop talking, not when you pause.

vad.rs

🚫

Whisper Hallucination Filter

Filters common Whisper artifacts from silence and background noise. No phantom transcriptions hitting Claude.

stt.rs

🗨

Multi-turn Conversations

Session memory per call with configurable timeout. Claude remembers context across turns within the same conversation.

session_timeout_secs

📡

Outbound Call API

POST to /api/call with a phone number and context. Claude knows why it's calling before the conversation starts.

POST /api/call

🔌

n8n Bridge

Orchestrator routes triggers from any n8n workflow to the call-human module. Alerts, cron jobs, or events can initiate AI calls with full context.

n8n integration

⚡

Real-time Audio Pipeline

Mu-law codec, streaming WebSocket, direct PCM conversion. No intermediate servers -- Twilio talks straight to Axum.

audio.rs

💬

Configurable Greeting

Custom TTS greeting when a call connects. Set it in config.toml -- default is "Hello, this is Echo".

config.toml

Tech stack

What's under the hood

Minimal dependencies, no runtime bloat. One compiled binary handles everything.

🦀

Rust

core runtime

⚙

Axum

http + websocket

📞

Twilio

telephony

🎙

Groq Whisper

speech-to-text

🔈

Inworld

text-to-speech

🧠

Claude Code CLI

reasoning engine

🔌

n8n

workflow automation

Quick start

Up and running in minutes

Clone, configure, deploy. You'll need a Twilio number and API keys for Groq and Inworld.

Clone and build

git clone https://github.com/dnacenta/voice-echo.git
cd voice-echo
cargo build --release

Configure

mkdir -p ~/.voice-echo
cp config.example.toml ~/.voice-echo/config.toml
cp .env.example ~/.voice-echo/.env
# Edit .env with your API keys
# Edit config.toml with your Twilio number

Deploy and run

# Copy the binary, set up nginx + systemd
# See the full guide in the README
sudo cp target/release/voice-echo /usr/local/bin/
sudo systemctl enable --now voice-echo

Point Twilio to your server

# In Twilio Console, set voice webhook to:
POST https://your-server.example.com/twilio/voice