Skip to content

jhfnetboy/Candle-local-AI-Server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

24 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽต TTS Server - Local Text-to-Speech Service

Version 0.2.0 | High-performance local TTS server powered by Kokoro-82M ONNX model

License: MIT Rust

๐Ÿ“– Overview

A lightweight, blazing-fast text-to-speech server designed for the MyDictionary Chrome extension. Features 54 high-quality voices with automatic model downloading and intelligent caching. The macOS version now runs as a background menubar application.

โœจ Features

  • ๐ŸŽค 54 Premium Voices - British/American English, male/female options
  • โšก Lightning Fast - Rust-powered, sub-second synthesis
  • ๐Ÿ’พ Smart Caching - SHA256-based file caching with TTL, stored in ~/Library/Application Support/tts-server/
  • ๐Ÿ”„ Auto Download - Models download automatically on first run
  • ๐ŸŒ REST API - Simple HTTP endpoints for easy integration
  • ๐ŸŽฏ Browser Compatible - 16-bit PCM WAV output
  • ๐Ÿ–ฅ๏ธ macOS Menubar App - Runs silently in the background with a menubar icon for quick access and control.
  • ๐Ÿ”’ Single Instance - Prevents multiple instances from running concurrently.
  • ๐Ÿชต Detailed Logging - Logs are written to ~/Library/Application Support/tts-server/logs/

๐Ÿš€ Quick Start

Option 1: Download Pre-built Binary (Recommended)

macOS (Apple Silicon & Intel)

The macOS version is now a self-contained .app bundle that runs as a background menubar application.

# 1. Download the latest TTS Server.app from the releases page:
#    (e.g., https://github.com/jhfnetboy/Candle-local-AI-Server/releases/download/v0.2.0/TTS_Server.app.zip)

# 2. Extract the downloaded archive (if it's a .zip or .tar.gz)
#    (Example for .zip):
#    unzip TTS_Server.app.zip

# 3. Move/Drag the "TTS Server.app" to your /Applications folder.
mv TTS_Server.app /Applications/

# 4. Install espeak-ng (required for phonemization)
brew install espeak-ng

# 5. Launch the application
#    You can double-click it from your /Applications folder, or run:
open /Applications/TTS\ Server.app

The application will:

  • Run silently in the background with an icon in your macOS menubar (top-right).
  • Start the server on http://localhost:9527.
  • Download models automatically on first run (~310MB ONNX model, ~50MB voice data). This will be stored in ~/Library/Application Support/tts-server/checkpoints/ and ~/Library/Application Support/tts-server/data/.
  • Create a cache directory for audio files in ~/Library/Application Support/tts-server/cache/audio/.
  • Generate detailed logs in ~/Library/Application Support/tts-server/logs/.

Menubar Icon Usage:

  • Left-click on the icon to show "Open UI" and "Quit" options.
  • "Open UI" will open http://localhost:9527 in your default browser.
  • "Quit" will gracefully shut down the server.

ๅธธ่ง้—ฎ้ข˜่งฃๅ†ณ:

  • ๅฆ‚ๆžœ้‡ๅˆฐ "cannot be opened because it is from an unidentified developer"
    • ่ฏทๅœจ /Applications ๆ–‡ไปถๅคนไธญๅณ้”ฎ็‚นๅ‡ป TTS Server.app๏ผŒ้€‰ๆ‹ฉโ€œๆ‰“ๅผ€โ€ใ€‚็ณป็ปŸๅฏ่ƒฝไผš่ฏข้—ฎๆ˜ฏๅฆ็กฎๅฎš่ฆๆ‰“ๅผ€๏ผŒ็‚นๅ‡ปโ€œๆ‰“ๅผ€โ€ๅณๅฏใ€‚ๆญคๆ“ไฝœ้€šๅธธๅช้œ€่ฟ›่กŒไธ€ๆฌกใ€‚
  • ๅฆ‚ๆžœ้‡ๅˆฐ "espeak-ng: command not found"
    • ๅฎ‰่ฃ…: brew install espeak-ng

Windows (x64)

โš ๏ธ Windows ็‰ˆๆœฌๅฐ†ๅœจๆœชๆฅ็‰ˆๆœฌๅ‘ๅธƒ (้ข„่ฎก v0.2.0 ๅŽ)

็›ฎๅ‰ไป…ๆ”ฏๆŒ macOSใ€‚Windows ็”จๆˆทๅฏไปฅ้€‰ๆ‹ฉไปŽๆบ็ ๆž„ๅปบใ€‚


Option 2: Build from Source

Prerequisites:

# Clone the repository
git clone https://github.com/jhfnetboy/Candle-local-AI-Server.git
cd Candle-local-AI-Server

# Install espeak-ng
# macOS:
brew install espeak-ng
# Ubuntu:
sudo apt-get install espeak-ng
# Windows:
choco install espeak-ng

# Build release version (for macOS, this will generate a .app bundle)
cargo bundle --release

# For macOS, move the generated .app to Applications and launch:
mv target/release/bundle/osx/TTS\ Server.app /Applications/
open /Applications/TTS\ Server.app

# For Linux/Windows, run the raw binary (if you don't need a UI)
# ./target/release/tts-server

๐Ÿ”— Integration with MyDictionary Extension

Step 1: Start TTS Server

# Make sure the server is running (e.g., double-click TTS Server.app or run from terminal)
# You should see the menubar icon if on macOS.

# You can check server health via:
curl http://localhost:9527/health

Step 2: Install MyDictionary Extension

  1. Download MyDictionary extension from Chrome Web Store or build from source
  2. The extension will automatically detect the local TTS server
  3. Open extension settings โ†’ TTS Voice Settings
  4. You'll see a green "โœ… Connected" indicator if the server is running

Step 3: Select Your Voice

  1. Go to TTS Voice Settings (Extension popup โ†’ Settings โ†’ Voice Settings)
  2. Choose from 54 voices:
    • ๐Ÿ‡ฌ๐Ÿ‡ง British English: George, Daniel, Alice, Emma... (Recommended for learning)
    • ๐Ÿ‡บ๐Ÿ‡ธ American English: Michael, Nova, Sarah...
  3. Click Save Settings

Step 4: Enjoy!

Select any text on a webpage and click the ๐Ÿ”Š TTS button in the sidebar.


๐Ÿ“ก API Reference

Endpoints

GET / - Server Info

curl http://localhost:9527/

Response:

{
  "success": true,
  "data": {
    "name": "TTS Server",
    "version": "0.2.0",
    "status": "running",
    "framework": "Candle"
  }
}

GET /health - Health Check

curl http://localhost:9527/health

POST /synthesize - Text to Speech

Request:

curl -X POST http://localhost:9527/synthesize \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, world!",
    "voice": "bm_george",
    "format": "wav"
  }'

Parameters:

  • text (required): Text to synthesize
  • voice (optional): Voice ID (default: bm_george)
  • format (optional): Output format, currently only wav (reserved for future mp3/ogg support)

Response:

{
  "file_id": "51f91581302698db",
  "url": "http://localhost:9527/audio/51f91581302698db.wav",
  "cached": false
}

GET /audio/:filename - Get Audio File

curl http://localhost:9527/audio/51f91581302698db.wav --output output.wav

Voice List

See VOICE_API.md for complete list of 54 available voices.

Recommended voices for English learning:

  • bm_george - British male, clear and standard
  • bm_daniel - British male, accurate pronunciation
  • af_nova - American female, recommended
  • am_michael - American male, standard

๐Ÿ› ๏ธ Configuration

Port Configuration

By default, the server runs on port 9527. To change:

Edit src/main.rs:

let addr = SocketAddr::from(([0, 0, 0, 0], 9527));  // Change port here

Then rebuild:

cargo build --release

Cache Configuration

  • Location: ~/Library/Application Support/tts-server/cache/audio/
  • TTL: 1 hour (3600 seconds)
  • Format: SHA256-based file IDs

To change cache settings, edit src/main.rs:

AudioCache::new("cache/audio", 3600)  // Change TTL (seconds)

๐Ÿ› Troubleshooting

Problem: Server won't start

Solution 1: Check if port 9527 is already in use

# macOS/Linux:
lsof -i :9527

# Windows:
netstat -ano | findstr :9527

Solution 2: Check espeak-ng installation

espeak-ng --version

If not installed, see Quick Start for installation instructions.

Problem: Extension shows "Disconnected"

  1. Make sure the TTS server is running: http://localhost:9527/health
  2. Check browser console for CORS errors
  3. Restart the server and reload the extension

Problem: "Model not found" error

The models should download automatically on first run. They will be stored in ~/Library/Application Support/tts-server/checkpoints/ and ~/Library/Application Support/tts-server/data/. If download fails:

# Manual download (you might need to provide the full path to download_models.sh inside the .app bundle)
# For example, if TTS Server.app is in /Applications:
/Applications/TTS\ Server.app/Contents/Resources/download_models.sh

Problem: Windows - "espeak-ng not found"

โš ๏ธ Windows ็‰ˆๆœฌๅฐ†ๅœจๆœชๆฅ็‰ˆๆœฌๅ‘ๅธƒ (้ข„่ฎก v0.2.0 ๅŽ)

Windows ็”จๆˆท็›ฎๅ‰ๅฏไปฅไปŽๆบ็ ๆž„ๅปบ๏ผŒๆˆ–่€…็ญ‰ๅพ…ๅฎ˜ๆ–น Windows ็‰ˆๆœฌๅ‘ๅธƒใ€‚


๐Ÿ—๏ธ Project Structure

tts-server/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ main.rs           # HTTP server & routes
โ”‚   โ”œโ”€โ”€ tts_engine.rs     # Kokoro ONNX inference
โ”‚   โ”œโ”€โ”€ cache.rs          # File caching system
โ”‚   โ”œโ”€โ”€ vocab.rs          # Tokenization
โ”‚   โ””โ”€โ”€ wav_encoder.rs    # WAV audio encoding
โ”œโ”€โ”€ checkpoints/          # ONNX models (auto-downloaded to Application Support)
โ”œโ”€โ”€ data/voices/          # 54 voice embeddings (auto-downloaded to Application Support)
โ”œโ”€โ”€ Cargo.toml            # Rust dependencies
โ”œโ”€โ”€ Info.plist.in         # macOS app bundle configuration
โ””โ”€โ”€ README.md             # This file

๐Ÿค Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments


๐Ÿ“ž Support


Made with โค๏ธ by Jason

About

A local AI model runner/server for different models i hugging face. Dev for MyDictionary, but free for all to use.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors