-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add High-Level Python API with Automatic Voice Loading #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add easy-to-use Python inference API with one-line synthesis, automatic default voice loading, and comprehensive documentation. Key features: - synthesize_speech() one-line function - Automatic default voice loading (7 voices included) - Iterator support for LLM integration - Complete documentation and examples
@microsoft-github-policy-service agree |
|
Hey team, |
|
The two Markdown sections are quite verbose. Could you provide a more concise guide that focuses on the essentials? It would also be helpful to include a minimal, clear example. Is there any further room to simplify the code? |
|
Sure, let me look into this |
|
@YaoyaoChang can you review i made changes to doc. |
|
@YaoyaoChang can u check my implementation |
|
I’ve been busy these days and will take care of it as soon as I’m available. |
Add easy-to-use Python inference API with one-line synthesis, automatic
default voice loading, and comprehensive documentation.
New Features
High-Level API (
vibevoice/inference.py)synthesize_speech(): One-line function for text-to-speech synthesislist_default_voices(): Helper to list available voice presetsVibeVoiceStreamingTTS: High-level TTS class with streaming supportdemo/voices/streaming_model/en-Mike_man.pt, falls back to first availableAudioPlayer: Audio playback with speaker selectionAutomatic Voice Loading
en-Davis_man, en-Frank_man, en-Grace_woman, in-Samuel_man
Module Exports (
vibevoice/__init__.py)📊 Changes Summary
Lines of Code
Impact
🎯 Key Features Being Added
1. One-Line Synthesis
2. Automatic Voice Loading
3. LLM Integration
4. Complete Documentation