Pre-release walkthrough. Run on a clean machine (or fresh conda env) before tagging v0.1.0.
-
git cloneinto a fresh directory - No
.conda/, no__pycache__/, nooutput/left over
-
conda env create --file environment.ymlsucceeds -
conda activate text2audiobooksucceeds -
python --versionreports 3.11 -
ffmpeg -versionresolves (system PATH OK)
-
python -m pytest tests -qreports ≥ 266 passed, 1 skipped in < 5s - Zero
DeprecationWarning, zeroSyntaxWarningin the run - Skipped test is
tests/test_openai_smoke.py(opt-in)
-
python main.pyopens the window with title "Text to Speech Converter" - Provider dropdown shows OpenAI, Ollama, Kokoro (3 entries; no VibeVoice)
- Voice dropdown shows the 6 OpenAI voices by default
- Refresh Models → status shows "Loaded N OpenAI models" (or "OpenAI discovery failed (...) -- using fallback list" if API down; both are acceptable)
- Pick a short
.txtfile (1-2 paragraphs), name "smoke", click Start - Window stays draggable during synthesis
- Status label visibly ticks: Reading input → Preparing text → Converting N chunk(s) → Merging audio → Conversion completed
- Provider/Quality/Model/Voice dropdowns are GREYED OUT during synthesis
- Success dialog appears with output path
-
output/smoke.mp3plays back as the input text in the chosen voice -
output/smoke_chunk_positions.txtlists chunk boundaries
- Clear all fields, click Start → dialog reads "Please provide: Input File, Output File Name, Model"
- Fast-double-click Start during an in-flight conversion → no second worker spawns (silent rejection per audit S1)
- Switch Provider → Ollama
- Refresh Models →
- if running: status shows "Loaded N Ollama models" with TTS-only filter (no
llama3/mistralin dropdown) - if not running: status shows "No Ollama models available -- connection refused (...) -- Is
ollama serverunning?" + warning dialog
- if running: status shows "Loaded N Ollama models" with TTS-only filter (no
- Pick a "supported" Ollama model (matching
bark|kokoro|tts|speech), click Start → error dialog mentions "use the Kokoro provider" (Phase 6 documented limitation)
- Switch Provider → Kokoro; voice dropdown repopulates with
af_heart,am_michael, etc. (20 voices) - If kokoro lib missing: Start → error dialog reads "kokoro package not importable. Install:
pip install kokoro soundfile huggingface_hub" - If espeak-ng missing: Start → error dialog reads "espeak-ng not found on PATH. Install from https://github.com/espeak-ng/espeak-ng/releases ..."
- When both ready: pick same text file, click Start
- First run downloads pinned model revision to
~/.cache/huggingface(~500 MB; ~30-60s on broadband) - Subsequent runs skip the download
- Output MP3 plays back the text in the chosen Kokoro voice (American English)
-
OPENAI_SMOKE_TEST=1 OPENAI_API_KEY=sk-... python -m pytest tests/test_openai_smoke.py -v - Reports
PASSEDin < 5 seconds - Cost on OpenAI dashboard < $0.01
-
python combine_and_convert.pyopens the second window - Pick 2+ MP3s + a background image + output folder + name; click Start
- Output
.mp4plays with image + concatenated audio - Validation: clear all fields → error "Please provide: MP3 Files, Background Image, Output Folder, Output File Name"
-
git ls-files | grep key.txtreturns nothing -
pytest tests/test_repo_hygiene.pypasses (enforceskey.txtnot tracked) - If OpenAI auth fails in step 4, logs show
***REDACTED***not the raw key value
- All boxes checked OR documented exception in release notes
- Tag
v0.1.0 - Push