Download the PHP package b7s/fluentvox without Composer
On this page you can find all versions of the php package b7s/fluentvox. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download b7s/fluentvox
More information about b7s/fluentvox
Files in b7s/fluentvox
Package fluentvox
Short Description Fluent PHP wrapper for Chatterbox TTS - State-of-the-art text-to-speech with voice cloning
License MIT
Informations about the package fluentvox
This standalone, developer‑friendly library brings the full power of Resemble AI’s Chatterbox TTS into a beautifully fluent PHP API. Effortless to use, lightning‑fast, and built for real‑world production environments, it handles everything—from seamless model management to cross‑platform compatibility on Linux, macOS, and Windows.
Whether you're crafting immersive voice experiences, automating audio generation, or building AI‑powered products, this wrapper gives you a clean, modern, and expressive toolkit that makes advanced TTS feel natural.
✨ Features
- 🎯 Fluent API - Laravel-inspired chainable interface
- 🌍 Cross-Platform - Linux, macOS (including Apple Silicon), and Windows support
- 📦 Automatic Management - Downloads and manages models automatically
- 🎭 Voice Cloning - Clone any voice from a reference audio file
- 🌐 Multilingual - Support for 23+ languages
- ⚡ GPU Acceleration - CUDA (NVIDIA), MPS (Apple Silicon), or CPU
- 🔒 Type-Safe - Full PHP 8.3+ type hints / PHPStan level 6
- 🛠️ Complete CLI - Command-line tools for installation and generation
- 🧪 Tests - Complete test suite with Pest PHP 4
Easy to use
📦 Installation
Install Dependencies
After installing the package, install Python dependencies:
This will:
- Verify Python 3.10+ is installed
- Install Chatterbox TTS package
- Detect GPU acceleration (CUDA/MPS)
Using Python Virtual Environments (venv)
FluentVox automatically detects Python virtual environments. It will prioritize:
- Active venv (via
VIRTUAL_ENVenvironment variable) - Local venv directories (
.venv,venv,venv311, etc.) in current directory and parent directories - System Python (fallback)
Recommended workflow:
Manual configuration:
If FluentVox doesn't detect your venv automatically, specify the Python path in fluentvox-config.php:
Check Installation
If you want to generate audio very quickly, choose:
- Minimum recommended: RTX 3060 12 GB (~6x CPU)
- Ideal: RTX 4060 / 4060 Ti
- Top: RTX 4070+
🚀 Quick Start
Basic Usage
Voice Cloning
Multilingual
Expressive Speech
Custom Sample Rate
🛠️ CLI Commands
install - Install Dependencies
doctor - Diagnose Installation
The doctor command checks your installation and shows:
- Platform compatibility
- Python and dependency status
- Model availability
- Default model detection: Identifies your configured default model and suggests downloading it if not available
models - Manage Models
generate - Generate Speech
Note: If you don't specify --model, the command will use the default_model from your fluentvox-config.php file.
📖 API Reference
Model Selection
Choose which Chatterbox model to use based on your needs. Each model has different capabilities, performance characteristics, and language support.
Voice Cloning
Clone any voice by providing a reference audio sample. The model will mimic the speaker's voice characteristics, tone, and speaking style in the generated speech.
Language (Multilingual Model)
Specify the target language when using the multilingual model. The model will generate speech with native pronunciation and intonation for the selected language.
Supported Languages: Arabic, Danish, German, Greek, English, Spanish, Finnish, French, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Swahili, Turkish, Chinese (see more details)
Expression Controls
Control the emotional intensity and expressiveness of the generated speech. Higher values produce more dramatic, animated voices while lower values create calmer, more subdued speech.
Pace/CFG Controls
Adjust the rhythm and speed of speech delivery. CFG (Classifier-Free Guidance) weight controls how closely the model follows the text pacing. Lower values create slower, more deliberate speech while higher values speed up delivery.
Randomness Controls
Control the variability and creativity in speech generation. Temperature affects how predictable vs. varied the output is. Seeds allow you to reproduce exact results.
Audio Processing
Configure how reference audio is processed for voice cloning. VAD (Voice Activity Detection) can remove silence and background noise from reference clips.
Device Selection
Choose which hardware to use for audio generation. GPU acceleration (CUDA/MPS) is significantly faster than CPU but requires compatible hardware.
Output Configuration
Configure where and how the generated audio is saved, with options for timeouts and progress monitoring.
Sample Rate Notes:
- Model's native rate is 24kHz (24000 Hz)
- If you specify a different rate, audio will be automatically resampled
- Common rates: 16000 (telephony), 24000 (native), 44100 (CD), 48000 (professional)
- Higher rates = larger files but potentially better quality
- Resampling adds minimal processing time
Presets
Pre-configured combinations of settings optimized for common use cases. These presets adjust expression, pace, and temperature for specific scenarios.
Execution
Generate the audio file or retrieve raw audio data for further processing.
Audio Conversion
Convert generated audio to different formats using FFmpeg. All conversion methods generate the audio first, then convert it. The format is automatically detected from the file extension.
Supported Formats:
- MP3: Universal compatibility, good compression (bitrate: 64-320 kbps)
- M4A/AAC: Apple devices, slightly better quality than MP3 (bitrate: 64-256 kbps)
- OGG Vorbis: Web streaming, open format (quality: 0-10)
- Opus: Best compression for voice, modern browsers (bitrate: 32-256 kbps)
- FLAC: Lossless compression, archival quality
Result Object
The GenerationResult object contains information about the generated audio and metadata about the generation process.
Static Helpers
Utility methods for system checks, installation, model management, and audio conversion without creating a FluentVox instance.
⚙️ Configuration
Create a fluentvox-config.php file in your project root:
Configuration File Location:
The configuration file is searched in the following order:
- Explicit path (if provided programmatically)
- Project root (where
composer.jsonis located) - Current working directory
- Package root (fallback)
Default Model:
The default_model setting is automatically used by:
- CLI
generatecommand when--modelis not specified FluentVox::make()instancesdoctorcommand detects and highlights the default model status
To download the default model, run:
📊 Available Models
| Model | Size | Languages | Features | Best For |
|---|---|---|---|---|
chatterbox |
500M | English | CFG & exaggeration tuning | General TTS with creative controls |
chatterbox-turbo |
350M | English | Paralinguistic tags [laugh], [cough] |
Voice agents, low latency |
chatterbox-multilingual |
500M | 23+ | Zero-shot cloning, multiple languages | Global applications |
📋 Requirements
- PHP 8.3+
- Composer 2+
- Python 3.10+ (system Python or virtual environment)
- PyTorch (auto-installed)
- Chatterbox TTS (auto-installed)
- FFmpeg (for audio conversion)
Note: Using a Python virtual environment (venv) is highly recommended to avoid conflicts with system packages. FluentVox automatically detects and uses venv when available.
Note: Chatterbox TTS has its own system requirements and dependencies. If you encounter installation issues, run
vendor/bin/fluentvox doctorto diagnose problems, orvendor/bin/fluentvox install --verbosefor detailed installation logs. Ensure all Python dependencies are properly installed before use.
GPU Acceleration (Optional)
- NVIDIA GPU: CUDA 11.8+ (Linux/Windows)
- Apple Silicon: MPS (macOS M1/M2/M3)
🌐 Platform Support
| Platform | Architecture | GPU Support |
|---|---|---|
| Linux | x86_64, arm64 | CUDA |
| macOS | x86_64, arm64 (Apple Silicon) | MPS |
| Windows | x86_64 | CUDA |
Running Tests
Run tests
📄 License
MIT License - see LICENSE file.
🙏 Credits
- Resemble AI - Chatterbox TTS model
- b7s/whisper-php - Inspiration for API design
- b7s/YtPilot - Inspiration for fluent API