This gem mirrors the public interface of elevenlabs-python and provides a Ruby-native client with the same resource tree, streaming helpers, and request semantics. The client is generated directly from the upstream SDK, so keeping pace with new endpoints only requires re-running the included extraction script.
This gem is published to GitHub Packages (not RubyGems.org). Add the GitHub Packages source to your Gemfile:
source "https://rubygems.pkg.github.com/architecture" do
gem "elevenlabs", "0.4.0"
endGitHub Packages requires authentication. Create a personal access token with read:packages scope and add it to ~/.gem/credentials:
---
:github: Bearer <YOUR_TOKEN>
Then run bundle install.
Note: The
gem install elevenlabscommand shown on the GitHub Packages page points to RubyGems.org and will not work — this gem is only available via GitHub Packages.
Bundler can pull the gem straight from the git repository. This works for public repos without any token setup:
# Pin to a release tag (recommended for production)
gem "elevenlabs", git: "https://github.com/architecture/elevenlabs-ruby", tag: "v0.4.0"
# Or track the latest main branch
gem "elevenlabs", git: "https://github.com/architecture/elevenlabs-ruby", branch: "main"Then run bundle install.
require "elevenlabs"
client = ElevenLabs::Client.new(api_key: ENV.fetch("ELEVENLABS_API_KEY"))
# Mirrors client.history.list(...) from Python
history = client.history.list(page_size: 5)
history["history"].each do |item|
puts "#{item["voice_name"]}: #{item["text"]}"
endEvery namespace from the Python SDK shows up at the same path:
client.voices.get_all
client.text_to_speech.convert("voice_id", text: "Hello!")
client.conversational_ai.agents.list
client.workspace.invites.create(email: "teammate@example.com")Optional parameters default to the ElevenLabs::OMIT sentinel. Pass nil to send null, or skip the argument entirely to remove it from the payload.
Every top-level namespace described in lib/elevenlabs/spec.json is available under client. That includes:
audio_isolation, audio_native, conversational_ai, dubbing, environment_variables, forced_alignment, history, models, music, pronunciation_dictionaries, samples, service_accounts, speech_to_speech, speech_to_text, studio, text_to_dialogue, text_to_sound_effects, text_to_speech, text_to_voice, tokens, usage, user, voices, webhooks, workspace.
Below are example snippets that demonstrate each namespace. Substitute IDs and payloads with real values from your account.
# audio_isolation
clean = client.audio_isolation.convert(audio: ElevenLabs::Upload.from_path("noisy.wav"))
# audio_native
client.audio_native.projects.list
# conversational_ai
client.conversational_ai.agents.list(page_size: 10)
# environment_variables
client.environment_variables.list(page_size: 10)
client.environment_variables.create(request: { "label" => "API_KEY", "type" => "secret", "values" => { "production" => "sk-123" } })
# dubbing
client.dubbing.transcript.create(
project_id: "proj_123",
source_language: "en",
target_languages: ["es"]
)
# forced_alignment
client.forced_alignment.jobs.create(audio: ElevenLabs::Upload.from_path("clip.wav"))
# history
client.history.list(page_size: 20)
# models
client.models.list
# music
client.music.composition_plan.create(prompt: "lofi chill beats")
client.music.upload(file: ElevenLabs::Upload.from_path("track.mp3"))
# pronunciation_dictionaries
client.pronunciation_dictionaries.list
client.pronunciation_dictionaries.rules.set(
pronunciation_dictionary_id: "dict_123",
rules: [{ "type" => "phoneme", "string_to_replace" => "ElevenLabs", "phoneme" => "ɛlɛvənlæbz", "alphabet" => "ipa" }]
)
# samples
client.samples.list
# service_accounts
client.service_accounts.api_keys.list
# speech_to_speech
client.speech_to_speech.convert(
voice_id: "voice_123",
audio: ElevenLabs::Upload.from_path("input.wav")
)
# speech_to_text
client.speech_to_text.convert(model_id: "scribe_v1", file: ElevenLabs::Upload.from_path("meeting.mp3"))
# studio
client.studio.projects.list
# text_to_dialogue
client.text_to_dialogue.convert(
inputs: [
{ "voice_id" => "voice_a", "text" => "Hi!" },
{ "voice_id" => "voice_b", "text" => "Hey there!" }
]
)
# text_to_sound_effects
client.text_to_sound_effects.convert(text: "city street ambience")
# text_to_speech
client.text_to_speech.convert("voice_id", text: "Hello world")
# text_to_voice
client.text_to_voice.create(text: "Generate new voice preview")
# tokens
client.tokens.single_use.create(
voice_id: "voice_123",
usage_limit: 3
)
# usage
client.usage.get
# user
client.user.get
# voices
client.voices.get_all
# webhooks
client.webhooks.list
# workspace
client.workspace.members.list
client.workspace.auth_connections.list
client.workspace.auth_connections.create(request: { "type" => "oauth2", "label" => "My OAuth" })Each namespace exposes the full set of nested resources (for example client.conversational_ai.knowledge_base.documents.create) exactly as defined in the Python SDK.
stream = client.text_to_speech.convert(
"pNInz6obpgDQGcFmaJgB",
text: "Welcome to ElevenLabs Ruby!",
output_format: "mp3_44100_128",
model_id: "eleven_monolingual_v1"
)
File.open("welcome.mp3", "wb") { |f| stream.each { |chunk| f.write(chunk) } }stream = client.text_to_dialogue.convert(
inputs: [
{ "voice_id" => "voice_a", "text" => "Hello, how are you?" },
{ "voice_id" => "voice_b", "text" => "Doing great, thanks!" }
],
model_id: "eleven_dialogue_v1"
)
File.open("dialogue.wav", "wb") { |f| stream.each { |chunk| f.write(chunk) } }upload = ElevenLabs::Upload.from_path("meeting.m4a", content_type: "audio/mp4a-latm")
transcript = client.speech_to_text.convert(
model_id: "scribe_v1",
file: upload,
language_code: "en",
diarize: true
)
puts transcript["text"]agents = client.conversational_ai.agents.list(page_size: 10)
agents["agents"].each do |agent|
puts "#{agent["agent_id"]} => #{agent["name"]}"
end
client.conversational_ai.agents.link.create(
agent_id: agents["agents"].first["agent_id"],
workspace_group_id: "group_123"
)client.workspace.invites.create(email: "teammate@example.com", role: "member")Streaming endpoints return Ruby Enumerators so you can write the same buffering logic:
stream = client.text_to_speech.convert("voice_id", text: "Streaming example")
File.open("hello.mp3", "wb") do |file|
stream.each { |chunk| file.write(chunk) }
endUse request_options: { chunk_size: 4096 } to tweak streaming chunk sizes.
Each call accepts a request_options: hash mirroring the Python SDK:
client.history.list(
page_size: 25,
request_options: {
timeout_in_seconds: 30,
additional_headers: { "x-trace-id" => SecureRandom.uuid },
additional_query_parameters: { debug: true }
}
)additional_body_parameters merge into JSON/form payloads, giving you a consistent escape hatch when the API adds fields before the SDK regenerates.
Use ElevenLabs::Upload to wrap local paths, strings, or IO objects for multipart endpoints:
upload = ElevenLabs::Upload.from_path("sample.wav", content_type: "audio/wav")
client.voices.pvc.samples.create("voice_id", files: [upload], remove_background_noise: true)Uploads created from paths auto-close their file handles after the request finishes. For custom IO objects, pass auto_close: true so the SDK can close them:
io = File.open("clip.mp3", "rb")
upload = ElevenLabs::Upload.from_io(io, auto_close: true)
client.voices.ivc.samples.create("voice_id", files: [upload])You can stub ElevenLabs::Upload.file_opener in tests to avoid touching the filesystem.
Non-success responses raise ElevenLabs::HTTPError with useful context:
begin
client.history.get("missing_id")
rescue ElevenLabs::HTTPError => e
warn "HTTP #{e.status}"
warn e.body.inspect
endThe gem includes scripts/extract_spec.py, which parses the upstream Python SDK and writes lib/elevenlabs/spec.json. Run the script after pulling the latest upstream changes:
python3 scripts/extract_spec.pyThis keeps every endpoint, request shape, and child resource in sync without hand editing Ruby code.
Run the full test suite using Rake:
rake testOr run individual test files:
ruby -Ilib:test test/operation_serialization_test.rb
ruby -Ilib:test test/http_client_test.rb
ruby -Ilib:test test/utils_test.rb
ruby -Ilib:test test/upload_test.rb
ruby -Ilib:test test/errors_test.rb
ruby -Ilib:test test/client_test.rb
ruby -Ilib:test test/environment_test.rbTest Coverage (146 tests, 394 assertions):
operation_serialization_test.rb- Tests request serialization for various operationsoperation_executor_test.rb- Tests path building, query/body/file resolution, streaming dispatch, request_options forwardinghttp_client_test.rb- Tests file upload handling, redirect following, streaming cleanuphttp_client_headers_test.rb- Tests default headers, API key injection, JSON/form body prep, timeouts, response parsing, error handlingresources_test.rb- Tests spec loading, class generation, operation methods, child resource accessors, caching, deep nestingutils_test.rb- Tests utility functions (deep_dup, assign_path, deep_compact, etc.)upload_test.rb- Tests Upload helper methods for files, bytes, strings, and IOerrors_test.rb- Tests error classes and their attributesclient_test.rb- Tests client initialization, resource caching, all namespace accessors, custom environmentsenvironment_test.rb- Tests environment URL resolution
Launch IRB with the project on the load path:
irb -Ilib -ItestFrom there you can require "elevenlabs" or load specific test files to iterate interactively.
To generate and install a local build for testing:
gem build elevenlabs-ruby.gemspec
gem install ./elevenlabs-$(ruby -Ilib -e 'require "elevenlabs"; puts ElevenLabs::VERSION').gemYou can then require the gem from any project (or IRB) and point Gemfile entries to the local path if desired:
gem "elevenlabs", path: "/path/to/elevenlabs-ruby"Updated lib/elevenlabs/spec.json by running the extraction script against elevenlabs-python v2.40.0 (commits #745 and #750 — March 2026).
New Namespaces:
environment_variables— manage ConvAI environment variables (list, create, get, update)workspace.auth_connections— manage workspace auth connections (list, create, delete)conversational_ai.knowledge_base.document— individual document operations (refresh, compute_rag_index)
New/Updated Parameters:
speech_to_text.convert— addedentity_redactionandentity_redaction_modefor PII redactionspeech_to_text.convert— addedkeytermsfor keyword boosting
Bug Fixes:
- Fixed
build_bodyto handle empty-path assignments (operations where therequestparam IS the body, such asenvironment_variables.createandworkspace.auth_connections.create)
New Types: Auth connections (OAuth2, JWT, basic, bearer, custom header, WhatsApp), environment variables, conditional AST operators, telephony direction, guardrail trigger actions, content threshold guardrails, LLM literal JSON schema properties, and more.
Test suite now at 146 runs, 394 assertions, 0 failures.
Updated lib/elevenlabs/spec.json by running the extraction script against elevenlabs-python v2.39.1 (commit 8303d37, SDK regeneration #744 — March 2026).
New Operations:
workspace.groups.list— list all groups in the workspace
New/Updated Parameters:
audio_native.update_content_from_url— addedauthorandtitleoptional paramsconversational_ai.batch_calls.create— addedtarget_concurrency_limitfor controlling simultaneous call dispatchconversational_ai.users.list— addedbranch_idfilter andsort_byorderingconversational_ai.whatsapp_accounts.update— addedenable_audio_message_responsemusic.compose— addedrespect_sections_durationsfor stricter section timingspeech_to_text.convert— addedno_verbatimto strip filler words (scribe_v2)workspace.invites.create— addedseat_typeparam
Removed Parameters:
conversational_ai.agents.create/update— removedcoaching_settings
Test suite now at 65 runs, 182 assertions, 0 failures.
Added serialization tests and README examples for the two new operations introduced in SDK #740:
music.upload— multipart file upload test covering path, form field, and file entrypronunciation_dictionaries.rules.set— JSON body test covering path and rules payload
Test suite now at 57 runs, 144 assertions, 0 failures.
Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit 78ed67e, SDK regeneration #740 — March 2026).
New Operations:
music.upload— upload an audio file for use in music workflowspronunciation_dictionaries.rules.set— replace the full rules set on a pronunciation dictionary
New Ruby access patterns:
client.music.upload(file: ElevenLabs::Upload.from_path("track.mp3"))
client.pronunciation_dictionaries.rules.set(
pronunciation_dictionary_id: "dict_123",
rules: [{ "type" => "phoneme", "string_to_replace" => "ElevenLabs", "phoneme" => "ɛlɛvənlæbz", "alphabet" => "ipa" }]
)New Types:
CheckServiceAvailabilityParams, CreateAssetParams, CreateClientAppointmentParams, CustomGuardrailsConfigInput, CustomGuardrailsConfigOutput, DeleteAssetParams, DeleteCalendarEventParams, GetClientAppointmentsParams, GuardrailExecutionMode, ListCalendarEventsParams, MusicUploadResponse, RequiredConstraint, RequiredConstraints, StudioAgentSettingsModel, StudioAgentToolSettingsModel, TelephonyCallConfig, UpdateAssetParams, UpdateCalendarEventParams, VoiceStatisticsResponseModel
The HTTP client now automatically follows 3xx redirects when the server returns a redirect response. This means requests that hit moved or redirected endpoints will transparently re-issue to the new location without any change to your calling code.
Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit f71bcd8, SDK regeneration #736 — March 2026). 6 new operations added.
New Operations:
audio_native.update_content_from_url— update audio native content from a URLconversational_ai.conversations.files.create— upload files within a conversation contextconversational_ai.conversations.files.delete— delete files from a conversationconversational_ai.conversations.messages.search— search conversation messagesconversational_ai.conversations.messages.text_search— text search across conversation messagesconversational_ai.llm.list— list available LLMs for conversational AI
New Ruby access patterns:
client.audio_native.update_content_from_url(project_id: "proj_123", url: "https://example.com/audio.mp3")
client.conversational_ai.conversations.files.create(conversation_id: "conv_123", file: upload)
client.conversational_ai.conversations.files.delete(conversation_id: "conv_123", file_id: "file_456")
client.conversational_ai.conversations.messages.search(conversation_id: "conv_123", query: "hello")
client.conversational_ai.conversations.messages.text_search(conversation_id: "conv_123", query: "hello")
client.conversational_ai.llm.listUpstream changes also included:
- Coaching settings for agent create/patch operations
- New types:
ClipAnimation,CoachingAgentSettings,FocusGuardrail,PromptInjectionGuardrail,ReferenceVideo,LlmInfoModel,ConstantSchemaOverride,DynamicVariableSchemaOverride,MessagesSearchResponse,ConversationHistoryTranscriptResponseModel,PrivacyConfigOutput,ProcedureRefResponseModel,WidgetConfig,GenerationSourceContext - Renamed:
AlignmentGuardrail→FocusGuardrail,PrivacyConfig→PrivacyConfigInput
Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit 0b87e77, SDK regeneration #730 — February 16, 2026). This is a major update with many new endpoints.
New Namespaces:
- MCP server management (
client.conversational_ai.mcp_servers+ tool approvals, tool configs, tools listing) - Agent branches (
client.conversational_ai.agents.branches) — list, create, get, update, merge - Agent deployments (
client.conversational_ai.agents.deployments) - Agent drafts (
client.conversational_ai.agents.drafts) - Conversational AI tests (
client.conversational_ai.tests) with invocations sub-resource - Conversational AI tools (
client.conversational_ai.tools) - Twilio integration (
client.conversational_ai.twilio) — outbound calls and call registration - SIP trunk (
client.conversational_ai.sip_trunk) — outbound calls - Analytics (
client.conversational_ai.analytics.live_count) - Dashboard settings (
client.conversational_ai.dashboard.settings) - LLM usage (
client.conversational_ai.llm_usage,client.conversational_ai.agents.llm_usage) - Users listing (
client.conversational_ai.users) - Agent simulation (
client.conversational_ai.agents.simulate_conversation,simulate_conversation_stream,run_tests) - Professional voice cloning expanded (
client.voices.pvc) — samples, speakers, verification, captcha, waveform
Enhanced Features:
- Music: added
compose_detailed,stream,separate_stemsoperations - Text-to-dialogue: added
stream_with_timestampsandconvert_with_timestamps - Dubbing: expanded resource operations (transcribe, translate, dub, render, segment/speaker management)
- Studio: added
get_muted_tracksfor projects - Knowledge base: added
rag_index_overview, per-document RAG index compute, chunk and summary retrieval
Added comprehensive test suite with 45 tests and 100 assertions covering:
- Utils module (deep_dup, assign_path, deep_compact, symbolize_keys, encode_path_segment)
- Upload helpers (from_bytes, from_string, from_io, from_path)
- Error classes (HTTPError attributes and inheritance)
- Client initialization and resource caching
- Environment URL resolution for all regions
- Created Rakefile for easy test execution (
rake test)
All tests passing with 0 failures.
Updated .github/workflows/gem-push.yml to use Ruby 3.3 (from 2.6.x) to match the project's Ruby version requirements. Also enabled bundler caching for faster CI builds.
Updated lib/elevenlabs/spec.json by running the extraction script against the latest elevenlabs-python SDK (commit 23cb5ff). This update includes:
New Features:
- Agent summaries endpoint (
client.conversational_ai.agents.summaries) - WhatsApp integration (
client.conversational_ai.whatsappandclient.conversational_ai.whatsapp_accounts) - Batch calls functionality for conversational AI
- Workspace resources management (
client.workspace.resources) - Knowledge base improvements (dependent type filtering, source file URL retrieval)
- Dubbing transcripts management (
client.dubbing.transcripts)
Enhanced Parameters:
show_only_owned_agentsfilter for agent listingbranch_idsupport for conversation workflowsmain_languagesandconversation_initiation_sourcefor conversationsentity_detectioncapability for speech-to-text- Custom SIP headers for phone number workflows
- Widget configuration and language presets
API Changes:
- Removed deprecated
use_typesenseparameter from knowledge base operations - Updated output format enums to use consolidated
allowed_output_formats - Enhanced phone number transfer configuration with custom headers
- Improved permission types for workspace API keys
To update your local spec.json in the future, run:
cd tmp-elevenlabs-python && git pull origin main && cd .. && python3 scripts/extract_spec.py