feat: added local TTS support using Kokoro #401

ErikBjare · 2025-01-14T08:29:22Z

New SOTA local TTS model dropped, figured I should try integrating it and get Bob to speak using it.

Kokoro: https://huggingface.co/hexgrad/Kokoro-82M
Try it: https://huggingface.co/spaces/hexgrad/Kokoro-TTS

Closes #60

Important

Adds local TTS support using Kokoro, integrating a new TTS model and server into the system.

Behavior:
- Integrates Kokoro TTS model for local text-to-speech in gptme/chat.py.
- Adds TTS functionality to step() in gptme/chat.py to speak responses if TTS tool is available.
- Introduces _clean_content_for_speech() in gptme/chat.py to clean text for TTS.
Tools:
- New gptme/tools/tts.py file for TTS operations, including speak(), set_speed(), and set_volume() functions.
- Implements audio playback management with threading and queuing.
Server:
- Adds scripts/tts_server.py to run a TTS server using FastAPI and Kokoro model.
- Provides /tts and /health endpoints for TTS conversion and server health check.
Dependencies:
- Updates pyproject.toml to include sounddevice, scipy, and other TTS-related packages.
Misc:
- Excludes scripts/tts_server.py from certain Makefile operations.

^{This description was created by}^{for 5c0359b. It will automatically update as commits are pushed.}

tts_server.py

ellipsis-dev

❌ Changes requested. Reviewed everything up to cd2f7ff in 1 minute and 34 seconds

More details

Looked at 378 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/tts.py:155

Draft comment:
Consider adding a timeout to the requests.get call to handle cases where the TTS server is unreachable.
Reason this comment was not posted:
Marked as duplicate.

Workflow ID: wflow_7wCANXf1iIAyU5DO

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

tts_server.py

gptme/tools/tts.py

ellipsis-dev

👍 Looks good to me! Incremental review on 5f67371 in 31 seconds

More details

Looked at 274 lines of code in 3 files
Skipped 1 files when reviewing.
Skipped posting 5 drafted comments based on config settings.

1. gptme/tools/tts.py:198

Draft comment:
Consider making the TTS server URL configurable instead of hardcoding it. This allows flexibility in server location or port.
Reason this comment was not posted:
Confidence changes required: 50%
The speak function in gptme/tools/tts.py uses a hardcoded URL for the TTS server. This could be made configurable to allow flexibility in server location or port.

2. gptme/tools/tts.py:201

Draft comment:
Consider adding exception handling for network-related errors when making requests to the TTS server to prevent unhandled exceptions.
Reason this comment was not posted:
Confidence changes required: 50%
The speak function in gptme/tools/tts.py does not handle the case where the requests.get call fails due to network issues or server unavailability. This could lead to unhandled exceptions.

3. gptme/tools/tts.py:211

Draft comment:
Ensure that the response content is a valid WAV file before attempting to read it with wavfile.read to avoid potential errors.
Reason this comment was not posted:
Confidence changes required: 50%
The speak function in gptme/tools/tts.py uses wavfile.read which might not handle all audio formats correctly. It assumes the response content is always a valid WAV file.

4. pyproject.toml:105

Draft comment:
The tts extra is correctly defined with necessary dependencies: sounddevice, scipy, and numpy.
Reason this comment was not posted:
Confidence changes required: 0%
The tts extra in pyproject.toml includes numpy, scipy, and sounddevice, which are necessary for TTS functionality. This is correctly set up.

5. scripts/tts_server.py:104

Draft comment:
Consider validating the speed parameter to ensure it falls within an acceptable range (e.g., 0.5 to 2.0) to prevent unexpected behavior.
Reason this comment was not posted:
Confidence changes required: 50%
The text_to_speech endpoint in scripts/tts_server.py uses a default speed of 1.0, which is reasonable. However, it should validate the speed parameter to ensure it falls within an acceptable range.

Workflow ID: wflow_CrMfSPYC0JQ63v0J

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 1e7ca05 in 37 seconds

More details

Looked at 36 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. scripts/tts_server.py:43

Draft comment:
The # fmt: skip comment is unnecessary here unless there's a specific reason to skip formatting. Consider removing it for cleaner code.
Reason this comment was not posted:
Confidence changes required: 20%
The import statements for generate and build_model are using # fmt: skip, which is unnecessary unless there is a specific formatting issue. It's better to remove it if not needed.

2. scripts/tts_server.py:80

Draft comment:
torch.load does not have a weights_only argument. This will cause a runtime error. Please remove or correct this argument.
Reason this comment was not posted:
Comment was on unchanged code.

Workflow ID: wflow_a4SJ8E7bmq8YjJLZ

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on e006783 in 21 seconds

More details

Looked at 433 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. scripts/tts_server.py:113

Draft comment:
Consider adding more specific exception handling or logging the stack trace to help diagnose initialization issues.
Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new feature for TTS using Kokoro, but there are some issues with exception handling and logging that need to be addressed.

2. gptme/tools/tts.py:309

Draft comment:
Consider logging the stack trace for exceptions to aid in debugging.
Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new feature for TTS using Kokoro, but there are some issues with exception handling and logging that need to be addressed.

3. gptme/chat.py:252

Draft comment:
Ensure that the speak function handles exceptions gracefully to prevent crashes if TTS fails.
Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new feature for TTS using Kokoro, but there are some issues with exception handling and logging that need to be addressed.

Workflow ID: wflow_6N9qV5e15GOYzIWR

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 066f38a in 47 seconds

More details

Looked at 249 lines of code in 4 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/tts.py:314

Draft comment:
Consider adding a timeout to the requests.get call to prevent it from hanging indefinitely if the TTS server is unresponsive. Also, handle potential network errors more gracefully.
Reason this comment was not posted:
Comment was on unchanged code.

Workflow ID: wflow_wIA7W55QZ5HGXB8q

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 86adc9b in 16 seconds

More details

Looked at 14 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/tts.py:20

Draft comment:
Good addition to handle OSError for missing PortAudio library.
Reason this comment was not posted:
Confidence changes required: 0%
The PR adds an OSError exception to handle cases where the PortAudio library is not found. This is a good addition for robustness.

Workflow ID: wflow_sEtWqSUXOyrKX9BU

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

codecov-commenter · 2025-01-14T12:20:10Z

Codecov Report

Attention: Patch coverage is 19.43128% with 170 lines in your changes missing coverage. Please review.

Project coverage is 69.11%. Comparing base (9cbaa5d) to head (5c0359b).
Report is 5 commits behind head on master.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
gptme/tools/tts.py	18.31%	165 Missing ⚠️
gptme/chat.py	44.44%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #401      +/-   ##
==========================================
- Coverage   70.99%   69.11%   -1.88%     
==========================================
  Files          68       69       +1     
  Lines        5540     5750     +210     
==========================================
+ Hits         3933     3974      +41     
- Misses       1607     1776     +169

Flag	Coverage Δ
anthropic/claude-3-haiku-20240307	`67.77% <19.43%> (-2.07%)`	⬇️
openai/gpt-4o-mini	`67.13% <19.43%> (-1.83%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ellipsis-dev

👍 Looks good to me! Incremental review on 7c8cc6b in 18 seconds

More details

Looked at 25 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. pyproject.toml:142

Draft comment:
Adding mypy overrides for ignoring missing imports for modules like numpy, scipy, sounddevice, and flask without explanation can hide potential issues. Ensure these are necessary and document the reason for each override.
Reason this comment was not posted:
Confidence changes required: 50%
The PR adds mypy overrides for several modules, but the description does not mention why these overrides are necessary. This could lead to potential issues if the imports are actually missing or if there are type issues that need to be addressed.

Workflow ID: wflow_OWLftHMvOuFaH98A

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 5c0359b in 21 seconds

More details

Looked at 16 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/chat.py:252

Draft comment:
The PR description mentions integrating the Kokoro TTS model, but the code changes do not reflect this integration. Ensure that the Kokoro model is properly integrated and used for TTS.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_e71i6UDlXbeAAzIp

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ErikBjare · 2025-01-14T12:39:42Z

Works great!

feat: added local TTS support using Kokoro

cd2f7ff

ErikBjare commented Jan 14, 2025

View reviewed changes

tts_server.py Outdated Show resolved Hide resolved

ellipsis-dev bot reviewed Jan 14, 2025

View reviewed changes

tts_server.py Outdated Show resolved Hide resolved

gptme/tools/tts.py Outdated Show resolved Hide resolved

fix(tts): added speed setting, interruption, and needed deps

5f67371

ellipsis-dev bot reviewed Jan 14, 2025

View reviewed changes

fix: fixed imports in tts_server.py

1e7ca05

ellipsis-dev bot reviewed Jan 14, 2025

View reviewed changes

ErikBjare added 2 commits January 14, 2025 12:11

fix: more fixes to tts tool, including better sentence splitting

e006783

fix: more fixes to tts, including correctly picking default device

066f38a

ellipsis-dev bot reviewed Jan 14, 2025

View reviewed changes

fix: fixed catching exception during import for tts

86adc9b

ellipsis-dev bot reviewed Jan 14, 2025

View reviewed changes

chore: added more mypy overrides for optional deps

7c8cc6b

ellipsis-dev bot reviewed Jan 14, 2025

View reviewed changes

fix: shorten code

5c0359b

ErikBjare merged commit 3e8e869 into master Jan 14, 2025
7 checks passed

0xbrayo mentioned this pull request Jan 24, 2025

Speech-to-Text Transcription #263

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: added local TTS support using Kokoro #401

feat: added local TTS support using Kokoro #401

ErikBjare commented Jan 14, 2025 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

codecov-commenter commented Jan 14, 2025 •

edited

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ErikBjare commented Jan 14, 2025

feat: added local TTS support using Kokoro #401

feat: added local TTS support using Kokoro #401

Conversation

ErikBjare commented Jan 14, 2025 • edited by ellipsis-dev bot Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

codecov-commenter commented Jan 14, 2025 • edited Loading

Codecov Report

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare commented Jan 14, 2025

ErikBjare commented Jan 14, 2025 •

edited by ellipsis-dev bot

Loading

codecov-commenter commented Jan 14, 2025 •

edited

Loading