Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added local TTS support using Kokoro #401

Merged
merged 8 commits into from
Jan 14, 2025
Merged

Conversation

ErikBjare
Copy link
Owner

@ErikBjare ErikBjare commented Jan 14, 2025

New SOTA local TTS model dropped, figured I should try integrating it and get Bob to speak using it.

Closes #60


Important

Adds local TTS support using Kokoro, integrating a new TTS model and server into the system.

  • Behavior:
    • Integrates Kokoro TTS model for local text-to-speech in gptme/chat.py.
    • Adds TTS functionality to step() in gptme/chat.py to speak responses if TTS tool is available.
    • Introduces _clean_content_for_speech() in gptme/chat.py to clean text for TTS.
  • Tools:
    • New gptme/tools/tts.py file for TTS operations, including speak(), set_speed(), and set_volume() functions.
    • Implements audio playback management with threading and queuing.
  • Server:
    • Adds scripts/tts_server.py to run a TTS server using FastAPI and Kokoro model.
    • Provides /tts and /health endpoints for TTS conversion and server health check.
  • Dependencies:
    • Updates pyproject.toml to include sounddevice, scipy, and other TTS-related packages.
  • Misc:
    • Excludes scripts/tts_server.py from certain Makefile operations.

This description was created by Ellipsis for 5c0359b. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to cd2f7ff in 1 minute and 34 seconds

More details
  • Looked at 378 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. gptme/tools/tts.py:155
  • Draft comment:
    Consider adding a timeout to the requests.get call to handle cases where the TTS server is unreachable.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_7wCANXf1iIAyU5DO


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 5f67371 in 31 seconds

More details
  • Looked at 274 lines of code in 3 files
  • Skipped 1 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. gptme/tools/tts.py:198
  • Draft comment:
    Consider making the TTS server URL configurable instead of hardcoding it. This allows flexibility in server location or port.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The speak function in gptme/tools/tts.py uses a hardcoded URL for the TTS server. This could be made configurable to allow flexibility in server location or port.
2. gptme/tools/tts.py:201
  • Draft comment:
    Consider adding exception handling for network-related errors when making requests to the TTS server to prevent unhandled exceptions.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The speak function in gptme/tools/tts.py does not handle the case where the requests.get call fails due to network issues or server unavailability. This could lead to unhandled exceptions.
3. gptme/tools/tts.py:211
  • Draft comment:
    Ensure that the response content is a valid WAV file before attempting to read it with wavfile.read to avoid potential errors.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The speak function in gptme/tools/tts.py uses wavfile.read which might not handle all audio formats correctly. It assumes the response content is always a valid WAV file.
4. pyproject.toml:105
  • Draft comment:
    The tts extra is correctly defined with necessary dependencies: sounddevice, scipy, and numpy.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The tts extra in pyproject.toml includes numpy, scipy, and sounddevice, which are necessary for TTS functionality. This is correctly set up.
5. scripts/tts_server.py:104
  • Draft comment:
    Consider validating the speed parameter to ensure it falls within an acceptable range (e.g., 0.5 to 2.0) to prevent unexpected behavior.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The text_to_speech endpoint in scripts/tts_server.py uses a default speed of 1.0, which is reasonable. However, it should validate the speed parameter to ensure it falls within an acceptable range.

Workflow ID: wflow_CrMfSPYC0JQ63v0J


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 1e7ca05 in 37 seconds

More details
  • Looked at 36 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. scripts/tts_server.py:43
  • Draft comment:
    The # fmt: skip comment is unnecessary here unless there's a specific reason to skip formatting. Consider removing it for cleaner code.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The import statements for generate and build_model are using # fmt: skip, which is unnecessary unless there is a specific formatting issue. It's better to remove it if not needed.
2. scripts/tts_server.py:80
  • Draft comment:
    torch.load does not have a weights_only argument. This will cause a runtime error. Please remove or correct this argument.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_a4SJ8E7bmq8YjJLZ


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on e006783 in 21 seconds

More details
  • Looked at 433 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. scripts/tts_server.py:113
  • Draft comment:
    Consider adding more specific exception handling or logging the stack trace to help diagnose initialization issues.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new feature for TTS using Kokoro, but there are some issues with exception handling and logging that need to be addressed.
2. gptme/tools/tts.py:309
  • Draft comment:
    Consider logging the stack trace for exceptions to aid in debugging.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new feature for TTS using Kokoro, but there are some issues with exception handling and logging that need to be addressed.
3. gptme/chat.py:252
  • Draft comment:
    Ensure that the speak function handles exceptions gracefully to prevent crashes if TTS fails.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new feature for TTS using Kokoro, but there are some issues with exception handling and logging that need to be addressed.

Workflow ID: wflow_6N9qV5e15GOYzIWR


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 066f38a in 47 seconds

More details
  • Looked at 249 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. gptme/tools/tts.py:314
  • Draft comment:
    Consider adding a timeout to the requests.get call to prevent it from hanging indefinitely if the TTS server is unresponsive. Also, handle potential network errors more gracefully.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_wIA7W55QZ5HGXB8q


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 86adc9b in 16 seconds

More details
  • Looked at 14 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. gptme/tools/tts.py:20
  • Draft comment:
    Good addition to handle OSError for missing PortAudio library.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The PR adds an OSError exception to handle cases where the PortAudio library is not found. This is a good addition for robustness.

Workflow ID: wflow_sEtWqSUXOyrKX9BU


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2025

Codecov Report

Attention: Patch coverage is 19.43128% with 170 lines in your changes missing coverage. Please review.

Project coverage is 69.11%. Comparing base (9cbaa5d) to head (5c0359b).
Report is 5 commits behind head on master.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
gptme/tools/tts.py 18.31% 165 Missing ⚠️
gptme/chat.py 44.44% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #401      +/-   ##
==========================================
- Coverage   70.99%   69.11%   -1.88%     
==========================================
  Files          68       69       +1     
  Lines        5540     5750     +210     
==========================================
+ Hits         3933     3974      +41     
- Misses       1607     1776     +169     
Flag Coverage Δ
anthropic/claude-3-haiku-20240307 67.77% <19.43%> (-2.07%) ⬇️
openai/gpt-4o-mini 67.13% <19.43%> (-1.83%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 7c8cc6b in 18 seconds

More details
  • Looked at 25 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. pyproject.toml:142
  • Draft comment:
    Adding mypy overrides for ignoring missing imports for modules like numpy, scipy, sounddevice, and flask without explanation can hide potential issues. Ensure these are necessary and document the reason for each override.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR adds mypy overrides for several modules, but the description does not mention why these overrides are necessary. This could lead to potential issues if the imports are actually missing or if there are type issues that need to be addressed.

Workflow ID: wflow_OWLftHMvOuFaH98A


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 5c0359b in 21 seconds

More details
  • Looked at 16 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. gptme/chat.py:252
  • Draft comment:
    The PR description mentions integrating the Kokoro TTS model, but the code changes do not reflect this integration. Ensure that the Kokoro model is properly integrated and used for TTS.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_e71i6UDlXbeAAzIp


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ErikBjare ErikBjare merged commit 3e8e869 into master Jan 14, 2025
7 checks passed
@ErikBjare
Copy link
Owner Author

Works great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add speech synthesis
2 participants