Skip to content

πŸŽ‡ Multi-lingual AI Assistant with gTTS & Gemini Pro πŸŽ† An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. πŸŒπŸ’¬ Experience seamless voice interaction in various languages with continuous updates and improvements! πŸš€βœ¨

License

Notifications You must be signed in to change notification settings

shaheennabi/Multi-lingual-AI-Assistant-with-gTTS-and-Gemini-Pro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Multi-lingual AI Assistant with gTTS and Gemini Pro πŸ€–πŸŒ

  • Caution: This is my mini_project here

Welcome to the Multi-lingual AI Assistantβ€”the future of voice-driven interaction powered by Gemini Pro and gTTS! This AI assistant brings the power of Google’s cutting-edge models to your fingertips, enabling seamless, real-time voice interactions across multiple languages. Speak your mind, and let the AI do the rest! 🌟

Whether you want to ask a question, get a recommendation, or just chat, this assistant is ready to assist you in multiple languages. It takes voice input, processes it using Gemini Pro, and responds with text-to-speech using gTTS. 🎧✨ Plus, you can download the speech output for offline access and share it anytime!

This isn't just a simple assistantβ€”it's an experience! πŸ’₯πŸ’₯


🚨 Key Features 🚨

  • 🌍 Multi-Language Support: Communicate in multiple languages with Gemini Pro’s robust capabilitiesβ€”whether you're in English, Spanish, French, or many others! The assistant speaks your language. πŸ’¬πŸŒ
  • 🎀 Voice Input: No typing needed! Use the microphone to speak to your assistant, and it will convert your speech into text using Speech Recognition. πŸ—£οΈπŸŽ™οΈ
  • πŸ”„ Text-to-Speech with gTTS: The assistant converts its generated responses back into speech using the Google Text-to-Speech (gTTS) API. Hear the assistant’s voice in your preferred language. πŸŽ§πŸ”Š
  • πŸ”₯ Downloadable Speech Output: After interacting with the assistant, get your generated speech as an audio file for offline use! πŸ’ΎπŸ“²
  • ✨ Streamlit UI: A stunning, easy-to-use web interface built with Streamlit to bring everything together in a beautiful package. Interact with the assistant effortlessly. 🎨πŸ–₯️

πŸ› οΈ Installation Guide πŸ› οΈ

Step 1: Create Your Conda Environment

Let's get your environment set up and ready to go! Open your terminal and run:

conda create --name multilingual-assistant python=3.9

Activate the env:

conda activate multilingual-assistant

Step 2: Install Dependencies

Now, install all the required dependencies using the following command:

pip install -r requirements.txt

Make sure you’ve got everything you need to make the magic happen!

Dependencies:

  • gTTS (Google Text-to-Speech): Converts the assistant’s responses into speech.
  • Gemini Pro: The language model behind all the intelligence.
  • Streamlit: For building the stunning web interface.
  • Speech Recognition: To convert your voice into text.

πŸš€ How to Use πŸš€

Step 1: Set Up API Keys for Gemini Pro

To interact with Gemini Pro, you'll need to set up API access. Head to Google Cloud, create a project, and enable Gemini Pro. Store your API key securely and configure it in your environment.

Step 2: Launch the Streamlit Application

Now, it's time to see the magic in action. Run the following command:

streamlit run app.py

This will start the Streamlit app and open the web interface in your browser.

Step 3: Interact with the Assistant

  1. Record Your Voice: Click the Record button to start speaking.
  2. AI Processing: The assistant will listen to your speech, convert it to text, and send it to Gemini Pro for processing.
  3. Listen to the Response: The assistant will convert the AI-generated text back into speech using gTTS and play it back to you.
  4. Download the Speech: After hearing the assistant’s response, click the download button to save the speech for offline use.

πŸ“‚ Project Structure

Here’s a look at the project structure:

Multi-lingual-AI-Assistant-with-gTTS-and-Gemini-Pro/
β”‚
β”œβ”€β”€ app.py                # Streamlit UI for interaction
β”œβ”€β”€ requirements.txt      # All the necessary dependencies
β”œβ”€β”€ src      
     |-----helper.py
└── README.md             # Project documentation (You’re looking at it right now!)

πŸ’‘ Technologies Used πŸ’‘

  • Gemini Pro: Google’s state-of-the-art language model for intelligent AI responses.
  • gTTS (Google Text-to-Speech): Converting text to natural-sounding speech using Google’s powerful TTS engine.
  • Streamlit: A super-fast, easy-to-use library for creating web apps with a focus on machine learning.
  • Speech Recognition: Capturing voice input and converting it to text.
  • Python 3.9: The Python version keeping everything running smoothly.

πŸ“œ License πŸ“œ

This project is licensed under the MIT License. Check the LICENSE file for more details.


πŸ™ Acknowledgments πŸ™

A big thank you to the following technologies that made this project possible:

  • Google Gemini Pro
  • gTTS
  • Streamlit
  • Speech Recognition

πŸŽ‰ Let’s Talk! πŸŽ‰

Ready to try it out? Clone the repository, install the dependencies, and fire up your assistant! πŸš€πŸ’¬ Let’s create something amazing together. ✨


🌟 Stars are Always Welcome! 🌟

If you love the project, ⭐ star ⭐ it and show some love! Also, feel free to contribute and make this assistant even smarter. πŸ’‘

About

πŸŽ‡ Multi-lingual AI Assistant with gTTS & Gemini Pro πŸŽ† An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. πŸŒπŸ’¬ Experience seamless voice interaction in various languages with continuous updates and improvements! πŸš€βœ¨

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages