Imagineo Chat

title	emoji	colorFrom	colorTo	sdk	sdk_version	app_file	pinned	license	short_description
IMAGINEO CHAT	⚡	yellow	red	gradio	5.20.1	app.py	true	creativeml-openrail-m	compose chat multi-modal

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Imagineo Chat

Imagineo Chat is a versatile multimodal chatbot that integrates text, image, and video processing capabilities. It leverages state-of-the-art models like Gemma3-4B, Qwen2-VL-OCR, and Stable Diffusion XL to provide a seamless conversational experience. Whether you want to generate images, analyze videos, or have a text-based conversation, Imagineo Chat has you covered.

Features

Text Generation: Engage in natural language conversations with the chatbot.
Image Generation: Generate high-quality images using Stable Diffusion XL models (Lightning 5, Lightning 4, Turbo v3).
Multimodal Processing: Analyze and describe images using the Gemma3-4B model.
Video Analysis: Process and summarize video content using the Gemma3-4B model.
Text-to-Speech (TTS): Convert text responses into speech using Edge TTS.
OCR (Optical Character Recognition): Extract and interpret text from images using Qwen2-VL-OCR.

How to Use

Text and Chat

Simply type your query in the text box. You can also use special tags to trigger specific functionalities:

@gemma3-4b: Use this tag for multimodal queries involving images or text.
@gemma3-4b-video: Use this tag to analyze and describe video content.
@lightningv5: Generate images using the Stable Diffusion XL Lightning 5 model.
@lightningv4: Generate images using the Stable Diffusion XL Lightning 4 model.
@turbov3: Generate images using the Stable Diffusion XL Turbo v3 model.
@tts1 or @tts2: Convert the chatbot's response into speech using different voices.

Image and Video Upload

You can upload images or videos directly into the chat interface. The chatbot will process the media and provide relevant responses based on the query.

Examples

Here are some example queries you can try:

@gemma3-4b Explain the Image (Upload an image)
@gemma3-4b-video Explain what is happening in this video? (Upload a video)
@lightningv5 Chocolate dripping from a donut
@tts1 Who is Nikola Tesla, and why did he die?
@lightningv4 Cat holding a sign that says hello world
@turbov3 Anime illustration of a wiener schnitzel
@tts2 What causes rainbows to form?

Installation

To run Imagineo Chat locally, follow these steps:

Clone the Repository:

git clone https://github.com/your-username/imagineo-chat.git
cd imagineo-chat

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Application:
```
python app.py
```
Access the Chat Interface: Open your browser and navigate to http://localhost:7860 to access the chat interface.

Models Used

Text Generation: prithivMLmods/FastThink-0.5B-Tiny
Multimodal (OCR): prithivMLmods/Qwen2-VL-OCR-2B-Instruct
Image Generation:
- Lightning 5: SG161222/RealVisXL_V5.0_Lightning
- Lightning 4: SG161222/RealVisXL_V4.0_Lightning
- Turbo v3: SG161222/RealVisXL_V3.0_Turbo
Multimodal (Text & Image): google/gemma-3-4b-it

Requirements

Python 3.8+
PyTorch
Transformers
Gradio
Edge TTS
OpenCV (for video processing)
Diffusers (for image generation)

Name	Name	Last commit message	Last commit date
Latest commit PRITHIVSAKTHIUR Update README.md Mar 13, 2025 8eab8f6 · Mar 13, 2025 History 6 Commits
examples	examples	Add files via upload	Mar 13, 2025
.gitattributes	.gitattributes	Add files via upload	Mar 13, 2025
README.md	README.md	Update README.md	Mar 13, 2025
app.py	app.py	Add files via upload	Mar 13, 2025
requirements.txt	requirements.txt	Add files via upload	Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Imagineo Chat

Features

How to Use

Text and Chat

Image and Video Upload

Examples

Installation

Models Used

Requirements

About

Languages

PRITHIVSAKTHIUR/Imagineo-Chat-Video-Understanding

Folders and files

Latest commit

History

Repository files navigation

Imagineo Chat

Features

How to Use

Text and Chat

Image and Video Upload

Examples

Installation

Models Used

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Languages