You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains a Jupyter notebook (`FastAPI-Llama-HuggingfaceHub-Collab.ipyn`) that demonstrates how to set up and run a FastAPI server with Llama 2 model integration using Google Colab's free T4 GPU.
4
+
5
+
## Features
6
+
7
+
- Sets up a FastAPI server with Llama 2 model integration
8
+
- Uses Google Colab's free GPU for model inference
9
+
- Creates a public URL for the API using ngrok
10
+
- Provides an example of how to make API requests to the server
11
+
12
+
## Contents
13
+
14
+
The notebook includes the following main sections:
15
+
16
+
1. Installation of dependencies
17
+
2. Setting up ngrok for creating a public URL
18
+
3. Creating the FastAPI application
19
+
4. Starting the FastAPI server
20
+
5. Using ngrok to create a public URL for the server
21
+
6. Testing the API with example requests
22
+
23
+
## Usage
24
+
25
+
1. Open the `FastAPI-Llama-HuggingfaceHub-Collab.ipynb` notebook in Google Colab
26
+
2. Follow the instructions in the notebook to set up and run the server
27
+
3. Use the provided ngrok URL to make API requests to the Llama 2 model
28
+
29
+
## Requirements
30
+
31
+
- Google Colab account (for free GPU access)
32
+
- ngrok account (free tier is sufficient)
33
+
34
+
## Note
35
+
36
+
Make sure to shut down the server and ngrok processes when you're done using the notebook to free up resources.
37
+
For more detailed instructions and code explanations, please refer to the comments within the notebook.
0 commit comments