Skip to content

Commit

Permalink
Make models_config YAML for easy editing (#247)
Browse files Browse the repository at this point in the history
* Make models_config YAML for easy editing

* Make models_config configurable via env

* Remove redundant code

* Add a comment
  • Loading branch information
chiragjn authored Jun 24, 2024
1 parent 5fb8d2b commit a000f10
Show file tree
Hide file tree
Showing 10 changed files with 174 additions and 175 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,4 @@ infinity/
volumes/
pgdata/
*.bak
models_config.json
models_config.yaml
24 changes: 19 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Cognita makes it really easy to customize and experiment everything about a RAG

1. Support for multiple document retrievers that use `Similarity Search`, `Query Decompostion`, `Document Reranking`, etc
1. Support for SOTA OpenSource embeddings and reranking from `mixedbread-ai`
1. Support for using LLMs using `Ollama`
1. Support for using LLMs using `ollama`
1. Support for incremental indexing that ingests entire documents in batches (reduces compute burden), keeps track of already indexed documents and prevents re-indexing of those docs.

# :rocket: Quickstart: Running Cognita Locally
Expand All @@ -82,9 +82,23 @@ Cognita makes it really easy to customize and experiment everything about a RAG

Cognita and all of its services can be run using docker-compose. This is the recommended way to run Cognita locally. Install Docker and docker-compose for your system from: [Docker Compose](https://docs.docker.com/compose/install/)

You can run the following command to start the services:
### Configuring Model Providers

```docker
Before starting the services, we need to configure model providers that we would need for embedding and generating answers.

To start, copy `models_config.sample.yaml` to `models_config.yaml`

```shell
cp models_config.sample.yaml models_config.yaml
```

By default, the config has local providers enabled that need infinity and ollama server to run embedding and LLMs locally.
However, if you have a OpenAI API Key, you can uncomment the `openai` provider in `models_config.yaml` and update `OPENAI_API_KEY` in `compose.env`


Now, you can run the following command to start the services:

```shell
docker-compose --env-file compose.env up
```

Expand All @@ -98,12 +112,12 @@ docker-compose --env-file compose.env up

To start additional services such as `ollama` and `infinity-server` you can run the following command:

```docker
```shell
docker-compose --env-file compose.env --profile ollama --profile infinity up
```

- This will start additional servers for `ollama` and `infinity-server` which can be used for LLM, Embeddings and reranking respectively. You can access the `infinity-server` at `http://localhost:7997`.
- You can also have these services hosted somewhere else and provide the respective `OLLAMA_URL` and `INFINITY_URL` in the `compose.env` file.


## Developing in Cognita

Expand Down
39 changes: 0 additions & 39 deletions backend/modules/model_gateway/__init__.py
Original file line number Diff line number Diff line change
@@ -1,39 +0,0 @@
# When using locally, you can create your own models_config.json file before running docker-compose.
# The following code is used for tf-deployments
import json
import os

from backend.logger import logger
from backend.settings import settings

# Define the paths as constants
# Current file's directory
current_dir = os.path.dirname(__file__)

# Navigate up three levels to get to the project root
project_root = os.path.dirname(os.path.dirname(os.path.dirname(current_dir)))

MODELS_CONFIG_SAMPLE_PATH = os.path.join(project_root, "models_config.sample.json")
MODELS_CONFIG_PATH = os.path.join(project_root, "models_config.json")

logger.info(f"MODELS_CONFIG_SAMPLE_PATH: {MODELS_CONFIG_SAMPLE_PATH}")
logger.info(f"MODELS_CONFIG_PATH: {MODELS_CONFIG_PATH}")

if (
settings.TFY_API_KEY
and os.path.exists(MODELS_CONFIG_SAMPLE_PATH)
and not os.path.exists(MODELS_CONFIG_PATH)
):
logger.info(
"models_config.json not found. Creating models_config.json from models_config.sample.json"
)
data = {
"provider_name": "truefoundry",
"api_format": "openai",
"llm_model_ids": ["openai-main/gpt-4-turbo", "openai-main/gpt-3-5-turbo"],
"embedding_model_ids": ["openai-main/text-embedding-ada-002"],
"api_key_env_var": "TFY_API_KEY",
"base_url": settings.TFY_LLM_GATEWAY_URL,
}
with open(MODELS_CONFIG_PATH, "w") as f:
json.dump([data], f, indent=4)
88 changes: 46 additions & 42 deletions backend/modules/model_gateway/model_gateway.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
import json
import os
from typing import List

import yaml
from langchain.embeddings.base import Embeddings
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_openai import OpenAIEmbeddings
from langchain_openai.chat_models import ChatOpenAI

from backend.modules.model_gateway import MODELS_CONFIG_PATH
from backend.logger import logger
from backend.settings import settings
from backend.types import ModelConfig, ModelProviderConfig, ModelType


Expand All @@ -16,50 +17,53 @@ class ModelGateway:
model_name_to_provider_config = {}

def __init__(self):
with open(MODELS_CONFIG_PATH) as f:
data = json.load(f)
# parse the json data into a list of ModelProviderConfig objects
self.provider_configs = [
ModelProviderConfig.parse_obj(item) for item in data
]

# load llm models
self.llm_models: List[ModelConfig] = []
# load embedding models
self.embedding_models: List[ModelConfig] = []

for provider_config in self.provider_configs:
if provider_config.api_key_env_var and not os.environ.get(
provider_config.api_key_env_var
):
raise ValueError(
f"Environment variable {provider_config.api_key_env_var} not set. "
f"Cannot initialize the model gateway."
)

for model_id in provider_config.embedding_model_ids:
model_name = f"{provider_config.provider_name}/{model_id}"
self.model_name_to_provider_config[model_name] = provider_config
logger.info(f"Loading models config from {settings.MODELS_CONFIG_PATH}")
with open(settings.MODELS_CONFIG_PATH) as f:
data = yaml.safe_load(f)
print(data)
_providers = data.get("model_providers") or []
# parse the json data into a list of ModelProviderConfig objects
self.provider_configs = [
ModelProviderConfig.parse_obj(item) for item in _providers
]

# Register the model as an embedding model
self.embedding_models.append(
ModelConfig(
name=f"{provider_config.provider_name}/{model_id}",
type=ModelType.embedding,
)
# load llm models
self.llm_models: List[ModelConfig] = []
# load embedding models
self.embedding_models: List[ModelConfig] = []

for provider_config in self.provider_configs:
if provider_config.api_key_env_var and not os.environ.get(
provider_config.api_key_env_var
):
raise ValueError(
f"Environment variable {provider_config.api_key_env_var} not set. "
f"Cannot initialize the model gateway."
)

for model_id in provider_config.embedding_model_ids:
model_name = f"{provider_config.provider_name}/{model_id}"
self.model_name_to_provider_config[model_name] = provider_config

# Register the model as an embedding model
self.embedding_models.append(
ModelConfig(
name=f"{provider_config.provider_name}/{model_id}",
type=ModelType.embedding,
)
)

for model_id in provider_config.llm_model_ids:
model_name = f"{provider_config.provider_name}/{model_id}"
self.model_name_to_provider_config[model_name] = provider_config
for model_id in provider_config.llm_model_ids:
model_name = f"{provider_config.provider_name}/{model_id}"
self.model_name_to_provider_config[model_name] = provider_config

# Register the model as a llm model
self.llm_models.append(
ModelConfig(
name=f"{provider_config.provider_name}/{model_id}",
type=ModelType.chat,
)
# Register the model as a llm model
self.llm_models.append(
ModelConfig(
name=f"{provider_config.provider_name}/{model_id}",
type=ModelType.chat,
)
)

def get_embedding_models(self) -> List[ModelConfig]:
return self.embedding_models
Expand Down Expand Up @@ -100,7 +104,7 @@ def get_llm_from_model_config(
if not model_config.parameters:
model_config.parameters = {}
if not model_provider_config.api_key_env_var:
api_key = None
api_key = "EMPTY"
else:
api_key = os.environ.get(model_provider_config.api_key_env_var, "")
model_id = "/".join(model_config.name.split("/")[1:])
Expand Down
85 changes: 40 additions & 45 deletions backend/settings.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
import json
import os
from typing import Optional

import orjson
from pydantic import BaseSettings
from pydantic import BaseSettings, root_validator

from backend.types import MetadataStoreConfig, VectorDBConfig

Expand All @@ -13,49 +10,47 @@ class Settings(BaseSettings):
Settings class to hold all the environment variables
"""

LOG_LEVEL: str = "info"
MODELS_CONFIG_PATH: str
METADATA_STORE_CONFIG: MetadataStoreConfig
VECTOR_DB_CONFIG: VectorDBConfig
TFY_SERVICE_ROOT_PATH: Optional[str] = "/"
TFY_API_KEY: str
OPENAI_API_KEY: Optional[str]
TFY_HOST: Optional[str]
TFY_LLM_GATEWAY_URL: str
LOG_LEVEL = os.getenv("LOG_LEVEL", "info")
VECTOR_DB_CONFIG = os.getenv("VECTOR_DB_CONFIG", "")
METADATA_STORE_CONFIG = os.getenv("METADATA_STORE_CONFIG", "")
TFY_SERVICE_ROOT_PATH = os.getenv("TFY_SERVICE_ROOT_PATH", "")
JOB_FQN = os.getenv("JOB_FQN", "")
JOB_COMPONENT_NAME = os.getenv("JOB_COMPONENT_NAME", "")
TFY_API_KEY = os.getenv("TFY_API_KEY", "")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")
TFY_HOST = os.getenv("TFY_HOST", "")
TFY_LLM_GATEWAY_URL = os.getenv("TFY_LLM_GATEWAY_URL", "")

LOCAL: bool = os.getenv("LOCAL", False)
OLLAMA_URL: str = os.getenv("OLLAMA_URL", "http://localhost:11434")
EMBEDDING_SVC_URL: str = os.getenv("EMBEDDING_SVC_URL", "")
RERANKER_SVC_URL: str = os.getenv("RERANKER_SVC_URL", "")

if not VECTOR_DB_CONFIG:
raise ValueError("VECTOR_DB_CONFIG is not set")

if not METADATA_STORE_CONFIG:
raise ValueError("METADATA_STORE_CONFIG is not set")

if not TFY_LLM_GATEWAY_URL:
TFY_LLM_GATEWAY_URL = f"{TFY_HOST}/api/llm"

try:
VECTOR_DB_CONFIG = VectorDBConfig.parse_obj(orjson.loads(VECTOR_DB_CONFIG))
except Exception as e:
raise ValueError(f"VECTOR_DB_CONFIG is invalid: {e}")
try:
METADATA_STORE_CONFIG = MetadataStoreConfig.parse_obj(
orjson.loads(METADATA_STORE_CONFIG)
)
except Exception as e:
raise ValueError(f"METADATA_STORE_CONFIG is invalid: {e}")

RERANKER_SVC_URL: str = ""
LOCAL: bool = False

TFY_HOST: str = ""
TFY_API_KEY: str = ""
JOB_FQN: str = ""
JOB_COMPONENT_NAME: str = ""

LOG_LEVEL: str = "info"
TFY_SERVICE_ROOT_PATH: str = ""

# TODO: This will be removed in future releases
TFY_LLM_GATEWAY_URL: str = ""

@root_validator(pre=True)
def _validate_values(cls, values):
models_config_path = values.get("MODELS_CONFIG_PATH")
if not os.path.isabs(models_config_path):
this_dir = os.path.abspath(os.path.dirname(__file__))
root_dir = os.path.dirname(this_dir)
models_config_path = os.path.join(root_dir, models_config_path)

if not models_config_path:
raise Exception(
f"{models_config_path} does not exist. "
f"You can copy models_config.sample.yaml to {settings.MODELS_CONFIG_PATH} to bootstrap config"
)

values["MODELS_CONFIG_PATH"] = models_config_path

tfy_host = values.get("TFY_HOST")
tfy_llm_gateway_url = values.get("TFY_LLM_GATEWAY_URL")
if tfy_host and not tfy_llm_gateway_url:
tfy_llm_gateway_url = f"{tfy_host.rstrip('/')}/api/llm"
values["TFY_LLM_GATEWAY_URL"] = tfy_llm_gateway_url

return values


settings = Settings()
9 changes: 4 additions & 5 deletions compose.env
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,18 @@ POSTGRES_PORT=5432
POSTGRES_USER=postgres
POSTGRES_PASSWORD=test


## OLLAMA VARS
OLLAMA_MODEL=qwen2:1.5b


## INFINITY VARS
INFINITY_EMBEDDING_MODEL=mixedbread-ai/mxbai-embed-large-v1
INFINITY_RERANKING_MODEL=mixedbread-ai/mxbai-rerank-xsmall-v1


## COGNITA_BACKEND VARS
### Note: If you are changing `COGNITA_BACKEND_PORT`, please make sure to update `VITE_QA_FOUNDRY_URL` to match it. Frontend talks to backend via the host network
COGNITA_BACKEND_PORT=8000
OLLAMA_URL=http://ollama-server:11434
INFINITY_URL=http://infinity-server:7997
### `MODEL_PROVIDERS_CONFIG_PATH` is relative to cognita root dir
MODELS_CONFIG_PATH="./models_config.yaml"
METADATA_STORE_CONFIG='{"provider":"prisma"}'
VECTOR_DB_CONFIG='{"provider":"qdrant","url":"http://qdrant-server:6333", "config": {"grpc_port": 6334, "prefer_grpc": false}}'

Expand All @@ -31,6 +28,8 @@ VITE_DOCS_QA_STANDALONE_PATH=/
VITE_DOCS_QA_ENABLE_REDIRECT=false
VITE_DOCS_QA_MAX_UPLOAD_SIZE_MB=200

# OpenAI API Keys
OPENAI_API_KEY=

## TFY VARS
TFY_API_KEY=
Expand Down
3 changes: 1 addition & 2 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -131,11 +131,10 @@ services:
- DEBUG_MODE=true
- LOCAL=${LOCAL}
- LOG_LEVEL=DEBUG
- OLLAMA_URL=${OLLAMA_URL}
- EMBEDDING_SVC_URL=${INFINITY_URL}
- RERANKER_SVC_URL=${INFINITY_URL}
- METADATA_STORE_CONFIG=${METADATA_STORE_CONFIG}
- VECTOR_DB_CONFIG=${VECTOR_DB_CONFIG}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- TFY_API_KEY=${TFY_API_KEY}
- TFY_HOST=${TFY_HOST}
- TFY_LLM_GATEWAY_URL=${TFY_LLM_GATEWAY_URL}
Expand Down
36 changes: 0 additions & 36 deletions models_config.sample.json

This file was deleted.

Loading

0 comments on commit a000f10

Please sign in to comment.