Skip to content

Commit 368c7fe

Browse files
committed
feat: Added resource requirement calculations.
1 parent 600b162 commit 368c7fe

File tree

1 file changed

+40
-0
lines changed

1 file changed

+40
-0
lines changed

docs/core/resources.md

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# How Much Resources Do you Need to Run Chroma
2+
3+
Chroma makes use of the following compute resources:
4+
5+
- RAM - Chroma stores the vector HNSW index in-memory. This allows it to perform blazing fast semantic searches.
6+
- Disk - Chroma persists all data to disk. This includes the vector HNSW index, metadata index, system DB, and the
7+
write-ahead log (WAL).
8+
- CPU - Chroma uses CPU for indexing and searching vectors.
9+
10+
Here are some formulas and heuristics to help you estimate the resources you need to run Chroma.
11+
12+
## RAM
13+
14+
Once you select your embedding model, use the following formula for calculating RAM storage requirements for the vector
15+
HNSW index:
16+
17+
`number of vectors` * `dimensionality of vectors` * `4 bytes` = `RAM required`
18+
19+
- `number of vectors` - This is the number of vectors you plan to index. These are the documents in your Chroma
20+
collection (or chunks if you use LlamaIndex or LangChain terminology).
21+
- `dimensionality of vectors` - This is the dimensionality of the vectors output by your embedding model. For example,
22+
if you use the `sentence-transformers/paraphrase-MiniLM-L6-v2` model, the dimensionality of the vectors is 384.
23+
- `4 bytes` - This is the size of each component of a vector. Chroma relies on HNSW lib implementation that uses 32bit
24+
floats.
25+
26+
## Disk
27+
28+
Disk storage requirements mainly depend on what metadata you store and the number of vectors you index. The heuristics
29+
is at least 2-4x the RAM required for the vector HNSW index.
30+
31+
!!! note "WAL Cleanup"
32+
33+
Chroma does not currently clean the WAL so your sqlite3 metadata file will grow over time. In the meantime feel free
34+
to use available tooling to periodically clean your WAL -
35+
see [chromadb-ops](https://github.com/amikos-tech/chromadb-ops) for more information.
36+
37+
## CPU
38+
39+
There are no hard requirements for the CPU, but it is recommended to use as much CPU as you can spare as it directly
40+
relates to index and search speeds.

0 commit comments

Comments
 (0)