|
| 1 | +# How Much Resources Do you Need to Run Chroma |
| 2 | + |
| 3 | +Chroma makes use of the following compute resources: |
| 4 | + |
| 5 | +- RAM - Chroma stores the vector HNSW index in-memory. This allows it to perform blazing fast semantic searches. |
| 6 | +- Disk - Chroma persists all data to disk. This includes the vector HNSW index, metadata index, system DB, and the |
| 7 | + write-ahead log (WAL). |
| 8 | +- CPU - Chroma uses CPU for indexing and searching vectors. |
| 9 | + |
| 10 | +Here are some formulas and heuristics to help you estimate the resources you need to run Chroma. |
| 11 | + |
| 12 | +## RAM |
| 13 | + |
| 14 | +Once you select your embedding model, use the following formula for calculating RAM storage requirements for the vector |
| 15 | +HNSW index: |
| 16 | + |
| 17 | +`number of vectors` * `dimensionality of vectors` * `4 bytes` = `RAM required` |
| 18 | + |
| 19 | +- `number of vectors` - This is the number of vectors you plan to index. These are the documents in your Chroma |
| 20 | + collection (or chunks if you use LlamaIndex or LangChain terminology). |
| 21 | +- `dimensionality of vectors` - This is the dimensionality of the vectors output by your embedding model. For example, |
| 22 | + if you use the `sentence-transformers/paraphrase-MiniLM-L6-v2` model, the dimensionality of the vectors is 384. |
| 23 | +- `4 bytes` - This is the size of each component of a vector. Chroma relies on HNSW lib implementation that uses 32bit |
| 24 | + floats. |
| 25 | + |
| 26 | +## Disk |
| 27 | + |
| 28 | +Disk storage requirements mainly depend on what metadata you store and the number of vectors you index. The heuristics |
| 29 | +is at least 2-4x the RAM required for the vector HNSW index. |
| 30 | + |
| 31 | +!!! note "WAL Cleanup" |
| 32 | + |
| 33 | + Chroma does not currently clean the WAL so your sqlite3 metadata file will grow over time. In the meantime feel free |
| 34 | + to use available tooling to periodically clean your WAL - |
| 35 | + see [chromadb-ops](https://github.com/amikos-tech/chromadb-ops) for more information. |
| 36 | + |
| 37 | +## CPU |
| 38 | + |
| 39 | +There are no hard requirements for the CPU, but it is recommended to use as much CPU as you can spare as it directly |
| 40 | +relates to index and search speeds. |
0 commit comments