Note
DeepSeek-RS is a personal project implementing DeepSeek's architecture in Rust for learning and experimentation. This is not an official DeepSeek project.
DeepSeek-RS is a Rust-based deep-learning framework replicating DeepSeek’s architecture. It utilizes tch-rs (libtorch bindings) for tensor computation and supports large-scale transformer-based models.
Warning
DeepSeek-RS is not stable and is currently unfinished. Expect bugs, missing features, and incomplete functionality. Use at your own risk!
- Transformer-based architecture - Supports large-scale models.
- Optimized tensor computations - Uses
tch-rs
(bindings tolibtorch
). - 1:1 Python conversion - Structure is directly mapped from DeepSeek’s Python implementation.
- Efficient inference - Uses GPU acceleration via
tch::Tensor
. - Training pipeline - Custom model training with dataset integration.
- Rust (latest stable)
- Cargo package manager
- Libtorch (must be installed separately)
git clone https://github.com/rustyspottedcatt/deepseek-rs.git
cd deepseek-rs
cargo build --release
use tch::nn::{self, Module, OptimizerConfig};
use tch::{Device, Tensor};
let vs = nn::VarStore::new(Device::cuda_if_available());
let model = nn::seq()
.add(nn::linear(vs.root() / "fc1", 768, 3072, Default::default()))
.add_fn(|xs| xs.relu())
.add(nn::linear(vs.root() / "fc2", 3072, 768, Default::default()));
println!("Model successfully loaded.");
use tch::Tensor;
let input_tensor = Tensor::randn(&[1, 768], (tch::Kind::Float, tch::Device::cuda_if_available()));
let output_tensor = model.forward(&input_tensor);
println!("Inference output: {:?}", output_tensor);
use tch::nn::{self, OptimizerConfig};
use tch::Tensor;
let mut opt = nn::Adam::default().build(&vs, 1e-3).unwrap();
let loss = output_tensor.mse_loss(&target_tensor, tch::Reduction::Mean);
opt.backward_step(&loss);
println!("Training step completed.");
[dependencies]
tch = "0.19"
ndarray = "0.15"
serde = { version = "1.0", features = ["derive"] }
bincode = "1.3"
rayon = "1.7"
Distributed under the GNU AGPLv3 license.