DeepSeek-RS (Rust Implementation)

Note

DeepSeek-RS is a personal project implementing DeepSeek's architecture in Rust for learning and experimentation. This is not an official DeepSeek project.

DeepSeek-RS is a Rust-based deep-learning framework replicating DeepSeek’s architecture. It utilizes tch-rs (libtorch bindings) for tensor computation and supports large-scale transformer-based models.

Features

Warning

DeepSeek-RS is not stable and is currently unfinished. Expect bugs, missing features, and incomplete functionality. Use at your own risk!

Transformer-based architecture - Supports large-scale models.
Optimized tensor computations - Uses tch-rs (bindings to libtorch).
1:1 Python conversion - Structure is directly mapped from DeepSeek’s Python implementation.
Efficient inference - Uses GPU acceleration via tch::Tensor.
Training pipeline - Custom model training with dataset integration.

Installation

Prerequisites

Rust (latest stable)
Cargo package manager
Libtorch (must be installed separately)

Clone the Repository

git clone https://github.com/rustyspottedcatt/deepseek-rs.git
cd deepseek-rs

Build the Project

cargo build --release

Usage

Model Loading

use tch::nn::{self, Module, OptimizerConfig};
use tch::{Device, Tensor};

let vs = nn::VarStore::new(Device::cuda_if_available());
let model = nn::seq()
    .add(nn::linear(vs.root() / "fc1", 768, 3072, Default::default()))
    .add_fn(|xs| xs.relu())
    .add(nn::linear(vs.root() / "fc2", 3072, 768, Default::default()));

println!("Model successfully loaded.");

Inference

use tch::Tensor;

let input_tensor = Tensor::randn(&[1, 768], (tch::Kind::Float, tch::Device::cuda_if_available()));
let output_tensor = model.forward(&input_tensor);

println!("Inference output: {:?}", output_tensor);

Training

use tch::nn::{self, OptimizerConfig};
use tch::Tensor;

let mut opt = nn::Adam::default().build(&vs, 1e-3).unwrap();
let loss = output_tensor.mse_loss(&target_tensor, tch::Reduction::Mean);

opt.backward_step(&loss);
println!("Training step completed.");

Dependencies

[dependencies]
tch = "0.19"
ndarray = "0.15"
serde = { version = "1.0", features = ["derive"] }
bincode = "1.3"
rayon = "1.7"

License

Distributed under the GNU AGPLv3 license.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
cuda		cuda
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSeek-RS (Rust Implementation)

Table of Contents

Features

Installation

Prerequisites

Clone the Repository

Build the Project

Usage

Model Loading

Inference

Training

Dependencies

License

About

Releases

Packages

Languages

License

NEBYTE/deepseek-rs

Folders and files

Latest commit

History

Repository files navigation

DeepSeek-RS (Rust Implementation)

Table of Contents

Features

Installation

Prerequisites

Clone the Repository

Build the Project

Usage

Model Loading

Inference

Training

Dependencies

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages