Skip to content

DeepSeek-RS is a personal project implementing DeepSeek's architecture in Rust for learning and experimentation. This is not an official DeepSeek project.

License

Notifications You must be signed in to change notification settings

NEBYTE/deepseek-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepSeek Rust

DeepSeek-RS (Rust Implementation)

Maintainer Made with Rust License

Note

DeepSeek-RS is a personal project implementing DeepSeek's architecture in Rust for learning and experimentation. This is not an official DeepSeek project.

DeepSeek-RS is a Rust-based deep-learning framework replicating DeepSeek’s architecture. It utilizes tch-rs (libtorch bindings) for tensor computation and supports large-scale transformer-based models.


Table of Contents


Features

Warning

DeepSeek-RS is not stable and is currently unfinished. Expect bugs, missing features, and incomplete functionality. Use at your own risk!

  • Transformer-based architecture - Supports large-scale models.
  • Optimized tensor computations - Uses tch-rs (bindings to libtorch).
  • 1:1 Python conversion - Structure is directly mapped from DeepSeek’s Python implementation.
  • Efficient inference - Uses GPU acceleration via tch::Tensor.
  • Training pipeline - Custom model training with dataset integration.

Installation

Prerequisites

  • Rust (latest stable)
  • Cargo package manager
  • Libtorch (must be installed separately)

Clone the Repository

git clone https://github.com/rustyspottedcatt/deepseek-rs.git
cd deepseek-rs

Build the Project

cargo build --release

Usage

Model Loading

use tch::nn::{self, Module, OptimizerConfig};
use tch::{Device, Tensor};

let vs = nn::VarStore::new(Device::cuda_if_available());
let model = nn::seq()
    .add(nn::linear(vs.root() / "fc1", 768, 3072, Default::default()))
    .add_fn(|xs| xs.relu())
    .add(nn::linear(vs.root() / "fc2", 3072, 768, Default::default()));

println!("Model successfully loaded.");

Inference

use tch::Tensor;

let input_tensor = Tensor::randn(&[1, 768], (tch::Kind::Float, tch::Device::cuda_if_available()));
let output_tensor = model.forward(&input_tensor);

println!("Inference output: {:?}", output_tensor);

Training

use tch::nn::{self, OptimizerConfig};
use tch::Tensor;

let mut opt = nn::Adam::default().build(&vs, 1e-3).unwrap();
let loss = output_tensor.mse_loss(&target_tensor, tch::Reduction::Mean);

opt.backward_step(&loss);
println!("Training step completed.");

Dependencies

[dependencies]
tch = "0.19"
ndarray = "0.15"
serde = { version = "1.0", features = ["derive"] }
bincode = "1.3"
rayon = "1.7"

License

Distributed under the GNU AGPLv3 license.

About

DeepSeek-RS is a personal project implementing DeepSeek's architecture in Rust for learning and experimentation. This is not an official DeepSeek project.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages