Skip to content

Actions: huggingface/trl

Hugging Face Issue Labeler

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
437 workflow runs
437 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[GPRO] vLLM getting stuck on client crash
Hugging Face Issue Labeler #437: Issue #3408 opened by daniel-dona
May 2, 2025 22:19 31s
May 2, 2025 22:19 31s
GRPO online example not reproducible
Hugging Face Issue Labeler #436: Issue #3406 opened by JakobJBauer
May 2, 2025 17:18 33s
May 2, 2025 17:18 33s
FP8 training
Hugging Face Issue Labeler #435: Issue #3399 opened by JosephChenHub
May 1, 2025 11:30 25s
May 1, 2025 11:30 25s
The size of tensor a (2) must match the size of tensor b (4)
Hugging Face Issue Labeler #434: Issue #3392 opened by jinz2014
April 29, 2025 16:03 35s
April 29, 2025 16:03 35s
Qwen3 training support
Hugging Face Issue Labeler #433: Issue #3387 opened by cuiyuhao1996
April 29, 2025 09:18 36s
April 29, 2025 09:18 36s
Unable to change default data collator from SFTTrainer
Hugging Face Issue Labeler #432: Issue #3386 opened by GAD-cell
April 29, 2025 09:00 49s
April 29, 2025 09:00 49s
GRPO Trainer cannot use rewards as best metric to save best model
Hugging Face Issue Labeler #431: Issue #3384 opened by lohzhunyewcs
April 29, 2025 03:27 18s
April 29, 2025 03:27 18s
keep_end + max_length causes NaNs in trainer_state.json
Hugging Face Issue Labeler #430: Issue #3382 opened by jdebaer
April 28, 2025 22:22 20s
April 28, 2025 22:22 20s
FSDP doesn't work with SFT,
Hugging Face Issue Labeler #429: Issue #3375 opened by NickNickGo
April 27, 2025 17:31 31s
April 27, 2025 17:31 31s
DPO with QLoRA + FSDP
Hugging Face Issue Labeler #428: Issue #3371 opened by shon-otmazgin-wix
April 27, 2025 09:35 29s
April 27, 2025 09:35 29s
April 27, 2025 05:40 34s
GPRO: use_liger_loss + zero3 error
Hugging Face Issue Labeler #426: Issue #3368 opened by paul777chen
April 27, 2025 04:16 29s
April 27, 2025 04:16 29s
Device indexing error with trl vllm-serve + DDP
Hugging Face Issue Labeler #425: Issue #3363 opened by tchang1997
April 25, 2025 16:18 44s
April 25, 2025 16:18 44s
RecursionError on IterativeSFTTrainer.step
Hugging Face Issue Labeler #424: Issue #3361 opened by pranav-gade
April 25, 2025 10:53 31s
April 25, 2025 10:53 31s
Cannot init IterativeSFTTrainer without specifying processing_class
Hugging Face Issue Labeler #423: Issue #3360 opened by pranav-gade
April 25, 2025 10:31 27s
April 25, 2025 10:31 27s
GRPO trainer cannot start with zero1 together with bf16
Hugging Face Issue Labeler #422: Issue #3359 opened by Tavish9
April 25, 2025 01:00 33s
April 25, 2025 01:00 33s
MarkupError due to unescaped text while visualizing samples during an evaluation step
Hugging Face Issue Labeler #421: Issue #3353 opened by MayankAgarwal
April 24, 2025 13:44 38s
April 24, 2025 13:44 38s
Keywords "remove_unused_columns=False" don't work when using the DPOConfig
Hugging Face Issue Labeler #420: Issue #3349 opened by Liesy
April 24, 2025 03:34 30s
April 24, 2025 03:34 30s
GRPO trainer is not compatible with FSDP
Hugging Face Issue Labeler #419: Issue #3348 opened by Tavish9
April 24, 2025 01:52 34s
April 24, 2025 01:52 34s
GRPO with VLLM got http.client.RemoteDisconnected: Remote end closed connection without response
Hugging Face Issue Labeler #418: Issue #3347 opened by Andcircle
April 23, 2025 21:01 40s
April 23, 2025 21:01 40s
[GPG][new trainer] Add support to new GPG method
Hugging Face Issue Labeler #417: Issue #3345 opened by lerogo
April 23, 2025 08:00 39s
April 23, 2025 08:00 39s
[SFT] formatting_func ignored.. with completion only loss
Hugging Face Issue Labeler #415: Issue #3343 opened by HERIUN
April 23, 2025 03:41 31s
April 23, 2025 03:41 31s
gradient_accumulation_steps is not working
Hugging Face Issue Labeler #414: Issue #3342 opened by greatxue
April 23, 2025 01:32 32s
April 23, 2025 01:32 32s
Loss becomes NaN when response is empty due to max_prompt_length >= max_length in SimPO
Hugging Face Issue Labeler #413: Issue #3339 opened by vigneshwaran
April 22, 2025 09:10 32s
April 22, 2025 09:10 32s