Pulse · unslothai/unsloth · GitHub

April 23, 2025 – April 30, 2025

Overview

5 Active pull requests

55 Active issues

4 Pull requests merged by 1 person

Qwen3 inference fixes
#2436 merged Apr 30, 2025
Fixup qwen3 qk norm
#2427 merged Apr 29, 2025
Fixup qwen3
#2423 merged Apr 29, 2025
[WIP] Initial support for Qwen3. Will udpate when the model is released
#2211 merged Apr 29, 2025

1 Pull request opened by 1 person

Added missing code of conduct
#2416 opened Apr 26, 2025

33 Issues closed by 12 people

[Question]torch._dynamo.exc.Unsupported: 'inline in skipfiles: device.__init__ | __init__ /home/miniconda3/envs/unsloth_env/lib/python3.11/site-packages/torch/cuda/__init__.py, skipped according trace_rules.lookup SKIP_DIRS'
#2429 closed Apr 30, 2025
Continued Pre-Training Notebook not working with unsloth/Llama-3.2-1B-bnb-4bit
#1210 closed Apr 29, 2025
Is there proper attention masking done when applying packing=true?
#1207 closed Apr 29, 2025
NameError: name 'Unpack' is not defined
#1181 closed Apr 29, 2025
Error - 'OutOfMemoryError: CUDA out of memory.'
#1214 closed Apr 29, 2025
Please add the model: EleutherAI/polyglot-ko-5.8b
#1209 closed Apr 29, 2025
[Bug] unsloth `ImportError` when using `triton==3.3.0`
#2403 closed Apr 29, 2025
[Feature] When will Dynamic Quants 2.0 be available for custom models?
#2421 closed Apr 29, 2025
ORPO trainer not works after SFT
#1203 closed Apr 28, 2025
[Bug] NotImplementedError: Unsloth: llama.cpp GGUF conversion does not yet support converting model types of Gemma3ForCausalLM.
#2422 closed Apr 28, 2025
[Question] How to train the "pt" gemma model with text corpus ? any sample unsloth notebook ?
#2420 closed Apr 28, 2025
"FlashAttention only support fp16 and bf16 data type" error when using dora
#1154 closed Apr 27, 2025
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'") AttributeError: 'LongRopeRotaryEmbedding' object has no attribute 'long_cos_cached'. Did you mean: 'short_cos_cached'?
#1189 closed Apr 27, 2025
Issue saving mistral-7b-instruct-v0.3-bnb-4bit to GGUF
#1197 closed Apr 27, 2025
[Feature] Implement finetunning for new gemma int4 models
#2412 closed Apr 27, 2025
[FEAT] integration of SmolVLM
#2320 closed Apr 27, 2025
[Bug] FastLanguageModel.from_pretrained() secretly changes the model I really want to fine-tune
#2407 closed Apr 27, 2025
[Docs]how to understand the R1-2.51bits or V3-2.71bits
#2344 closed Apr 27, 2025
[Docs]benchmark compare between 2.71 and 4.5 bit?
#2345 closed Apr 27, 2025
Train_on_completions cant handle eval_datasets as dictionary
#1192 closed Apr 26, 2025
[Bug] AttributeError: 'Mistral3ForConditionalGeneration' object has no attribute 'model'
#2415 closed Apr 26, 2025
how can i get the unsloth pro version？
#2414 closed Apr 26, 2025
does unlsoth support freeze tunning
#1183 closed Apr 25, 2025
pip install --upgrade --no-cache-dir unsloth BROKE CUDA packages. Inference slower.
#1187 closed Apr 25, 2025
ModuleNotFoundError : Failed to import transformers.models.falcon_mamba.configuration_falcon_mamba
#1185 closed Apr 25, 2025
Can't import unsloth when both the latest version of unsloth and transformers are installed
#1179 closed Apr 25, 2025
[Bug] PatchDPOTrainer - AttributeError: 'dict' object has no attribute 'logits'
#2406 closed Apr 25, 2025
[Bug] NameError: name 'tokenizer_call' is not defined
#2400 closed Apr 24, 2025
A bug in save.py
#1170 closed Apr 24, 2025
Fine tuning without GPU?
#1132 closed Apr 24, 2025
Gradient accumulation fix does change the max_steps value
#1163 closed Apr 24, 2025
[Question] Which notebook should I use for continue pretraining LLM with domain knowledge?
#2402 closed Apr 24, 2025
How to use this as the reference policy?
#1167 closed Apr 23, 2025

22 Issues opened by 22 people

[GRPO+LoRA] no attribute 'load_lora'
#2437 opened Apr 30, 2025
multi-GPU training
#2435 opened Apr 30, 2025
[Bug] GRPO Qwen3 Failed: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
#2434 opened Apr 30, 2025
[Bug] error installing and then importing on any currently supported image of SageMaker AWS
#2433 opened Apr 29, 2025
[Bug] PassManager::run failed when training on Google Colab
#2432 opened Apr 29, 2025
[Bug]RTX 5090 error
#2431 opened Apr 29, 2025
RuntimeError: Expected there to be 1 prompt updates corresponding to 1 image items, but instead found 0 prompt updates!
#2430 opened Apr 29, 2025
QWEN3 FINE-TUNING now in Unsloth!
#2428 opened Apr 29, 2025
Llama3_1_(3B)_GRPO_LoRA vllm error
#2426 opened Apr 29, 2025
[Question] Dose DDP support for Unsloth Open?
#2425 opened Apr 29, 2025
GRPO Training: Repeated Output After Initial Normal Output
#2424 opened Apr 29, 2025
AttributeError: 'HybridCache' object has no attribute 'float'
#2419 opened Apr 27, 2025
[Bug] Loss not decreasing with Qwen 2.5 32B
#2417 opened Apr 26, 2025
[Bug] Latest Unsloth cannot load local models. AttributeError: 'NoneType' object has no attribute 'count'
#2413 opened Apr 25, 2025
[Question] Adding several peft adapters and ensuring unsloth takes them into account
#2411 opened Apr 25, 2025
[Feature] DIA TTS model finetuning support
#2410 opened Apr 25, 2025
[Feature] TRL 0.17 support
#2409 opened Apr 25, 2025
"RuntimeError: CUDA driver error: unknown error" when Fine Tuning Llama-3.2-11B-Vision-Instruct
#2408 opened Apr 25, 2025
Unsloth models output gibberish on LONG inputs
#2405 opened Apr 24, 2025
[Question] Do not see 2x speed finetuning Qwen2.5-VL model
#2404 opened Apr 24, 2025
Jetson finetune load model out of memory
#2401 opened Apr 24, 2025
[Bug] Impossible to convert Gemma3 4b into gguf
#2399 opened Apr 23, 2025

42 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

vLLM Windows CUDA support [tested]
#2158 commented on Apr 28, 2025 • 1 new comment
Error saving GGUF of vision model
#1504 commented on Apr 27, 2025 • 0 new comments
Error, when loading llama3: `AttributeError: module 'transformers.models.bit.modeling_bit' has no attribute 'Linear'`
#2191 commented on Apr 27, 2025 • 0 new comments
BackendCompilerFailed: backend='inductor' raised: SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats
#2230 commented on Apr 28, 2025 • 0 new comments
[Bug] Issue while evaluating on GRPO.
#2367 commented on Apr 28, 2025 • 0 new comments
FakeTensor Error in GRPO training
#2079 commented on Apr 28, 2025 • 0 new comments
[Question] Cannot install specific releases from source ?
#2368 commented on Apr 28, 2025 • 0 new comments
Unable to load unsloth trained model saved to a local directory.
#934 commented on Apr 28, 2025 • 0 new comments
[Question] Fine-tune Gemma3 OOM
#2366 commented on Apr 28, 2025 • 0 new comments
RuntimeError: Unsloth: Quantization failed! You might have to compile llama.cpp yourself, then run this again.
#1781 commented on Apr 29, 2025 • 0 new comments
Error while importing unsloth in databricks
#1294 commented on Apr 29, 2025 • 0 new comments
Add STT model for fine tune
#2394 commented on Apr 29, 2025 • 0 new comments
Unsloth currently does not support multi GPU setups in unsloth-2024.8
#859 commented on Apr 29, 2025 • 0 new comments
[Bug] unsloth_compiled_module_mamba2.py IndentationError: unexpected indent
#2347 commented on Apr 29, 2025 • 0 new comments
[BUG] Image features and image tokens do not match on full finetuning
#2251 commented on Apr 29, 2025 • 0 new comments
Maybe bug ? Cannot retrain fine tuning output from gemma3-4b
#2304 commented on Apr 29, 2025 • 0 new comments
'unsloth/llava-v1.6-mistral-7b-hf' model inference ValueError: Image features and image tokens do not match: tokens: 1175, features 1176
#2225 commented on Apr 29, 2025 • 0 new comments
Unsloth On Mac
#685 commented on Apr 30, 2025 • 0 new comments
CUDA error: out of memory in WSL with 24G VRAM while 2/3 was still left unused
#1797 commented on Apr 30, 2025 • 0 new comments
OOM on WSL, GRPOTrainer RuntimeError: CUDA driver error: out of memory
#1744 commented on Apr 30, 2025 • 0 new comments
Added Support for Apple Silicon
#1289 commented on Apr 28, 2025 • 0 new comments
How to create or get the ollama modelFile of Unsloth tube square fine-tuning model?
#1823 commented on Apr 23, 2025 • 0 new comments
Is there a training method for GRPO using Qwen2.5-VL-3B-Instruct?
#2324 commented on Apr 23, 2025 • 0 new comments
[BUG] Gemma3 vision throws error: TypeError: 'int' object is not iterable
#2258 commented on Apr 23, 2025 • 0 new comments
[RunPod] - llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
#1947 commented on Apr 24, 2025 • 0 new comments
Gemma3ForCausalLM_forward() got multiple values for argument 'logits_to_keep'
#2167 commented on Apr 24, 2025 • 0 new comments
Improve Documentation on Loading Fine-Tuned Models with Added Tokens
#1863 commented on Apr 24, 2025 • 0 new comments
[Feature] Qwen 2.5-Omni Support?
#2325 commented on Apr 24, 2025 • 0 new comments
[FEAT] add Quantization Aware Training (QAT) support
#2311 commented on Apr 25, 2025 • 0 new comments
[Bug] AttributeError: 'LlamaForCausalLM' object has no attribute 'vllm_engine'
#2384 commented on Apr 26, 2025 • 0 new comments
[Bug] When use customized trl.trainer, there is a sharp increase in CUDA memory?
#2397 commented on Apr 26, 2025 • 0 new comments
[Bug]Encountered a RuntimeError: CUDA error: a memory access was encountered during training with Unsloth’s GRPOTrainer
#2387 commented on Apr 26, 2025 • 0 new comments
[Question] Unexpectable warnings during unsloth setup
#2391 commented on Apr 26, 2025 • 0 new comments
save_pretrained_merged Issue
#2159 commented on Apr 26, 2025 • 0 new comments
[Feature] Is it possible to support to train microsoft/bitnet-b1.58-2B-4T ?
#2390 commented on Apr 26, 2025 • 0 new comments
Comprehensive Report: 3-Day Installation Struggle on Windows 10/WSL Following All Official Methods
#2395 commented on Apr 26, 2025 • 0 new comments
DAPO Implementation
#2141 commented on Apr 26, 2025 • 0 new comments
[Bug]Error loading model on Colab A100
#2380 commented on Apr 27, 2025 • 0 new comments
[Bug] TinyLlama Finetune not learning
#2385 commented on Apr 27, 2025 • 0 new comments
[Question]After fine-tuning, the model performed well, but after merging into GGUF (without quantization), its performance dropped. Why might this happen?
#2374 commented on Apr 27, 2025 • 0 new comments
raise RuntimeError("mmap can only be used with files saved with "
#570 commented on Apr 27, 2025 • 0 new comments
[Feature] Can support THUDM/GLM-Z1-9B-0414, thanks
#2398 commented on Apr 27, 2025 • 0 new comments