-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Insights: unslothai/unsloth
Overview
Could not load contribution data
Please try again later
4 Pull requests merged by 1 person
-
Qwen3 inference fixes
#2436 merged
Apr 30, 2025 -
Fixup qwen3 qk norm
#2427 merged
Apr 29, 2025 -
Fixup qwen3
#2423 merged
Apr 29, 2025 -
[WIP] Initial support for Qwen3. Will udpate when the model is released
#2211 merged
Apr 29, 2025
1 Pull request opened by 1 person
-
Added missing code of conduct
#2416 opened
Apr 26, 2025
33 Issues closed by 12 people
-
Continued Pre-Training Notebook not working with unsloth/Llama-3.2-1B-bnb-4bit
#1210 closed
Apr 29, 2025 -
Is there proper attention masking done when applying packing=true?
#1207 closed
Apr 29, 2025 -
NameError: name 'Unpack' is not defined
#1181 closed
Apr 29, 2025 -
Error - 'OutOfMemoryError: CUDA out of memory.'
#1214 closed
Apr 29, 2025 -
Please add the model: EleutherAI/polyglot-ko-5.8b
#1209 closed
Apr 29, 2025 -
[Bug] unsloth `ImportError` when using `triton==3.3.0`
#2403 closed
Apr 29, 2025 -
[Feature] When will Dynamic Quants 2.0 be available for custom models?
#2421 closed
Apr 29, 2025 -
ORPO trainer not works after SFT
#1203 closed
Apr 28, 2025 -
[Question] How to train the "pt" gemma model with text corpus ? any sample unsloth notebook ?
#2420 closed
Apr 28, 2025 -
"FlashAttention only support fp16 and bf16 data type" error when using dora
#1154 closed
Apr 27, 2025 -
Issue saving mistral-7b-instruct-v0.3-bnb-4bit to GGUF
#1197 closed
Apr 27, 2025 -
[Feature] Implement finetunning for new gemma int4 models
#2412 closed
Apr 27, 2025 -
[FEAT] integration of SmolVLM
#2320 closed
Apr 27, 2025 -
[Bug] FastLanguageModel.from_pretrained() secretly changes the model I really want to fine-tune
#2407 closed
Apr 27, 2025 -
[Docs]how to understand the R1-2.51bits or V3-2.71bits
#2344 closed
Apr 27, 2025 -
[Docs]benchmark compare between 2.71 and 4.5 bit?
#2345 closed
Apr 27, 2025 -
Train_on_completions cant handle eval_datasets as dictionary
#1192 closed
Apr 26, 2025 -
[Bug] AttributeError: 'Mistral3ForConditionalGeneration' object has no attribute 'model'
#2415 closed
Apr 26, 2025 -
how can i get the unsloth pro version?
#2414 closed
Apr 26, 2025 -
does unlsoth support freeze tunning
#1183 closed
Apr 25, 2025 -
pip install --upgrade --no-cache-dir unsloth BROKE CUDA packages. Inference slower.
#1187 closed
Apr 25, 2025 -
ModuleNotFoundError : Failed to import transformers.models.falcon_mamba.configuration_falcon_mamba
#1185 closed
Apr 25, 2025 -
Can't import unsloth when both the latest version of unsloth and transformers are installed
#1179 closed
Apr 25, 2025 -
[Bug] PatchDPOTrainer - AttributeError: 'dict' object has no attribute 'logits'
#2406 closed
Apr 25, 2025 -
[Bug] NameError: name 'tokenizer_call' is not defined
#2400 closed
Apr 24, 2025 -
A bug in save.py
#1170 closed
Apr 24, 2025 -
Fine tuning without GPU?
#1132 closed
Apr 24, 2025 -
Gradient accumulation fix does change the max_steps value
#1163 closed
Apr 24, 2025 -
[Question] Which notebook should I use for continue pretraining LLM with domain knowledge?
#2402 closed
Apr 24, 2025 -
How to use this as the reference policy?
#1167 closed
Apr 23, 2025
22 Issues opened by 22 people
-
[GRPO+LoRA] no attribute 'load_lora'
#2437 opened
Apr 30, 2025 -
multi-GPU training
#2435 opened
Apr 30, 2025 -
[Bug] error installing and then importing on any currently supported image of SageMaker AWS
#2433 opened
Apr 29, 2025 -
[Bug] PassManager::run failed when training on Google Colab
#2432 opened
Apr 29, 2025 -
[Bug]RTX 5090 error
#2431 opened
Apr 29, 2025 -
QWEN3 FINE-TUNING now in Unsloth!
#2428 opened
Apr 29, 2025 -
Llama3_1_(3B)_GRPO_LoRA vllm error
#2426 opened
Apr 29, 2025 -
[Question] Dose DDP support for Unsloth Open?
#2425 opened
Apr 29, 2025 -
GRPO Training: Repeated Output After Initial Normal Output
#2424 opened
Apr 29, 2025 -
AttributeError: 'HybridCache' object has no attribute 'float'
#2419 opened
Apr 27, 2025 -
[Bug] Loss not decreasing with Qwen 2.5 32B
#2417 opened
Apr 26, 2025 -
[Question] Adding several peft adapters and ensuring unsloth takes them into account
#2411 opened
Apr 25, 2025 -
[Feature] DIA TTS model finetuning support
#2410 opened
Apr 25, 2025 -
[Feature] TRL 0.17 support
#2409 opened
Apr 25, 2025 -
"RuntimeError: CUDA driver error: unknown error" when Fine Tuning Llama-3.2-11B-Vision-Instruct
#2408 opened
Apr 25, 2025 -
Unsloth models output gibberish on LONG inputs
#2405 opened
Apr 24, 2025 -
[Question] Do not see 2x speed finetuning Qwen2.5-VL model
#2404 opened
Apr 24, 2025 -
Jetson finetune load model out of memory
#2401 opened
Apr 24, 2025 -
[Bug] Impossible to convert Gemma3 4b into gguf
#2399 opened
Apr 23, 2025
42 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
vLLM Windows CUDA support [tested]
#2158 commented on
Apr 28, 2025 • 1 new comment -
Error saving GGUF of vision model
#1504 commented on
Apr 27, 2025 • 0 new comments -
Error, when loading llama3: `AttributeError: module 'transformers.models.bit.modeling_bit' has no attribute 'Linear'`
#2191 commented on
Apr 27, 2025 • 0 new comments -
BackendCompilerFailed: backend='inductor' raised: SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats
#2230 commented on
Apr 28, 2025 • 0 new comments -
[Bug] Issue while evaluating on GRPO.
#2367 commented on
Apr 28, 2025 • 0 new comments -
FakeTensor Error in GRPO training
#2079 commented on
Apr 28, 2025 • 0 new comments -
[Question] Cannot install specific releases from source ?
#2368 commented on
Apr 28, 2025 • 0 new comments -
Unable to load unsloth trained model saved to a local directory.
#934 commented on
Apr 28, 2025 • 0 new comments -
[Question] Fine-tune Gemma3 OOM
#2366 commented on
Apr 28, 2025 • 0 new comments -
RuntimeError: Unsloth: Quantization failed! You might have to compile llama.cpp yourself, then run this again.
#1781 commented on
Apr 29, 2025 • 0 new comments -
Error while importing unsloth in databricks
#1294 commented on
Apr 29, 2025 • 0 new comments -
Add STT model for fine tune
#2394 commented on
Apr 29, 2025 • 0 new comments -
Unsloth currently does not support multi GPU setups in unsloth-2024.8
#859 commented on
Apr 29, 2025 • 0 new comments -
[Bug] unsloth_compiled_module_mamba2.py IndentationError: unexpected indent
#2347 commented on
Apr 29, 2025 • 0 new comments -
[BUG] Image features and image tokens do not match on full finetuning
#2251 commented on
Apr 29, 2025 • 0 new comments -
Maybe bug ? Cannot retrain fine tuning output from gemma3-4b
#2304 commented on
Apr 29, 2025 • 0 new comments -
'unsloth/llava-v1.6-mistral-7b-hf' model inference ValueError: Image features and image tokens do not match: tokens: 1175, features 1176
#2225 commented on
Apr 29, 2025 • 0 new comments -
Unsloth On Mac
#685 commented on
Apr 30, 2025 • 0 new comments -
CUDA error: out of memory in WSL with 24G VRAM while 2/3 was still left unused
#1797 commented on
Apr 30, 2025 • 0 new comments -
OOM on WSL, GRPOTrainer RuntimeError: CUDA driver error: out of memory
#1744 commented on
Apr 30, 2025 • 0 new comments -
Added Support for Apple Silicon
#1289 commented on
Apr 28, 2025 • 0 new comments -
How to create or get the ollama modelFile of Unsloth tube square fine-tuning model?
#1823 commented on
Apr 23, 2025 • 0 new comments -
Is there a training method for GRPO using Qwen2.5-VL-3B-Instruct?
#2324 commented on
Apr 23, 2025 • 0 new comments -
[BUG] Gemma3 vision throws error: TypeError: 'int' object is not iterable
#2258 commented on
Apr 23, 2025 • 0 new comments -
[RunPod] - llama_model_load: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file
#1947 commented on
Apr 24, 2025 • 0 new comments -
Gemma3ForCausalLM_forward() got multiple values for argument 'logits_to_keep'
#2167 commented on
Apr 24, 2025 • 0 new comments -
Improve Documentation on Loading Fine-Tuned Models with Added Tokens
#1863 commented on
Apr 24, 2025 • 0 new comments -
[Feature] Qwen 2.5-Omni Support?
#2325 commented on
Apr 24, 2025 • 0 new comments -
[FEAT] add Quantization Aware Training (QAT) support
#2311 commented on
Apr 25, 2025 • 0 new comments -
[Bug] AttributeError: 'LlamaForCausalLM' object has no attribute 'vllm_engine'
#2384 commented on
Apr 26, 2025 • 0 new comments -
[Bug] When use customized trl.trainer, there is a sharp increase in CUDA memory?
#2397 commented on
Apr 26, 2025 • 0 new comments -
[Bug]Encountered a RuntimeError: CUDA error: a memory access was encountered during training with Unsloth’s GRPOTrainer
#2387 commented on
Apr 26, 2025 • 0 new comments -
[Question] Unexpectable warnings during unsloth setup
#2391 commented on
Apr 26, 2025 • 0 new comments -
save_pretrained_merged Issue
#2159 commented on
Apr 26, 2025 • 0 new comments -
[Feature] Is it possible to support to train microsoft/bitnet-b1.58-2B-4T ?
#2390 commented on
Apr 26, 2025 • 0 new comments -
Comprehensive Report: 3-Day Installation Struggle on Windows 10/WSL Following All Official Methods
#2395 commented on
Apr 26, 2025 • 0 new comments -
DAPO Implementation
#2141 commented on
Apr 26, 2025 • 0 new comments -
[Bug]Error loading model on Colab A100
#2380 commented on
Apr 27, 2025 • 0 new comments -
[Bug] TinyLlama Finetune not learning
#2385 commented on
Apr 27, 2025 • 0 new comments -
[Question]After fine-tuning, the model performed well, but after merging into GGUF (without quantization), its performance dropped. Why might this happen?
#2374 commented on
Apr 27, 2025 • 0 new comments -
raise RuntimeError("mmap can only be used with files saved with "
#570 commented on
Apr 27, 2025 • 0 new comments -
[Feature] Can support THUDM/GLM-Z1-9B-0414, thanks
#2398 commented on
Apr 27, 2025 • 0 new comments