Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] LLaVA 1.5 7B model fine-tune -- pydantic #1765

Closed
yiwei-chenn opened this issue Nov 13, 2024 · 1 comment
Closed

[Question] LLaVA 1.5 7B model fine-tune -- pydantic #1765

yiwei-chenn opened this issue Nov 13, 2024 · 1 comment

Comments

@yiwei-chenn
Copy link

yiwei-chenn commented Nov 13, 2024

Question

When I use my own pre-trained mlp adapter to finetune the LLaVA 1.5 7B model, I use the finetune_lora.sh like

deepspeed llava/train/train_mem.py \
    --lora_enable True --lora_r 128 --lora_alpha 256 --mm_projector_lr 2e-5 \
    --deepspeed ./scripts/zero3.json \
    --model_name_or_path lmsys/vicuna-7b-v1.5 \
    --version v1 \
    --data_path ./my_own.json \
    --image_folder ./llava-finetune \
    --vision_tower openai/clip-vit-large-patch14-336 \
    --pretrain_mm_mlp_adapter ./checkpoints/llava-v1.5-7b-pretrain/mm_projector.bin \
    --mm_projector_type mlp2x_gelu \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --group_by_modality_length True \
    --bf16 True \
    --output_dir ./checkpoints/llava-v1.5-7b-lora \
    --num_train_epochs 1 \
    --per_device_train_batch_size 16 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 50000 \
    --save_total_limit 1 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --report_to wandb

And I face the problem like

image

This problem was caused by

'pydantic_core._pydantic_core.ValidationError: 1 validation error for DeepSpeedZeroConfig
stage3_prefetch_bucket_size
Input should be a valid integer, got a number with a fractional part [type=int_from_float, input_value=15099494.4, input_type=float]'

I used the same procedure to use LoRA fine-tune the LLaVA 1.5 13B version, but did not cause the same problem.

Does anyone know how to solve that?

@yiwei-chenn yiwei-chenn changed the title [Question] LLaVA 1.5 7B model fine-tune [Question] LLaVA 1.5 7B model fine-tune -- pydantic Nov 13, 2024
@yiwei-chenn
Copy link
Author

After checking deepspeedai/DeepSpeed#6525, the problem seems caused by the mismatch between transformers and deepspeed.

I solved the problem by downgrading deepspeed:

pip install deepspeed==0.14.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant