optimum-habana
Fixes 'Tokenizer does not have padding token' introduced by #1444 for Llama3.1
#1457
Merged

Fixes 'Tokenizer does not have padding token' introduced by #1444 for Llama3.1 #1457

MohitIntel
MohitIntel219 days ago (edited 218 days ago)

New error 'ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'})' introduced by #1444 for Llama3.1.

This patch fixes the above when running the run_lora_clm.py for finetuning llama3.1 using command from language-modeling README:

PT_HPU_MAX_COMPOUND_OP_SIZE=10 \
python3 ../gaudi_spawn.py --use_deepspeed  --world_size 8  run_lora_clm.py \
  --model_name_or_path meta-llama/Llama-3.1-70B-Instruct \
  --deepspeed llama2_ds_zero3_config.json \
  --dataset_name tatsu-lab/alpaca \
  --bf16 True \
  --output_dir ./lora_out \
  --num_train_epochs 2 \
  --max_seq_len 2048 \
  --per_device_train_batch_size 10 \
  --per_device_eval_batch_size 1 \
  --gradient_checkpointing \
  --eval_strategy epoch \
  --eval_delay 2 \
  --save_strategy no \
  --learning_rate 0.0018 \
  --warmup_ratio 0.03 \
  --lr_scheduler_type "cosine" \
  --logging_steps 1 \
  --dataset_concatenation \
  --attn_softmax_bf16 True \
  --do_train \
  --do_eval \
  --use_habana \
  --use_lazy_mode \
  --pipelining_fwd_bwd \
  --throughput_warmup_steps 3 \
  --lora_rank 4 \
  --lora_target_modules "q_proj" "v_proj" "k_proj" "o_proj" \
  --validation_split_percentage 4 \
  --use_flash_attention True \
  --flash_attention_causal_mask True \
  --fp8 True
MohitIntel fix for ValueError: Asking to pad but the tokenizer does not have a p…
4d7e619d
MohitIntel MohitIntel requested a review from ssarkar2 ssarkar2 219 days ago
MohitIntel MohitIntel requested a review from bhargaveede bhargaveede 219 days ago
MohitIntel MohitIntel requested a review from vivekgoe vivekgoe 219 days ago
MohitIntel MohitIntel requested a review from mandy-li mandy-li 219 days ago
MohitIntel MohitIntel requested a review from libinta libinta 219 days ago
MohitIntel MohitIntel requested a review 219 days ago
MohitIntel MohitIntel requested a review from regisss regisss 219 days ago
MohitIntel MohitIntel changed the base branch from main to v1.14-release 219 days ago
MohitIntel MohitIntel requested a review from schoi-habana schoi-habana 219 days ago
MohitIntel
MohitIntel219 days ago

@regisss , can we have this fix merged quickly since it is a blocker issue affecting llama finetuning functionality on Gaudi.

yafshar
yafshar commented on 2024-10-25
Conversation is marked as resolved
Show resolved
examples/language-modeling/README.md
478478 --world_size 8 --use_mpi run_lora_clm.py \
479479 --model_name_or_path meta-llama/Llama-2-7b-hf \
480480 --dataset_name tatsu-lab/alpaca \
481
--bf16 True \
yafshar219 days ago

@MohitIntel why do you remove this arg here?

MohitIntel218 days ago (edited 218 days ago)

Mistakenly removed from this command. Meant to remove from another. Will fix.
After discussion with Libin both bf16 and fp8 flags are indeed needed in the same command.

jiminha
jiminha219 days ago

@MohitIntel I tried the command that you posted on v1.14.0 and I don't see any error.
I also checked the llama70b model, and pad id is defined.
https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/generation_config.json#L5

Where do you see this error?

yafshar
yafshar219 days agošŸ‘ 1

@jiminha try with llama3 not llama2.

jiminha
jiminha219 days ago

@libinta Please check this ticket if we want to bring this into also point release. Finetuning failing for Llama3.1 models due to pad_token_id missing.

MohitIntel Fix after review
0a9b2f6b
MohitIntel Fixed space instead of tab
aade9ebb
jiminha
jiminha approved these changes on 2024-10-25
jiminha jiminha added run-test
yafshar
yafshar approved these changes on 2024-10-28
yafshar216 days ago

LGTM!

regisss
regisss approved these changes on 2024-10-29
regisss regisss merged b6021b72 into v1.14-release 215 days ago
regisss regisss deleted the mdeopujari/fix_run_lora_clm_llama branch 215 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone