PR #3172 Warmup gaudi backend

Warmup gaudi backend #3172

Narsil merged 41 commits into huggingface:main from sywangyi:warmup_gaudi_backend

clean cuda/rocm code in hpu backend, enable flat_hpu

201dc629

fix TP in pageattn

b7fea6fc

adjust block table in hpu to improve performance

5d365394

enable all the model. not testet yet

a07e7437

use tensor cache in hpu graph to avoid replay issue

6bbe24d9

add moe support, fix qwen/mistral/mixtral crash

5cd1c93c

fix phimoe issue

073f7939

gpt_bigcode could also go pageattn

2cde30de

enable dbrx remove some unused code

2074d051

Merge branch 'main' into gaudi_backend_pa

d5b78ba1

multi-modality initial PR

f95aa426

adjust warmup and enable vlm

36b6612f

fix incorrect output in qwen2 idefics if hpu graph is used

fdf0733f

remove unused quantization code and enable awq/gptq int4

9914ffe1

fix gptq issue

8d221b7b

enable fp8

69773767

warmup prefill

fd70ad70

add warmup_decode

ba7a131e

warmup decode

7900be5a

remove block_tables and prefill_cache_indices which will lead to dyna…

1508ee8d

Merge branch 'main' into gaudi_backend_pa

7914e980

fix comment

787dbe98

missing gptj change...

376e0507

fix some issue

f0e5faec

remove torch.where to fix incorrect output in hpu graph model

c55a8cae

LLM warmup logic

9d85ac94

multi-modality warmup

705cc0b6

optimize code

a84da5b6

refine log and fix some issue

85916875

fix warmup issue for mllama

29703dbd

pingpong optimization

cd900c3b

Merge branch 'main' into gaudi_backend_pa

610dd200

match the latest vllm_extension ops

4cdc34ec

Merge branch 'gaudi_backend_pa' into warmup_gaudi_backend

4de8fb01

work with the latest vllm extension ops

a83e9fe0

remove block_scales which is not needed anymore

76cc1297

improve performance

ba049c9d

Merge branch 'main' into warmup_gaudi_backend

6b21985c

prefill bypass graph

5ec7f15d

pingpong optimization issue fix

bf3987e2

Merge branch 'main' into warmup_gaudi_backend

01f17d52

regisss approved these changes on 2025-04-18

Narsil approved these changes on 2025-04-24

Narsil merged 37580294 into main 239 days ago

sywangyi deleted the warmup_gaudi_backend branch 205 days ago

Reviewers

regisss

Narsil

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

text-generation-inference Warmup gaudi backend #3172 Merged

Warmup gaudi backend #3172

text-generation-inference
Warmup gaudi backend
#3172

Merged