DeepSpeed
[ROCm] Relax tolerances for FP8 unit test for fp16 and bf16 cases
#7655

Merged

[ROCm] Relax tolerances for FP8 unit test for fp16 and bf16 cases #7655

sfc-gh-truwase merged 24 commits into deepspeedai:master from rraminen:relax_tol_testFP8_ROCm

rraminen requested a review from

tjruwase 61 days ago

rraminen requested a review from

loadams 61 days ago

rraminen requested a review from

tohtana 61 days ago

jithunnair-amd commented on 2025-10-30

rraminen marked this pull request as draft 58 days ago

rraminen marked this pull request as ready for review 48 days ago

rraminen marked this pull request as draft 44 days ago

rraminen marked this pull request as ready for review 34 days ago

Relax tolerance

80e6f533

ALST/UlyssesSP: more intuitive API wrt variable seqlen (#7656)

160730e4

Fix misplaced overflow handling return in fused_optimizer.py (#7645)

b01045b7

[bug]: fixed comm_dtype in extra_large_param_to_reduce (#7660)

a21a0747

UlyssesSP: TiledMLP doc - recomputes forward twice (#7664)

dd2e1474

resolved a 0-dim tensor slicing bug from _get_state_without_padding (…

56ca87be

Fix typo in pytorch-profiler.md documentation (#7652)

b3fa61f2

README refresh (#7668)

014ee5fc

Update version.txt after release (#7675)

d324f97d

[modal ci] fixes (#7676)

2d8d5238

leaf modules: explain better (#7674)

97301535

disable nv-lightning-v100.yml cI (#7681)

24990451

allow seperate learning rate "muon_lr" and "adam_lr" for muon optimiz…

08c6d1de

see_mem_usage: make always work (#7688)

a0fde72a

make debug utils more resilient (#7690)

0387a0a1

stage 1-2: don't pin memory if not configured (#7689)

c69ab198

modal ci: fix group concurrency (#7691)

7a94820a

Use pytorch utils to detect ninja (#7687)

f00b3887

Update SECURITY.md to point to GitHub reporting rather than Microsoft…

767fe524

Add Qwen2.5 to AutoTP model list (#7696)

e96064e3

Trust intel server for XPU tests (#7698)

2972ef84

PyTorch-compatible backward API (#7665)

a7ea3f6c

rraminen force pushed from 1089fa4f to a7ea3f6c 27 days ago

rraminen requested a review from

jomayeri 27 days ago

Merge branch 'master' into relax_tol_testFP8_ROCm

ae3ae053

sfc-gh-truwase commented on 2025-12-02

Apply suggestion from @sfc-gh-truwase

72d4e997

sfc-gh-truwase approved these changes on 2025-12-02

sfc-gh-truwase merged 28fbb808 into master 25 days ago

Reviewers

sfc-gh-truwase

k-artem

jithunnair-amd

tjruwase

loadams

tohtana

jomayeri

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

DeepSpeed [ROCm] Relax tolerances for FP8 unit test for fp16 and bf16 cases #7655 Merged

[ROCm] Relax tolerances for FP8 unit test for fp16 and bf16 cases #7655

DeepSpeed
[ROCm] Relax tolerances for FP8 unit test for fp16 and bf16 cases
#7655

Merged