Add diffllama #34083

weak-kajuma
weak-kajuma
qubvel qubvel added New model
weak-kajuma first adding diffllama
3bd9e34c
weak-kajuma add Diff Attention and other but still with errors
269055e1
weak-kajuma weak-kajuma force pushed to 269055e1 1 year ago
weak-kajuma
ArthurZucker
ArthurZucker commented on 2024-10-15
weak-kajuma complate make attention Diff-Attention
dbbf0730
weak-kajuma fix some bugs which may be caused by transformer-cli while adding model
c4ea9dfc
weak-kajuma fix a bug caused by forgetting KV cache...
e072544a
weak-kajuma
weak-kajuma
bzantium
bzantium commented on 2024-10-19
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
674d7a23
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
9eac636a
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
0e99dbd4
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
1e445c78
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
cca6a5c2
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
dd167af8
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
23099cb9
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
faac378e
bzantium
bzantium commented on 2024-10-20
bzantium
bzantium commented on 2024-10-20
bzantium
bzantium commented on 2024-10-20
bzantium
bzantium commented on 2024-10-20
weak-kajuma
weak-kajuma I found Attention missed implemented from paper still on e072544a3bfc…
53e13aa2
weak-kajuma re-implemented
63b018a2
weak-kajuma adding groupnorm
204bec87
weak-kajuma align with transformers code style
bce12e5f
weak-kajuma fix typo
44d8423c
weak-kajuma adding groupnorm
6dc6f81c
weak-kajuma change SdpaAttention to DiffSdpaAttention
48b38e87
weak-kajuma fix bug
997f561d
weak-kajuma
bzantium
bzantium commented on 2024-10-20
bzantium
bzantium commented on 2024-10-21
bzantium
bzantium commented on 2024-10-21
weak-kajuma
weak-kajuma Update src/transformers/models/diffllama/modeling_diffllama.py
107bd3c3
weak-kajuma fix bugs of places of "GroupNorm with scale" and etc
26307d92
bzantium
weak-kajuma
weak-kajuma Revert "fix bugs of places of "GroupNorm with scale" and etc"
22aa1451
bzantium
bzantium commented on 2024-10-22
bzantium
bzantium commented on 2024-10-22
bzantium
bzantium commented on 2024-10-22
weak-kajuma simplify multiple of attention (matmul) operations into one by repeat…
cc472bef
weak-kajuma simplify multiple of attention (matmul) operations into one by repeat…
e834129d
weak-kajuma simplify multiple of attention (matmul) operations into one by repeat…
e9d94e5a
weak-kajuma remove missed type
03529996
bzantium
bzantium approved these changes on 2024-10-23
ArthurZucker
bzantium
bzantium
weak-kajuma add diffllama model_doc
843178ad
weak-kajuma apply make style/quality
71c8d124
bzantium
ArthurZucker
ArthurZucker
Cyrilvallez
Cyrilvallez commented on 2024-10-30
weak-kajuma apply review comment about model
fea95faf
weak-kajuma apply review comment about test
b3f8dd5c
weak-kajuma place diffllama alphabetically on the src/transformers/__init__.py
50ce3532
weak-kajuma
weak-kajuma fix forgot code
6f253335
weak-kajuma Supports parameters that are not initialized with standard deviation …
dd2282e6
weak-kajuma add DiffLlamaConfig to CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPO…
9e7a9c3e
weak-kajuma remove unused property of config
8c98d191
weak-kajuma add to supported model list
cbf217d8
weak-kajuma add to spda supported model list
c8739822
weak-kajuma
Cyrilvallez
Cyrilvallez commented on 2024-11-05
Cyrilvallez
Cyrilvallez
weak-kajuma fix copyright, remove pretraining_tensor_parallel, and modify for ini…
b003a535
weak-kajuma remove unused import and etc.
37c7a88e
weak-kajuma empty commit
ba92d5c1
weak-kajuma empty commit
8cc823ee
weak-kajuma empty commit
d47631d6
weak-kajuma
ArthurZucker
ArthurZucker ArthurZucker requested a review from ArthurZucker ArthurZucker 1 year ago
effortprogrammer
weak-kajuma weak-kajuma changed the title [WIP] Add diffllama [Request Reviews]Add diffllama 1 year ago
ArthurZucker
ArthurZucker commented on 2024-11-20
weak-kajuma weak-kajuma changed the title [Request Reviews]Add diffllama Add diffllama 1 year ago
weak-kajuma apply modular transformers but with bugs
c6932de8
Cyrilvallez
effortprogrammer
weak-kajuma revert prev commit
48e16cf3
weak-kajuma create src/transformers/model/diffllama/modular_diffllama.py
a44f95d3
weak-kajuma run utils/modular_model_converter.py
c45aa59b
weak-kajuma empty commit
c5741eb0
weak-kajuma
Cyrilvallez
Cyrilvallez commented on 2024-12-04
Cyrilvallez
weak-kajuma leaner modular diffllama
ea622ce1
weak-kajuma Merge branch 'huggingface:main' into add_diffllama
e30c2984
weak-kajuma remove more and more in modular_diffllama.pt
3f85c228
weak-kajuma remove more and more in modular_diffllama.pt
87d034da
weak-kajuma
effortprogrammer
ArthurZucker ArthurZucker requested a review from Cyrilvallez Cyrilvallez 1 year ago
Cyrilvallez
Cyrilvallez commented on 2024-12-10
weak-kajuma resolve missing docstring entries
4660c6e3
weak-kajuma force reset
b4ff5f3f
weak-kajuma weak-kajuma force pushed to b4ff5f3f 1 year ago
weak-kajuma Merge branch 'huggingface:main' into add_diffllama
484a493f
weak-kajuma convert modular
0ce20233
ArthurZucker
ArthurZucker commented on 2024-12-23
weak-kajuma
Cyrilvallez
Cyrilvallez approved these changes on 2025-01-07
ArthurZucker
ArthurZucker ArthurZucker merged 96bf3d6c into main 1 year ago
ArthurZucker

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone