PR #545 🚀🚀🚀 Transformers.js V3 🚀🚀🚀

xenova1 year ago (edited 210 days ago)👍 31🎉 2❤ 21🚀 32

In preparation for Transformers.js v3, I'm compiling a list of issues/features which will be fixed/included in the release.

Useful commands:

Pack
```
npm pack
```
Publish dry-run
```
npm publish --dry-run
```
Publish dry-run w/ tag
```
npm publish --dry-run --tag dev
```

Bump alpha version

npm version prerelease --preid=alpha -m "[version] Update to %s"

How to use WebGPU

First, install the development branch

npm install @huggingface/transformers

Then specify the device parameter when loading the model. Here's example code to get started. Please note that this is still a WORK IN PROGRESS, so the following usage may change before release.

import { pipeline } from '@huggingface/transformers';

// Create feature extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
    device: 'webgpu',
    dtype: 'fp32', // or 'fp16'
});

// Generate embeddings
const sentences = ['That is a happy person', 'That is a very happy person'];
const output = await extractor(sentences, { pooling: 'mean', normalize: true });
console.log(output.tolist());

HuggingFaceDocBuilderDev1 year ago👍 3

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

xenova marked this pull request as draft 1 year ago

Huguet571 year ago

Hey! This is great. Is this already in alpha?

kishorekaruppusamy1 year ago👍 4

Team, is there any tentative time to release this v3 alpha ???

jhpassion06211 year ago👍 2❤ 1

I can't wait anymore :) Please update me when it will be released!

jhpassion06211 year ago (edited 1 year ago)

@xenova Can I test v3-alpha by using NPM? When I try to run, I get this issue.

kishorekaruppusamy1 year ago

@xenova Can I test v3-alpha by using NPM? When I try to run, I get this issue.

use this https://github.com/kishorekaruppusamy/transformers.js/commit/7af8ef1e5c37f3052ed3a8e38938595702836f09 commit to resolve this issue ...

jhpassion06211 year ago

Thanks for your reply @kishorekaruppusamy I tried with your branch and I got other issues.

Please give me your advise!

kishorekaruppusamy1 year ago (edited 1 year ago)

Thanks for your reply @kishorekaruppusamy I tried with your branch and I got other issues. Please give me your advise!

https://github.com/kishorekaruppusamy/transformers.js/blob/V3_BRANCH_WEBGPU_BUG_FIX/src/backends/onnx.js#L144
change this url to local dist dir inside build ..

felladrin commented on 2024-03-12

Conversation is marked as resolved

Show resolved

beaufortfrancois commented on 2024-03-14

src/utils/devices.js

	1		/**
	2		* @typedef {'cpu'\|'gpu'\|'wasm'\|'webgpu'\|null} DeviceType

beaufortfrancois1 year ago

Out of curiosity, what is 'gpu'?

xenova1 year ago👍 3

It's meant to be a "catch-all" for the different ways that the library can be used with GPU support (not just in the browser with WebGPU). The idea is that it will simplify documentation, as transformers.js will select the best execution provider depending on the environment. For example, DML/CUDA support in onnxruntime-node (see microsoft/onnxruntime#16050 (comment))

Of course, this is still a work in progress, so it can definitely change!

Early dereferencing for performance boosts

0dba2661

cleanup

5e4e20fb

Move quantization logic to `quantize.py`

dd6af93f

update deps

04af3d57

Fix q4 quantization

91286517

save q4 quantization

83cbb218

Add decode ASR test

eb613441

Do not process last chunk unnecessarily

cec24005

fp16 disable_shape_infer if model is too large

c835b543

Use `check_and_save_model` for saving fp16 model

45cd8d4d

Reorder functions

88f3e441

formatting

23440f00

Remove debug log

b411e9fd

Fix q8 quantization for models > 2GB

04a334a5

correct attribute

cd1ea697

Fix `TextGenerationPipeline`

a167f6e2

Fix pauses in whisper word-level timestamps

ea732896

Formatting

344af32a

Sort added tokens by length to avoid early partial matches

c305c382

Add new tokenizer test

d6f6fd47

Only finish with newline if running in Node.js

1557b8d0

Consider token timestamps when selecting longest common sequence

9ac7ceb4

Create whisper word-level timestamps demo

79ed46ed

cleanup

8da68866

Fallback to WASM if WebGPU not supported

d709bd07

Reload model for each quantization mode

9ef3a6d0

Update converstion script requirements

9787b75a

Separate IO and Quantization args

974f0862

Use `const` where possible

d0428688

Add `InterruptableStoppingCriteria`

1b4d2428

`@xenova/transformers` -> `@huggingface/transformers`

31101c82

Override semver version

e84322b5

Add support for pyannote models

bd943340

Update README.md

3dbc633b

Add listed support for pyannote

858e55d1

Add pyannote example code

8bf03494

Support specifying `min_num_frames`

c52618cf

Support simultaneous instantiation of multiple inference sessions

96f19b06

Support broadcasting encoder outputs over decoder inputs

4ad43e21

Fix test

c6aeb4be

fix bundler config for latest ORT

6d3ea4bc

Only check fp16 support for webgpu device

38a3bf6d

Remove default chat templates

9df84c43

Add support for gemma2

fc3d860f

Add gemma2 generation test

939920d2

Update gemma2 config mapping

5bb93a06

Prioritize high-performance adapter when possible

72ec168f

Set defaults for `tools` and `documents` in `apply_chat_template`

9068a531

bump `@huggingface/jinja` -> 0.3.0

824538bc

Add `apply_chat_template` default parameters unit test

836c0afe

Merge branch 'v3' into @huggingface/transformers

487d8b20

Add prettier

1f6e0e16

prettier format config files

55494d18

remove incorrect comment

5a68461b

Merge branch 'pr/864' into @huggingface/transformers

437cb34e

Update onnxruntime-web version

5a6c9267

Update webpack.config.js

b19251b8

Fix copy path

820c1e26

Run `npm ci`

b0dab917

Fix bundling

86b9b621

Do not set `preferredOutputLocation` if we are proxying

222b94ed

Merge branch 'v3' into @huggingface/transformers

b326cc94

Update `@webgpu/types`

ca67092f

Update SAM example

42076fda

Use `??=` operator where possible

48d31424

Fix commonjs usage

3b1a4fd9

Mark `onnxruntime-node` and `sharp` as externals

9a73b5ed

Move `externals` into config

9951aa5d

Downgrade to onnxruntime 1.18.0

c04d37e6

Finalize module/commonjs build

d32fe2bc

Separate web and node builds

1530d509

[version] Update to 3.0.0-alpha.1

b4df0e25

Default to CDN-hosted .wasm files

ab59c516

[version] Update to 3.0.0-alpha.2

866b2198

bump versions

4a3398d1

[version] Update to 3.0.0-alpha.3

8891a142

Merge branch 'improve-conversion-script' into v3

a315933b

Consolidate conversion and quantization script

12569b8f

Downgrade `onnxconverter-common`

83f57181

Link to types in exports

6fa5fa6c

Update list of supported tasks

2f1b2105

Fixed unit tests

27bc55d7

Update imports

23d11500

Bump versions to `3.0.0-alpha.4`

f9070dca

[version] Update to 3.0.0-alpha.4

c3494e1b

Fix "Default condition should be last one"

973fb0dc

Bump versions

7376ecf9

[version] Update to 3.0.0-alpha.5

0a04bc07

Update next.js client-side demo

e4603cd9

Initial WebNN Support

ff1853ce

Mark fs, path and url as external packages for node build

15574bcf

Move content type map outside of `FileResponse` object

72828625

Add GPU support for Node.js

22f7cede

Bump versions

1e319a4c

[version] Update to 3.0.0-alpha.6

d278891f

Fix conflicts

3fefa17a

bump dependency versions

fa6cc70f

Add support for device auto-detection

7fa53265

Fix default device selection

4ec77c1a

Merge branch 'pr/ibelem/890-1' into v3

5799e304

Improve WebNN selection

5b2cac21

Skip token callback if `skip_prompt` is set

ad23c50c

Bump versions

5b84b62a

[version] Update to 3.0.0-alpha.7

bcf6a86f

bump versions

b97ed0d8

[version] Update to 3.0.0-alpha.8

c5b70838

bump versions

cbeefded

[version] Update to 3.0.0-alpha.9

59600f24

Add support for Sapiens

b2e025a0

Update default ONNX env

8661d951

Fix types

57db34db

Topologically sort fp16 nodes

1b7f9789

Add marian unit test

45d1526e

Re-order imports

b903757c

Fix `NoBadWordsLogitsProcessor`

633976f7

Update package.json

24d8787e

[jest] Disable coverage

9412ec46

Bump versions

08e73881

[version] Update to 3.0.0-alpha.10

d5a8f87a

Improve node/web interoperability

7843ad07

Fix scripts/requirements.txt

bf093aec

Bump versions

9a5ee429

[version] Update to 3.0.0-alpha.11

535cdfe5

Add support for JAIS models (#906)

4e1acf04

Add JAIS to README

488548d0

Fix node/web interop (again)

13aed411

Bump versions

7655f81c

[version] Update to 3.0.0-alpha.12

1c7e2267

Set `SapiensForNormalEstimation` to encoder-only

ab6b28b6

Implement `sub` tensor operation

66c05d56

Bump versions

31e8b2ae

[version] Update to 3.0.0-alpha.13

bf3f7d5f

Improve typing for `wrap` helper function

c0253561

Update `preferredOutputLocation` type

7ebdaf21

Make `wrap` type more generic

3b8ddcbc

Re-use `segmentation_data`

a385c6e4

Fix `min` type

537e9586

Add support for Hiera models

bcb28b34

Fix reused loop variable (closes #910)

d21c87cd

Add logits processor test file

1d281f63

Fix test imports

ba0427f4

Bump versions

3bc3e86c

[version] Update to 3.0.0-alpha.14

0518960d

Add another `bad_words` logits processor test (closes #913)

552cdea6

Add support for GroupViT

3422a8bc

Add zero-shot-image-classification unit test

3599902a

Add maskformer model definitions

5892ee81

Support universal image segmentation in `image-segmentation` pipeline

c4dac775

Add support for PVT models

f0c47bed

Add `post_process_instance_segmentation` function template

d80d3a4c

Add `library_name` option to convert.py

844099df

Wrap onnxslim with try block

ba5d7252

Use const where possible

b3691c81

Use const where possible (again)

dcf117f2

Create `MaskFormerFeatureExtractor`

9af026c5

Add support for MaskFormer

0f8200c5

Improve tool-use chat template detection

e278c5e9

Add object detection pipeline unit test

83fa58f0

Add support for ViTMSN and VitMAE

86d6da46

Bump ORT versions

93b25fb2

Create `get_chat_template` helper function

2f680ee7

Fix CI

2f9b2ed9

Run prettier on `tests/**`

deec3504

move certain tests to utils subfolder

48fa226e

xenova marked this pull request as ready for review 248 days ago

Bump onnxruntime-web version

a10828f4

Bump `onnxruntime==1.19.2` in scripts/requirements.txt

ba58ea24

Merge branch 'main' into v3

4f17e954

Merge branch 'main' into v3

c40a1512

Sort `this.added_tokens` before creating regex (`.toSorted` is not av…

30315b21

Rather make a copy of `this.added_tokens`

d7df5758

Fix `.tokenize` with `fuse_unk=true`

a519379b

Add blenderbot tokenizer tests

89ddccf5

Add t5 tokenizer tests

36ad144b

Add falcon tokenizer tests

4765dd63

Run prettier

fd8b9a25

Add ESM tokenizer tests

710816ef

Run unit tests in parallel

0d3cd309

Fix `fuse_unk` for tokenizers with `byte_fallback=true` but no byte f…

cc258c23

Add llama tokenizer unit tests

4798755c

Update emoji test string names

c6c3ae18

Move whisper-specific unit tests to subfolder

79a74095

Code formatting

1a388048

Bump versions

dabe6ae3

[version] Update to 3.0.0-alpha.15

54f1f214

Add emoji tokenizer test cases for LlamaTokenizer

a912d796

Attempt to fix encoder-decoder memory leak

969d10e1

Remove unused code

072cbbce

Fix BertNormalizer (strip `Mn` unicode characters)

14b4bd4a

Handle ZERO WIDTH JOINER (U+200D) characters

67977718

Add more spm normalization characters

f148afd6

Add emoji unit tests for bert/t5

ca4b5b98

[WebNN] Add support for specifying `free_dimension_overrides` in config

113c81ea

Log warning if webnn is selected by `free_dimension_overrides` is not…

9005accf

Fix unigram for multi-byte tokens

682c7d05

Add gemma tokenizer tests

4a31e549

Allow user to specify device and dtype in config.json

7a160655

Update dependency versions

4c1d21ba

Bump versions

3c6a95a0

[version] Update to 3.0.0-alpha.16

ac391d24

Add CLIP tokenizer unit tests

d30d3b7a

Add more tokenizer tests

e089ef4c

Bump onnxruntime-web version

2c9e271f

Bump versions

ee1e32a2

[version] Update to 3.0.0-alpha.17

f41e995b

Add support for new `tokenizers>=0.2.0` BPE serialization format

9a42cf32

Bump onnxruntime-web version

f534b352

Bump versions

0c8b1af1

[version] Update to 3.0.0-alpha.18

2ca41780

Keep encoder outputs on GPU

a82e7ef0

Update whisper-webgpu demo dependencies

c37a38cd

Bump versions

e1c4fc69

[version] Update to 3.0.0-alpha.19

fe51609a

Support to load ONNX APIs based on JS runtime (#947)

b5188664

Allow specification of `use_external_data_format` in custom config

95c8cc55

Update deberta unit tests

03eb77bf

Update roberta tokenizer tests

c61a76ba

Support inferringunigram tokenizer type

32d8df40

Reuse tokenizer tests for original t5-small

6505abb1

Remove redundant null coalesce

96192182

Enable unit test coverage reports

52c4ce70

Use `PROBLEMATIC_REGEX_MAP` for bloom tokenizer

12edaf08

Improve tokenizer unit tests

5e7e82b9

Update tokenizer unit tests

795a61a3

Remove unused code

77ebe0de

Add m2m_100 tokenizer unit tests

56eda3bd

Add m2m translation pipeline unit test

2040ad5d

Add support for Depth Pro models

8718c176

Add whisper turbo alignment heads

a32efa3d

Remove in-library list of supported models

8b0d330a

Bump versions

cf3f5c34

[version] Update to 3.0.0-alpha.20

86fe1753

Add function to map tensor data array.

1c78278b

Merge branch 'main' into v3

a5e02100

Optimise loop to reduce calls to `this`

9f8fac09

Merge branch 'pr/966' into v3

1c43e3f8

Add back tensor map test

7a0f77c1

Add support for granite models

da03a0a4

Allow multiple optional configs to be passed (+ reduce code duplication)

37effa36

Bump dependencies

f21b36e2

Bump versions

d26a6633

[version] Update to 3.0.0-alpha.21

c337c3bb

Add support for per-dtype `kv_cache_dtype`

92d0dc69

Add text streamer unit test

ea03bf54

Bump ORT web version

27a033f6

Bump versions

19277eaf

[version] Update to 3.0.0-alpha.22

90a74905

Update repo name to `@huggingface/transformers.js`

38773eab

xenova changed the title ~~[WIP] 🚀🚀🚀 Transformers.js V3 🚀🚀🚀~~ 🚀🚀🚀 Transformers.js V3 🚀🚀🚀 210 days ago

Update tested node versions

832b5b74

Bump versions

b871c087

[version] Update to 3.0.0

7a58d6e1

xenova merged 7ebd50ce into main 210 days ago

transformers.js
🚀🚀🚀 Transformers.js V3 🚀🚀🚀
#545

Merged

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

Useful commands:

How to use WebGPU

10	10	}
11	11
12	12	// Proxy the WASM backend to prevent the UI from freezing

transformers.js 🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545 Merged

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

Useful commands:

How to use WebGPU

transformers.js
🚀🚀🚀 Transformers.js V3 🚀🚀🚀
#545

Merged