jax
816fab4e - [jax:benchmark] Add tracing benchmarks for some common operations.

Commit
164 days ago
[jax:benchmark] Add tracing benchmarks for some common operations. - For now, `jnp.dot`, `jnp.concat`, `*`. - We also include a simple test to sense check that tracing time scales linearly with number of equations (by no means a guarantee). - Added no_cache variants of the benchmarks. ``` name cpu/op test_pallas_mqa_splash_attention_trace 30.73m ± ∞ ¹ test_pallas_mqa_splash_attention_trace_no_cache_clear 32.27µ ± ∞ ¹ test_pallas_mqa_splash_attention_lower 36.84m ± ∞ ¹ test_pallas_mqa_splash_attention_lower_no_cache_clear 35.62µ ± ∞ ¹ test_jnp_dot_trace 10.04m ± ∞ ¹ test_jnp_dot_trace_no_cache_clear 140.6µ ± ∞ ¹ test_jnp_concat_trace 16.41m ± ∞ ¹ test_jnp_concat_trace_no_cache_clear 205.5µ ± ∞ ¹ test_num_multiply_eqns_trace/1 16.02m ± ∞ ¹ test_num_multiply_eqns_trace/128 28.06m ± ∞ ¹ test_num_multiply_eqns_trace/256 39.44m ± ∞ ¹ test_num_multiply_eqns_trace/384 51.62m ± ∞ ¹ test_num_multiply_eqns_trace/512 62.91m ± ∞ ¹ test_num_multiply_eqns_trace/640 75.83m ± ∞ ¹ test_num_multiply_eqns_trace/768 90.09m ± ∞ ¹ test_num_multiply_eqns_trace/896 96.68m ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/1 121.2µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/128 123.6µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/256 127.0µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/384 130.9µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/512 134.6µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/640 137.5µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/768 143.4µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/896 143.9µ ± ∞ ¹ geomean 2.024m ¹ need >= 6 samples for confidence interval at level 0.95 name time/op test_pallas_mqa_splash_attention_trace 30.78m ± ∞ ¹ test_pallas_mqa_splash_attention_trace_no_cache_clear 32.30µ ± ∞ ¹ test_pallas_mqa_splash_attention_lower 37.00m ± ∞ ¹ test_pallas_mqa_splash_attention_lower_no_cache_clear 35.66µ ± ∞ ¹ test_jnp_dot_trace 19.12m ± ∞ ¹ test_jnp_dot_trace_no_cache_clear 140.7µ ± ∞ ¹ test_jnp_concat_trace 21.84m ± ∞ ¹ test_jnp_concat_trace_no_cache_clear 205.6µ ± ∞ ¹ test_num_multiply_eqns_trace/1 24.69m ± ∞ ¹ test_num_multiply_eqns_trace/128 36.81m ± ∞ ¹ test_num_multiply_eqns_trace/256 48.04m ± ∞ ¹ test_num_multiply_eqns_trace/384 60.04m ± ∞ ¹ test_num_multiply_eqns_trace/512 72.00m ± ∞ ¹ test_num_multiply_eqns_trace/640 84.88m ± ∞ ¹ test_num_multiply_eqns_trace/768 98.77m ± ∞ ¹ test_num_multiply_eqns_trace/896 105.4m ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/1 121.2µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/128 123.7µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/256 127.1µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/384 131.0µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/512 134.8µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/640 137.5µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/768 143.5µ ± ∞ ¹ test_num_multiply_eqns_trace_no_cache_clear/896 144.1µ ± ∞ ¹ geomean 2.239m ¹ need >= 6 samples for confidence interval at level 0.95 ``` PiperOrigin-RevId: 783855325
Author
Parents
Loading