Fix equation in MatMulNBits op spec (#22253)
### Description
This PR fixes an equation in the MatMulNBits op spec. The old formula is
stated as
```
[CeilDiv((N * n_blocks_per_col + 1) * bits, 8)]
```
but it should be stated as
```
[N * CeilDiv(n_blocks_per_col * bits, 8)]
```
or as
```
[N * FloorDiv((n_blocks_per_col + 1) * bits, 8)]
```
### Motivation and Context
For models such as ChatGLM where the column size is odd, the division
math can be off. For example:

With the old equation, the projections are calculated as follows.
```
# Down projection
B = 4,096 x 107 x 64
zero_points = 221,184
N = 4,096
n_blocks_per_col = 107
4,096 * CeilDiv((107 + 1) * 4, 8) = 4,096 * CeilDiv(108 * 4, 8) = 4,096 * 54 = 221,184
# Up projection
B = 13,696 x 32 x 64
zero_points = 219,136
N = 13,696
n_blocks_per_col = 32
13,696 * CeilDiv((32 + 1) * 4, 8) = 13,696 * CeilDiv(33 * 4, 8) = 13,696 * 17 = 232,832
```
With the new equation, the projections are calculated as follows.
```
# Down projection
B = 4,096 x 107 x 64
zero_points = 221,184
N = 4,096
n_blocks_per_col = 107
4,096 * CeilDiv(107 * 4, 8) = 4,096 * 54 = 221,184
# Up projection
B = 13,696 x 32 x 64
zero_points= 219,136
N = 13,696
n_blocks_per_col = 32
13,696 * CeilDiv(32 * 4, 8) = 13,696 * 16 = 219,136
```