fix: remove noops and speed up function `zoom_image` by 12% (#4164)
<!-- CODEFLASH_OPTIMIZATION:
{"function":"zoom_image","file":"unstructured/partition/utils/ocr_models/tesseract_ocr.py","speedup_pct":"12%","speedup_x":"0.12x","original_runtime":"18.1
milliseconds","best_runtime":"16.1
milliseconds","optimization_type":"memory","timestamp":"2025-12-19T03:24:39.274Z","version":"1.0"}
-->
#### 📄 12% (0.12x) speedup for ***`zoom_image` in
`unstructured/partition/utils/ocr_models/tesseract_ocr.py`***
⏱️ Runtime : **`18.1 milliseconds`** **→** **`16.1 milliseconds`** (best
of `12` runs)
#### 📝 Explanation and details
The optimization removes unnecessary morphological operations (dilation
followed by erosion) that were being performed with a 1x1 kernel. Since
a 1x1 kernel has no effect on the image during dilation and erosion
operations, these steps were pure computational overhead.
**Key changes:**
- Eliminated the creation of a 1x1 kernel (`np.ones((1, 1), np.uint8)`)
- Removed the `cv2.dilate()` and `cv2.erode()` calls that used this
ineffective kernel
- Added explanatory comments about why these operations were removed
**Why this leads to speedup:**
The line profiler shows that the morphological operations consumed 27.7%
of the total runtime (18.5% for dilation + 9.2% for erosion). A 1x1
kernel performs no actual morphological transformation - it's equivalent
to applying the identity operation. Removing these no-op calls
eliminates unnecessary OpenCV function overhead and memory operations.
**Performance impact based on function references:**
The `zoom_image` function is called within Tesseract OCR processing,
specifically in `get_layout_from_image()` when text height falls outside
optimal ranges. This optimization will improve OCR preprocessing
performance, especially beneficial since OCR is typically a
computationally intensive operation that may be called repeatedly on
document processing pipelines.
**Test case analysis:**
The optimization shows consistent 7-35% speedups across various test
cases, with particularly strong gains for:
- Identity zoom operations (35.8% faster) - most common case where
zoom=1
- Upscaling operations (21-32% faster) - when OCR requires image
enlargement
- Large images (8-22% faster) - where the removed operations had more
overhead
The optimization maintains identical visual output since the removed
operations were mathematically no-ops, ensuring OCR accuracy is
preserved while reducing processing time.
✅ **Correctness verification report:**
| Test | Status |
| --------------------------- | ----------------- |
| ⚙️ Existing Unit Tests | ✅ **27 Passed** |
| 🌀 Generated Regression Tests | ✅ **38 Passed** |
| ⏪ Replay Tests | 🔘 **None Found** |
| 🔎 Concolic Coverage Tests | 🔘 **None Found** |
|📊 Tests Coverage | 100.0% |
<details>
<summary>⚙️ Existing Unit Tests and Runtime</summary>
| Test File::Test Function | Original ⏱️ | Optimized ⏱️ | Speedup |
|:---------------------------------------------------|:--------------|:---------------|:----------|
| `partition/pdf_image/test_ocr.py::test_zoom_image` | 707μs | 632μs |
11.9%✅ |
</details>
<details>
<summary>🌀 Generated Regression Tests and Runtime</summary>
```python
from __future__ import annotations
import numpy as np
# imports
from PIL import Image as PILImage
from unstructured.partition.utils.ocr_models.tesseract_ocr import zoom_image
# --------- UNIT TESTS ---------
# Helper function to create a simple RGB PIL image of given size and color
def make_image(size=(10, 10), color=(255, 0, 0)):
img = PILImage.new("RGB", size, color)
return img
# ---------------- BASIC TEST CASES ----------------
def test_zoom_identity():
"""Zoom factor 1 should return an image of the same size (but not necessarily the same object)."""
img = make_image((20, 30), (123, 45, 67))
codeflash_output = zoom_image(img, 1)
out = codeflash_output # 75.0μs -> 55.2μs (35.8% faster)
# The pixel values may not be identical due to dilation/erosion, but should be very close
diff = np.abs(np.array(out, dtype=int) - np.array(img, dtype=int))
def test_zoom_upscale():
"""Zoom factor >1 should increase image size proportionally."""
img = make_image((10, 20), (0, 255, 0))
codeflash_output = zoom_image(img, 2)
out = codeflash_output # 35.2μs -> 29.0μs (21.4% faster)
# The output image should still be greenish
arr = np.array(out)
def test_zoom_downscale():
"""Zoom factor <1 should decrease image size proportionally."""
img = make_image((10, 10), (0, 0, 255))
codeflash_output = zoom_image(img, 0.5)
out = codeflash_output # 25.3μs -> 21.6μs (17.1% faster)
arr = np.array(out)
def test_zoom_non_integer_factor():
"""Non-integer zoom factors should produce correct output size."""
img = make_image((8, 8), (100, 200, 50))
codeflash_output = zoom_image(img, 1.5)
out = codeflash_output # 30.2μs -> 22.8μs (32.1% faster)
def test_zoom_no_side_effects():
"""The input image should not be modified."""
img = make_image((5, 5), (10, 20, 30))
img_before = np.array(img).copy()
codeflash_output = zoom_image(img, 2)
_ = codeflash_output # 22.9μs -> 18.3μs (25.0% faster)
# ---------------- EDGE TEST CASES ----------------
def test_zoom_zero_factor():
"""Zoom factor 0 should be treated as 1 (no scaling)."""
img = make_image((7, 13), (50, 100, 150))
codeflash_output = zoom_image(img, 0)
out = codeflash_output # 24.6μs -> 20.0μs (23.2% faster)
def test_zoom_negative_factor():
"""Negative zoom factors should be treated as 1 (no scaling)."""
img = make_image((12, 8), (200, 100, 50))
codeflash_output = zoom_image(img, -2)
out = codeflash_output # 26.1μs -> 20.0μs (30.4% faster)
def test_zoom_large_factor_on_small_image():
"""Zooming a small image by a large factor should scale up."""
img = make_image((2, 2), (42, 84, 126))
codeflash_output = zoom_image(img, 10)
out = codeflash_output # 42.8μs -> 33.5μs (27.5% faster)
def test_zoom_non_rgb_image():
"""Function should work with grayscale images (converted to RGB)."""
img = PILImage.new("L", (5, 5), 128) # Grayscale
img_rgb = img.convert("RGB")
codeflash_output = zoom_image(img, 2)
out = codeflash_output # 31.0μs -> 25.7μs (20.8% faster)
def test_zoom_alpha_channel_image():
"""Function should ignore alpha channel and process as RGB."""
img = PILImage.new("RGBA", (6, 6), (100, 150, 200, 128))
img_rgb = img.convert("RGB")
codeflash_output = zoom_image(img, 2)
out = codeflash_output # 28.0μs -> 24.9μs (12.6% faster)
def test_zoom_large_image_upscale():
"""Zooming a large image up should work and not crash."""
img = make_image((500, 500), (10, 20, 30))
codeflash_output = zoom_image(img, 1.5)
out = codeflash_output # 1.23ms -> 1.09ms (12.5% faster)
# Check a corner pixel is still close to original color
arr = np.array(out)
def test_zoom_large_image_downscale():
"""Zooming a large image down should work and not crash."""
img = make_image((800, 600), (200, 100, 50))
codeflash_output = zoom_image(img, 0.5)
out = codeflash_output # 942μs -> 923μs (2.03% faster)
arr = np.array(out)
def test_zoom_maximum_allowed_size():
"""Test with the largest allowed image under 1000x1000."""
img = make_image((999, 999), (1, 2, 3))
codeflash_output = zoom_image(img, 1)
out = codeflash_output # 1.47ms -> 1.30ms (13.0% faster)
arr = np.array(out)
def test_zoom_many_colors():
"""Test with an image with many colors (gradient)."""
arr = np.zeros((100, 100, 3), dtype=np.uint8)
for i in range(100):
for j in range(100):
arr[i, j] = [i * 2 % 256, j * 2 % 256, (i + j) % 256]
img = PILImage.fromarray(arr)
codeflash_output = zoom_image(img, 0.9)
out = codeflash_output # 112μs -> 97.0μs (16.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
```
```python
from __future__ import annotations
import numpy as np
# imports
from PIL import Image as PILImage
from unstructured.partition.utils.ocr_models.tesseract_ocr import zoom_image
# --- Helper functions for tests ---
def create_test_image(size=(10, 10), color=(255, 0, 0), mode="RGB"):
"""Create a plain color PIL image for testing."""
return PILImage.new(mode, size, color)
# --- Unit tests ---
# 1. Basic Test Cases
def test_zoom_identity():
"""Test zoom=1 returns image of same size and content is similar."""
img = create_test_image((10, 10), (123, 222, 111))
codeflash_output = zoom_image(img, 1)
result = codeflash_output # 57.2μs -> 53.3μs (7.43% faster)
# The content may not be pixel-perfect due to cv2 conversion, but should be close
arr_orig = np.array(img)
arr_result = np.array(result)
def test_zoom_double_size():
"""Test zoom=2 increases both dimensions by 2x."""
img = create_test_image((10, 5), (10, 20, 30))
codeflash_output = zoom_image(img, 2)
result = codeflash_output # 38.6μs -> 30.6μs (26.3% faster)
def test_zoom_half_size():
"""Test zoom=0.5 reduces both dimensions by half (rounded)."""
img = create_test_image((10, 6), (200, 100, 50))
codeflash_output = zoom_image(img, 0.5)
result = codeflash_output # 29.6μs -> 25.4μs (16.7% faster)
def test_zoom_arbitrary_factor():
"""Test zoom=1.7 scales image correctly."""
img = create_test_image((10, 10), (0, 255, 0))
codeflash_output = zoom_image(img, 1.7)
result = codeflash_output # 30.3μs -> 23.8μs (27.3% faster)
expected_size = (int(round(10 * 1.7)), int(round(10 * 1.7)))
# 2. Edge Test Cases
def test_zoom_zero():
"""Test zoom=0 is treated as 1 (no scaling)."""
img = create_test_image((8, 8), (50, 50, 50))
codeflash_output = zoom_image(img, 0)
result = codeflash_output # 26.3μs -> 23.1μs (13.7% faster)
arr_orig = np.array(img)
arr_result = np.array(result)
def test_zoom_negative():
"""Test negative zoom is treated as 1 (no scaling)."""
img = create_test_image((7, 9), (100, 200, 50))
codeflash_output = zoom_image(img, -3)
result = codeflash_output # 24.4μs -> 20.4μs (19.6% faster)
arr_orig = np.array(img)
arr_result = np.array(result)
def test_zoom_minimal_size():
"""Test 1x1 image with zoom=2 and zoom=0.5."""
img = create_test_image((1, 1), (0, 0, 0))
codeflash_output = zoom_image(img, 2)
result_up = codeflash_output
codeflash_output = zoom_image(img, 0.5)
result_down = codeflash_output
def test_zoom_non_rgb_image():
"""Test grayscale and RGBA images."""
# Grayscale
img_gray = PILImage.new("L", (10, 10), 128)
# Convert to RGB for function compatibility
img_gray_rgb = img_gray.convert("RGB")
codeflash_output = zoom_image(img_gray_rgb, 2)
result_gray = codeflash_output # 41.8μs -> 54.2μs (22.9% slower)
# RGBA
img_rgba = PILImage.new("RGBA", (10, 10), (10, 20, 30, 40))
img_rgba_rgb = img_rgba.convert("RGB")
codeflash_output = zoom_image(img_rgba_rgb, 0.5)
result_rgba = codeflash_output # 22.4μs -> 19.7μs (13.8% faster)
def test_zoom_non_integer_zoom():
"""Test zoom with non-integer floats."""
img = create_test_image((9, 7), (10, 20, 30))
codeflash_output = zoom_image(img, 1.333)
result = codeflash_output # 26.9μs -> 24.6μs (9.32% faster)
expected_size = (int(9 * 1.333), int(7 * 1.333))
def test_zoom_unusual_aspect_ratio():
"""Test tall and wide images."""
img_tall = create_test_image((3, 100), (1, 2, 3))
codeflash_output = zoom_image(img_tall, 0.5)
result_tall = codeflash_output # 31.7μs -> 32.0μs (0.911% slower)
img_wide = create_test_image((100, 3), (4, 5, 6))
codeflash_output = zoom_image(img_wide, 0.5)
result_wide = codeflash_output # 21.8μs -> 24.0μs (9.20% slower)
def test_zoom_large_zoom_factor():
"""Test very large zoom factor (e.g., 20x)."""
img = create_test_image((2, 2), (255, 255, 255))
codeflash_output = zoom_image(img, 20)
result = codeflash_output # 33.6μs -> 26.0μs (29.1% faster)
def test_zoom_extreme_color_values():
"""Test image with extreme color values (black/white)."""
img_black = create_test_image((5, 5), (0, 0, 0))
img_white = create_test_image((5, 5), (255, 255, 255))
codeflash_output = zoom_image(img_black, 1)
result_black = codeflash_output # 23.6μs -> 21.3μs (10.8% faster)
codeflash_output = zoom_image(img_white, 1)
result_white = codeflash_output # 17.5μs -> 14.9μs (17.9% faster)
# 3. Large Scale Test Cases
def test_zoom_large_image_no_scale():
"""Test zoom=1 on a large image."""
img = create_test_image((500, 400), (100, 150, 200))
codeflash_output = zoom_image(img, 1)
result = codeflash_output # 300μs -> 274μs (9.51% faster)
arr_orig = np.array(img)
arr_result = np.array(result)
def test_zoom_large_image_upscale():
"""Test zoom=2 on a large image."""
img = create_test_image((200, 300), (10, 20, 30))
codeflash_output = zoom_image(img, 2)
result = codeflash_output # 446μs -> 415μs (7.60% faster)
def test_zoom_large_image_downscale():
"""Test zoom=0.5 on a large image."""
img = create_test_image((800, 600), (50, 60, 70))
codeflash_output = zoom_image(img, 0.5)
result = codeflash_output # 934μs -> 945μs (1.19% slower)
def test_zoom_large_non_square():
"""Test large non-square image with zoom=1.5."""
img = create_test_image((333, 777), (123, 45, 67))
codeflash_output = zoom_image(img, 1.5)
result = codeflash_output # 1.51ms -> 1.24ms (21.9% faster)
expected_size = (int(333 * 1.5), int(777 * 1.5))
def test_zoom_maximum_allowed_size():
"""Test image at upper bound of allowed size (1000x1000)."""
img = create_test_image((1000, 1000), (222, 111, 0))
codeflash_output = zoom_image(img, 1)
result = codeflash_output # 1.81ms -> 1.66ms (8.62% faster)
# Downscale
codeflash_output = zoom_image(img, 0.1)
result_down = codeflash_output # 870μs -> 871μs (0.153% slower)
# Upscale (should not exceed 1000*2=2000, which is still reasonable)
codeflash_output = zoom_image(img, 2)
result_up = codeflash_output # 6.98ms -> 5.98ms (16.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
```
</details>
To edit these changes `git checkout
codeflash/optimize-zoom_image-mjcb2smb` and push.
[](https://codeflash.ai)

---------
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: qued <64741807+qued@users.noreply.github.com>