Skip to content

[Regression] New multithreaded reformat function slower by default than with just 1 thread used in PyAV < 17.0.0 #2200

@timminator

Description

@timminator

In the latest version, to_ndarray(format='rgb24') is significantly slower by default. Forcing threads=1 restores the original performance inline with PyAV 16.1.0. Seems to only occur consistently on unique video frames, hence the startup code in the script underneath. I expected a performance increase from this new update. :-)
@lgeiger I hope its fine if I tag you here - you added this new functionality which is really appreciated and you might know whats going on here. :-)

Example script:

import timeit
import numpy as np

n = 500

setup_code = f"""
import av
frames = [av.VideoFrame(1920, 1080, 'yuv420p') for _ in range({n})]
frame_iter = iter(frames)
"""

stmt_code = "next(frame_iter).to_ndarray(format='rgb24')"
stmt_code_1thread = "next(frame_iter).to_ndarray(format='rgb24', threads=1)"

# Run benchmark
times = timeit.repeat(stmt_code, setup=setup_code, repeat=10, number=n)
times_per_call = np.array(times) / n
times = timeit.repeat(stmt_code_1thread, setup=setup_code, repeat=10, number=n)
times_per_call_1thread = np.array(times) / n

print(f"Multithreaded (new default): {times_per_call.mean() * 1e3:.4f} ± {times_per_call.std() * 1e3:.4f} ms")
print(f"Single threaded (PyAV < 17.0.0): {times_per_call_1thread.mean() * 1e3:.4f} ± {times_per_call_1thread.std() * 1e3:.4f} ms")

Result:

Multithreaded (new default): 2.5456 ± 0.0552 ms
Single threaded (PyAV < 17.0.0): 1.8747 ± 0.0305 ms

Operating System: Windows 11
Python version: 3.12.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions