YUV->RGBA conversion: Special case the edge pixels, do the middle without index clamping #12

torokati44 · 2021-01-20T21:43:29Z

This PR splits the color space conversion of the edge pixels and "the rest", to allow fewer operations on the inside pixels that are the straightforward case.

The effects of each included commit (plus two excluded ones) on the runtime of a particular command are:

These numbers are the output of time (in seconds) running the following command (after compilation is done), with each sample averaged from three runs. The error bars are one standard deviation long in both directions.

time cargo run --package=exporter --release -- ../../Downloads/z0r-de_4145.swf --frames 1000

I also commented out the actual saving of the frames into files, so the effect on rendering itself is more directly measurable.

While the "utility functions" commit regresses a little bit, doing it is almost a necessity for the one after it, which is the one providing the significant gains.

Overall, these changes sped up the rendering by about 25%.

I also made two more experiments (independently) that I then discarded because they both regressed slightly:
The first one was doing the bilinear interpolation differently: on f32 numbers, in two steps (the usual way, in a rotated H-shape).
The second one was simply omitting the .min() and .max() calls from clamp(), relying on the saturating property of the f32 to u8 cast instead.

I don't know if this is starting to stretch the "code simplicity/cleanliness" vs. "runtime performance" trade-off a little bit too far, but at least there is still no unsafe anywhere... :)

…n step This yielded an overall 20% faster decoding speed on the video I tested

torokati44 added 3 commits January 20, 2021 21:02

video: Simplify clamp() used in the YUV to RGBA conversion

a8769d6

video/h263: Extract the YUV conversion step into utility functions

6cc0fb9

video/h263: Special case the edge pixels in the YUV -> RGBA conversio…

2589356

…n step This yielded an overall 20% faster decoding speed on the video I tested

torokati44 mentioned this pull request Jan 20, 2021

H.263 decoder + initial video support ruffle-rs/ruffle#2173

Merged

4 tasks

kmeisthax merged commit 2589356 into kmeisthax:video-h263 Jan 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YUV->RGBA conversion: Special case the edge pixels, do the middle without index clamping #12

YUV->RGBA conversion: Special case the edge pixels, do the middle without index clamping #12

torokati44 commented Jan 20, 2021 •

edited

Loading

YUV->RGBA conversion: Special case the edge pixels, do the middle without index clamping #12

YUV->RGBA conversion: Special case the edge pixels, do the middle without index clamping #12

Conversation

torokati44 commented Jan 20, 2021 • edited Loading

torokati44 commented Jan 20, 2021 •

edited

Loading