From: Carl Worth Date: Wed, 30 May 2007 20:45:23 +0000 (-0700) Subject: Add corrected_rectangles blog entry X-Git-Url: https://git.cworth.org/git?p=cworth.org;a=commitdiff_plain;h=0ba56140ea9fea1c4838e9e0a0ec13a480113f70 Add corrected_rectangles blog entry --- diff --git a/src/exa/corrected_rectangles.mdwn b/src/exa/corrected_rectangles.mdwn new file mode 100644 index 0000000..d57462f --- /dev/null +++ b/src/exa/corrected_rectangles.mdwn @@ -0,0 +1,117 @@ +[[meta title="Correcting bugs in the rectangles test"]] + +[[tag cairo exa performance xorg]] + +Owen Taylor was kind enough to take a close look at my [[recent +post|understanding_rectangles]] comparing the performance of EXA and +NoAccel rectangle fills on an r100. He was also careful enough to +notice that the results looked really fishy. + +Here are some the problems he noted from looking at the graphs: + +1. The EXA line looks to have an impossibly large fill rate + +2. The NoAccel line looks asymptotically linear rather than quadratic + as expected. + +3. No chart of numbers was provided to allow for any closer + examination. + +I went back to the code for my test case and did find a bug that +explains some of the problems he saw. The random positioning of +rectangles wasn't correctly accounting for their size to keep them +within the visible portion of the window. So, as the rectangle gets +larger the region that is likely to be clipped by the destination +window also gets larger. And that explains the linear rather than +quadratic growth. + +So here's a corrected version of the original graphs: + +[[rectangles-corrected-512.png]] + +And, again, a closer look at the small rectangles: + +[[rectangles-corrected-64.png]] + +And, this time I'll provide a chart of numbers as well: + + +
Time to render 10000 rectangles with XRenderFillRectangles +
Rectangle size NoAccel (ms) EXA (ms) +
1x1 1.456 2.356 +
2x2 1.529 2.288 +
4x4 1.884 2.352 +
8x8 3.039 2.356 +
16x16 3.255 2.357 +
32x32 7.608 2.377 +
64x64 26.479 2.430 +
128x128 101.325 5.376 +
256x256 1295.105 22.549 +
512x512 15354.022 89.744 +
+ +So that addresses the second and third of Owen's issues. But what +about that fill rate? First, how can I know my card's maximum fill +rate? I'm told that the standard approach is to use `x11perf +-rect500`. Let's see what that gives for NoAccel: + + NoAccel $ x11perf -rect500 + ... + 900 reps @ 6.1247 msec ( 163.0/sec): 500x500 rectangle + +And then for EXA: + + $ x11perf -rect500 + 3000 reps @ 1.9951 msec ( 501.0/sec): 500x500 rectangle + +So that shows fill rates of about 41M pixels/sec for NoAccel +and about 125M pixels/sec for EXA, (`500*500*163 = 40750000` +and `500*500*501 = 125250000`). + +Meanwhile, my results above for the 10000 512x512 rectangles give fill +rates of 171M pixels/sec for NoAccel and 29210M pixels/sec for EXA, +(`512*512*10000/15.354022 =~ 170733114` and `512*512*10000/.089744 =~ +29210197896`). + +So my test is reporting a NoAccel fill rate that is 4x faster than +what x11perf reports, and an EXA fill rate that is 233x (!) faster +than what x11perf reports. So, something is definitely still fishy +here. A fill rate of close to 30 billion pixels/sec. from an old r100 +just cannot be possible, (as another datapoint, I just got a new Intel +965 and with x11perf I measure a fill rate of 843 million +pixels/sec. on it). + +So what could be happening here? It could be that my cairo-perf +measurement framework is totally broken. It does at least seem to be +returning consistent numbers from one run to the next, though. And the +results do appear to have the correct trend as can be seen from these +two graphs showing the measured fill rates: + +[[img fill-rates-cairo-perf.png]] + +[[img fill-rates-x11perf.png]] + +But again, notice from the Y-axis values of the cairo-perf plot that +the numbers are just plain too large to be believed. + +I don't yet have a good answer for what could explain the difference +here. I did notice that exaPolyFillRect converts the list of +rectangles into a region which should prevent areas overlapped my +multiple rectangles from being filled multiple times. For x11perf +there is no overlap at 100x100 or smaller, but a lot of overlap at +500x500. Similarly, the overlap gets more probable at larger sizes +with the cairo-perf test. The existence of optimizations like that +suggest that these tests might legitimately be able to report numbers +larger than the actual fill rate of the video card. + +But that code should also be common whether calling +XRenderFillRectangles like my cairo-perf test does, or XFillRectangles +like the x11perf test does. So that optimization doesn't explain what +I'm seeing here. (I also reran my cairo-perf test with +XRenderFillRectangles changed to XFillRectangles and saw no +difference.) + +Anybody have any ideas what might be going on here? Email me at + or the xorg list at , +([subscription required](http://lists.freedesktop.org/mailman/listinfo/xorg) +of course). diff --git a/src/exa/fill-rates-cairo-perf.png b/src/exa/fill-rates-cairo-perf.png new file mode 100644 index 0000000..4f7f113 Binary files /dev/null and b/src/exa/fill-rates-cairo-perf.png differ diff --git a/src/exa/fill-rates-x11perf.png b/src/exa/fill-rates-x11perf.png new file mode 100644 index 0000000..9df8ab3 Binary files /dev/null and b/src/exa/fill-rates-x11perf.png differ diff --git a/src/exa/rectangles-corrected-512.png b/src/exa/rectangles-corrected-512.png new file mode 100644 index 0000000..a9d1332 Binary files /dev/null and b/src/exa/rectangles-corrected-512.png differ diff --git a/src/exa/rectangles-corrected-64.png b/src/exa/rectangles-corrected-64.png new file mode 100644 index 0000000..d762d79 Binary files /dev/null and b/src/exa/rectangles-corrected-64.png differ