X-Git-Url: https://git.cworth.org/git?a=blobdiff_plain;f=src%2Fintel%2Fperformance_measurement.mdwn;h=66454c09efeba23027cb916f0866b8acbb3e7103;hb=2316b07ba6f4f3ac1768aee62e214686ee74f3a8;hp=ce5ffd1cb8941d0acb9b42142f86a779ec459d7c;hpb=04a50398d2be37dcd9b5f8ed42b72638b3ad861b;p=cworth.org diff --git a/src/intel/performance_measurement.mdwn b/src/intel/performance_measurement.mdwn index ce5ffd1..66454c0 100644 --- a/src/intel/performance_measurement.mdwn +++ b/src/intel/performance_measurement.mdwn @@ -5,14 +5,14 @@ Trying to get a handle on 2D graphics rendering performance can be a difficult task. Obviously, people care about the performance of their 2D applications. Nobody wants to wait for a web browser to scroll past -tacky banner ads or for an email client to render a pageful of +tacky banner ads or for an email client to render a screen full of spam. And it's easy for users to notice "my programs aren't rendering as fast with the latest drivers". But what developers need is a way to quantify exactly what that means, in order to track improvements and avoid regressions. And that measurement is the hard part. Or at least it always has been hard, until Chris Wilson's recent cairo-perf-trace. -# Previous attempts at 2D benchmarking +## Previous attempts at 2D benchmarking Various attempts at 2D-rendering benchmark suites have appeared and even become popular. Notable examples are x11perf and gtkperf. My @@ -53,18 +53,18 @@ And yes, I myself have used and perhaps indirectly advocated for using things like x11perf in the past. I won't recommend it again in the future. See below for what I suggest instead. -# What do the 3D folks do? +## What do the 3D folks do? For 3D performance, everybody knows this lesson already. Nobody measures the performance of "draw the same triangles over and -over". And when a program that does that (like glxgears) everybody -laughs if someone tries to take its frames-per-second report -seriously. In fact, the phrase "glxgears is not a benchmark" is a -catchphrase among 3D developers. Instead, 3D measurement is made with -"benchmark modes" in the 3D applications that people actually care -about, (which as far as I can tell is just games for some reason). In -the benchmark mode, a sample session of recorded input is replayed as -quickly as possible and a performance measurement is reported. +over". And if someone does, (by seriously quoting glxgear fps numbers, +for example), then everybody gets a good laugh. In fact, the phrase +"glxgears is not a benchmark" is a catchphrase among 3D +developers. Instead, 3D measurement is made with "benchmark modes" in +the 3D applications that people actually care about, (which as far as +I can tell is just games for some reason). In the benchmark mode, a +sample session of recorded input is replayed as quickly as possible +and a performance measurement is reported. As a rule, our 2D applications don't have similar benchmark modes. (There are some exceptions such as the trender utility for @@ -72,7 +72,7 @@ mozilla and special command-line options for the swfdec player.) And coding up application-specific benchmarking code for every interesting application isn't something that anyone is signing up to do right now. -# Introducing cairo-perf-trace +## Introducing cairo-perf-trace Over the past year or so, Chris "ickle" Wilson has been putting a lot of work into a debugging utility known as cairo-trace, (inspired by @@ -134,7 +134,7 @@ with youtube), and traces of poppler, gnome-terminal, and evolution. Obviously, anyone should feel free to generate and propose new traces to contribute. -# Putting cairo-perf-trace to use +## Putting cairo-perf-trace to use In the few days that cairo-perf-traces has existed, we're already seeing great results from it. When Kristian Høgsberg recently proposed @@ -184,7 +184,8 @@ Then, after my simple just-use-malloc patch I get: Here the xlib result has improved from 194 seconds to 81 seconds. That's a 2.4x improvement, and fast enough to now play the movie without skipping. It's very satisfying to validate performance -patches with real-world application code like this. (Of course, +patches with real-world application code like this. This commit is in +the recent 2.7.99.901 or the Intel driver, by the way. (Of course, there's still a 1.8x slowdown of the xlib backend compared to the image backend, so there's still more to be fixed here.)