Collect timer/monitor results whenever there are >1000 outstanding
Previously, fips always waited for a frame boundary before collecting
timer and monitor results. Now, whenever more than a maximum (set to
1000 here) number of monitors have been fired off, but no results
collected, fips will check and collect results for all timers/monitors
that have results available.
Here's some background on the debugging that led to this change:
With an apitrace collected from "DOTA 2" we ran into crashes, always
on the first frame of the game proper (after the opening menus,
etc.). This frame is unusually large, (roughly half a million OpenGL
calls).
With that large frame, and the resulting large number of outstanding
queries waiting to be collected, we were running into a resource
limit and Mesa's performance-monitor code was crashing on an
unexpectedly NULL bo->virtual pointer.
A little digging determined that a DRM map ioctl was failing due to
the map_count resource in the kernel being larger than the
configured default (roughly 65530).
After checking that neither fips nor Mesa was leaking any large
number of buffer objects, (nor keeping many mapped), we decided to
attempt this more aggressive collection of results in fips.
As far as resource consumption in general, this does seem like a
reasonable thing to do. If we have hundreds of outstanding queries,
surely the oldest of them have completed, and we can free some
resources by collecting those.
On the other hand, it still seems wrong that the kernel is imposing
an arbitrary limit on how many outstanding queries an application
can have. The AMD_performance_monitor specification and
implementation are not intended to have any such limitation. So,
there's still some investigation to be done on what resource is
causing the kernel's map_count to grow so large and to see if
there's a bug there to be fixed.