Yakiimo3D

Mostly DirectX 11 Programming

DX11 Temporal AA Without Manual Blending

Introduction

I updated my simple Temporal AA demo to include an implementation of Temporal AA without blending. A lot of times, in technical 3D forums, you read about Temporal AA not showing up in screenshots. If you blend jittered frames like in my previous demo then Temporal AA is going to show up in screenshots. However, it’s also possible to implement Temporal AA without manual blending, and in that case, the AA is not going to appear in screenshots. Over the weekend, I implemented this other version of simple Temporal AA.

CodePlex link for my program’s source code and binary are provided at the end of the article.

Relevant Links

1) http://en.wikipedia.org/wiki/Radeon_R420
The Radeon X700-X850 series apparently featured Temporal AA without blending. Don’t think it’s supported on newer graphics cards as my HD5750 with Catalyst 10.10 doesn’t show an option.

2) http://techreport.com/articles.x/6672/22
This Tech Report article has a nice explanation of the Radeon Temporal AA implementation. Mulitisampling patterns are alternated each frame. If vsync is enabled and the app can maintain 60fps, even without manual blending, the eyes will interpolate the jittered frames giving an illusion of higher-sampled anti-aliasing. In the Radeon implementation, 2x MSAA using temporal AA will give an illusion of 4x AA.

3) http://forum.beyond3d.com/showthread.php?t=46241
Beyond3D’s “List of Rendering Resolutions + basics on hardware scaling, MSAA, framebuffers” thread mentions both Temporal AA with and without blending.

4) http://www.yakiimo3d.com/2010/09/28/dx11-perspective-matrix-jittering-temporal-aa/
My implementation of Temporal AA with blending. A simple implementation without reprojection, so you get ghosting/blurring.

Implementation Details

The implementation is the same as my last Temporal AA implementation except that I don’t manually blend the previous and current frame, but let the eye do the interpolation. I added a combobox for selecting the vsync interval. Only vsynced 60fps gives good results; non vsynced 60+fps, vsynced 30fps, vsynced 20fps all give pretty terrible edge flickering. With blending, the artifact is blurring/ghosting, without blending, the artifact seems to be edge flickering.

Unlike the Radeon implementation, I’m not alternating MSAA patterns, but just alternating jittered scene renders, so the quality should be equivalent to 2x SSAA (like my previous blended Temporal AA implementation). In the demo, the Temporal AA without blending looks a little bit worse than Temporal AA with blending because I decreased the sub-pixel offset to 0.15 from 0.25. Even at a vsynced 60 fps setting, I can sometimes still see an occasional edge flicker.

Demo

Source Code & Binary
http://yakiimo3d.codeplex.com/releases/view/56973

I would recommend viewing the demo in full-screen mode as this seems to give a more stable frame rate, and less edge flickering. I also fixed my texture mipmap settings and set the anisotropic filter to 16x. Even at this setting, you can still see a tiny bit of reduction in texture aliasing (without noticeable texture blurring from the SSAA, I think) in some parts of the scene, but it’s not significant and I guess that’s not the main point of Temporal AA.

Sorry for the download size. My mesh model data, borrowed from the DirectX SDK samples, is pretty big. Like my previous Temporal AA implementation, DX11 isn’t necessary for this technique, but my hobby coding is now done using DX11, so that’s what the demo uses.

Final Words

While I was working on my last Temporal AA demo, another Japanese programmer told me that he saw very little texture aliasing reduction in his Temporal AA test, but I foolishly didn’t check my code for bugs. Hopefully, I didn’t make any stupid mistakes this time!

DirectX11 Tessellation Tutorial Presentation

http://xtunt.com/?p=11
A DirectX11 tessellation tutorial presentation given at the IX Brazilian Symposium on Computer Games and Digital Entertainment.

http://www.xtunt.com/samples
The presentation pdf said that source code will be available to download from the above url. (Not available yet as of 2010/11/21).

http://xtunt.com/?p=29
Looks like the source code has been released at a slightly different url. (2010/11/27 Update)

http://www.youtube.com/user/xxnunes
Gustavo’s youtube page has sample videos of DX11 tessellation implementations.

Learned about the above presentation from the author’s tweet on twitter. The presentation tutorial covers simple quad tessellation, simple triangle tessellation and parametric surface generation with the tessellator. Detailed and easy to understand explanations. Read through it quickly and seems like a great first tutorial for learning DirectX11 tessellation.

Temporal AA Texture Mipmap Settings

http://www.gamedev.net/community/forums/topic.asp?topic_id=583617

I posted my temporal AA sample as a gamedev.net IOTD, and I got a comment pointing out that regular texture filtering should be sufficient in order to remove the aliasing inside of textures. I checked my code and noticed that the MaxLod value for the sampler was not set up correctly. With the code corrected and mipmapping with trilinear filtering working properly, the texture aliasing was alleviated without temporal AA’s anti-aliasing. If anisotropic filtering is enabled, the blurring becomes gone too.


DX11 Perspective Matrix Jittering Temporal AA

Introduction

Temporal anti-aliasing algorithms are algorithms where you use samples from previously rendered frames in order to perform a temporal form of supersampling. There are different ways to implement temporal AA, and over the weekend, I wrote an implementation of the simple perspective matrix jittering temporal AA. This type of temporal AA is mentioned in the Beyond3D forums here (http://forum.beyond3d.com/showthread.php?t=46241).

CodePlex link for my program’s source code and binary are provided at the end of the article.

Halo: Reach Temporal AA

http://www.eurogamer.net/articles/digitalfoundry-halo-reach-tech-analysis-article (AA is covered on Page 3.)
Digital Foundry recently published a tech analysis article on Halo: Reach. Based on frame capture analysis and the behavior of the anti-aliasing, they speculated that the anti-aliasing used in Halo: Reach is some kind of temporal anti-aliasing.

CryEngine3 Temporal AA

http://advances.realtimerendering.com/s2010/index.html
At Siggraph 2010′s “Advances in Real-Time Rendering in 3D Graphics and Games” course, Crytek presented a talk “CryENGINE 3: Reaching the Speed of Light” in which they gave some details about their anti-aliasing implementation. Their anti-aliasing is a hybrid solution with 2 different AA algorithms applied to near and distant objects. For near objects, an edge-based post process AA is applied. For distant objects, temporal AA (the temporal reprojection with cache miss approach) is applied.

The below papers were given as references for temporal reprojection in Crytek’s talk.

Accelerating Real-Time Shading with Reverse Reprojection Caching (ACM SIGGRAPH Symposium on Graphics Hardware 2007)
http://www.cse.ust.hk/~psander/
http://www.cse.ust.hk/~psander/docs/reproj2.pdf (pdf)
and
Spatio-Temporal Upsampling on the GPU (I3D 2010)
http://www.mpi-inf.mpg.de/~rherzog/
http://www.mpi-inf.mpg.de/~rherzog/Papers/spatioTemporalUpsampling_preprintI3D2010.pdf (pdf)

Screen Captures


No AA

MSAA 2x

MSAA 4x

Temporal AA 2x

Comments On the Screen Captures

(2010/10/02 added comment)
http://www.yakiimo3d.com/2010/10/02/temporal-aa-texture-mipmap-settings/
Texture filtering was not properly set up when I took the above pics. If texture filtering is properly enabled, the aliasing inside the textures becomes gone. With mipmapping-enabled, the effects of temporal AA on texture aliasing becomes minimal.

My implementation of temporal AA blends together 2 samples and should be roughly equivalent to a 2 sample SSAA in terms of quality. For comparison, I also implemented hardware MSAA 2x and hardware MSAA 4x. In the above screenshots, you can see that for long edges, MSAA 2x and temporal AA 2x are around the same quality. MSAA 4x results in much cleaner long edges than both temporal AA and MSAA 2x. If you look at the inside of the textures, you will notice that temporal AA 2x results in much cleaner straight lines and less overall aliasing compared to both MSAA 2x and MSAA 4x. Temporal AA is a form of supersampling and is able to reduce pixel shader aliasing on surfaces and not just jaggies on edges. However, for the above picture, if texture mipmapping had been properly enabled, the hardware texture filter would have taken care of the aliasing.

In my Japanese blog, I posted screenshots comparing the different AA algorithm’s results on a grid-patterned texture. Temporal AA works really well for this type of high frequency texture. While neither MSAA 2x nor MSAA 4x are able to reduce the aliasing much, temporal AA successfully reduces the moire pattern and the aliasing.
http://d.hatena.ne.jp/yakiimo02/20100926/1285523408


MSAA uses multiple depth samples, but only a single pixel shader sample, while temporal AA, like SSAA, uses multiple pixel shader samples. This means that temporal AA effectively operates at a high resolution compared to screen size and allows the algorithm to reduce aliasing on detailed polygon surfaces requiring more info than a pixel can hold.

The perspective matrix jittering temporal AA algorithm is cheap and works great on still pictures. However, it has problems once the camera starts moving (especially in sideways motions) because the difference between the previous frame’s pixels and the current frame’s pixels becomes large and as a result when the 2 frames are blended, the resulting screen is blurry with slight halo traces of the last frame.

Implementation Details

http://glprogramming.com/red/chapter10.html
Chapter 10 of the OpenGL Redbook explains the OpenGL accumulation buffer AA algorithm.
http://www.cse.msu.edu/~cse872/tutorial5.html
Found out that the Redbook explains the OpenGL accumulation buffer AA algorithm from this Michigan State University’s CSE 872 Advanced Computer Graphics class’s tutorial.

Chapter 10 of the OpenGL Redbook has an explanation on using perspective matrix jittering in order to perform anti-aliasing through the accumulation buffer. In the Redbook example, they re-render the scene n-sample times, each time with the perspective matrix jittered by a small sub-pixel amount, and blend the result into the OpenGL accumulation buffer. This subpixel n-sample blending is equivalent to SSAA and gets you an anti-aliased image. The technique requires re-drawing the scene n-sample times within a single frame and is prohibitively costly for most real-time applications.

My temporal AA implementation is a 2-sample perspective matrix jittered AA with one sample taken from the previously rendered frame and one sample from the current frame. The implementation requires an extra rendertarget the same size as the framebuffer in order to store the previous frame’s color buffer (assuming the current frame’s rendertarget is already necessary and so free). Even frames and odd frames are renderered with a different subpixel jitter value and are blended together in a fullscreen post process pass. Same as the OpenGL accumulation buffer AA, the temporal blending of 2 subpixel jittered color buffers should give similiar results to SSAA 2x. The only tricky part about the implementation for me was figuring out how to create a perspective projection matrix that jitters the rendered image by a screenspace subpixel amount. The function that performs this calculation is shown below.

// From temporalAA.cpp
/**
    Based on the OpenGL Red Book Chapter 10.
    http://glprogramming.com/red/chapter10.html
*/

D3DXMATRIX TemporalAA::JitteredFrustum(float left, float right, float bottom,
    float top, float fNear, float fFar, float pixdx,
    float pixdy, float eyedx, float eyedy, float focus, const CameraInfo& cameraInfo) const
{
    float xwsize, ywsize;
    float dx, dy;

    xwsize = right - left;
    ywsize = top - bottom;
    // translate the screen space jitter distances into near clipping plane distances
    dx = -(pixdx*xwsize/cameraInfo.fBackBufferW +
            eyedx*fNear/focus);
    dy = -(pixdy*ywsize/cameraInfo.fBackBufferH +
            eyedy*fNear/focus);

    D3DXMATRIX mPerspective;

    D3DXMatrixPerspectiveOffCenterLH( &mPerspective, left + dx, right + dx, bottom + dy, top + dy, fNear, fFar );

    return mPerspective;
}

Perspective Matrix Jittering Temporal AA Demo

The background model used in the demo is the “powerplant” sdkmesh model included with the DirectX June 2010 SDK. It’s used in the DX11 Cascaded Shadowmaps demo.

In my demo, the frame sometimes jumps when you are moving the camera. I use the DXUT CFirstPersonCamera and same thing occurs for the Cascaded Shadowmaps demo and is not an artifact of temporal AA.

Source Code & Binary
http://yakiimo3d.codeplex.com/releases/view/53001

Conclusions

The perspective matrix jittering temporal AA is easy to implement, cheap and provides anti-aliasing qualities similiar to SSAA 2x. However, the current implementation suffers from distracting screen blurring once the camera starts moving. I haven’t really read up on temporal reprojection yet, but it might be interesting to implement AA using it and see how the results compare to this implementation.

DirectCompute Cloth Sample Included In Bullet Physics 2.77

http://bulletphysics.org/Bullet/phpBB3/viewtopic.php?t=5681
Just found out that the recently released Bullet Physics 2.77 contains OpenCL and DirectCompute hardware accelerated cloth simulation samples contributed by AMD.

http://channel9.msdn.com/Blogs/gclassy/DirectCompute-Lecture-Series-230-GPU-Accelerated-Physics
You can watch a video of ATI/AMD’s Lee Howes presenting on the DirectCompute cloth implementation on MSDN (linked in the above Erwin Coumans’s announcement post.) If you don’t want to watch the full video, notice that the slides for the video are available for download as well.

I downloaded Bullet Physics 2.77 and compiled and ran the included DX11 DirectCompute cloth sample. On my HD5750, with the Release build, I get around 310-350 fps for the 5 cloth scene. As mentioned in the forum annoucement, no collision detection yet, so the cloth sometimes penetrates itself, but overall it looks nice.

http://cedec.cesa.or.jp/2010/en/sessions/PG/C10_P0206.html
Apparently the AMD DirectCompute session at CEDEC 2010 (Japan’s GDC) talked about this integrated cloth simulation sample for the Bullet. I went to a couple of CEDEC sessions this year, but I did not go to the above AMD session.