See full post

Kostas Anagnostou

kostasanagnostou.bsky.social

Followers · Following

Lead Rendering Engineer at Playground Games working on Fable. Always open for graphics questions or mentoring people who want to get in the industry. I tweet about graphics mostly. Views my own. Blog: interplayoflight.wordpress.com

Joined October 2024

Posts Replies Media Original posts Likes

Kostas Anagnostou kostasanagnostou.bsky.social · Feb 5
Post exploring the evolution of SIMT in GPUS: "SIMD Started It, SIMT Improved It" blog.siggraph.org/2026/01/simd...
SIMD Started It, SIMT Improved It - ACM SIGGRAPH Blog

By blending thread abstractions with SIMD hardware, GPUs evolved into flexible processors for graphics, AI, and scientific computing.

blog.siggraph.org

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
Brian Karis handle.invalid · Feb 2
I'm finally writing up how Nanite Tessellation works. The first few blogs posts are up. More will be coming. graphicrants.blogspot.com/2026/02/nani...
Nanite Tessellation

Nanite Tessellation, aka Nanite Dynamic Tessellation, aka Nanite Dynamic Displacement was the next major feature I worked on after Nanite it...

graphicrants.blogspot.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
juretriglav juretriglav.bsky.social · Jan 29
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 20
IBL Optimization Study II: Faster Irradiance technik90.blogspot.com/2026/01/ibl-...
IBL Optimization Study II: Faster Irradiance

Today, we're picking up right were we left last post . We are looking at building our irradiance map. As a refresher, the irradiance map is...

technik90.blogspot.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
Adam Sawicki asawicki.info · Jan 20
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 12
Optimizing spatiotemporal variance-guided filtering for modern GPU architectures jcgt.org/published/00...
https://jcgt.org/published/0015/01/02/

jcgt.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 11
Nice collection of graphics programming resources: cody-duncan.github.io/r-graphicspr...
r/GraphicsProgramming Wiki - "r/GraphicsProgramming Wiki"

cody-duncan.github.io

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
Martin Fuller martinfuller.bsky.social · Jan 6
Slides are now available for my GPC 2025 talk with @phammer.bsky.social on Variable Rate Compute Shaders in Doom The Dark Ages static.graphicsprogrammingconference.com/public/2025/...
https://static.graphicsprogrammingconference.com/public/2025/slides/variable-rate-compute-shaders-in-doom/Fuller-Hammer-variable-rate-compute-shaders-in-doom-the-dark-ages.pptx

static.graphicsprogrammingconference.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
makramd makramd.bsky.social · Jan 5
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 4
As expected, the size of the MLP does matter when running it on Tensor cores. A 3x32x32x1 MLP running on Cooperative Vectors is ~70x faster than my compute shader version for the same amount of inference. The Coop Vectors version is using fp16 but the speedup is impressive regardless.
- Kostas Anagnostou kostasanagnostou.bsky.social · Dec 26, 2025
  It's been an adventure but I finally managed to get Cooperative Vectors to use my tiny MLP (2 hidden layers, 3 nodes each) to infer sky vis at a specific pos. It isn't really faster than my compute shader version, the MLP is maybe too small to make good use of the tensor cores but cool regardless.
View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
saschawillems saschawillems.bsky.social · Jan 3
My "How to Vulkan in 2026" @vulkan.org #Vulkan guide is now publicly available at www.howtovulkan.com I still consider it a preview, though I'm mostly happy with it and only plan on changing minor things and incorporating some feedback.
How to Vulkan in 2026 - How to Vulkan

How to write Vulkan graphics code in 2026

howtovulkan.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 3
Interesting post, working through and improving the performance issues of an LLM-generated IBL implementation: technik90.blogspot.com/2025/12/ibl-...
IBL Optimization Study

This is the first post in a series dedicated to Image-Based Lighting. This is a very common technique in modern videogames used to implement...

technik90.blogspot.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 2
"Improving Direct Lighting Material Occlusion - Part 1", discussing micro-occlusion for direct lighting, Naughty Dog's and Activision's approaches and alternatives irradiance.ca/posts/micros...
Improving Direct Lighting Material Occlusion - Part 1

Piqué fabric with micro-occlusion on the top, and micro-shadowing at the bottom. Both are using the same micro-occlusion map. Geometry virtualization and displacement systems like Nanite has allowed u...

irradiance.ca

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Jan 1
Wordle 1,657 5/6 🟨⬜⬜⬜🟩 ⬜🟩⬜⬜🟩 🟩🟩⬜⬜🟩 ⬜🟩🟨⬜🟩 🟩🟩🟩🟩🟩 A bit embarrassed that I couldn't find today's word sooner. 😊 Happy New Year everyone!

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Dec 31, 2025
Good read: "GPU Cache Hierarchy: Understanding L1, L2, and VRAM" charlesgrassi.dev/blog/gpu-cac...
GPU Cache Hierarchy: Understanding L1, L2, and VRAM

Why does one texture sample cost 4 cycles and another 500? GPU cache hierarchy explained—L1, L2, VRAM, and how to stop thrashing them.

charlesgrassi.dev

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
longbool longbool.bsky.social · Dec 30, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
did:plc:yvrnuwqdk6ko4dsmjg3wmm3q handle.invalid · Dec 29, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
did:plc:yvrnuwqdk6ko4dsmjg3wmm3q handle.invalid · Dec 29, 2025
#SIGGRAPH2025 Advances in Real-Time Rendering in Games course talk recording of "FAST AS HELL: IDTECH8 GLOBAL ILLUMINATION" by @idsoftwaretiago.bsky.social from id Software is now online: youtu.be/VTrdeqMMMK0?... Enjoy!
SIGGRAPH 2025 Advances in Real-Time Rendering in Games: Fast as Hell: idTech8 Global Illumination

YouTube video by SIGGRAPH Advances in Real-Time Rendering

youtu.be

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
did:plc:yvrnuwqdk6ko4dsmjg3wmm3q handle.invalid · Dec 29, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
devaniti devaniti.bsky.social · Dec 29, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Dec 27, 2025
Great collection of C/C++ compiler optimisations: xania.org/AoCO2025-arc...
AoCO2025 Archive — Matt Godbolt’s blog

xania.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Dec 26, 2025
It's been an adventure but I finally managed to get Cooperative Vectors to use my tiny MLP (2 hidden layers, 3 nodes each) to infer sky vis at a specific pos. It isn't really faster than my compute shader version, the MLP is maybe too small to make good use of the tensor cores but cool regardless.

View on Bluesky Download image Show all post labels

Reposted by Kostas Anagnostou
Sebastian Aaltonen handle.invalid · Dec 16, 2025
My "No Graphics API" blog post is live! Please repost :) www.sebastianaaltonen.com/blog/no-grap... I spend 1.5 years doing this. Full rewrite last summer and another partial rewrite last month. As Hemingway said: "First draft of everything is always shit".
No Graphics API — Sebastian Aaltonen

Graphics APIs and shader languages have significantly increased in complexity over the past decade. It’s time to start discussing how to strip down the abstractions to simplify development, improve pe...

sebastianaaltonen.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Dec 11, 2025
"Microbenchmarking NVIDIA’s Blackwell Architecture: An in-depth Architectural Analysis", focusing on the tensor cores arxiv.org/pdf/2512.02189
https://arxiv.org/pdf/2512.02189

arxiv.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Dec 1, 2025
Great read: "Video Game Blurs (and how the best one works)" blog.frost.kiwi/dual-kawase/
Video Game Blurs (and how the best one works)

How to build realtime blurs on the GPU and how the best blur algorithm works - "Dual Kawase"

blog.frost.kiwi

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Nov 30, 2025
The z-buffer and depth testing (aka z-testing) have been the dominant way of hidden surface elimination for over 50 years, introduced but not implemented in W. Straßer's PhD thesis in 1974, and actually implemented in Ed Catmull's PhD thesis in the same year. 1/4

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Nov 23, 2025
The post on using spatial hashing with raytraced ambient occlusion attracted quite a bit of interest so I expanded it into a blog post to discuss how it works behind the scenes to both reduce the noise and its cost. interplayoflight.wordpress.com/2025/11/23/s...
Spatial hashing for raytraced ambient occlusion

Subdividing a 3D space into cells or voxels and using positional and/or directional information to directly index into it is a popular method to store and access local data, typically using 3D text…

interplayoflight.wordpress.com
- Kostas Anagnostou kostasanagnostou.bsky.social · Nov 11, 2025
  Did a quick and dirty implementation of a spatial hash structure to speedup RTAO, ray results are stored in cells indexed by pos/normal/cell size and after storing a few rays occlusion can be queried from the cell instead of raytracing it. 3x faster raytraced AO for that scene with no denoising.
View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
gabrielsassone gabrielsassone.bsky.social · Nov 23, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Nov 23, 2025
TIL that you can use an LLM to create Latex equations by pretty much describing them. My past, post graduate self who had to painstakenly create them by hand would be jealous.

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Nov 14, 2025
"Get Started with Neural Shading" course videos: youtube.com/playlist?lis...
Get Started with Neural Shading - YouTube

In this course, we will provide an overview of neural shading and how to get started with it in your game or application.

youtube.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
did:plc:yvrnuwqdk6ko4dsmjg3wmm3q handle.invalid · Nov 11, 2025
#SIGGRAPH2025 Advances in Real-Time Rendering in Games course talk recording of "Stochastic Tile-Based Lighting in HypeHype" by Jarkko Lempiäinen from HypeHype is now online: www.youtube.com/watch?v=8O44...
SIGGRAPH 2025 Advances: STOCHASTIC TILE-BASED LIGHTING IN HYPEHYPE

YouTube video by SIGGRAPH Advances in Real-Time Rendering

youtube.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
did:plc:yvrnuwqdk6ko4dsmjg3wmm3q handle.invalid · Nov 11, 2025
#SIGGRAPH2025 Advances in Real-Time Rendering in Games course talk recording of "Strand-Based Hair And Fur Rendering In Indiana Jones and The Great Circle" by Sergei Kulikov from MachineGames is now online: youtu.be/jSE1XXBEK-w
SIGGRAPH 2025 Advances: STRAND-BASED HAIR AND FUR RENDERING IN INDIANA JONES AND THE GREAT CIRCLE

YouTube video by SIGGRAPH Advances in Real-Time Rendering

youtu.be

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Nov 11, 2025
Did a quick and dirty implementation of a spatial hash structure to speedup RTAO, ray results are stored in cells indexed by pos/normal/cell size and after storing a few rays occlusion can be queried from the cell instead of raytracing it. 3x faster raytraced AO for that scene with no denoising.

View on Bluesky Download video Show all post labels

Reposted by Kostas Anagnostou
Guillaume Boissé handle.invalid · Nov 5, 2025
New blog post! Behind the scenes of some of the techniques involved in making our last PC demo 💫 gboisse.github.io/posts/this-i...

View on Bluesky Download image Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Nov 2, 2025
Finally got around to adding support for (hardware) VRS to the toy engine. Forcing a 2x2 shading rate and comparing in GPU trace, a summary of what is happening for the gbuffer pass (2nd trace VRS on), the GPU is doing the same number of z-tests, while doing about 64% less pixel shader work. 1/3

View on Bluesky Download image Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 31, 2025
Happy Halloween!

View on Bluesky Download image Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 25, 2025
Some information about Ghost of Yōtei's rendering systems blog.playstation.com/2025/10/23/g...
Ghost of Yōtei – tech deep dive

Sucker Punch delves into the tech that helped them bring Atsu’s engrossing journey to live.

blog.playstation.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 20, 2025
ReGIR - An advanced implementation for many-lights offline rendering tomclabault.github.io/blog/2025/re...
ReGIR - An advanced implementation for many-lights offline rendering | Tom Clabault

Tom Clabault's website

tomclabault.github.io

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 15, 2025
Good read! "Neural Super Sampling for Mobile": huggingface.co/Arm/neural-s...
https://huggingface.co/Arm/neural-super-sampling/blob/main/2025-neural-super-sampling.pdf

huggingface.co

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
Pablo Zurita handle.invalid · Oct 15, 2025
I finally found the time and energy to make a new blog and write a couple of posts. This time I wrote about PBR content and game development principles. Both posts are quite different so hopefully people find something interesting on either one of them. irradiance.ca/posts/
Posts

irradiance.ca

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 11, 2025
Interesting read on the future of game graphics, from 12 years ago: mcvuk.com/business-new...
Starry eyed: Where game graphics go next - MCV/DEVELOP

Develop examines the truth about photorealism in the next generation

mcvuk.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 9, 2025
Reading AMD GPU ISA rocm.blogs.amd.com/software-too...
Reading AMD GPU ISA

Reading AMDGCN ISA

rocm.blogs.amd.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 8, 2025
Quick experiment in Compiler Explorer to observe how VGPR allocation differs between wave32 and wave64 on RDNA: with wave32 it appears to be in batches of 8 while with wave64 it is in batches of 4 VGPRs. godbolt.org/z/W6ee8MhMx
Compiler Explorer - HLSL (RGA 2.6.2 (DXC trunk))

// The entry point and target profile are needed to compile this example: // -T ps_6_6 -E PSMain RWBuffer<float> output; cbuffer SomeData { float dataArray[10]; int index; } //Undefin...

godbolt.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 8, 2025
Really good, maths-free, introduction to the Fourier transform from this year's Siggraph, recommended watch: dl.acm.org/doi/10.1145/...
Introduction to the Fourier Transform | Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Courses

dl.acm.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 6, 2025
Quick tip, add the "-fspv-target-env=vulkan1.1" command line argument to Compiler Explorer's RGA to get it to compile HLSL shaders with wave intrinsics: godbolt.org/z/PKoh5d51K
Compiler Explorer - HLSL (RGA 2.6.2 (DXC trunk))

// The entry point and target profile are needed to compile this example: // -T ps_6_6 -E PSMain struct PSInput { float4 position : SV_Position; float4 color : COLOR0; }; float4 PSMain(PS...

godbolt.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Oct 1, 2025
"Inside NVIDIA GPUs: Anatomy of high performance matmul kernels", includes a great intro to GPU architecture and PTX/SASS: www.aleksagordic.com/blog/matmul
Inside NVIDIA GPUs: Anatomy of high performance matmul kernels - Aleksa Gordić

From GPU architecture and PTX/SASS to warp-tiling and deep asynchronous tensor core pipelines.

aleksagordic.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
Arseny Kapoulkine zeux.io · Sep 30, 2025
New blog post! In "Billions of triangles in minutes" we'll walk through hierarchical cluster level of detail generation of, well, billions of triangles in minutes. Reposts welcome! zeux.io/2025/09/30/b...

View on Bluesky Download image Show all post labels

Reposted by Kostas Anagnostou
Martin Fuller martinfuller.bsky.social · Sep 30, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
did:plc:yvrnuwqdk6ko4dsmjg3wmm3q handle.invalid · Sep 30, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
Martin Fuller martinfuller.bsky.social · Sep 26, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Sep 22, 2025
Nsight Graphics' GPU Trace/Trace Analysis often provides more low level hardware information, in the form of tooltips and performance advice, than the documentation available online does, worth exploring a few captures to understand the GPU architecture better.

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Sep 21, 2025
Continuing on the topic of GPU utilisation and performance, as a practical example, I looked a bit deeper into the impact of vertex shader exports on the cost of a drawcall and wrote another blog post with some observations interplayoflight.wordpress.com/2025/09/21/t...
The performance impact of vertex shader exports

Following up on the previous post on GPU utilization and performance, and to provide a practical example, I expanded a bit on a topic discussed in brief: vertex shader exports and their impact on p…

interplayoflight.wordpress.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
mjp123 mjp123.bsky.social · Sep 8, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Sep 3, 2025
Brief introduction to a number of OIT techniques: "Advances in Order Independent Transparency for Real-Time & Virtual Production Workflows" www.youtube.com/watch?v=wXSJ...
Advances in Order Independent Transparency for Real-Time & Virtual Production Workflows - P. Kakkar

YouTube video by Academy Software Foundation

youtube.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 29, 2025
New blog post discussing a few approaches to bottleneck reduction and GPU utilisation and performance increase interplayoflight.wordpress.com/2025/08/29/g...
GPU utilisation and performance improvements

Drill deep into a GPU’s architecture and at its heart you will find a large number of SIMD units whose purpose is to read data, perform some vector or scalar ALU (VALU or SALU) operation on i…

interplayoflight.wordpress.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 27, 2025
It is interesting that both RTGI presentations at Advances in R-T Rendering this year (both great reads!) introduce it as a solution to scalability and baking size/time issues, due to the size/dynamic nature of the world, more than a visual improvement advances.realtimerendering.com/s2025/index....
Advances in Real-Time Rendering in Games, SIGGRAPH 2025 - Celebrating 20 years!

advances.realtimerendering.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 27, 2025
Detecting reads from uninitialised heap memory in C++ programs at runtime www.forwardscattering.org/post/71
Forward Scattering - The Weblog of Nicholas Chapman

First off, most malloc implementations in debug mode will fill the memory with a special byte pattern like 0xCDCDCDCD. Unfortunately this doesn't directly help you detect reads of such patterns.

forwardscattering.org

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 22, 2025
Worth re-sharing this oldish but still great presentation as an example of how perf should be viewed holistically, maximising all GPU units' utilisation even if it means making a particular drawcall's execution slower to achieve this. s3.amazonaws.com/nd.images/re... www.youtube.com/watch?v=CvS6...
https://s3.amazonaws.com/nd.images/research/2020_siggraph/Low_Level_Optimizations_In_TLOU2.pptx

s3.amazonaws.com

View on Bluesky Show all post labels

Reposted by Kostas Anagnostou
aschrein aschrein.bsky.social · Aug 22, 2025
[Not loaded yet]

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 21, 2025
Intuitive Guide to Convolution betterexplained.com/articles/int...
Intuitive Guide to Convolution – BetterExplained

betterexplained.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 20, 2025
Pointer Tagging in C++: The Art of Packing Bits Into a Pointer vectrx.substack.com/p/pointer-ta...
Pointer Tagging in C++: The Art of Packing Bits Into a Pointer

Using tagged pointers to save memory, speed up dynamic dispatch, and compact data structures

vectrx.substack.com

View on Bluesky Show all post labels

Kostas Anagnostou kostasanagnostou.bsky.social · Aug 19, 2025
Anno 1800: Frame Analysis blog.thomaspoulet.fr/posts/anno-1...
Anno 1800: Frame Analysis | Thomas Poulet

In depth analysis of the steps required to get a frame from Anno 1800 onto the screen.

blog.thomaspoulet.fr

View on Bluesky Show all post labels

An unhandled error has occurred. Reload 🗙