A small update with several fixes for GA102 and Big Navi from my current perspective:

- removed Position&Parameter cache for GA102 since Nvidia doesn't use extra caches, thx @pixeljetstream

- fixed drunk 160 CU image for Navi21, thx @jilin_zhang

1/x
- fixed wrong PCIe4.0 speed numbers, thx @gnyueh

- Thought about the objection from @TDevilfish in relation to the ROPs for N21 and talked with a knowledgeable person, I'm now convinced that AMD changed the ROP number per Render Backend from 4 ROPs to 8 ROPs.
128 ROPs total.
- Realized a stupid miss out from my side, rogame's configuration table from AMD's driver mentions 16 L2$ Tiles (Max Texture Channel Caches).
Till now AMD uses one L2$ tile per memory channel.
With 16 it couldn't be a 384-Bit Interface. https://twitter.com/_rogame/status/1289239501647171584
16 L2$ tiles fit to a 256-Bit GDDR6 Interface.
One Channel is 16-Bit wide, for a 256-Bit Interface you have 16 channels.
Another possibility is a 2048-Bit HBM2(e) Interface.
Each Stack has 8x 128-Bit Channels with two one has 16 Memory Channels.
IIRC @coreteks mentioned...
...that Add-in-Board (AIB) Partners are awaiting GPU+Memory from AMD, only with HBM AIBs are getting a bundle, GDDR6 is not distributed by AMD.
The open source driver situation is a weirdo.
Parameters for the unified memory controller indicate a 2048-Bit HBM2 Interface...
...other commits only mentioned the GDDR6 VRAM type, reading the exact configuration from the atom firmware (iirc it wasn't just mentioned in relation to an emulation mode).
A reason why I was more inclined to go the GDDR6 route for N21.
Now I went the HBM route.
With 4 Shader Engines and 80 CUs there are a lot more clients which work with the L2$.
I would say it's guaranteed that AMD doubled the capacity from 256KB per L2$ tile to 512KB -> 8MB L2$
So far that's all.
I'm very looking forward to Ampere reviews on September 14th, some already leaked (have not read them yet).
There is a mixed feeling.
RT and FP32 perf seems great, 8nm and perf/watt improvements are lower than what I expected.

PS: The "die shot" ...
...annotations for GA102 are not meant to be taken fully seriously.
It's an artistic marketing picture from NV.
For Volta+Turing the high level built up aligned well to real shots, potentially it does so for Ampere too but that's up in the air right now, especially finer details
You can follow @Locuza_.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.