This removes s_load_dword latency for tess rings.
We need just 1 SGPR for the address if we use 64K alignment. The final asm
for recreating the descriptor is:
// s2 is (address >> 16)
s_mov_b32 s3, 0
s_lshl_b64 s[4:5], s[2:3], 16
s_mov_b32 s6, -1
s_mov_b32 s7, 0x27fac
v2: bitcast the descriptor type from v2i64 to v4i32
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
|
||
|---|---|---|
| .. | ||
| amd | ||
| compiler | ||
| egl | ||
| gallium | ||
| gbm | ||
| getopt | ||
| glx | ||
| gtest | ||
| hgl | ||
| intel | ||
| loader | ||
| mapi | ||
| mesa | ||
| util | ||
| vulkan | ||
| Makefile.am | ||
| SConscript | ||