NeroReflex/mesa - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
James Park	1351fcf3c3	amd: Fix warnings around variable sizes Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6162>	2021-04-23 10:37:22 +00:00
Timur Kristóf	74c467d988	aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7. On GFX6-7, the 8 and 16-bit integer add reductions use the 32-bit v_add instruction, which clobbers the VCC register. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10346>	2021-04-22 11:29:49 +00:00
Rhys Perry	776ba40115	aco: add and use Program::progress This is used when printing the program and to avoid updating register demand during post-RA liveness analysis. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	2d36232e62	aco: allow SDWA sels smaller than the operand size p_extract_vector copy-propagation can create byte sels for v2b operands. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	655ba1e3a9	aco: don't update register demand during RA validation It isn't intended to be accurate after RA, so num_waves can become zero, breaking the sgpr_limit calculation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	0eaa5dfac0	aco: remove image parameter from get_sampler_desc() We can just check whether tex_instr is NULL instead. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Rhys Perry	3cbe9894f7	aco: set TRUNC_COORD=0 for nir_texop_tg4 Fixes black squares in Assassin's Creed: Valhalla and rendering of FidelityFX-CACAO demo. fossil-db (sienna cichlid): Totals from 3052 (2.09% of 146267) affected shaders: SpillSGPRs: 8437 -> 8646 (+2.48%) CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40% Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29% Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05% InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03% VClause: 92114 -> 92132 (+0.02%) SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01% Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61% Branches: 219629 -> 219635 (+0.00%) PreSGPRs: 248970 -> 249366 (+0.16%) fossil-db (polaris10): Totals from 3050 (2.06% of 147787) affected shaders: SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02% VGPRs: 242572 -> 242612 (+0.02%) SpillSGPRs: 10387 -> 10675 (+2.77%) CodeSize: 31872460 -> 31996128 (+0.39%) MaxWaves: 10924 -> 10925 (+0.01%) Instrs: 6222217 -> 6239072 (+0.27%) Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09% InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06% VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01% SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00% Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41% Branches: 219698 -> 219703 (+0.00%) PreSGPRs: 244251 -> 244644 (+0.16%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `58f25098a0` ("radv: Use TRUNC_COORD on samplers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3110 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Samuel Pitoiset	9434675d60	aco: fix opquantize2f16 on GFX6-7 Make sure to preserve signed zeroes. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero on GFX6 (Pitcairn). Untested on GFX7. Fixes: `54a09545ec` ("aco: optimize a*0.0") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10319>	2021-04-19 16:33:37 +00:00
Marek Olšák	ec1ddb976a	amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>	2021-04-17 02:37:49 +00:00
Marek Olšák	b878444c3a	amd: drop support for LLVM 10 It doesn't support RDNA 2. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Samuel Pitoiset	936b58378c	amd: drop support for LLVM 8 It doesn't support Navi1x and the removal enables this nice code cleanup. v2: rebase - mareko Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1) Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Michel Dänzer	d200f45875	Use explicit break instead of fall-through to break-only case clang generates a warning if there's no explicit break or fall-through annotation. The latter would be kind of silly in this case, and not robust against any future changes turning the fall-through invalid. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Rhys Perry	5b8a4516e6	aco/ra: remove live-in temporary from live_out_per_block when moving it Otherwise, handle_loop_phis() might pass it to handle_live_in() and then we could have two phis for this variable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Rhys Perry	11fde1247c	aco/ra: use original names when renaming loop carried phi operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Timur Kristóf	f3e004cb56	aco: Add a simple heuristic to decide early or late primitive export. Late export is theoretically better if used with LATE_ALLOC, but in practice, the early export has an advantage of lower register usage, therefore more concurrent waves. The idea of this commit is that "small" shaders benefit from early primitive export more, due to being able to launch much more waves. Let's consider a NIR shader "small" when it has only 1 block. This yields both better performance, and better stats, than always using late export. Fossil DB on Sienna: Totals from 12807 (8.76% of 146265) affected shaders: VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83% SpillSGPRs: 1458 -> 1538 (+5.49%) CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14% MaxWaves: 282902 -> 278516 (-1.55%) Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18% VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30% SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16% Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26% Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26% PreSGPRs: 434701 -> 447396 (+2.92%) PreVGPRs: 527783 -> 540527 (+2.41%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	5dbab03a80	aco: Emit fewer branches for NGG VS/TES with late primitive export. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	af7d5f5b86	aco: Set block_kind_export_end in create_vs/fs_exports. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	2b312a4fd7	aco: Extract ngg_nogs_export_prim_id to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	231ef14b3d	aco: Use s_setprio 3 at the beginning of every VS and TES. The user-set priority of shaders matters very little, but we hope this might still help speed up VS input loads especially. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	4c86c7aa15	aco: Remove useless s_setprio near gs_alloc_req. We learned that the gs_alloc_req is not actually when the export space allocation happens. So it makes no sense to prioritize it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	75cd43741a	aco: Align NGG scratch size to 16 so a single ds_read can always read it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Timur Kristóf	c1346e5c22	aco: Optimize workgroup exclusive scan to better avoid bank conflicts. Previously, every wave had multiple active lanes read the LDS, and the data was processed by VALU DPP instructions. Now, only the first lane reads the LDS in order to avoid bank conflicts, and the results are processed by SALU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Daniel Schürmann	b6a28aaa8b	aco/cssa: don't create parallelcopies for constants and exec if we are able to spill these directly. Totals from 4913 (3.60% of 136546) affected shaders (Raven): SpillSGPRs: 16021 -> 15451 (-3.56%); split: -3.87%, +0.31% CodeSize: 58102020 -> 57371464 (-1.26%); split: -1.26%, +0.00% Instrs: 11411454 -> 11230105 (-1.59%); split: -1.59%, +0.00% Latency: 555706331 -> 550058635 (-1.02%); split: -1.07%, +0.05% InvThroughput: 273023354 -> 271854469 (-0.43%); split: -0.44%, +0.01% SClause: 385168 -> 385371 (+0.05%); split: -0.01%, +0.06% Copies: 1342084 -> 1175762 (-12.39%); split: -12.40%, +0.01% Branches: 392619 -> 378662 (-3.55%); split: -3.56%, +0.00% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	18ba93e673	aco/cssa: rewrite lower_to_cssa pass The previous pass was based on misconceptions and rounded up with bug fixes. The new pass is entirely rewritten and basically just one-to-one from the paper: "Revisiting Out-of-SSA Translation for Correctness, CodeQuality, and Efficiency" by B. Boissinot et al. It also incorporates the value-equality testing. The regressions are mainly due to creating parallelcopies for exec phis at loop headers (mitigated in the next commit). Totals from 4933 (3.61% of 136546) affected shaders (Raven): SpillSGPRs: 16249 -> 16527 (+1.71%); split: -0.28%, +1.99% SpillVGPRs: 1771 -> 1595 (-9.94%) CodeSize: 57544436 -> 58280304 (+1.28%); split: -0.00%, +1.28% Scratch: 176128 -> 179200 (+1.74%) Instrs: 11265783 -> 11445884 (+1.60%); split: -0.00%, +1.60% Latency: 552596156 -> 555880540 (+0.59%); split: -0.53%, +1.13% InvThroughput: 271431862 -> 273097423 (+0.61%); split: -0.18%, +0.79% VClause: 160240 -> 160241 (+0.00%); split: -0.02%, +0.02% SClause: 386863 -> 386685 (-0.05%); split: -0.07%, +0.02% Copies: 1180801 -> 1345633 (+13.96%); split: -0.02%, +13.98% Branches: 379129 -> 393052 (+3.67%); split: -0.01%, +3.69% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	9d73a4a412	aco: add new reindex_ssa() pass Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	d75c73e6a6	aco: fix kill flags on phi operands Fossil-db changes are likely due to how the CSSA pass works. Totals from 1782 (1.31% of 136546) affected shaders (Raven): CodeSize: 25333292 -> 25294020 (-0.16%); split: -0.16%, +0.00% Instrs: 4916059 -> 4908218 (-0.16%); split: -0.16%, +0.00% Latency: 282860167 -> 282707176 (-0.05%); split: -0.08%, +0.03% InvThroughput: 136487564 -> 136394958 (-0.07%); split: -0.12%, +0.05% VClause: 74791 -> 74795 (+0.01%) Copies: 542115 -> 534280 (-1.45%); split: -1.48%, +0.04% Branches: 168977 -> 168966 (-0.01%); split: -0.01%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	13e4fed01f	aco: lower p_spill with constants correctly Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	4a57787006	aco/spill: use correct next_use_distances at loop header To decide which variables to spill, we must use the distances at the beginning of the loop-header, and not the distances at the end of the loop-preheader. The difference are that the former includes phis which are viable to be spilled as opposed to the phi operands which would be reloaded by add_coupling_code(), ending up in potentially too high register pressure before the loop. Totals from 206 (0.15% of 136546) affected shaders (Raven): SpillSGPRs: 5154 -> 5000 (-2.99%) CodeSize: 3654072 -> 3647184 (-0.19%); split: -0.19%, +0.00% Instrs: 701482 -> 700526 (-0.14%); split: -0.14%, +0.00% Latency: 40988780 -> 40872506 (-0.28%); split: -0.29%, +0.00% InvThroughput: 20364560 -> 20306006 (-0.29%) SClause: 20192 -> 20198 (+0.03%) Copies: 77732 -> 77688 (-0.06%); split: -0.08%, +0.03% Branches: 24204 -> 24050 (-0.64%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	b56ea19111	aco/spill: refactor live-in registerDemand calculation This also fixes some hypothetical issue for loops without phis and for loops with higher register pressure at the end of the loop preheader. No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	282eacc3e0	aco/spill: refactor some more spill decision taking No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	dfb10e4f4b	aco/spill: don't count phis as variable access This increases the chance of evicting phis if these have longer next-use distances. Totals from 6 (0.00% of 146267) affected shaders (Navi10): CodeSize: 476992 -> 464388 (-2.64%) Instrs: 81785 -> 79952 (-2.24%) VClause: 2380 -> 2374 (-0.25%) Copies: 26836 -> 25131 (-6.35%) Branches: 2494 -> 2492 (-0.08%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	b2a6346df7	aco/spill: spill phi constants and exec directly to VGPR This lets us avoid some CSSA copies. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	99936d7142	aco/spill: reload spilled exec masks directly to exec This handles the case of exec = p_linear_phi %a, %b where %a or %b might have been spilled. By directly reloading these variables into the exec mask register, we can avoid additional CSSA parallelcopies. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	beb292343a	aco/spill: refactor spill decision taking No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Rhys Perry	d8f12fd421	aco: fix 16-bit f2{u8,i8} on GFX6/7 Not really tested. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Rhys Perry	d0e15b8c22	aco: fix 16-bit u2f32 This shouldn't sign-extend. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00
Samuel Pitoiset	1ad295ed6f	radv: allow to force VRS rates on GFX10.3 with RADV_FORCE_VRS This allows to force the VRS rates via RADV_FORCE_VRS, the supported values are 2x2, 1x2 and 2x1. This supports the primitive shading rate mode for non GUI elements. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7794>	2021-04-09 14:47:53 +02:00
Bas Nieuwenhuizen	580f1ac473	nir: Extract shader_info->cs.shared_size out of union. It is valid for all stages, just 0 for most of them. In particular mesh/task shaders might be using it. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10094>	2021-04-08 14:39:28 +00:00
Rhys Perry	961361cdc9	aco: ensure loops nested in a WQM loop are in WQM Fixes a potential empty exec mask in this situation: enter_wqm() loop { ... wqm code ... enter_exact() loop { ... no wqm code ... } } Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `f0074a6f05` ("aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4546 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10075>	2021-04-08 09:56:25 +00:00
Rhys Perry	835c5b7ebf	aco: fix integer tg4 workaround with unnormalized coordinates Same as LLVM from `2abf62d348`. fossil-db (GFX8): Totals from 15 (0.01% of 147787) affected shaders: VGPRs: 744 -> 748 (+0.54%) CodeSize: 100472 -> 100732 (+0.26%) Instrs: 19995 -> 20059 (+0.32%) Latency: 1001530 -> 1001859 (+0.03%) InvThroughput: 378508 -> 378747 (+0.06%) SClause: 676 -> 675 (-0.15%) Copies: 1655 -> 1654 (-0.06%) PreSGPRs: 735 -> 742 (+0.95%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10053>	2021-04-07 15:21:51 +00:00
Samuel Pitoiset	65bca137bd	aco: implement a workaround for the image load DCC hw bug on GFX10.3 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919>	2021-04-05 08:54:55 +00:00
Samuel Pitoiset	3dfb453626	aco: fix get_sampler_desc() for image loads Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9919>	2021-04-05 08:54:55 +00:00
Samuel Pitoiset	8fa7aa16ce	radv: change RADV_FORCE_FAMILY to use family name instead of LLVM processor name gfx1030 doesn't allow us to specify e.g. dimgrey. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9974>	2021-04-05 06:53:55 +00:00
Rhys Perry	e76531ea7b	aco/tests: fix isel.sparse.clause for LLVM 12+ Seems disassembly of this instruction was fixed in LLVM 12. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4154 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9694>	2021-03-29 15:05:33 +00:00
Tony Wasserka	8557ac9a12	aco/isel: Add documentation for (u)int64->f16 conversion The upper 32 bits are truncated before converting, which still produces correct results since they never meaningfully contribute to the result. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	b5be03f39f	aco/isel: Fix large inputs being truncated in int32->f16 conversions The previous code produced incorrect results for inputs outside the range [INT16_MIN, INT16_MAX]. A problematic case is e.g. i2f16 32768, which previously would be converted to -32768.0 instead of returning the exactly representable floating point result. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	4ce8e422e3	aco/isel: Add documentation and asserts for convert_int This function has evolved to be a generic helper function used throughout the file, so having those assumptions written down explicitly and document unsupported edge cases should help prevent incorrect use. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	1e03796fa4	aco/isel: Don't request sign extension when truncating signed integers This doesn't change semantics but allows us to reject this potentially ambiguous configuration in convert_int in a later change. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00
Tony Wasserka	3a2b055726	aco/isel: Fix i64/u64->float32 conversion for large inputs Previously, inputs such as 0x100000000 would have their upper 32-bits ignored despite being representable by 32-bit floats. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9597>	2021-03-26 14:39:23 +00:00

1 2 3 4 5 ...

1390 commits