NeroReflex/mesa - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Eric Anholt	eea6f21cbd	freedreno: Fix invalid read when a block has no instructions. We can't deref list_(first/last)_entries unless we know we have at least one. Instead, just use our IP we've been tracking as we go to set up the start ip, and fill in the end IP as we walk instructions. Fixes a complaint in valgrind on dEQP-GLES3.functional.transform_feedback.* which sometimes has an empty main (non-END) block when the VS inputs are just directly mapped to outputs without any ALU ops. Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-16 22:02:43 +00:00
Rob Clark	b4df115d3f	freedreno/a6xx: pre-calculate userconst stateobj size The AnTuTu "garden" benchmark overflows the fixed size constbuffer stateobject, so lets be more clever and calculate (a potentially slightly pessimistic) actual size. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-09-12 18:07:20 -07:00
Kristian H. Kristensen	30ab3e39fd	freedreno/a6xx: Implement primitive count queries on GPU The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or tessellation, since these stages add primitives at runtime. Use the WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and implement a hw query for this. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-09-06 09:53:28 -07:00
Jonathan Marek	feea5986a9	freedreno/a2xx: formats update For render formats, update fd2_pipe2color to only work with HW supported render formats, and remove the format whitelist is_format_supported. This patch enables float render formats (which work). For vertex/texture formats, use a generic function which translates using the bitsize of the channels. Since we fake support for some vertex formats, check for these in is_format_supported to avoid enabling them as sampler formats. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-09-06 02:24:29 +00:00
Jonathan Marek	88ca73bcd0	freedreno/a2xx: implement polygon offset Fixes failures in the following deqp tests: dEQP-GLES2.functional.polygon_offset.* Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 02:24:29 +00:00
Vasily Khoruzhick	9367d2ca37	nir: allow specifying filter callback in lower_alu_to_scalar Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>	2019-09-06 01:51:28 +00:00
Rob Clark	9baa72b7fc	freedreno/ir3: allow copy propagation for relative This appears to work fine (with the additional constraint of keeping the indirect load in the same block that a0.x was loaded). We can probably lift this restriction on earlier gens after testing. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	d9ad6f54dc	freedreno/ir3: fix cp cmps.s opt Need to use ir3_instr_set_address(), otherwise the instruction might not get added to the indirects table. This becomes a problem when we turn on copy propagation for relative accesses, as check_instr() in the sched pass won't realize there is an indirect consumer of address register load that is ready to be scheduled. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	e59bfc820b	freedreno/ir3: assert that only single address An instruction can reference only a single address register value. Add an assert to catch bugs. Also, address value should also be local to the same block as the instruction. (The one spot where changing the instruction address is actually legit needs to clear the address first.) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	f94f22e87a	freedreno/ir3: fix mad copy propagation special case After the next patch enabling copy propagation for relative sources, we'll need to dereference the n'th src in valid_flags(), so we actually need to swap the sources before calling valid_flags(). But the logic was already a bit cumbersome, so move it into a helper function. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	1fd6a91d4a	freedreno/ir3: fix addr/pred spilling The live_values and use_count was not being properly updated. This starts triggering problems with the next patch, where we allow copy propagation for RELATIV access. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Rob Clark	50a91fbf87	freedreno/ir3: cleanup "partially const" ubo srcs Move the constant part of the indirect offset into nir intrinsic base. When we have multiple indirect accesses with different constant offsets, this lets other opt passes clean up things to use a single address register value. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-06 00:13:44 +00:00
Eric Engestrom	c4969b0a25	freedreno/drm-shim: fix mem leak Fixes: `494ecef6b4` ("freedreno: Add support for drm-shim.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-04 00:18:37 +01:00
Rob Clark	1ef459297c	freedreno/ir3: use uniform base When lowering from ubo, use the constant base field in the load_uniform instruction for the constant part of the offset. Doesn't change much for constant indexing, but this will help for indirect indexing because constant-folding can't completely clean up the result. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Rob Clark	305bcdf992	freedreno/drm: fix 64b iova shifts Should shift before splitting 64b iova into dwords Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-09-03 14:10:57 -07:00
Alyssa Rosenzweig	3f9dc97124	freedreno/ir3: Link directly to Sethi-Ullman paper Allow a direct link to the PDF itself from the authors themselves, rather than a paywall splash page. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Rob Clark <robdclark@chromium.org>	2019-08-30 15:50:22 -07:00
Rob Clark	6167a63839	freedreno/ir3: do better job of marking convergence points Fixes: dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_vertex dEQP-GLES3.functional.shaders.switch.switch_in_do_while_loop_dynamic_fragment Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-28 15:25:27 -07:00
Rob Clark	6af70aa2b4	freedreno/ir3: maintain predecessors/successors While resolving jumps to skip intermediate jumps from the structured CFG, maintain the successors and predecessors correctly. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-28 15:25:25 -07:00
Rob Clark	06bc4875ff	freedreno/ir3: convert block->predecessors to set Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-28 15:25:19 -07:00
Jason Ekstrand	951cf94521	nir: Add explicit signs to image min/max intrinsics This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-21 17:19:55 +00:00
Rob Clark	882d53d8e3	freedreno/ir3+a6xx: same VBO state for draw/binning Worth ~+20% on gl_driver2 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	4a188e4215	freedreno/ir3: track # of driver params To avoid emitting unneeded const state. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:11:26 -07:00
Rob Clark	5722149bf1	freedreno/ir3: drop unneeded ir3_ra() args Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-13 08:08:07 -07:00
Rhys Perry	da8ed68aca	nir: replace nir_move_load_const() with nir_opt_sink() This is mostly the same as nir_move_load_const() but can also move undef instructions, comparisons and some intrinsics (being careful with loops). v2: actually delete nir_move_load_const.c v3: fix nir_opt_sink() usage in freedreno v3: update Makefile.sources v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def v4: handle if uses v4: fix handling of nested loops v5: re-write adjust_block_for_loops v5: re-write setting of use_block for if uses Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 22:01:30 +00:00
Caio Marcelo de Oliveira Filho	5ed4e31c08	spirv: Drop lower_workgroup_access_to_offsets Intel drivers are not using this anymore, and turnip still don't have Compute Shaders, so won't make a difference to stop using this option. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rob Clark <robdclark@chromium.org>	2019-08-10 22:15:35 -07:00
John Stultz	fcfa2d1447	mesa: freedreno: Android.registers.mk: Fix up register xml.h file generation The current Androdi.registers.mk file causes build failures that look like: FAILED: external/mesa3d/src/freedreno/Android.registers.mk:49: error: implicit rules are obsolete: out/target/product/linaro_db845c/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/%.xml.h Caused by the following Android build rule change: https://android.googlesource.com/platform/build/+/HEAD/Changes.md#implicit_rules I tried to replace this with something similar to the static pattern suggested in the URL above, but ended up getting all the xml.h files generated using only the first a2xx.xml source file. So I've fallen back to explicitly defining the make rules for each. Additionally, we needed to provide the proper LOCAL_EXPORT_C_INCLUDE_DIRS and add the defined static library to the components that depend on the register headers. Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-08-07 02:18:38 +00:00
John Stultz	96baf052b2	mesa: Add ir3/ir3_nir_imul.c generation to Android.mk With current master we're seeing build failures with AOSP: error: undefined symbol: ir3_nir_lower_imul This is due to the ir3_nir_imul.c file not being generated in the Android.mk files. This patch simply adds it to the Android build, after which thigns build and book ok on db410c. Cc: Rob Clark <robdclark@chromium.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: Amit Pundir <amit.pundir@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Alistair Strachan <astrachan@google.com> Cc: Greg Hartman <ghartman@google.com> Cc: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: John Stultz <john.stultz@linaro.org>	2019-08-07 02:18:19 +00:00
Eric Engestrom	d2d85b950d	meson: replace libmesa_util with idep_mesautil This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Rob Clark	44f3c1cf01	freedreno: update registers Pull in some updates of VSC regs Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-02 10:24:14 -07:00
Rob Clark	e2bb3e84ab	freedreno/drm: convert ring_pool to child_pool Worth another couple percent at driver2 Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-02 10:24:14 -07:00
Rob Clark	9ac23794c9	freedreno/drm: remove idx_lock Since it ends up contended, it is a bit of a bottleneck for workloads with high driver overhead. Worth nearly +10% at gfxbench driver2. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-08-02 10:24:14 -07:00
Jonathan Marek	d8584c5cf2	freedreno: a2xx: implement texture tiling Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-08-02 15:58:22 +00:00
Rob Clark	73cc2dc084	freedreno/ir3: fix for array/reg store vs meta instructions fishgl.com has a shader which does roughly: foo = texture(...); if (bar) foo = texture(...); after lowering phi webs to regs we end up w/ a vec4 reg (array). But since it was not an indirect access, we try to skip the extra mov. This results that the per-component fanout (split) meta instructions store directly to the reg (array). Which doesn't work out in RA. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-07-29 15:15:31 -07:00
Eric Anholt	91986fbbdb	freedreno: Fix data race on making the shader's id. The value is only used for IR3_DBG_DISASM, but it cleans up the helgrind output. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Eric Anholt	6f0521b78c	freedreno: Take a lock around shader variant creation. Shaders are shared across contexts in gallium (part of making it so that you get shader compile at link time, for shader-db and to reduce compiles at draw time). So, we need to protect from variant creation for a shader from multiple threads at the same time. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Eric Anholt	6e3b220ad3	freedreno: Fix data races with allocating/freeing struct ir3. There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: `8fe2076243` ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-29 12:50:49 -07:00
Eric Engestrom	d2de5b6ba2	anv+tu+radv: delete unusable dev_icd.json As per previous commit, Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, so delete them. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> # for anv Reviewed-by: Eric Anholt <eric@anholt.net> # for tu Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> # for radv	2019-07-26 14:47:53 +00:00
Eric Anholt	494ecef6b4	freedreno: Add support for drm-shim. I'm using this for shader-db analysis on x86_64 systems. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-07-25 08:56:19 -07:00
Eric Anholt	0d8a4c67cf	freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper. Cuts a bunch of boilerplate. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	56f4ede73d	freedreno: Convert load_barycentric_at_sample to the NIR lowering helper. Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-18 11:28:56 -07:00
Eric Anholt	61098baf42	freedreno: Convert load_barycentric_at_offset to the NIR lowering helper. Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-07-18 11:28:56 -07:00
Kristian H. Kristensen	e03259974e	freedreno: Generate headers from xml files Reviewed-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Rob Clark <robdclark@gmail.com>	2019-07-10 22:05:02 +00:00
Eric Engestrom	1abae9e54a	tu: add exported symbols check Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2019-07-10 11:27:51 +00:00
Sagar Ghuge	456557a837	nir: Add lower_rotate flag and set to true in all drivers Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-01 10:14:22 -07:00
Rob Clark	02893fe73a	freedreno: update generated registers Corrects the a3xx texconst state for TILE_MODE. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-07-01 06:15:52 -07:00
Rob Clark	9753d7381c	freedreno/ir3: small cleanup `target` cannot be NULL here. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-06-28 13:02:59 -07:00
Rob Clark	016a9ab2f9	freedreno/ir3: fix missing (ss) in dummy bary.f case In case we need to insert a dummy bary.f for the (ei) flag, it also needs (ss) so we don't release varying storage to the next VS wave before the ldlv completed. Fixes random failures in: dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.* Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2019-06-28 13:02:59 -07:00
Eric Anholt	5c4289dd4b	freedreno: Only upload the used part of UBO0 to the constant buffer. We were pessimistically uploading all of it in case of indirection, but we can just bump that when we encounter indirection. total constlen in shared programs: 2529623 -> 2485933 (-1.73%) Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-24 14:23:07 -07:00
Eric Anholt	852704976a	freedreno: Stop treating UBO 0 specially in UBO uploading. ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we need to upload (all of it, since it will lower indirect UBO 0 accesses from load_ubo back to indirection on the constant buffer). Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-24 14:23:07 -07:00
Daniel Schürmann	165b7f3a44	nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec. That is: the five least significant bits provide the values of 'bits' and 'offset' which is the case for all hardware currently supported by NIR and using the bfm/bfe instructions. This patch also changes the lowering of bitfield_insert/extract using shifts to not use bfm and removes the flag 'lower_bfm'. Tested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-24 18:42:20 +02:00

1 2 3 4 5 ...

365 commits