mesa-24.0.1

-----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEV1Ud4VuWj2NBwkj2jY4xr8MkKKYFAmXNKkcSHGVyaWNAZW5n
 ZXN0cm9tLmNoAAoJEI2OMa/DJCimfL4H/ipoFpuC6KzOvRXOlZZJVfvba2nvgElc
 Z3O8yFrlbi/t0ATZVM4OTf0XacUAxX5NP13hwjuFITN4O9ePy3+nX1V7ZxnOHeuv
 3G3H/kEkjDd+NP711aw6Ggf5XU7edsPXrZRkuMS0XVAIz3vCO40OrwIRUxT2fJHd
 mlV2XwbywunYP3kzDuwZ0FvtKC6Ov0K3Hz6fjtZP91VqZ9M+2/xy2ADR/FQPhy7N
 nMBvrOI+mc2d+z0pv/4DcevmMXxI+uCnA/fakK6nnJ8QkkhaDNix23Hh0+kKzROz
 cmBdpxDaRl0cOcVfsuogPvnqN63uvEVkbgaqduGmTZz2vKK1OH7PsHA=
 =U7rU
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEzgD2pEY1++nF3Ggr3Ztj+AXPXAMFAmXOFrwACgkQ3Ztj+AXP
 XANKQQ/7BqBv7HPilNIP+c5wbtkrmYuhgYXtdtd5TKJw7axi/Si3EmcfAKHwoEIw
 A6fDpZ7pZikkn6sKKynvzElIEkkh9SBSTwy8I3emtWTRW8umkPQH54voasAGLqYW
 j76eGyuk+Ux/6JQ8rJQMsykQNlynoeZ6R5UoHJ1x5svqS7+/hEj3MAjodhJWUMP6
 zmxBEykrOGtVFk/Gmb3p+g+YOMHB5Fh7WipgnuPbCidxwL2yXGDWtew4OQWt/TwR
 qgJg6XHLVHJ3LG32zCnYvu80AVnBWTNbmV0yUg+810JfOGf/FuefZBBgOBTopq49
 S3wBtBeYJRO4tsze5rx93Nxt/+QHic+s0ExLGro3OHU8Pl2cycdieZ8qij/sgnP6
 aJnTzNv+RX9+TsAJVfiqgphNiOf4HMVYL2ZrRSTWDZNYPLUEcQkrjeOlDT1LzDt1
 0zl28PFJsSScnIoSmdaPSz3oB2RFy3qo9JrQtjtCQSha5YmUZR0sko8lnCBKZNPt
 yyipYHRc6mh2XC3d4nBICT9gpAniaLetOe9enUBElV3k7C6jGyQyUEuL0LFfnUne
 QrhB3baiGKYrskdheXGR1lawZcV/DfutoVW+Ro1m5eYJJq120ZpcYn3aaHFd0zO6
 GZUuUaFt9KoNku2AUDuXnk9Qkrh3oKi4iPgsL9t24IpsLPGn1m4=
 =hJm5
 -----END PGP SIGNATURE-----

Merge tag 'mesa-24.0.1' into 24/neroreflex

mesa-24.0.1
This commit is contained in:
Denis 2024-02-15 14:50:48 +01:00
commit f688dfd9cf
76 changed files with 10841 additions and 318 deletions

View file

@ -463,7 +463,6 @@ debian-clang:
BUILDTYPE: debug
LLVM_VERSION: 15
UNWIND: "enabled"
GALLIUM_DUMP_CPU: "true"
C_ARGS: >
-Wno-error=constant-conversion
-Wno-error=enum-conversion

File diff suppressed because it is too large Load diff

View file

@ -1 +1 @@
24.0.0
24.0.1

View file

@ -3,6 +3,8 @@ Release Notes
The release notes summarize what's new or changed in each Mesa release.
- :doc:`24.0.1 release notes <relnotes/24.0.1>`
- :doc:`24.0.0 release notes <relnotes/24.0.0>`
- :doc:`23.3.3 release notes <relnotes/23.3.3>`
- :doc:`23.3.2 release notes <relnotes/23.3.2>`
- :doc:`23.3.1 release notes <relnotes/23.3.1>`
@ -407,6 +409,8 @@ The release notes summarize what's new or changed in each Mesa release.
:maxdepth: 1
:hidden:
24.0.1 <relnotes/24.0.1>
24.0.0 <relnotes/24.0.0>
23.3.3 <relnotes/23.3.3>
23.3.2 <relnotes/23.3.2>
23.3.1 <relnotes/23.3.1>

4455
docs/relnotes/24.0.0.rst Normal file

File diff suppressed because it is too large Load diff

191
docs/relnotes/24.0.1.rst Normal file
View file

@ -0,0 +1,191 @@
Mesa 24.0.1 Release Notes / 2024-02-14
======================================
Mesa 24.0.1 is a bug fix release which fixes bugs found since the 24.0.0 release.
Mesa 24.0.1 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 24.0.1 implements the Vulkan 1.3 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA256 checksum
---------------
::
TBD.
New features
------------
- None
Bug fixes
---------
- rusticl: clEnqueueFillBuffer (among others) fails on buffers created from GL object.
- [ADL] gpu hang on dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics
- Turnip spam on non-turnip devices
- Intermittent compiler failures when building valhall tests
- panfrost: graphical artifacts on T604 (T600)
- Dying Light native artifacts on Intel A770
- r300: Amnesia: The Dark Descent heavy corruption
- [ANV/DG2] Age of Empires IV fullscreen "banding" artefacts
- [mtl][anv] dEQP-VK.pipeline.monolithic.depth.format.d32_sfloat.compare_ops.* failures when run multithreaded
- [mtl][anv] flaky tests in pipeline.monolithic.extended_dynamic_state*stencil_state_face* series
- Broken colors/dual-source blending on PinePhone (Pro) since 23.1.0
- Regression between 23.0.4 and 23.1.0: texture glitches in osgEarth
- radeonsi unsynchronized flips/tearing with KMS DRM rendering on 780M
Changes
-------
Blisto (1):
- driconf: set vk_x11_strict_image_count for Atlas Fallen Vulkan
Boris Brezillon (2):
- panfrost: Pad compute jobs with zeros on v4
- pan/va: Add missing valhall_enums dep to valhall_disasm
Christian Duerr (1):
- panfrost: Fix dual-source blending
Connor Abbott (1):
- ir3/ra: Fix bug with collect source handling
Corentin Noël (1):
- zink: Only call reapply_color_write if EXT_color_write_enable is available
Danylo Piliaiev (1):
- tu: Do not print anything on systems without Adreno GPU
Dave Airlie (5):
- zink: use sparse residency for buffers.
- radv: fix correct padding on uvd
- radv: init decoder ip block earlier.
- radv/uvd: uvd kernel checks for full dpb allocation.
- radv: don't submit 0 length on UVD either.
David Heidelberg (1):
- meson: upgrade zlib wrap to 1.3.1
David Rosca (2):
- frontends/va: Fix updating AV1 rate control parameters
- radeonsi/vcn: Don't reinitialize encode session on bitrate/fps change
Eric Engestrom (7):
- docs: add release notes for 24.0.0
- docs: add sha256sum for 24.0.0
- .pick_status.json: Update to fa8e0ba3f739cb46cf7bb709903c0206f240c584
- vk/util: fix 'beta' check for physical device features
- vk/util: fix 'beta' check for physical device properties
- panfrost: fix UB caused by shifting signed int too far
- .pick_status.json: Update to 90eae30bcb84d54dc871ddbb8355f729cf8fa900
Friedrich Vock (2):
- radv/rt: Write inactive node data in ALWAYS_ACTIVE workaround
- radv,driconf: Enable active AS leaf workaround for Jedi Survivor
Georg Lehmann (3):
- aco/gfx11+: disable v_pk_fmac_f16_dpp
- aco: don't remove branches that skip v_writelane_b32
- aco/gfx11+: limit hard clauses to 32 instructions
José Roberto de Souza (2):
- iris: Fix return of iris_wait_syncobj()
- intel: Fix intel_get_mesh_urb_config()
Karol Herbst (3):
- nir/lower_cl_images: record image_buffers and msaa_images
- rusticl/mem: properly handle buffers
- rusticl/mem: support GL_TEXTURE_BUFFER
Kenneth Graunke (1):
- driconf: Advertise GL_EXT_shader_image_load_store on iris for SVP13
Konstantin Seurer (1):
- radv/sqtt: Handle ray tracing pipelines with no traversal shader
Lepton Wu (1):
- llvmpipe: Set "+64bit" for X86_64
Lionel Landwerlin (4):
- anv: don't unmap AUX ranges at BO delete
- intel/fs: rerun divergence prior to lowering non-uniform interpolate at sample
- anv: fix incorrect flushing on shader query copy
- anv: fix buffer marker cache flush issues on MTL
M Henning (1):
- nvk: Don't clobber vb0 after repeated blits
Mark Janes (1):
- hasvk: add missing linker arguments
Mike Blumenkrantz (2):
- mesa: plumb errors through to texture allocation
- nir/lower_io: fix handling for compact arrays with indirect derefs
Pavel Ondračka (1):
- r300: fix vs output register indexing
Pierre-Eric Pelloux-Prayer (1):
- egl/drm: flush before calling get_back_bo
Rhys Perry (1):
- aco: fix >8 byte linear vgpr copies
Rob Clark (1):
- freedreno: Fix MSAA z/s layout in GMEM
Samuel Pitoiset (2):
- radv: add a workaround for mipmaps and minLOD on GFX6-8
- radv/sqtt: fix describing queue submits for RGP
Sviatoslav Peleshko (2):
- anv,driconf: Add sampler coordinate precision workaround for AoE 4
- driconf: Apply dual color blending workaround to Dying Light
Tapani Pälli (1):
- anv: flush tile cache independent of format with HIZ-CCS flush
Timothy Arceri (2):
- glsl: don't tree graft globals
- Revert "ci: Enable GALLIUM_DUMP_CPU=true only in the clang job"

View file

@ -1,24 +0,0 @@
New features
------------
VK_EXT_image_compression_control on RADV
VK_EXT_device_fault on RADV
OpenGL 3.3 on Asahi
Geometry shaders on Asahi
GL_ARB_texture_cube_map_array on Asahi
GL_ARB_clip_control on Asahi
GL_ARB_timer_query on Asahi
GL_EXT_disjoint_timer_query on Asahi
GL_ARB_base_instance on Asahi
OpenGL 4.6 (up from 4.2) on d3d12
VK_EXT_depth_clamp_zero_one on RADV
GL_ARB_shader_texture_image_samples on Asahi
GL_ARB_indirect_parameters on Asahi
GL_ARB_viewport_array on Asahi
GL_ARB_fragment_layer_viewport on Asahi
GL_ARB_cull_distance on Asahi
GL_ARB_transform_feedback_overflow_query on Asahi
VK_KHR_calibrated_timestamps on RADV
VK_KHR_vertex_attribute_divisor on RADV
VK_KHR_maintenance6 on RADV
VK_KHR_ray_tracing_position_fetch on RADV
EGL_EXT_query_reset_notification_strategy

View file

@ -1,19 +0,0 @@
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_nearest,Fail

View file

@ -8,25 +8,6 @@ dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.layer_copy_before_r
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.layer_copy_before_resolving.4_bit,Fail
dEQP-VK.api.copy_and_blit.dedicated_allocation.resolve_image.layer_copy_before_resolving.8_bit,Fail
dEQP-VK.pipeline.monolithic.timestamp.calibrated.calibration_test,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.image.sample_texture.128_bit_compressed_format_two_samplers,Fail
dEQP-VK.image.sample_texture.128_bit_compressed_format_two_samplers_cubemap,Fail
dEQP-VK.image.sample_texture.64_bit_compressed_format_two_samplers,Fail

View file

@ -1,19 +0,0 @@
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_nearest,Fail

View file

@ -1,22 +1,2 @@
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.2d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.linear_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_linear_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest,Fail
dEQP-VK.texture.mipmap.3d.image_view_min_lod.base_level.nearest_nearest_integer_texel_coord,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.linear_nearest,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_linear,Fail
dEQP-VK.texture.mipmap.cubemap.image_view_min_lod.base_level.nearest_nearest,Fail
# New CTS failures in 1.3.7.0.
dEQP-VK.api.version_check.unavailable_entry_points,Fail

View file

@ -1,6 +0,0 @@
test_view_min_lod:4551:Test 20: Test failed: Got 0x00000000, expected 0xffffffff at (0, 0, 0).
test_view_min_lod:4551:Test 22: Test failed: Got 0x00000000, expected 0x0f0f0f0f at (0, 0, 0).
test_view_min_lod:4551:Test 46: Test failed: Got 0x0f0f0f0f, expected 0xffffffff at (0, 0, 0).
test_view_min_lod:4551:Test 47: Test failed: Got 0xffffffff, expected 0x0f0f0f0f at (0, 0, 0).
test_view_min_lod:4551:Test 49: Test failed: Got 0x0f0f0f0f, expected 0xffffffff at (0, 0, 0).
test_view_min_lod:4551:Test 50: Test failed: Got 0xffffffff, expected 0x0f0f0f0f at (0, 0, 0).

View file

@ -248,6 +248,10 @@ get_type(Program* program, aco_ptr<Instruction>& instr)
void
form_hard_clauses(Program* program)
{
/* The ISA documentation says 63 is the maximum for GFX11/12, but according to
* LLVM there are HW bugs with more than 32 instructions.
*/
const unsigned max_clause_length = program->gfx_level >= GFX11 ? 32 : 63;
for (Block& block : program->blocks) {
unsigned num_instrs = 0;
aco_ptr<Instruction> current_instrs[63];
@ -261,7 +265,7 @@ form_hard_clauses(Program* program)
aco_ptr<Instruction>& instr = block.instructions[i];
clause_type type = get_type(program, instr);
if (type != current_type || num_instrs == 63 ||
if (type != current_type || num_instrs == max_clause_length ||
(num_instrs && !should_form_clause(current_instrs[0].get(), instr.get()))) {
emit_clause(bld, num_instrs, current_instrs);
num_instrs = 0;

View file

@ -414,6 +414,9 @@ can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp
instr->opcode == aco_opcode::v_dot2_f32_bf16;
}
if (instr->opcode == aco_opcode::v_pk_fmac_f16)
return gfx_level < GFX11;
/* there are more cases but those all take 64-bit inputs */
return instr->opcode != aco_opcode::v_madmk_f32 && instr->opcode != aco_opcode::v_madak_f32 &&
instr->opcode != aco_opcode::v_madmk_f16 && instr->opcode != aco_opcode::v_madak_f16 &&

View file

@ -1820,17 +1820,15 @@ handle_operands(std::map<PhysReg, copy_operation>& copy_map, lower_context* ctx,
if (it->second.bytes > 8) {
assert(!it->second.op.isConstant());
assert(!it->second.def.regClass().is_subdword());
RegClass rc = RegClass(it->second.def.regClass().type(), it->second.def.size() - 2);
RegClass rc = it->second.def.regClass().resize(it->second.def.bytes() - 8);
Definition hi_def = Definition(PhysReg{it->first + 2}, rc);
rc = RegClass(it->second.op.regClass().type(), it->second.op.size() - 2);
rc = it->second.op.regClass().resize(it->second.op.bytes() - 8);
Operand hi_op = Operand(PhysReg{it->second.op.physReg() + 2}, rc);
copy_operation copy = {hi_op, hi_def, it->second.bytes - 8};
copy_map[hi_def.physReg()] = copy;
assert(it->second.op.physReg().byte() == 0 && it->second.def.physReg().byte() == 0);
it->second.op = Operand(it->second.op.physReg(),
it->second.op.regClass().type() == RegType::sgpr ? s2 : v2);
it->second.def = Definition(it->second.def.physReg(),
it->second.def.regClass().type() == RegType::sgpr ? s2 : v2);
it->second.op = Operand(it->second.op.physReg(), it->second.op.regClass().resize(8));
it->second.def = Definition(it->second.def.physReg(), it->second.def.regClass().resize(8));
it->second.bytes = 8;
}
@ -2927,6 +2925,11 @@ lower_to_hw_instr(Program* program)
} else if (inst->isSALU()) {
num_scalar++;
} else if (inst->isVALU() || inst->isVINTRP()) {
if (instr->opcode == aco_opcode::v_writelane_b32 ||
instr->opcode == aco_opcode::v_writelane_b32_e64) {
/* writelane ignores exec, writing inactive lanes results in UB. */
can_remove = false;
}
num_vector++;
/* VALU which writes SGPRs are always executed on GFX10+ */
if (ctx.program->gfx_level >= GFX10) {

View file

@ -809,6 +809,32 @@ BEGIN_TEST(to_hw_instr.swap_linear_vgpr)
finish_to_hw_instr_test();
END_TEST
BEGIN_TEST(to_hw_instr.copy_linear_vgpr_v3)
if (!setup_cs(NULL, GFX10))
return;
PhysReg reg_v0{256};
PhysReg reg_v4{256 + 4};
RegClass v3_linear = v3.as_linear();
//>> p_unit_test 0
bld.pseudo(aco_opcode::p_unit_test, Operand::zero());
//! lv2: %0:v[0-1] = v_lshrrev_b64 0, %0:v[4-5]
//! s2: %0:exec, s1: %0:scc = s_not_b64 %0:exec
//! lv2: %0:v[0-1] = v_lshrrev_b64 0, %0:v[4-5]
//! s2: %0:exec, s1: %0:scc = s_not_b64 %0:exec
//! lv1: %0:v[2] = v_mov_b32 %0:v[6]
//! s2: %0:exec, s1: %0:scc = s_not_b64 %0:exec
//! lv1: %0:v[2] = v_mov_b32 %0:v[6]
//! s2: %0:exec, s1: %0:scc = s_not_b64 %0:exec
Instruction* instr = bld.pseudo(aco_opcode::p_parallelcopy, Definition(reg_v0, v3_linear),
Operand(reg_v4, v3_linear));
instr->pseudo().scratch_sgpr = m0;
finish_to_hw_instr_test();
END_TEST
BEGIN_TEST(to_hw_instr.pack2x16_constant)
PhysReg v0_lo{256};
PhysReg v0_hi{256};

View file

@ -460,6 +460,7 @@ TYPE(AccelerationStructureInstance, 8);
bool
build_triangle(inout radv_aabb bounds, VOID_REF dst_ptr, radv_bvh_geometry_data geom_data, uint32_t global_id)
{
bool is_valid = true;
triangle_indices indices = load_indices(geom_data.indices, geom_data.index_format, global_id);
triangle_vertices vertices = load_vertices(geom_data.data, indices, geom_data.vertex_format, geom_data.stride);
@ -469,7 +470,11 @@ build_triangle(inout radv_aabb bounds, VOID_REF dst_ptr, radv_bvh_geometry_data
* format does not have a NaN representation, then all triangles are considered active.
*/
if (isnan(vertices.vertex[0].x) || isnan(vertices.vertex[1].x) || isnan(vertices.vertex[2].x))
#if ALWAYS_ACTIVE
is_valid = false;
#else
return false;
#endif
if (geom_data.transform != NULL) {
mat4 transform = mat4(1.0);
@ -498,12 +503,13 @@ build_triangle(inout radv_aabb bounds, VOID_REF dst_ptr, radv_bvh_geometry_data
DEREF(node).geometry_id_and_flags = geom_data.geometry_id;
DEREF(node).id = 9;
return true;
return is_valid;
}
bool
build_aabb(inout radv_aabb bounds, VOID_REF src_ptr, VOID_REF dst_ptr, uint32_t geometry_id, uint32_t global_id)
{
bool is_valid = true;
REF(radv_bvh_aabb_node) node = REF(radv_bvh_aabb_node)(dst_ptr);
for (uint32_t vec = 0; vec < 2; vec++)
@ -520,12 +526,16 @@ build_aabb(inout radv_aabb bounds, VOID_REF src_ptr, VOID_REF dst_ptr, uint32_t
* NaN, and the first is not, the behavior is undefined.
*/
if (isnan(bounds.min.x))
#if ALWAYS_ACTIVE
is_valid = false;
#else
return false;
#endif
DEREF(node).primitive_id = global_id;
DEREF(node).geometry_id_and_flags = geometry_id;
return true;
return is_valid;
}
radv_aabb

View file

@ -573,8 +573,8 @@ radv_describe_queue_present(struct radv_queue *queue, uint64_t cpu_timestamp, vo
}
static VkResult
radv_describe_queue_submit(struct radv_queue *queue, struct radv_cmd_buffer *cmd_buffer, uint64_t cpu_timestamp,
void *pre_gpu_timestamp_ptr, void *post_gpu_timestamp_ptr)
radv_describe_queue_submit(struct radv_queue *queue, struct radv_cmd_buffer *cmd_buffer, uint32_t cmdbuf_idx,
uint64_t cpu_timestamp, void *pre_gpu_timestamp_ptr, void *post_gpu_timestamp_ptr)
{
struct radv_device *device = queue->device;
struct rgp_queue_event_record *record;
@ -590,6 +590,7 @@ radv_describe_queue_submit(struct radv_queue *queue, struct radv_cmd_buffer *cmd
record->gpu_timestamps[0] = pre_gpu_timestamp_ptr;
record->gpu_timestamps[1] = post_gpu_timestamp_ptr;
record->queue_info_index = queue->vk.queue_family_index;
record->submit_sub_index = cmdbuf_idx;
radv_describe_queue_event(queue, record);
@ -841,7 +842,7 @@ sqtt_QueueSubmit2(VkQueue _queue, uint32_t submitCount, const VkSubmitInfo2 *pSu
};
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, pCommandBufferInfo->commandBuffer);
radv_describe_queue_submit(queue, cmd_buffer, cpu_timestamp, gpu_timestamps_ptr[0], gpu_timestamps_ptr[1]);
radv_describe_queue_submit(queue, cmd_buffer, j, cpu_timestamp, gpu_timestamps_ptr[0], gpu_timestamps_ptr[1]);
}
sqtt_submit.commandBufferInfoCount = new_cmdbuf_count;
@ -1578,10 +1579,12 @@ radv_register_rt_pipeline(struct radv_device *device, struct radv_ray_tracing_pi
uint32_t idx = pipeline->stage_count;
/* Combined traversal shader */
result = radv_register_rt_stage(device, pipeline, idx++, max_any_hit_stack_size + max_intersection_stack_size,
pipeline->base.base.shaders[MESA_SHADER_INTERSECTION]);
if (result != VK_SUCCESS)
return result;
if (pipeline->base.base.shaders[MESA_SHADER_INTERSECTION]) {
result = radv_register_rt_stage(device, pipeline, idx++, max_any_hit_stack_size + max_intersection_stack_size,
pipeline->base.base.shaders[MESA_SHADER_INTERSECTION]);
if (result != VK_SUCCESS)
return result;
}
/* Prolog */
result = radv_register_rt_stage(device, pipeline, idx++, 0, pipeline->prolog);

View file

@ -1075,6 +1075,11 @@ radv_image_create_layout(struct radv_device *device, struct radv_image_create_in
radv_video_get_profile_alignments(device->physical_device, profile_list, &width_align, &height_align);
image_info.width = align(image_info.width, width_align);
image_info.height = align(image_info.height, height_align);
if (radv_has_uvd(device->physical_device) && image->vk.usage & VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR) {
/* UVD and kernel demand a full DPB allocation. */
image_info.array_size = MIN2(16, image_info.array_size);
}
}
unsigned plane_count = radv_get_internal_plane_count(device->physical_device, image->vk.format);

View file

@ -644,7 +644,7 @@ radv_image_view_make_descriptor(struct radv_image_view *iview, struct radv_devic
bool disable_compression, bool enable_compression, unsigned plane_id,
unsigned descriptor_plane_id, VkImageCreateFlags img_create_flags,
const struct ac_surf_nbc_view *nbc_view,
const VkImageViewSlicedCreateInfoEXT *sliced_3d)
const VkImageViewSlicedCreateInfoEXT *sliced_3d, bool force_zero_base_mip)
{
struct radv_image *image = iview->image;
struct radv_image_plane *plane = &image->planes[plane_id];
@ -652,7 +652,7 @@ radv_image_view_make_descriptor(struct radv_image_view *iview, struct radv_devic
unsigned first_layer = iview->vk.base_array_layer;
uint32_t blk_w;
union radv_descriptor *descriptor;
uint32_t hw_level = 0;
uint32_t hw_level = iview->vk.base_mip_level;
if (is_storage_image) {
descriptor = &iview->storage_descriptor;
@ -665,7 +665,6 @@ radv_image_view_make_descriptor(struct radv_image_view *iview, struct radv_devic
blk_w = plane->surface.blk_w / vk_format_get_blockwidth(plane->format) * vk_format_get_blockwidth(vk_format);
if (device->physical_device->rad_info.gfx_level >= GFX9) {
hw_level = iview->vk.base_mip_level;
if (nbc_view->valid) {
hw_level = nbc_view->level;
iview->extent.width = nbc_view->width;
@ -674,6 +673,9 @@ radv_image_view_make_descriptor(struct radv_image_view *iview, struct radv_devic
/* Clear the base array layer because addrlib adds it as part of the base addr offset. */
first_layer = 0;
}
} else {
if (force_zero_base_mip)
hw_level = 0;
}
radv_make_texture_descriptor(device, image, is_storage_image, iview->vk.view_type, vk_format, components, hw_level,
@ -690,7 +692,7 @@ radv_image_view_make_descriptor(struct radv_image_view *iview, struct radv_devic
if (is_stencil)
base_level_info = &plane->surface.u.legacy.zs.stencil_level[iview->vk.base_mip_level];
else
base_level_info = &plane->surface.u.legacy.level[iview->vk.base_mip_level];
base_level_info = &plane->surface.u.legacy.level[force_zero_base_mip ? iview->vk.base_mip_level : 0];
}
bool enable_write_compression = radv_image_use_dcc_image_stores(device, image);
@ -751,6 +753,12 @@ radv_image_view_init(struct radv_image_view *iview, struct radv_device *device,
bool from_client = extra_create_info && extra_create_info->from_client;
vk_image_view_init(&device->vk, &iview->vk, !from_client, pCreateInfo);
bool force_zero_base_mip = true;
if (device->physical_device->rad_info.gfx_level <= GFX8 && min_lod) {
/* Do not force the base level to zero to workaround a spurious bug with mipmaps and min LOD. */
force_zero_base_mip = false;
}
switch (image->vk.image_type) {
case VK_IMAGE_TYPE_1D:
case VK_IMAGE_TYPE_2D:
@ -799,7 +807,7 @@ radv_image_view_init(struct radv_image_view *iview, struct radv_device *device,
plane_count = 1;
}
if (device->physical_device->rad_info.gfx_level >= GFX9) {
if (!force_zero_base_mip || device->physical_device->rad_info.gfx_level >= GFX9) {
iview->extent = (VkExtent3D){
.width = image->vk.extent.width,
.height = image->vk.extent.height,
@ -889,10 +897,10 @@ radv_image_view_init(struct radv_image_view *iview, struct radv_device *device,
VkFormat format = vk_format_get_plane_format(iview->vk.view_format, i);
radv_image_view_make_descriptor(iview, device, format, &pCreateInfo->components, min_lod, false,
disable_compression, enable_compression, iview->plane_id + i, i, img_create_flags,
&iview->nbc_view, NULL);
&iview->nbc_view, NULL, force_zero_base_mip);
radv_image_view_make_descriptor(iview, device, format, &pCreateInfo->components, min_lod, true,
disable_compression, enable_compression, iview->plane_id + i, i, img_create_flags,
&iview->nbc_view, sliced_3d);
&iview->nbc_view, sliced_3d, force_zero_base_mip);
}
}

View file

@ -2022,13 +2022,13 @@ radv_physical_device_try_create(struct radv_instance *instance, drmDevicePtr drm
if ((device->instance->debug_flags & RADV_DEBUG_INFO))
ac_print_gpu_info(&device->rad_info, stdout);
radv_init_physical_device_decoder(device);
radv_physical_device_init_queue_table(device);
/* We don't check the error code, but later check if it is initialized. */
ac_init_perfcounters(&device->rad_info, false, false, &device->ac_perfcounters);
radv_init_physical_device_decoder(device);
/* The WSI is structured as a layer on top of the driver, so this has
* to be the last part of initialization (at least until we get other
* semi-layers).

View file

@ -1643,7 +1643,8 @@ radv_queue_submit_normal(struct radv_queue *queue, struct vk_queue_submit *submi
queue->device->ws->cs_unchain(cmd_buffer->cs);
if (!chainable || !queue->device->ws->cs_chain(chainable, cmd_buffer->cs, queue->state.uses_shadow_regs)) {
/* don't submit empty command buffers to the kernel. */
if (radv_queue_ring(queue) != AMD_IP_VCN_ENC || cmd_buffer->cs->cdw != 0)
if ((radv_queue_ring(queue) != AMD_IP_VCN_ENC && radv_queue_ring(queue) != AMD_IP_UVD) ||
cmd_buffer->cs->cdw != 0)
cs_array[num_submitted_cs++] = cmd_buffer->cs;
}

View file

@ -1724,10 +1724,12 @@ radv_uvd_cmd_reset(struct radv_cmd_buffer *cmd_buffer)
if (vid->sessionctx.mem)
send_cmd(cmd_buffer, RDECODE_CMD_SESSION_CONTEXT_BUFFER, vid->sessionctx.mem->bo, vid->sessionctx.offset);
send_cmd(cmd_buffer, RDECODE_CMD_MSG_BUFFER, cmd_buffer->upload.upload_bo, out_offset);
/* pad out the IB to the 16 dword boundary - otherwise the fw seems to be unhappy */
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, 8);
for (unsigned i = 0; i < 8; i++)
radeon_emit(cmd_buffer->cs, 0x81ff);
int padsize = vid->sessionctx.mem ? 4 : 6;
radeon_check_space(cmd_buffer->device->ws, cmd_buffer->cs, padsize);
for (unsigned i = 0; i < padsize; i++)
radeon_emit(cmd_buffer->cs, PKT2_NOP_PAD);
}
VKAPI_ATTR void VKAPI_CALL

View file

@ -447,7 +447,17 @@ radv_amdgpu_cs_finalize(struct radeon_cmdbuf *_cs)
*cs->ib_size_ptr |= cs->base.cdw;
} else {
/* Pad the CS with NOP packets. */
if (ip_type != AMDGPU_HW_IP_VCN_ENC) {
bool pad = true;
/* Don't pad on VCN encode/unified as no NOPs */
if (ip_type == AMDGPU_HW_IP_VCN_ENC)
pad = false;
/* Don't add padding to 0 length UVD due to kernel */
if (ip_type == AMDGPU_HW_IP_UVD && cs->base.cdw == 0)
pad = false;
if (pad) {
while (!cs->base.cdw || (cs->base.cdw & ib_pad_dw_mask))
radeon_emit_unchecked(&cs->base, nop_packet);
}

View file

@ -39,6 +39,7 @@ ir_variable_refcount_visitor::ir_variable_refcount_visitor()
{
this->mem_ctx = ralloc_context(NULL);
this->ht = _mesa_pointer_hash_table_create(NULL);
this->global = true;
}
static void
@ -94,8 +95,10 @@ ir_visitor_status
ir_variable_refcount_visitor::visit(ir_variable *ir)
{
ir_variable_refcount_entry *entry = this->get_variable_entry(ir);
if (entry)
if (entry) {
entry->declaration = true;
entry->is_global = this->global;
}
return visit_continue;
}
@ -117,10 +120,14 @@ ir_variable_refcount_visitor::visit(ir_dereference_variable *ir)
ir_visitor_status
ir_variable_refcount_visitor::visit_enter(ir_function_signature *ir)
{
global = false;
/* We don't want to descend into the function parameters and
* dead-code eliminate them, so just accept the body here.
*/
visit_list_elements(this, &ir->body);
global = true;
return visit_continue_with_parent;
}

View file

@ -62,6 +62,9 @@ public:
unsigned assigned_count;
bool declaration; /* If the variable had a decl in the instruction stream */
/** Is the variable a global */
bool is_global;
};
class ir_variable_refcount_visitor : public ir_hierarchical_visitor {
@ -86,6 +89,8 @@ public:
struct hash_table *ht;
void *mem_ctx;
bool global;
};
#endif /* GLSL_IR_VARIABLE_REFCOUNT_H */

View file

@ -386,7 +386,8 @@ tree_grafting_basic_block(ir_instruction *bb_first,
if (!entry->declaration ||
entry->assigned_count != 1 ||
entry->referenced_count != 2)
entry->referenced_count != 2 ||
entry->is_global)
continue;
/* Found a possibly graftable assignment. Now, walk through the

View file

@ -114,6 +114,9 @@ nir_lower_cl_images(nir_shader *shader, bool lower_image_derefs, bool lower_samp
ASSERTED int last_loc = -1;
int num_rd_images = 0, num_wr_images = 0;
BITSET_ZERO(shader->info.image_buffers);
BITSET_ZERO(shader->info.msaa_images);
nir_foreach_variable_with_modes(var, shader, nir_var_image | nir_var_uniform) {
if (!glsl_type_is_image(var->type) && !glsl_type_is_texture(var->type))
continue;
@ -128,6 +131,17 @@ nir_lower_cl_images(nir_shader *shader, bool lower_image_derefs, bool lower_samp
else
var->data.driver_location = num_wr_images++;
var->data.binding = var->data.driver_location;
switch (glsl_get_sampler_dim(var->type)) {
case GLSL_SAMPLER_DIM_BUF:
BITSET_SET(shader->info.image_buffers, var->data.binding);
break;
case GLSL_SAMPLER_DIM_MS:
BITSET_SET(shader->info.msaa_images, var->data.binding);
break;
default:
break;
}
}
shader->info.num_textures = num_rd_images;
BITSET_ZERO(shader->info.textures_used);

View file

@ -216,7 +216,7 @@ get_io_offset(nir_builder *b, nir_deref_instr *deref,
p++;
}
if (path.path[0]->var->data.compact) {
if (path.path[0]->var->data.compact && nir_src_is_const((*p)->arr.index)) {
assert((*p)->deref_type == nir_deref_type_array);
assert(glsl_type_is_scalar((*p)->type));

View file

@ -329,6 +329,13 @@ dri2_drm_swap_buffers(_EGLDisplay *disp, _EGLSurface *draw)
if (dri2_surf->color_buffers[i].age > 0)
dri2_surf->color_buffers[i].age++;
/* Flushing must be done before get_back_bo to make sure glthread's
* unmarshalling thread is idle otherwise it might concurrently
* call get_back_bo (eg: through dri2_drm_image_get_buffers).
*/
dri2_flush_drawable_for_swapbuffers(disp, draw);
dri2_dpy->flush->invalidate(dri2_surf->dri_drawable);
/* Make sure we have a back buffer in case we're swapping without
* ever rendering. */
if (get_back_bo(dri2_surf) < 0)
@ -338,9 +345,6 @@ dri2_drm_swap_buffers(_EGLDisplay *disp, _EGLSurface *draw)
dri2_surf->current->age = 1;
dri2_surf->back = NULL;
dri2_flush_drawable_for_swapbuffers(disp, draw);
dri2_dpy->flush->invalidate(dri2_surf->dri_drawable);
return EGL_TRUE;
}

View file

@ -1,3 +1,20 @@
# should be fixed with kernel 6.8
spec@ext_external_objects@vk-depth-display@D32S8,Fail
spec@ext_external_objects@vk-image-overwrite@RGB 10 A2 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGB 10 A2 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGB 5 A1 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 16 INT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 16 SFLOAT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 16 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 32 INT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 32 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 4 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 INT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 SRGB optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-stencil-display@D32S8,Fail
KHR-GL46.gpu_shader_fp64.fp64.max_uniform_components,Fail
KHR-GL46.shader_image_load_store.basic-allFormats-store,Fail
KHR-GL46.shader_image_load_store.basic-allTargets-store,Fail

View file

@ -1,3 +1,20 @@
# should be fixed with kernel 6.8
spec@ext_external_objects@vk-depth-display@D32S8,Fail
spec@ext_external_objects@vk-image-overwrite@RGB 10 A2 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGB 10 A2 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGB 5 A1 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 16 INT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 16 SFLOAT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 16 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 32 INT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 32 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 4 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 INT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 SRGB optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 UINT optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-image-overwrite@RGBA 8 UNORM optimal: Failed to create texture from GL memory object.,Fail
spec@ext_external_objects@vk-stencil-display@D32S8,Fail
KHR-GL46.gpu_shader_fp64.fp64.max_uniform_components,Fail
KHR-GL46.shader_image_load_store.basic-allFormats-store,Fail

View file

@ -1715,8 +1715,14 @@ handle_collect(struct ra_ctx *ctx, struct ir3_instruction *instr)
struct ra_interval *interval = &ctx->intervals[src->def->name];
if (src->def->merge_set != dst_set || interval->is_killed)
/* We only need special handling if the source's interval overlaps with
* the destination's interval.
*/
if (src->def->interval_start >= instr->dsts[0]->interval_end ||
instr->dsts[0]->interval_start >= src->def->interval_end ||
interval->is_killed)
continue;
while (interval->interval.parent != NULL) {
interval = ir3_reg_interval_to_ra_interval(interval->interval.parent);
}

View file

@ -228,7 +228,7 @@ tu_physical_device_try_create(struct vk_instance *vk_instance,
#ifdef TU_HAS_VIRTIO
result = tu_knl_drm_virtio_load(instance, fd, version, &device);
#endif
} else {
} else if (TU_DEBUG(STARTUP)) {
result = vk_startup_errorf(instance, VK_ERROR_INCOMPATIBLE_DRIVER,
"device %s (%s) is not compatible with turnip",
path, version->name);

View file

@ -416,6 +416,13 @@ lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
* so we do not use llvm::sys::getHostCPUFeatures to detect cpu features
* but using util_get_cpu_caps() instead.
*/
#if DETECT_ARCH_X86_64
/*
* Without this, on some "buggy" qemu cpu setup, LLVM could crash
* if LLVM detects the wrong CPU type.
*/
MAttrs.push_back("+64bit");
#endif
MAttrs.push_back(util_get_cpu_caps()->has_sse ? "+sse" : "-sse" );
MAttrs.push_back(util_get_cpu_caps()->has_sse2 ? "+sse2" : "-sse2" );
MAttrs.push_back(util_get_cpu_caps()->has_sse3 ? "+sse3" : "-sse3" );

View file

@ -1,3 +1,13 @@
# from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25931
spec@arb_internalformat_query2@image_format_compatibility_type pname checks,Fail
spec@arb_internalformat_query2@image_format_compatibility_type pname checks@GL_IMAGE_FORMAT_COMPATIBILITY_TYPE,Fail
spec@arb_internalformat_query2@max dimensions related pname checks,Fail
spec@arb_internalformat_query2@max dimensions related pname checks@GL_MAX_COMBINED_DIMENSIONS,Fail
spec@arb_internalformat_query2@max dimensions related pname checks@GL_MAX_DEPTH,Fail
spec@arb_internalformat_query2@max dimensions related pname checks@GL_MAX_HEIGHT,Fail
spec@arb_internalformat_query2@max dimensions related pname checks@GL_MAX_LAYERS,Fail
spec@arb_internalformat_query2@max dimensions related pname checks@GL_MAX_WIDTH,Fail
spec@!opengl 1.0@gl-1.0-beginend-coverage,Fail
spec@!opengl 1.0@gl-1.0-beginend-coverage@glFlush,Fail
spec@!opengl 1.0@gl-1.0-blend-func,Fail
@ -43,8 +53,6 @@ spec@arb_seamless_cube_map@arb_seamless_cubemap,Fail
spec@arb_shader_atomic_counters@semantics,Fail
spec@arb_shader_atomic_counters@semantics@Tessellation control shader atomic built-in semantics,Fail
spec@arb_texture_cube_map_array@arb_texture_cube_map_array-sampler-cube-array-shadow,Fail
spec@arb_texture_multisample@arb_texture_multisample-dsa-texelfetch,Fail
spec@arb_texture_multisample@arb_texture_multisample-dsa-texelfetch@Texture type: GL_RGB9_E5,Fail
spec@arb_texture_rg@texwrap formats-int,Fail
spec@arb_texture_rg@texwrap formats-int offset,Fail
spec@arb_texture_rg@texwrap formats-int offset@GL_R16I,Fail

View file

@ -480,9 +480,9 @@ gmem_key_init(struct fd_batch *batch, bool assume_zs, bool no_scis_opt)
if (has_zs || assume_zs) {
struct fd_resource *rsc = fd_resource(pfb->zsbuf->texture);
key->zsbuf_cpp[0] = rsc->layout.cpp;
key->zsbuf_cpp[0] = rsc->layout.cpp * pfb->samples;
if (rsc->stencil)
key->zsbuf_cpp[1] = rsc->stencil->layout.cpp;
key->zsbuf_cpp[1] = rsc->stencil->layout.cpp * pfb->samples;
/* If we clear z or s but not both, and we are using z24s8 (ie.
* !separate_stencil) then we need to restore the other, even if

View file

@ -155,7 +155,7 @@ clear_stale_syncobjs(struct iris_batch *batch)
struct iris_batch_fence, i);
assert(fence->flags & IRIS_BATCH_FENCE_WAIT);
if (iris_wait_syncobj(bufmgr, *syncobj, 0))
if (iris_wait_syncobj(bufmgr, *syncobj, 0) == false)
continue;
/* This sync object has already passed, there's no need to continue
@ -225,7 +225,7 @@ iris_wait_syncobj(struct iris_bufmgr *bufmgr,
.count_handles = 1,
.timeout_nsec = timeout_nsec,
};
return intel_ioctl(fd, DRM_IOCTL_SYNCOBJ_WAIT, &args);
return intel_ioctl(fd, DRM_IOCTL_SYNCOBJ_WAIT, &args) == 0;
}
#define CSI "\e["

View file

@ -300,6 +300,11 @@ GENX(jm_launch_grid)(struct panfrost_batch *batch,
cfg.textures = batch->textures[PIPE_SHADER_COMPUTE];
cfg.samplers = batch->samplers[PIPE_SHADER_COMPUTE];
}
#if PAN_ARCH == 4
pan_section_pack(t.cpu, COMPUTE_JOB, COMPUTE_PADDING, cfg)
;
#endif
#else
struct panfrost_context *ctx = batch->ctx;
struct panfrost_compiled_shader *cs = ctx->prog[PIPE_SHADER_COMPUTE];
@ -449,6 +454,11 @@ jm_emit_vertex_job(struct panfrost_batch *batch,
section = pan_section_ptr(job, COMPUTE_JOB, DRAW);
jm_emit_vertex_draw(batch, section);
#if PAN_ARCH == 4
pan_section_pack(job, COMPUTE_JOB, COMPUTE_PADDING, cfg)
;
#endif
}
#endif /* PAN_ARCH <= 7 */

View file

@ -369,7 +369,7 @@ panfrost_create_shader_state(struct pipe_context *pctx,
if (nir->info.stage == MESA_SHADER_FRAGMENT &&
nir->info.outputs_written & BITFIELD_BIT(FRAG_RESULT_COLOR)) {
NIR_PASS_V(nir, nir_lower_fragcolor, 8);
NIR_PASS_V(nir, nir_lower_fragcolor, nir->info.fs.color_is_dual_source ? 1 : 8);
so->fragcolor_lowered = true;
}

View file

@ -368,7 +368,6 @@ shaders@glsl-bug-110796,Fail
shaders@glsl-fs-bug25902,Fail
shaders@glsl-fwidth,Fail
shaders@glsl-lod-bias,Fail
shaders@glsl-orangebook-ch06-bump,Fail
shaders@glsl-uniform-interstage-limits@subdivide 5,Fail
shaders@glsl-uniform-interstage-limits@subdivide 5- statechanges,Fail

View file

@ -158,13 +158,6 @@ static void set_vertex_inputs_outputs(struct r300_vertex_program_compiler * c)
}
}
/* Texture coordinates. */
for (i = 0; i < ATTR_TEXCOORD_COUNT; i++) {
if (outputs->texcoord[i] != ATTR_UNUSED) {
c->code->outputs[outputs->texcoord[i]] = reg++;
}
}
/* Generics. */
for (i = 0; i < ATTR_GENERIC_COUNT; i++) {
if (outputs->generic[i] != ATTR_UNUSED) {
@ -172,6 +165,13 @@ static void set_vertex_inputs_outputs(struct r300_vertex_program_compiler * c)
}
}
/* Texture coordinates. */
for (i = 0; i < ATTR_TEXCOORD_COUNT; i++) {
if (outputs->texcoord[i] != ATTR_UNUSED) {
c->code->outputs[outputs->texcoord[i]] = reg++;
}
}
/* Fog coordinates. */
if (outputs->fog != ATTR_UNUSED) {
c->code->outputs[outputs->fog] = reg++;

View file

@ -1042,23 +1042,23 @@ static void radeon_enc_begin_frame(struct pipe_video_codec *encoder,
{
struct radeon_encoder *enc = (struct radeon_encoder *)encoder;
struct vl_video_buffer *vid_buf = (struct vl_video_buffer *)source;
bool need_rate_control = false;
enc->need_rate_control = false;
if (u_reduce_video_profile(enc->base.profile) == PIPE_VIDEO_FORMAT_MPEG4_AVC) {
struct pipe_h264_enc_picture_desc *pic = (struct pipe_h264_enc_picture_desc *)picture;
need_rate_control =
enc->need_rate_control =
(enc->enc_pic.rc_layer_init[0].target_bit_rate != pic->rate_ctrl[0].target_bitrate) ||
(enc->enc_pic.rc_layer_init[0].frame_rate_num != pic->rate_ctrl[0].frame_rate_num) ||
(enc->enc_pic.rc_layer_init[0].frame_rate_den != pic->rate_ctrl[0].frame_rate_den);
} else if (u_reduce_video_profile(picture->profile) == PIPE_VIDEO_FORMAT_HEVC) {
struct pipe_h265_enc_picture_desc *pic = (struct pipe_h265_enc_picture_desc *)picture;
need_rate_control =
enc->need_rate_control =
(enc->enc_pic.rc_layer_init[0].target_bit_rate != pic->rc.target_bitrate) ||
(enc->enc_pic.rc_layer_init[0].frame_rate_num != pic->rc.frame_rate_num) ||
(enc->enc_pic.rc_layer_init[0].frame_rate_den != pic->rc.frame_rate_den);
} else if (u_reduce_video_profile(picture->profile) == PIPE_VIDEO_FORMAT_AV1) {
struct pipe_av1_enc_picture_desc *pic = (struct pipe_av1_enc_picture_desc *)picture;
need_rate_control =
enc->need_rate_control =
(enc->enc_pic.rc_layer_init[0].target_bit_rate != pic->rc[0].target_bitrate) ||
(enc->enc_pic.rc_layer_init[0].frame_rate_num != pic->rc[0].frame_rate_num) ||
(enc->enc_pic.rc_layer_init[0].frame_rate_den != pic->rc[0].frame_rate_den);
@ -1113,23 +1113,22 @@ static void radeon_enc_begin_frame(struct pipe_video_codec *encoder,
enc->need_feedback = false;
if (!enc->stream_handle || need_rate_control) {
if (!enc->stream_handle) {
struct rvid_buffer fb;
if (!enc->stream_handle) {
enc->stream_handle = si_vid_alloc_stream_handle();
enc->si = CALLOC_STRUCT(rvid_buffer);
if (!enc->si ||
!enc->stream_handle ||
!si_vid_create_buffer(enc->screen, enc->si, 128 * 1024, PIPE_USAGE_STAGING)) {
RVID_ERR("Can't create session buffer.\n");
goto error;
}
enc->stream_handle = si_vid_alloc_stream_handle();
enc->si = CALLOC_STRUCT(rvid_buffer);
if (!enc->si ||
!enc->stream_handle ||
!si_vid_create_buffer(enc->screen, enc->si, 128 * 1024, PIPE_USAGE_STAGING)) {
RVID_ERR("Can't create session buffer.\n");
goto error;
}
si_vid_create_buffer(enc->screen, &fb, 4096, PIPE_USAGE_STAGING);
enc->fb = &fb;
enc->begin(enc);
flush(enc);
si_vid_destroy_buffer(&fb);
enc->need_rate_control = false;
}
return;

View file

@ -262,6 +262,7 @@ struct radeon_encoder {
bool emulation_prevention;
bool need_feedback;
bool need_rate_control;
unsigned dpb_size;
unsigned roi_size;
rvcn_enc_picture_info_t dpb_info[RENCODE_MAX_NUM_RECONSTRUCTED_PICTURES];

View file

@ -1392,11 +1392,22 @@ static void radeon_enc_headers_hevc(struct radeon_encoder *enc)
static void encode(struct radeon_encoder *enc)
{
unsigned i;
enc->before_encode(enc);
enc->session_info(enc);
enc->total_task_size = 0;
enc->task_info(enc, enc->need_feedback);
if (enc->need_rate_control) {
i = 0;
do {
enc->enc_pic.temporal_id = i;
enc->layer_select(enc);
enc->rc_layer_init(enc);
} while (++i < enc->enc_pic.num_temporal_layers);
}
enc->encode_headers(enc);
enc->ctx(enc);
enc->bitstream(enc);

View file

@ -505,11 +505,22 @@ static void radeon_enc_ctx(struct radeon_encoder *enc)
}
static void encode(struct radeon_encoder *enc)
{
unsigned i;
enc->before_encode(enc);
enc->session_info(enc);
enc->total_task_size = 0;
enc->task_info(enc, enc->need_feedback);
if (enc->need_rate_control) {
i = 0;
do {
enc->enc_pic.temporal_id = i;
enc->layer_select(enc);
enc->rc_layer_init(enc);
} while (++i < enc->enc_pic.num_temporal_layers);
}
enc->encode_headers(enc);
enc->ctx(enc);
enc->bitstream(enc);

View file

@ -5506,7 +5506,8 @@ zink_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags)
if (!is_copy_only && !is_compute_only) {
pipe_buffer_write_nooverlap(&ctx->base, ctx->dummy_vertex_buffer, 0, sizeof(data), data);
pipe_buffer_write_nooverlap(&ctx->base, ctx->dummy_xfb_buffer, 0, sizeof(data), data);
reapply_color_write(ctx);
if (screen->info.have_EXT_color_write_enable)
reapply_color_write(ctx);
/* set on startup just to avoid validation errors if a draw comes through without
* a tess shader later

View file

@ -309,7 +309,7 @@ create_bci(struct zink_screen *screen, const struct pipe_resource *templ, unsign
bci.usage |= VK_BUFFER_USAGE_CONDITIONAL_RENDERING_BIT_EXT;
if (templ->flags & PIPE_RESOURCE_FLAG_SPARSE)
bci.flags |= VK_BUFFER_CREATE_SPARSE_BINDING_BIT;
bci.flags |= VK_BUFFER_CREATE_SPARSE_BINDING_BIT | VK_BUFFER_CREATE_SPARSE_RESIDENCY_BIT;
return bci;
}

View file

@ -940,7 +940,7 @@ zink_get_param(struct pipe_screen *pscreen, enum pipe_cap param)
return screen->info.feats.features.shaderCullDistance;
case PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE:
return screen->info.feats.features.sparseBinding ? ZINK_SPARSE_BUFFER_PAGE_SIZE : 0;
return screen->info.feats.features.sparseResidencyBuffer ? ZINK_SPARSE_BUFFER_PAGE_SIZE : 0;
/* Sparse texture */
case PIPE_CAP_MAX_SPARSE_TEXTURE_SIZE:

View file

@ -310,7 +310,7 @@ impl GLCtxManager {
// CL_INVALID_GL_OBJECT if bufobj is not a GL buffer object or is a GL buffer
// object but does not have an existing data store or the size of the buffer is 0.
if target == GL_ARRAY_BUFFER && export_out.buf_size == 0 {
if [GL_ARRAY_BUFFER, GL_TEXTURE_BUFFER].contains(&target) && export_out.buf_size == 0 {
return Err(CL_INVALID_GL_OBJECT);
}
@ -326,6 +326,7 @@ pub struct GLMemProps {
pub height: u16,
pub depth: u16,
pub width: u32,
pub offset: u32,
pub array_size: u16,
pub pixel_size: u8,
pub stride: u32,
@ -349,7 +350,7 @@ pub struct GLExportManager {
impl GLExportManager {
pub fn get_gl_mem_props(&self) -> CLResult<GLMemProps> {
let pixel_size = if self.is_gl_buffer() {
0
1
} else {
format_from_gl(self.export_out.internal_format)
.ok_or(CL_OUT_OF_HOST_MEMORY)?
@ -361,6 +362,7 @@ impl GLExportManager {
let mut depth = self.export_out.depth as u16;
let mut width = self.export_out.width;
let mut array_size = 1;
let mut offset = 0;
// some fixups
match self.export_in.target {
@ -373,9 +375,10 @@ impl GLExportManager {
array_size = depth;
depth = 1;
}
GL_ARRAY_BUFFER => {
GL_ARRAY_BUFFER | GL_TEXTURE_BUFFER => {
array_size = 1;
width = self.export_out.buf_size as u32;
offset = self.export_out.buf_offset as u32;
height = 1;
depth = 1;
}
@ -389,6 +392,7 @@ impl GLExportManager {
height: height,
depth: depth,
width: width,
offset: offset,
array_size: array_size,
pixel_size: pixel_size,
stride: self.export_out.stride,
@ -531,6 +535,7 @@ pub fn target_from_gl(target: u32) -> CLResult<(u32, u32)> {
// internal format does not map to a supported OpenCL image format.
Ok(match target {
GL_ARRAY_BUFFER => (CL_MEM_OBJECT_BUFFER, CL_GL_OBJECT_BUFFER),
GL_TEXTURE_BUFFER => (CL_MEM_OBJECT_IMAGE1D_BUFFER, CL_GL_OBJECT_TEXTURE_BUFFER),
GL_RENDERBUFFER => (CL_MEM_OBJECT_IMAGE2D, CL_GL_OBJECT_RENDERBUFFER),
GL_TEXTURE_1D => (CL_MEM_OBJECT_IMAGE1D, CL_GL_OBJECT_TEXTURE1D),
GL_TEXTURE_1D_ARRAY => (CL_MEM_OBJECT_IMAGE1D_ARRAY, CL_GL_OBJECT_TEXTURE1D_ARRAY),

View file

@ -501,6 +501,11 @@ impl Mem {
..Default::default()
};
// it's kinda not supported, but we want to know if anything actually hits this as it's
// certainly not tested by the CL CTS.
if mem_type != CL_MEM_OBJECT_BUFFER {
assert_eq!(gl_mem_props.offset, 0);
}
Ok(Arc::new(Self {
base: CLObjectBase::new(),
context: context,
@ -508,7 +513,7 @@ impl Mem {
mem_type: mem_type,
flags: flags,
size: gl_mem_props.size(),
offset: 0,
offset: gl_mem_props.offset as usize,
host_ptr: ptr::null_mut(),
image_format: image_format,
pipe_format: pipe_format,

View file

@ -262,34 +262,41 @@ VAStatus vlVaHandleVAEncPictureParameterBufferTypeAV1(vlVaDriver *drv, vlVaConte
VAStatus vlVaHandleVAEncMiscParameterTypeRateControlAV1(vlVaContext *context, VAEncMiscParameterBuffer *misc)
{
unsigned temporal_id;
VAEncMiscParameterRateControl *rc = (VAEncMiscParameterRateControl *)misc->data;
struct pipe_av1_enc_rate_control *pipe_rc = NULL;
for (int i = 1; i < ARRAY_SIZE(context->desc.av1enc.rc); i++) {
pipe_rc = &context->desc.av1enc.rc[i];
pipe_rc->rate_ctrl_method = context->desc.av1enc.rc[0].rate_ctrl_method;
}
temporal_id = context->desc.av1enc.rc[0].rate_ctrl_method !=
PIPE_H2645_ENC_RATE_CONTROL_METHOD_DISABLE ?
rc->rc_flags.bits.temporal_id :
0;
for (int i = 0; i < ARRAY_SIZE(context->desc.av1enc.rc); i++)
{
pipe_rc = &context->desc.av1enc.rc[i];
if (context->desc.av1enc.seq.num_temporal_layers > 0 &&
temporal_id >= context->desc.av1enc.seq.num_temporal_layers)
return VA_STATUS_ERROR_INVALID_PARAMETER;
if (pipe_rc->rate_ctrl_method == PIPE_H2645_ENC_RATE_CONTROL_METHOD_CONSTANT)
pipe_rc->target_bitrate = pipe_rc->peak_bitrate;
else
pipe_rc->target_bitrate = pipe_rc->peak_bitrate * (rc->target_percentage / 100.0);
pipe_rc = &context->desc.av1enc.rc[temporal_id];
if (pipe_rc->target_bitrate < 2000000)
pipe_rc->vbv_buffer_size = MIN2((pipe_rc->target_bitrate * 2.75), 2000000);
else
pipe_rc->vbv_buffer_size = pipe_rc->target_bitrate;
if (pipe_rc->rate_ctrl_method == PIPE_H2645_ENC_RATE_CONTROL_METHOD_CONSTANT)
pipe_rc->target_bitrate = rc->bits_per_second;
else
pipe_rc->target_bitrate = rc->bits_per_second * (rc->target_percentage / 100.0);
pipe_rc->peak_bitrate = rc->bits_per_second;
if (pipe_rc->target_bitrate < 2000000)
pipe_rc->vbv_buffer_size = MIN2((pipe_rc->target_bitrate * 2.75), 2000000);
else
pipe_rc->vbv_buffer_size = pipe_rc->target_bitrate;
pipe_rc->fill_data_enable = !(rc->rc_flags.bits.disable_bit_stuffing);
pipe_rc->skip_frame_enable = 0;/* !(rc->rc_flags.bits.disable_frame_skip); */
pipe_rc->fill_data_enable = !(rc->rc_flags.bits.disable_bit_stuffing);
pipe_rc->skip_frame_enable = 0;/* !(rc->rc_flags.bits.disable_frame_skip); */
pipe_rc->max_qp = rc->max_qp;
pipe_rc->min_qp = rc->min_qp;
/* Distinguishes from the default params set for these values in other
functions and app specific params passed down */
pipe_rc->app_requested_qp_range = ((rc->max_qp > 0) || (rc->min_qp > 0));
if (pipe_rc->rate_ctrl_method == PIPE_H2645_ENC_RATE_CONTROL_METHOD_QUALITY_VARIABLE)
pipe_rc->vbr_quality_factor = rc->quality_factor;
}
if (pipe_rc->rate_ctrl_method == PIPE_H2645_ENC_RATE_CONTROL_METHOD_QUALITY_VARIABLE)
pipe_rc->vbr_quality_factor = rc->quality_factor;
return VA_STATUS_SUCCESS;
}

View file

@ -315,8 +315,17 @@ intel_get_mesh_urb_config(const struct intel_device_info *devinfo,
* of entries, so we need to discount the space for constants for all of
* them. See 3DSTATE_URB_ALLOC_MESH and 3DSTATE_URB_ALLOC_TASK.
*/
const unsigned push_constant_kb = devinfo->mesh_max_constant_urb_size_kb;
unsigned push_constant_kb = devinfo->mesh_max_constant_urb_size_kb;
/* 3DSTATE_URB_ALLOC_MESH_BODY says
*
* MESH URB Starting Address SliceN
* This field specifies the offset (from the start of the URB memory
* in slices beyond Slice0) of the MESH URB allocation, specified in
* multiples of 8 KB.
*/
push_constant_kb = ALIGN(push_constant_kb, 8);
total_urb_kb -= push_constant_kb;
const unsigned total_urb_avail_mesh_task_kb = total_urb_kb;
/* TODO(mesh): Take push constant size as parameter instead of considering always
* the max? */
@ -338,55 +347,68 @@ intel_get_mesh_urb_config(const struct intel_device_info *devinfo,
if (task_urb_share_percentage >= 0) {
task_urb_share = task_urb_share_percentage / 100.0f;
} else {
task_urb_share = 1.0f * r.task_entry_size_64b /
(r.task_entry_size_64b + r.mesh_entry_size_64b);
task_urb_share = (float)r.task_entry_size_64b / (r.task_entry_size_64b + r.mesh_entry_size_64b);
}
}
const unsigned one_task_urb_kb = ALIGN(r.task_entry_size_64b * 64, 1024) / 1024;
unsigned task_urb_kb = MAX2(total_urb_kb * task_urb_share, one_task_urb_kb);
/* 3DSTATE_URB_ALLOC_MESH_BODY and 3DSTATE_URB_ALLOC_TASK_BODY says
*
* MESH Number of URB Entries must be divisible by 8 if the MESH/TASK URB
* Entry Allocation Size is less than 9 512-bit URB entries.
*/
const unsigned min_mesh_entries = r.mesh_entry_size_64b < 9 ? 8 : 1;
const unsigned min_task_entries = r.task_entry_size_64b < 9 ? 8 : 1;
const unsigned min_mesh_urb_kb = ALIGN(r.mesh_entry_size_64b * min_mesh_entries * 64, 1024) / 1024;
const unsigned min_task_urb_kb = ALIGN(r.task_entry_size_64b * min_task_entries * 64, 1024) / 1024;
total_urb_kb -= (min_mesh_urb_kb + min_task_urb_kb);
/* split the remaining urb_kbs */
unsigned task_urb_kb = total_urb_kb * task_urb_share;
unsigned mesh_urb_kb = total_urb_kb - task_urb_kb;
if (r.task_entry_size_64b > 0) {
/* sum minimum + split urb_kbs */
mesh_urb_kb += min_mesh_urb_kb;
/* 3DSTATE_URB_ALLOC_TASK_BODY says
* MESH Number of URB Entries SliceN
* This field specifies the offset (from the start of the URB memory
* in slices beyond Slice0) of the TASK URB allocation, specified in
* multiples of 8 KB.
*/
if ((total_urb_avail_mesh_task_kb - ALIGN(mesh_urb_kb, 8)) >= min_task_entries) {
mesh_urb_kb = ALIGN(mesh_urb_kb, 8);
} else {
mesh_urb_kb = ROUND_DOWN_TO(mesh_urb_kb, 8);
task_urb_kb = total_urb_kb - mesh_urb_kb;
}
/* TODO(mesh): Could we avoid allocating URB for Mesh if rasterization is
* disabled? */
unsigned next_address_8kb = DIV_ROUND_UP(push_constant_kb, 8);
r.mesh_entries = MIN2((mesh_urb_kb * 16) / r.mesh_entry_size_64b, 1548);
/* 3DSTATE_URB_ALLOC_MESH_BODY says
*
* MESH Number of URB Entries must be divisible by 8 if the MESH URB
* Entry Allocation Size is less than 9 512-bit URB entries.
*/
if (r.mesh_entry_size_64b < 9)
r.mesh_entries = ROUND_DOWN_TO(r.mesh_entries, 8);
unsigned next_address_8kb = push_constant_kb / 8;
assert(push_constant_kb % 8 == 0);
r.mesh_starting_address_8kb = next_address_8kb;
assert(mesh_urb_kb % 8 == 0);
next_address_8kb += mesh_urb_kb / 8;
r.mesh_entries = MIN2((mesh_urb_kb * 16) / r.mesh_entry_size_64b, 1548);
r.mesh_entries = r.mesh_entry_size_64b < 9 ? ROUND_DOWN_TO(r.mesh_entries, 8) : r.mesh_entries;
next_address_8kb += mesh_urb_kb / 8;
assert(mesh_urb_kb % 8 == 0);
r.task_starting_address_8kb = next_address_8kb;
task_urb_kb = total_urb_avail_mesh_task_kb - mesh_urb_kb;
if (r.task_entry_size_64b > 0) {
r.task_entries = MIN2((task_urb_kb * 16) / r.task_entry_size_64b, 1548);
/* 3DSTATE_URB_ALLOC_TASK_BODY says
*
* TASK Number of URB Entries must be divisible by 8 if the TASK URB
* Entry Allocation Size is less than 9 512-bit URB entries.
*/
if (r.task_entry_size_64b < 9)
r.task_entries = ROUND_DOWN_TO(r.task_entries, 8);
r.task_starting_address_8kb = next_address_8kb;
r.task_entries = r.task_entry_size_64b < 9 ? ROUND_DOWN_TO(r.task_entries, 8) : r.task_entries;
}
r.deref_block_size = r.mesh_entries > 32 ?
INTEL_URB_DEREF_BLOCK_SIZE_MESH :
INTEL_URB_DEREF_BLOCK_SIZE_PER_POLY;
assert(mesh_urb_kb + task_urb_kb <= total_urb_avail_mesh_task_kb);
assert(mesh_urb_kb >= min_mesh_urb_kb);
assert(task_urb_kb >= min_task_urb_kb);
return r;
}

View file

@ -1685,6 +1685,7 @@ brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler,
OPT(nir_opt_move, nir_move_comparisons);
OPT(nir_opt_dead_cf);
bool divergence_analysis_dirty = false;
NIR_PASS(_, nir, nir_convert_to_lcssa, true, true);
NIR_PASS_V(nir, nir_divergence_analysis);
@ -1710,11 +1711,19 @@ brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler,
if (OPT(nir_lower_int64))
brw_nir_optimize(nir, is_scalar, devinfo);
divergence_analysis_dirty = true;
}
/* Do this only after the last opt_gcm. GCM will undo this lowering. */
if (nir->info.stage == MESA_SHADER_FRAGMENT)
if (nir->info.stage == MESA_SHADER_FRAGMENT) {
if (divergence_analysis_dirty) {
NIR_PASS(_, nir, nir_convert_to_lcssa, true, true);
NIR_PASS_V(nir, nir_divergence_analysis);
}
OPT(brw_nir_lower_non_uniform_barycentric_at_sample);
}
/* Clean up LCSSA phis */
OPT(nir_opt_remove_phis);

View file

@ -1898,12 +1898,6 @@ anv_device_release_bo(struct anv_device *device,
}
assert(bo->refcount == 0);
/* Unmap the entire BO. In the case that some addresses lacked an aux-map
* entry, the unmapping function will add table entries for them.
*/
if (anv_bo_allows_aux_map(device, bo))
intel_aux_map_unmap_range(device->aux_map_ctx, bo->offset, bo->size);
/* Memset the BO just in case. The refcount being zero should be enough to
* prevent someone from assuming the data is valid but it's safer to just
* stomp to zero just in case. We explicitly do this *before* we actually

View file

@ -80,6 +80,7 @@ static const driOptionDescription anv_dri_options[] = {
DRI_CONF_ANV_ASSUME_FULL_SUBGROUPS(0)
DRI_CONF_ANV_DISABLE_FCV(false)
DRI_CONF_ANV_SAMPLE_MASK_OUT_OPENGL_BEHAVIOUR(false)
DRI_CONF_ANV_FORCE_FILTER_ADDR_ROUNDING(false)
DRI_CONF_ANV_FP64_WORKAROUND_ENABLED(false)
DRI_CONF_ANV_GENERATED_INDIRECT_THRESHOLD(4)
DRI_CONF_ANV_GENERATED_INDIRECT_RING_THRESHOLD(100)
@ -2465,6 +2466,8 @@ anv_init_dri_options(struct anv_instance *instance)
driQueryOptionb(&instance->dri_options, "limit_trig_input_range");
instance->sample_mask_out_opengl_behaviour =
driQueryOptionb(&instance->dri_options, "anv_sample_mask_out_opengl_behaviour");
instance->force_filter_addr_rounding =
driQueryOptionb(&instance->dri_options, "anv_force_filter_addr_rounding");
instance->lower_depth_range_rate =
driQueryOptionf(&instance->dri_options, "lower_depth_range_rate");
instance->no_16bit =

View file

@ -1161,6 +1161,7 @@ struct anv_instance {
uint8_t assume_full_subgroups;
bool limit_trig_input_range;
bool sample_mask_out_opengl_behaviour;
bool force_filter_addr_rounding;
bool fp64_workaround_enabled;
float lower_depth_range_rate;
unsigned generated_indirect_threshold;
@ -3020,6 +3021,15 @@ enum anv_query_bits {
ANV_QUERY_WRITES_DATA_FLUSH = (1 << 3),
};
/* It's not clear why DG2 doesn't have issues with L3/CS coherency. But it's
* likely related to performance workaround 14015868140.
*
* For now we enable this only on DG2 and platform prior to Gfx12 where there
* is no tile cache.
*/
#define ANV_DEVINFO_HAS_COHERENT_L3_CS(devinfo) \
(intel_device_info_is_dg2(devinfo))
/* Things we need to flush before accessing query data using the command
* streamer.
*

View file

@ -395,22 +395,18 @@ transition_depth_buffer(struct anv_cmd_buffer *cmd_buffer,
0, base_layer, layer_count, ISL_AUX_OP_AMBIGUATE);
}
#if GFX_VER == 12
/* Depth/Stencil writes by the render pipeline to D16 & S8 formats use a
* different pairing bit for the compression cache line. This means that
* there is potential for aliasing with the wrong cache if you use another
* format OR a piece of HW that does not use the same pairing. To avoid
* this, flush the tile cache as the compression data does not live in the
* color/depth cache.
/* Additional tile cache flush for MTL:
*
* https://gitlab.freedesktop.org/mesa/mesa/-/issues/10420
* https://gitlab.freedesktop.org/mesa/mesa/-/issues/10530
*/
if (image->planes[depth_plane].aux_usage == ISL_AUX_USAGE_HIZ_CCS &&
final_needs_depth && !initial_depth_valid &&
anv_image_format_is_d16_or_s8(image)) {
if (intel_device_info_is_mtl(cmd_buffer->device->info) &&
image->planes[depth_plane].aux_usage == ISL_AUX_USAGE_HIZ_CCS &&
final_needs_depth && !initial_depth_valid) {
anv_add_pending_pipe_bits(cmd_buffer,
ANV_PIPE_TILE_CACHE_FLUSH_BIT,
"D16 or S8 HIZ-CCS flush");
"HIZ-CCS flush");
}
#endif
}
/* Transitions a HiZ-enabled depth buffer from one layout to another. Unless
@ -467,17 +463,15 @@ transition_stencil_buffer(struct anv_cmd_buffer *cmd_buffer,
}
}
/* Depth/Stencil writes by the render pipeline to D16 & S8 formats use a
* different pairing bit for the compression cache line. This means that
* there is potential for aliasing with the wrong cache if you use another
* format OR a piece of HW that does not use the same pairing. To avoid
* this, flush the tile cache as the compression data does not live in the
* color/depth cache.
/* Additional tile cache flush for MTL:
*
* https://gitlab.freedesktop.org/mesa/mesa/-/issues/10420
* https://gitlab.freedesktop.org/mesa/mesa/-/issues/10530
*/
if (anv_image_format_is_d16_or_s8(image)) {
if (intel_device_info_is_mtl(cmd_buffer->device->info)) {
anv_add_pending_pipe_bits(cmd_buffer,
ANV_PIPE_TILE_CACHE_FLUSH_BIT,
"D16 or S8 HIZ-CCS flush");
"HIZ-CCS flush");
}
#endif
}
@ -4170,6 +4164,13 @@ mask_is_write(const VkAccessFlags2 access)
VK_ACCESS_2_OPTICAL_FLOW_WRITE_BIT_NV);
}
static inline bool
mask_is_transfer_write(const VkAccessFlags2 access)
{
return access & (VK_ACCESS_2_TRANSFER_WRITE_BIT |
VK_ACCESS_2_MEMORY_WRITE_BIT);
}
static void
cmd_buffer_barrier_video(struct anv_cmd_buffer *cmd_buffer,
const VkDependencyInfo *dep_info)
@ -4333,6 +4334,16 @@ cmd_buffer_barrier_blitter(struct anv_cmd_buffer *cmd_buffer,
#endif
}
static inline bool
cmd_buffer_has_pending_copy_query(struct anv_cmd_buffer *cmd_buffer)
{
/* Query copies are only written with dataport, so we only need to check
* that flag.
*/
return (cmd_buffer->state.queries.buffer_write_bits &
ANV_QUERY_WRITES_DATA_FLUSH) != 0;
}
static void
cmd_buffer_barrier(struct anv_cmd_buffer *cmd_buffer,
const VkDependencyInfo *dep_info,
@ -4358,6 +4369,7 @@ cmd_buffer_barrier(struct anv_cmd_buffer *cmd_buffer,
VkAccessFlags2 dst_flags = 0;
bool apply_sparse_flushes = false;
bool flush_query_copies = false;
for (uint32_t i = 0; i < dep_info->memoryBarrierCount; i++) {
src_flags |= dep_info->pMemoryBarriers[i].srcAccessMask;
@ -4373,6 +4385,11 @@ cmd_buffer_barrier(struct anv_cmd_buffer *cmd_buffer,
ANV_QUERY_COMPUTE_WRITES_PENDING_BITS;
}
if (stage_is_transfer(dep_info->pMemoryBarriers[i].srcStageMask) &&
mask_is_transfer_write(dep_info->pMemoryBarriers[i].srcAccessMask) &&
cmd_buffer_has_pending_copy_query(cmd_buffer))
flush_query_copies = true;
/* There's no way of knowing if this memory barrier is related to sparse
* buffers! This is pretty horrible.
*/
@ -4398,6 +4415,11 @@ cmd_buffer_barrier(struct anv_cmd_buffer *cmd_buffer,
ANV_QUERY_COMPUTE_WRITES_PENDING_BITS;
}
if (stage_is_transfer(buf_barrier->srcStageMask) &&
mask_is_transfer_write(buf_barrier->srcAccessMask) &&
cmd_buffer_has_pending_copy_query(cmd_buffer))
flush_query_copies = true;
if (anv_buffer_is_sparse(buffer) && mask_is_write(src_flags))
apply_sparse_flushes = true;
}
@ -4494,6 +4516,14 @@ cmd_buffer_barrier(struct anv_cmd_buffer *cmd_buffer,
if (apply_sparse_flushes)
bits |= ANV_PIPE_FLUSH_BITS;
/* Copies from query pools are executed with a shader writing through the
* dataport.
*/
if (flush_query_copies) {
bits |= (GFX_VER >= 12 ?
ANV_PIPE_HDC_PIPELINE_FLUSH_BIT : ANV_PIPE_DATA_CACHE_FLUSH_BIT);
}
if (dst_flags & VK_ACCESS_INDIRECT_COMMAND_READ_BIT)
genX(cmd_buffer_flush_generated_draws)(cmd_buffer);
@ -9133,10 +9163,8 @@ genX(CmdWriteBufferMarker2AMD)(VkCommandBuffer commandBuffer,
* cache flushes.
*/
enum anv_pipe_bits bits =
#if GFX_VERx10 < 125
ANV_PIPE_DATA_CACHE_FLUSH_BIT |
ANV_PIPE_TILE_CACHE_FLUSH_BIT |
#endif
(ANV_DEVINFO_HAS_COHERENT_L3_CS(cmd_buffer->device->info) ? 0 :
(ANV_PIPE_DATA_CACHE_FLUSH_BIT | ANV_PIPE_TILE_CACHE_FLUSH_BIT)) |
ANV_PIPE_END_OF_PIPE_SYNC_BIT;
trace_intel_begin_write_buffer_marker(&cmd_buffer->trace);

View file

@ -1152,8 +1152,12 @@ VkResult genX(CreateSampler)(
const VkFilter mag_filter =
plane_has_chroma ? sampler->vk.ycbcr_conversion->state.chroma_filter :
pCreateInfo->magFilter;
const bool enable_min_filter_addr_rounding = min_filter != VK_FILTER_NEAREST;
const bool enable_mag_filter_addr_rounding = mag_filter != VK_FILTER_NEAREST;
const bool force_addr_rounding =
device->physical->instance->force_filter_addr_rounding;
const bool enable_min_filter_addr_rounding =
force_addr_rounding || min_filter != VK_FILTER_NEAREST;
const bool enable_mag_filter_addr_rounding =
force_addr_rounding || mag_filter != VK_FILTER_NEAREST;
/* From Broadwell PRM, SAMPLER_STATE:
* "Mip Mode Filter must be set to MIPFILTER_NONE for Planar YUV surfaces."
*/

View file

@ -1812,11 +1812,11 @@ copy_query_results_with_shader(struct anv_cmd_buffer *cmd_buffer,
genX(emit_simple_shader_dispatch)(&state, query_count, push_data_state);
anv_add_pending_pipe_bits(cmd_buffer,
cmd_buffer->state.current_pipeline == GPGPU ?
ANV_QUERY_COMPUTE_WRITES_PENDING_BITS :
ANV_QUERY_RENDER_TARGET_WRITES_PENDING_BITS(device->info),
"after query copy results");
/* The query copy result shader is writing using the dataport, flush
* HDC/Data cache depending on the generation. Also stall at pixel
* scoreboard in case we're doing the copy with a fragment shader.
*/
cmd_buffer->state.queries.buffer_write_bits |= ANV_QUERY_WRITES_DATA_FLUSH;
trace_intel_end_query_copy_shader(&cmd_buffer->trace, query_count);
}

View file

@ -199,7 +199,9 @@ libvulkan_intel_hasvk = shared_library(
],
c_args : anv_flags,
gnu_symbol_visibility : 'hidden',
link_args : [ld_args_build_id, ld_args_bsymbolic, ld_args_gc_sections],
link_args : [vulkan_icd_link_args, ld_args_build_id,
ld_args_bsymbolic, ld_args_gc_sections],
link_depends : vulkan_icd_link_depends,
install : true,
)

View file

@ -7101,14 +7101,14 @@ texture_image_multisample(struct gl_context *ctx, GLuint dims,
if (!st_SetTextureStorageForMemoryObject(ctx, texObj,
memObj, 1, width,
height, depth,
offset)) {
offset, func)) {
_mesa_init_teximage_fields(ctx, texImage, 0, 0, 0, 0,
internalformat, texFormat);
}
} else {
if (!st_AllocTextureStorage(ctx, texObj, 1,
width, height, depth)) {
width, height, depth, func)) {
/* tidy up the texture image state. strictly speaking,
* we're allowed to just leave this in whatever state we
* like, but being tidy is good.

View file

@ -439,7 +439,7 @@ texture_storage(struct gl_context *ctx, GLuint dims,
struct gl_memory_object *memObj, GLenum target,
GLsizei levels, GLenum internalformat, GLsizei width,
GLsizei height, GLsizei depth, GLuint64 offset, bool dsa,
bool no_error)
bool no_error, const char *func)
{
GLboolean sizeOK = GL_TRUE, dimensionsOK = GL_TRUE;
mesa_format texFormat;
@ -517,7 +517,7 @@ texture_storage(struct gl_context *ctx, GLuint dims,
if (!st_SetTextureStorageForMemoryObject(ctx, texObj, memObj,
levels,
width, height, depth,
offset)) {
offset, func)) {
clear_texture_fields(ctx, texObj);
return;
@ -525,7 +525,7 @@ texture_storage(struct gl_context *ctx, GLuint dims,
}
else {
if (!st_AllocTextureStorage(ctx, texObj, levels,
width, height, depth)) {
width, height, depth, func)) {
/* Reset the texture images' info to zeros.
* Strictly speaking, we probably don't have to do this since
* generating GL_OUT_OF_MEMORY can leave things in an undefined
@ -550,10 +550,10 @@ texture_storage_error(struct gl_context *ctx, GLuint dims,
struct gl_texture_object *texObj,
GLenum target, GLsizei levels,
GLenum internalformat, GLsizei width,
GLsizei height, GLsizei depth, bool dsa)
GLsizei height, GLsizei depth, bool dsa, const char *func)
{
texture_storage(ctx, dims, texObj, NULL, target, levels, internalformat,
width, height, depth, dsa, 0, false);
width, height, depth, dsa, 0, false, func);
}
@ -562,10 +562,10 @@ texture_storage_no_error(struct gl_context *ctx, GLuint dims,
struct gl_texture_object *texObj,
GLenum target, GLsizei levels,
GLenum internalformat, GLsizei width,
GLsizei height, GLsizei depth, bool dsa)
GLsizei height, GLsizei depth, bool dsa, const char *func)
{
texture_storage(ctx, dims, texObj, NULL, target, levels, internalformat,
width, height, depth, dsa, 0, true);
width, height, depth, dsa, 0, true, func);
}
@ -609,20 +609,20 @@ texstorage_error(GLuint dims, GLenum target, GLsizei levels,
return;
texture_storage_error(ctx, dims, texObj, target, levels,
internalformat, width, height, depth, false);
internalformat, width, height, depth, false, caller);
}
static void
texstorage_no_error(GLuint dims, GLenum target, GLsizei levels,
GLenum internalformat, GLsizei width, GLsizei height,
GLsizei depth)
GLsizei depth, const char *caller)
{
GET_CURRENT_CONTEXT(ctx);
struct gl_texture_object *texObj = _mesa_get_current_tex_object(ctx, target);
texture_storage_no_error(ctx, dims, texObj, target, levels,
internalformat, width, height, depth, false);
internalformat, width, height, depth, false, caller);
}
@ -666,20 +666,20 @@ texturestorage_error(GLuint dims, GLuint texture, GLsizei levels,
}
texture_storage_error(ctx, dims, texObj, texObj->Target,
levels, internalformat, width, height, depth, true);
levels, internalformat, width, height, depth, true, caller);
}
static void
texturestorage_no_error(GLuint dims, GLuint texture, GLsizei levels,
GLenum internalformat, GLsizei width, GLsizei height,
GLsizei depth)
GLsizei depth, const char *caller)
{
GET_CURRENT_CONTEXT(ctx);
struct gl_texture_object *texObj = _mesa_lookup_texture(ctx, texture);
texture_storage_no_error(ctx, dims, texObj, texObj->Target,
levels, internalformat, width, height, depth, true);
levels, internalformat, width, height, depth, true, caller);
}
@ -687,7 +687,8 @@ void GLAPIENTRY
_mesa_TexStorage1D_no_error(GLenum target, GLsizei levels,
GLenum internalformat, GLsizei width)
{
texstorage_no_error(1, target, levels, internalformat, width, 1, 1);
texstorage_no_error(1, target, levels, internalformat, width, 1, 1,
"glTexStorage1D");
}
@ -705,7 +706,8 @@ _mesa_TexStorage2D_no_error(GLenum target, GLsizei levels,
GLenum internalformat, GLsizei width,
GLsizei height)
{
texstorage_no_error(2, target, levels, internalformat, width, height, 1);
texstorage_no_error(2, target, levels, internalformat, width, height, 1,
"glTexStorage2D");
}
@ -723,7 +725,8 @@ _mesa_TexStorage3D_no_error(GLenum target, GLsizei levels,
GLenum internalformat, GLsizei width,
GLsizei height, GLsizei depth)
{
texstorage_no_error(3, target, levels, internalformat, width, height, depth);
texstorage_no_error(3, target, levels, internalformat, width, height, depth,
"glTexStorage3D");
}
@ -740,7 +743,8 @@ void GLAPIENTRY
_mesa_TextureStorage1D_no_error(GLuint texture, GLsizei levels,
GLenum internalformat, GLsizei width)
{
texturestorage_no_error(1, texture, levels, internalformat, width, 1, 1);
texturestorage_no_error(1, texture, levels, internalformat, width, 1, 1,
"glTextureStorage1D");
}
@ -758,7 +762,8 @@ _mesa_TextureStorage2D_no_error(GLuint texture, GLsizei levels,
GLenum internalformat,
GLsizei width, GLsizei height)
{
texturestorage_no_error(2, texture, levels, internalformat, width, height, 1);
texturestorage_no_error(2, texture, levels, internalformat, width, height, 1,
"glTextureStorage2D");
}
@ -778,7 +783,7 @@ _mesa_TextureStorage3D_no_error(GLuint texture, GLsizei levels,
GLsizei height, GLsizei depth)
{
texturestorage_no_error(3, texture, levels, internalformat, width, height,
depth);
depth, "glTextureStorage3D");
}
@ -854,5 +859,5 @@ _mesa_texture_storage_memory(struct gl_context *ctx, GLuint dims,
assert(memObj);
texture_storage(ctx, dims, texObj, memObj, target, levels, internalformat,
width, height, depth, offset, dsa, false);
width, height, depth, offset, dsa, false, "");
}

View file

@ -3373,7 +3373,7 @@ st_texture_storage(struct gl_context *ctx,
GLsizei levels, GLsizei width,
GLsizei height, GLsizei depth,
struct gl_memory_object *memObj,
GLuint64 offset)
GLuint64 offset, const char *func)
{
const GLuint numFaces = _mesa_num_tex_faces(texObj->Target);
struct gl_texture_image *texImage = texObj->Image[0][0];
@ -3423,6 +3423,7 @@ st_texture_storage(struct gl_context *ctx,
}
if (!found) {
_mesa_error(st->ctx, GL_INVALID_OPERATION, "%s(format/samplecount not supported)", func);
return GL_FALSE;
}
}
@ -3459,8 +3460,10 @@ st_texture_storage(struct gl_context *ctx,
texObj->IsSparse);
}
if (!texObj->pt)
if (!texObj->pt) {
_mesa_error(st->ctx, GL_OUT_OF_MEMORY, "%s", func);
return GL_FALSE;
}
/* Set image resource pointers */
for (level = 0; level < levels; level++) {
@ -3493,11 +3496,12 @@ GLboolean
st_AllocTextureStorage(struct gl_context *ctx,
struct gl_texture_object *texObj,
GLsizei levels, GLsizei width,
GLsizei height, GLsizei depth)
GLsizei height, GLsizei depth,
const char *func)
{
return st_texture_storage(ctx, texObj, levels,
width, height, depth,
NULL, 0);
NULL, 0, func);
}
@ -3715,11 +3719,11 @@ st_SetTextureStorageForMemoryObject(struct gl_context *ctx,
struct gl_memory_object *memObj,
GLsizei levels, GLsizei width,
GLsizei height, GLsizei depth,
GLuint64 offset)
GLuint64 offset, const char *func)
{
return st_texture_storage(ctx, texObj, levels,
width, height, depth,
memObj, offset);
memObj, offset, func);
}
GLboolean

View file

@ -98,7 +98,8 @@ void st_CopyTexSubImage(struct gl_context *ctx, GLuint dims,
GLboolean st_AllocTextureStorage(struct gl_context *ctx,
struct gl_texture_object *texObj,
GLsizei levels, GLsizei width,
GLsizei height, GLsizei depth);
GLsizei height, GLsizei depth,
const char *func);
GLboolean st_TestProxyTexImage(struct gl_context *ctx, GLenum target,
GLuint numLevels, GLint level,
mesa_format format, GLuint numSamples,
@ -116,7 +117,8 @@ GLboolean st_SetTextureStorageForMemoryObject(struct gl_context *ctx,
struct gl_memory_object *memObj,
GLsizei levels, GLsizei width,
GLsizei height, GLsizei depth,
GLuint64 offset);
GLuint64 offset,
const char *func);
GLboolean st_GetSparseTextureVirtualPageSize(struct gl_context *ctx,
GLenum target, mesa_format format,

View file

@ -2053,6 +2053,10 @@ void
nvk_cmd_bind_vertex_buffer(struct nvk_cmd_buffer *cmd, uint32_t vb_idx,
struct nvk_addr_range addr_range)
{
/* Used for meta save/restore */
if (vb_idx == 0)
cmd->state.gfx.vb0 = addr_range;
struct nv_push *p = nvk_cmd_buffer_push(cmd, 6);
P_MTHD(p, NV9097, SET_VERTEX_STREAM_A_LOCATION_A(vb_idx));
@ -2097,10 +2101,6 @@ nvk_CmdBindVertexBuffers2(VkCommandBuffer commandBuffer,
const struct nvk_addr_range addr_range =
nvk_buffer_addr_range(buffer, pOffsets[i], size);
/* Used for meta save/restore */
if (idx == 0)
cmd->state.gfx.vb0 = addr_range;
nvk_cmd_bind_vertex_buffer(cmd, idx, addr_range);
}
}

View file

@ -69,6 +69,7 @@ if with_tests
c_args : [c_msvc_compat_args, no_override_init_args],
gnu_symbol_visibility : 'hidden',
include_directories : [inc_include, inc_src, inc_mesa],
dependencies: [idep_valhall_enums_h],
link_with : [libpanfrost_valhall_disasm],
),
suite : ['panfrost'],

View file

@ -106,9 +106,9 @@ va_fuse_add_imm(bi_instr *I)
/* If the constant is negated, flip the sign bit */
if (I->src[s].neg) {
if (I->op == BI_OPCODE_FADD_IMM_F32)
I->index ^= (1 << 31);
I->index ^= (1u << 31);
else if (I->op == BI_OPCODE_FADD_IMM_V2F16)
I->index ^= (1 << 31) | (1 << 15);
I->index ^= (1u << 31) | (1u << 15);
else
unreachable("unexpected .neg");
}

View file

@ -992,12 +992,16 @@
<field name="Job Task Split" size="4" start="0:26" type="uint"/>
</struct>
<struct name="Compute Padding" size="2">
</struct>
<!-- Compute job also covers vertex and geometry operations -->
<aggregate name="Compute Job" align="64">
<section name="Header" offset="0" type="Job Header"/>
<section name="Invocation" offset="32" type="Invocation"/>
<section name="Parameters" offset="40" type="Compute Job Parameters"/>
<section name="Draw" offset="64" type="Draw"/>
<section name="Compute padding" offset="184" type="Compute Padding"/>
</aggregate>
<struct name="Primitive Size">

View file

@ -106,6 +106,7 @@ pan_nir_lower_zs_store(nir_shader *nir)
stores[1] = intr;
writeout |= PAN_WRITEOUT_S;
} else if (sem.dual_source_blend_index) {
assert(!stores[2]); /* there should be only 1 source for dual blending */
stores[2] = intr;
writeout |= PAN_WRITEOUT_2;
}

View file

@ -169,6 +169,7 @@ TODO: document the other workarounds.
<application name="Dying Light" executable="DyingLightGame">
<option name="allow_glsl_builtin_variable_redeclaration" value="true" />
<option name="dual_color_blend_by_location" value="true" />
</application>
<application name="Exanima" executable="Exanima.exe">
@ -939,6 +940,11 @@ TODO: document the other workarounds.
<option name="vk_dont_care_as_load" value="true" />
</application>
<!-- Atlas Fallen Vulkan crashes with vsync turned off on xwayland without this workaround. -->
<application name="Atlas Fallen" executable="AtlasFallen (VK).exe">
<option name="vk_x11_strict_image_count" value="true" />
</application>
<!-- Disable fp16 support for browsers, since there is too much
broken WebGL out there that uses the wrong precision.
Bonus workaround for Firefox bug #1845309. -->
@ -1116,6 +1122,11 @@ TODO: document the other workarounds.
<application name="Insurgency" executable="insurgency_linux">
<option name="force_gl_vendor" value="X.Org" />
</application>
<application name="SPECviewperf13" executable="viewperf">
<!-- creo-03 needs this to compile shaders; we don't support some corner cases -->
<option name="mesa_extension_override" value="+GL_EXT_shader_image_load_store" />
</application>
</device>
<device driver="crocus">
<application name="glmark2" executable="glmark2">
@ -1178,6 +1189,9 @@ TODO: document the other workarounds.
<application name="Armored Core 6" executable="armoredcore6.exe">
<option name="fake_sparse" value="true" />
</application>
<application name="Age of Empires IV" executable="RelicCardinal.exe">
<option name="anv_force_filter_addr_rounding" value="true" />
</application>
<!-- Needed to avoid XeSS code paths. -->
<application name="Marvel's Spider-Man Remastered" executable="Spider-Man.exe">
<option name="force_vk_vendor" value="-1" />

View file

@ -159,6 +159,10 @@ Application bugs worked around in this file:
<option name="radv_ssbo_non_uniform" value="true" />
</application>
<application name="Star Wars: Jedi Survivor" executable="JediSurvivor.exe">
<option name="radv_force_active_accel_struct_leaves" value="true" />
</application>
<!-- OpenGL Game workarounds (zink) -->
<application name="Black Geyser: Couriers of Darkness" executable="BlackGeyser.x86_64">
<option name="radv_zero_vram" value="true" />

View file

@ -732,6 +732,10 @@
DRI_CONF_OPT_B(anv_sample_mask_out_opengl_behaviour, def, \
"Ignore sample mask out when having single sampled target")
#define DRI_CONF_ANV_FORCE_FILTER_ADDR_ROUNDING(def) \
DRI_CONF_OPT_B(anv_force_filter_addr_rounding, def, \
"Force min/mag filter address rounding to be enabled even for NEAREST sampling")
#define DRI_CONF_ANV_MESH_CONV_PRIM_ATTRS_TO_VERT_ATTRS(def) \
DRI_CONF_OPT_E(anv_mesh_conv_prim_attrs_to_vert_attrs, def, -2, 2, \
"Apply workaround for gfx12.5 per-prim attribute corruption HW bug", \

View file

@ -367,7 +367,7 @@ def get_feature_structs(doc, api, beta):
# Skip extensions with a define for now
guard = required[_type.attrib['name']].guard
if guard is not None and (guard != "VK_ENABLE_BETA_EXTENSIONS" or not beta):
if guard is not None and (guard != "VK_ENABLE_BETA_EXTENSIONS" or beta != "true"):
continue
# find Vulkan structure type

View file

@ -223,7 +223,7 @@ def get_property_structs(doc, api, beta):
# Skip extensions with a define for now
guard = required[full_name].guard
if guard is not None and (guard != "VK_ENABLE_BETA_EXTENSIONS" or not beta):
if guard is not None and (guard != "VK_ENABLE_BETA_EXTENSIONS" or beta != "true"):
continue
# find Vulkan structure type

View file

@ -1,13 +1,13 @@
[wrap-file]
directory = zlib-1.3
source_url = http://zlib.net/fossils/zlib-1.3.tar.gz
source_fallback_url = https://github.com/mesonbuild/wrapdb/releases/download/zlib_1.3-5/zlib-1.3.tar.gz
source_filename = zlib-1.3.tar.gz
source_hash = ff0ba4c292013dbc27530b3a81e1f9a813cd39de01ca5e0f8bf355702efa593e
patch_filename = zlib_1.3-5_patch.zip
patch_url = https://wrapdb.mesonbuild.com/v2/zlib_1.3-5/get_patch
patch_hash = 524fb7648b68474a00c0e2d27e0ca707fb4ba0eb6d510d4f18f29cf656ba2b8b
wrapdb_version = 1.3-5
directory = zlib-1.3.1
source_url = http://zlib.net/fossils/zlib-1.3.1.tar.gz
source_fallback_url = https://github.com/mesonbuild/wrapdb/releases/download/zlib_1.3.1-1/zlib-1.3.1.tar.gz
source_filename = zlib-1.3.1.tar.gz
source_hash = 9a93b2b7dfdac77ceba5a558a580e74667dd6fede4585b91eefb60f03b72df23
patch_filename = zlib_1.3.1-1_patch.zip
patch_url = https://wrapdb.mesonbuild.com/v2/zlib_1.3.1-1/get_patch
patch_hash = e79b98eb24a75392009cec6f99ca5cdca9881ff20bfa174e8b8926d5c7a47095
wrapdb_version = 1.3.1-1
[provide]
zlib = zlib_dep