Mesa 20.0.0 Release Notes / 2020-02-19 ====================================== Mesa 20.0.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 20.0.1. Mesa 20.0.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don't support all the features required in OpenGL 4.6. OpenGL 4.6 is **only** available if requested at context creation. Compatibility contexts may report a lower version depending on each driver. Mesa 20.0.0 implements the Vulkan 1.2 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used. SHA256 checksum --------------- :: bb6db3e54b608d2536d4000b3de7dd3ae115fc114e8acbb5afff4b3bbed04b34 mesa-20.0.0.tar.xz New features ------------ - OpenGL 4.6 on radeonsi. - GL_ARB_gl_spirv on radeonsi. - GL_ARB_spirv_extensions on radeonsi. - GL_EXT_direct_state_access for compatibility profile. - VK_AMD_device_coherent_memory on RADV. - VK_AMD_mixed_attachment_samples on RADV. - VK_AMD_shader_explicit_vertex_parameter on RADV. - VK_AMD_shader_image_load_store_lod on RADV. - VK_AMD_shader_fragment_mask on RADV. - VK_EXT_subgroup_size_control on RADV/LLVM. - VK_KHR_separate_depth_stencil_layouts on Intel, RADV. - VK_KHR_shader_subgroup_extended_types on RADV. - VK_KHR_swapchain_mutable_format on RADV. - VK_KHR_shader_float_controls on RADV/ACO. - GFX6 (Southern Islands) and GFX7 (Sea Islands) support on RADV/ACO. - Wave32 support for GFX10 (Navi) on RADV/ACO. - Compilation of Geometry Shaders on RADV/ACO. - Vulkan 1.2 on Intel, RADV. - GL_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2 on Intel. Bug fixes --------- - drisw crashes on calling NULL putImage on EGL surfaceless platform (pbuffer EGLSurface) - [radeonsi][vaapi][bisected] invalid VASurfaceID when playing interlaced DVB stream in Kodi - [RADV] GPU hangs while the cutscene plays in the game Assassin's Creed Origins - ACO: The Elder Scrolls Online crashes on startup (Navi) - Broken rendering of glxgears on S/390 architecture (64bit, BigEndian) - aco: sun flickering with Assassins Creeds Origins - !1896 broke ext_image_dma_buf_import piglit tests with radeonsi - aco: wrong geometry with Assassins Creed Origins on GFX6 - valgrind errors since commit a8ec4082a41 - OSMesa osmesa_choose_format returns a format not supported by st_new_renderbuffer_fb - Build error with VS on WIN - Using EGL_KHR_surfaceless_context causes spurious "libEGL warning: FIXME: egl/x11 doesn't support front buffer rendering." - !3460 broke texsubimage test with piglit on zink+anv - The screen is black when using ACO - [Regression] JavaFX unbounded VRAM+RAM usage - radv: implement VK_AMD_shader_explicit_vertex_parameter - Civilization VI crashes when loading game (AMD Vega Mobile) - [radeonsi] X-Server crashes when trying to start Guild Wars 2 with the commits from !3421 - aco: implement GFX6 support - Add support for VK_KHR_swapchain_mutable_format - radv: The Surge 2 crashes in ac_get_elem_bits() - [Regression] JavaFX unbounded VRAM+RAM usage - Use the OpenCL dispatch defnitions from OpenCL_Headers - [regression][ilk,g965,g45] various dEQP-GLES2.functional.shaders.\* failures - aco: Dead Rising 4 crashes in lower_to_hw_instr() on GFX6-GFX7 - libvulkan_radeon.so crash with \`free(): double free detected in tcache 2\` - Commit be08e6a causes crash in com.android.launcher3 (Launcher) - anv: Regression causing issues for radv when there are no Intel devices - Mesa no longer compiles with GCC 10 - [Navi/aco] Guild Wars 2 - ring gfx timeout with commit 3bca0af2 - [radv/aco] Regression is causing a soft crash in The Witcher 3 - [bisected] [radeonsi] GPU hangs/resets while playing interlaced content on Kodi with VAAPI - [radeonsi] MSAA image not copied properly after image store through texture view - T-Rex and Manhattan onscreen performance issue on Android - VkSamplerCreateInfo compareEnable not respected - VkSamplerCreateInfo compareEnable not respected - Freedreno drm softpin driver implementation leaks memory - [POLARIS10] VRAM leak involving glTexImage2D with non-NULL data argument - [regression][bisected][ivb/byt] crucible test func.push-constants.basic.q0 causes gpu hang - MR 3096 broke lots of piglit ext_framebuffer_object tests on Raven - Rise of the Tomb Raider benchmark crash on Dell XPS 7390 2-in-1 w/ Iris Plus Graphics (Ice Lake 8x8 GT2) - Raven Ridge (2400G): Resident Evil 2 crashes my machine - Common practice of glGetActiveUniform leads to O(N²) behavior in Mesa - Rocket League ingame artifacts - [radv] SteamVR direct mode no longer works - [ANV] unused create parameters not properly ignored - [Bisected] Mesa fails to start alacritty with the wayland backend (AMD Vega). - [iris] piglit test clip-distance-vs-gs-out fails due to VUE map mismatch between VS <-> GS stages - [radv] SteamVR direct mode no longer works - Blocky corruption in The Surge 2 - radeonsi: Floating point exception on R9 270 gpu for a set of traces - [RADV] [Navi] LOD artifacting in Halo - The Master Chief Collection (Halo Reach) - [CTS] dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.r32g32b32\_\* fail on GFX6-GFX8 - Vulkan: Please consider adding another sample count to sampledImageIntegerSampleCounts - Navi10: Bitrate based encoding with VAAPI/RadeonSI unusable - [RADV] create parameters not properly ignored - [regression][bdw,gen9,hsw,icl][iris] gltcs failures on mesa=8172b1fa03f - Bugs in RadeonSI VAAPI implementation - [GFX10] Glitch rendering Custom Avatars in Beat Saber - intel/fs: Check for 16-bit immediates in fs_visitor::lower_mul_dword_inst is too strict - i965/iris: assert when destroy GL context with active query - Visuals without alpha bits are not sRGB-capable - swapchain throttling: wait for fence has 1ns timeout - radeonsi: OpenGL app always produces page fault in gfxhub on Navi 10 - [regression] KHR-GLES31.core.geometry_shader.api.program_pipeline_vs_gs_capture fails for various drivers - [CTS] dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point hangs on GFX10 - [RADV] SPIR-V warning when compiling shader using storage multisampled image array - [RADV] The Dead Rising 4 is causing a GPU hang with LLVM backend - macOS u_thread.h:156:4: error: implicit declaration of function 'pthread_getcpuclockid' - [Wine / Vulkan] Doom 2016 Hangs on Main Menu - NULL resource when playing VP9 video through VDPAU on RX 570 - radeonsi: mpv --vo=vaapi incorrect rendering on gfx9+ - [BSW/BDW] skia lcdblendmode & lcdoverlap test failure - Create a way to prefer iris vs i965 via driconf - [Bisected] i965: CS:GO crashes in emit_deref_copy_load_store with debug Mesa - radv/aco Jedi Fallen Order hair rendering buggy - Inaccurate information on https://www.mesa3d.org/repository.html about how to get git write access. - [RADV] VK_KHR_timeline_semaphore balloons in runtime - Shadow of Mordor has randomly dancing black shadows on Talion's face - gen7 crucible failures func.push-constants.basic.q0 and func.shader-subgroup-vote.basic.q0 - GL_EXT_disjoint_timer_query failing with GL_INVALID_ENUM - Unreal 4 Elemental and MatineeFightScene demos misrender - gputest gimark has unwanted black liquorice flakes - triangle strip clipping with GL_FIRST_VERTEX_CONVENTION causes wrong vertex's attribute to be broadcasted for flat interpolation - [bisected][regression][g45,g965,ilk] piglit arb_fragment_program kil failures - glcts crashes since the enablement of ARB_shading_language_include - Android build broken - ld.lld: error: duplicate symbol (mesa-19.3.0-rc1) - Divinity: Original Sin Enhanced Edition(Native) crash on start - HSW. Tropico 6 and SuperTuxKart have shadows flickering - GL_EXT_disjoint_timer_query failing with GL_INVALID_ENUM - glxgears segfaults on POWER / Xvnc - [regression][bdw,gen9,icl][iris] piglit failures on mesa f9fd04aca15fd00889caa666ba38007268e67f5c - Redundant builds of libmesa_classic and libmesa_gallium - [IVB,BYT] [Regression] [Bisected] Core dump at launching arb_compute_shader/linker/bug-93840.shader_test - Vulkan drivers need access to format utils of gallium - Disabling lower_fragdata_array causes shader-db to crash for some drivers - GL_EXT_disjoint_timer_query failing with GL_INVALID_ENUM - Android build broken by commit 9020f51 "util/u_endian: Add error checks" - radv secure compile feature breaks compilation of RADV on armhf EABI (19.3-rc1) - radv_debug.c warnings when compiling on 32 bits : cast to pointer from integer of different size - Meson: Mesa3D build failure with standalone Mingw-w64 multilib - [regression][bisected] KHR46 VertexArrayAttribFormat has unexpectedly generated GL_INVALID_OPERATION - textureSize(samplerExternalOES, int) missing in desktop mesa 19.1.7 implementation - zink: implicly casting integers to pointers, warnings on 32-bit compile - Objects leaving trails in Firefox with antialias and preserveDrawingBuffer in three.js WebGLRednerer with mesa 19.2 Changes ------- Aaron Watry (1): - clover/llvm: fix build after llvm 10 commit 1dfede3122ee Adam Jackson (1): - drisw: Cache the depth of the X drawable Afonso Bordado (4): - pan/midgard: Optimize comparisions with similar operations - pan/midgard: Move midgard_is_branch_unit to helpers - pan/midgard: Optimize branches with inverted arguments - pan/midgard: Fix midgard_compile.h includes Alan Coopersmith (1): - intel/perf: adapt to platforms like Solaris without d_type in struct dirent Alejandro Piñeiro (4): - v3d: adds an extra MOV for any sig.ld\* - mesa/main/util: moving gallium u_mm to util, remove main/mm - nir/opt_peephole_select: remove unused variables - turnip: remove unused descriptor state dirty Alexander van der Grinten (1): - egl: Fix \_eglPointerIsDereferencable w/o mincore() Alexander von Gluck IV (1): - haiku/hgl: Fix build via header reordering Alyssa Rosenzweig (223): - pipe-loader: Build kmsro loader for with all kmsro targets - pan/midgard: Remove OP_IS_STORE_VARY - pan/midgard: Add a dummy source for loads - pan/midgard: Refactor swizzles - pan/midgard: Eliminate blank_alu_src - pan/midgard: Use fp32 blend shaders - pan/midgard: Validate tags when branching - pan/midgard: Fix quadword_count handling - pan/midgard: Compute bundle interference - pan/midgard: Add bizarre corner case - pan/midgard: offset_swizzle doesn't need dstsize - pan/midgard: Extend offset_swizzle to non-32-bit - pan/midgard: Extend swizzle packing for vec4/16-bit - pan/midgard: Extend default_phys_reg to !32-bit - panfrost/ci: Update T760 expectations - pan/midgard: Fix printing of half-registers in texture ops - pan/midgard: Disassemble half-steps correctly - pan/midgard: Pass shader stage to disassembler - pan/midgard: Switch base for vertex texturing on T720 - nir: Add load_output_u8_as_fp16_pan intrinsic - pan/midgard: Identify ld_color_buffer_u8_as_fp16\* - pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_pan - pan/midgard: Pack load/store masks - panfrost: Select format-specific blending intrinsics - pan/midgard: Add blend shader selection bits for MRT - pan/midgard: Implement linearly-constrained register allocation - pan/midgard: Integrate LCRA - pan/midgard: Remove util/ra support - pan/midgard: Compute spill costs - pan/lcra: Use Chaitin's spilling heuristic - pan/midgard: Copypropagate vector creation - pan/midgard: Fix copypropagation for textures - pan/midgard: Generalize texture registers across GPUs - pan/midgard: Fix vertex texturing on early Midgard - pan/midgard: Use texture, not textureLod, on early Midgard - pan/midgard: Disassemble with old pipeline always on T720 - pan/midgard: Prioritize texture registers - pan/midgard: Expand 64-bit writemasks - pan/midgard: Implement i2i64 and u2u64 - pan/midgard: Fix mir_round_bytemask_down for !32b - pan/midgard: Pack 64-bit swizzles - pan/midgard: Use generic constant packing for 8/64-bit - pan/midgard: Implement non-aligned UBOs - pan/midgard: Expose more typesize helpers - pan/midgard: Fix masks/alignment for 64-bit loads - pan/midgard: Represent ld/st offset unpacked - pan/midgard: Use shader stage in mir_op_computes_derivative - panfrost: Stub out clover callbacks - panfrost: Pass kernel inputs as uniforms - panfrost: Disable tiling for GLOBAL resources - panfrost: Set PIPE_COMPUTE_CAP_ADDRESS_BITS to 64 - pan/midgard: Introduce quirks checks - panfrost: Add the lod_bias field - nir: Add load_sampler_lod_paramaters_pan intrinsic - pan/midgard: Implement load_sampler_lod_paramaters_pan - pan/midgard: Add LOD bias/clamp lowering - pan/midgard: Describe quirk MIDGARD_BROKEN_LOD - pan/midgard: Enable LOD lowering only on buggy chips - panfrost: Add lcra.c to Android.mk - pan/midgard: Use lower_tex_without_implicit_lod - panfrost: Add information about T720 tiling - panfrost: Implement pan_tiler for non-hierarchy GPUs - panfrost: Simplify draw_flags - pan/midgard: Splatter on fragment out - gitlab-ci: Remove non-default skips from Panfrost - panfrost: Remove blend shader hack - panfrost: Update SET_VALUE with information from igt - panfrost: Rename SET_VALUE to WRITE_VALUE - gallium/util: Support POLYGON in u_stream_outputs_for_vertices - pan/midgard: Move spilling code out of scheduler - pan/midgard: Split spill node selection/spilling - pan/midgard: Simplify spillability test - pan/midgard: Remove spill cost heuristic - pan/midgard: Move bounds checking into LCRA - pan/midgard: Remove consecutive_skip code - pan/midgard: Remove code marked "TODO: remove me" - pan/midgard: Dynamically allocate r26/27 for spills - pan/midgard: Use no_spill bitmask - pan/midgard: Don't use no_spill for memory spill src - pan/midgard: Force alignment for csel_v - pan/midgard: Don't try to free NULL in LCRA - pan/midgard: Simplify and fix vector copyprop - pan/midgard: Fix shift for TLS access - panfrost: Describe thread local storage sizing rules - panfrost: Rename unknown_address_0 -> scratchpad - panfrost: Split stack_shift nibble from unk0 - panfrost: Add routines to calculate stack size/shift - panfrost: Factor out panfrost_query_raw - panfrost: Query core count and thread tls alloc - panfrost: Route stack_size from compiler - panfrost: Emit SFBD/MFBD after a batch, instead of before - panfrost: Handle minor cppcheck issues - pan/midgard: Remove unused ld/st packing hepers - pan/midgard: Handle misc. cppcheck warnings - panfrost: Calculate maximum stack_size per batch - panfrost: Pass size to panfrost_batch_get_scratchpad - pandecode: Add cast - panfrost: Move nir_undef_to_zero to Midgard compiler - panfrost: Move property queries to \_encoder - panfrost: Add panfrost_model_name helper - panfrost: Report GPU name in es2_info - ci: Remove T760/T860 from CI temporarily - panfrost: Pass blend RT number through - pan/midgard: Add schedule barrier after fragment writeout - pan/midgard: Writeout per render target - pan/midgard: Fix liveness analysis with multiple epilogues - pan/midgard: Set r1.w magic - panfrost: Fix FBD issue - ci: Reinstate Panfrost CI - panfrost: Remove fbd_type enum - panfrost: Pack invocation_shifts manually instead of a bit field - panfrost: Remove asserts in panfrost_pack_work_groups_compute - panfrost: Simplify sampler upload condition - panfrost: Don't double-create scratchpad - panfrost: Add PAN_MESA_DEBUG=precompile for shader-db - panfrost: Let precompile imply shaderdb - panfrost: Handle empty shaders - pan/midgard: Use a reg temporary for mutiple writes - pan/midgard: Hoist temporary coordinate for cubemaps - pan/midgard: Set .shadow for shadow samplers - pan/midgard: Set Z to shadow comparator for 2D - pan/midgard: Add uniform/work heuristic - pan/midgard: Implement textureOffset for 2D textures - pan/midgard: Fix crash with txs - pan/midgard: Lower txd with lower_tex - panfrost: Decode shader types in pantrace shader-db - pan/decode: Skip COMPUTE in blobber-db - pan/decode: Prefix blobberdb with MESA_SHADER\_\* - pan/decode: Append 0:0 spills:fills to blobber-db - pan/midgard: Fix disassembler cycle/quadword counting - pan/midgard: Bounds check lcra_restrict_range - pan/midgard: Extend IS_VEC4_ONLY to arguments - pan/midgard: Clamp LOD register swizzle - pan/midgard: Expand swizzle for texelFetch - pan/midgard: Fix fallthrough from offset to comparator - pan/midgard: Do witchcraft on texture offsets - pan/midgard: Generalize temp coordinate to non-2D - pan/midgard: Implement shadow cubemaps - pan/midgard: Enable lower_(un)pack\_\* lowering - pan/midgard: Support loads from R11G11B10 in a blend shader - pan/midgard: Add mir_upper_override helper - pan/midgard: Compute destination override - panfrost: Rename pan_instancing.c -> pan_attributes.c - panfrost: Factor batch/resource out of instancing routines - panfrost: Move instancing routines to encoder/ - panfrost: Factor out panfrost_compute_magic_divisor - panfrost: Fix off-by-one in pan_invocation.c - pan/decode: Fix reference computation for invocations - panfrost: Slight cleanup of Gallium's pan_attribute.c - panfrost: Remove pan_shift_odd - pan/decode: Handle gl_VertexID/gl_InstanceID - panfrost: Unset vertex_id_zero_based - pan/midgard: Factor out emit_attr_read - pan/midgard: Lower gl_VertexID/gl_InstanceID to attributes - panfrost: Extend attribute_count for vertex builtins - panfrost: Route gl_VertexID through cmdstream - pan/midgard: Fix minor typo - panfrost: Remove MALI_SPECIAL_ATTRIBUTE_BASE defines - panfrost: Update information on fixed attributes/varyings - panfrost: Remove MALI_ATTR_INTERNAL - panfrost: Inline away MALI_NEGATIVE - panfrost: Implement remaining texture wrap modes - panfrost: Add pan_attributes.c to Android.mk - panfrost: Add missing #include in common header - panfrost: Remove mali_alt_func - panfrost; Update comment about work/uniform_count - panfrost: Remove 32-bit next_job path - glsl: Set .flat for gl_FrontFacing - pan/midgard: Promote tilebuffer reads to 32-bit - pan/midgard: Use type-appropriate st_vary - pan/midgard: Implement flat shading - panfrost: Identify glProvokingVertex flag - panfrost: Disable some CAPs we want lowered - panfrost: Implement integer varyings - panfrost: Remove MRT indirection in blend shaders - panfrost: Respect glPointSize() - pan/midgard: Convert fragment writeout to proper branches - pan/midgard: Remove prepacked_branch - panfrost: Handle RGB16F colour clear - panfrost: Pack MRT blend shaders into a single BO - pan/midgard: Fix memory corruption in constant combining - pan/midgard: Use better heuristic for shader termination - pan/midgard: Generalize IS_ALU and quadword_size - pan/midgard: Generate MRT writeout loops - pan/midgard: Remove old comment - pan/midgard: Identity ld_color_buffer as 32-bit - pan/midgard: Use upper ALU tags for MFBD writeout - panfrost: Texture from Z32F_S8 as R32F - panfrost: Support rendering to non-zero Z/S layers - panfrost: Implement sRGB blend shaders - panfrost: Cleanup tiling selection logic - panfrost: Report MSAA 4x supported for dEQP - panfrost: Handle PIPE_FORMAT_R10G10B10A2_USCALED - panfrost: Respect constant buffer_offset - panfrost: Adjust for mismatch between hardware/Gallium in arrays/cube - pan/midgard: Account for z/w flip in texelFetch - panfrost: Don't double-flip Z/W for 2D arrays - pan/midgard: Support indirect UBO offsets - panfrost: Fix linear depth textures - pan/midgard: Bytemasks should round up, not round down - panfrost: Identify un/pack colour opcodes - pan/midgard: Fix recursive csel scheduling - panfrost: Expose some functionality with dEQP flag - panfrost: Compile tiling routines with -O3 - panfrost,lima: De-Galliumize tiling routines - panfrost: Rework linear<--->tiled conversions - panfrost: Add pandecode entries for ASTC/ETC formats - panfrost: Fix crash in compute variant allocation - panfrost: Drop mysterious zero=0xFFFF field - panfrost: Don't use implicit mali_exception_status enum - pan/decode: Remove last_size - pan/midgard: Remove pack_color define - pan/decode: Remove SHORT_SLIDE indirection - panfrost: Fix 32-bit warning for \`indices\` - pan/decode: Drop MFBD compute shader stuff - pan/midgard: Record TEXTURE_OP_BARRIER - pan/midgard: Disassemble barrier instructions - pan/midgard: Validate barriers use a barrier tag - pan/midgard: Handle tag 0x4 as texture - pan/midgard: Remove float_bitcast - pan/midgard: Fix missing prefixes - pan/midgard: Don't crash with constants on unknown ops - pan/midgard: Use fprintf instead of printf for constants Andreas Baierl (14): - lima: Beautify stream dumps - lima: Parse VS and PLBU command stream while making a dump - lima/streamparser: Fix typo in vs semaphore parser - lima/streamparser: Add findings introduced with gl_PointSize - lima/parser: Some fixes and cleanups - lima/parser: Add RSW parsing - lima/parser: Add texture descriptor parser - lima: Rotate dump files after each finished pp frame - lima: Fix dump file creation - lima/parser: Fix rsw parser - lima/parser: Fix VS cmd stream parser - lima/parser: Make rsw alpha blend parsing more readable - lima: Add stencil support - lima: Fix alpha blending Andres Rodriguez (1): - vulkan/wsi: disable the hardware cursor Andrii Simiklit (5): - main: fix several 'may be used uninitialized' warnings - glsl: fix an incorrect max_array_access after optimization of ssbo/ubo - glsl: fix a binding points assignment for ssbo/ubo arrays - glsl/nir: do not change an element index to have correct block name - mesa/st: fix a memory leak in get_version Anthony Pesch (5): - util: import xxhash - util: move fnv1a hash implementation into its own header - util/hash_table: replace \_mesa_hash_data's fnv1a hash function with xxhash - util/hash_table: added hash functions for integer types - util/hash_table: update users to use new optimal integer hash functions Anuj Phogat (2): - intel: Add device info for 1x4x6 Jasper Lake - intel: Add pci-ids for Jasper Lake Arno Messiaen (5): - lima: fix stride in texture descriptor - lima: add layer_stride field to lima_resource struct - lima: introduce ppir_op_load_coords_reg to differentiate between loading texture coordinates straight from a varying vs loading them from a register - lima: add cubemap support - lima/ppir: add lod-bias support Bas Nieuwenhuizen (33): - radv: Fix timeout handling in syncobj wait. - radv: Remove \_mesa_locale_init/fini calls. - turnip: Remove \_mesa_locale_init/fini calls. - anv: Remove \_mesa_locale_init/fini calls. - radv: Fix disk_cache_get size argument. - radv: Close all unnecessary fds in secure compile. - radv: Do not change scratch settings while shaders are active. - radv: Allocate cmdbuffer space for buffer marker write. - radv: Enable VK_KHR_buffer_device_address. - amd/llvm: Refactor ac_build_scan. - radv: Unify max_descriptor_set_size. - radv: Fix timeline semaphore refcounting. - radv: Fix RGBX Android<->Vulkan format correspondence. - amd/common: Fix tcCompatible degradation on Stoney. - amd/common: Always use addrlib for HTILE tc-compat. - radv: Limit workgroup size to 1024. - radv: Expose all sample counts for integer formats as well. - amd/common: Handle alignment of 96-bit formats. - nir: Add clone/hash/serialize support for non-uniform tex instructions. - nir: print non-uniform tex fields. - amd/common: Always initialize gfx9 mipmap offset/pitch. - turnip: Use VK_NULL_HANDLE instead of NULL. - meson: Enable -Werror=int-conversion. - Revert "amd/common: Always initialize gfx9 mipmap offset/pitch." - radv: Only use the gfx mipmap level offset/pitch for linear textures. - spirv: Fix glsl type assert in spir2nir. - radv: Emit a BATCH_BREAK when changing pixel shaders or CB_TARGET_MASK. - radv: Use new scanout gfx9 metadata flag. - radv: Disable VK_EXT_sample_locations on GFX10. - radv: Remove syncobj_handle variable in header. - radv: Expose VK_KHR_swapchain_mutable_format. - radv: Allow DCC & TC-compat HTILE with VK_IMAGE_CREATE_EXTENDED_USAGE_BIT. - radv: Do not set SX DISABLE bits for RB+ with unused surfaces. Ben Crocker (1): - llvmpipe: use ppc64le/ppc64 Large code model for JIT-compiled shaders Bernd Kuhls (1): - util/os_socket: Include unistd.h to fix build error Boris Brezillon (21): - panfrost: MALI_DEPTH_TEST is actually MALI_DEPTH_WRITEMASK - panfrost: Destroy the upload manager allocated in panfrost_create_context() - panfrost: Release the ctx->pipe_framebuffer ref - panfrost: Move BO cache related fields to a sub-struct - panfrost: Try to evict unused BOs from the cache - gallium: Fix the ->set_damage_region() implementation - panfrost: Make sure we reset the damage region of RTs at flush time - panfrost: Remove unneeded phi nodes - panfrost/midgard: Fix swizzle for store instructions - panfrost/midgard: Print the actual source register for store operations - panfrost/midgard: Use a union to manipulate embedded constants - panfrost/midgard: Rework mir_adjust_constants() to make it type/size agnostic - panfrost/midgard: Make sure promote_fmov() only promotes 32-bit imovs - panfrost/midgard: Factorize f2f and u2u handling - panfrost/midgard: Add f2f64 support - panfrost/midgard: Fix mir_print_instruction() for branch instructions - panfrost/midgard: Add 64 bits float <-> int converters - panfrost/midgard: Add missing lowering passes for type/size conversion ops - panfrost/midgard: Add a condense_writemask() helper - panfrost/midgard: Prettify embedded constant prints - panfrost: Fix the damage box clamping logic Brian Ho (14): - turnip: Update tu_query_pool with turnip-specific fields - turnip: Implement vkCreateQueryPool for occlusion queries - turnip: Implement vkCmdBeginQuery for occlusion queries - turnip: Implement vkCmdEndQuery for occlusion queries - turnip: Update query availability on render pass end - turnip: Implement vkGetQueryPoolResults for occlusion queries - turnip: Implement vkCmdResetQueryPool - turnip: Implement vkCmdCopyQueryPoolResults for occlusion queries - anv: Properly fetch partial results in vkGetQueryPoolResults - anv: Handle unavailable queries in vkCmdCopyQueryPoolResults - turnip: Enable occlusionQueryPrecise - turnip: Free event->bo on vkDestroyEvent - turnip: Fix vkGetQueryPoolResults with available flag - turnip: Fix vkCmdCopyQueryPoolResults with available flag Brian Paul (4): - s/APIENTRY/GLAPIENTRY/ in teximage.c - nir: fix a couple signed/unsigned comparison warnings in nir_builder.h - Call shmget() with permission 0600 instead of 0777 - nir: no-op C99 \_Pragma() with MSVC C Stout (1): - util/vector: Fix u_vector_foreach when head rolls over Caio Marcelo de Oliveira Filho (24): - spirv: Don't leak GS initialization to other stages - glsl: Check earlier for MaxShaderStorageBlocks and MaxUniformBlocks - glsl: Check earlier for MaxTextureImageUnits and MaxImageUniforms - anv: Initialize depth_bounds_test_enable when not explicitly set - spirv: Consider the sampled_image case in wa_glslang_179 workaround - intel/fs: Lower 64-bit MOVs after lower_load_payload() - intel/fs: Fix lowering of dword multiplication by 16-bit constant - intel/vec4: Fix lowering of multiplication by 16-bit constant - anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT) - spirv: Implement SPV_KHR_non_semantic_info - panfrost: Fix Makefile.sources - anv: Drop unused function parameter - anv: Ignore some CreateInfo structs when rasterization is disabled - intel/fs: Only use SLM fence in compute shaders - spirv: Drop EXT for PhysicalStorageBuffer symbols - spirv: Handle PhysicalStorageBuffer in memory barriers - nir: Add missing nir_var_mem_global to various passes - intel/fs: Add FS_OPCODE_SCHEDULING_FENCE - intel/fs: Add workgroup_size() helper - intel/fs: Don't emit fence for shared memory if only one thread is used - intel/fs: Don't emit control barrier if only one thread is used - anv: Always initialize target_stencil_layout - intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SAT - nir: Make nir_deref_path_init skip trivial casts Chris Wilson (1): - egl: Mention if swrast is being forced Christian Gmeiner (24): - drm-shim: fix EOF case - etnaviv: rs: upsampling is not supported - etnaviv: add drm-shim - etnaviv: drop not used config_out function param - etnaviv: use a more self-explanatory param name - etnaviv: handle 8 byte block in tiling - etnaviv: add support for extended pe formats - etnaviv: fix integer vertex formats - etnaviv: use NORMALIZE_SIGN_EXTEND - etnaviv: fix R10G10B10A2 vertex format entries - etnaviv: handle integer case for GENERIC_ATTRIB_SCALE - etnaviv: remove dead code - etnaviv: remove not used etna_bits_ones(..) - etnaviv: drop compiled_rs_state forward declaration - etnaviv: update resource status after flushing - gallium: add PIPE_CAP_MAX_VERTEX_BUFFERS - etnaviv: check if MSAA is supported - etnaviv: gc400 does not support any vertex sampler - etnaviv: use a better name for FE_VERTEX_STREAM_UNK14680 - etnaviv: move state based texture structs - etnaviv: move descriptor based texture structs - etnaviv: add deqp debug option - etnaviv: drop default state for PE_STENCIL_CONFIG_EXT2 - etnaviv: drm-shim: add GC400 Connor Abbott (19): - nir: Fix non-determinism in lower_global_vars_to_local - radv: Rename ac_arg_regfile - ac: Add a shared interface between radv, radeonsi, LLVM and ACO - ac/nir, radv, radeonsi: Switch to using ac_shader_args - radv: Move argument declaration out of nir_to_llvm - aco: Constify radv_nir_compiler_options in isel - aco: Use radv_shader_args in aco_compile_shader() - aco: Split vector arguments at the beginning - aco: Make num_workgroups and local_invocation_ids one argument each - radv: Replace supports_spill with explict_scratch_args - aco: Use common argument handling - aco: Make unused workgroup id's 0 - nir: Maintain the algebraic automaton's state as we work. - a6xx: Add more CP packets - freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTE - freedreno: Fix CP_MEM_TO_REG flag definitions - freedreno: Document CP_COND_REG_EXEC more - freedreno: Document CP_UNK_A6XX_55 - freedreno: Document CP_INDIRECT_BUFFER_CHAIN Daniel Ogorchock (2): - panfrost: Fix panfrost_bo_access memory leak - panfrost: Fix headers and gpu_headers memory leak Daniel Schürmann (58): - aco: fix immediate offset for spills if scratch is used - aco: only use single-dword loads/stores for spilling - aco: fix accidential reordering of instructions when scheduling - aco: workaround Tonga/Iceland hardware bug - aco: fix invalid access on Pseudo_instructions - aco: preserve kill flag on moved operands during RA - aco: rematerialize s_movk instructions - aco: check if SALU instructions are predeceeded by exec when calculating WQM needs - aco: value number instructions using the execution mask - aco: use s_and_b64 exec to reduce uniform booleans to one bit - amd/llvm: Add Subgroup Scan functions for SI - radv: Enable Subgroup Arithmetic and Clustered for SI - aco: don't value-number instructions from within a loop with ones after the loop. - aco: don't split live-ranges of linear VGPRs - aco: fix a couple of value numbering issues - aco: refactor visit_store_fs_output() to use the Builder - aco: Initial GFX7 Support - aco: SI/CI - fix sampler aniso - aco: fix SMEM offsets for SI/CI - aco: implement nir_op_fquantize2f16 for SI/CI - aco: only use scalar loads for readonly buffers on SI/CI - aco: implement nir_op_isign on SI/CI - aco: move buffer_store data to VGPR if needed - aco: implement quad swizzles for SI/CI - aco: recognize SI/CI SMRD hazards - aco: fix disassembly of writelane instructions. - aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI - aco: implement 64bit VGPR shifts for SI/CI - aco: make 1/2*PI a literal constant on SI/CI - aco: implement 64bit i2b for SI /CI - aco: implement 64bit ine/ieq for SI/CI - aco: disable disassembly for SI/CI due to lack of support by LLVM - radv: only flush scalar cache for SSBO writes with ACO on GFX8+ - aco: flush denorms after fmin/fmax on pre-GFX9 - aco: don't use a scalar temporary for reductions on GFX10 - aco: implement (clustered) reductions for SI/CI - aco: implement inclusive_scan for SI/CI - aco: implement exclusive scan for SI/CI - radv: disable Youngblood app profile if ACO is used - aco: return to loop_active mask at continue_or_break blocks - radv: Enable ACO on GFX7 (Sea Islands) - aco: use soffset for MUBUF instructions on SI/CI - aco: improve readfirstlane after uniform ssbo loads on GFX7 - aco: propagate temporaries into expanded vectors - nir: fix printing of var_decl with more than 4 components. - aco: compact various Instruction classes - aco: compact aco::span to use uint16_t offset and size instead of pointer and size_t. - aco: fix unconditional demote_to_helper - aco: rework lower_to_cssa() - aco: handle phi affinities transitively through parallelcopies - aco: ignore parallelcopies to the same register on jump threading - aco: fix combine_salu_not_bitwise() when SCC is used - aco: reorder VMEM operands in ACO IR - aco: fix register allocation with multiple live-range splits - aco: simplify adjust_sample_index_using_fmask() & get_image_coords() - aco: simplify gathering of MIMG address components - docs: add new features for RADV/ACO. - aco: fix image_atomic_cmp_swap Daniel Stone (2): - Revert "st/dri: do FLUSH_VERTICES before calling flush_resource" - Revert "gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES" Danylo Piliaiev (12): - intel/blorp: Fix usage of uninitialized memory in key hashing - i965/program_cache: Lift restriction on shader key size - intel/blorp: Fix usage of uninitialized memory in key hashing - intel/fs: Do not lower large local arrays to scratch on gen7 - i965: Unify CC_STATE and BLEND_STATE atoms on Haswell as a workaround - glsl: Add varyings to "zero-init of uninitialized vars" workaround - drirc: Add glsl_zero_init workaround for GpuTest - iris/query: Implement PIPE_QUERY_GPU_FINISHED - iris: Fix value of out-of-bounds accesses for vertex attributes - i965: Do not set front_buffer_dirty if there is no front buffer - st/mesa: Handle the rest renderbuffer formats from OSMesa - st/nir: Unify inputs_read/outputs_written before serializing NIR Dave Airlie (74): - nir/serialize: pack function has name and entry point into flags. - nir/serialize: fix serializing functions with no implementations. - spirv: don't store 0 to cs.ptr_size for non kernel stages. - spirv: get the correct type for function returns. - spirv/nir/opencl: handle some multiply instructions. - nir: add 64-bit ufind_msb lowering support. (v2) - nouveau: request ufind_msb64 lowering in the frontend. - vtn/opencl: add clz support - nir: fix deref offset builder - llvmpipe: initial query buffer object support. (v2) - docs: add llvmpipe to ARB_query_buffer_object. - gallivm: split out the flow control ir to a common file. - gallivm: nir->tgsi info convertor (v2) - gallivm: add popcount intrinsic wrapper - gallivm: add cttz wrapper - gallivm: add selection for non-32 bit types - gallivm: add nir->llvm translation (v2) - draw: add nir info gathering and building support - gallium: add nir lowering passes for the draw pipe stages. (v2) - gallivm: add swizzle support where one channel isn't defined. - llvmpipe: add initial nir support - nir/samplers: don't zero samplers_used/txf. - llvmpipe/images: handle undefined atomic without crashing - gallivm/llvmpipe: add support for front facing in sysval. - llvmpipe: enable texcoord semantics - gallium/scons: fix graw-xlib build on OSX. - llvmpipe: add queries disabled flag - llvmpipe: disable occlusion queries when requested by state tracker - draw: add support for collecting primitives generated outside streamout - llvmpipe: enable support for primitives generated outside streamout - aco: handle gfx7 int8/10 clamping on exports - gallivm: add bitfield reverse and ufind_msb - llvmpipe/nir: handle texcoord requirements - gallivm: fix transpose for when first channel isn't created - gallivm: fix perspective enable if usage_mask doesn't have 0 bit set - gallivm/nir: cleanup code and call cmp wrapper - gallivm/nir: copy compare ordering code from tgsi - gallivm: add base instance sysval support - gallivm/draw: add support for draw_id system value. - gallivm: fixup base_vertex support - llvmpipe: enable ARB_shader_draw_parameters. - vtn: convert vload/store to single value loops - vtn/opencl: add shuffle/shuffle support - gallivm/nir: wrap idiv to avoid divide by 0 (v2) - llvmpipe: switch to NIR by default - nir: sanitize work group intrinsics to always be 32-bit. - gallivm: add 64-bit const int creator. - llvmpipe/gallivm: add kernel inputs - gallivm: add support for 8-bit/16-bit integer builders - gallivm: pick integer builders for alu instructions. - gallivm/nir: allow 8/16-bit conversion and comparison. - tgsi/mesa: handle KERNEL case - gallivm/llvmpipe: add support for work dimension intrinsic. - gallivm/llvmpipe: add support for block size intrinsic - gallivm/llvmpipe: add support for global operations. - llvmpipe: handle serialized nir as a shader type. - llvmpipe: add support for compute shader params - llvmpipe/nir: use nir_max_vec_components in more places - gallivm: handle non-32 bit undefined - llvmpipe: lower hadd/add_sat - gallivm/nir: lower packing - gallivm/nir: add vec8/16 support - llvmpipe: add debug option to enable OpenCL support. - gallivm: fixup const int64 builder. - llvmpipe: enable ARB_shader_group_vote. - gallium/util: add multi_draw_indirect to util_draw_indirect. - llvmpipe: enable driver side multi draw indirect - llvmpipe: add support for ARB_indirect_parameters. - llvmpipe: add ARB_derivative_control support - gallivm: fix gather component handling. - llvmpipe: fix some integer instruction lowering. - galllivm: fix gather offset casting - gallivm: fix find lsb - gallivm/nir: add missing break for isub. David Heidelberg (1): - .mailmap: use correct email address David Stevens (1): - virgl: support emulating planar image sampling Denis Pauk (2): - gallium/swr: Enable support bptc format. - docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr Dongwon Kim (3): - gallium: enable INTEL_PERFORMANCE_QUERY - iris: INTEL performance query implementation - gallium: check all planes' pipe formats in case of multi-samplers Drew Davenport (1): - radeonsi: Clear uninitialized variable Drew DeVault (1): - st_get_external_sampler_key: improve error message Duncan Hopkins (1): - zink: make sure src image is transfer-src-optimal Dylan Baker (69): - Bump VERSION to 20.0.0-devel - docs/new_features: Empty the feature list for the 20.0 cycle - nir: correct use of identity check in python - r200: use preprocessor for big vs little endian checks - r100: Use preprocessor to select big vs little endian paths - dri/osmesa: use preprocessor for selecting endian code paths - util/u_endian: Use \_WIN32 instead of \_MSC_VER - util/u_endian: set PIPE_ARCH_*_ENDIAN to 1 - mesa/main: replace uses of \_mesa_little_endian with preprocessor - mesa/swrast: replace instances of \_mesa_little_endian with preprocessor - mesa/main: delete now unused \_mesa_little_endian - gallium/osmesa: Use PIPE_ARCH_*_ENDIAN instead of little_endian function - util: rename PIPE_ARCH_*_ENDIAN to UTIL_ARCH_*_ENDIAN - util/u_endian: Add error checks - meson: Add dep_glvnd to egl deps when building with glvnd - docs: add release notes for 19.2.3 - docs: add sha256 sum to 19.2.3 release notes - docs: update calendar, add news item and link release notes for 19.2.2 - meson: gtest needs pthreads - gallium/osmesa: Convert osmesa test to gtest - osmesa/tests: Extend render test to cover other working cases - util: Use ZSTD for shader cache if possible - docs: Add release notes for 19.2.4 - docs: Add SHA256 sum for for 19.2.4 - docs: update calendar, add news item and link release notes for 19.2.4 - docs: Add relnotes for 19.2.5 - docs/relnotes/19.2.5: Add SHA256 sum - docs: update calendar, add news item and link release notes for 19.2.5 - docs/release-calendar: Update for extended 19.3 rc period - docs: Add release notes for 19.2.6 - docs: Add SHA256 sum for 19.2.6 - docs: update calendar, add news item and link release notes for 19.2.6 - gallium/auxiliary: Fix uses of gnu struct = {} extension - meson: Add -Werror=gnu-empty-initializer to MSVC compat args - docs: Add release notes for 19.2.7 - docs: Add SHA256 sums for 19.2.7 - docs: update calendar, add news item and link release notes for 19.2.7 - docs: Update mesa 19.3 release calendar - meson/broadcom: libbroadcom_cle needs expat headers - meson/broadcom: libbroadcom_cle also needs zlib - docs: add release notes for 19.3.0 - docs/19.3.0: Add SHA256 sums - docs: Update release notes, index, and calendar for 19.3.0 - dcos: add releanse notes for 19.3.1 - docs: Add release notes, update calendar, and add news for 19.3.1 - docs: add relnotes for 19.2.8 - docs/relnotes/19.2.8: Add SHA256 sum - docs: Add release notes, news, and update calendar for 19.2.8 - docs: Add release notes for 19.3.2 - docs: add SHA256 sums for 19.3.2 - docs: Add release notes for 19.3.2, update calendar and home page - docs: Update release calendar for 20.0 - docs: Add relnotes for 19.3.3 release - docs: Add SHA 256 sums for 19.3.3 - docs: update news, calendar, and link release notes for 19.3.3 - VERSION: bump to 20.0.0-rc1 - bin/pick-ui: Add a new maintainer script for picking patches - .pick_status.json: Update to 0d14f41625fa00187f690f283c1eb6a22e354a71 - .pick_status.json: Update to b550b7ef3b8d12f533b67b1a03159a127a3ff34a - .pick_status.json: Update to 9afdcd64f2c96f3fcc1a28912987f2e8066aa995 - .pick_status.json: Update to 7eaf21cb6f67adbe0e79b80b4feb8c816a98a720 - VERSION: bump to 20.0-rc2 - .pick_status.json: Update to d8bae10bfe0f487dcaec721743cd51441bcc12f5 - .pick_status.json: Update to 689817c9dfde9a0852f2b2489cb0fa93ffbcb215 - .pick_status.json: Update to 23037627359e739c42b194dec54875aefbb9d00b - VERSION: bump for 20.0.0-rc3 - .pick_status.json: Update to 2a98cf3b2ecea43cea148df7f77d2abadfd1c9db - .pick_status.json: Update to 946eacbafb47c8b94d47e7c9d2a8b02fff5a22fa - .pick_status.json: Update to bee5c9b0dc13dbae0ccf124124eaccebf7f2a435 Eduardo Lima Mitev (2): - turnip: Remove failed command buffer from pool - turnip: Fix issues in tu_compute_pipeline_create() that may lead to crash Elie Tournier (4): - Docs: remove duplicate meson docs for windows - docs: fix ascii html representation - nir/algebraic: i2f(f2i()) -> trunc() - nir/algebraic: sqrt(x)*sqrt(x) -> fabs(x) Emmanuel Gil Peyrot (1): - intel/compiler: Return early if read() failed Eric Anholt (102): - ci: Make lava inherit the ccache setup of the .build script. - ci: Switch over to an autoscaling GKE cluster for builds. - Revert "ci: Switch over to an autoscaling GKE cluster for builds." - mesa/st: Add mapping of MESA_FORMAT_RGB_SNORM16 to gallium. - gallium: Add defines for FXT1 texture compression. - gallium: Add some more channel orderings of packed formats. - gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8. - gallium: Add equivalents of packed MESA_FORMAT_*UINT formats. - mesa: Stop defining a full separate format for RGBA_UINT8. - mesa/st: Test round-tripping of all compressed formats. - mesa: Prepare for the MESA_FORMAT\_\* enum to be sparse. - mesa: Redefine MESA_FORMAT\_\* in terms of PIPE_FORMAT_*. - mesa/st: Gut most of st_mesa_format_to_pipe_format(). - mesa/st: Make st_pipe_format_to_mesa_format an effective no-op. - u_format: Fix swizzle of A1R5G5B5. - ci: Use several debian buster packages instead of hand-building. - ci: Make the skip list regexes match the full test name. - ci: Use cts_runner for our dEQP runs. - ci: Enable all of GLES3/3.1 testing for softpipe. - ci: Remove old commented copy of freedreno artifacts. - ci: Disable flappy blit tests on a630. - ci: Expand the freedreno blit skip regex to cover more cases. - util: Move gallium's PIPE_FORMAT utils to /util/format/ - mesa: Move compile of common Mesa core files to a static lib. - mesa/st: Simplify st_choose_matching_format(). - mesa: Don't put sRGB formats in the array format table. - mesa/st: Reuse st_choose_matching_format from st_choose_format(). - util: Add a mapping from VkFormat to PIPE_FORMAT. - turnip: Drop the copy of the formats table. - ci: Move freedreno's parallelism to the runner instead of gitlab-ci jobs. - ci: Use a tag from the parallel-deqp-runner repo. - nir: Add a scheduler pass to reduce maximum register pressure. - nir: Refactor algebraic's block walk - nir: Make algebraic backtrack and reprocess after a replacement. - freedreno: Introduce a fd_resource_layer_stride() helper. - freedreno: Introduce a fd_resource_tile_mode() helper. - freedreno: Introduce a resource layout header. - freedreno: Convert the slice struct to the new resource header. - freedreno/a6xx: Log the tiling mode in resource layout debug. - turnip: Disable timestamp queries for now. - turnip: Fix unused variable warnings. - turnip: Drop redefinition of VALIDREG now that it's in ir3.h. - turnip: Reuse tu6_stage2opcode() more. - turnip: Add basic SSBO support. - turnip: Refactor the graphics pipeline create implementation. - turnip: Add a helper function for getting tu_buffer iovas. - turnip: Sanity check that we're adding valid BOs to the list. - turnip: Move pipeline BO list adding to BindPipeline. - turnip: Add support for compute shaders. - ci: Disable egl_ext_device_drm tests in piglit. - freedreno: Enable texture upload memory throttling. - freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off. - freedreno: Track the set of UBOs to be uploaded in UBO analysis. - freedreno: Drop the extra offset field for mipmap slices. - freedreno: Refactor the UBWC flags registers emission. - freedreno: Move UBWC layout into a slices array like the non-UBWC slices. - tu: Move our image layout into a freedreno_layout struct. - freedreno: Move a6xx's setup_slices() to a shareable helper function. - freedreno: Switch the 16-bit workaround to match what turnip does. - tu: Move UBWC layout into fdl6_layout() and use that function. - turnip: Lower usub_borrow. - turnip: Drop unused variable. - turnip: Add support for descriptor arrays. - turnip: Fix support for immutable samplers. - ci: Fix caselist results archiving after parallel-deqp-runner rename. - mesa: Fix detection of invalidating both depth and stencil. - mesa/st: Deduplicate the NIR uniform lowering code. - mesa/st: Move the vec4 type size function into core GLSL types. - mesa/prog: Reuse count_vec4_slots() from ir_to_mesa. - mesa/st: Move the dword slot counting function to glsl_types as well. - i965: Reuse the new core glsl_count_dword_slots(). - nir: Fix printing of ~0 .locations. - turnip: Refactor linkage state setup. - mesa: Make atomic lowering put atomics above SSBOs. - gallium: Pack the atomic counters just above the SSBOs. - nir: Drop the ssbo_offset to atomic lowering. - compiler: Add a note about how num_ssbos works in the program info. - freedreno: Stop scattered remapping of SSBOs/images to IBOs. - radeonsi: Remove a bunch of default handling of pipe caps. - r600: Remove a bunch of default handling of pipe caps. - r300: Remove a bunch of default handling of pipe caps. - radeonsi: Drop PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS. - turnip: Fix some whitespace around binary operators. - turnip: Refactor the intrinsic lowering. - turnip: Add limited support for storage images. - turnip: Disable UBWC on images used as storage images. - turnip: Add support for non-zero (still constant) UBO buffer indices. - turnip: Add support for uniform texel buffers. - freedreno/ir3: Plumb the ir3_shader_variant into legalize. - turnip: Add support for fine derivatives. - turnip: Fix execution of secondary cmd bufs with nothing in primary. - freedreno: Add some missing a6xx address declarations. - freedreno: Fix OUT_REG() on address regs without a .bo supplied. - turnip: Port krh's packing macros from freedreno to tu. - turnip: Convert renderpass setup to the new register packing macros. - turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros. - vulkan/wsi: Fix compiler warning when no WSI platforms are enabled. - iris: Silence warning about AUX_USAGE_MC. - mesa/st: Fix compiler warnings from INTEL_shader_integer_functions. - ci: Enable -Werror on the meson-i386 build. - tu: Fix binning address setup after pack macros change. - Revert "gallium: Fix big-endian addressing of non-bitmask array formats." Eric Engestrom (58): - meson: split out idep_xmlconfig_headers from idep_xmlconfig - anv: add missing xmlconfig headers dependency - radv: drop unnecessary xmlpool_options_h - pipe-loader: drop unnecessary xmlpool_options_h - loader: replace xmlpool_options_h with idep_xmlconfig_headers - targets/omx: replace xmlpool_options_h with idep_xmlconfig_headers - targets/va: replace xmlpool_options_h with idep_xmlconfig_headers - targets/vdpau: replace xmlpool_options_h with idep_xmlconfig_headers - targets/xa: replace xmlpool_options_h with idep_xmlconfig_headers - targets/xvmc: replace xmlpool_options_h with idep_xmlconfig_headers - dri: replace xmlpool_options_h with idep_xmlconfig_headers - i915: replace xmlpool_options_h with idep_xmlconfig_headers - nouveau: replace xmlpool_options_h with idep_xmlconfig_headers - r200: replace xmlpool_options_h with idep_xmlconfig_headers - radeon: replace xmlpool_options_h with idep_xmlconfig_headers - meson: move idep_xmlconfig_headers to xmlpool/ - gitlab-ci: build a recent enough version of GLVND (ie. 1.2.0) - meson: require glvnd 1.2.0 - meson: revert glvnd workaround - meson: add variable to control the symbols checks - meson: move the generic symbols check arguments to a common variable - meson: add windows support to symbols checks - meson: require \`nm\` again on Unix systems - mesa/imports: let the build system detect strtok_r() - egl: fix \_EGL_NATIVE_PLATFORM fallback - egl: move #include of local headers out of Khronos headers - gitlab-ci: build libdrm using meson instead of autotools - gitlab-ci: auto-cancel CI runs when a newer commit is pushed to the same branch - CL: sync C headers with Khronos - CL: sync C++ headers with Khronos - vulkan: delete typo'd header - egl: use EGL_CAST() macro in eglmesaext.h - anv: add missing "fall-through" annotation - vk_util: drop duplicate formats in vk_format_map[] - meson: drop duplicate \`lib\` prefix on libiris_gen\* - meson: drop \`intel_\` prefix on imgui_core - docs: reword a bit and list HTTPS before FTP - intel: add mi_builder_test for gen12 - intel/compiler: add ASSERTED annotation to avoid "unused variable" warning - intel/compiler: replace \`0\` pointer with \`NULL\` - util/simple_mtx: don't set the canary when it can't be checked - anv: drop unused #include - travis: autodetect python version instead of hard-coding it - util/format: remove left-over util_format_description_table declaration - util/format: add PIPE_FORMAT_ASTC_*x*x*_SRGB to util_format_{srgb,linear}() - util/format: add trivial srgb<->linear conversion test - u_format: move format tests to util/tests/ - amd: fix empty-body issues - nine: fix empty-body-issues - meson: simplify install_megadrivers.py invocation - mesa: avoid returning a value in a void function - meson: use github URL for wraps instead of completely unreliable wrapdb - egl: drop confusing mincore() error message - llvmpipe: drop LLVM < 3.4 support - util/atomic: fix return type of p_atomic_add_return() fallback - util/os_socket: fix header unavailable on windows - freedreno/perfcntrs: fix fd leak - util/disk_cache: check for write() failure in the zstd path Erico Nunes (17): - lima: fix nir shader memory leak - lima: fix bo submit memory leak - lima/ppir: enable lower_fdph - gallium/util: add alignment parameter to util_upload_index_buffer - lima: allocate separate bo to store varyings - lima: refactor indexed draw indices upload - vc4: move the draw splitting routine to shared code - lima: split draw calls on 64k vertices - lima/ppir: fix lod bias src - lima/ppir: remove assert on ppir_emit_tex unsupported feature - lima: set shader caps to optimize control flow - lima/ppir: remove orphan load node after cloning - lima/ppir: implement full liveness analysis for regalloc - lima/ppir: handle write to dead registers in ppir - lima/ppir: fix ssa undef emit - lima/ppir: split ppir_op_undef into undef and dummy again - lima/ppir: fix src read mask swizzling Erik Faye-Lund (82): - zink: heap-allocate samplers objects - zink: emit line-width when using polygon line-mode - anv: remove incorrect polygonMode=point early-out - zink: use actual format for render-pass - zink: always allow mutating the format - zink: do not advertize coherent mapping - zink: disable fragment-shader texture-lod - zink: transition resources before resolving - zink: always allow sampling of images - zink: use u_blitter when format-reinterpreting - zink/spirv: drop temp-array for component-count - zink/spirv: support loading bool constants - zink/spirv: implement bany_fnequal[2-4] - zink/spirv: implement bany_inequal[2-4] - zink/spirv: implement ball_iequal[2-4] - zink/spirv: implement ball_fequal[2-4] - zink: do advertize integer support in shaders - zink/spirv: add support for nir_op_flrp - zink: correct depth-stencil format - nir: patch up deref-vars when lowering clip-planes - zink: always allow transfer to/from buffers - zink: implement buffer-to-buffer copies - zink: remove no-longer-needed hack - zink: move format-checking to separate source - zink: move filter-helper to separate helper-header - zink: move blitting to separate source - zink: move drawing separate source - st/mesa: unmap pbo after updating cache - zink: use true/false instead of TRUE/FALSE - zink: reject invalid sample-counts - zink: fix crash when restoring sampler-states - zink: delete query rather than allocating a new one - zink: do not try to destroy NULL-fence - zink: handle calloc-failure - zink: avoid NULL-deref - zink: avoid NULL-deref - zink: avoid NULL-deref - zink: error-check right variable - zink: silence coverity error - zink: enable PIPE_CAP_MIXED_COLORBUFFER_FORMATS - zink: implement nir_texop_txd - zink: implement txf - zink: implement some more trivial opcodes - zink: simplify front-face type - zink: factor out builtin-var creation - zink: implement load_vertex_id - zink: use nir_fmul_imm - zink: remove unused code-path in lower_pos_write - nir/zink: move clip_halfz-lowering to common code - etnaviv: use nir_lower_clip_halfz instead of open-coding - st/mesa: use uint-samplers for sampling stencil buffers - zink: fixup initialization of operand_mask / num_extra_operands - util: initialize float-array with float-literals - st/wgl: eliminate implicit cast warning - gallium: fix a warning - mesa/st: use float literals - docs: fix typo in html tag name - docs: fix paragraphs - docs: open paragraph before closing it - docs: use code-tag instead of pre-tag - docs: use code-tags instead of pre-tags - docs: use code-tags instead of pre-tags - docs: move paragraph closing tag - docs: remove double-closed definition-list - docs: do not double-close link tag - docs: do not use definition-list for sub-topics - docs: use figure/figcaption instead of tables - docs: remove trailing header - docs: remove leading spaces - docs: remove trailing newlines - docs: use [1] instead of asterisk for footnote - docs: remove pointless, stray newline - docs: fixup indentation - zink: implement nir_texop_txs - zink: support offset-variants of texturing - zink: avoid incorrect vector-construction - zink: store image-type per texture - zink: support sampling non-float textures - zink: support arrays of samplers - zink: set compareEnable when setting compareOp - st/mesa: use uint-result for sampling stencil buffers - Revert "nir: Add a couple trivial abs optimizations" Florian Will (1): - radv/winsys: set IB flags prior to submit in the sysmem path Francisco Jerez (26): - glsl: Fix software 64-bit integer to 32-bit float conversions. - intel/fs/gen11+: Handle ROR/ROL in lower_simd_width(). - intel/fs/gen8+: Fix r127 dst/src overlap RA workaround for EOT message payload. - intel/fs: Fix nir_intrinsic_load_barycentric_at_sample for SIMD32. - intel/fs/cse: Fix non-deterministic behavior due to inaccurate liveness calculation. - intel/fs: Make implied_mrf_writes() an fs_inst method. - intel/fs: Try to vectorize header setup in lower_load_payload(). - intel/fs: Generalize fs_reg::is_contiguous() to register files other than VGRF. - intel/fs: Rework fs_inst::is_copy_payload() into multiple classification helpers. - intel/fs: Extend copy propagation dataflow analysis to copies with FIXED_GRF source. - intel/fs: Add partial support for copy-propagating FIXED_GRFs. - intel/fs: Add support for copy-propagating a block of multiple FIXED_GRFs. - intel/fs: Allow limited copy propagation of a LOAD_PAYLOAD into another. - intel/fs/gen4-6: Allocate registers from aligned_pairs_class based on LINTERP use. - intel/fs/gen6: Constrain barycentric source of LINTERP during bank conflict mitigation. - intel/fs/gen6: Generalize aligned_pairs_class to SIMD16 aligned barycentrics. - intel/fs/gen6: Use SEL instead of bashing thread payload for unlit centroid workaround. - intel/fs: Split fetch_payload_reg() into separate helper for barycentrics. - intel/fs: Introduce barycentric layout lowering pass. - intel/fs: Switch to standard vector layout for barycentrics at optimization time. - intel/fs/cse: Make HALT instruction act as CSE barrier. - intel/fs/gen7: Fix fs_inst::flags_written() for SHADER_OPCODE_FIND_LIVE_CHANNEL. - intel/fs: Add virtual instruction to load mask of live channels into flag register. - intel/fs/gen12: Workaround unwanted SEND execution due to broken NoMask control flow. - intel/fs/gen12: Fixup/simplify SWSB annotations of SIMD32 scratch writes. - intel/fs/gen12: Workaround data coherency issues due to broken NoMask control flow. Fritz Koenig (1): - freedreno: reorder format check Georg Lehmann (3): - Correctly wait in the fragment stage until all semaphores are signaled - Vulkan Overlay: Don't try to change the image layout to present twice - Vulkan overlay: use the corresponding image index for each swapchain Gert Wollny (12): - r600: Disable eight bit three channel formats - virgl: Increase the shader transfer buffer by doubling the size - gallium/tgsi_from_mesa: Add 'extern "C"' to be able to include from C++ - nir: make nir_get_texture_size/lod available outside nir_lower_tex - gallium: tgsi_from_mesa - handle VARYING_SLOT_FACE - r600: Add functions to dump the shader info - r600: Make it possible to include r600_asm.h in a C++ file - r600/sb: Correct SB disassambler for better debugging - r600: Fix maximum line width - r600: Make SID and unsigned value - r600: Delete vertex buffer only if there is actually a shader state - mesa/st: glsl_to_nir: don't lower atomics to SSBOs if driver supports HW atomics Guido Günther (2): - etnaviv: drm: Don't miscalculate timeout - freedreno/drm: Don't miscalculate timeout Gurchetan Singh (11): - drirc: set allow_higher_compat_version for Faster Than Light - virgl/drm: update UAPI - teximage: split out helper from EGLImageTargetTexture2DOES - glapi / teximage: implement EGLImageTargetTexStorageEXT - dri_util: add driImageFormatToSizedInternalGLFormat function - i965: track if image is created by a dmabuf - i965: refactor intel_image_target_texture_2d - i965: support EXT_EGL_image_storage - st/dri: track if image is created by a dmabuf - st/mesa: refactor egl image binding a bit - st/mesa: implement EGLImageTargetTexStorage Hyunjun Ko (7): - freedreno/ir3: cleanup by removing repeated code - freedreno: support 16b for the sampler opcode - freedreno/ir3: fix printing output registers of FS. - freedreno/ir3: fixup when changing to mad.f16 - freedreno/ir3: enable half precision for pre-fs texture fetch - turnip: fix invalid VK_ERROR_OUT_OF_POOL_MEMORY - freedreno/ir3: put the conversion back for half const to the right place. Iago Toral Quiroga (32): - v3d: rename vertex shader key (num)_fs_inputs fields - mesa/st: make sure we remove dead IO variables before handing NIR to backends - glsl: add missing initialization of the location path field - v3d: fix indirect BO allocation for uniforms - v3d: actually root the first BO in a command list in the job - v3d: add missing plumbing for VPM load instructions - v3d: add debug assert - v3d: enable debug options for geometry shader dumps - v3d: remove unused variable - v3d: add initial compiler plumbing for geometry shaders - v3d: fix packet descriptions for geometry and tessellation shaders - v3d: emit geometry shader state commands - v3d: implement geometry shader instancing - v3d: add 1-way SIMD packing definition - v3d: compute appropriate VPM memory configuration for geometry shader workloads - v3d: we always have at least one output segment - v3d: add support for adjacency primitives - v3d: don't try to render if shaders failed to compile - v3d: predicate geometry shader outputs inside non-uniform control flow - v3d: save geometry shader state for blitting - v3d: support transform feedback with geometry shaders - v3d: remove obsolete assertion - v3d: do not limit new CL space allocations with branch to 4096 bytes - v3d: support rendering to multi-layered framebuffers - v3d: move layer rendering to a separate helper - v3d: handle writes to gl_Layer from geometry shaders - v3d: fix primitive queries for geometry shaders - v3d: disable lowering of indirect inputs - v3d: support precompiling geometry shaders - v3d: expose OES_geometry_shader - u_vbuf: don't try to delete NULL driver CSO - v3d: fix bug when checking result of syncobj fence import Ian Romanick (39): - intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too - nir/algebraic: Add the ability to mark a replacement as exact - nir/algebraic: Mark other comparison exact when removing a == a - intel/fs: Disable conditional discard optimization on Gen4 and Gen5 - nir/range-analysis: Add pragmas to help loop unrolling - nir/range_analysis: Make sure the table validation only occurs once - nir/opt_peephole_select: Don't count some unary operations - intel/compiler: Increase nir_opt_peephole_select threshold - nir/algebraic: Simplify some Inf and NaN avoidance code - nir/algebraic: Rearrange bcsel sequences generated by nir_opt_peephole_select - intel/compiler: Fix 'comparison is always true' warning - mesa: Silence 'left shift of negative value' warning in BPTC compression code - mesa: Silence unused parameter warning - anv: Fix error message format string - mesa: Extension boilerplate for INTEL_shader_integer_functions2 - glsl: Add new expressions for INTEL_shader_integer_functions2 - glsl_types: Add function to get an unsigned base type from a signed type - glsl: Add built-in functions for INTEL_shader_integer_functions2 - nir: Add new instructions for INTEL_shader_integer_functions2 - nir/algebraic: Add lowering for uabs_usub and uabs_isub - nir/algebraic: Add lowering for 64-bit hadd and rhadd - nir/algebraic: Add lowering for 64-bit usub_sat - nir/algebraic: Add lowering for 64-bit uadd_sat - nir/algebraic: Add lowering for 64-bit iadd_sat and isub_sat - compiler: Translate GLSL IR to NIR for new INTEL_shader_integer_functions2 expressions - intel/fs: Don't lower integer multiplies that don't need lowering - intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops - intel/fs: Implement support for NIR opcodes for INTEL_shader_integer_functions2 - nir/spirv: Translate SPIR-V to NIR for new INTEL_shader_integer_functions2 opcodes - spirv: Silence a bunch of unused parameter warnings - spirv: Add support for IntegerFunctions2INTEL capability - i965: Enable INTEL_shader_integer_functions2 on Gen8+ - gallium: Add a cap bit for OpenCL-style extended integer functions - gallium: Add a cap bit for integer multiplication between 32-bit and 16-bit - iris: Enable INTEL_shader_integer_functions2 - anv: Enable SPV_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2 - nir/algebraic: Optimize some 64-bit integer comparisons involving zero - relnotes: Add GL_INTEL_shader_integer_functions2 and VK_INTEL_shader_integer_functions2 - intel/fs: Don't count integer instructions as being possibly coissue Icecream95 (16): - gallium/auxiliary: Reduce conversions in u_vbuf_get_minmax_index_mapped - gallium/auxiliary: Handle count == 0 in u_vbuf_get_minmax_index_mapped - panfrost: Add negative lod bias support - panfrost: Compact the bo_access readers array - panfrost: Dynamically allocate shader variants - panfrost: Add ETC1/ETC2 texture formats - panfrost: Add ASTC texture formats - pan/midgard: Fix bundle dynarray leak - pan/midgard: Fix a memory leak in the disassembler - pan/midgard: Support disassembling to a file - pan/bifrost: Support disassembling to a file - pan/decode: Support dumping to a file - pan/decode: Dump to a file - pan/decode: Rotate trace files - panfrost: Don't copy uniforms when the size is zero - pan/midgard: Fix a liveness info leak Icenowy Zheng (2): - lima: support indexed draw with bias - lima: fix lima_set_vertex_buffers() Ilia Mirkin (7): - gm107/ir: fix loading z offset for layered 3d image bindings - nv50/ir: mark STORE destination inputs as used - nv50,nvc0: fix destination coordinates of blit - nvc0: add dummy reset status support - gm107/ir: avoid combining geometry shader stores at 0x60 - nvc0: treat all draws without color0 broadcast as MRT - nvc0: disable xfb's which don't have a stride Italo Nicola (1): - intel/compiler: remove old comment Iván Briano (4): - intel/compiler: Don't change hstride if not needed - anv: Export filter_minmax support only when it's really supported - anv: Export VK_KHR_buffer_device_address only when really supported - anv: Enable Vulkan 1.2 support James Xiong (3): - iris: try to set the specified tiling when importing a dmabuf - gallium: dmabuf support for yuv formats that are not natively supported - gallium: let the pipe drivers decide the supported modifiers Jan Vesely (2): - clover: Initialize Asm Parsers - clover: Use explicit conversion from llvm::StringRef to std::string Jan Zielinski (8): - gallium/swr: Fix depth values for blit scenario - swr/rasterizer: Add tessellator implementation to the rasterizer - gallium/swr: Fix Windows build - gallium/gallivm/tgsi: enable tessellation shaders - gallium/gallivm: enable linking lp_bld_printf function with C++ code - gallium/swr: implementation of tessellation shaders compilation - gallium/swr: fix tessellation state save/restore - docs: Update SWR tessellation support Jason Ekstrand (212): - util: Add a util_sparse_array data structure - anv: Move refcount to anv_bo - anv: Use a util_sparse_array for the GEM handle -> BO map - anv: Fix a relocation race condition - anv: Stop storing the GEM handle in anv_reloc_list_add - anv: Declare the bo in the anv_block_pool_foreach_bo loop - anv: Inline anv_block_pool_get_bo - anv: Replace ANV_BO_EXTERNAL with anv_bo::is_external - anv: Handle state pool relocations using "wrapper" BOs - anv: Fix a potential BO handle leak - anv: Rework anv_block_pool_expand_range - anv: Use anv_block_pool_foreach_bo in get_bo_from_pool - anv: Rework the internal BO allocation API - anv: Choose BO flags internally in anv_block_pool - anv/tests: Zero-initialize instances - anv/tests: Initialize the BO cache and device mutex - anv: Allocate block pool BOs from the cache - anv: Use the query_slot helper in vkResetQueryPoolEXT - anv: Allocate query pool BOs from the cache - anv: Set more flags on descriptor pool buffers - anv: Allocate descriptor buffers from the BO cache - util: Add a free list structure for use with util_sparse_array - anv: Allocate batch and fence buffers from the cache - anv: Allocate scratch BOs from the cache - anv: Allocate misc BOs from the cache - anv: Drop anv_bo_init and anv_bo_init_new - anv: Add a device parameter to anv_execbuf_add_bo - anv: Set the batch allocator for compute pipelines - anv: Use a bitset for tracking residency - anv: Zero released anv_bo structs - anv: Use the new BO alloc API for Android - anv: Don't delete fragment shaders that write sample mask - anv: Don't claim the null RT as a valid color target - anv: Stop compacting render targets in the binding table - anv: Move the RT BTI flush workaround to begin_subpass - spirv: Remove the type from sampled_image - spirv: Add a vtn_decorate_pointer helper - spirv: Sort out the mess that is sampled image - nir/builder: Add a nir_extract_bits helper - nir: Add tests for nir_extract_bits - intel/nir: Use nir_extract_bits in lower_mem_access_bit_sizes - intel/fs: Add DWord scattered read/write opcodes - intel/fs: refactor surface header setup - intel/nir: Plumb devinfo through lower_mem_access_bit_sizes - intel/fs: Implement the new load/store_scratch intrinsics - intel/fs: Lower large local arrays to scratch - anv: Lock around fetching sync file FDs from semaphores - anv: Plumb timeline semaphore signal/wait values through from the API - spirv: Fix the MSVC build - anv/pipeline: Assume layout != NULL - genxml: Mark everything in genX_pack.h always_inline - anv: Input attachments are always single-plane - anv: Flatten descriptor bindings in anv_nir_apply_pipeline_layout - anv: Delete dead shader constant pushing code - anv: Stop bounds-checking pushed UBOs - anv: Pre-compute push ranges for graphics pipelines - intel/compiler: Add a flag to avoid compacting push constants - anv: Re-arrange push constant data a bit - anv: Rework push constant handling - anv: Use a switch statement for binding table setup - anv: More carefully dirty state in BindDescriptorSets - anv: More carefully dirty state in BindPipeline - anv: Use an anv_state for the next binding table - anv: Emit a NULL vertex for zero base_vertex/instance - nir: Validate that variables are in the right lists - iris: Re-enable param compaction - Revert "i965/fs: Merge CMP and SEL into CSEL on Gen8+" - vulkan/enum_to_str: Handle out-of-order aliases - anv/entrypoints: Better handle promoted extensions - vulkan: Update the XML and headers to 1.1.129 - anv: Push constants are relative to dynamic state on IVB - anv: Set up SBE_SWIZ properly for gl_Viewport - anv: Respect the always_flush_cache driconf option - iris: Stop setting up fake params - anv: Drop bo_flags from anv_bo_pool - anv: Add a has_softpin boolean - blorp: Pass the VB size to the VF cache workaround - anv: Always invalidate the VF cache in BeginCommandBuffer - anv: Apply cache flushes after setting index/draw VBs - anv: Use PIPE_CONTROL flushes to implement the gen8 VF cache WA - anv: Don't leak when set_tiling fails - util/atomic: Add a \_return variant of p_atomic_add - anv: Disallow allocating above heap sizes - anv: Stop tracking VMA allocations - anv: Set up VMA heaps independently from memory heaps - anv: Stop advertising two heaps just for the VF cache WA - anv: Add an explicit_address parameter to anv_device_alloc_bo - util/vma: Factor out the hole splitting part of util_vma_heap_alloc - util/vma: Add a function to allocate a particular address range - anv: Add allocator support for client-visible addresses - anv: Use a pNext loop in AllocateMemory - anv: Implement VK_KHR_buffer_device_address - util/atomic: Add p_atomic_add_return for the unlocked path - vulkan/wsi: Provide the implicitly synchronized BO to vkQueueSubmit - vulkan/wsi: Add a hooks for signaling semaphores and fences - anv: Always add in EXEC_OBJECT_WRITE when specified in extra_flags - anv: Use submit-time implicit sync instead of allocate-time - anv: Add a fence_reset_reset_temporary helper - anv: Use BO fences/semaphores for AcquireNextImage - anv: Return VK_ERROR_OUT_OF_DEVICE_MEMORY for too-large buffers - anv: Re-capture all batch and state buffers - anv: Re-emit all compute state on pipeline switch - ANV: Stop advertising smoothLines support on gen10+ - anv: Flush the queue on DeviceWaitIdle - anv: Unconditionally advertise Vulkan 1.1 - anv: Bump the advertised patch version to 129 - i965: Enable GL_EXT_gpu_shader4 on Gen6+ - anv: Properly advertise sampledImageIntegerSampleCounts - anv: Drop unneeded struct keywords - blorp: Stop whacking Z24 depth to BGRA8 - blorp: Allow reading with HiZ - i965/blorp: Don't resolve HiZ unless we're reinterpreting - intel/blorp: Use the source format when using blorp_copy with HiZ - anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9 - i965: Allow HiZ for glCopyImageSubData sources - intel/nir: Add a memory barrier before barrier() - intel/disasm: Fix decoding of src0 of SENDS - genxml: Remove a non-existant HW bit - anv: Don't add dynamic state base address to push constants on Gen7 - anv: Flag descriptors dirty when gl_NumWorkgroups is used - anv: Re-use flush_descriptor_sets in flush_compute_state - intel/vec4: Support scoped_memory_barrier - nir: Handle more barriers in dead_write and copy_prop - nir: Handle barriers with more granularity in combine_stores - llmvpipe: No-op implement more barriers - nir: Add a new memory_barrier_tcs_patch intrinsic - spirv: Add a workaround for OpControlBarrier on old GLSLang - spirv: Add output memory semantics to OpControlBarrier in TCS - nir/glsl: Emit memory barriers as part of barrier() - intel/nir: Stop adding redundant barriers - nir: Rename nir_intrinsic_barrier to control_barrier - nir/lower_atomics_to_ssbo: Also lower barriers - anv: Drop an unused variable - intel/blorp: Fill out all the dwords of MI_ATOMIC - anv: Don't over-advertise descriptor indexing features - anv: Memset array properties - vulkan/wsi: Add a driconf option to force WSI to advertise BGRA8_UNORM first - vulkan: Update the XML and headers to 1.2.131 - turnip: Pretend to support Vulkan 1.2 - anv: Bump the patch version to 131 - anv,nir: Lower quad_broadcast with dynamic index in NIR - anv: Implement the new core version feature queries - anv: Implement the new core version property queries - relnotes: Add Vulkan 1.2 - anv: Drop some VK_IMAGE_TILING_OPTIMAL checks - anv: Support modifiers in GetImageFormatProperties2 - vulkan/wsi: Move the ImageCreateInfo higher up - vulkan/wsi: Use the interface from the real modifiers extension - vulkan/wsi: Filter modifiers with ImageFormatProperties - vulkan/wsi: Implement VK_KHR_swapchain_mutable_format - anv/blorp: Rename buffer image stride parameters - anv: Canonicalize buffer formats for image/buffer copies - anv: Add an anv_physical_device field to anv_device - anv: Take an anv_device in vk_errorf - anv: Take a device in anv_perf_warn - anv: Stop allocating WSI event fences off the instance - anv: Drop the instance pointer from anv_device - anv: Move the physical device dispatch table to anv_instance - anv: Drop separate chipset_id fields - anv: Re-arrange physical_device_init - anv: Allow enumerating multiple physical devices - anv/apply_pipeline_layout: Initialize the nir_builder before use - intel/blorp: resize src and dst surfaces separately - anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolves - anv: Add a layout_to_aux_state helper - anv: Use isl_aux_state for HiZ resolves - anv: Add a usage parameter to anv_layout_to_aux_usage - anv: Allow HiZ in read-only depth layouts - anv: Improve BTI change cache flushing - intel/fs: Don't unnecessarily fall back to indirect sends on Gen12 - intel/disasm: Properly disassemble indirect SENDs - intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s) - intel/isl: Add a hack for the Gen12 A0 texture buffer bug - anv: Rework the meaning of anv_image::planes[]::aux_usage - anv: Replace aux_surface.isl.size_B checks with aux_usage checks - intel/aux-map: Add some #defines - intel/aux-map: Factor out some useful helpers - anv: Delete a redundant calculation - isl: Add a helper for calculating subimage memory ranges - anv: Add another align_down helper - anv: Make AUX table invalidate a PIPE\_\* bit - anv: Make anv_vma_alloc/free a lot dumber - anv: Rework CCS memory handling on TGL-LP - intel/blorp: Add support for CCS_E copies with UNORM formats - intel/isl: Allow CCS_E on more formats - intel/genxml: Make SO_DECL::"Hole Flag" a Boolean - anv: Insert holes for non-existant XFB varyings - intel/blorp: Handle bit-casting UNORM and BGRA formats - anv: Replace one more aux_surface.isl.size_B check - intel/mi_builder: Force write completion on Gen12+ - anv: Set actual state pool sizes when we have softpin - anv: Re-use one old BT block in reset_batch_bo_chain - anv/block_pool: Ensure allocations have contiguous maps - anv: Rename a variable - genxml: Add a new 3DSTATE_SF field on gen12 - anv,iris: Set 3DSTATE_SF::DerefBlockSize to per-poly on Gen12+ - intel/genxml: Drop SLMEnable from L3CNTLREG on Gen11 - iris: Set SLMEnable based on the L3$ config - iris: Store the L3$ configs in the screen - iris: Use the URB size from the L3$ config - i965: Re-emit l3 state before BLORP executes - intel: Take a gen_l3_config in gen_get_urb_config - intel/blorp: Always emit URB config on Gen7+ - iris: Consolodate URB emit - anv: Emit URB setup earlier - intel/common: Return the block size from get_urb_config - intel/blorp: Plumb deref block size through to 3DSTATE_SF - anv: Plumb deref block size through to 3DSTATE_SF - iris: Plumb deref block size through to 3DSTATE_SF - anv: Always fill out the AUX table even if CCS is disabled - intel/fs: Write the address register with NoMask for MOV_INDIRECT - anv/blorp: Use the correct size for vkCmdCopyBufferToImage Jonathan Gray (4): - winsys/amdgpu: avoid double simple_mtx_unlock() - i965: update Makefile.sources for perf changes - util/futex: use futex syscall on OpenBSD - util/u_thread: don't restrict u_thread_get_time_nano() to \__linux_\_ Jonathan Marek (98): - freedreno: add Adreno 640 ID - freedreno/ir3: disable texture prefetch for 1d array textures - freedreno/registers: fix a6xx_2d_blit_cntl ROTATE - etnaviv: blt: use only for tiling, and add missing formats - etnaviv: separate PE and RS formats, use only RS only for tiling - etnaviv: blt: set TS dirty after clear - turnip: add display wsi - turnip: add x11 wsi - turnip: implement CmdClearColorImage/CmdClearDepthStencilImage - turnip: fix sRGB GMEM clear - util: add missing R8G8B8A8_SRGB format to vk_format_map - freedreno/regs: update UBWC related bits - turnip: implement UBWC - etnaviv: avoid using RS for 64bpp formats - etnaviv: implement 64bpp clear - etnaviv: blt: fix partial ZS clears with TS - etnaviv: support 3d/array/integer formats in texture descriptors - turnip: fix integer render targets - freedreno/registers: add missing MH perfcounter enum for a2xx - freedreno/perfcntrs: add a2xx MH counters - freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds - freedreno/perfcntrs/fdperf: add missing a20x compatible - freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter - turnip: fix display wsi fence timing out - turnip: don't skip unused attachments when setting up tiling config - turnip: implement CmdClearAttachments - turnip: don't set unused BLIT_DST_INFO bits for GMEM clear - turnip: MSAA resolve directly from GMEM - turnip: allow writes to draw_cs outside of render pass - turnip: add function to allocate aligned memory in a substream cs - turnip: improve emit_textures - turnip: implement border color - turnip: add hw binning - turnip: fix incorrectly failing assert - freedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info - freedreno/registers: add a6xx texture format for stencil sampler - turnip: fix hw binning render area - turnip: fix tile layout logic - turnip: update tile_align_w/tile_align_h - turnip: set load_layer_id to zero - turnip: set FRAG_WRITES_SAMPMASK bit - turnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view - turnip: no 8x msaa on 128bpp formats - turnip: add dirty bit for push constants - turnip: subpass rework - turnip: CmdClearAttachments fixes - turnip: implement subpass input attachments - etnaviv: remove sRGB formats from format table - etnaviv: sRGB render target support - etnaviv: set output mode and saturate bits - etnaviv: update INT_FILTER choice for GLES3 formats - etnaviv: disable integer vertex formats on pre-HALTI2 hardware - etnaviv: remove swizzle from format table - etnaviv: add missing formats - etnaviv: add missing vs_needs_z_div handling to NIR backend - turnip: use single substream cs - turnip: use common blit path for buffer copy - turnip: don't require src image to be set for clear blits - turnip: implement CmdFillBuffer/CmdUpdateBuffer - freedreno/ir3: lower mul_2x32_64 - turnip: fix emit_textures for compute shaders - turnip: remove compute emit_border_color - turnip: fix emit_ibo - turnip: change emit_ibo to be like emit_textures - turnip: remove duplicate A6XX_SP_CS_CONFIG_NIBO - nir: add option to lower half packing opcodes - freedreno/ir3: lower pack/unpack ops - turnip: don't set LRZ enable at end of renderpass - freedreno/ir3: update prefetch input_offset when packing inlocs - turnip: add cache invalidate to fix input attachment cases - turnip: don't set SP_FS_CTRL_REG0_VARYING if only fragcoord is used - freedreno/ir3: fix vertex shader sysvals with pre_assign_inputs - freedreno/registers: document vertex/instance id offset bits - freedreno/ir3: support load_base_instance - turnip: emit base instance vs driver param - turnip: emit_compute_driver_params fixes - turnip: compute gmem offsets at renderpass creation time - turnip: implement secondary command buffers - nir: fix assign_io_var_locations for vertex inputs - turnip: minor warning fixes - util/format: add missing vulkan formats - turnip: disable B8G8R8 vertex formats - etnaviv: fix incorrectly failing vertex size assert - etnaviv: update headers from rnndb - etnaviv: HALTI2+ instanced draw - etnaviv: implement gl_VertexID/gl_InstanceID - etnaviv: remove unnecessary vertex_elements_state_create error checking - st/mesa: don't lower YUV when driver supports it natively - st/mesa: run st_nir_lower_tex_src_plane for lowered xyuv/ayuv - freedreno/ir3: allow inputs with the same location - turnip: remove tu_sort_variables_by_location - turnip: fix array/matrix varyings - turnip: hook up GetImageDrmFormatModifierPropertiesEXT - turnip: set linear tiling for scanout images - vulkan/wsi: remove unused image_get_modifier - turnip: simplify tu_physical_device_get_format_properties - etnaviv: implement UBOs - turnip: hook up cmdbuffer event set/wait Jordan Justen (7): - iris: Add IRIS_DIRTY_RENDER_BUFFER state flag - iris/gen11+: Move flush for render target change - iris: Allow max dynamic pool size of 2GB for gen12 - intel: Remove unused Tigerlake PCI ID - iris: Fix some indentation in iris_init_render_context - iris: Emit CS Stall before Instruction Cache flush for gen12 WA - anv: Emit CS Stall before Instruction Cache flush for gen12 WA Jose Maria Casanova Crespo (1): - v3d: Fix predication with atomic image operations Juan A. Suarez Romero (3): - nir/lower_double_ops: relax lower mod() - Revert "nir/lower_double_ops: relax lower mod()" - nir/spirv: skip unreachable blocks in Phi second pass Kai Wasserbäch (4): - nir: fix unused variable warning in nir_lower_vars_to_explicit_types - nir: fix unused variable warning in find_and_update_previous_uniform_storage - nir: fix unused function warning in src/compiler/nir/nir.c - intel/gen_decoder: Fix unused-but-set-variable warning Karol Herbst (14): - nv50/ir: fix crash in isUniform for undefined values - nir/validate: validate num_components on registers and intrinsics - nir/serialize: fix vec8 and vec16 - nir/tests: add serializer tests - nir/tests: MSVC build fix - spirv: handle UniformConstant for OpenCL kernels - clover/nir: treat UniformConstant as global memory - clover/nir: set spirv environment to OpenCL - clover/spirv: allow Int64 Atomics for supported devices - nir: handle nir_deref_type_ptr_as_array in rematerialize_deref_in_block - nv50/ir: implement global atomics and handle it for nir - nir/serialize: cast swizzle before shifting - aco: use NIR_MAX_VEC_COMPONENTS instead of 4 - nv50ir/nir: support vec8 and vec16 Kenneth Graunke (57): - iris: Fix "Force Zero RTA Index Enable" setting again - nir: Handle image arrays when setting variable data - Revert "intel/blorp: Fix usage of uninitialized memory in key hashing" - iris: Properly move edgeflag_out from output list to global list - iris: Wrap iris_fix_edge_flags in NIR_PASS - mesa: Handle GL_COLOR_INDEX in \_mesa_format_from_format_and_type(). - iris: Change keybox parenting - iris: Stop mutating the resource in get_rt_read_isl_surf(). - iris: Drop 'old_address' parameter from iris_rebind_buffer - iris: Create an "iris_surface_state" wrapper struct - iris: Maintain CPU-side SURFACE_STATE copies for views and surfaces. - iris: Update SURFACE_STATE addresses when setting sampler views - iris: Disable VF cache partial address workaround on Gen11+ - driconf, glsl: Add a vs_position_always_invariant option - drirc: Set vs_position_always_invariant for Shadow of Mordor on Intel - st/mesa: Add GL_TDFX_texture_compression_FXT1 support - iris: Map FXT1 texture formats - meson: Add a "prefer_iris" build option - main: Change u_mmAllocMem align2 from bytes (old API) to bits (new API) - meson: Include iris in default gallium-drivers for x86/x86_64 - util: Detect use-after-destroy in simple_mtx - intel/genxml: Add a partial TCCNTLREG definition - iris: Enable Gen11 Color/Z write merging optimization - anv: Enable Gen11 Color/Z write merging optimization - intel/decoder: Make get_state_size take a full 64-bit address and a base - iris: Create smaller program keys without legacy features - iris: Default to X-tiling for scanout buffers without modifiers - iris: Alphabetize source files after iris_perf.c was added - drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version - iris: Make helper functions to turn iris shader keys into brw keys. - iris: Fix shader recompile debug printing - iris: Avoid replacing backing storage for buffers with no contents - intel: Drop Gen11 WaBTPPrefetchDisable workaround - st/nir: Optionally unify inputs_read/outputs_written when linking. - iris: Set nir_shader_compiler_options::unify_interfaces. - st/mesa: Allow ASTC5x5 fallbacks separately from other ASTC LDR formats. - iris: Disable ASTC 5x5 support on Gen9 for now. - iris: Delete remnants of the unimplemented ASTC 5x5 workaround - iris: Allow HiZ for copy_region sources - anv: Only enable EWA LOD algorithm when doing anisotropic filtering. - Revert "nir: assert that nir_lower_tex runs after lowering derefs" - i965: Simplify brw_get_renderer_string() - iris: Simplify iris_get_renderer_string() - intel: Use similar brand strings to the Windows drivers - intel/compiler: Fix illegal mutation in get_nir_image_intrinsic_image - iris: Fix export of fences that have already completed. - st/mesa: Allocate full miplevels if MaxLevel is explicitly set - iris: Drop some workarounds which are no longer necessary - anv: Drop some workarounds that are no longer necessary - intel: Fix aux map alignments on 32-bit builds. - meson: Prefer 'iris' by default over 'i965'. - loader: Check if the kernel driver is i915 before loading iris - iris: Drop 'engine' from iris_batch. - iris: Make iris_emit_default_l3_config pull devinfo from the batch - iris: Support multiple chained batches. - i965: Use brw_batch_references in tex_busy check - loader: Fix leak of kernel driver name Kristian Høgsberg (62): - freedreno/registers: Fix typo - freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DST - freedreno/registers: Add comments about primitive counters - freedreno/a6xx: Fix primitive counters again - freedreno/a6xx: Clear sysmem with CP_BLIT - freedreno: Add nogmem debug option to force bypass rendering - freedreno/a6xx: Fix layered texture type enum - freedreno/a6x: Rename z/s formats - freedreno/a6xx: Add register offset for STG/LDG - freedreno/ir3: Emit link map as byte or dwords offsets as needed - freedreno/ir3: Add load and store intrinsics for global io - freedreno: Don't count primitives for patches - freedreno/ir3: Add ir3 intrinsics for tessellation - freedreno/ir3: Use imul24 in offset calculations - freedreno/ir3: Add tessellation field to shader key - freedreno/ir3: Extend geometry lowering pass to handle tessellation - freedreno/ir3: Add new synchronization opcodes - freedreno/ir3: End TES with chsh when using GS - freedreno/ir3: Implement tess coord intrinsic - freedreno/ir3: Implement TCS synchronization intrinsics - freedreno/ir3: Setup inputs and outputs for tessellation stages - freedreno/ir3: Don't assume binning shader is always VS - freedreno/ir3: Pre-color TCS header and primitive ID inputs - freedreno/ir3: Allocate const space for tessellation parameters - freedreno/a6xx: Build the right draw command for tessellation - freedreno/a6xx: Allocate and program tessellation buffer - freedreno/a6xx: Emit constant parameters for tessellation stages - freedreno/a6xx: Program state for tessellation stages - freedreno: Use bypass rendering for tessellation - freedreno/a6xx: Only set emit.hs/ds when we're drawing patches - freedreno/blitter: Save tessellation state - freedreno/a6xx: Only use merged regs and four quads for VS+FS - freedreno/a6xx: Turn on tessellation shaders - freedreno/ir3: Use regid() helper when setting up precolor regs - freedreno/registers: Remove duplicate register definitions - freedreno: New struct packing macros - freedreno/registers: Add 64 bit address registers - freedreno/a6xx: Drop stale include - freedreno/a6xx: Include fd6_pack.h in a few files - freedreno/a6xx: Convert emit_mrt() to OUT_REG() - freedreno/a6xx: Convert emit_zs() to OUT_REG() - freedreno/a6xx: Convert VSC pipe setup to OUT_REG() - freedreno/a6xx: Convert gmem blits to OUT_REG() - freedreno/a6xx: Convert some tile setup to OUT_REG() - freedreno/a6xx: Silence warning for unused perf counters - freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bits - freedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacks - freedreno: Add debug flag for forcing linear layouts - freedreno/a6xx: Program sampler swap based on resource tiling - freedreno/a6xx: Pick blitter swap based on resource tiling - freedreno/a6xx: Add fd_resource_swap() helper - freedreno/a6xx: Use blitter for resolve blits - freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBX - freedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macro - freedreno/a6xx: Handle srgb blits on the blitter - freedreno/a6xx: Move handle_rgba_blit() up - freedreno/a6xx: Rewrite compressed blits in a helper function - freedreno/a6xx: Set up multisample sysmem MRTs correctly - st/mesa: Lower vars to ssa and constant prop before gl_nir_lower_buffers - ir3: Set up full/half register conflicts correctly - iris: Advertise PIPE_CAP_NATIVE_FENCE_FD - iris: Print warning and return \*out = NULL when fd to syncobj fails Krzysztof Raszkowski (10): - gallium/swr: Fix GS invocation issues - Fixed proper setting gl_InvocationID. - Fixed GS vertices output memory overflow. - gallium/swr: Enable some ARB_gpu_shader5 extensions Enable / add to features.txt: - Enhanced textureGather. - Geometry shader instancing. - Geometry shader multiple streams. - gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format. - gallivm: add TGSI bit arithmetic opcodes support - gallium/swr: Fix glVertexPointer race condition. - gallium/swr: Disable showing detected arch message. - docs/GL4: update gallium/swr features - gallium/swr: add option for static link - gallium/swr: Fix gcc 4.8.5 compile error - gallium/swr: simplify environmental variabled expansion code Lasse Lopperi (1): - freedreno/drm: Fix memory leak in softpin implementation Laurent Carlier (1): - egl: avoid local modifications for eglext.h Khronos standard header file Leo Liu (1): - ac: add missing Arcturus to the info of pc lines Lepton Wu (2): - gallium: dri2: Use index as plane number. - android: mesa: Revert "android: mesa: revert "Enable asm unconditionally"" Lionel Landwerlin (60): - intel/dev: set default num_eu_per_subslice on gen12 - intel/perf: add TGL support - intel/perf: fix Android build - mesa: check draw buffer completeness on glClearBufferfi/glClearBufferiv - vulkan: bump headers/registry to 1.1.127 - anv: Properly handle host query reset of performance queries - anv: implement VK_KHR_separate_depth_stencil_layouts - mesa: check framebuffer completeness only after state update - anv: invalidate file descriptor of semaphore sync fd at vkQueueSubmit - anv: remove list items on batch fini - anv: detach batch emission allocation from device - anv: expose timeout helpers outside of anv_queue.c - anv: move queue init/finish to anv_queue.c - anv: allow NULL batch parameter to anv_queue_submit_simple_batch - anv: prepare driver to report submission error through queues - anv: refcount semaphores - anv: prepare the driver for delayed submissions - anv/wsi: signal the semaphore in the acquireNextImage - anv: implement VK_KHR_timeline_semaphore - intel/dev: flag the Elkhart Lake platform - intel/perf: add EHL performance query support - intel/perf: fix invalid hw_id in query results - intel/perf: set read buffer len to 0 to identify empty buffer - intel/perf: take into account that reports read can be fairly old - intel/perf: simplify the processing of OA reports - intel/perf: fix improper pointer access - anv: fix missing gen12 handling - anv: fix incorrect VMA alignment for CCS main surfaces - anv: fix fence underlying primitive checks - anv: fix assumptions about temporary fence payload - intel/perf: drop batchbuffer flushing at query begin - i965/iris: perf-queries: don't invalidate/flush 3d pipeline - anv: constify pipeline layout in nir passes - anv: drop unused parameter from apply layout pass - vulkan/wsi: error out when image fence doesn't signal - mesa: avoid triggering assert in implementation - i965/iris/perf: factor out frequency register capture - loader: fix close on uninitialized file descriptor value - anv: don't close invalid syncfd semaphore - anv: fix intel perf queries availability writes - anv: set stencil layout for input attachments - iris: Implement Gen12 workaround for non pipelined state - anv: Implement Gen12 workaround for non pipelined state - anv: only use VkSamplerCreateInfo::compareOp if enabled - anv: fix pipeline switch back for non pipelined states - genxml: add new Gen11+ PIPE_CONTROL field - iris: handle new PIPE_CONTROL field - iris: implement another workaround for non pipelined states - anv: implement another workaround for non pipelined states - intel/perf: expose timestamp begin for mdapi - intel/perf: report query split for mdapi - anv: enable VK_KHR_swapchain_mutable_format - anv: don't report error with other vendor DRM devices - anv: ensure prog params are initialized with 0s - anv/iris: warn gen12 3DSTATE_HS restriction - intel: Implement Gen12 workaround for array textures of size 1 - isl: drop CCS row pitch requirement for linear surfaces - isl: add gen12 comment about CCS for linear tiling - anv: implement gen9 post sync pipe control workaround - anv: set MOCS on push constants Luis Mendes (1): - radv: fix radv secure compile feature breaks compilation on armhf EABI and aarch64 Marco Felsch (1): - etnaviv: Fix assert when try to accumulate an invalid fd Marek Olšák (245): - glsl: encode/decode types using a union with bitfields for readability - glsl: encode vector_elements and matrix_columns better - glsl: encode explicit_stride for basic types better - glsl: encode array types better - glsl: encode struct/interface types better - st/mesa: call nir_opt_access only once - st/mesa: call nir_lower_flrp only once per shader - compiler: make variable::data::binding unsigned - nir: pack nir_variable::data::stream - nir: pack nir_variable::data::xfb\_\* - radeonsi: use IR SHA1 as the cache key for the in-memory shader cache - radeonsi: don't keep compute shader IR after compilation - radeonsi: keep serialized NIR instead of nir_shader in si_shader_selector - nir: pack the rest of nir_variable::data - nir/serialize: don't expand 16-bit variable state slots to 32 bits - nir/serialize: store 32-bit object IDs instead of 64-bit - nir/serialize: pack nir_variable flags - mesa: expose SPIR-V extensions in the Compatibility profile too - util: add blob_finish_get_buffer - radeonsi/nir: call nir_serialize only once per shader - radeonsi/nir: fix compute shader crash due to nir_binary == NULL - glsl/linker: pass shader_info to analyze_clip_cull_usage directly - compiler: pack shader_info from 160 bytes to 96 bytes - st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them - st/mesa: rename DEBUG_TGSI -> DEBUG_PRINT_IR - st/mesa: remove \\n being only printed in debug builds after printed TGSI - st/mesa: print TCS/TES/GS/CS TGSI in the right place & keep disk cache enabled - st/mesa: add ST_DEBUG=nir to print NIR shaders - st/mesa: remove unused TGSI-only debug printing functions - gallium/noop: call finalize_nir - radeonsi/nir: remove dead function temps - radeonsi/nir: call nir_lower_flrp only once per shader - radeonsi/nir: don't lower fma, instead, fuse fma - mesa: enable glthread for 7 Days To Die - st/mesa: rename delete_basic_variant -> delete_common_variant - st/mesa: decrease the size of st_fp_variant_key from 48 to 40 bytes - st/mesa: start deduplicating some program code - st/mesa: initialize affected_states and uniform storage earlier in deserialize - st/mesa: consolidate and simplify code flagging program::affected_states - st/mesa: trivially merge st_vertex_program into st_common_program - st/mesa: rename st_common_program to st_program - st/mesa: cleanups after unification of st_vertex/common program - st/mesa: rename occurences of stcp to stp to correspond to st_program - st/mesa: more cleanups after unification of st_vertex/common_program - st/mesa: subclass st_vertex_program for VP-specific members - st/mesa: call nir_sweep in st_finalize_nir - st/mesa: keep serialized NIR instead of nir_shader in st_program - st/mesa: call nir_serialize only once per shader - nir: move data.image.access to data.access - nir/print: only print image.format for image variables - glsl_to_nir: rename image_access to mem_access - nir: move data.descriptor_set above data.index for better packing - nir: don't use GLenum16 in nir.h - ac: add radeon_info::num_rings and move ring_type to amd_family.h - ac: fill num_rings for remaining IPs - winsys/amdgpu: detect noop dependencies on the same ring correctly - nir: strip as we serialize to remove the nir_shader_clone call - nir/serialize: do ctx = {0} instead of manual initializations - util/blob: add 8-bit and 16-bit reads and writes - nir/serialize: pack instructions better - nir/serialize: pack src better and limit the object count to 1M from 1G - nir/serialize: don't serialize var->data for temporaries - nir/serialize: deduplicate serialized var types by reusing the last unique one - nir/serialize: try to store a diff in var data locations instead of var data - nir/serialize: pack load_const with non-64-bit constants better - nir/serialize: pack 1-component constants into 20 bits if possible - nir/serialize: pack nir_intrinsic_instr::const_index[] better - nir/serialize: try to pack two alu srcs into 1 uint32 - nir/serialize: don't store deref types if not needed - nir/serialize: don't serialize mode for deref non-cast instructions - nir/serialize: try to put deref->var index into the unused bits of the header - nir/serialize: cleanup - fold nir_deref_type_var cases into switches - nir/serialize: try to pack both deref array src into 32 bits - nir/serialize: remove up to 3 consecutive equal ALU instruction headers - nir/serialize: reuse the writemask field for 2 src X swizzles of SSA ALU - nir/serialize: serialize swizzles for vec8 and vec16 - nir/serialize: serialize writemask for vec8 and vec16 - nir/serialize: don't serialize redundant nir_intrinsic_instr::num_components - nir/serialize: use 3 unused bits in intrinsic for packed_const_indices - nir/serialize: support any num_components for remaining instructions - ac: set swizzled bit in cache policy as a hint not to merge loads/stores - radeonsi: initialize the per-context compiler on demand - radeonsi/nir: don't run si_nir_opts again if there is no change - st/mesa: don't serialize all streamout state if there are no SO outputs - st/mesa: don't use redundant stp->state.ir.nir - st/mesa: don't call ProgramStringNotify in glsl_to_nir - st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIR - st/mesa: simplify looping over linked shaders when linking NIR - st/mesa: don't use \*\* in the st_nir_link_shaders signature - st/mesa: add st_variant base class to simplify code for shader variants - ac/nir: don't rely on data.patch for tess factors - radeonsi/nir: implement subgroup system values for SPIR-V - radeonsi: simplify the interface of get_dw_address_from_generic_indices - radeonsi: simplify get_tcs_tes_buffer_address_from_generic_indices - radeonsi/nir: validate is_patch because SPIR-V doesn't set it for tess factors - radeonsi/nir: don't rely on data.patch for tess factors - radeonsi/nir: fix location_frac handling for TCS outputs - radeonsi/nir: support interface output types to fix SPIR-V xfb piglits - radeonsi: enable SPIR-V and GL 4.6 for NIR - util/driconfig: print ATTENTION if MESA_DEBUG=silent is not set - radeonsi/gfx10: simplify some duplicated NGG GS code - radeonsi/gfx10: fix the vertex order for triangle strips emitted by a GS - llvmpipe: implement TEX_LZ and TXF_LZ opcodes - gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipe - st/mesa: support UBOs for Selection/Feedback/RasterPos - st/mesa: save currently bound vertex samplers and sampler views in st_context - st/mesa: support samplers for Selection/Feedback/RasterPos - st/mesa: support SSBOs for Selection/Feedback/RasterPos - st/mesa: support shader images for Selection/Feedback/RasterPos - st/mesa: use a separate VS variant for the draw module - st/mesa: remove st_vp_variant::num_inputs - st/mesa: remove struct st_vp_variant in favor of st_common_variant - st/mesa: don't generate VS TGSI if NIR is enabled - draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM - st/mesa: release the draw shader properly to fix driver crashes (iris) - st/dri: assume external consumers of back buffers can write to the buffers - radeonsi: enable NIR by default and document GL 4.6 support - radeonsi/gfx10: disable vertex grouping - radeonsi/gfx10: simplify the tess_turns_off_ngg condition - radeonsi: don't rely on CLEAR_STATE to set PA_SC_GENERIC_SCISSOR\_\* - ac: fix ac_get_i1_sgpr_mask for Wave32 - ac: fix the return value in cull_bbox when bbox culling is disabled - radeonsi: deduplicate ES and GS thread enablement code - radeonsi: disallow compute-based culling if polygon mode is enabled - radeonsi: set is_monolithic for VS prologs when the shader is really monolithic - radeonsi: don't wrap the VS prolog in if (ES thread) .. endif - radeonsi/gfx10: don't insert NGG streamout atomics if they are never used - radeonsi: allow generating VS prologs with 0 inputs - radeonsi: fix determining whether the VS prolog is needed - radeonsi: reset more fields in si_llvm_context_set_ir to fix reusing ctx - radeonsi/gfx10: fix ngg_get_ordered_id - amd/addrlib: update to the latest version - ac/surface: fix an assertion failure on gfx9 in CMASK computation - radeonsi/gfx10: don't declare any LDS for NGG if it's not used - radeonsi/gfx10: enable NGG passthrough for eligible shaders - radeonsi/gfx10: improve performance for TES using PrimID but not exporting it - Revert "u_vbuf: Regard non-constant vbufs with non-instance elements as free" - winsys/radeon: initialize pte_fragment_size - radeonsi: preserve the scanout flag for shared resources on gfx9 and gfx10 - radeonsi: ignore PIPE_BIND_SCANOUT for imported textures - radeonsi: remove the "display_dcc_offset == 0" assertion - radeonsi: rename SDMA debug flags - radeonsi: remove broken and unused SI SDMA image copy code - radeonsi: add AMD_DEBUG=nodmaclear for debugging - radeonsi: add AMD_DEBUG=nodmacopyimage for debugging - radeonsi: rename dma_cs -> sdma_cs - radeonsi: move SI and CIK+ SDMA code into 1 common function for cleanups - radeonsi: disable SDMA on gfx8 to fix corruption on RX 580 - radeonsi: remove TGSI - gallium: put u_vbuf_get_caps return values into u_vbuf_caps - gallium/cso_context: move non-vbuf vertex buffer and element code into helpers - gallium: bypass u_vbuf if it's not needed (no fallbacks and no user VBOs) - ac/gpu_info: always use distributed tessellation on gfx10 - radeonsi: fix monolithic pixel shaders with two-sided colors and SampleMaskIn - radeonsi: fix context roll tracking in si_emit_shader_vs - radeonsi: test polygon mode enablement accurately - radeonsi: determine accurately if line stippling is enabled for performance - radeonsi: clean up messy si_emit_rasterizer_prim_state - ac: unify build_sendmsg_gs_alloc_req - ac: unify primitive export code - ac/gpu_info: add pc_lines and use it in radeonsi - ac: add 128-bit bitcount - ac: add ac_build_s_endpgm - radeonsi/gfx9: force the micro tile mode for MSAA resolve correctly on gfx9 - radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size - radeonsi: add si_context::num_vertex_elements - radeonsi: don't allow draw calls with uninitialized VS inputs - radeonsi: simplify si_set_vertex_buffers - ac,radeonsi: increase the maximum number of shader args and return values - radeonsi: put up to 5 VBO descriptors into user SGPRs - radeonsi: don't enable VBOs in user SGPRs if compute-based culling can be used - radeonsi: fix assertion and other failures in si_emit_graphics_shader_pointers - radeonsi: actually enable VBOs in user SGPRs - radeonsi: don't adjust depth and stencil PS output locations - radeonsi: rename DBG_NO_TGSI -> DBG_NO_NIR - radeonsi: remove TGSI from comments - radeonsi: rename si_shader_info -> si_shader_binary_info - radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info - radeonsi: merge si_tessctrl_info into si_shader_info - radeonsi: clean up si_shader_info - radeonsi: rename si_compile_tgsi_main -> si_build_main_function - radeonsi: rename si_shader_create -> si_create_shader_variant for clarity - radeonsi: fold si_create_function into si_llvm_create_func - radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init - radeonsi: move PS LLVM code into si_shader_llvm_ps.c - radeonsi: separate code computing info for small primitive culling - ac/cull: don't read Position.Z if it's not needed for culling - radeonsi: make si_insert_input\_\* functions non-static - radeonsi: move VS_STATE.LS_OUT_PATCH_SIZE a few bits higher to make space there - radeonsi/gfx10: separate code for getting edgeflags from the gs_invocation_id VGPR - radeonsi/gfx10: separate code for determining the number of vertices for NGG - radeonsi: fix si_build_wrapper_function for compute-based primitive culling - radeonsi: work around an LLVM crash when using llvm.amdgcn.icmp.i64.i1 - radeonsi: move si_insert_input\_\* functions - radeonsi: move tessellation shader code into si_shader_llvm_tess.c - radeonsi: remove llvm_type_is_64bit - radeonsi: move geometry shader code into si_shader_llvm_gs.c - radeonsi: move code for shader resources into si_shader_llvm_resources.c - radeonsi: remove useless #includes - radeonsi: merge si_compile_llvm and si_llvm_compile functions - gallium: add st_context_iface::flush_resource to call FLUSH_VERTICES - st/dri: do FLUSH_VERTICES before calling flush_resource - Revert "radeonsi: unbind image before compute clear" - radeonsi: clean up how internal compute dispatches are handled - radeonsi: don't invoke decompression inside internal launch_grid - radeonsi: fix doubles and int64 - radeonsi: turn an assertion into return in si_nir_store_output_tcs - ac: add prefix bitcount functions - ac: add ac_build_readlane without optimization barrier - radeonsi/gfx10: update comments and remove invalid TODOs - radeonsi/gfx10: correct VS PrimitiveID implementation for NGG - radeonsi/gfx10: move s_sendmsg gs_alloc_req to the beginning of shaders - radeonsi/gfx10: export primitives at the beginning of VS/TES - radeonsi/gfx10: merge main and pos/param export IF blocks into one if possible - radeonsi/gfx10: don't initialize VGPRs not used by NGG passthrough - radeonsi/gfx10: move GE_PC_ALLOC setting to shader states - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups - ac: add helper ac_build_triangle_strip_indices_to_triangle - radeonsi/gfx10: rewrite late alloc computation - radeonsi/gfx10: enable GS fast launch for triangles and strips with NGG culling - radeonsi: use ctx->ac. for types and integer constants - radeonsi: move non-LLVM code out of si_shader_llvm.c - radeonsi: move VS shader code into si_shader_llvm_vs.c - radeonsi: move si_shader_llvm_build.c content into si_shader_llvm.c - radeonsi: minor cleanup in si_shader_internal.h - radeonsi: move si_nir_build_llvm into si_shader_llvm.c - radeonsi: fold si_shader_context_set_ir into si_build_main_function - radeonsi: move more LLVM functions into si_shader_llvm.c - radeonsi: make si_compile_llvm return bool - radeonsi: make si_compile_shader return bool - radeonsi: change prototypes of si_is_multi_part_shader & si_is_merged_shader - radeonsi: separate LLVM compilation from non-LLVM code - util/simple_mtx: add a missing include to get ASSERTED - gallium/util: add a cache of live shaders for shader CSO deduplication - radeonsi: use the live shader cache - radeonsi: restructure si_shader_cache_load_shader - radeonsi: print shader cache stats with AMD_DEBUG=cache_stats - radeonsi: expose shader cache stats to the HUD - radeonsi: make screen available to shader part compilation - radeonsi: fix a regression since the addition of si_shader_llvm_vs.c - Revert "winsys/amdgpu: Close KMS handles for other DRM file descriptions" - Revert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible" - radeonsi: don't report that multi-plane formats are supported - radeonsi: fix the DCC MSAA bug workaround - radeonsi: don't wait for shader compilation to finish when destroying a context Marek Vasut (5): - etnaviv: Replace bitwise OR with logical OR - etnaviv: tgsi: Fix gl_FrontFacing support - etnaviv: Report correct number of vertex buffers - etnaviv: Do not filter out PIPE_FORMAT_S8_UINT_Z24_UNORM on pre-HALTI2 - etnaviv: Destroy rsc->pending_ctx set in etna_resource_destroy() Mark Janes (3): - Revert "st/mesa: call nir_serialize only once per shader" - Revert "st/mesa: keep serialized NIR instead of nir_shader in st_program" - iris: separating out common perf code Markus Wick (3): - mapi/glapi: Generate sizeof() helpers instead of fixed sizes. - mesa/glthread: Implement ARB_multi_bind. - drirc: Enable glthread for dolphin/citra/yuzu. Martin Fuzzey (1): - etnaviv: update Android build files Mathias Fröhlich (1): - egl: Implement getImage/putImage on pbuffer swrast. Matt Turner (19): - intel/compiler: Use ARRAY_SIZE() - intel/compiler: Extract GEN\_\* macros into separate file - intel/compiler: Split has_64bit_types into float/int - intel/compiler: Don't disassemble align1 3-src operands on Gen < 10 - intel/compiler: Limit compaction unit tests to specific gens - intel/compiler: Add NF some more places - intel/compiler: Add a INVALID_{,HW_}REG_TYPE macros - intel/compiler: Split hw_type tables - intel/compiler: Handle invalid inputs to brw_reg_type_to_*() - intel/compiler: Handle invalid compacted immediates - intel/compiler: Factor out brw_validate_instruction() - intel/compiler: Validate some instruction word encodings - intel/compiler: Add unit tests for new EU validation checks - intel/compiler: Validate fuzzed instructions - intel/compiler: Test compaction on Gen <= 12 - gitlab-ci: Skip ext_timer_query/time-elapsed - intel/compiler: Move Gen4/5 rounding to visitor - util: Explain BITSET_FOREACH_SET params - util: Remove tmp argument from BITSET_FOREACH_SET macro Mauro Rossi (9): - android: aco: fix Lower to CSSA - android: radeonsi: fix build error due to wrong u_format.csv file path - android: util/format: fix include path list - android: radeonsi: fix build after vl refactoring (v2) - android: nir: add a load/store vectorization pass - android: util: Add a mapping from VkFormat to PIPE_FORMAT. - android: radv: fix vk_format_table.c generated source build - android: radeonsi,ac: fix building error due to ac changes - android: radv: build radv_shader_args.c Michel Dänzer (36): - gitlab-ci: Set arm job CCACHE_DIR properly - gitlab-ci: Use separate arm64 build/test docker images - gitlab-ci: Don't build libdrm for ARM - gitlab-ci: Use ninja -j4 for building dEQP - gitlab-ci: Move artifact preparation to separate script - gitlab-ci: Share dEQP build process between x86 & ARM test image scripts - gitlab-ci: Sort packages in debian-install.sh - gitlab-ci: Run piglit tests with llvmpipe - gitlab-ci: Use separate docker images for x86 build/test jobs - gitlab-ci: Delete install/bin from artifacts as well - gitlab-ci: Document that ci-templates refs must be in sync - gitlab-ci: Use functional container job names - gitlab-ci: Rename container install scripts to match job names (better) - gitlab-ci: Organize images using new REPO_SUFFIX templates feature - gitlab-ci: Directly use host-mapped directory for ccache - gitlab-ci: Stop reporting piglit test results via JUnit - gitlab-ci: Stop storing piglit test results as JUnit - gitlab-ci: Put HTML summary in artifacts for failed piglit jobs - gitlab-ci: Update to current ci-templates master - gitlab-ci: Run piglit glslparser & quick_shader tests separately - glsl/tests: Use splitlines() instead of strip() - gitlab-ci: Use the common run policy for LAVA jobs as well again - gitlab-ci: Overhaul job run policy - gitlab-ci: Don't exclude any piglit quick_shader tests - gitlab-ci: Test against LLVM / clang 9 on x86 - gitlab-ci: Stop using manual jobs for merge requests - gitlab-ci: Set GIT_STRATEGY to none for the dummy job - gitlab-ci: Use single if for manual job rules entry - winsys/amdgpu: Keep a list of amdgpu_screen_winsyses in amdgpu_winsys - winsys/amdgpu: Keep track of retrieved KMS handles using hash tables - winsys/amdgpu: Only re-export KMS handles for different DRM FDs - util: Add os_same_file_description helper - winsys/amdgpu: Re-use amdgpu_screen_winsys when possible - winsys/amdgpu: Close KMS handles for other DRM file descriptions - winsys/amdgpu: Re-use amdgpu_screen_winsys when possible - winsys/amdgpu: Close KMS handles for other DRM file descriptions Michel Zou (3): - Meson: Check for dladdr with MinGW - disk_cache_get_function_timestamp: check for dladdr - Meson: Add llvm>=9 modules Miguel Casas-Sanchez (1): - i965: Ensure that all 2101010 image imports can pass framebuffer completeness. Nanley Chery (3): - gallium/dri2: Fix creation of multi-planar modifier images - gallium: Store the image format in winsys_handle - iris: Fix import of multi-planar surfaces with modifiers Nataraj Deshpande (1): - egl/android: Restrict minimum triple buffering for android color_buffers Nathan Kidd (1): - llvmpipe: Check thread creation errors Neha Bhende (3): - st/mesa: release tgsi tokens for shader states - svga: fix size of format_conversion_table[] - svga: Use pipe_shader_state_from_tgsi to set shader state Neil Armstrong (3): - Add support for T820 CI Jobs - ci: Remove T820 from CI temporarily - gitlab-ci/lava: add pipeline information in the lava job name Neil Roberts (9): - nir/opcodes: Add a helper function to generate the comparison binops - nir/opcodes: Add a helper function to generate reduce opcodes - nir: Add a 16-bit bool type - nir: Add a 8-bit bool type - nir/lower_alu_to_scalar: Support lowering 8- and 16-bit reduce ops - freedreno/ir3: Support 16-bit comparison instructions - freedreno/ir3: Add implementation of nir_op_b16csel - freedreno/ir3: Implement f2b16 and i2b16 - freedreno/ir3: Enabling lowering 16-bit flrp Paul Cercueil (5): - kmsro: Extend to include ingenic-drm - u_vbuf: Mark vbufs incompatible if more were requested than HW supports - u_vbuf: Only create driver CSO if no incompatible elements - u_vbuf: Regard non-constant vbufs with non-instance elements as free - u_vbuf: Return true in u_vbuf_get_caps if nb of vbufs is below minimum Paul Gofman (1): - state_tracker: Handle texture view min level in st_generate_mipmap() Paulo Zanoni (2): - intel/compiler: remove the operand restriction for src1 on GLK - intel/compiler: fix nir_op_{i,u}*32 on ICL Peng Huang (1): - radeonsi: make si_fence_server_signal flush pipe without work Philipp Sieweck (1): - svga: check return value of define_query_vgpu{9,10} Pierre Moreau (4): - compiler/spirv: Fix uses of gnu struct = {} extension - include/CL: Update OpenCL headers to latest - clover: Use the dispatch table type from the OpenCL headers - clover/meson: Define OpenCL header macros Pierre-Eric Pelloux-Prayer (54): - radeonsi: tell the shader disk cache what IR is used - mesa: enable msaa in clear_with_quad if needed - mesa: pass vao as a function paramter - mesa: add EXT_dsa glVertexArray\* functions declarations - mesa: rework \_mesa_lookup_vao_err to allow usage from EXT_dsa - mesa: add vao/vbo lookup helper for EXT_dsa - mesa: add EXT_dsa glVertexArray\* functions implementation - mesa: add gl_vertex_array_object parameter to client state helpers - mesa: add EXT_dsa glEnableVertexArrayEXT / glDisableVertexArrayEXT - mesa: add EXT_dsa EnableVertexArrayAttribEXT / DisableVertexArrayAttribEXT - mesa: extract helper function from \_mesa_GetPointerv - mesa: add EXT_dsa glGetVertexArray\* 4 functions - mesa: fix call to \_mesa_lookup_vao_err - radeonsi: fix shader disk cache key - radeonsi: enable mesa_glthread for GfxBench - mesa: update features.txt to reflect EXT_dsa status - mesa: add ARB_framebuffer_no_attachments named functions - mesa: add ARB_vertex_attrib_64bit VertexArrayVertexAttribLOffsetEXT - mesa: add ARB_clear_buffer_object named functions - mesa: add ARB_gpu_shader_fp64 selector-less functions - mesa: add ARB_instanced_arrays EXT_dsa function - mesa: add ARB_texture_buffer_range glTextureBufferRangeEXT function - mesa: implement ARB_texture_storage_multisample + EXT_dsa functions - mesa: extend vertex_array_attrib_format to support EXT_dsa - mesa: add ARB_vertex_attrib_binding glVertexArray\* functions - mesa: add ARB_sparse_buffer NamedBufferPageCommitmentEXT function - mesa: enable EXT_direct_state_access - mesa: fix warning in 32 bits build - radeonsi: implement sdma for GFX9 - radeonsi: display cs blit count for AMD_DEBUG=testdma - radeonsi: use gfx9.surf_offset to compute texture offset - radeonsi: fix multi plane buffers creation - radeonsi: dcc dirty flag - st/mesa: add a notify_before_flush callback param to flush - st/dri: use st->flush callback to flush the backbuffer - radeonsi: disable dcc for 2x MSAA surface and bpe < 4 - gallium: refuse to create buffers larger than UINT32_MAX - radeon/vcn2: enable rate control for hevc encoding - radeonsi: check ctx->sdma_cs before using it - radeonsi: release saved resources in si_retile_dcc - radeonsi: release saved resources in si_compute_expand_fmask - radeonsi: release saved resources in si_compute_clear_render_target - radeonsi: release saved resources in si_compute_copy_image - radeonsi: release saved resources in si_compute_clear_12bytes_buffer - radeonsi: release saved resources in si_compute_do_clear_or_copy - radeonsi: fix fmask expand compute shader - radeonsi: make sure fmask expand is done if needed - radeonsi: unbind image before compute clear - radeonsi: drop the negation from fmask_is_not_identity - util: call bind_sampler_states before setting sampler_views - radeonsi: move AMD_DEBUG tests to AMD_TEST - docs: document AMD_DEBUG variable - radeonsi: stop using the VM_ALWAYS_VALID flag - radeonsi/ngg: add VGT_FLUSH when enabling fast launch Prodea Alexandru-Liviu (2): - Meson: Remove lib prefix from graw and osmesa when building with Mingw. Also remove version sufix from osmesa swrast on Windows. - Appveyor: Quickly fix meson build. As this required use of Python 3.8, mako module also had to be updated. Qiang Yu (3): - lima: sync lima_drm.h with kernel - lima: create heap buffer with new interface if available - lima: add noheap debug option Rafael Antognolli (23): - intel/isl: Add MOCS settings to isl_device. - anv: Use mocs settings from isl_dev. - iris: Use mocs from isl_dev. - intel: Add workaround for stencil state. - intel/genxml: Add 3DSTATE_CONSTANT_ALL packet. - intel/aubinator: Decode 3DSTATE_CONSTANT_ALL. - intel/blorp: Use 3DSTATE_CONSTANT_ALL to setup push constants. - iris: Rework push constants emitting code. - iris: Use 3DSTATE_CONSTANT_ALL when possible. - anv: Move gen8+ push constant packet workaround. - anv: Add get_push_range_address() helper. - anv: Move code for emitting push constants into its own function. - anv: Use 3DSTATE_CONSTANT_ALL when possible. - iris: Add restriction to 3DSTATE_CONSTANT\_ packets. - util/os_socket: Add socket related functions. - vulkan/overlay: Add a control socket. - vulkan/overlay: Add support for a control socket. - vulkan/overlay: Add a command to start capturing data to a file. - vulkan/overlay: Add basic overlay control script. - vulkan/overlay: Update docs. - iris: Implement WA for push constants. - utils/os_socket: Define ssize_t on windows. - intel: Load the driver even if I915_PARAM_REVISION is not found. Rhys Perry (131): - radv: adjust loop unrolling heuristics for int64 - aco: add Instruction::usesModifiers() and add more checks in the optimizer - radv: fix radv_nir_get_max_workgroup_size when nir=NULL - aco: use DPP instead of exec modification when lowering GFX10 shuffles - aco: fix shuffle with uniform operands - nir/divergence: improve DA of shuffle - aco: fix read_invocation with VGPR lane index - aco: don't propagate vgprs into v_readlane/v_writelane - aco: combine read_invocation and shuffle implementations - radv: enable FP16/FP64 denormals earlier and only for LLVM - aco: don't combine literals into v_cndmask_b32/v_subb/v_addc - aco: fix 64-bit fsign with 0 - aco: implement VK_KHR_shader_float_controls - aco: refactor reduction lowering helpers - aco: implement 64-bit integer reductions - radv/aco: enable VK_KHR_shader_subgroup_extended_types - nir: make nir_variable::{num_members,num_state_slots} a uint16_t - nir: add nir_variable::index and nir_index_vars - nir/large_constants: use nir_index_vars and nir_variable::index - docs: update features.txt for RADV - aco: improve waitcnt insertion around loops - aco: fix copy+paste error - aco: fix waitcnts for barriers at block ends - nir: add nir_num_variable_modes and nir_var_mem_push_const - radv: set alignment for load_ssbo/store_ssbo in meta shaders - nir: add a load/store vectorization pass - nir: add load/store vectorizer tests - aco: enable load/store vectorizer - aco: allow constant offsets for global/scratch instructions on GFX10 - aco: set dlc/glc correctly for image loads - aco: propagate p_wqm on an image_sample's coordinate p_create_vector - aco: fix i2i64 - aco: fix incorrect cast in parse_wait_instr() - aco: add v_nop inbetween exec write and VMEM/DS/FLAT - aco: improve WAR hazard workaround with >64bit stores - aco: fix GFX10 opcodes for some global/flat atomics - aco: fix assembly of FLAT/GLOBAL atomics - aco: fix SADDR with FLAT on GFX10 - aco: don't enable store_global for helper invocations - aco: improve FLAT/GLOBAL scheduling - aco: implement global atomics - ac/llvm: fix pointer type for global atomics - ac/llvm: improve sync scope for global atomics - radv: set writes_memory for global memory stores/atomics - aco: validate the CFG - aco: handle loop exit and IF merge phis with break/discard - aco: fix block_kind_discard s_andn2 definition to exec - nir/lower_io_to_vector: don't create arrays when not needed - nir/load_store_vectorize: fix combining stores with aliasing loads between - aco/wave32: fix comparison optimizations - aco: improve jump threading with wave32 - aco: fix vgpr alloc granule with wave32 - aco: limit register usage for large work groups - aco: set vm for pos0 exports on GFX10 - aco: fix imageSize()/textureSize() with large buffers on GFX8 - aco: fix uninitialized data in the binary - aco: handle VOP3 modifiers when combining a constant comparison's NaN test - aco: handle omod successors with the constant in the first operand - aco: check usesModifiers() when identifying a neg/abs - aco: better handle neg/abs of sgprs - aco: set exec_potentially_empty for demotes - aco: don't DCE atomics with return values - aco: disable add combining for ds_swizzle_b32 - aco: check if multiplication/clamp is live when applying output modifier - nir/divergence: handle load_primitive_id in GS - nir/lower_gs_intrinsics: add option for per-stream counts - aco: update IR validator - aco: apply literals to split mads - aco: combine two sgprs into a VALU if they're the same - aco: improve can_use_VOP3() - aco: rewrite literal combining - aco: rewrite apply_sgprs() - aco: add check_vop3_operands() - aco: be more careful with literals in combine_salu_{n2,lshl_add} - aco: follow through temporary when merging tests into constant comparisons - aco: allow applying two sgprs to an instruction - aco: allow an extra SGPR with multiple uses to be applied to VOP3 - aco: take advantage of GFX10's constant bus limit and VOP3 literals - aco: improve creation of v_madmk_f32/v_madak_f32 - aco: fix clamp optimization - aco: improve clamp optimization - aco: add min(-max(), ) and max(-min(), ) optimization - aco: don't move literal to reg when making an instruction VOP3 on GFX10 - aco: allow input modifiers on v_cndmask_b32 - aco: replace extract_vector with copies - aco: improve readfirstlane after uniform LDS loads - aco: add integer min/max to can_swap_operands - nir/sink,nir/move: move/sink load_per_vertex_input - nir/sink,nir/move: move/sink nir_op_mov - nir/algebraic: a & ~(a >> 31) -> imax(a, 0) - aco: fix stack buffer overflow in apply_sgprs() - aco: fix fall-through test in try_remove_simple_block() with back-edges - aco: fix operand kill flags when a temporary is used more than once - aco: fix off-by-one error when initializing sgpr_live_in - radv: move gs copy shader creation before other variants - aco: improve support for s_sendmsg - radv/aco,aco: implement GS on GFX9+ - aco: implement GS on GFX7-8 - radv/aco: allow ACO for GS - aco: explicitly mark end blocks for exports - aco: remove needs_instance_id - aco: implement GS copy shaders - radv/aco: use ACO for GS copy shaders - aco: use nir_move_copies - aco: fix WaR check for >64-bit FLAT/GLOBAL instructions - aco: fix operand to scc when selecting SGPR ufind_msb/ifind_msb - aco: always add sgprs to sgpr_ids when choosing literals - aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc - amd/common,radv: move vertex_format_table to ac_shader_util.{h,c} - aco: rework vertex fetching a bit - aco: skip unused channels at the start when fetching vertices - aco: handle unaligned vertex fetch on GFX10 - aco: value-number MUBUF instructions - aco: use MUBUF in some situations instead of splitting vertex fetches - aco: fix rebase error from GS copy shader support - aco: ensure predecessors' p_logical_end is in WQM when a p_phi is in WQM - aco: run p_wqm instructions in WQM - nir/algebraic: add patterns for a >> #b << #b - nir/algebraic: add some half packing optimizations - aco: fix target calculation when vgpr spilling introduces sgpr spilling - aco: don't consider loop header blocks branch blocks in add_coupling_code - aco: don't update demand in add_coupling_code() for loop headers - aco: only create parallelcopy to restore exec at loop exit if needed - aco: don't always add logical edges from continue_break blocks to headers - aco: error when block has no logical preds but VGPRs are live at the start - aco: set exec_potentially_empty after continues/breaks in nested IFs - aco: improve assertion at the end of spiller - aco: fill reg_demand with sensible information in add_coupling_code() - aco: parallelcopy exec mask before s_wqm - aco: fix exec mask consistency issues - aco: fix gfx10_wave64_bpermute Ricardo Garcia (1): - anv: Unify GetDeviceQueue and GetDeviceQueue2 Rob Clark (89): - freedreno/ir3: split pre-coloring to it's own function - freedreno/ir3: use SSA flag on dest register too - freedreno/ir3: ir3_print tweaks - freedreno/ir3/ra: move regs_count==0 check - freedreno/ir3/ra: remove ir print after livein/out - freedreno/ir3: remove obsolete comment - freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISION - freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISION - freedreno/ir3: sync disasm changes from envytools - freedreno/ir3: also track # of nops for shader-db - freedreno: fix eglDupNativeFenceFD error - freedreno/ir3: fix valgrind complaint with STLW - freedreno/ir3: remove half-precision output - freedreno/ir3: rename fanin/fanout to collect/split - freedreno/ir3: remove impossible condition - freedreno/ir3: add input/output iterators - freedreno/ir3: show input/output wrmask's in disasm - freedreno/ir3: helper to print ir if debug enabled - freedreno/ir3: remove first-vertex sysval - freedreno/ir3: simplify creating sysval inputs - freedreno/ir3: re-work shader inputs/outputs - freedreno/ir3: only tex instructions have wrmask - freedreno/ir3: fix gpu hang with pre-fs-tex-fetch - freedreno/ir3: legalize cleanups - freedreno/ir3: remove unused parameter - freedreno/perfcntrs: small cleanup - freedreno/perfcntrs: remove gallium dependencies - freedreno/perfcntrs: move to shared location - freedreno/perfcntrs: add accessor to get per-gen tables - freedreno/perfctrs/a2xx: move CP to be first group - freedreno/perfcntrs/a6xx: remove RBBM counters - freedreno/perfcntrs: add fdperf - freedreno/perfctrs/fdperf: periodically restore counters - gitlab-ci: update deqp build so we can generate xml - gitlab-ci/deqp: preserve full list of unexpected results - gitlab-ci/deqp: preserve caselists for blocks with fails - gitlab-ci/deqp: detect and report flakes - gitlab-ci: bump arm test container - gitlab-ci/deqp: generate xml results for fails/flakes - gitlab-ci/deqp: generate junit results - gitlab-ci/freedreno/a6xx: remove most of the flakes - freedreno: use rsc->slice accessor everywhere - freedreno: switch to layout helper - gitlab-ci: disable junit results for deqp - freedreno/ir3: remove store_output lowered to store_shared_ir3 - freedreno/ir3: fix neverball assert in case of unused VS inputs - nir/lower_clip: Fix incorrect driver loc for clipdist outputs - freedreno/fdperf: use drmOpen() - freedreno/a6xx: disable LRZ when blending - freedreno/a5xx+a6xx: split LRZ layout to per-gen - freedreno/a6xx: fix LRZ layout - freedreno/a6xx: fix LRZ logic - freedreno/a6xx: enable LRZ by default - spirv: add OpLifetime\* - freedreno/ir3: add last-baryf shaderdb stat - freedreno/ir3: add scheduler traces - freedreno/ir3: add iterator macros - freedreno/a6xx: fix OUT_REG() vs growable cmdstream - nir+vtn: vec8+vec16 support - freedreno/ir3: fix flat shading again - nir: assert that nir_lower_tex runs after lowering derefs - mesa/st: lower samplers before nir_lower_tex - freedreno/ir3: rename instructions - gitlab-ci: fix missing caselist.css/xsl - freedreno/a6xx: limit scratch/debug markers to debug builds - freedreno/a6xx: cleanup rasterizer state - freedreno/a6xx: separate rast stateobj for prim restart - freedreno/a6xx: drop a few more per-draw registers - freedreno/a6xx: move dynamic program state to streaming stateobj - freedreno/a6xx: add PROG_FB_RAST stateobj - freedreno/drm: fix invalid-cmdstream-size with older kernels - freedreno: use PIPE_CAP_RGB_OVERRIDE_DST_ALPHA_BLEND - mesa/st: random whitespace cleanup - freedreno/a6xx: remove special handling based on MRT format - freedreno/a6xx: convert blend state to stateobj - freedreno: extract vsc pipe bo from GMEM state - freedreno: consolidate GMEM state - freedreno: constify fd_tile - freedreno: constify fd_vsc_pipe - freedreno/a6xx: constify gmem state - freedreno/a5xx: constify gmem state - freedreno/a4xx: constify gmem state - freedreno/a3xx: constify gmem state - freedreno/a2xx: constify gmem state - freedreno: get GMEM state from batch - freedreno: add gmem state cache - freedreno: add gmem_lock - freedreno: remove flush-queue - freedreno: allow ctx->batch to be NULL Robert Foss (5): - nir: Build nir_lower_point_size.c in libmesa_nir - android: Add panfrost support to build scripts - android: Fix u_format_table.c being generated twice - panfrost: Prefix schedule_program to prevent collision - android: Fix whitespace issue Rohan Garg (1): - gitlab-ci: Use lavacli from packages Roland Scheidegger (3): - gallium/scons: fix graw_gdi build - util/atomic: Fix p_atomic_add for unlocked and msvc paths - winsys/svga: use new ioctl for logging Roman Stratiienko (2): - Android: Fix build issue without LLVM - panfrost: Fix Android build Ross Zwisler (1): - intel: limit shader geometry on BDW GT1 Sagar Ghuge (1): - intel/compiler: Clear accumulator register before EOT Samuel Iglesias Gonsálvez (1): - main: fix coverity error in \_mesa_program_resource_find_name() Samuel Pitoiset (202): - radv: declare NGG scratch for VS or TES and only on GFX10 - radv: fix compute pipeline keys when optimizations are disabled - docs: document all RADV environment variables - radv: add a note about perftest/debug options - radv: fix 32-bit compiler warnings - nir: fix packing of nir_variable - radv/gfx10: enable wave32 for compute based on shader's wavesize - radv: hardcode the number of waves for the GFX6 LS-HS bug - radv: determine shaders wavesize at pipeline level - radv: rely on shader's wavesize when computing NGG info - radv: implement VK_EXT_subgroup_size_control - radv/gfx10: fix primitive indices orientation for NGG GS - ac: handle pointer types to LDS in ac_get_elem_bits() - gitlab-ci: build a specific libdrm version for ARM64 - gitlab-ci: build RADV on ARM64 - ac: fix build with recent LLVM - radv: remove useless RADV_DEBUG=unsafemath debug option - radv: make sure to not clear the ds attachment after resolves - ac: add radeon_info::has_l2_uncached - radv: implement VK_AMD_device_coherent_memory - spirv: fix lowering of OpGroupNonUniformAllEqual - ac: remove useless cast in ac_build_set_inactive() - ac: add 8-bit and 16-bit supports to ac_build_shuffle() - ac: add 8-bit and 16-bit supports to ac_build_readlane() - ac: add 8-bit and 16-bit supports to ac_build_set_inactive() - ac: add 8-bit and 16-bit supports to ac_build_dpp() - ac: add 8-bit and 16-bit supports to ac_build_swizzle() - ac: add 8-bit and 16-bit supports to get_reduction_identity() - ac: add 8-bit and 16-bit supports to ac_build_wwm() - ac: add 8-bit and 16-bit supports to ac_build_optimization_barrier() - ac: add 16-bit float support to ac_build_alu_op() - radv: advertise VK_KHR_shader_subgroup_extended_types on GFX8-GFX9 - radv: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7 - docs: add missing new features for RADV - pipe-loader: check that the pointer to driconf_xml isn't NULL - gitlab-ci: move building piglit into a separate script - gitlab-ci: fix ldd check for Vulkan drivers - gitlab-ci: add a job that only build things needed for testing - gitlab-ci: do not build with debugoptimized for meson-main - gitlab-ci: build swr in meson-main - gitlab-ci: build GLVND in meson-clang - gitlab-ci: remove now useless meson-swr-glvnd build job - gitlab-ci: reduce the number of scons build - radv: disable subgroup shuffle operations on GFX10 - ac/llvm: fix the local invocation index for wave32 - meson: only build imgui when needed - radv: set the image view aspect mask during subpass transitions - radv: set the image view aspect mask before resolves - radv: rework creation of decompress/resummarize meta pipelines - radv: create decompress pipelines for separate depth/stencil layouts - radv: select the depth decompress path based on the aspect mask - ac/llvm: fix warning in ac_build_canonicalize() - radv: fix reporting subgroup size with VK_KHR_pipeline_executable_properties - radv: fix enabling sample shading with SampleID/SamplePosition - radv/gfx10: fix implementation of exclusive scans - ac: add 8-bit and 16-bit supports to ac_build_permlane16() - radv: enable VK_KHR_shader_subgroup_extended_types on GFX10 - ac/llvm: convert src operands to pointers if necessary - radv: add more constants to avoid using magic numbers - radv,ac/nir: lower deref operations for shared memory - aco: drop useless lowering of deref operations for shared memory - ac/llvm: fix atomic var operations if source isn't a deref - radv: remove dead shader input/output variables - radv: simplify a check in radv_fixup_vertex_input_fetches() - radv/gfx10: fix the vertex order for triangle strips emitted by a GS - gitlab-ci: rename build-deqp.sh to build-deqp-gl.sh - gitlab-ci: add a gl suffix to the x86 test image and all test jobs - gitlab-ci: add a new job that builds a base test image for VK - gitlab-ci: build cts_runner in the x86 test image for VK - gitlab-ci: build dEQP VK 1.1.6 in the x86 test image for VK - gitlab-ci: add a new base test job for VK - gitlab-ci: allow to run dEQP Vulkan with DEQP_VER - gitlab-ci: configure the Vulkan ICD export with VK_DRIVER - gitlab-ci: build RADV in meson-testing - gitlab-ci: add a job that runs Vulkan CTS with RADV conditionally - radv: do not use VK_TRUE/VK_FALSE - radv: move emission of two PA_SC\_\* registers to the pipeline CS - radv: fix possibly wrong PA_SC_AA_CONFIG value for conservative rast - radv: synchronize after performing a separate depth/stencil fast clears - radv: do not init HTILE as compressed state when dst layout allows it - radv: initialize HTILE for separate depth/stencil aspects - radv: implement VK_KHR_separate_depth_stencil_layouts - gitlab-ci: set RADV_DEBUG=checkir for RADV test jobs - ac/nir: fix out-of-bound access when loading constants from global - radv: enable SpvCapabilityImageMSArray - radv: handle unaligned vertex fetches on GFX6/GFX10 - radv/gfx10: fix ngg_get_ordered_id - radv/gfx10: fix the out-of-bounds check for vertex descriptors - ac: declare an enum for the OOB select field on GFX10 - radv: init a default multisample state for the resolve FS path - radv: ignore pMultisampleState if rasterization is disabled - radv: ignore pTessellationState if the pipeline doesn't use tess - radv: ignore pDepthStencilState if rasterization is disabled - radv: tidy up radv_pipeline_init_blend_state() - radv: ignore pColorBlendState if rasterization is disabled - radv: rely on pipeline layout when creating push descriptors with template - radv: return the correct pitch for linear mipmaps on GFX10 - radv: record number of color/depth samples for each subpass - radv: implement VK_AMD_mixed_attachment_samples - ac/surface: use uint16_t for mipmap level pitches - radv: do not fill keys from fragment shader twice - spirv: add SpvCapabilityImageReadWriteLodAMD - spirv,nir: add new lod parameter to image_{load,store} intrinsics - amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod - aco: handle nir_intrinsic_image_deref_{load,store} with lod - radv: advertise VK_AMD_shader_image_load_store_lod - radv/gfx10: disable vertex grouping - radv/gfx10: determine if a pipeline is eligible for NGG passthrough - radv/gfx10: do not declare LDS for NGG if useless - radv/gfx10: add support for NGG passthrough mode - radv/gfx10: improve performance for TES using PrimID but not exporting it - radv: only use VkSamplerCreateInfo::compareOp if enabled - radv/gfx10: enable all CUs if NGG is never used - radv/gfx10: simplify some duplicated NGG GS code - vulkan/overlay: Fix for Vulkan 1.2 - radv: update VK_EXT_descriptor_indexing for Vulkan 1.2 - radv: update VK_EXT_host_query_reset for Vulkan 1.2 - radv: update VK_EXT_sampler_filter_minmax for Vulkan 1.2 - radv: update VK_EXT_scalar_block_layout for Vulkan 1.2 - radv: update VK_KHR_8bit_storage for Vulkan 1.2 - radv: update VK_KHR_buffer_device_address for Vulkan 1.2 - radv: update VK_KHR_create_renderpass2 for Vulkan 1.2 - radv: update VK_KHR_depth_stencil_resolve for Vulkan 1.2 - radv: update VK_KHR_draw_indirect_count for Vulkan 1.2 - radv: update VK_KHR_driver_properties for Vulkan 1.2 - radv: update VK_KHR_image_format_list for Vulkan 1.2 - radv: update VK_KHR_imageless_framebuffer for Vulkan 1.2 - radv: update VK_KHR_shader_atomic_int64 for Vulkan 1.2 - radv: update VK_KHR_shader_float16_int8 for Vulkan 1.2 - radv: update VK_KHR_shader_float_controls for Vulkan 1.2 - radv: update VK_KHR_shader_subgroup_extended_types for Vulkan 1.2 - radv: update VK_KHR_uniform_buffer_standard_layout for Vulkan 1.2 - radv: update VK_KHR_timeline_semaphore for Vulkan 1.2 - radv: implement Vulkan 1.1 features and properties - radv: implement Vulkan 1.2 features and properties - radv: enable Vulkan 1.2 - aco: fix emitting SMEM instructions with no operands on GFX6-GFX7 - aco: do not select 96-bit/128-bit variants for ds_read/ds_write on GFX6 - aco: do not combine additions of DS instructions on GFX6 - aco: implement stream output with vec3 on GFX6 - aco: fix emitting slc for MUBUF instructions on GFX6-GFX7 - aco: print assembly with CLRXdisasm for GFX6-GFX7 if found on the system - aco: fix constant folding of SMRD instructions on GFX6 - aco: do not use the vec3 variant for stores on GFX6 - aco: do not use the vec3 variant for loads on GFX6 - aco: add new addr64 bit to MUBUF instructions on GFX6-GFX7 - aco: implement nir_intrinsic_load_barycentric_at_sample on GFX6 - radv: fix double free corruption in radv_alloc_memory() - radv: add explicit external subpass dependencies to meta operations - radv: handle missing implicit subpass dependencies - spirv: add SpvCapabilityFragmentMaskAMD - nir: add two new texture ops for multisample fragment color/mask fetches - spirv: add support for SpvOpFragment{Mask}FetchAMD operations - nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch - ac/nir: add support for nir_texop_fragment_{mask}_fetch - aco: add support for nir_texop_fragment_{mask}_fetch - radv: advertise VK_AMD_shader_fragment_mask - aco: fix printing assembly with CLRXdisasm on GFX6 - aco: fix wrong IR in nir_intrinsic_load_barycentric_at_sample - aco: implement nir_intrinsic_store_global on GFX6 - aco: implement nir_intrinsic_load_global on GFX6 - aco: implement nir_intrinsic_global_atomic\_\* on GFX6 - aco: implement 64-bit nir_op_ftrunc on GFX6 - aco: implement 64-bit nir_op_fceil on GFX6 - aco: implement 64-bit nir_op_fround_even on GFX6 - aco: implement 64-bit nir_op_ffloor on GFX6 - aco: implement nir_op_f2i64/nir_op_f2u64 on GFX6 - ac/llvm: fix missing casts in ac_build_readlane() - aco: combine MRTZ (depth, stencil, sample mask) exports - aco: fix a hardware bug for MRTZ exports on GFX6 - aco: fix a hazard with v_interp\_\* and v_{read,readfirst}lane\_\* on GFX6 - aco: copy the literal offset of SMEM instructions to a temporary - radv: enable ACO support for GFX6 - radv: print NIR shaders after lowering FS inputs/outputs - radv: do not allow sparse resources with multi-planar formats - radv: enable VK_AMD_shader_fragment_mask on GFX6-GFX7 - compiler: add a new explicit interpolation mode - spirv: add support for SpvDecorationExplicitInterpAMD - compiler: add PERSP to the existing barycentric system values - compiler: add new SYSTEM_VALUE_BARYCENTRIC\_\* - spirv: add support for SpvBuiltInBaryCoord\* - nir: add nir_intrinsic_load_barycentric_model - nir: lower SYSTEM_VALUE_BARYCENTRIC\_\* to nir_load_barycentric() - nir: add nir_intrinsic_interp_deref_at_vertex - nir: lower interp_deref_at_vertex to load_input_vertex - spirv: implement SPV_AMD_shader_explicit_vertex_parameter - ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter - aco: implement VK_AMD_shader_explicit_vertex_parameter - radv: gather which input PS variables use an explicit interpolation mode - radv: implement VK_AMD_shader_explicit_vertex_parameter - radv: bump conformance version to 1.2.0.0 - radv: remove the non conformant VK implementation warning on GFX10 - aco: fix VS input loads with MUBUF on GFX6 - radv/gfx10: add a separate flag for creating a GDS OA buffer - radv/gfx10: implement NGG GS queries - radv/gfx10: re-enable NGG GS - radv: refactor physical device properties - aco: fix MUBUF VS input loads when expanding vec3 to vec4 on GFX6 - aco: do not use ds_{read,write}2 on GFX6 - aco: fix waiting for scalar stores before "writing back" data on GFX8-GFX9 - aco: fix creating v_madak if v_mad_f32 has two sgpr literals - nir: do not use De Morgan's Law rules for flt and fge Samuel Thibault (3): - loader: #define PATH_MAX when undefined (eg. Hurd) - util: Do not fail to build on unknown pthread_setname_np - meson: Do not require libdrm for DRI2 on hurd Satyajit Sahu (1): - radeon/vcn: Handle crop parameters for encoder Sonny Jiang (1): - radeonsi: use compute shader for clear 12-byte buffer Stephan Gerhold (1): - kmsro: Add "mcde" entry point Tapani Pälli (33): - nir: fix couple of compile warnings - util/android: fix android build errors - Revert "egl: implement new functions from EGL_EXT_image_flush_external" - Revert "egl: handle EGL_IMAGE_EXTERNAL_FLUSH_EXT" - Revert "st/dri: add support for EGL_EXT_image_flush_external" - Revert "st/dri: assume external consumers of back buffers can write to the buffers" - Revert "dri_interface: add interface for EGL_EXT_image_flush_external" - mesa: allow bit queries for EXT_disjoint_timer_query - Revert "mesa: allow bit queries for EXT_disjoint_timer_query" - mesa: allow bit queries for EXT_disjoint_timer_query - gitlab-ci: update Piglit commit, update skips - mapi: add GetInteger64vEXT with EXT_disjoint_timer_query - glsl: handle max uniform limits with lower_const_arrays_to_uniforms - gitlab-ci: bump piglit checkout commit - glsl: additional interface redeclaration check for SSO programs - intel/compiler: add newline to limit_dispatch_width message - intel/compiler: force simd8 when dual src blending on gen8 - dri: add \__DRI_IMAGE_FORMAT_SXRGB8 - i965: expose MESA_FORMAT_B8G8R8X8_SRGB visual - mesa/st/i965: add a ProgramResourceHash for quicker resource lookup - mesa: create program resource hash in a single place - iris: set depth stall enabled when depth flush enabled on gen12 - anv: set depth stall enabled when depth flush enabled on gen12 - isl/gen12: add reminder comment about missing WA with 3D surfaces - anv: fix assert in GetImageDrmFormatModifierPropertiesEXT - anv: add assert for isl_mod_info in choose_isl_tiling_flags - anv: initialize clear_color_is_zero_one - egl/android: fix buffer_count for applications setting max count - anv/android: setup gralloc1 usage from gralloc0 usage manually - anv/android: make format_supported_with_usage static - intel/vec4: fix valgrind errors with vf_values array - glsl: fix a memory leak with resource_set - iris: fix aux buf map failure in 32bits app on Android Thomas Hellstrom (4): - winsys/svga: Enable transhuge pages for buffer objects - svga: Avoid discard DMA uploads - gallium/util: Increase the debug_flush map depth - svga: Fix banded DMA upload Thong Thai (8): - st/va: Convert interlaced NV12 to progressive - util/format: Add the P010 format used for 10-bit videos - gallium: Add PIPE_FORMAT_P010 support - st/va: Add support for P010, used for 10-bit videos - radeon: Use P010 for decoding of 10-bit videos - r600: Remove HEVC related code since HEVC is not supported - mesa: Prevent \_MaxLevel from being less than zero - Revert "st/va: Convert interlaced NV12 to progressive" Timothy Arceri (66): - glsl: just use NIR to lower outputs when driver can't read outputs - glsl: disable lower_fragdata_array() for NIR drivers - mesa: add ARB_shading_language_include stubs - glsl: add infrastructure for ARB_shading_language_include - mesa: add ARB_shading_language_include infrastructure to gl_shared_state - mesa: add helper to validate tokenise shader include path - mesa: add \_mesa_lookup_shader_include() helper - mesa: add copy_string() helper - mesa: add glNamedStringARB() support - mesa: implement glGetNamedStringARB() - mesa: make error checking optional in \_mesa_lookup_shader_include() - mesa: implement glIsNamedStringARB() - mesa: implement glGetNamedStringivARB() - mesa: split \_mesa_lookup_shader_include() in two - mesa: implement glDeleteNamedStringARB() - glsl: add ARB_shading_language_include support to #line - glsl: pass gl_context to glcpp_parser_create() - glsl: add preprocessor #include support - glsl: error if #include used while extension is disabled - glsl: add can_skip_compile() helper - glsl: delay compilation skip if shader contains an include - mesa: add support cursor support for relative path shader includes - mesa: add shader include lookup support for relative paths - mesa: implement glCompileShaderIncludeARB() - mesa: enable ARB_shading_language_include - gitlab-ci: bump piglit checkout commit - gitlab-ci: update for arb_shading_language_include - compiler: move build definition of pp_standalone_scaffolding.c - radv: add some infrastructure for fresh forks for each secure compile - radv: add a secure_compile_open_fifo_fds() helper - radv: create a fresh fork for each pipeline compile - docs: update source code repository documentation - glsl: move calculate_array_size_and_stride() to link_uniforms.cpp - glsl: don't set uniform block as used when its not - glsl: make use of active_shader_mask when building resource list - glsl/nir: iterate the system values list when adding varyings - docs: remove mailing list as way of submitting patches - glsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir() - glsl: copy the how_declared field when converting to nir - nir: add some fields to nir_variable_data - glsl: copy the new data fields when converting to nir - glsl: add support for named varyings in nir_build_program_resource_list() - glsl: add subroutine support to nir_build_program_resource_list() - st/glsl_to_nir: call gl_nir_lower_buffers() a little later - st/glsl_to_nir: use nir based program resource list builder - st/glsl_to_nir: fix SSO validation regression - glsl: rename gl_nir_link() to gl_nir_link_spirv() - glsl: add gl_nir_link_check_atomic_counter_resources() - glsl: add new gl_nir_link_glsl() helper - glsl: reorder link_and_validate_uniforms() calls - mesa: add new UseNIRGLSLLinker constant - glsl: use nir linker to link atomics - glsl: add check_image_resources() for the nir linker - glsl: use nir version of check_image_resources() for nir linker - glsl: move check_subroutine_resources() into the shared util code - glsl: call check_subroutine_resources() from the nir linker - glsl: move uniform resource checks into the common linker code - glsl: call uniform resource checks from the nir linker - glsl: move calculate_subroutine_compat() to shared linker code - glsl: call calculate_subroutine_compat() from the nir linker - glsl: fix potential bug in nir uniform linker - glsl: remove bogus assert in nir uniform linking - glsl: fix check for matrices in blocks when using nir uniform linker - glsl: count uniform components and storage better in nir linking - glsl_to_nir: update interface type properly - glsl: fix gl_nir_set_uniform_initializers() for image arrays Timur Kristóf (39): - ac: Handle invalid GFX10 format correctly in ac_get_tbuffer_format. - aco: Make sure not to mistakenly propagate 64-bit constants. - aco: Treat all booleans as per-lane. - aco: Optimize out trivial code from uniform bools. - aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce. - aco: Remove superfluous argument from emit_boolean_logic. - aco: Remove lower_linear_bool_phi, it is not needed anymore. - aco: Optimize load_subgroup_id to one bit field extract instruction. - aco/wave32: Change uniform bool optimization to work with wave32. - aco/wave32: Replace hardcoded numbers in spiller with wave size. - aco/wave32: Introduce emit_mbcnt which takes wave size into account. - aco/wave32: Add wave size specific opcodes to aco_builder. - aco/wave32: Use lane mask regclass for exec/vcc. - aco/wave32: Fix load_local_invocation_index to support wave32. - aco/wave32: Use wave_size for barrier intrinsic. - aco/wave32: Allow setting the subgroup ballot size to 64-bit. - aco/wave32: Fix reductions. - aco: Fix uniform i2i64. - ac/llvm: Fix ac_build_reduce in wave32 mode. - aco/wave32: Set the definitions of v_cmp instructions to the lane mask. - aco: Implement 64-bit constant propagation. - aco: Allow optimizing vote_all and nir_op_iand. - aco: Don't skip combine_instruction when definitions[1] is used. - aco: Optimize out s_and with exec, when used on uniform bitwise values. - aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible. - nouveau/nvc0: add extern keyword to nvc0_miptree_vtbl. - intel/compiler: Fix array bounds warning on GCC 10. - radeon: Move si_get_pic_param to radeon_vce.c - r600: Move get_pic_param to radeon_vce.c - gallium: Fix a couple of multiple definition warnings. - radeon: Fix multiple definition error with radeon_debug - aco: Fix -Wstringop-overflow warnings in aco_span. - aco: Fix maybe-uninitialized warnings. - aco: Fix signedness compare warning. - aco: Make a better guess at which instructions need the VCC hint. - aco: Transform uniform bitwise instructions to 32-bit if possible. - aco/gfx10: Fix VcmpxExecWARHazard mitigation. - aco: Fix the meaning of is_atomic. - aco/optimizer: Don't combine uniform bool s_and to s_andn2. Tomasz Pyra (1): - gallium/swr: Fix arb_transform_feedback2 Tomeu Vizoso (38): - gitlab-ci: Disable lima jobs - gitlab-ci: Run only LAVA jobs in special-named branches - panfrost: Add checksum fields to SFBD descriptor - panfrost: Set 0x10 bit on mali_shader_meta.unknown2_4 on T720 - panfrost: Rework format encoding on SFBD - panfrost: Take into account texture layers in SFBD - panfrost: Decode blend shaders for SFBD - panfrost: Generate polygon list manually for SFBD - panfrost: Print the right zero field - panfrost: Pipe the GPU ID into compiler and disassembler - panfrost: Set depth and stencil for SFBD based on the format - panfrost: Multiply offset_units by 2 - panfrost: Make sure the shader descriptor is in sync with the GL state - gitlab-ci: Remove limit on kernel logging - panfrost: Just print tiler fields as-is for Tx20 - panfrost: Rework buffers in SFBD - gitlab-ci: Fix dir name for VK-GL-CTS sources - panfrost: Don't print the midgard_blend_rt structs on SFBD - panfrost: Add quirks system to cmdstream - panfrost: Simplify shader patching - panfrost: White list the Mali T720 - gitlab-ci: Test Panfrost on T720 GPUs - panfrost: Add PAN_MESA_DEBUG=sync - panfrost: Hold a reference to sampler views - pan/midgard: Remove undefined behavior - nir: Don't copy empty array - util: Don't access members of NULL pointers - panfrost: Don't lose bits! - st/mesa: Don't access members of NULL pointers - panfrost: Handle Z24_UNORM_S8_UINT as MALI_Z32_UNORM - panfrost: Increase PIPE_SHADER_CAP_MAX_OUTPUTS to 16 - panfrost: Dynamically allocate array of texture pointers - panfrost: Map with size of first layer for 3D textures - panfrost: Store internal format - gitlab-ci: Update kernel for LAVA to 5.5-rc1 plus fixes - gitlab-ci: Switch LAVA jobs to use shared dEQP runner - gitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5 - gitlab-ci: Consolidate container and build stages for LAVA Urja Rannikko (4): - panfrost: free last_read/write tables in mir_create_dependency_graph - panfrost: free allocations in schedule_block - panfrost: add lcra_free() to free lcra state - panfrost: free spill cost table in mir_spill_register Vasily Khoruzhick (31): - lima: add debug prints for BO cache - lima: align size before trying to fetch BO from cache - lima: ignore flags while looking for BO in cache - lima: set dithering flag when necessary - lima: add support for gl_PointSize - lima: enable tiling - lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle() - lima: expose tiled format modifier in query_dmabuf_modifiers() - lima: use single BO for GP outputs - lima: drop suballocator - lima: fix allocation of GP outputs storage for indexed draw - lima: postpone PP stream generation - lima: don't reload and redraw tiles that were not updated - lima: fix PP stream terminator size - lima: use linear layout for shared buffers if modifier is not specified - lima: add debug flag to disable tiling - lima: drop support for R8G8B8 format - lima: fix PLBU_CMD_PRIMITIVE_SETUP command - lima: fix viewport clipping - lima: implement polygon offset - lima: fix PIPE_CAP\_\* to mark features that aren't supported yet - lima: add new findings to texture descriptor - lima: fix handling of reverse depth range - ci: lava: pass CI_NODE_INDEX and CI_NODE_TOTAL to lava jobs - ci: Re-enable CI for lima on mali450 - lima: implement invalidate_resource() - nir: don't emit ishl in \_nir_mul_imm() if backend doesn't support bitops - lima: use imul for calculations with intrinsic src - lima: ppir: don't delete root ld_tex nodes without successors in current block - lima: ppir: always create move and update ld_tex successors for all blocks - lima: disable early-z if fragment shader uses discard Vinson Lee (9): - swr: Fix build with llvm-10.0. - panfrost: Fix gnu-empty-initializer build errors. - scons: Bump C standard to gnu11 on macOS 10.15. - util/u_thread: Restrict u_thread_get_time_nano on macOS. - swr: Fix build with llvm-10.0. - swr: Fix build with llvm-10.0. - lima: Fix build with GCC 10. - swr: Fix GCC 4.9 checks. - panfrost: Remove unused anonymous enum variables. Wladimir J. van der Laan (2): - u_vbuf: add logic to use a limited number of vbufs - u_vbuf: use single vertex buffer if it's not possible to have multiple X512 (1): - util/u_thread: Fix build under Haiku Yevhenii Kolesnikov (5): - glsl: Enable textureSize for samplerExternalOES - meson: Fix linkage of libgallium_nine with libgalliumvl - meta: Cleanup function for DrawTex - main: allow external textures for BindImageTexture - meta: Add cleanup function for Bitmap Zebediah Figura (1): - Revert "draw: revert using correct order for prim decomposition." luc (1): - zink: confused compilation macro usage for zink in target helpers.