Commit graph

5401 commits

Author SHA1 Message Date
ameerj 0639244d85 renderer_opengl: Swizzle BGR textures on copy
OpenGL does not natively support BGR internal formats, which causes many BGR textures to render incorrectly, with Red and Blue channels swapped.

This commit aims to address this by swizzling the blue and red channels on texture copies when a BGR format is encountered.
2021-03-04 14:14:19 -05:00
bunnei b8b5891585
Merge pull request #5989 from ReinUsesLisp/cmdpool
vk_command_pool: Reduce the command pool size from 4096 to 4
2021-03-04 11:07:31 -08:00
bunnei 50ee9c46ab video_core: rasterizer_accelerated: Fix delta check ordering. 2021-03-02 17:48:02 -08:00
bunnei 6ab839462c video_core: rasterizer_accelerated: Improve error handling & fix implicit conversion. 2021-03-02 17:44:02 -08:00
bunnei 94da1e8a7e video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.
- Uses a fixed 64MB for the cache instead of an ever growing map.
- Slightly faster by using atomics instead of a single mutex for access.
- Thanks for Rodrigo for the idea.
2021-03-02 16:57:53 -08:00
ReinUsesLisp 5ad62e7bfc buffer_cache: Heuristically decide to skip cache on uniform buffers
Some games benefit from skipping caches (Pokémon Sword), and others
don't (Animal Crossing: New Horizons). Add an heuristic to decide this
at runtime.

The cache hit ratio has to be ~98% or better to not skip the cache.
There are 16 frames of buffer.
2021-03-02 02:44:19 -03:00
ameerj 52e9d7fa49 gpu_thread: Remove Async NVDEC placeholders
This commit removes early placeholders for an implementation of async nvdec. With recent changes to the source code, the placeholders are no longer accurate, and can cause a nullptr dereference due to the nature of the cdma_pusher lifetime.
2021-02-28 22:03:00 -05:00
bunnei 55f556c53e
Merge pull request #5984 from jbeich/gcc-freebsd
common,video-core: unbreak GCC 11 build on FreeBSD 13
2021-02-27 14:15:00 -07:00
bunnei 09f7c355c6
Merge pull request #5953 from bunnei/memory-refactor-1
Kernel Rework: Memory updates and refactoring (Part 1)
2021-02-27 12:48:35 -07:00
Kelebek1 d31dbb1bc1 Implement glDepthRangeIndexeddNV 2021-02-24 22:26:53 +00:00
ReinUsesLisp aae399c1a8 vk_command_pool: Reduce the command pool size from 4096 to 4
This allows drivers to reuse memory more easily and preallocate less.
The optimal number has been measured booting Pokémon Sword.
2021-02-23 19:08:24 -03:00
Jan Beich 1841ca4b9b video_core: add missing header after 468bd9c1b0
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkShaderComplete()':
src/video_core/shader_notify.cpp:33:10: error: 'unique_lock' is not a member of 'std'
   33 |     std::unique_lock lock{mutex};
      |          ^~~~~~~~~~~
src/video_core/shader_notify.cpp:6:1: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
    5 | #include "video_core/shader_notify.h"
  +++ |+#include <mutex>
    6 |
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkSharderBuilding()':
src/video_core/shader_notify.cpp:38:10: error: 'unique_lock' is not a member of 'std'
   38 |     std::unique_lock lock{mutex};
      |          ^~~~~~~~~~~
src/video_core/shader_notify.cpp:38:10: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
2021-02-23 00:04:36 +00:00
bunnei 20245e660f
Merge pull request #5936 from Kelebek1/Offsets
Offsets for TexelFetch and TextureGather in Vulkan
2021-02-21 21:23:45 -07:00
Morph 1a5d4d7840 gl_disk_shader_cache: Log total shader entries count on game load 2021-02-20 11:08:19 -05:00
bunnei 728ee181eb
Merge pull request #5924 from ReinUsesLisp/inline-bindings
vk_update_descriptor: Inline and improve code for binding buffers
2021-02-19 12:27:10 -08:00
bunnei 93e20867b0 hle: kernel: Migrate PageHeap/PageTable to KPageHeap/KPageTable. 2021-02-18 16:16:25 -08:00
bunnei 9cae3e6e90
Merge pull request #4973 from ameerj/nvdec-opt
nvdec: Reuse allocated buffers and general cleanup
2021-02-18 15:12:07 -08:00
ReinUsesLisp 24d0cc3ab8 vk_rasterizer: Fix loading shader addresses twice
This was recently introduced on a wrongly rebased commit.
2021-02-15 21:34:13 -03:00
bunnei cffa6f4e62
Merge pull request #5923 from ReinUsesLisp/vk-dirty-pipeline
fixed_pipeline_cache: Use dirty flags to lazily update key
2021-02-15 13:17:27 -08:00
Kelebek1 9d8f793969 Review 1 2021-02-15 05:26:28 +00:00
Kelebek1 fb54c38631 Implement texture offset support for TexelFetch and TextureGather and add offsets for Tlds
Formatting
2021-02-15 00:36:37 +00:00
bunnei eae9f2e440 yuzu: Various frontend improvements to avoid crashes and improve experience on Linux. 2021-02-14 00:20:41 -08:00
ReinUsesLisp b8ffdbb167 vk_resource_pool: Load GPU tick once and compare with it
Other minor style improvements. Rename free_iterator to hint_iterator,
to describe better what it does.
2021-02-13 17:53:58 -03:00
ReinUsesLisp 21b40de318 vk_update_descriptor: Inline and improve code for binding buffers
Allow compilers with our settings inline hot code.
2021-02-13 17:46:24 -03:00
ReinUsesLisp 70353649d7 fixed_pipeline_cache: Use dirty flags to lazily update key
Use dirty flags to avoid building pipeline key from scratch on each draw
call. This saves a bit of unnecesary work on each draw call.
2021-02-13 17:44:47 -03:00
ameerj c7325c6a4c gl_texture_cache: Lazily create non-sRGB texture views for sRGB formats
This creates non-sRGB texture views for sRGB texture formats to allow for interfacing with these views in compute shaders using imageLoad and imageStore.

Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-02-13 13:27:50 -05:00
ameerj b675c44e49 rebase, fix name shadowing, more const 2021-02-13 13:07:56 -05:00
ameerj 3c37d66c28 Address PR feedback
Co-Authored-By: LC <712067+lioncash@users.noreply.github.com>
2021-02-13 13:07:56 -05:00
ameerj 09722cb4a7 streamline cdma_pusher/command_classes 2021-02-13 13:07:56 -05:00
ameerj 77564f987c streamline cdma_pusher/command_classes 2021-02-13 13:07:53 -05:00
ameerj ac265a72ce nvdec cleanup 2021-02-13 13:07:31 -05:00
Morph 83227ad981
Merge pull request #5919 from ReinUsesLisp/stream-buffer-tragic
gl_stream_buffer/vk_staging_buffer_pool: Fix size check
2021-02-13 21:25:45 +08:00
ReinUsesLisp dd9caf9aa0 vk_master_semaphore: Mark gpu_tick atomic operations with relaxed order 2021-02-13 05:57:28 -03:00
ReinUsesLisp 6171566296 vk_staging_buffer_pool: Inline tick tests
Load the current tick to a local variable, moving it out of an atomic
and allowing us to compare the value without going through a pointer
each time. This should make the loop more optimizable.
2021-02-13 05:14:11 -03:00
ReinUsesLisp 682d82faf3 gl_stream_buffer/vk_staging_buffer_pool: Fix size check
Fix a tragic off-by-one condition that causes Vulkan's stream buffer to
think it's always full, using fallback memory. The OpenGL was also
affected by this bug to a lesser extent.
2021-02-13 05:11:48 -03:00
LC 6f1ad6aa9f
Merge pull request #5916 from ameerj/maxwell-gl-unused
maxwell_to_gl: Remove unused code
2021-02-13 02:55:59 -05:00
ReinUsesLisp 757fd1e917 vulkan_device: Require VK_EXT_robustness2
We are already using robustness2 features without requiring it
explicitly, causing potential crashes on drivers without the extension.
Requiring this at boot allows better diagnostics for it and formalizes
our usage on the extension.
2021-02-13 03:31:50 -03:00
ReinUsesLisp 5b35b01070 video_core: Fix clang build issues 2021-02-13 02:26:47 -03:00
ReinUsesLisp 025fe458ae vk_staging_buffer_pool: Fix softlock when stream buffer overflows
There was still a code path that could wait on a timeline semaphore tick
that would never be signalled.

While we are at it, make use of more STL algorithms.
2021-02-13 02:18:38 -03:00
ReinUsesLisp 3a2eefb16c vk_buffer_cache: Add support for null index buffers
Games can bind a null index buffer (size=0) where all indices are
evaluated as zero. VK_EXT_robustness2 doesn't support this and all
drivers segfault when a null index buffer is passed to
vkCmdBindIndexBuffer.

Workaround this by creating a 4 byte buffer and filling it with zeroes.
If it's read out of bounds, robustness takes care of returning zeroes as
indices.
2021-02-13 02:18:38 -03:00
ReinUsesLisp 0b8b961442 buffer_cache: Add extra bytes to guest SSBOs
Bind extra bytes beyond the guest API's bound range.
This is due to some games like Astral Chain operating out of bounds.
Binding the whole map range would be technically correct, but games
have large maps that make this approach unaffordable for now.
2021-02-13 02:18:38 -03:00
ReinUsesLisp 93a69b6cc8 Merge branch 'bytes-to-map-end' into new-bufcache-wip 2021-02-13 02:18:35 -03:00
ReinUsesLisp 7402442442 vk_staging_buffer_pool: Get a staging buffer instead of waiting
Avoids waiting idle while the GPU finishes to do work, and fixes an
issue where we'd wait forever if a single command buffer (logic tick)
all the data.
2021-02-13 02:18:05 -03:00
ReinUsesLisp 0b631f22fc renderer_opengl: Remove interop
Remove unused interop code from the OpenGL backend.
2021-02-13 02:18:04 -03:00
ReinUsesLisp 3da87d3f12 gl_buffer_cache: Drop interop based parameter buffer workarounds
Sacrify runtime performance to avoid generating kernel exceptions on
Windows due to our abusive aliasing of interop buffer objects.
2021-02-13 02:17:24 -03:00
ReinUsesLisp 2b95c137ff buffer_cache: Heuristically detect stream buffers
Detect when a memory region has been joined several times and increase
the size of the created buffer on those instances. The buffer is assumed
to be a "stream buffer", increasing its size should stop us from
constantly recreating it and fragmenting memory.
2021-02-13 02:17:24 -03:00
ReinUsesLisp ec9354d6d9 buffer_cache: Split CreateBuffer in separate functions
Allow adding functionality to each function without making CreateBuffer
more complex.
2021-02-13 02:17:24 -03:00
ReinUsesLisp a02b4e1df6 buffer_cache: Skip cache on small uploads on Vulkan
Ports from OpenGL the optimization to skip small 3D uniform buffer
uploads. This will take advantage of the previously introduced stream
buffer.

Fixes instances where the staging buffer offset was being ignored.
2021-02-13 02:17:24 -03:00
ReinUsesLisp 35df1d1864 vk_staging_buffer_pool: Add stream buffer for small uploads
This uses a ring buffer similar to OpenGL's stream buffer for small
uploads. This stops us from allocating several small buffers, reducing
memory fragmentation and cache locality.

It uses dedicated allocations when possible.
2021-02-13 02:17:24 -03:00
ReinUsesLisp 8fd518ec40 vulkan_device: Enable robustBufferAccess
Fix regression on Pascal on Animal Crossing: New Horizons, fixing a
validation error.
2021-02-13 02:17:23 -03:00
ReinUsesLisp 82c2601555 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
ReinUsesLisp a39d9c5194 vulkan_common: Expose interop and headless devices 2021-02-13 02:16:21 -03:00
ReinUsesLisp 47d5ec6cfc vulkan_common: Make interop extensions mandatory 2021-02-13 02:16:21 -03:00
ReinUsesLisp 40ed0cb920 vulkan_device: Enable robust buffers 2021-02-13 02:16:21 -03:00
ReinUsesLisp 1a987054c5 vulkan_device: Use designated initializers for features 2021-02-13 02:16:21 -03:00
ReinUsesLisp 79afdeaf08 vulkan_wrapper: Add memory barrier pipeline barrier helper 2021-02-13 02:16:21 -03:00
ReinUsesLisp 004a8d6a7a vulkan_device: Fix formatting of constants 2021-02-13 02:16:21 -03:00
ReinUsesLisp 16f97ded21 vulkan_wrapper: Add interop functions 2021-02-13 02:16:21 -03:00
ReinUsesLisp 9735c34f5d vulkan_instance: Initialize Vulkan instance in a separate thread
Workaround an issue on Nvidia where creating a Vulkan instance from an
active OpenGL thread disables threaded optimization on the driver.
This optimization is important to have good performance on Nvidia
OpenGL.
2021-02-13 02:16:21 -03:00
ReinUsesLisp dde19e7d75 vulkan_wrapper: Pull Windows symbols 2021-02-13 02:16:21 -03:00
ReinUsesLisp 75ccd9959c gpu: Report renderer errors with exceptions
Instead of using a two step initialization to report errors, initialize
the GPU renderer and rasterizer on the constructor and report errors
through std::runtime_error.
2021-02-13 02:16:19 -03:00
ReinUsesLisp 9d8ca6cc4a buffer_base: Add support for cached CPU writes
Some games usually write memory pages currently used by the GPU, causing
rendering issues (e.g. flashing geometry and shadows on Link's
Awakening). To workaround this issue, Guest CPU writes are delayed until
the command buffer finishes processing, but the pages are updated
immediately.

The overall behavior is:
- CPU writes are cached until they are flushed, they update the page
  state, but don't change the modification state. Cached writes stop
  pages from being flushed, in case games have meaningful data in it.
- Command processing writes (e.g. push constants) update the page state
  and are marked to the command processor as dirty. They don't remove
  the state of cached writes.
2021-02-13 02:15:29 -03:00
ameerj 069afcc633 maxwell_to_gl: Remove unused code
Removes unused declarations in maxwell_to_gl.h
2021-02-12 23:01:09 -05:00
bunnei 245d60bfff
Merge pull request #5900 from lioncash/unused-func
video_core: Remove unused functions and variables
2021-02-09 15:29:10 -08:00
Lioncash 10636d2494 gl_rasterizer: Remove unused variables
Resolves warnings on clang 12
2021-02-09 17:31:37 -05:00
Lioncash 783dc9e112 texture_cache/util: Remove unused functions
Silences a few warnings on clang 12.
2021-02-09 17:30:20 -05:00
Ameer J 26669d9e13
Merge pull request #5880 from lat9nq/ffmpeg-external
cmake: FFmpeg linking rework
2021-02-08 21:13:10 -05:00
Rodrigo Locatti 4c82c08897
Merge pull request #5888 from Morph1984/ogl-4.6
renderer_opengl: Update OpenGL backend version requirement to 4.6
2021-02-07 21:44:49 -03:00
Chloe Marcec c5f109bc50 video_core: Delete morton
moron.h & morton.cpp are not used anywhere and are just empty files
2021-02-08 10:20:21 +11:00
Morph 6e5cc977ad renderer_opengl: Update OpenGL backend version requirement to 4.6 2021-02-07 16:32:35 -05:00
lat9nq b7e6eca8b2 Address reviewer comments 2021-02-05 16:46:03 -05:00
lat9nq 1d19eac415 CMake: Port citra-emu/citra FindFFmpeg.cmake
Also renames related CMake variables to match both the Find*FFmpeg* and
variables defined within the file. Fixes odd errors produced by the old
FindFFmpeg.

Citra's FindFFmpeg is slightly modified here: adds Citra's copyright at
the beginning, renames FFmpeg_INCLUDES to FFmpeg_INCLUDE_DIR, disables a
few components in _FFmpeg_ALL_COMPONENTS, and adds the missing avutil
component to the comment above.
2021-02-05 15:39:19 -05:00
lat9nq 47401016bf CMake: Implement YUZU_USE_BUNDLED_FFMPEG
For Linux, instructs CMake to use the FFmpeg submodule in externals.
This is HEAVILY based on our usage of the late Unicorn.  Minimal change
to MSVC as it uses the yuzu-emu/ext-windows-bin. MinGW now targets the
same ext-windows-bin libraries as MSVC for FFmpeg. Adds FFMPEG_LIBRARIES
to WIN32 and simplifies video_core/CMakeLists.txt a bit.
2021-02-05 14:49:51 -05:00
lat9nq fc43eac82a video_core: host_shaders: Don't pass --quiet to glslangValidator if unavailable
Prevents CMake from calling `glslangValidator` with `--quiet` when it is
not available, i.e. on older downstream versions from Ubuntu.
2021-02-01 23:39:54 -05:00
bunnei 5861bacafd
Merge pull request #5795 from ReinUsesLisp/bytes-to-map-end
video_core/memory_manager: Add BytesToMapEnd
2021-01-29 22:56:29 -08:00
LC 16818e952c
Merge pull request #5836 from ReinUsesLisp/unaligned-constr-sched
vk_scheduler: Fix unaligned placement new expressions
2021-01-28 10:53:15 -05:00
ReinUsesLisp 9e88ad8da9 vk_scheduler: Fix unaligned placement new expressions
We were accidentaly creating an object in an unaligned memory address.
Fix this by manually aligning the offset.
2021-01-27 22:28:22 -03:00
bunnei 45b13c3037
Merge pull request #5786 from ReinUsesLisp/glsl-cbuf
gl_shader_decompiler: Fix constant buffer size calculation
2021-01-27 15:27:53 -08:00
Rodrigo Locatti ef6cc3aa1d
vulkan_device: Blacklist Intel from float16 math (#5798)
Astral Chain crashes Intel's SPIR-V compiler when using fp16.
Disable this while the vendor works on a fix.
2021-01-27 13:31:32 -08:00
bunnei 28b822fe38
Merge pull request #5778 from ReinUsesLisp/shader-dir
renderer_opengl: Avoid precompiled cache and force NV GL cache directory
2021-01-27 11:34:21 -08:00
bunnei 62766b1326
Merge pull request #5785 from ReinUsesLisp/buffer-dma
video_core/memory_manager: Flush destination buffer on CopyBlock
2021-01-24 22:57:00 -08:00
ReinUsesLisp 34c3ec2f8c Revert "Start of Integer flags implementation"
This reverts #4713. The implementation in that PR is not accurate.
It does not reflect the behavior seen in hardware.
2021-01-25 02:48:03 -03:00
ReinUsesLisp 9dc4a80b17 vk_graphics_pipeline: Fix narrowing conversion on MSVC 2021-01-24 21:41:29 -03:00
LC df0d8c45d2
Merge pull request #5807 from ReinUsesLisp/vc-warnings
video_core: Silence the remaining gcc warnings and enforce them
2021-01-24 17:36:43 -05:00
Rodrigo Locatti b769b1be26
Merge pull request #5363 from ReinUsesLisp/vk-image-usage
vk_texture_cache: Support image store on sRGB images with VkImageViewUsageCreateInfo
2021-01-24 18:44:51 -03:00
ReinUsesLisp 6b00443bc1 vk_texture_cache: Support image store on sRGB images with VkImageViewUsageCreateInfo
Vulkan 1.0 didn't support creating sRGB image views on an ABGR8 VkImage
with storage usage bits. VK_KHR_maintenance2 addressed this allowing to
reduce the usage bits on a VkImageView.

To allow image store on non-sRGB image views when the VkImage is created
with sRGB, always create VkImages without sRGB and add the sRGB format
on the view.
2021-01-24 18:16:43 -03:00
ReinUsesLisp 6a0143400f vulkan_device: Lift VK_EXT_extended_dynamic_state blacklist on RDNA
It seems to be safe to use this on new drivers.
2021-01-24 20:21:11 -03:00
ReinUsesLisp 748551dafb cmake: Enforce -Warray-bounds and -Wmissing-field-initializers globally 2021-01-24 17:31:29 -03:00
bunnei 19c14589d3
Merge pull request #5796 from ReinUsesLisp/vertex-a-bypass-vk
vk_pipeline_cache: Properly bypass VertexA shaders
2021-01-24 11:22:58 -08:00
ReinUsesLisp f81c783b5b host_shaders/cmake: Pass --quiet to glslang to keep it quiet
Silences noisy builds on toolchains.
2021-01-24 04:55:23 -03:00
ReinUsesLisp cc4335a9c6 video_core/cmake: Enforce -Warray-bounds and -Wmissing-field-initializers 2021-01-24 04:42:41 -03:00
ReinUsesLisp 1b76e7e890 video_core: Silence -Wmissing-field-initializers warnings 2021-01-24 04:32:19 -03:00
ReinUsesLisp 80a673a27f maxwell_3d: Silence array bounds warnings 2021-01-24 04:31:41 -03:00
ReinUsesLisp ad48259d7e maxwell_to_vk: Silence -Wextra warnings about using different enum types 2021-01-24 04:03:36 -03:00
Levi Behunin 9477d23d70
shader_ir: Fix comment typo 2021-01-23 13:16:37 -05:00
ReinUsesLisp 966896daad video_core/cmake: Properly generate fatal errors on Aftermath
Fix "message(ERROR ..." to "message(FATAL_ERROR ..." to properly stop
cmake when Nsight Aftermath can't be configured.
2021-01-23 04:15:30 -03:00
ReinUsesLisp 625a011888 nsight_aftermath_tracker: Fix build issues when enabled
Fixes a bunch of build errors when Nsight Aftermath is properly enabled.
2021-01-23 04:13:39 -03:00
ReinUsesLisp 37ef2ee595 vk_pipeline_cache: Properly bypass VertexA shaders
The VertexA stage is not yet implemented, but Vulkan is adding its
descriptors, causing a discrepancy in the pushed descriptors and the
template. This generally ends up in a driver side crash.

Bypass the VertexA stage for now.
2021-01-23 03:59:59 -03:00
bunnei 302a5f00e8
Merge pull request #4713 from behunin/int-flags
Start of Integer flags implementation
2021-01-22 21:57:14 -08:00
ReinUsesLisp bda177ef40 video_core/memory_manager: Add BytesToMapEnd
Track map address sizes in a flat ordered map and add a method to query
the number of bytes until the end of a map in a given address.
2021-01-22 18:31:12 -03:00
ReinUsesLisp 436457b6e7 gl_shader_decompiler: Fix constant buffer size calculation
The divide logic was wrong and can cause an uniform buffer size
overflow.
2021-01-21 19:47:41 -03:00
ReinUsesLisp b7febb5625 video_core/memory_manager: Remove unused CopyBlockUnsafe
This function was not being used.
2021-01-21 19:16:06 -03:00
ReinUsesLisp 0e9a6759f9 video_core/memory_manager: Flush destination buffer on CopyBlock
When we copy into a buffer, it might contain data modified from the GPU
on the same pages. Because of this, we have to flush the contents before
writing new data.

An alternative approach would be to write the data in place, but games
can also write data in other ways, invalidating our contents.

Fixes geometry in Zombie Panic in Wonderland DX.
2021-01-21 19:16:06 -03:00
ReinUsesLisp dd790abab0 video_core/memory_manager: Add GPU address based flush method
Allow flushing rasterizer contents based on a GPU address.
2021-01-21 19:16:05 -03:00
bunnei ffbde909c8
Merge pull request #5361 from ReinUsesLisp/vk-shader-comment
vk_shader_decompiler: Show comments as OpUndef with a type
2021-01-20 21:33:42 -08:00
ReinUsesLisp 51512d01d8 renderer_opengl: Avoid precompiled cache and force NV GL cache directory
Setting __GL_SHADER_DISK_CACHE_PATH we can force the cache directory to
be in yuzu's user directory to stop commonly distributed malware from
deleting our driver shader cache. And by setting
__GL_SHADER_DISK_CACHE_SKIP_CLEANUP we can have an unbounded shader
cache size.

This has only been implemented on Windows, mostly because previous tests
didn't seem to work on Linux.

Disable the precompiled cache on Nvidia's driver. There's no need to
hide information the driver already has in its own cache.
2021-01-21 00:41:03 -03:00
Rodrigo Locatti 2ef4591e58
Merge pull request #5746 from lioncash/sign-compare
texture_cache/util: Resolve -Wsign-compare warning
2021-01-18 03:49:58 -03:00
Rodrigo Locatti 132f2006af
Merge pull request #5745 from lioncash/documentation
video_core: Resolve -Wdocumentation warnings
2021-01-17 05:37:17 -03:00
Lioncash 5f4e7c77bd texture_cache/util: Resolve -Wsign-compare warning
Resolves a -Wsign-compare warning on Clang.
2021-01-17 02:47:48 -05:00
Lioncash 40acc2c079 video_core: Resolve -Wdocumentation warnings
Silences some -Wdocumentation warnings on Clang.
2021-01-17 02:44:21 -05:00
Lioncash c61b973968 vulkan_debug_callback: Add missing header guard
Prevents inclusion issues from occurring.
2021-01-17 02:39:24 -05:00
Rodrigo Locatti fd873fd369
Merge pull request #5262 from ReinUsesLisp/buffer-base
buffer_cache/buffer_base: Add a range tracking buffer container and tests
2021-01-16 19:48:26 -03:00
Rodrigo Locatti c17ee0da5d
Merge pull request #5297 from ReinUsesLisp/vulkan-allocator-common
vulkan_memory_allocator: Improvements to the memory allocator
2021-01-15 21:50:05 -03:00
ReinUsesLisp c3c7603076 vk_shader_decompiler: Show comments as OpUndef with a type
Silence the new validation layer error about SPIR-V not allowing OpUndef
on a OpTypeVoid, even when the SPIR-V spec doesn't say anything against
it.

They will be inserted as an undefined int to avoid SPIRV-Cross and
validation errors, but only when a debugging tool is attached.
2021-01-15 21:12:57 -03:00
LC 8be9e5b48b
Merge pull request #5358 from ReinUsesLisp/rename-insert-padding
common/common_funcs: Rename INSERT_UNION_PADDING_{BYTES,WORDS} to _NOINIT
2021-01-15 16:19:46 -05:00
ReinUsesLisp 3ff978aa4f common/common_funcs: Rename INSERT_UNION_PADDING_{BYTES,WORDS} to _NOINIT
INSERT_PADDING_BYTES_NOINIT is more descriptive of the underlying behavior.
2021-01-15 16:27:28 -03:00
ReinUsesLisp 301e2b5b7a vulkan_memory_allocator: Remove unnecesary 'device' memory from commits 2021-01-15 16:19:40 -03:00
ReinUsesLisp 432f045dba vk_texture_cache: Use Download memory types for texture flushes
Use the Download memory type where it matters.
2021-01-15 16:19:40 -03:00
ReinUsesLisp 8f22f5470c vulkan_memory_allocator: Add allocation support for download types
Implements the allocator logic to handle download memory types. This
will try to use HOST_CACHED_BIT when available.
2021-01-15 16:19:39 -03:00
ReinUsesLisp 72541af3bc vulkan_memory_allocator: Add "download" memory usage hint
Allow users of the allocator to hint memory usage for downloads. This
removes the non-descriptive boolean passed for "host visible" or not
host visible memory commits, and uses an enum to hint device local,
upload and download usages.
2021-01-15 16:19:39 -03:00
ReinUsesLisp fade63b58e vulkan_common: Move allocator to the common directory
Allow using the abstraction from the OpenGL backend.
2021-01-15 16:19:39 -03:00
ReinUsesLisp c2b550987b renderer_vulkan: Rename Vulkan memory manager to memory allocator
"Memory manager" collides with the guest GPU memory manager, and a
memory allocator sounds closer to what the abstraction aims to be.
2021-01-15 16:19:39 -03:00
ReinUsesLisp e996f1ad09 vk_memory_manager: Improve memory manager and its API
Fix a bug where the memory allocator could leave gaps between commits.
To fix this the allocation algorithm was reworked, although it's still
short in number of lines of code.

Rework the allocation API to self-contained movable objects instead of
naively using an unique_ptr to do the job for us. Remove the VK prefix.
2021-01-15 16:19:36 -03:00
LC 9754a8145c
Merge pull request #5357 from ReinUsesLisp/alignment-log2
common/alignment: Rename AlignBits to AlignUpLog2 and use constraints
2021-01-15 03:12:36 -05:00
Lioncash 8620de6b20 common/bit_util: Replace CLZ/CTZ operations with standardized ones
Makes for less code that we need to maintain.
2021-01-15 02:15:32 -05:00
ReinUsesLisp fe494a0ccd common/alignment: Rename AlignBits to AlignUpLog2
AlignUpLog2 describes what the function does better than AlignBits.
2021-01-15 04:13:33 -03:00
ReinUsesLisp cc2c3e447f video_core/cmake: Remove Werror flags already defined code-base wide
These flags are already defined in src/cmake.
2021-01-15 03:37:34 -03:00
LC 28e78d81b2
Merge pull request #5351 from ReinUsesLisp/vc-unused-functions
cmake: Enforce -Wunused-function code-base wise
2021-01-15 01:36:51 -05:00
Rodrigo Locatti 185388f341
Merge pull request #5350 from ReinUsesLisp/vk-init-warns
vulkan_common: Silence missing initializer warnings
2021-01-15 03:32:01 -03:00
LC 76b465f3ef
Merge pull request #5349 from ReinUsesLisp/anv-fix
vulkan_device: Enable shaderStorageImageMultisample conditionally
2021-01-15 01:17:00 -05:00
ReinUsesLisp 06e0506cb3 cmake: Enforce -Wunused-function code-base wide 2021-01-15 03:09:48 -03:00
ReinUsesLisp 71264ce9a7 video_core: Enforce -Wunused-function
Stops us from merging code with unused functions in the future.

If something is invoked behind conditionally evaluated code in
a way that the language can't see it (e.g. preprocessor macros), the
potentially unused function should use [[maybe_unused]].
2021-01-15 02:59:25 -03:00
ReinUsesLisp 3e03391a49 vk_buffer_cache: Remove unused function 2021-01-15 02:58:55 -03:00
ReinUsesLisp be8fd5490e vulkan_common: Silence missing initializer warnings
Silence warnings explicitly initializing all members on construction.
2021-01-15 02:55:11 -03:00
ReinUsesLisp ba2ea7eeac vulkan_device: Enable shaderStorageImageMultisample conditionally
Fix Vulkan initialization on ANV.
2021-01-15 02:47:05 -03:00
ReinUsesLisp 22be115eb2 astc: Increase integer encoded vector size
Invalid ASTC textures seem to write more bytes here, increase
the size to something that can't make us push out of bounds.
2021-01-15 02:24:36 -03:00
ReinUsesLisp 0ec71b78fb astc: Return zero on out of bound bits
Avoid out of bound reads on invalid ASTC textures.
Games can bind invalid textures that make us read or write out of bounds.
2021-01-15 02:24:36 -03:00
ReinUsesLisp d9a15a935b vulkan_device: Remove requirement on shaderStorageImageMultisample
yuzu doesn't currently emulate MS image stores. Requiring this makes no
sense for now. Fixes ANV not booting any games on Vulkan.
2021-01-13 06:21:33 -03:00
ReinUsesLisp a4bfae1b55 buffer_cache/buffer_base: Add a range tracking buffer container
It keeps track of the modified CPU and GPU ranges on a CPU page
granularity, notifying the given rasterizer about state changes
in the tracking behavior of the buffer.

Use a small vector optimization to store buffers smaller than 256 KiB
locally instead of using free store memory allocations.
2021-01-13 04:14:58 -03:00
bunnei de1a316369
Merge pull request #5311 from ReinUsesLisp/fence-wait
vk_fence_manager: Use timeline semaphores instead of spin waits
2021-01-12 21:00:05 -08:00
Levi 7a3c884e39 Merge remote-tracking branch 'upstream/master' into int-flags 2021-01-10 22:09:56 -07:00
bunnei 8eea7c1176
Merge pull request #5231 from ReinUsesLisp/dyn-bindings
renderer_vulkan/fixed_pipeline_state: Move enabled bindings to static state
2021-01-08 12:24:46 -08:00
ReinUsesLisp 154a7653f9 vk_fence_manager: Use timeline semaphores instead of spin waits
With timeline semaphores we can avoid creating objects. Instead of
creating an event, grab the current tick from the scheduler and flush
the current command buffer. When the fence has to be queried/waited, we
can do so against the master semaphore instead of spinning on an event.

If Vulkan supported NVN like events or fences, we could signal from the
command buffer and wait for that without splitting things in two
separate command buffers.
2021-01-08 02:47:28 -03:00
Ameer J 16392a23cc remove inaccurate reference
Co-authored-by: LC <mathew1800@gmail.com>
2021-01-07 14:33:45 -05:00
ameerj 06cef3355e fix for nvdec disabled, cleanup host1x 2021-01-07 14:33:45 -05:00
ameerj 2c27127d04 nvdec syncpt incorporation
laying the groundwork for async gpu, although this does not fully implement async nvdec operations
2021-01-07 14:33:45 -05:00
MerryMage 21199cb965 vulkan_library: Common::DynamicLibrary::Open is [[nodiscard]]
Ignore the return value on __APPLE__ systems as well
2021-01-07 17:37:47 +00:00
MerryMage aace20afc7 texture_cache: Replace PAGE_SHIFT with PAGE_BITS
PAGE_SHIFT is a #define in system headers that leaks into user code on some systems
2021-01-07 16:51:34 +00:00
Morph e8d40559d5
Merge pull request #5288 from ReinUsesLisp/workaround-garbage
gl_texture_cache: Avoid format views on Intel and AMD
2021-01-06 15:39:51 +08:00
bunnei 275b96a0e2
Merge pull request #5289 from ReinUsesLisp/vulkan-device
vulkan_common: Move device abstraction to the common directory and allow surfaceless devices
2021-01-05 17:44:56 -08:00
LC 2a6e6306d8
Merge pull request #5292 from ReinUsesLisp/empty-set
vk_rasterizer: Skip binding empty descriptor sets on compute
2021-01-04 21:32:57 -05:00
ReinUsesLisp 1ccf805367 vk_rasterizer: Skip binding empty descriptor sets on compute
Fixes unit tests where compute shaders had no descriptors in the set,
making Vulkan drivers crash when binding an empty set.
2021-01-04 17:56:39 -03:00
ReinUsesLisp ac1e4734c2 vulkan_device: Allow creating a device without surface 2021-01-04 02:22:22 -03:00
ReinUsesLisp d235cf3933 renderer_vulkan/nsight_aftermath_tracker: Move to vulkan_common 2021-01-04 02:22:22 -03:00
ReinUsesLisp 3753553b6a renderer_vulkan: Move device abstraction to vulkan_common 2021-01-04 02:22:22 -03:00
ReinUsesLisp 7d904fef2e gl_texture_cache: Avoid format views on Intel and AMD
Intel and AMD proprietary drivers are incapable of rendering to texture
views of different formats than the original texture. Avoid creating
these at a cache level. This will consume more memory, emulating them
with copies.
2021-01-04 02:06:40 -03:00
ReinUsesLisp 3a49c1a691 gl_texture_cache: Create base images with sRGB
This breaks accelerated decoders trying to imageStore into images with
sRGB. The decoders are currently disabled so this won't cause issues at
runtime.
2021-01-04 01:54:54 -03:00
ReinUsesLisp 974d731926 renderer_vulkan: Rename VKDevice to Device
The "VK" prefix predates the "Vulkan" namespace. It was carried around
the codebase for consistency. "VKDevice" currently is a bad alias with
"VkDevice" (only an upcase character of difference) that can cause
confusion. Rename all instances of it.
2021-01-03 17:51:48 -03:00
Rodrigo Locatti 7265e80c12
Merge pull request #5230 from ReinUsesLisp/vulkan-common
vulkan_common: Move reusable Vulkan abstractions to a separate directory
2021-01-03 17:38:29 -03:00
Morph a745d87971 general: Fix various spelling errors 2021-01-02 10:23:41 -05:00
bunnei 25d607f5f6
Merge pull request #5208 from bunnei/service-threads
Service threads
2020-12-30 22:06:05 -08:00
ReinUsesLisp cdbee27692 vulkan_instance: Allow different Vulkan versions and enforce 1.1
For listing the available physical devices we can use Vulkan 1.0.
Now that MoltenVK supports 1.1 we can require it for running games.

Add missing documentation.
2020-12-31 02:07:34 -03:00
ReinUsesLisp 7344a7c447 vk_device: Use an array to report lacking device limits
This makes easier to add and tune the required device limits.
2020-12-31 02:07:34 -03:00
ReinUsesLisp f687392e6f vk_device: Stop initialization when device is not suitable
VKDevice::IsSuitable was not being called. To address this issue, check
suitability before initialization and throw an exception if it fails.

By doing this, we can deduplicate some code on queue searches.
Previosuly we would first search if a present and graphics queue
existed, then on initialization we would search again to find the index.
2020-12-31 02:07:33 -03:00
ReinUsesLisp 53ea06dc17 renderer_vulkan: Remove two step initialization on VKDevice
The Vulkan device abstraction either initializes successfully on the
constructor or throws a Vulkan exception.
2020-12-31 02:07:33 -03:00
ReinUsesLisp 085adfea00 renderer_vulkan: Throw when enumerating devices fails
Report device enumeration errors with exceptions to be consistent with
other initialization related function calls. Reduces the amount of code
to maintain.
2020-12-31 02:07:33 -03:00
ReinUsesLisp 11f0f7598d renderer_vulkan: Initialize surface in separate file
Move surface initialization code to a separate file. It's unlikely to
use this code outside of Vulkan, but keeping platform-specific code
(Win32, Xlib, Wayland) in its own translation unit keeps things cleaner.
2020-12-31 02:07:33 -03:00
ReinUsesLisp dce8720780 renderer_vulkan: Catch and report exceptions
Move more Vulkan code to report errors with exceptions and report them
through a log before notifying it with an error boolean for backwards
compatibility. In the future we can replace the rasterizer two-step
initialization to always use exceptions.
2020-12-31 02:07:33 -03:00
ReinUsesLisp 47843b4f09 renderer_vulkan: Create debug callback on separate file and throw
Initialize debug callbacks (messenger) from a separate file. This allows
sharing code with different backends.

Change our Vulkan error handling to use exceptions instead of error
codes, simplifying the initialization process.
2020-12-31 02:07:33 -03:00
ReinUsesLisp 25f88d99ce renderer_vulkan: Move instance initialization to a separate file
Simplify Vulkan's backend initialization code by moving it to a separate
file, allowing us to initialize a Vulkan instance from different
backends.
2020-12-31 02:07:33 -03:00
ReinUsesLisp d1435009ed vulkan_common: Rename renderer_vulkan/wrapper.h to vulkan_common/vulkan_wrapper.h
Allows sharing Vulkan wrapper code between different rendering backends.
2020-12-31 02:07:14 -03:00
ReinUsesLisp d937421422 vulkan_common: Move dynamic library load to a separate file
Allows us to initialize a Vulkan dynamic library from different backends
without duplicating code.
2020-12-31 02:02:48 -03:00
Lioncash bcafef4b94 half_set: Resolve -Wmaybe-uninitialized warnings 2020-12-30 17:59:42 -05:00
Lioncash f0d9ab0717 maxwell_to_vk: Initialize usage variable in SurfaceFormat()
Silences a -Wmaybe-uninitialized warning
2020-12-30 13:25:03 -05:00
ReinUsesLisp 9764c13d6d video_core: Rewrite the texture cache
The current texture cache has several points that hurt maintainability
and performance. It's easy to break unrelated parts of the cache
when doing minor changes. The cache can easily forget valuable
information about the cached textures by CPU writes or simply by its
normal usage.The current texture cache has several points that hurt
maintainability and performance. It's easy to break unrelated parts
of the cache when doing minor changes. The cache can easily forget
valuable information about the cached textures by CPU writes or simply
by its normal usage.

This commit aims to address those issues.
2020-12-30 03:38:50 -03:00
ReinUsesLisp 9106ac1e6b video_core: Add a delayed destruction ring abstraction 2020-12-30 02:10:19 -03:00
ReinUsesLisp 21b18057f7 host_shaders: Add Vulkan assembler compute shaders 2020-12-30 02:03:50 -03:00
ReinUsesLisp 87ff58b1d7 host_shaders: Add helper to blit depth stencil fragment shader 2020-12-30 02:02:07 -03:00
ReinUsesLisp ae5725b709 host_shaders: Add texture color blit fragment shader 2020-12-30 02:00:48 -03:00
ReinUsesLisp 64fbf319f1 host_shaders: Add shaders to present to the swapchain 2020-12-30 01:59:12 -03:00
ReinUsesLisp 82b7daed9c host_shaders: Add shaders to convert between depth and color images 2020-12-30 01:48:44 -03:00
ReinUsesLisp dc81a90640 host_shaders: Add compute shader to copy BC4 as RG32UI to RGBA8 2020-12-30 01:47:08 -03:00
ReinUsesLisp 5169ce9fcd host_shaders: Add shader to render a full screen triangle 2020-12-30 01:44:09 -03:00
ReinUsesLisp 59c46f9de9 host_shaders: Add pitch linear upload compute shader 2020-12-30 01:41:42 -03:00
ReinUsesLisp 12d16248dd host_shaders: Add block linear upload compute shaders 2020-12-30 01:39:35 -03:00
ReinUsesLisp f20e18f60d host_shaders: Add copyright headers to OpenGL present shaders 2020-12-30 01:35:56 -03:00
ReinUsesLisp 95d156a150 video_core/host_shaders: Add support for prebuilt SPIR-V shaders
Add support for building SPIR-V shaders from GLSL and generating headers
to include the text of those same GLSL shaders to consume from OpenGL.
2020-12-30 01:29:07 -03:00
bunnei 954341763a gpu: gpu_thread: Ensure MicroProfile is shutdown on exit. 2020-12-28 21:33:34 -08:00
bunnei 4991620f89 video_core: gpu_thread: Do not wait when system is powered down. 2020-12-28 16:33:48 -08:00
bunnei 40571c073f video_core: gpu: Implement synchronous mode using threaded GPU. 2020-12-28 16:33:48 -08:00
bunnei 14c825bd1c video_core: gpu: Refactor out synchronous/asynchronous GPU implementations.
- We must always use a GPU thread now, even with synchronous GPU.
2020-12-28 16:33:48 -08:00
ReinUsesLisp 661483f313 renderer_vulkan/fixed_pipeline_state: Move enabled bindings to static state
Without using VK_EXT_robustness2, we can't consider the 'enabled' (not
null) vertex buffers as dynamic state, as this leads to invalid Vulkan
state. Move this to static state that is always hashed and compared in
the pipeline key.

The bits for enabled vertex buffers are moved into the attribute state
bitfield. This is not 'correct' as it's not an attribute state, but that
struct has bits to spare, and it's used in an array of 32 elements (the
exact same number of vertex buffer bindings).
2020-12-25 23:34:38 -03:00
Rodrigo Locatti 0dc4ab42cc
Merge pull request #5226 from ReinUsesLisp/c4715-vc
video_core: Enforce C4715 (not all control paths return a value)
2020-12-25 03:11:47 -03:00
ReinUsesLisp 1b9e08ab78 cmake: Always enable Vulkan
Removes the unnecesary burden of maintaining separate #ifdef paths and
allows us sharing generic Vulkan code across APIs.
2020-12-24 21:07:24 -03:00
ReinUsesLisp 1e191cc837 video_core: Enforce C4715 (not all control paths return a value)
Most of the time people write code that always returns a value,
terminates execution, throws an exception, or uses an unconventional
jump primitive.

This is not always true when we build without asserts on mainline builds.
To avoid introducing undefined behavior on our most used builds, enforce
this warning signalling an error and stopping the build from shipping.
2020-12-24 21:01:23 -03:00
ReinUsesLisp 5dbda22659 vk_shader_decompiler: Silence warning when compiling without asserts 2020-12-24 21:01:09 -03:00
bunnei 37bec068c2
Merge pull request #5157 from lioncash/array-dirty
maxwell_3d: Remove unused dirty_pointer array
2020-12-15 00:35:47 -08:00
bunnei d1a2b3fb18
Merge pull request #5162 from lioncash/copy-shader
gl_shader_decompiler: Elide unnecessary copies within DeclareConstantBuffers()
2020-12-10 00:11:11 -08:00
Rodrigo Locatti 3415890dd5
Merge pull request #5164 from lioncash/contains
video_core: Make use of ordered container contains() where applicable
2020-12-07 21:55:51 -03:00
Lioncash 09fa1d6a73 video_core: Make use of ordered container contains() where applicable
With C++20, we can use the more concise contains() member function
instead of comparing the result of the find() call with the end
iterator.
2020-12-07 16:30:39 -05:00
Lioncash 45c5b084fd ast: Improve string concat readability in operator()
Provides an in-place format string to make it more pleasant to read.
2020-12-07 16:15:28 -05:00
Lioncash edcbd47800 gl_shader_decompiler: Elide unnecessary copies within DeclareConstantBuffers()
Resolves a -Wrange-loop-analysis warning.
2020-12-07 14:01:52 -05:00
bunnei 5cd051eced
Merge pull request #5149 from comex/xx-map-interval
map_interval: Change field order to address uninitialized field warning
2020-12-07 10:14:02 -08:00
Rodrigo Locatti 12f3b13995
Merge pull request #5159 from lioncash/move-amend
shader_ir: std::move node within DeclareAmend()
2020-12-07 04:58:01 -03:00
Lioncash 5d2f18fbcd buffer_block: Mark interface as nodiscard where applicable
Prevents logic errors from occurring from unused values.
2020-12-07 01:53:40 -05:00
Lioncash 3954f14c6d buffer_block: Remove unnecessary includes
Reduces the amount of dependencies the header pulls in.
2020-12-07 01:52:16 -05:00
Lioncash 7234f436aa shader_ir: std::move node within DeclareAmend()
Same behavior, but elides an unnecessary atomic reference count
increment and decrement.
2020-12-07 00:51:03 -05:00
Lioncash 4c5f5c9bf3 video_core: Remove unnecessary enum class casting in logging messages
fmt now automatically prints the numeric value of an enum class member
by default, so we don't need to use casts any more.

Reduces the line noise a bit.
2020-12-07 00:41:50 -05:00
LC 23aabe85e6
Merge pull request #5152 from comex/xx-override
renderer_vulkan: Add missing `override` specifier
2020-12-07 00:07:17 -05:00
LC 69af6ada2f
Merge pull request #5136 from lioncash/video-shadow3
video_core: Resolve more variable shadowing scenarios pt.3
2020-12-07 00:06:53 -05:00
Lioncash 9e7a1f1351 maxwell_3d: Move member variables to end of class
Follows our established coding style.
2020-12-06 20:56:00 -05:00
Lioncash ce0712bf95 maxwell_3d: Resolve -Wdocumentation warning
Removes a documentation comment for a non-existent member.
2020-12-06 20:48:12 -05:00
Lioncash bcc5c4403a maxwell_3d: Remove unused dirty_pointer array
This is unused and removing it shrinks the structure by 3584 bytes.
2020-12-06 20:46:57 -05:00
comex eea5122d1b renderer_vulkan: Add missing override specifier 2020-12-06 18:38:52 -05:00
comex b8fbf6969c map_interval: Change field order to address uninitialized field warning
Clang complains about `new_chunk`'s constructor using the
then-uninitialized `first_chunk` (even though it's just to get a pointer
into it).
2020-12-06 18:37:23 -05:00
comex d637114c17 video_core: Adjust NUM macro to avoid Clang warning
The previous definition was:

    #define NUM(field_name) (sizeof(Maxwell3D::Regs::field_name) / sizeof(u32))

In cases where `field_name` happens to refer to an array, Clang thinks
`sizeof(an array value) / sizeof(a type)` is an instance of the idiom
where `sizeof` is used to compute an array length.  So it thinks the
type in the denominator ought to be the array element type, and warns if
it isn't, assuming this is a mistake.

In reality, `NUM` is not used to get array lengths at all, so there is no
mistake.  Silence the warning by applying Clang's suggested workaround
of parenthesizing the denominator.
2020-12-06 18:24:16 -05:00
comex a6e6cd5788 maxwell_dma: Rename RenderEnable::Mode::FALSE and TRUE to avoid name conflict
On Apple platforms, FALSE and TRUE are defined as macros by
<mach/boolean.h>, which is included by various system headers.

Note that there appear to be no actual users of the names to fix up.
2020-12-05 17:59:02 -05:00
Lioncash f95602f152 video_core: Resolve more variable shadowing scenarios pt.3
Cleans out the rest of the occurrences of variable shadowing and makes
any further occurrences of shadowing compiler errors.
2020-12-05 16:02:23 -05:00
Lioncash 414a87a4f4 video_core: Resolve more variable shadowing scenarios pt.2
Migrates the video core code closer to enabling variable shadowing
warnings as errors.

This primarily sorts out shadowing occurrences within the Vulkan code.
2020-12-05 06:39:35 -05:00
bunnei e6a896c4bd
Merge pull request #5124 from lioncash/video-shadow
video_core: Resolve more variable shadowing scenarios
2020-12-05 00:48:08 -08:00
bunnei 63419e144f
Merge pull request #5127 from FearlessTobi/port-5617
Port citra-emu/citra#5617: "Fix telemetry-related exit crash from use-after-free"
2020-12-04 21:57:40 -08:00
FearlessTobi 37d672bf08 Fix telemetry-related exit crash from use-after-free
Co-Authored-By: xperia64 <xperia64@users.noreply.github.com>
2020-12-05 02:42:50 +01:00
Lioncash 94af77aa7c codec: Remove deprecated usage of AVCodecContext::refcounted_frames
This was only necessary for use with the
avcodec_decode_video2/avcoded_decode_audio4 APIs which are also
deprecated.

Given we use avcodec_send_packet/avcodec_receive_frame, this isn't
necessary, this is even indicated directly within the FFmpeg API changes
document here on 2017-09-26:

https://github.com/FFmpeg/FFmpeg/blob/master/doc/APIchanges#L410

This prevents our code from breaking whenever we update to a newer
version of FFmpeg in the future if they ever decide to fully remove this
API member.
2020-12-04 16:23:13 -05:00
Lioncash 677a8b208d video_core: Resolve more variable shadowing scenarios
Resolves variable shadowing scenarios up to the end of the OpenGL code
to make it nicer to review. The rest will be resolved in a following
commit.
2020-12-04 16:19:09 -05:00
bunnei fad38ec6e8
Merge pull request #5064 from lioncash/node-shadow
node: Eliminate variable shadowing
2020-12-04 00:45:33 -08:00
Lioncash edd8208779 node: Mark member functions as [[nodiscard]] where applicable
Prevents logic bugs from accidentally ignoring the return value.
2020-12-03 16:03:34 -05:00
Lioncash 7cf34c3637 node: Eliminate variable shadowing 2020-12-03 15:59:38 -05:00
Lioncash cf9767c608 vp9/vic: Resolve pessimizing moves
Removes the usage of moves that don't result in behavior different from
a copy, or otherwise would prevent copy elision from occurring.
2020-12-03 12:33:07 -05:00
bunnei 9abb23cd27
Merge pull request #5002 from ameerj/nvdec-frameskip
nvdec: Queue and display all decoded frames, cleanup decoders
2020-12-02 15:55:15 -08:00
bunnei 7b4a213603
Merge pull request #5013 from ReinUsesLisp/vk-early-z
vk_shader_decompiler: Implement force early fragment tests
2020-11-30 11:11:07 -08:00
comex 4681e1ea9e codec: Fix pragma GCC diagnostic pop missing corresponding push 2020-11-26 16:35:42 -05:00
ReinUsesLisp 2ccf85a910 vk_shader_decompiler: Implement force early fragment tests
Force early fragment tests when the 3D method is enabled.
The established pipeline cache takes care of recompiling if needed.

This is implemented only on Vulkan to avoid invalidating the shader
cache on OpenGL.
2020-11-26 17:52:26 -03:00
ameerj 979b602738 Limit queue size to 10 frames
Workaround for ZLA, which seems to decode and queue twice as many frames as it displays.
2020-11-26 14:04:06 -05:00
bunnei 322349e8cc
Merge pull request #4975 from comex/invalid-syncpoint-id
nvdrv, video_core: Don't index out of bounds when given invalid syncpoint ID
2020-11-26 01:27:24 -08:00
ameerj c9e3abe206 Address PR feedback
remove some redundant moves, make deleter match naming guidelines.

Co-Authored-By: LC <712067+lioncash@users.noreply.github.com>
2020-11-26 00:18:26 -05:00
Rodrigo Locatti 0e15c68f54
Merge pull request #4976 from comex/poll-events
Overhaul EmuWindow::PollEvents to fix yuzu-cmd calling SDL_PollEvents off main thread
2020-11-25 20:44:53 -03:00
ameerj eab041866b Queue decoded frames, cleanup decoders 2020-11-25 17:10:44 -05:00
ameerj d52ee6d0a7 cleanup unneeded comments and newlines 2020-11-25 14:46:08 -05:00
ameerj e87670ee48 Refactor MaxwellToSpirvComparison. Use Common::BitCast
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-11-25 00:33:20 -05:00
ameerj 1dbf71ceb3 Address PR feedback from Rein 2020-11-24 22:46:45 -05:00
ameerj 9014861858 vulkan_renderer: Alpha Test Culling Implementation
Used by various textures in many titles, e.g.  SSBU menu.
2020-11-24 22:46:45 -05:00
comex e8b2fd21d8 nvdrv, video_core: Don't index out of bounds when given invalid syncpoint ID
- Use .at() instead of raw indexing when dealing with untrusted indices.

- For the special case of WaitFence with syncpoint id UINT32_MAX,
  instead of crashing, log an error and ignore.  This is what I get when
  running Super Mario Maker 2.
2020-11-24 12:59:41 -05:00
Rodrigo Locatti fbda5e9ec9
Merge pull request #3681 from lioncash/component
decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize()
2020-11-24 04:38:03 -03:00
comex 994f497781 Overhaul EmuWindow::PollEvents to fix yuzu-cmd calling SDL_PollEvents off main thread
EmuWindow::PollEvents was called from the GPU thread (or the CPU thread
in sync-GPU mode) when swapping buffers.  It had three implementations:

- In GRenderWindow, it didn't actually poll events, just set a flag and
  emit a signal to indicate that a frame was displayed.

- In EmuWindow_SDL2_Hide, it did nothing.

- In EmuWindow_SDL2, it did call SDL_PollEvents, but this is wrong
  because SDL_PollEvents is supposed to be called on the thread that set
  up video - in this case, the main thread, which was sleeping in a
  busyloop (regardless of whether sync-GPU was enabled).  On macOS this
  causes a crash.

To fix this:

- Rename EmuWindow::PollEvents to OnFrameDisplayed, and give it a
  default implementation that does nothing.

- In EmuWindow_SDL2, do not override OnFrameDisplayed, but instead have
  the main thread call SDL_WaitEvent in a loop.
2020-11-23 17:58:49 -05:00
Morph e13a91fa9b
Merge pull request #4954 from lioncash/compare
gl_rasterizer: Make floating-point literal a float
2020-11-22 09:55:23 +08:00
bunnei 5502f39125
Merge pull request #4955 from lioncash/move3
async_shaders: std::move data within QueueVulkanShader()
2020-11-21 01:21:08 -08:00
LC d88baa746b
Merge pull request #4957 from ReinUsesLisp/alpha-test-rt
gl_rasterizer: Remove warning of untested alpha test
2020-11-20 21:19:06 -05:00
ReinUsesLisp acc14d233f gl_rasterizer: Remove warning of untested alpha test
Alpha test has been proven to only affect the first render target.
2020-11-20 23:17:40 -03:00
bunnei b00f4abe36
Merge pull request #4953 from lioncash/shader-shadow
shader_bytecode: Eliminate variable shadowing
2020-11-20 16:58:14 -08:00
Lioncash 01db5cf203 async_shaders: emplace threads into the worker thread vector
Same behavior, but constructs the threads in place instead of moving
them.
2020-11-20 04:46:56 -05:00
Lioncash ba3916fc67 async_shaders: Simplify implementation of GetCompletedWork()
This is equivalent to moving all the contents and then clearing the
vector. This avoids a redundant allocation.
2020-11-20 04:44:44 -05:00
Lioncash 3fcc98e11a async_shaders: Simplify moving data into the pending queue 2020-11-20 04:41:29 -05:00
Lioncash 5b441fa25d async_shaders: std::move data within QueueVulkanShader()
Same behavior, but avoids redundant copies.

While we're at it, we can simplify the pushing of the parameters into
the pending queue.
2020-11-20 04:38:18 -05:00
Lioncash 8469b76630 gl_rasterizer: Make floating-point literal a float
Gets rid of an unnecessary expansion from float to double.
2020-11-20 04:24:33 -05:00
Lioncash b7cd5d742e shader_bytecode: Make use of [[nodiscard]] where applicable
Ensures that all queried values are made use of.
2020-11-20 02:20:37 -05:00
Lioncash 56ecafc204 shader_bytecode: Eliminate variable shadowing 2020-11-20 02:13:45 -05:00
Rodrigo Locatti 1889b641d9
Merge pull request #4308 from ReinUsesLisp/maxwell-3d-funcs
maxwell_3d: Move code to separate functions and insert instead of push_back
2020-11-20 01:57:22 -03:00
Lioncash 70812ec57b rasterizer_interface: Make use of [[nodiscard]] where applicable 2020-11-17 07:19:13 -05:00
Lioncash a78021580d render_base: Make use of [[nodiscard]] where applicable 2020-11-17 07:19:12 -05:00
Lioncash b928fca114 gpu: Make use of [[nodiscard]] where applicable 2020-11-17 07:19:09 -05:00
ReinUsesLisp 622830f4e1 maxwell_3d: Use insert instead of loop push_back
This reduces the overhead of bounds checking on each element.
It won't reduce the cost of allocation because usually this vector's
capacity is usually large enough to hold whatever we push to it.
2020-11-11 19:52:19 -03:00
ReinUsesLisp 9ea8cffe35 maxwell_3d: Move code to separate functions
Deduplicate some code and put it in separate functions so it's easier to
understand and profile.
2020-11-11 19:52:19 -03:00
bunnei dc5396a466 video_core: dma_pusher: Remove integrity check on command lists.
- This seems to cause softlocks in Breath of the Wild.
2020-11-07 00:08:19 -08:00
bunnei 91a45834fd
Merge pull request #4891 from lioncash/clang2
General: Fix clang build
2020-11-06 10:33:13 -08:00
bunnei a111a9ae2c
Merge pull request #4854 from ReinUsesLisp/cube-array-shadow
shader: Partially implement texture cube array shadow
2020-11-05 16:25:00 -08:00
Lioncash 6f006d051e General: Fix clang build
Allows building on clang to work again
2020-11-05 10:07:16 -05:00
bunnei 087f52e872
Merge pull request #4858 from lioncash/initializer
General: Resolve a few missing initializer warnings
2020-11-04 12:10:10 -08:00
Chloe 6bbbbe8f85
Merge pull request #4869 from bunnei/improve-gpu-sync
Improvements to GPU synchronization & various refactoring
2020-11-04 18:36:55 +11:00
bunnei 4bfa411ddc
Merge pull request #4874 from lioncash/nodiscard2
nvdec: Make use of [[nodiscard]] where applicable
2020-11-03 16:34:07 -08:00
Lioncash 4f0f481f63 nvdec: Make use of [[nodiscard]] where applicable
Prevents bugs from occurring where the results of a function are
accidentally discarded
2020-11-02 02:45:15 -05:00
bunnei 1089d76736
Merge pull request #4865 from ameerj/async-threadcount
async_shaders: Increase Async worker thread count for >8 thread cpus
2020-11-01 01:54:01 -07:00
bunnei c6e1c46ac7 video_core: dma_pusher: Add support for integrity checks.
- Log corrupted command lists, rather than crash.
2020-11-01 01:52:38 -07:00
bunnei c64545d07a video_core: dma_pusher: Add support for prefetched command lists. 2020-11-01 01:52:38 -07:00
bunnei 6053b95552 video_core: gpu: Implement WaitFence and IncrementSyncPoint. 2020-11-01 01:52:37 -07:00
bunnei 98f68d06f1
Merge pull request #4853 from ReinUsesLisp/fcmp-imm
shader/arithmetic: Implement FCMP immediate + register variant
2020-10-31 01:25:02 -07:00
Lioncash 12eeffcb7c vp9: Be explicit with copy and move operators
It's deprecated in the language to autogenerate these if the destructor
for a type is specified, so we can explicitly specify how we want these
to be generated.
2020-10-29 22:57:35 -04:00
Lioncash 0d713cf8eb vp9: Mark functions with [[nodiscard]] where applicable
Prevents values from mistakenly being discarded in cases where it's a
bug to do so.
2020-10-29 22:57:32 -04:00
Lioncash badea3b301 vp9: Provide a default initializer for "hidden" member
The API of VP9 exposes a WasFrameHidden() function which accesses this
member. Given the constructor previously didn't initialize this member,
it's a potential vector for an uninitialized read.

Instead, we can initialize this to a deterministic value to prevent that
from occurring.
2020-10-29 22:35:55 -04:00
Lioncash f8543249f0 vp9: Make some member functions internally linked
These helper functions don't directly modify any member state and can be
hidden from view.
2020-10-29 22:34:46 -04:00
Lioncash 5553bd3ba2 General: Resolve a few missing initializer warnings
Resolves a few -Wmissing-initializer warnings.
2020-10-29 19:37:07 -04:00
bunnei ef29bf4515
Merge pull request #4837 from lioncash/nvdec-2
nvdec: Minor tidying up
2020-10-29 12:28:07 -07:00
ameerj 3620206136 async_shaders: Increase Async worker thread count for 8+ thread cpus
Adds 1 async worker thread for every 2 available threads above 8
2020-10-29 14:16:45 -04:00
bunnei c6d001c94f
Merge pull request #4838 from lioncash/syncmgr
sync_manager: Amend parameter order of calls to SyncptIncr constructor
2020-10-28 22:49:22 -07:00
bunnei 94eca09cf6 video_core: cdma_pusher: Add missing LOG_DEBUG field in ExecuteCommand. 2020-10-28 16:47:08 -07:00
ReinUsesLisp 657771bdcb shader: Partially implement texture cube array shadow
This implements texture cube arrays with shadow comparisons but doesn't
fix the asserts related to it.

Fixes out of bounds reads on swizzle constructors and makes them use
bounds checked ::at instead of the unsafe operator[].
2020-10-28 17:12:40 -03:00
ReinUsesLisp 44b552be71 shader/arithmetic: Implement FCMP immediate + register variant
Trivially add the encoding for this.
2020-10-28 17:05:41 -03:00
LC 978e7897a3
Merge pull request #4848 from ReinUsesLisp/type-limits
video_core: Enforce -Werror=type-limits
2020-10-28 03:16:10 -04:00
ReinUsesLisp 79da90cea8 video_core: Enforce -Wredundant-move and -Wpessimizing-move
Silence three warnings and make them errors to avoid introducing more in the future.
2020-10-28 02:44:50 -03:00
ReinUsesLisp 4a451e5849 video_core: Enforce -Werror=type-limits
Silences one warning and avoids introducing more in the future.
2020-10-28 02:37:47 -03:00
Lioncash 047e77e2f0 sync_manager: Amend parameter order of calls to SyncptIncr constructor
Corrects some cases where the arguments would be incorrectly swapped.
2020-10-27 03:22:57 -04:00
Lioncash cce14b4cd7 h264: Make WriteUe take a u32
Enforces the type of the desired value in calling code.
2020-10-27 03:21:53 -04:00
Lioncash 6291975731 vp9: std::move buffer within ComposeFrameHeader()
We can move the buffer here to avoid a heap reallocation
2020-10-27 02:27:31 -04:00
Lioncash 00decfbb07 vp9: Remove dead code 2020-10-27 02:26:17 -04:00
Lioncash 111802bbbb vp9: Join declarations with assignments 2020-10-27 02:26:03 -04:00
Lioncash 3b5d5fa86f vp9: Remove pessimizing moves
The move will already occur without std::move.
2020-10-27 02:21:40 -04:00
Lioncash dcc26c54a5 vp9: Resolve variable shadowing 2020-10-27 02:20:17 -04:00
Lioncash c04203b786 nvdec: Tidy up header includes
Prevents a few unnecessary inclusions.
2020-10-27 02:16:42 -04:00
ameerj eb67a45ca8 video_core: NVDEC Implementation
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.

The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.

To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.

Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.

Async GPU is not properly implemented at the moment.

Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
2020-10-26 23:07:36 -04:00
bunnei 3e46934442
Merge pull request #4706 from ReinUsesLisp/cmake-host-shaders
video_core: Fix instances where msbuild always regenerated host shaders
2020-10-23 10:01:16 -07:00
Lioncash 678d012c2c video_core: Conditially activate relevant compiler warnings
These compiler flags aren't shared with clang, so specifying these flags
unconditionally can lead to a bit of warning spam.

While we're in the area, we can also enable -Wunused-but-set-parameter
given this is almost always a bug.
2020-10-20 20:28:25 -04:00
ReinUsesLisp f21a189148 gl_arb_decompiler: Implement robust buffer operations
This emulates the behavior we get on GLSL with regular SSBOs with a
pointer + length pair. It aims to be consistent with the crashes we
might get.

Out of bounds stores are ignored. Atomics are ignored and return zero.
Reads return zero.
2020-10-20 03:34:32 -03:00
bunnei f1ead11df7
Merge pull request #4204 from ReinUsesLisp/vulkan-1.0
renderer_vulkan: Create and properly use Vulkan 1.0 instances when 1.1 is not available
2020-10-19 14:18:54 -07:00
bunnei 743fe1aea3
Merge pull request #4782 from ReinUsesLisp/remove-dyn-primitive
vk_graphics_pipeline: Manage primitive topology as fixed state
2020-10-17 22:14:17 -07:00
bunnei d47ac3ce09
Merge pull request #4772 from goldenx86/block-rdna
vk_device: Block VK_EXT_extended_dynamic_state for RDNA devices
2020-10-14 17:51:39 -07:00
ReinUsesLisp e4e0abc418 vk_graphics_pipeline: Manage primitive topology as fixed state
Vulkan has requirements for primitive topologies that don't play nicely
with yuzu's. Since it's only 4 bits, we can move it to fixed state
without changing the size of the pipeline key.

- Fixes a regression on recent Nvidia drivers on Fire Emblem: Three
  Houses.
2020-10-13 04:08:33 -03:00
bunnei 4c348f4069
Merge pull request #4766 from ReinUsesLisp/tmml-cube
shader/texture: Implement CUBE texture type for TMML and fix arrays
2020-10-12 12:53:57 -07:00
ReinUsesLisp e1600b0962 video_core: Enforce -Wclass-memaccess 2020-10-09 16:46:11 -03:00
LC 61b246a3a9
Merge pull request #4771 from ReinUsesLisp/warn-unused-var
video_core: Enforce -Wunused-variable and -Wunused-but-set-variable
2020-10-08 21:10:31 -04:00
goldenx86 0120e5b1d9 vk_device: Block VK_EXT_extended_dynamic_state for RDNA devices
RDNA devices seem to crash when using VK_EXT_extended_dynamic_state in
the latest 20.9.2 proprietary Windows drivers. As a workaround, for now
we block device names corresponding to current RDNA released products.
2020-10-08 21:27:49 -03:00
ReinUsesLisp dffaffaac1 shader/texture: Implement CUBE texture type for TMML and fix arrays
TMML takes an array argument that has no known meaning, this one appears
as the first component in gpr8 followed by s, t and r. Skip this
component when arrays are being used. Also implement CUBE texture types.

- Used by Pikmin 3: Deluxe Demo.
2020-10-07 23:17:46 -03:00
ReinUsesLisp cd3e959f23 renderer_vulkan/wrapper: Fix physical device sorting
The old code had a sort function that was invalid and it didn't work as
expected when the base vector had a different order (e.g. renderdoc was
attached).

This sorts devices as expected and fixes a debug assert on MSVC.
2020-10-07 17:13:22 -03:00
ReinUsesLisp 2a24b1c973 video_core: Enforce -Wunused-variable and -Wunused-but-set-variable 2020-10-02 21:19:35 -03:00
Matías Locatti d7843b8ef2
Remove ext_extended_dynamic_state blacklist
Latest AMD 20.9.2 driver fixed this, there's no reason to keep it blocked, as the previous stable signed driver release doesn't include the extension.
2020-09-30 03:13:38 -03:00
Rodrigo Locatti e5a1e0a76d
Merge pull request #4724 from lat9nq/fix-vulkan-nvidia-allocate-2
vk_stream_buffer: Fix initializing Vulkan with NVIDIA on Linux
2020-09-26 23:52:49 +00:00
bunnei 442096298e
Merge pull request #4703 from lioncash/desig7
shader/registry: Make use of designated initializers where applicable
2020-09-26 15:23:15 -07:00
lat9nq ca26fd0f42 vk_stream_buffer: Fix initializing Vulkan with NVIDIA on Linux
The previous fix only partially solved the issue, as only certain GPUs that needed 9 or less MiB subtracted would work (i.e. GTX 980 Ti, GT 730). This takes from DXVK's example to divide `heap_size` by 2 to determine `allocable_size`. Additionally tested on my Quadro K4200, which previously required setting it to 12 to boot.
2020-09-25 17:42:59 -04:00
Lioncash 940d85241b vk_command_pool: Move definition of Pool into the cpp file
Allows the implementation details to be changed without recompiling any
files that include this header.
2020-09-25 00:15:52 -04:00
Lioncash 4ed4bba305 vk_command_pool: Make use of override on destructor 2020-09-25 00:14:10 -04:00
Lioncash e0f2db4376 vk_command_pool: Add missing header guard 2020-09-25 00:12:45 -04:00
Levi Behunin bc69cc1511 More forgetting... duh 2020-09-24 22:12:13 -06:00
bunnei 2634e3c6eb
Merge pull request #4711 from lioncash/move5
arithmetic_integer_immediate: Make use of std::move where applicable
2020-09-24 21:02:42 -07:00
Levi Behunin 24c1bb3842 Forgot to apply suggestion here as well 2020-09-24 21:58:51 -06:00
Levi Behunin a19dc3bf00 Address Comments 2020-09-24 21:52:23 -06:00
Levi Behunin d53b79ff5c Start of Integer flags implementation 2020-09-24 16:40:06 -06:00
Lioncash e3a615a616 arithmetic_integer_immediate: Make use of std::move where applicable
Same behavior, minus any redundant atomic reference count increments and
decrements.
2020-09-24 13:28:45 -04:00
ReinUsesLisp 67af0323f0 video_core: Fix instances where msbuild always regenerated host shaders
When HEADER_GENERATOR was included in the DEPENDS section of custom
commands, msbuild assumed this was always modified. Changing this file
is not common so we can remove it from there.
2020-09-23 22:27:17 -03:00
bunnei d66b897a6d
Merge pull request #4674 from ReinUsesLisp/timeline-semaphores
renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore
2020-09-23 18:24:27 -07:00
Lioncash 77532ebde3 shader/registry: Silence a -Wshadow warning 2020-09-23 15:10:25 -04:00
Lioncash cd6f4f7eed shader/registry: Remove unnecessary namespace qualifiers
Using statements already make these unnecessary.
2020-09-23 15:08:34 -04:00
Lioncash ffeb4ef83e shader/registry: Make use of designated initializers where applicable
Same behavior, less repetition.
2020-09-23 15:06:25 -04:00
Lioncash 0dc6967ff1 control_flow: emplace elements in place within TryQuery()
Places data structures where they'll eventually be moved to to avoid
needing to even move them in the first place.
2020-09-22 22:54:36 -04:00
Lioncash fcd0145eb5 control_flow: Make use of std::move in InsertBranch()
Avoids unnecessary atomic increments and decrements.
2020-09-22 22:48:09 -04:00
Lioncash ff45c39578 General: Make use of std::nullopt where applicable
Allows some implementations to avoid completely zeroing out the internal
buffer of the optional, and instead only set the validity byte within
the structure.

This also makes it consistent how we return empty optionals.
2020-09-22 17:32:33 -04:00
ReinUsesLisp 7003090187 renderer_opengl: Remove emulated mailbox presentation
Emulated mailbox presentation was causing performance issues on
Nvidia's OpenGL driver. Remove it.
2020-09-20 16:29:41 -03:00
ReinUsesLisp 4f5bbe56ba vk_query_cache: Hack counter destructor to avoid reserving queries
This is a hack to destroy all HostCounter instances before the base
class destructor is called. The query cache should be redesigned to have
a proper ownership model instead of using shared pointers.

For now, destroy the host counter hierarchy from the derived class
destructor.
2020-09-19 01:47:29 -03:00
ReinUsesLisp 58b0ae84b5 renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore
This reworks how host<->device synchronization works on the Vulkan
backend. Instead of "protecting" resources with a fence and signalling
these as free when the fence is known to be signalled by the host GPU,
use timeline semaphores.

Vulkan timeline semaphores allow use to work on a subset of D3D12
fences. As far as we are concerned, timeline semaphores are a value set
by the host or the device that can be waited by either of them.

Taking advantange of this, we can have a monolithically increasing
atomic value for each submission to the graphics queue. Instead of
protecting resources with a fence, we simply store the current logical
tick (the atomic value stored in CPU memory). When we want to know if a
resource is free, it can be compared to the current GPU tick.

This greatly simplifies resource management code and the free status of
resources should have less false negatives.

To workaround bugs in validation layers, when these are attached there's
a thread waiting for timeline semaphores.
2020-09-19 01:46:37 -03:00
Lioncash 91bca9eb0b fermi_2d: Make use of designated initializers
Same behavior, less repetition. We can also ensure all members of Config
are initialized.
2020-09-18 13:55:21 -04:00
Rodrigo Locatti 31461589c5
Merge pull request #4672 from lioncash/narrowing
decoder/texture: Eliminate narrowing conversion in GetTldCode()
2020-09-17 21:17:54 +00:00
Lioncash 4944d48ee8 decode/image: Eliminate switch fallthrough in DecodeImage()
Fortunately this didn't result in any issues, given the block that code
was falling through to would immediately break.
2020-09-17 15:12:18 -04:00
Lioncash ffc66f089d decoder/texture: Eliminate narrowing conversion in GetTldCode()
The assignment was previously truncating a u64 value to a bool.
2020-09-17 15:04:17 -04:00
ReinUsesLisp eb914b6c50 video_core: Enforce -Werror=switch
This forces us to fix all -Wswitch warnings in video_core.
2020-09-16 17:48:01 -03:00
ReinUsesLisp 9e87193725 video_core: Remove all Core::System references in renderer
Now that the GPU is initialized when video backends are initialized,
it's no longer needed to query components once the game is running: it
can be done when yuzu is booting.

This allows us to pass components between constructors and in the
process remove all Core::System references in the video backend.
2020-09-06 05:28:48 -03:00
bunnei 94a25b75a0
Merge pull request #4611 from lioncash/xbyak2
externals: Update Xbyak to 5.96
2020-09-03 20:24:27 -04:00
bunnei 39319f09d8
Merge pull request #4575 from lioncash/async
async_shaders: Mark getters as const member functions
2020-09-03 11:34:30 -04:00
ReinUsesLisp c573920c01 vk_device: Fix driver id check on AMD for VK_EXT_extended_dynamic_state
'driver_id' can only be known on Vulkan 1.1 after creating a logical
device. Move the driver id check to disable
VK_EXT_extended_dynamic_state after the logical device is successfully
initialized.

The Vulkan device will have the extension enabled but it will not be
used.
2020-08-30 20:22:48 -03:00
Lioncash a5dcccfdd2 externals: Update Xbyak to 5.96
I made a request on the Xbyak issue tracker to allow some constructors
to be constexpr in order to avoid static constructors from needing to
execute for some of our register constants.

This request was implemented, so this updates Xbyak so that we can make
use of it.
2020-08-30 05:09:48 -04:00
ReinUsesLisp fe90c4fd7b vk_device: Blacklist AMD proprietary from VK_EXT_extended_dynamic_state
Vertex binding's <stride> is bugged on AMD's proprietary drivers when
using VK_EXT_extended_dynamic_state. Blacklist it for now while we
investigate how to report this issue to AMD.
2020-08-28 19:14:57 -03:00
bunnei 9864da7d43
Merge pull request #4524 from lioncash/memory-log
shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message
2020-08-27 00:16:10 -04:00
bunnei 1bb8c27a70
Merge pull request #4569 from ReinUsesLisp/glsl-cmake
video_core/host_shaders: Add CMake integration for string shaders
2020-08-26 22:57:39 -04:00
bunnei 1e2a92918b
Merge pull request #4555 from ReinUsesLisp/fix-primitive-topology
vk_state_tracker: Fix primitive topology
2020-08-26 22:19:52 -04:00
Lioncash 7b50c48df7 memory_manager: Make use of [[nodiscard]] in the interface 2020-08-26 20:15:03 -04:00
Lioncash d12d59f62a memory_manager: Make operator+ const qualified
This doesn't modify member state, so it can be marked as const.
2020-08-26 20:11:58 -04:00
bunnei 902bf6d37d
Merge pull request #4574 from lioncash/const-fn
memory_manager: Mark IsGranularRange() as a const member function
2020-08-25 11:24:13 -04:00
bunnei bb752df736
Merge pull request #4542 from ReinUsesLisp/gpu-init-base
video_core: Initialize renderer with a GPU
2020-08-24 22:56:11 -04:00
Lioncash bafef3d1c9 async_shaders: Mark getters as const member functions
While we're at it, we can also mark them as nodiscard.
2020-08-24 01:15:50 -04:00
Lioncash 5bce81c3d6 memory_manager: Mark IsGranularRange() as a const member function
This doesn't modify internal member state, so it can be marked as const.
2020-08-24 00:37:57 -04:00
Lioncash bae4e6c2f5 gl_texture_cache: Take std::string by reference in DecorateViewName()
LabelGLObject takes a string_view, so we don't need to make copies of
the std::string.
2020-08-23 23:36:33 -04:00
Lioncash f3bb52c0a9 video_core/fence_manager: Remove unnecessary includes
Avoids pulling in unnecessary things that can cause rebuilds when they
aren't required.
2020-08-23 21:44:50 -04:00
ReinUsesLisp 91df2beee3 video_core/host_shaders: Add CMake integration for string shaders
Add the necessary CMake code to copy the contents in a string source
shader (GLSL or GLASM) to a header file then consumed by video_core
files.

This allows editting GLSL in its own files without having to maintain
them in source files.

For now, only OpenGL presentation shaders are moved, but we can add
GLASM presentation shaders and static SPIR-V generation through
glslangValidator in the future.
2020-08-23 21:37:20 -03:00
ReinUsesLisp 0eaf7e1daa gl_shader_util: Use std::string_view instead of star pointer
This allows us passing any type of string and hinting the length of the
string to the OpenGL driver.
2020-08-23 21:23:54 -03:00
ReinUsesLisp da53bcee60 video_core: Initialize renderer with a GPU
Add an extra step in GPU initialization to be able to initialize render
backends with a valid GPU instance.
2020-08-22 01:51:45 -03:00
bunnei baff9ffcac
Merge pull request #4521 from lioncash/optionalcache
gl_shader_disk_cache: Make use of std::nullopt where applicable
2020-08-21 23:56:55 -04:00
bunnei 53fbf8e206
Merge pull request #4523 from lioncash/self-assign
macro-interpreter: Resolve -Wself-assign-field warning
2020-08-21 18:25:53 -04:00
ReinUsesLisp aed6011d7c vk_state_tracker: Fix primitive topology
State track the current primitive topology with a regular comparison
instead of using dirty flags.

This fixes a bug in dirty flags for this particular state and it also
avoids unnecessary state changes as this property is stored in a
frequently changed bit field.
2020-08-20 23:07:30 -03:00
ReinUsesLisp c5a78f4480 vk_device: Use Vulkan 1.0 properly
Enable the required capabilities to use Vulkan 1.0 without validation
errors and disable those that are not compatible with it.
2020-08-20 16:55:22 -03:00
ReinUsesLisp 29a0ca2391 renderer_vulkan: Create a Vulkan 1.0 instance when 1.1 is not available
This commit doesn't make yuzu compatible with Vulkan 1.0 yet, it only
creates an 1.0 instance.
2020-08-20 16:55:22 -03:00
bunnei 3ea3de4ecd
Merge pull request #4546 from lioncash/telemetry
common/telemetry: Migrate namespace into the Common namespace
2020-08-20 14:29:13 -04:00
bunnei 2d2e235bcf
Merge pull request #4522 from lioncash/vulk-copy
vulkan/wrapper: Avoid unnecessary copy in EnumerateInstanceExtensionProperties()
2020-08-18 19:31:35 -04:00
Lioncash f6bb905182 common/telemetry: Migrate namespace into the Common namespace
Migrates the Telemetry namespace into the Common namespace to make the
code consistent with the rest of our common code.
2020-08-18 15:08:32 -04:00
bunnei 56c6a5def8
Merge pull request #4535 from lioncash/fileutil
common/fileutil: Convert namespace to Common::FS
2020-08-17 22:35:30 -04:00
David cbaf1bc711
Merge pull request #4443 from ameerj/vk-async-shaders
vulkan_renderer: Async shader/graphics pipeline compilation
2020-08-17 15:06:11 +10:00
David a91acd5365
Merge pull request #4520 from lioncash/pessimize
async_shaders: Resolve -Wpessimizing-move warning
2020-08-17 14:36:05 +10:00
ameerj fde8102a41 Remove unneeded newlines, optional Registry in shader params
Addressing feedback from Rodrigo
2020-08-16 16:33:21 -04:00
Ameer J f49ffdd648 Morph: Update worker allocation comment
Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>
2020-08-16 12:02:22 -04:00
ameerj 1b829fbd7a move thread 1/4 count computation into allocate workers method 2020-08-16 12:02:22 -04:00
ameerj 31a76410e8 Address feedback, add shader compile notifier, update setting text 2020-08-16 12:02:22 -04:00
ameerj c02464f64e Vk Async Worker directly emplace in cache 2020-08-16 12:02:22 -04:00
ameerj 4539073ce1 Address feedback. Bruteforce delete duplicates 2020-08-16 12:02:22 -04:00
ameerj 6ac97405df Vk Async pipeline compilation 2020-08-16 12:02:22 -04:00
Lioncash c4ed791164 common/fileutil: Convert namespace to Common::FS
Migrates a remaining common file over to the Common namespace, making it
consistent with the rest of common files.

This also allows for high-traffic FS related code to alias the
filesystem function namespace as

namespace FS = Common::FS;

for more concise typing.
2020-08-16 06:52:40 -04:00
bunnei db96034ea4
Merge pull request #4528 from lioncash/discard
common: Make use of [[nodiscard]] where applicable
2020-08-16 01:47:54 -04:00
bunnei 404362e1b0
Merge pull request #4519 from lioncash/semi
maxwell_3d: Resolve -Wextra-semi warning
2020-08-16 00:55:15 -04:00
Lioncash 1ee060ca0d common/compression: Roll back std::span changes
Seems like all compilers don't support std::span yet.
2020-08-15 17:17:56 -04:00
bunnei feb243b08d
Merge pull request #4416 from lioncash/span
lz4_compression/zstd_compression: Make use of std::span in interfaces
2020-08-15 00:53:11 -04:00
bunnei 2dace90346
Merge pull request #4453 from ReinUsesLisp/block-to-linear
textures/decoders: Fix block linear to pitch copies
2020-08-14 19:52:12 -04:00
Lioncash dcc5562cd5 shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message
We need to provide a message for this variant of the macro, so we can
simply log out the type being used.
2020-08-14 08:38:37 -04:00
Lioncash 34ec64233a macro-interpreter: Resolve -Wself-assign-field warning
This was assigning the field to itself, which is a no-op. The size
doesn't change between its initial assignment and this one, so this is a
safe change to make.
2020-08-14 08:26:50 -04:00
Lioncash 167d36ec3c vulkan/wrapper: Avoid unnecessary copy in EnumerateInstanceExtensionProperties()
Given this is implicitly creating a std::optional, we can move the
vector into it.
2020-08-14 08:23:49 -04:00
Lioncash c8135b3c18 gl_shader_disk_cache: Make use of std::nullopt where applicable
Allows the compiler to avoid unnecessarily zeroing out the internal
buffer of std::optional on some implementations.
2020-08-14 08:20:44 -04:00
Lioncash 6b13d08822 async_shaders: Resolve -Wpessimizing-move warning
Prevents pessimization of the move constructor (which thankfully didn't
actually happen in practice here, given std::thread isn't copyable).
2020-08-14 08:16:50 -04:00
Lioncash 83d8bf9af9 maxwell_3d: Resolve -Wextra-semi warning
Semicolons after a function definition aren't necessary.
2020-08-14 08:13:41 -04:00
bunnei a9de967fa3
Merge pull request #4514 from Morph1984/worker-alloc
gl_shader_cache: Use std::max() for determining num_workers
2020-08-13 17:06:57 -04:00
Lioncash b724a4d90c General: Tidy up clang-format warnings part 2 2020-08-13 14:19:08 -04:00
Morph e0ff98dd34 gl_shader_cache: Use std::max() for determining num_workers
Does not allocate more threads than available in the host system for boot-time shader compilation and always allocates at least 1 thread if hardware_concurrency() returns 0.
2020-08-12 09:23:34 -04:00
ReinUsesLisp f00641459e textures/decoders: Fix block linear to pitch copies
There were two issues with block linear copies. First the swizzling was
wrong and this commit reimplements them.

The other issue was that these copies are generally used to download
render targets from the GPU and yuzu was not downloading them from
host GPU memory unless the extreme GPU accuracy setting was selected.
This commit enables cached memory reads for all accuracy levels.

- Fixes level thumbnails in Super Mario Maker 2.
2020-08-10 20:45:03 -03:00
bunnei 5429ea0e69
Merge pull request #4389 from ogniK5377/redundant-format-type
video_core: Remove redundant pixel format type
2020-08-07 09:33:58 -04:00
bunnei f11628b9b7
Merge pull request #4430 from bunnei/new-gpu-vmm
hle: nvdrv: Rewrite of GPU memory management.
2020-08-04 18:44:26 -04:00
bunnei efd1b57d03
Merge pull request #4445 from Morph1984/async-threads
renderer_opengl: Use 1/4 of all threads for async shader compilation
2020-08-04 18:43:42 -04:00
bunnei 0ae267bf77
Merge pull request #4469 from lioncash/missing
vk_texture_cache: Silence -Wmissing-field-initializer warnings
2020-08-04 06:59:51 -07:00
Lioncash 06809ad7bc vulkan: Silence more -Wmissing-field-initializer warnings 2020-08-03 12:28:57 -04:00