Commit graph

3271 commits

Author SHA1 Message Date
Fernando Sahmkow 45b6d2d349 rasterizer_cache: Expose FlushObject to Child classes and allow redefining of Register and Unregister 2019-02-27 21:57:33 -04:00
bunnei f15e2dd881
Merge pull request #2163 from ReinUsesLisp/bitset-dirty
maxwell_3d: Use std::bitset to manage dirty flags
2019-02-27 20:50:08 -05:00
ReinUsesLisp 27ddbeb01c gl_rasterizer_cache: Create texture views for array discrepancies
When a texture is sampled in a shader with a different array mode than
the cached state, create a texture view and bind that to the shader
instead.
2019-02-27 14:41:06 -03:00
bunnei 66e023fba2
Merge pull request #2167 from lioncash/namespace
common: Move Quaternion, Rectangle, Vec2, Vec3, and Vec4 into the Common namespace
2019-02-27 11:19:53 -05:00
Lioncash 16ea93c11e vk_memory_manager: Reorder constructor initializer list in terms of member declaration order
Reorders members in the order that they would actually be initialized
in. Silences a -Wreorder warning.
2019-02-27 11:08:19 -05:00
Lioncash a6a783b3dc gl_rasterizer: Reorder constructor initializer list in terms of member declaration order
Orders the members in the order they would actually be initialized in.
Silences a -Wreorder warning.
2019-02-27 11:08:19 -05:00
Lioncash e7eff72e83 gl_shader_disk_cache: Remove #pragma once from cpp file
This is only necessary in headers. Silences a warning with clang.
2019-02-27 11:02:49 -05:00
Lioncash b9238edd0d common/math_util: Move contents into the Common namespace
These types are within the common library, so they should be within the
Common namespace.
2019-02-27 03:38:39 -05:00
ReinUsesLisp 0ad3c031f4 gl_rasterizer_cache: Move format conversion to its own file 2019-02-26 20:08:27 -03:00
ReinUsesLisp 0ccd490fcd decoders: Minor style changes 2019-02-26 20:08:27 -03:00
ReinUsesLisp d91e35a50a renderer_opengl: Update pixel format tracking 2019-02-26 03:47:16 -03:00
ReinUsesLisp 5219edd715 maxwell_3d: Use std::bitset to manage dirty flags 2019-02-26 03:01:48 -03:00
ReinUsesLisp 730eb1dad7 vk_stream_buffer: Remove copy code path 2019-02-26 02:09:43 -03:00
ReinUsesLisp 5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp 48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
Lioncash c1b2e35625 shader/track: Resolve variable shadowing warnings 2019-02-25 09:10:59 -05:00
bunnei c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
bunnei c4243c07cc
Merge pull request #2119 from FernandoS27/fix-copy
rasterizer_cache_gl: Only do fast layered copy on the same format.
2019-02-24 23:03:52 -05:00
bunnei 90c780e6f3
Merge pull request #2139 from degasus/dma_pusher
video_core/dma_pusher: The full list of headers at once.
2019-02-24 04:15:49 -05:00
ReinUsesLisp 33a0597603 vk_stream_buffer: Implement a stream buffer
This manages two kinds of streaming buffers: one for unified memory
models and one for dedicated GPUs. The first one skips the copy from the
staging buffer to the real buffer, since it creates an unified buffer.

This implementation waits for all fences to finish their operation
before "invalidating". This is suboptimal since it should allocate
another buffer or start searching from the beginning. There is room for
improvement here.

This could also handle AMD's "pinned" memory (a heap with 256 MiB) that
seems to be designed for buffer streaming.
2019-02-24 04:27:51 -03:00
ReinUsesLisp 281a8bf259 vk_resource_manager: Minor VKFenceWatch changes 2019-02-24 04:19:04 -03:00
bunnei f7090bacc5
Merge pull request #2146 from ReinUsesLisp/vulkan-scheduler
vk_scheduler: Implement a scheduler
2019-02-23 23:32:43 -05:00
bunnei d062991643
Merge pull request #2150 from ReinUsesLisp/fixup-layer-swizzle
gl_rasterizer_cache: Fixup parameter order in layered swizzle
2019-02-23 23:31:34 -05:00
ReinUsesLisp 92050c4d86 vk_memory_manager: Fixup commit interval allocation
VKMemoryCommitImpl was using as the end of its interval "begin + end".
That ended up wasting memory.
2019-02-24 01:04:41 -03:00
ReinUsesLisp abef11a540 gl_rasterizer_cache: Fixup parameter order in layered swizzle 2019-02-23 23:27:30 -03:00
ReinUsesLisp f546fb35ed vk_scheduler: Implement a scheduler
The scheduler abstracts command buffer and fence management with an
interface that's able to do OpenGL-like operations on Vulkan command
buffers.

It returns by value a command buffer and fence that have to be used for
subsequent operations until Flush or Finish is executed, after that the
current execution context (the pair of command buffers and fences) gets
invalidated a new one must be fetched. Thankfully validation layers will
quickly detect if this is skipped throwing an error due to modifications
to a sent command buffer.
2019-02-22 01:33:32 -03:00
bunnei 94b27bb8a5
Merge pull request #2138 from ReinUsesLisp/vulkan-memory-manager
vk_memory_manager: Implement memory manager
2019-02-21 22:26:54 -05:00
bunnei 9539c4203b
Merge pull request #2125 from ReinUsesLisp/fixup-glstate
gl_state: Synchronize gl_state even when state is disabled
2019-02-20 21:47:46 -05:00
bunnei ae437320c8
Merge pull request #2130 from lioncash/system_engine
video_core: Remove usages of System::GetInstance() within the engines
2019-02-20 21:24:56 -05:00
Markus Wick 6dd40976d0 video_core/dma_pusher: Simplyfy Step() logic.
As fetching command list headers and and the list of command headers is a fixed 1:1 relation now, they can be implemented within a single call.
This cleans up the Step() logic quite a bit.
2019-02-19 10:28:42 +01:00
Markus Wick 717394c980 video_core/dma_pusher: The full list of headers at once.
Fetching every u32 from memory leads to a big overhead. So let's fetch all of them as a block if possible.
This reduces the Memory::* calls by the dma_pusher by a factor of 10.
2019-02-19 09:58:38 +01:00
ReinUsesLisp b675c97cdd vk_memory_manager: Implement memory manager
A memory manager object handles the memory allocations for a device. It
allocates chunks of Vulkan memory objects and then suballocates.
2019-02-19 03:42:28 -03:00
bunnei 4bce08d497
Merge pull request #2122 from ReinUsesLisp/vulkan-resource-manager
vk_resource_manager: Implement fence and command buffer allocator
2019-02-18 21:05:28 -05:00
bunnei 4699fdca8f
Merge pull request #2127 from FearlessTobi/fix-screenshot-srgb
renderer_opengl: respect the sRGB colorspace for the screenshot feature
2019-02-16 15:36:00 -05:00
Lioncash a8fa5019b5 video_core: Remove usages of System::GetInstance() within the engines
Avoids the use of the global accessor in favor of explicitly making the
system a dependency within the interface.
2019-02-15 22:06:23 -05:00
James Rowe 99da6362c4
Merge pull request #2123 from lioncash/coretiming-global
core_timing: De-globalize core_timing facilities
2019-02-15 19:52:11 -07:00
Lioncash bd983414f6 core_timing: Convert core timing into a class
Gets rid of the largest set of mutable global state within the core.
This also paves a way for eliminating usages of GetInstance() on the
System class as a follow-up.

Note that no behavioral changes have been made, and this simply extracts
the functionality into a class. This also has the benefit of making
dependencies on the core timing functionality explicit within the
relevant interfaces.
2019-02-15 21:50:25 -05:00
fearlessTobi 9a56b99fa4 renderer_opengl: respect the sRGB colorspace for the screenshot feature
Previously, we were completely ignoring for screenshots whether the game uses RGB or sRGB.
This resulted in screenshot colors that looked off for some titles.
2019-02-15 21:27:29 +01:00
ReinUsesLisp 8dfc81239f gl_state: Synchronize gl_state even when state is disabled
There are some potential edge cases where gl_state may fail to track the
state if a related state changes while the toggle is disabled or it
didn't change. This addresses that.
2019-02-15 01:30:14 -03:00
bunnei 4327f430f1
Merge pull request #2112 from lioncash/shadowing
gl_rasterizer_cache: Get rid of variable shadowing
2019-02-14 21:45:20 -05:00
bunnei a8fc5d6edd
Merge pull request #2111 from ReinUsesLisp/fetch-fix
gl_shader_decompiler: Re-implement TLDS lod
2019-02-14 21:42:34 -05:00
ReinUsesLisp ae6c052ed9 vk_resource_manager: Implement a command buffer pool with VKFencedPool 2019-02-14 18:44:26 -03:00
ReinUsesLisp a2b6de7e9f vk_resource_manager: Add VKFencedPool interface
Handles a pool of resources protected by fences. Manages resource
overflow allocating more resources.

This class is intended to be used through inheritance.
2019-02-14 18:44:26 -03:00
ReinUsesLisp 0ffdd0a683 vk_resource_manager: Implement VKResourceManager and fence allocator
CommitFence iterates a pool of fences until one is found. If all fences
are being used at the same time, allocate more.
2019-02-14 18:44:26 -03:00
ReinUsesLisp aa0b6babda vk_resource_manager: Implement VKFenceWatch
A fence watch is used to keep track of the usage of a fence and protect
a resource or set of resources without having to inherit from their
handlers.
2019-02-14 18:44:26 -03:00
ReinUsesLisp 25c2fe1c6b vk_resource_manager: Implement VKFence
Fences take ownership of objects, protecting them from GPU-side or
driver-side concurrent access. They must be commited from the resource
manager. Their usage flow is: commit the fence from the resource
manager, protect resources with it and use them, send the fence to an
execution queue and Wait for it if needed and then call Release. Used
resources will automatically be signaled when they are free to be
reused.
2019-02-14 18:44:26 -03:00
ReinUsesLisp 33a4cebc22 vk_resource_manager: Add VKResource interface
VKResource is an interface that gets signaled by a fence when it is free
to be reused.
2019-02-14 18:36:15 -03:00
bunnei fcc3aa0bbf
Merge pull request #2113 from ReinUsesLisp/vulkan-base
vulkan: Add dependencies and device abstraction
2019-02-14 10:06:48 -05:00
Fernando Sahmkow 10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
Fernando Sahmkow bb41683394 rasterizer_cache_gl: Only do fast layered copy on the same format. As
glCopyImageSubData does not support different formats.
2019-02-13 16:55:00 -04:00
bunnei cd542d5aac
Merge pull request #2099 from greggameplayer/BGRA8-Framebuffer-Real
Implement BGRA8 framebuffer format
2019-02-12 21:44:20 -05:00
ReinUsesLisp 8beca060d1 vk_device: Abstract device handling into a class
VKDevice contains all the data required to manage and initialize a
physical device. Its intention is to be passed across Vulkan objects to
query device-specific data (for example the logical device and the
dispatch loader).
2019-02-12 21:43:02 -03:00
Lioncash 86b55cb6df renderer_opengl: Remove reference to global system instance
We already store a reference to the system instance that the renderer is
created with, so we don't need to refer to the system instance via
Core::System::GetInstance()
2019-02-12 19:33:22 -05:00
bunnei 8135f4bfce
Merge pull request #2110 from lioncash/namespace
core_timing: Rename CoreTiming namespace to Core::Timing
2019-02-12 19:26:37 -05:00
bunnei c440ecfafe
Merge pull request #2104 from ReinUsesLisp/compute-assert
kepler_compute: Fixup assert and rename the engine
2019-02-12 19:24:34 -05:00
Lioncash 054e39647c gl_rasterizer_cache: Remove unnecessary newline 2019-02-12 16:56:19 -05:00
Lioncash e25c464c02 gl_rasterizer_cache: Get rid of variable shadowing
Avoids shadowing the members of the struct itself, which results in a
-Wshadow warning.
2019-02-12 16:46:39 -05:00
ReinUsesLisp 18fe910957 renderer_vulkan: Add declarations file
This file is intended to be included instead of vulkan/vulkan.hpp. It
includes declarations of unique handlers using a dynamic dispatcher
instead of a static one (which would require linking to a Vulkan
library).
2019-02-12 18:33:02 -03:00
ReinUsesLisp e60d4d70bc gl_shader_decompiler: Re-implement TLDS lod 2019-02-12 17:03:07 -03:00
Lioncash 48d9d66dc5 core_timing: Rename CoreTiming namespace to Core::Timing
Places all of the timing-related functionality under the existing Core
namespace to keep things consistent, rather than having the timing
utilities sitting in its own completely separate namespace.
2019-02-12 12:42:17 -05:00
bunnei 444231a83d
Merge pull request #2108 from FernandoS27/fix-cc
Fix incorrect value for CC bit in IADD
2019-02-12 10:39:03 -05:00
bunnei c1accfefde
Merge pull request #2109 from FernandoS27/fix-f2i
Corrected F2I None mode to RoundEven.
2019-02-12 10:20:29 -05:00
bunnei 27e5efd265
Merge pull request #2068 from ReinUsesLisp/shader-cleanup-textures
shader_ir: Clean texture management code
2019-02-12 10:20:15 -05:00
Fernando Sahmkow f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
Fernando Sahmkow edd668047c Fix incorrect value for CC bit in IADD 2019-02-11 16:44:43 -04:00
ReinUsesLisp 1ddcd0e6f0 kepler_compute: Fixup assert and rename engines
When I originally added the compute assert I used the wrong
documentation. This addresses that.

The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
2019-02-10 19:29:33 -03:00
greggameplayer a6a73d8892 Implement BGRA8 framebuffer format 2019-02-09 23:44:01 +01:00
bunnei 1d98027a0e
Merge pull request #1904 from bunnei/better-fermi-copy
gl_rasterizer: Implement a more accurate fermi 2D copy.
2019-02-08 23:32:24 -05:00
Fernando Sahmkow e543320129 Implement linear textures (#2089) 2019-02-08 18:28:01 -05:00
ReinUsesLisp e36e7ae74e gl_rasterizer_cache: Fixup texture view parameters
These parameters were declared as constants and passed to glTextureView
but then they were removed on a rabase. This addresses that mistake.
2019-02-08 18:32:58 -03:00
ReinUsesLisp 889c646ac0 shader_ir: Remove F4 prefix to texture operations
This was originally included because texture operations returned a vec4.
These operations now return a single float and the F4 prefix doesn't
mean anything.
2019-02-07 17:36:46 -03:00
ReinUsesLisp d62b0a9e29 shader_ir: Clean texture management code
Previous code relied on GLSL parameter order (something that's always
ill-formed on an IR design). This approach passes spatial coordiantes
through operation nodes and array and depth compare values in the the
texture metadata. It still contains an "extra" vector containing generic
nodes for bias and component index (for example) which is still a bit
ill-formed but it should be better than the previous approach.
2019-02-07 00:46:13 -03:00
bunnei f09d1dffd1
Merge pull request #2083 from ReinUsesLisp/shader-ir-cbuf-tracking
shader/track: Add a more permissive global memory tracking
2019-02-06 21:56:14 -05:00
bunnei 35e1118766 gl_rasterizer_cache: Mark surface copy destinations as modified. 2019-02-06 21:54:25 -05:00
bunnei dd1aab5446 gl_rasterizer: Implement a more accurate fermi 2D copy.
- This is a blit, use the blit registers.
2019-02-06 21:54:21 -05:00
Frederic L d0ac624403 gl_shader_disk_cache: Check LZ4 size limit
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-02-06 22:23:41 -03:00
Frederic L 9f0b247cf6 gl_shader_disk_cache: Consider compressed size zero as an error
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-02-06 22:23:41 -03:00
ReinUsesLisp e6a2245304 gl_shader_disk_cache: Use unordered containers 2019-02-06 22:23:41 -03:00
ReinUsesLisp e147ed4fc0 gl_shader_cache: Fixup GLSL unique identifiers 2019-02-06 22:23:40 -03:00
ReinUsesLisp eb73247433 gl_shader_cache: Link loading screen with disk shader cache load 2019-02-06 22:23:40 -03:00
ReinUsesLisp df0f31f44e gl_shader_cache: Set GL_PROGRAM_SEPARABLE to dumped shaders
i965 (and probably all mesa drivers) require GL_PROGRAM_SEPARABLE when using
glProgramBinary. This is probably required by the standard but it's ignored by
permisive proprietary drivers.
2019-02-06 22:23:40 -03:00
ReinUsesLisp 7fefec585c gl_shader_disk_cache: Pass core system as argument and guard against games without title ids 2019-02-06 22:23:40 -03:00
ReinUsesLisp 2bc6a699dc gl_shader_disk_cache: Guard reads and writes against failure 2019-02-06 22:23:40 -03:00
ReinUsesLisp 750abcc23d gl_shader_disk_cache: Address miscellaneous feedback 2019-02-06 22:23:40 -03:00
ReinUsesLisp 8ee3666a3c gl_shader_disk_cache: Pass return values returning instead of by parameters 2019-02-06 22:23:40 -03:00
ReinUsesLisp ed956569a4 gl_shader_disk_cache: Compress program binaries using LZ4 2019-02-06 22:23:39 -03:00
ReinUsesLisp f087639e4a gl_shader_disk_cache: Compress GLSL code using LZ4 2019-02-06 22:23:39 -03:00
ReinUsesLisp cfb20c4c9d gl_shader_disk_cache: Save GLSL and entries into the precompiled file 2019-02-06 22:23:39 -03:00
ReinUsesLisp e78da8dc1f settings: Hide shader cache behind a setting 2019-02-06 22:20:57 -03:00
ReinUsesLisp be4641c43f gl_shader_disk_cache: Invalidate shader cache changes with CMake hash 2019-02-06 22:20:57 -03:00
ReinUsesLisp a3703f5767 gl_shader_cache: Refactor to support disk shader cache 2019-02-06 22:20:57 -03:00
ReinUsesLisp 4039086226 gl_shader_disk_cache: Add transferable cache invalidation 2019-02-06 22:20:57 -03:00
ReinUsesLisp a1faed9950 gl_shader_disk_cache: Add precompiled load 2019-02-06 22:20:57 -03:00
ReinUsesLisp 57fb15d2a3 gl_shader_disk_cache: Add precompiled save 2019-02-06 22:20:57 -03:00
ReinUsesLisp 3435cd8d5e gl_shader_disk_cache: Add transferable load 2019-02-06 22:20:57 -03:00
ReinUsesLisp b1efceec89 gl_shader_disk_cache: Add transferable stores 2019-02-06 22:20:57 -03:00
ReinUsesLisp 98be5a4928 gl_shader_disk_cache: Add ShaderDiskCacheOpenGL class and helpers 2019-02-06 22:20:57 -03:00
ReinUsesLisp 145c3ac89e gl_shader_disk_cache: Add file and move BaseBindings declaration 2019-02-06 22:20:57 -03:00
ReinUsesLisp c2c5260fd7 gl_shader_decompiler: Remove name entries 2019-02-06 22:20:57 -03:00
ReinUsesLisp 8b11368671 gl_shader_util: Add parameter to handle retrievable programs 2019-02-06 22:20:57 -03:00
ReinUsesLisp 0ed5d728ca rasterizer_interface: Add disk cache entry for the rasterizer 2019-02-06 22:20:57 -03:00
ReinUsesLisp 049050856f shader_decode: Implement LDG and basic cbuf tracking 2019-02-06 22:20:57 -03:00
bunnei 10ab714fe0
Merge pull request #2042 from ReinUsesLisp/nouveau-tex
maxwell_3d: Allow texture handles with TIC id zero
2019-02-06 20:19:20 -05:00
bunnei 40ac058557
Merge pull request #2071 from ReinUsesLisp/dsa-texture
gl_rasterizer: Use DSA for textures and move swizzling to texture state
2019-02-06 20:17:59 -05:00
bunnei 7aa7d8f4ff
Merge pull request #2085 from ReinUsesLisp/cube-minus-one
video_core/texture: Fix BitField size for depth_minus_one
2019-02-05 17:15:26 -05:00
bunnei 72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
ReinUsesLisp b5e685b297 video_core/texture: Fix BitField size for depth_minus_one 2019-02-05 04:32:06 -03:00
bunnei bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M a568cd805b
Update src/video_core/engines/shader_bytecode.h
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow 0306c50339 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp dfa7be5ddf shader_ir/memory: Add ST_L 64 and 128 bits stores 2019-02-03 19:08:10 -03:00
ReinUsesLisp 0d1d755086 shader/track: Search inside of conditional nodes
Some games search conditionally use global memory instructions. This
allows the heuristic to search inside conditional nodes for the source
constant buffer.
2019-02-03 17:21:20 -03:00
ReinUsesLisp 42b75e8be8 shader_ir: Rename BasicBlock to NodeBlock
It's not always used as a basic block. Rename it for consistency.
2019-02-03 17:21:20 -03:00
ReinUsesLisp 6a6fabea58 shader_ir: Pass decoded nodes as a whole instead of per basic blocks
Some games call LDG at the top of a basic block, making the tracking
heuristic to fail. This commit lets the heuristic the decoded nodes as a
whole instead of per basic blocks.

This may lead to some false positives but allows it the heuristic to
track cases it previously couldn't.
2019-02-03 17:21:20 -03:00
ReinUsesLisp 2bdbb90af7 video_core: Assert on invalid GPU to CPU address queries 2019-02-03 04:58:40 -03:00
ReinUsesLisp 04e68e9738 maxwell_3d: Allow sampler handles with TSC id zero 2019-02-03 04:58:40 -03:00
ReinUsesLisp 390721a561 maxwell_3d: Allow texture handles with TIC id zero
Also remove "enabled" field from Tegra::Texture::FullTextureInfo because
it would become unused.
2019-02-03 04:58:24 -03:00
ReinUsesLisp e01a9de35f memory_manager: Check for reserved page status 2019-02-03 04:58:24 -03:00
ReinUsesLisp f61c1ed246 shader_ir/memory: Add LD_L 128 bits loads 2019-02-03 00:35:34 -03:00
ReinUsesLisp 9feb68085d shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00
ReinUsesLisp 0be835132c shader_ir/memory: Add LD_L 64 bits loads 2019-02-03 00:25:40 -03:00
bunnei eceab45dac
Merge pull request #2074 from ReinUsesLisp/shader-ir-unify-offset
shader_ir: Unify constant buffer offset values
2019-02-01 13:24:04 -05:00
bunnei 2d226ff8ac
Merge pull request #2067 from ReinUsesLisp/workaround-fb
gl_rasterizer: Workaround invalid zeta clears
2019-02-01 12:50:09 -05:00
ReinUsesLisp 26f8a700a7 rasterizer_interface: Remove unused AccelerateFill operation 2019-02-01 03:02:22 -03:00
ReinUsesLisp 13222f94c0 video_core: Remove unused Fill surface type 2019-02-01 02:57:47 -03:00
ReinUsesLisp 3e80b08944 gl_rasterizer_cache: Fixup test clause 2019-01-30 19:10:35 -03:00
Mat M 911587fb8d gl_rasterizer_cache: Guard clause swizzle testing
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-30 19:10:35 -03:00
ReinUsesLisp 220df45b7d gl_state: Remove texture target tracking 2019-01-30 19:10:35 -03:00
ReinUsesLisp 704744bb72 gl_rasterizer_cache: Move swizzling to textures instead of state 2019-01-30 19:10:35 -03:00
ReinUsesLisp 3bbaa98c78 gl_state: Use DSA and multi bind to update texture bindings 2019-01-30 19:10:11 -03:00
ReinUsesLisp 4b676e7786 gl_rasterizer: Use DSA for textures 2019-01-30 19:10:11 -03:00
Hexagon12 35480167b1
Merge pull request #2076 from lioncash/enc
video_core/dma_pusher: Silence C4828 warnings
2019-01-30 19:42:15 +02:00
Lioncash 0b594f3344 video_core/dma_pusher: Silence C4828 warnings
This was previously causing:

warning C4828: The file contains a character starting at offset 0xa33
that is illegal in the current source character set (codepage 65001).

warnings on Windows when compiling yuzu.
2019-01-30 12:36:31 -05:00
bunnei 92b18345a8
Merge pull request #1485 from FernandoS27/render-info
Add more info into textures' object labels
2019-01-30 12:35:56 -05:00
ReinUsesLisp 477d616f7d shader_ir: Unify constant buffer offset values
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
bunnei 3c3d9afd61
Merge pull request #2070 from ReinUsesLisp/cubearray-view
gl_shader_cache: Fix texture view for cubemaps as cubemap arrays
2019-01-29 22:27:08 -05:00
ReinUsesLisp 52c326c301 gl_shader_cache: Use explicit bindings 2019-01-30 00:11:02 -03:00
ReinUsesLisp 9f803299de gl_rasterizer: Implement global memory management 2019-01-30 00:00:15 -03:00
ReinUsesLisp 3b84e04af1 shader_decode: Implement LDG and basic cbuf tracking 2019-01-30 00:00:15 -03:00
Kevin ba38d91fe2 video_core/GPU Implemented the GPU PFIFO puller semaphore operations. (#1908)
* Implemented the puller semaphore operations.

* Nit: Fix 2 style issues

* Nit: Add Break to default case.

* Fix style.

* Update for comments. Added ReferenceCount method

* Forgot to remove GpuSmaphoreAddress union.

* Fix the clang-format issues.

* More clang formatting.

* two more white spaces for the Clang formatting.

* Move puller members into the regs union

* Updated to use Memory::WriteBlock instead of Memory::Write*

* Fix clang style issues

* White space clang error

* Removing unused funcitons and other pr comment

* Removing unused funcitons and other pr comment

* More union magic for setting regs value.

* union magic refcnt as well

*  Remove local var

* Set up the regs and regs_assert_positions up properly

* Fix clang error
2019-01-29 21:49:18 -05:00
ReinUsesLisp f58a6152fc gl_shader_cache: Fix texture view for cubemaps as cubemap arrays
Cubemaps are considered layered and to create a texture view the texture
mustn't be a layered texture, resulting in cubemaps being bound as
cubemap arrays. To fix this issue this commit introduces an extra
surface parameter called "is_array" and uses this to query for texture
view creation.

Now that texture views for cubemaps are actually being created, this
also fixes the number of layers created for the texture view (since they
have to be 6 to create a texture view of cubemaps).
2019-01-29 23:49:02 -03:00
ReinUsesLisp 07692230ca gl_rasterizer: Workaround invalid zeta clears
Some games (like Xenoblade Chronicles 2) clear both depth and stencil
buffers while there's a depth-only texture attached (e.g. D16 Unorm).
This commit reads the zeta format of the bound surface on
ConfigureFramebuffers and returns if depth and/or stencil attachments
were set. This is ignored on DrawArrays but on Clear it's used to just
clear those attachments, bypassing an OpenGL error.
2019-01-29 23:47:33 -03:00
Lioncash b2b98b2f44 shader/shader_ir: Amend three comment typos
Given we're in the area, these are three trivial typos that can be
corrected.
2019-01-28 07:52:04 -05:00
Lioncash 62e08c30b7 shader/shader_ir: Amend constructor initializer ordering for AbufNode
Orders the class members in the same order that they would actually be
initialized in. Gets rid of two compiler warnings.
2019-01-28 07:50:34 -05:00
Lioncash 3e1a9a45a6 shader/decode: Avoid a pessimizing std::move within DecodeRange()
std::moveing a local variable in a return statement has the potential to
prevent copy elision from occurring, so this can just be converted into
a regular return.
2019-01-28 07:43:23 -05:00
ReinUsesLisp fc6d46c374 video_core: Silent implicit conversion warning 2019-01-26 02:27:14 -03:00
bunnei 1f4ca1e841
Merge pull request #1927 from ReinUsesLisp/shader-ir
video_core: Replace gl_shader_decompiler with an IR based decompiler
2019-01-25 23:42:14 -05:00
bunnei 045b0b70b6 frontend: Refactor ScopeAcquireWindowContext out of renderer_opengl. 2019-01-23 19:19:23 -05:00
ReinUsesLisp 9a82dec74a maxwell_3d: Set rt_separate_frag_data to 1 by default
Commercial games assume that this value is 1 but they never set it. On
the other hand nouveau manually sets this register. On
ConfigureFramebuffers we were asserting for what we are actually
implementing (according to envytools).
2019-01-22 04:14:29 -03:00
James Rowe ea73ffe202 Rename step 1 and step 2 to be a little more descriptive 2019-01-20 18:40:25 -07:00
James Rowe e8bd6b1fcc QT: Upgrade the Loading Bar to look much better 2019-01-20 14:47:35 -07:00
bunnei 197d0d9d24
Merge pull request #2008 from ReinUsesLisp/dirty-framebuffers
gl_rasterizer_cache: Use dirty flags for framebuffers
2019-01-20 14:06:26 -05:00
bunnei cbf8bea9d5
Merge pull request #2002 from ReinUsesLisp/dsa-vao-buffer
gl_rasterizer: Use DSA for VAOs and buffers
2019-01-20 14:06:01 -05:00
ReinUsesLisp a1b1ea47ed gl_rasterizer: Silent unsafe mix warning 2019-01-18 03:25:28 -03:00
ReinUsesLisp a63d7c49fc shader_ir: Fixup clang build 2019-01-15 21:06:05 -03:00
ReinUsesLisp 1e40a4b343 gl_shader_decompiler: replace std::get<> with std::get_if<> for macOS compatibility 2019-01-15 17:54:53 -03:00
ReinUsesLisp 51de4e00a6 gl_shader_decompiler: Inline textureGather component 2019-01-15 17:54:53 -03:00
ReinUsesLisp 1c9c4eefeb shader_decode: Fixup XMAD 2019-01-15 17:54:53 -03:00
ReinUsesLisp 170c8212bb shader_ir: Pass to decoder functions basic block's code 2019-01-15 17:54:53 -03:00
ReinUsesLisp 2d6c064e66 shader_decode: Improve zero flag implementation 2019-01-15 17:54:53 -03:00
ReinUsesLisp d911740e5d shader_ir: Remove composite primitives and use temporals instead 2019-01-15 17:54:53 -03:00
ReinUsesLisp bb12f99b20 gl_shader_decompiler: Fixup AssignCompositeHalf 2019-01-15 17:54:53 -03:00
ReinUsesLisp 50195b1704 shader_decode: Use proper primitive names 2019-01-15 17:54:53 -03:00
ReinUsesLisp 2faad9bf23 shader_decode: Use BitfieldExtract instead of shift + and 2019-01-15 17:54:53 -03:00
ReinUsesLisp 52223313b1 shader_ir: Remove Ipa primitive 2019-01-15 17:54:53 -03:00
ReinUsesLisp d6b173d5fe gl_shader_decompiler: Use rasterizer's UBO size limit 2019-01-15 17:54:53 -03:00
ReinUsesLisp df74ff3c8b gl_shader_gen: Fixup code formatting 2019-01-15 17:54:53 -03:00
ReinUsesLisp af5d7e2c49 video_core: Rename glsl_decompiler to gl_shader_decompiler 2019-01-15 17:54:53 -03:00
ReinUsesLisp d9118d324a shader_ir: Remove RZ and use Register::ZeroIndex instead 2019-01-15 17:54:53 -03:00
ReinUsesLisp 5af82a8ed4 shader_decode: Implement TEXS.F16 2019-01-15 17:54:53 -03:00
ReinUsesLisp c68c13e1aa shader_decode: Fixup R2P 2019-01-15 17:54:53 -03:00
ReinUsesLisp 8b5588e776 glsl_decompiler: Fixup TLDS 2019-01-15 17:54:53 -03:00
ReinUsesLisp dbed6c6485 glsl_decompiler: Fixup geometry shaders 2019-01-15 17:54:53 -03:00
ReinUsesLisp ea78c78253 shader_decode: Fixup WriteLogicOperation zero comparison 2019-01-15 17:54:53 -03:00
ReinUsesLisp ab7f52b279 glsl_decompiler: Fixup permissive member function declarations 2019-01-15 17:54:53 -03:00
ReinUsesLisp 55a10d02e5 shader_decode: Fixup PSET 2019-01-15 17:54:53 -03:00
ReinUsesLisp a2e22b4359 shader_decode: Fixup clang-format 2019-01-15 17:54:53 -03:00
ReinUsesLisp e1fea1e0c5 video_core: Implement IR based geometry shaders 2019-01-15 17:54:53 -03:00
ReinUsesLisp a1b845b651 shader_decode: Implement VMAD and VSETP 2019-01-15 17:54:53 -03:00
ReinUsesLisp b11e0b94c7 shader_decode: Implement HSET2 2019-01-15 17:54:53 -03:00
ReinUsesLisp 2df55985b6 shader_decode: Rework HSETP2 2019-01-15 17:54:53 -03:00
ReinUsesLisp 8332482c24 shader_decode: Implement R2P 2019-01-15 17:54:53 -03:00
ReinUsesLisp 3f1136ac6f shader_decode: Implement CSETP 2019-01-15 17:54:52 -03:00
ReinUsesLisp 7e13e8bfcb shader_decode: Implement PSET 2019-01-15 17:54:52 -03:00
ReinUsesLisp dd91650aaf shader_decode: Implement HFMA2 2019-01-15 17:54:52 -03:00
ReinUsesLisp d6f76307fe glsl_decompiler: Remove HNegate inlining 2019-01-15 17:54:52 -03:00
ReinUsesLisp 027f443e69 shader_decode: Implement POPC 2019-01-15 17:54:52 -03:00
ReinUsesLisp 55e6786254 shader_decode: Implement TLDS (untested) 2019-01-15 17:54:52 -03:00
ReinUsesLisp ec98e4d842 shader_decode: Update TLD4 reflecting #1862 changes 2019-01-15 17:54:52 -03:00
ReinUsesLisp 03e088a4f4 shader_ir: Fixup TEX and TEXS and partially fix TLD4 decompiling 2019-01-15 17:54:52 -03:00
ReinUsesLisp 2d9136cec6 shader_decode: Fixup FSET 2019-01-15 17:54:52 -03:00
ReinUsesLisp af5c6e4ccb shader_decode: Implement IADD32I 2019-01-15 17:54:52 -03:00
ReinUsesLisp 4316eaf75c shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp fc46ecddb3 video_core: Return safe values after an assert hits 2019-01-15 17:54:52 -03:00
ReinUsesLisp 148a6418ed shader_decode: Implement FFMA 2019-01-15 17:54:52 -03:00
ReinUsesLisp 21aff36459 video_core: Address feedback 2019-01-15 17:54:52 -03:00
ReinUsesLisp 59b34b1d76 shader_ir: Fixup file inclusions and clang-format 2019-01-15 17:54:52 -03:00
Mat M 57a900cc45 shader_ir: Move comment node string
Co-Authored-By: ReinUsesLisp <reinuseslisp@airmail.cc>
2019-01-15 17:54:52 -03:00
ReinUsesLisp d4fae3a699 shader_ir: Address feedback to avoid UB in bit casting 2019-01-15 17:54:52 -03:00
ReinUsesLisp 946c86f0bb shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp c9cf899d18 shader_decode: Implement LEA 2019-01-15 17:54:52 -03:00
ReinUsesLisp 4fd06efeb9 shader_decode: Implement IADD3 2019-01-15 17:54:52 -03:00
ReinUsesLisp a40fd07516 shader_decode: Implement LOP3 2019-01-15 17:54:52 -03:00
ReinUsesLisp b184ca9089 shader_decode: Implement ST_L 2019-01-15 17:54:52 -03:00
ReinUsesLisp 8d42feb09b shader_decode: Implement LD_L 2019-01-15 17:54:52 -03:00
ReinUsesLisp 21f9e9da09 shader_decode: Implement HSETP2 2019-01-15 17:54:52 -03:00
ReinUsesLisp 68c99d2597 shader_decode: Implement HADD2 and HMUL2 2019-01-15 17:54:52 -03:00
ReinUsesLisp cf4a08d950 shader_decode: Implement HADD2_IMM and HMUL2_IMM 2019-01-15 17:54:52 -03:00
ReinUsesLisp 376a837511 shader_decode: Implement MOV_SYS 2019-01-15 17:54:52 -03:00
ReinUsesLisp 518a2bd206 shader_decode: Implement IMNMX 2019-01-15 17:54:52 -03:00
ReinUsesLisp 07944a2345 shader_decode: Implement F2F_C 2019-01-15 17:54:52 -03:00
ReinUsesLisp e8235c0215 shader_decode: Implement I2I 2019-01-15 17:54:52 -03:00
ReinUsesLisp 6ca31f544a shader_decode: Implement BRA internal flag 2019-01-15 17:54:52 -03:00
ReinUsesLisp 210620ff31 shader_decode: Implement ISCADD 2019-01-15 17:54:52 -03:00
ReinUsesLisp b0e7920838 shader_decode: Implement XMAD 2019-01-15 17:54:51 -03:00
ReinUsesLisp becfdb8638 shader_decode: Implement PBK and BRK 2019-01-15 17:54:51 -03:00
ReinUsesLisp 8f37531f8e shader_decode: Implement LOP 2019-01-15 17:54:51 -03:00
ReinUsesLisp 8486e7f8c8 shader_decode: Implement SEL 2019-01-15 17:54:51 -03:00
ReinUsesLisp ccb71bece9 shader_decode: Implement IADD 2019-01-15 17:54:51 -03:00
ReinUsesLisp faadae5814 shader_decode: Implement ISETP 2019-01-15 17:54:51 -03:00
ReinUsesLisp 80183de884 shader_decode: Implement BFI 2019-01-15 17:54:51 -03:00
ReinUsesLisp 078ba28e13 shader_decode: Implement ISET 2019-01-15 17:54:51 -03:00
ReinUsesLisp acdbbb8885 shader_decode: Implement LD_C 2019-01-15 17:54:51 -03:00
ReinUsesLisp d79c462af0 shader_decode: Implement SHL 2019-01-15 17:54:51 -03:00
ReinUsesLisp a2819c204f shader_decode: Implement SHR 2019-01-15 17:54:51 -03:00
ReinUsesLisp 39f1c6246a shader_decode: Implement LOP32I 2019-01-15 17:54:51 -03:00
ReinUsesLisp 501284a81a shader_decode: Implement BFE 2019-01-15 17:54:51 -03:00
ReinUsesLisp e444a6553f shader_decode: Implement FSET 2019-01-15 17:54:51 -03:00
ReinUsesLisp 3052eae25e shader_decode: Implement F2I 2019-01-15 17:54:51 -03:00
ReinUsesLisp 8abe5ba2c8 shader_decode: Implement I2F 2019-01-15 17:54:51 -03:00
ReinUsesLisp c849b5b320 shader_decode: Implement F2F 2019-01-15 17:54:50 -03:00
ReinUsesLisp 9118deb990 shader_decode: Stub DEPBAR 2019-01-15 17:54:50 -03:00
ReinUsesLisp 97f33f00cf shader_decode: Implement SSY and SYNC 2019-01-15 17:54:50 -03:00
ReinUsesLisp abdbafbc20 shader_decode: Implement PSETP 2019-01-15 17:54:50 -03:00
ReinUsesLisp 802c23b8a8 shader_decode: Implement TMML 2019-01-15 17:54:50 -03:00
ReinUsesLisp 2b90637f4b shader_decode: Implement TEX and TXQ 2019-01-15 17:54:50 -03:00
ReinUsesLisp 878672f371 shader_decode: Implement TEXS (F32) 2019-01-15 17:54:50 -03:00
ReinUsesLisp c703f0aee4 shader_decode: Implement FSETP 2019-01-15 17:54:50 -03:00
ReinUsesLisp 8215ae942c shader_decode: Partially implement BRA 2019-01-15 17:54:50 -03:00
ReinUsesLisp 4f95dc950e shader_decode: Implement IPA 2019-01-15 17:54:50 -03:00
ReinUsesLisp cacb934f21 shader_decode: Implement EXIT 2019-01-15 17:54:50 -03:00
ReinUsesLisp 0c049e0a21 shader_decode: Implement ST_A 2019-01-15 17:54:50 -03:00
ReinUsesLisp e3f1233ce1 shader_decode: Implement LD_A 2019-01-15 17:54:50 -03:00
ReinUsesLisp ea358bd4bf shader_decode: Implement FADD32I 2019-01-15 17:54:50 -03:00
ReinUsesLisp c9b2a1b051 shader_decode: Implement FMUL32_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp 2edee801ce shader_decode: Implement MOV32_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp 06cb910c6d shader_decode: Stub RRO_C, RRO_R and RRO_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp 5e6a0a08c1 shader_decode: Implement FMNMX_C, FMNMX_R and FMNMX_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp 964ddeeb90 shader_decode: Implement MUFU 2019-01-15 17:54:50 -03:00
ReinUsesLisp 4ccaa1402d shader_decode: Implement FADD_C, FADD_R and FADD_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp 7c192ec43f shader_decode: Implement FMUL_C, FMUL_R and FMUL_IMM 2019-01-15 17:54:50 -03:00
ReinUsesLisp 4c70d5b8eb shader_decode: Implement MOV_C and MOV_R 2019-01-15 17:54:50 -03:00
ReinUsesLisp a4f052f6b3 video_core: Replace gl_shader_decompiler 2019-01-15 17:54:50 -03:00
ReinUsesLisp 0c6fb456e0 glsl_decompiler: Implementation 2019-01-15 17:54:50 -03:00
ReinUsesLisp fbc67a0563 shader_ir: Add condition code helper 2019-01-15 17:54:50 -03:00
ReinUsesLisp a58abbcfc4 shader_ir: Add predicate combiner helper 2019-01-15 17:54:49 -03:00
ReinUsesLisp bf07272695 shader_ir: Add comparison helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp 60f044df56 shader_ir: Add half float helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp e3c55e31d7 shader_ir: Add integer helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp 833d0806f9 shader_ir: Add float helpers 2019-01-15 17:54:49 -03:00
ReinUsesLisp 6b9eea3fe5 shader_ir: Add setters 2019-01-15 17:54:49 -03:00
ReinUsesLisp 12a95ff453 shader_ir: Add local memory getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp 2f87fd060d shader_ir: Add internal flag getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp 15f431f0cb shader_ir: Add attribute getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp 864e8f55cf shader_ir: Add constant buffer getters 2019-01-15 17:54:49 -03:00
ReinUsesLisp 5e639bfcf6 shader_ir: Add register getter 2019-01-15 17:54:49 -03:00
ReinUsesLisp 4aaa2192b9 shader_ir: Add immediate node constructors 2019-01-15 17:54:49 -03:00
ReinUsesLisp 15a0e1481d shader_ir: Initial implementation 2019-01-15 17:54:49 -03:00
ReinUsesLisp 294df41b86 shader_bytecode: Fixup encoding 2019-01-15 17:54:49 -03:00
ReinUsesLisp a0c8c16d07 shader_header: Make local memory size getter constant 2019-01-15 17:54:49 -03:00
ReinUsesLisp 877a978a22 gl_rasterizer: Workaround Intel VAO DSA bug
There is a bug on Intel's blob driver where it fails to properly build a
vertex array object if it's not bound even after creating it with
glCreateVertexArrays. This workaround binds it after creating it to
bypass the issue.
2019-01-09 02:40:19 -03:00
ReinUsesLisp 3121408a90 gl_global_cache: Add dummy global cache manager 2019-01-08 17:47:45 -03:00
ReinUsesLisp 19cf995225 gl_rasterizer: Skip framebuffer configuration if rendertargets have not been changed 2019-01-07 16:23:23 -03:00
bunnei 23ebd4920e
Merge pull request #1999 from ReinUsesLisp/dirty-shader
gl_shader_cache: Use dirty flags for shaders
2019-01-07 14:22:30 -05:00
ReinUsesLisp b683e41fca gl_rasterizer_cache: Use dirty flags for the depth buffer 2019-01-07 16:22:28 -03:00
ReinUsesLisp 179ee963db gl_rasterizer_cache: Use dirty flags for color buffers 2019-01-07 16:20:39 -03:00
ReinUsesLisp 0ab17ab406 gl_shader_cache: Use dirty flags for shaders 2019-01-07 16:13:12 -03:00
ReinUsesLisp 5933b3ea96 gl_stream_buffer: Use DSA for buffer management 2019-01-06 16:49:24 -03:00
ReinUsesLisp 35c095898b gl_rasterizer: Use DSA for vertex array objects 2019-01-06 16:49:24 -03:00
ReinUsesLisp ea4928393f gl_state: Drop uniform buffer state tracking 2019-01-06 00:28:01 -03:00
ReinUsesLisp fc8a8789da gl_rasterizer_cache: Use GL_STREAM_COPY for PBOs
Since the data is doing the path CPU -> GPU -> GPU copy is the most
approximate hint. Using GL_STREAM_DRAW generated a performance warning
on Nvidia's stack. Changing this hint removed the warning.
2019-01-05 02:27:55 -03:00
bunnei c91d2bac45
Merge pull request #1961 from ReinUsesLisp/tex-view-2d
gl_rasterizer_cache: Texture view if shader samples array but OGL is not
2019-01-02 17:51:32 -05:00
ReinUsesLisp 97fb6179b9 gl_rasterizer_cache: Texture view if shader samples array but OGL is not
When a shader samples a texture array but that texture in OpenGL is
created without layers, use a texture view to increase the texture
hierarchy. For example, instead of binding a GL_TEXTURE_2D bind a
GL_TEXTURE_2D_ARRAY view.
2018-12-29 23:49:12 -03:00
bunnei 2020ba06e1 gpu: Remove PixelFormat G8R8U and G8R8S, as they do not seem to exist.
- Fixes UI rendering issues in The Legend of Zelda: Breath of the Wild.
2018-12-28 15:36:45 -05:00
Rodolfo Bogado fbe900ba6d Add missing uintBitsToFloat to SetRegisterToHalfFloat 2018-12-27 14:39:10 -03:00
bunnei fa9acc26d9
Merge pull request #1892 from Tinob/master
Improve Zero flag implementation
2018-12-27 11:06:59 -05:00
Lioncash 67fa21e143 renderer_opengl: Correct forward declaration of FramebufferLayout
This is actually a struct, not a class, which can lead to compilation
warnings.
2018-12-26 17:32:32 -05:00
Rodolfo Bogado 33056dd833 Apply CC test to the final value to be stored in the register 2018-12-26 18:16:31 -03:00
David 8047873a66 Fixed shader linking error due to TLDS (#1934)
* Fixed shader linking error due to TLDS

coord should be coords

* Fix remaining coords
2018-12-26 15:55:39 -05:00
ReinUsesLisp aaa0e6c346 shader_bytecode: Fixup TEXS.F16 encoding 2018-12-26 01:35:44 -03:00
bunnei 9a22a94a51
Merge pull request #1886 from FearlessTobi/port-4164
Port citra-emu/citra#4164: "citra_qt, video_core: Screenshot functionality"
2018-12-23 14:36:51 -05:00
Rodolfo Bogado bbf8d6bf01 Includde saturation in the evaluation of the control code 2018-12-22 19:19:18 -03:00
Rodolfo Bogado 946777601b Handle RZ cases evaluating the expression instead of the register value. 2018-12-22 19:19:18 -03:00
Rodolfo Bogado 7e72b5e453 complete emulation of ZeroFlag 2018-12-22 19:19:18 -03:00
bunnei e75e8b9580
Merge pull request #1921 from ogniK5377/no-unit
Fixed uninitialized memory due to missing returns in canary
2018-12-21 14:12:54 -05:00
bunnei 42427b9c7a
Merge pull request #1920 from heapo/texture_format_selection
Texture format fixes for RGBA16UI for copies and R16U when used as depth
2018-12-21 13:46:17 -05:00
bunnei 3050f3a7ba
Merge pull request #1909 from heapo/shadow_sampling_fixes
Fix arrayed texture LOD selection and depth comparison ordering
2018-12-19 13:10:37 -05:00
David Marcec 20859802f0 hopefully fix clang format issue 2018-12-19 13:22:09 +11:00
David Marcec fdd649e2ef Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
zhupengfei a2be49305d yuzu, video_core: Screenshot functionality
Allows capturing screenshot at the current internal resolution (native for software renderer), but a setting is available to capture it in other resolutions. The screenshot is saved to a single PNG in the current layout.
2018-12-18 22:54:41 +01:00
heapo 37280cf555 Texture format fixes: Flag RGBA16UI as GL_RGBA_INTEGER format, and interpret R16U as Z16 when depth_compare is enabled. 2018-12-18 11:34:51 -08:00
ReinUsesLisp ef061481c5 shader_bytecode: Fixup half float's operator B encoding 2018-12-18 04:28:50 -03:00
heapo 72599cc667 Implement postfactor multiplication/division for fmul instructions 2018-12-17 07:56:25 -08:00
heapo a6daed74f5 Fix arrayed shadow sampler array slice/depth comparison ordering, as well as invalid GLSL LOD selection. 2018-12-17 07:53:48 -08:00
bunnei e1f28afb98
Merge pull request #1893 from lioncash/warn
gl_shader_cache: Resolve truncation compiler warning
2018-12-11 20:47:10 -05:00
bunnei d63c883e66
Merge pull request #1888 from marcosvitali/glFrontFacing
gl_shader_decompiler: IPA fix FrontFacing.
2018-12-11 11:43:38 -05:00
Lioncash 4c2b94559b gl_shader_cache: Dehardcode constant in CalculateProgramSize()
This constant is related to the size of the instruction.
2018-12-10 23:47:20 -05:00
Lioncash 861bfdbf5d gl_shader_cache: Resolve truncation compiler warning
The previous code would cause a warning, as it was truncating size_t
(64-bit) to a u32 (32-bit) implicitly.
2018-12-10 23:44:18 -05:00
bunnei 5b5d0199fe
Merge pull request #1740 from FernandoS27/shader_props
Implemented Shader Unique Identifiers
2018-12-10 12:43:43 -05:00
Marcos Vitali 430e1f864b gl_shader_decompiler: IPA FrontFacing: the right value when is the front face is 0xFFFFFFFF. 2018-12-09 23:36:21 -03:00
Fernando Sahmkow d5d77848e6 Implemented a shader unique identifier. 2018-12-09 17:33:33 -04:00
FernandoS27 7b9c982d29 Add more info into textures' object labels 2018-12-09 17:22:29 -04:00
Marcos Vitali f4fa7ecb0e gl_shader_decompiler: TLDS/TLD4/TLD4S Reworked reflecting the source registers, bugs fixed and modularize. 2018-12-07 19:09:36 -03:00
bunnei 9390452195
Merge pull request #1824 from ReinUsesLisp/fbcache
gl_rasterizer: Implement a framebuffer cache
2018-12-06 11:56:59 -05:00
ReinUsesLisp 59a8df1b14 gl_shader_decompiler: Implement TEXS.F16 2018-12-05 02:06:34 -03:00
ReinUsesLisp 370980fdc3 gl_shader_decompiler: Fixup inverted if 2018-12-05 01:23:04 -03:00
heapo 7853e6b5d4 Improve msvc codegen for hot-path array LUTs
In some constexpr functions, msvc is building the LUT at runtime
(pushing each element onto the stack) out of an abundance of caution. Moving the
arrays into be file-scoped constexpr's avoids this and turns the functions into
simple look-ups as intended.
2018-12-04 17:13:07 -08:00
Marcos ab2108fb2a Rewrited TEX/TEXS (TEX Scalar). (#1826)
* Rewrited TEX/TEXS (TEX Scalar).

* Style fixes.

* Styles issues.
2018-12-04 12:24:35 -05:00
bunnei 7f6bc284e9
Merge pull request #1854 from Subv/old_command_processor
Don't try to route PFIFO methods (0-0x40) to the other engines.
2018-12-04 08:49:22 -05:00
Subv c4c19fa6c1 Removed unused file.
This is a leftover from #1792
2018-12-03 23:52:38 -05:00
Subv b873253da1 GPU: Don't try to route PFIFO methods (0-0x40) to the other engines. 2018-12-03 23:52:18 -05:00
bunnei 8a12daac8c
Merge pull request #1822 from ReinUsesLisp/glsl-scope
gl_shader_decompiler: Introduce a scoped object and style changes
2018-12-03 17:10:02 -05:00
bunnei 7ce17b2cf6
Merge pull request #1827 from ReinUsesLisp/clip-and-shader
gl_rasterizer: Enable clip distances when set in register and in shader
2018-12-01 23:51:47 -05:00
bunnei 80aa124b1d
Merge pull request #1825 from ReinUsesLisp/shader-pipeline-cache
gl_shader_manager: Update pipeline when programs have changed
2018-12-01 23:48:55 -05:00
bunnei a6805e58ce
Merge pull request #1795 from ReinUsesLisp/vc-cleanup
video_core: Minor style changes
2018-12-01 23:46:18 -05:00
bunnei 0e9be7be37
Merge pull request #1823 from bunnei/fix-surface-copy
gl_rasterizer_cache: Fix several surface copy issues.
2018-12-01 23:44:32 -05:00
Lioncash e88cdcc912 Fix debug build
A non-existent parameter was left in some formatting calls (the logging
macro for which only does anything meaningful on debug builds)
2018-12-01 02:11:42 -05:00
bunnei 0f43564d09 gl_rasterizer_cache: Update AccurateCopySurface to flush complete source surface.
- Fixes issues with Breath of the Wild with use_accurate_gpu_emulation setting.
2018-11-29 20:10:11 -05:00
ReinUsesLisp 2908d30274 gl_rasterizer: Enable clip distances when set in register and in shader 2018-11-29 16:58:20 -03:00
ReinUsesLisp 1a2bb596db gl_rasterizer: Implement a framebuffer cache 2018-11-29 16:34:46 -03:00
ReinUsesLisp e8620eaa9a gl_shader_manager: Update pipeline when programs have changed 2018-11-29 16:26:42 -03:00
bunnei 3d3cc35ee7 gl_rasterizer_cache: Remove BlitSurface and replace with more accurate copy.
- BlitSurface with different texture targets is inherently broken.
- When target is the same, we can just use FastCopySurface.
- Fixes rendering issues with Breath of the Wild.
2018-11-28 21:56:21 -05:00
ReinUsesLisp eb700afcf0 gl_shader_decompiler: Remove texture temporal in TLD4 2018-11-28 23:46:16 -03:00
ReinUsesLisp 8d58e5da71 gl_shader_decompiler: Flip negated if else statement 2018-11-28 23:46:16 -03:00
ReinUsesLisp f4abebd731 gl_shader_decompiler: Use GLSL scope on instructions unrelated to textures 2018-11-28 23:46:14 -03:00
ReinUsesLisp 78fc8f6b66 gl_shader_decompiler: Move texture code generation into lambdas 2018-11-28 23:45:53 -03:00
ReinUsesLisp ab13b628d0 gl_shader_decompiler: Clean up texture instructions 2018-11-28 23:45:53 -03:00
ReinUsesLisp 6a642022dd gl_shader_decompiler: Scope GLSL variables with a scoped object 2018-11-28 23:45:51 -03:00
ReinUsesLisp 037449668e gl_rasterizer: Signal UNIMPLEMENTED when rt_separate_frag_data is not zero 2018-11-28 21:26:22 -03:00
ReinUsesLisp 653d7a3f0d gl_rasterizer_cache: Use brackets for two-line single-expresion blocks 2018-11-28 21:18:14 -03:00
ReinUsesLisp 432a9872ed gl_rasterizer: Remove unused struct declarations 2018-11-28 21:18:13 -03:00
ReinUsesLisp 22c7c710b4 gl_rasterizer: Remove extension booleans 2018-11-28 21:18:13 -03:00
bunnei 5a9a84994a
Merge pull request #1808 from Tinob/master
Fix clip distance and viewport
2018-11-28 17:47:28 -05:00
bunnei 3fe8ab0d99
Merge pull request #1786 from Tinob/DepthClamp
Add Depth Clamp Support
2018-11-28 17:46:55 -05:00
bunnei 6f849887c9
Merge pull request #1792 from bunnei/dma-pusher
gpu: Rewrite GPU command list processing with DmaPusher class.
2018-11-28 10:12:37 -05:00
bunnei 881f5ad70f
Merge pull request #1735 from FernandoS27/tex-spacing
Texture decoder: Implemented Tile Width Spacing
2018-11-27 19:21:17 -05:00
bunnei ac74b71d75 dma_pushbuffer: Optimize to avoid loop and copy on Push. 2018-11-27 19:17:33 -05:00
bunnei c568f5cea7 gpu: Move command list profiling to DmaPusher::DispatchCalls. 2018-11-27 18:42:21 -05:00
ReinUsesLisp 2e9b90abad gl_shader_decompiler: Fixup clip distance index 2018-11-27 15:35:26 -03:00
Markus Wick 8747f5fc0d gl_rasterizer: Fixup for #1723.
On invalidating the streaming buffer, we need to reupload all vertex buffers.
But we don't need to reconfigure the vertex format.
This was a (silly) misstake in #1723.

Thanks at Rodrigo for discovering the issue.

Fun fact, as configuring the vertex format also invalidate the vertex buffer,
this misstake had no affect on the behavior.
2018-11-27 10:32:41 +01:00
bunnei abea6fa90c gpu: Rewrite GPU command list processing with DmaPusher class.
- More accurate impl., fixes Undertale (among other games).
2018-11-26 23:14:01 -05:00
Rodolfo Bogado 6710eb4892 remove viewport_transform_enabled as it seems to be inactive when valid transforms are used. 2018-11-27 00:04:33 -03:00
ReinUsesLisp 237c2026e9 morton: Fixup compiler warning 2018-11-26 23:22:57 -03:00
Rodolfo Bogado dfdbfa69e5 Implement depth clamp 2018-11-26 20:56:32 -03:00
Rodolfo Bogado 8e971f5062 Add support for Clip Distance enabled register 2018-11-26 20:45:21 -03:00
bunnei 1856d0ee8a
Merge pull request #1794 from Tinob/master
Add support for viewport_transfom_enable register
2018-11-26 18:34:09 -05:00
bunnei 67a154e23d
Merge pull request #1723 from degasus/dirty_flags
gl_rasterizer: Skip VB upload if the state is clean.
2018-11-26 18:33:22 -05:00
Marcos cb8d51e37e GPU States: Implement Polygon Offset. This is used in SMO all the time. (#1784)
* GPU States: Implement Polygon Offset. This is used in SMO all the time.

* Clang Format fixes.

* Initialize polygon_offset in the constructor.
2018-11-26 18:31:44 -05:00
bunnei 7684f4d0cf
Merge pull request #1713 from FernandoS27/bra-cc
Implemented BRA CC conditional and FSET CC Setting
2018-11-26 18:28:03 -05:00
bunnei a41943dc55
Merge pull request #1798 from ReinUsesLisp/y-direction
gl_shader_decompiler: Implement S2R's Y_DIRECTION
2018-11-26 18:25:42 -05:00
FernandoS27 ddfbe0b58d Implemented Tile Width Spacing 2018-11-26 09:05:12 -04:00
bunnei f9a211220c
Merge pull request #1763 from ReinUsesLisp/bfi
gl_shader_decompiler: Implement BFI_IMM_R
2018-11-25 23:04:57 -05:00
bunnei d7d1ab15b6
Merge pull request #1760 from ReinUsesLisp/r2p
gl_shader_decompiler: Implement R2P_IMM
2018-11-25 22:38:42 -05:00
bunnei 0394813401
Merge pull request #1782 from FernandoS27/dc
Fixed Coordinate Encodings in TEX and TEXS instructions
2018-11-25 22:36:25 -05:00
bunnei 8ce90a4f0b
Merge pull request #1783 from ReinUsesLisp/clip-distances
gl_shader_decompiler: Implement clip distances
2018-11-25 22:35:30 -05:00
bunnei ceb4bc22a4
Merge pull request #1796 from ReinUsesLisp/morton-move
video_core: Move morton functions out of gl_rasterizer_cache
2018-11-25 22:35:12 -05:00
Rodolfo Bogado 415e8383ba Limit the amount of viewports tested for state changes only to the usable ones 2018-11-25 12:18:29 -03:00
ReinUsesLisp 924e834b8f gl_shader_decompiler: Implement S2R's Y_DIRECTION 2018-11-25 04:37:29 -03:00
bunnei 7d544c1b9d
Merge pull request #1787 from bunnei/fix-gpu-mm
memory_manager: Do not allow 0 to be a valid GPUVAddr.
2018-11-24 23:45:00 -05:00
ReinUsesLisp 7ff2131cf9 morton: Style changes 2018-11-25 00:38:53 -03:00
ReinUsesLisp dad3a6718e video_core: Move morton functions to their own file 2018-11-25 00:37:18 -03:00
FernandoS27 8c797464a2 Fix Texture Overlapping 2018-11-24 17:26:42 -04:00
FernandoS27 33afff1870 Implemented BRA CC conditional and FSET CC Setting 2018-11-24 13:25:54 -04:00
Rodolfo Bogado 13f6a603c2 Add support for viewport_transfom_enable register 2018-11-24 13:17:48 -03:00
bunnei d01bf170c4
Merge pull request #1725 from FernandoS27/gl43
Update OpenGL's backend version from 3.3 to 4.3
2018-11-23 23:56:57 -05:00
bunnei e23543918b
Merge pull request #1785 from Tinob/master
Add support for clear_flags register
2018-11-23 23:55:56 -05:00
bunnei b6b78203cc
Merge pull request #1769 from ReinUsesLisp/cc
gl_shader_decompiler: Rename cc to condition code and name internal flags
2018-11-23 23:31:04 -05:00
Rodolfo Bogado 54c2a4cafc Add support for clear_flags register 2018-11-24 00:16:33 -03:00
FernandoS27 7668ef51d6 Fix TEXS Instruction encodings 2018-11-23 22:46:50 -04:00
FernandoS27 9c2127d5eb Fix one encoding in TEX Instruction 2018-11-23 22:46:49 -04:00
FernandoS27 487d805899 Corrected inputs indexing in TEX instruction 2018-11-23 22:46:48 -04:00
bunnei 69b3f98d3a
Merge pull request #1744 from degasus/shader_cache
shader_cache: Only lock covered instructions.
2018-11-23 21:09:36 -05:00
bunnei 0b1842294f memory_manager: Do not allow 0 to be a valid GPUVAddr.
- Fixes a bug with Undertale using 0 for a render target.
2018-11-23 12:58:55 -05:00
Hexagon12 3135dbc29c Added predicate comparison LessEqualWithNan (#1736)
* Added predicate comparison LessEqualWithNan

* oops

* Clang fix
2018-11-23 08:51:32 -08:00
bunnei c4b5319446
Merge pull request #1756 from ReinUsesLisp/fix-textures
gl_shader_decompiler: Fix register overwriting on texture calls
2018-11-23 08:49:37 -08:00
bunnei d77af9f8fd
Merge pull request #1766 from FernandoS27/fix-txq
Properly Implemented TXQ Instruction
2018-11-23 08:48:57 -08:00
ReinUsesLisp b3853403b7 gl_shader_decompiler: Implement clip distances 2018-11-23 02:14:43 -03:00
ReinUsesLisp c9ac23683b gl_shader_decompiler: Add a message for unimplemented cc generation 2018-11-22 16:12:27 -03:00
bunnei 50d2abaaa9
Merge pull request #1775 from bunnei/blend-eq
maxwell_3d: Implement alternate blend equations.
2018-11-22 08:44:05 -08:00
bunnei e633021532
Merge pull request #1764 from bunnei/macrointerpreter
macro_interpreter: Implement AddWithCarry and SubtractWithBorrow.
2018-11-22 08:43:29 -08:00
bunnei 033b46253e macro_interpreter: Implement AddWithCarry and SubtractWithBorrow.
- Used by Undertale.
2018-11-22 00:58:00 -05:00
bunnei 0e6a608245 maxwell_3d: Implement alternate blend equations.
- Used by Undertale.
2018-11-22 00:51:01 -05:00
bunnei b84f4cfb62
Merge pull request #1737 from FernandoS27/layer-copy
Implemented Fast Layered Copy
2018-11-21 21:39:16 -08:00
ReinUsesLisp 74eb16521f gl_shader_decompiler: Rename internal flag strings 2018-11-21 22:31:42 -03:00
ReinUsesLisp 8a5e6fce07 gl_shader_decompiler: Rename control codes to condition codes 2018-11-21 22:31:16 -03:00
ReinUsesLisp 864cbbaf4c gl_shader_decompiler: Fix register overwriting on texture calls 2018-11-21 21:21:19 -03:00
bunnei ec38b4e883
Merge pull request #1753 from FernandoS27/ufbtype
Use default values for unknown framebuffer pixel format
2018-11-21 14:15:27 -08:00
bunnei 61586e8794
Merge pull request #1752 from ReinUsesLisp/unimpl-decompiler
gl_shader_decompiler: Use UNIMPLEMENTED when applicable
2018-11-21 14:13:28 -08:00
FernandoS27 4a6a9b6622 Properly Implemented TXQ Instruction 2018-11-21 18:12:36 -04:00
ReinUsesLisp 642dfeda2a gl_shader_decompiler: Implement BFI_IMM_R 2018-11-21 16:12:30 -03:00
bunnei bb175ab430
Merge pull request #1754 from ReinUsesLisp/zero-register
gl_shader_decompiler: Remove UNREACHABLE when setting RZ
2018-11-21 08:06:29 -08:00
FernandoS27 0368260c99 Removed pre 4.3 ARB extensions 2018-11-21 11:43:17 -04:00
FernandoS27 0a9fedfac9 Use default values for unknown framebuffer pixel format 2018-11-21 07:33:34 -04:00
ReinUsesLisp d92afc7493 gl_shader_decompiler: Implement R2P_IMM 2018-11-21 04:56:00 -03:00
ReinUsesLisp 423a3ed2c8 gl_shader_decompiler: Remove UNREACHABLE when setting RZ 2018-11-20 22:23:10 -03:00
ReinUsesLisp bb893188eb gl_shader_decompiler: Use UNIMPLEMENTED instead of LOG+UNREACHABLE when applicable 2018-11-20 22:00:13 -03:00
bunnei 1a543723ab maxwell_3d: Initialize rasterizer color mask registers as enabled.
- Fixes rendering regression with Sonic Mania.
2018-11-20 19:58:06 -05:00
Markus Wick cfbae58b2b shader_cache: Only lock covered instructions. 2018-11-20 21:58:31 +01:00
FernandoS27 eb36463e03 Implemented Fast Layered Copy 2018-11-19 19:51:13 -04:00
bunnei f02b125ac8
Merge pull request #1717 from FreddyFunk/swizzle-gob
textures/decoders: Replace magic numbers
2018-11-18 20:13:00 -08:00
bunnei 6dc33fb812
Merge pull request #1693 from Tinob/master
Missing ogl states
2018-11-18 19:59:10 -08:00
Frederic L 11a1442229 Eliminated unnessessary memory allocation and copy (#1702) 2018-11-18 19:53:03 -08:00
ReinUsesLisp 29e7c76d66 gl_rasterizer: Remove default clip distance 2018-11-18 23:57:52 -03:00
Rodolfo Bogado 4d1a0a24cc drop support for non separate alpha as it seems to cause issues in some games 2018-11-18 03:44:48 -03:00
Rodolfo Bogado 81a9c5fe6f fix sampler configuration, thanks to Marcos for his investigation 2018-11-17 19:59:34 -03:00
Rodolfo Bogado b312cca756 small type fix 2018-11-17 19:59:34 -03:00
Rodolfo Bogado 5297495c87 small fix for alphaToOne bit location 2018-11-17 19:59:34 -03:00
Rodolfo Bogado e69eb3c760 fix for gcc compilation 2018-11-17 19:59:34 -03:00
Rodolfo Bogado 53b4a1af0f add AlphaToCoverage and AlphaToOne 2018-11-17 19:59:34 -03:00
Rodolfo Bogado 8ed7e1af2c add support for fragment_color_clamp 2018-11-17 19:59:33 -03:00
Rodolfo Bogado 02c22a3440 add missing MirrorOnceBorder support where supported 2018-11-17 19:59:33 -03:00
Rodolfo Bogado 1d60bb6544 set border color not depending on the wrap mode
only enable color mask for the first framebuffer id independent blending is disabled
2018-11-17 19:59:33 -03:00
Rodolfo Bogado 6a2aa6dbdb set default value for point size register 2018-11-17 19:59:33 -03:00
Rodolfo Bogado 1881e86c43 fix viewport and scissor behavior 2018-11-17 19:59:32 -03:00
Markus Wick 97f5c4ffd3 gl_rasterizer: Skip VB upload if the state is clean. 2018-11-17 14:28:54 +01:00
Frederic Laing 7a400e2191 textures/decoders: Replace magic numbers 2018-11-17 01:55:28 +01:00
bunnei 405dd03dae
Merge pull request #1700 from FreddyFunk/cleanup
gl_rasterizer_chache: Minor cleanup
2018-11-16 07:07:30 -08:00
bunnei 241646f423
Merge pull request #1701 from FreddyFunk/decoders
textures/decoders: Minor cleanup
2018-11-16 07:07:11 -08:00
bunnei 0b701751da
Merge pull request #1676 from lioncash/warn
gl_state: Amend compilation warnings
2018-11-16 07:00:03 -08:00
Frederic Laing 95d3965f31 textures/decoders: Minor cleanup 2018-11-15 21:04:17 +01:00
Frederic Laing 3844b5c0c5 gl_rasterizer_chache: Minor cleanup 2018-11-15 20:10:05 +01:00
bunnei 8a537a2021
Merge pull request #1637 from FernandoS27/cache
Improved GPU Caches lookup Speed
2018-11-14 19:07:52 -08:00
bunnei 3bd503d59c
Merge pull request #1662 from FreddyFunk/CopySurface-Optimization
gl_rasterizer_cache: CopySurface optimization
2018-11-13 18:58:12 -08:00
bunnei d2b2b05b6a
Merge pull request #1685 from lioncash/base
video_core/renderer_base: Remove GL include from the renderer base class files
2018-11-13 18:53:59 -08:00
Lioncash 4ed9ef15c4 video_core/renderer_base: Remove GL include from the renderer base class files
Keeps the base class source files implementation-agnostic.
2018-11-13 14:38:13 -05:00
Frederic L ab362aa7e5
gl_rasterizer: Minor cleanup
Minor code cleanup from unaddressed feedback in #1654
2018-11-13 14:07:23 +01:00
Lioncash 9a0fb7d9fb gl_state: Amend compilation warnings
Makes float -> integral conversions explicit via casts and also silences
a sign conversion warning.
2018-11-13 07:10:40 -05:00
bunnei 65bd03d74c
Merge pull request #1628 from greggameplayer/Texture2DArray
Implement SurfaceTarget Texture2DArray
2018-11-12 21:13:47 -08:00
greggameplayer c8b3f09876 Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB (#1666)
* Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB
( needed by Mario+Rabbids Kingdom Battle )

* Small placement correction
2018-11-12 18:34:54 -08:00
bunnei 2c6efda235
Merge pull request #1660 from Tinob/master
Map more missing opengl states
2018-11-11 19:58:16 -08:00
bunnei a264fac943
Merge pull request #1664 from FreddyFunk/cast2
gl_rasterizer: Fix compiler warnings
2018-11-11 12:18:29 -08:00
Rodolfo Bogado 72b1fae984 Use core extensions when available to set max anisotropic filtering level 2018-11-11 16:36:53 -03:00
Rodolfo Bogado 4e6c64bf8d Improve state management by splitting some of the states id separated function to avoid a full apply overhead 2018-11-11 16:36:53 -03:00
Rodolfo Bogado 4a6eff3b7b Try to fix problems with stencil test in some games, relax translation to opengl enums to avoid crashing and only generate logs of the errors. 2018-11-11 16:31:00 -03:00
Rodolfo Bogado e9610ec0dd set sampler max lod, min lod, lod bias and max anisotropy 2018-11-11 16:31:00 -03:00
FernandoS27 3088e36237 Improved GPU Caches lookup Speed 2018-11-11 12:53:25 -04:00
bunnei eaee73f95d
Merge pull request #1669 from ReinUsesLisp/fixup-gs
gl_shader_decompiler: Guard out of bound geometry shader input reads
2018-11-11 08:28:20 -08:00
bunnei c82bccab56
Merge pull request #1663 from lioncash/raster
rasterizer_cache: Remove reliance on the System singleton
2018-11-11 08:20:27 -08:00
bunnei 1916213311
Merge pull request #1648 from FernandoS27/texs-3-array
Implement 3 coordinate array in TEXS instruction
2018-11-11 08:18:27 -08:00
bunnei 8ea6261547
Merge pull request #1654 from degasus/dirty_flags
gl_rasterizer: Skip VAO binding if the state is clean.
2018-11-11 08:17:57 -08:00
ReinUsesLisp 8d4bb10d44 gl_shader_decompiler: Guard out of bound geometry shader input reads
Geometry shaders follow a pattern that results in out of bound reads.
This pattern is:
- VSETP to predicate
- Use that predicate to conditionally set a register a big number
- Use the register to access geometry shaders
At the time of writing this commit I don't know what's the intent of
this number. Some drivers argue about these out of bound reads. To avoid
this issue, input reads are guarded limiting reads to the highest
posible vertex input of the current topology (e.g. points to 1 and
triangles to 3).
2018-11-10 03:10:50 -03:00
Frederic Laing e2bf581e3a gl_rasterizer_cache: Remove unnecessary memory allocation and copy in CopySurface 2018-11-08 16:50:09 +01:00
Frederic Laing 1d36aec267 gl_rasterizer: Fix compiler warnings 2018-11-08 13:33:30 +01:00
Lioncash 9046f764bf rasterizer_cache: Remove reliance on the System singleton
Rather than have a transparent dependency, we can make it explicit in
the interface. This also gets rid of the need to put the core include in
a header.
2018-11-08 06:16:38 -05:00
Lioncash 9de523fd90 rasterizer_cache: Add missing virtual destructor to RasterizerCacheObject
Ensures that destruction will always do the right thing in any context.
2018-11-08 00:31:39 -05:00
Lioncash 29f082775b gl_resource_manager: Amend clang-format discrepancies
Fixes the buildbot.
2018-11-08 00:23:45 -05:00
FernandoS27 d347623d6f Correct issue where texturelod could not be applied to 2darrayshadow 2018-11-07 21:48:45 -04:00
FernandoS27 ad2f47b579 Implement 3 coordinate array in TEXS instruction 2018-11-07 17:04:30 -04:00
bunnei 81ff9e2473
Merge pull request #1630 from bunnei/fix-mapbufferex
memory_manager: Do not MapBufferEx over already in use memory.
2018-11-07 00:14:36 -08:00
bunnei 74bce4d68f
Merge pull request #1635 from Tinob/master
Implement multi-target viewports and blending
2018-11-07 00:11:49 -08:00
Markus Wick 359db6a673 gl_rasterizer: Skip VAO binding if the state is clean. 2018-11-06 22:31:33 +01:00
Markus Wick 0590dd2971 gl_rasterizer: Split VAO and VB setup functions. 2018-11-06 22:31:33 +01:00
greggameplayer d3b9599b2d
Merge branch 'master' into Texture2DArray 2018-11-06 19:05:57 +01:00
Markus Wick 2c87f10267 gl_rasterizer_cache: Add profiles for Copy and Blit.
They were missed, and Copy is very high in profile here. It doesn't block the GPU,
but it stalls the driver thread. So with our bad GL instructions, this might block quite a while.
2018-11-06 17:45:32 +01:00
Markus Wick 7e59e907ef gl_resource_manager: Profile creation and deletion. 2018-11-06 17:45:32 +01:00
Markus Wick 80e4dbdce7 gl_stream_buffer: Profile orphaning of stream buffer.
This serialize to the driver thread and so it may block for a while.
So if it is in the benchmark, we get noticed if it happens too often.
2018-11-06 17:45:32 +01:00
Markus Wick 54df9fe29e gl_resource_manager: Split implementations in .cpp file.
Those implementations are quite costly, so there is no need to inline them to the caller.
Ressource deletion is often a performance bug, so in this way, we support to add breakpoints to them.
2018-11-06 14:40:39 +01:00
bunnei cdb19e71fe
Merge pull request #1616 from FernandoS27/cube-array
Implement Cube Arrays
2018-11-05 15:28:48 -05:00
Rodolfo Bogado 19038db489 Add support to color mask to avoid issues in blending caused by wrong values in the alpha channel in some render targets. 2018-11-05 00:24:19 -03:00
Rodolfo Bogado 145ae36963 Implement multi-target viewports and blending 2018-11-04 20:49:48 -03:00
bunnei 38c1c500ab
Merge pull request #1625 from FernandoS27/astc
Implement ASTC Textures 5x5 and fix a bunch of ASTC texture problems
2018-11-04 18:47:26 -05:00
greggameplayer 9249fadb9e
correct syntax 2018-11-02 14:28:28 +01:00
greggameplayer cb8e4a4633
Merge branch 'master' into Texture2DArray 2018-11-02 14:26:32 +01:00
FernandoS27 60a184455c Fix ASTC Decompressor to support depth parameter 2018-11-01 19:22:12 -04:00
bunnei 4aa9779ae1 memory_manager: Do not MapBufferEx over already in use memory.
- This fixes rendering when changing areas in Super Mario Odyssey.
2018-11-01 18:57:59 -04:00
bunnei cc1fe93297
Merge pull request #1623 from Tinob/master
Improve OpenGL state handling
2018-11-01 15:53:33 -04:00
FernandoS27 aee93f98f9 Fix ASTC formats 2018-11-01 13:08:19 -04:00
FernandoS27 31930a3334 Implemented ASTC 5x5 2018-11-01 13:06:24 -04:00
FernandoS27 678c18aa5c Implement Cube Arrays 2018-11-01 11:56:19 -04:00
bunnei 9afcbba8e4
Merge pull request #1527 from FernandoS27/assert-flow
Assert Control Flow Instructions using Control Codes
2018-11-01 00:34:56 -04:00
bunnei de0ab806df maxwell_3d: Restructure macro upload to use a single macro code memory.
- Fixes an issue where macros could be skipped.
- Fixes rendering of distant objects in Super Mario Odyssey.
2018-10-31 23:29:21 -04:00
bunnei 86e70cf302
Merge pull request #1528 from FernandoS27/assert-control-codes
Assert Control Codes Generation on Shader Instructions
2018-10-31 22:34:18 -04:00
greggameplayer 9ae972ab4e Implement SurfaceTarget Texture2DArray
( needed by Mario+Rabbids Kingdom Battle )
2018-10-31 04:29:15 +01:00
Rodolfo Bogado aca218aea0 Improve OpenGL state handling 2018-10-30 21:19:04 -03:00
ReinUsesLisp 76754f5705 video_core: Move surface declarations out of gl_rasterizer_cache 2018-10-30 16:07:20 -03:00
FernandoS27 5bb80ab009 Assert Control Codes Generation 2018-10-30 13:37:55 -04:00
Frederic L 7a5eda5914 global: Use std::optional instead of boost::optional (#1578)
* get rid of boost::optional

* Remove optional references

* Use std::reference_wrapper for optional references

* Fix clang format

* Fix clang format part 2

* Adressed feedback

* Fix clang format and MacOS build
2018-10-30 00:03:25 -04:00
bunnei c5a849212f
Merge pull request #1580 from FernandoS27/mm-impl
Implemented Mipmaps
2018-10-29 22:34:00 -04:00
bunnei 0270906dbf
Merge pull request #1613 from ReinUsesLisp/gl-utils
video_core: Move OpenGL specific utils to its renderer
2018-10-29 13:22:14 -04:00
bunnei 5d7167dfca
Merge pull request #1610 from slashiee/dxt1-alpha
renderer_opengl: Enable alpha channel for DXT1 texture format
2018-10-28 21:29:43 -04:00
ReinUsesLisp 80cbd81276 video_core: Move OpenGL specific utils to its renderer 2018-10-28 22:22:30 -03:00
Rodolfo Bogado e8b565b239 renderer_opengl: Correct bpp value for ASTC_2D_8X5_SRGB 2018-10-28 20:52:57 -03:00
FernandoS27 3aa8b644a9 Assert Control Flow Instructions using Control Codes 2018-10-28 19:16:41 -04:00
FernandoS27 dde3094058 Fixed black textures, pixelation and we no longer require to auto-generate mipmaps 2018-10-28 19:00:49 -04:00
FernandoS27 f0e902a7d6 Fixed mipmap block autosizing algorithm 2018-10-28 19:00:05 -04:00
FernandoS27 87f8181405 Fixed Invalid Image size and Mipmap calculation 2018-10-28 19:00:04 -04:00
FernandoS27 f4432b5d0c Fixed Block Resizing algorithm and Clang Format 2018-10-28 19:00:03 -04:00
FernandoS27 258f0f5c31 Implement Mip Filter 2018-10-28 19:00:01 -04:00
FernandoS27 dc85e3bff1 Zero out memory region of recreated surface before flushing 2018-10-28 19:00:00 -04:00
FernandoS27 bbf3b2da0c Implement Mipmaps 2018-10-28 18:59:59 -04:00
Michael 635d1e5651 Enable alpha channel for DXT1 texture format 2018-10-28 14:11:04 -07:00
Tobias 351d5a2227
Correct bpp value for ASTC_2D_8X5 2018-10-28 19:49:10 +01:00
bunnei aa1cf608ed
Merge pull request #1601 from FernandoS27/shader-precision
Improved Shader accuracy on Vertex and Geometry Shaders.
2018-10-28 13:06:21 -04:00
FernandoS27 e5ca097e32 Refactor precise usage and add FMNMX, MUFU, FMUL32 and FADD332 2018-10-28 11:38:40 -04:00
Rodolfo Bogado 0287b2be6d Implement sRGB Support, including workarounds for nvidia driver issues and QT sRGB support 2018-10-28 01:13:55 -03:00
bunnei d63f5acb15
Merge pull request #1594 from FreddyFunk/static-cast
gl_rasterizer_cache: Fix compiler warning
2018-10-27 21:09:06 -04:00
FernandoS27 d8d557df86 Improved Shader accuracy on Vertex and Geometry Shaders with FFMA, FMUL and FADD 2018-10-27 20:09:26 -04:00
bunnei ed95ce6bb7
Merge pull request #1592 from bunnei/prim-restart
gl_rasterizer: Implement primitive restart.
2018-10-27 13:25:00 -04:00
FernandoS27 705300992e Implement Default Block Height for each format 2018-10-27 10:17:39 -04:00
Frederic Laing 0bf24d310e gl_rasterizer_cache: Fix compiler warning 2018-10-27 13:06:26 +02:00
bunnei 58444a0376 gl_rasterizer: Implement primitive restart. 2018-10-26 00:42:57 -04:00
bunnei d278f25bda
Merge pull request #1533 from FernandoS27/lmem
Implemented Shader Local Memory
2018-10-26 00:16:25 -04:00
bunnei 949d9a7136 maxwell_3d: Add code for initializing register defaults. 2018-10-25 23:42:39 -04:00
bunnei 8cea598158 gl_rasterizer: Implement depth range. 2018-10-25 21:53:24 -04:00
bunnei f7a173de6c
Merge pull request #1524 from FernandoS27/layers-fix
rasterizer: Fix Layered Textures Loading and Cubemaps
2018-10-25 00:29:18 -04:00
FernandoS27 ca142f35c0 Implemented LD_L and ST_L 2018-10-24 17:51:53 -04:00
FernandoS27 abefe29398 Implement Shader Local Memory 2018-10-24 17:50:43 -04:00
bunnei 69b35d7615
Merge pull request #1554 from FernandoS27/pointsize
Implement PointSize Output Attribute.
2018-10-24 17:38:38 -04:00
Lioncash 257b7bbfee
decoders: Remove unused variable within SwizzledData() 2018-10-23 23:51:13 -04:00
Lioncash a97cdb5eb4
maxwell_3d: Remove unused variable within ProcessQueryGet() 2018-10-23 23:50:16 -04:00
FernandoS27 ed8ca608a0 Implement PointSize 2018-10-23 15:08:00 -04:00
FernandoS27 e0ea2f5f6e Fixed Layered Textures Loading and Cubemaps 2018-10-23 14:27:36 -04:00
bunnei 5716496239
Merge pull request #1519 from ReinUsesLisp/vsetp
gl_shader_decompiler: Implement VSETP
2018-10-23 10:22:37 -04:00
bunnei 0f3d8c2574
Merge pull request #1539 from lioncash/dma
maxwell_dma: Silence compilation warnings
2018-10-23 10:22:12 -04:00
bunnei 75d807788c
Merge pull request #1470 from FernandoS27/alpha_testing
Implemented Alpha Test using Shader Emulation
2018-10-23 10:21:30 -04:00
ReinUsesLisp 7d6dca0d0a gl_shader_decompiler: Implement VSETP 2018-10-23 01:07:20 -03:00
ReinUsesLisp 5dfb43531c gl_shader_decompiler: Abstract VMAD into a video subset 2018-10-23 01:07:20 -03:00
bunnei 848a49112a
Merge pull request #1512 from ReinUsesLisp/brk
gl_shader_decompiler: Implement PBK and BRK
2018-10-23 00:01:38 -04:00
bunnei 496d155d7b
Merge pull request #1550 from FernandoS27/fmul32
Added Saturation to FMUL32I
2018-10-22 23:58:09 -04:00
bunnei 4cccfb4190
Merge pull request #1537 from lioncash/shader
gl_shader_decompiler: Minor changes
2018-10-22 22:49:49 -04:00
FernandoS27 259da93567 Added Saturation to FMUL32I 2018-10-22 20:22:15 -04:00
FernandoS27 8e1239fbc5 Assert that multiple render targets are not set while alpha testing 2018-10-22 15:35:45 -04:00
FernandoS27 59a004f915 Use standard UBO and fix/stylize the code 2018-10-22 15:07:33 -04:00
FernandoS27 17315cee16 Cache uniform locations and restructure the implementation 2018-10-22 15:07:32 -04:00
FernandoS27 bcb5b924fd Remove SyncAlphaTest and clang format 2018-10-22 15:07:31 -04:00
FernandoS27 7b39107e3a Added Alpha Func 2018-10-22 15:07:30 -04:00
FernandoS27 aa620c14af Implemented Alpha Testing 2018-10-22 15:07:30 -04:00
bunnei 1226a5706e
Merge pull request #1547 from FernandoS27/fix-fset
Fixed FSETP and FSET
2018-10-22 12:53:47 -04:00
FernandoS27 5c5b4e8e7d Fixed FSETP and FSET 2018-10-22 11:31:17 -04:00
FernandoS27 e2416bbd1f Fixed VAOs Float types only returning GL_FLOAT in cases that they had to return GL_HALF_FLOAT 2018-10-22 09:27:00 -04:00
Lioncash c1e5525fc6 engines/maxwell_*: Use nested namespace specifiers where applicable
These three source files are the only ones within the engines directory
that don't use nested namespaces. We may as well change these over to
keep things consistent.
2018-10-20 15:58:09 -04:00
Lioncash d53c73adaa maxwell_dma: Make variables const where applicable within HandleCopy()
These are never modified, so we can make that assumption explicit.
2018-10-20 15:56:01 -04:00
Lioncash dd1ee39426 maxwell_dma: Make FlushAndInvalidate's size parameter a u64
This prevents truncation warnings at the lambda's usage sites.
2018-10-20 15:54:45 -04:00
Lioncash 08e574eec4 maxwell_dma: Remove unused variables in HandleCopy()
These pointer variables are never used, so we can get rid of them.
2018-10-20 15:53:24 -04:00
Lioncash 8a86c8d48b gl_shader_decompiler: Allow std::move to function in SetPredicate
If the variable being moved is const, then std::move will always perform
a copy (since it can't actually move the data).
2018-10-20 14:25:15 -04:00
Lioncash 381baf783d gl_shader_decompiler: Get rid of variable shadowing warnings
A variable with the same name was previously declared in an outer scope.
2018-10-20 14:22:37 -04:00
Lioncash 61ef8af1e2 gl_shader_decompiler: Fix a few comment typos 2018-10-20 14:19:28 -04:00
ReinUsesLisp 3ec795d95e gl_shader_decompiler: Move position varying declaration back to gl_shader_gen
The intention of declaring them in gl_shader_decompiler was to be able
to use blocks to implement geometry shaders. But that wasn't needed in
the end and it caused issues when both vertex stages were being used,
resulting in a redeclaration of "position".
2018-10-20 02:19:30 -03:00
bunnei b1f8bff7db
Merge pull request #1501 from ReinUsesLisp/half-float
gl_shader_decompiler: Implement H* instructions
2018-10-19 23:47:19 -04:00
bunnei 7e665c2721 GPU: Improved implementation of maxwell DMA (Subv). 2018-10-18 22:41:53 -04:00
bunnei bcde71d4d9 decoders: Introduce functions for un/swizzling subrects. 2018-10-18 22:41:43 -04:00
bunnei a5d853a9f8 GPU: Invalidate destination address of kepler_memory writes. 2018-10-18 22:41:13 -04:00
bunnei 6b333d862b fermi_2d: Add support for more accurate surface copies. 2018-10-18 22:41:12 -04:00
bunnei 6acd8d166a
Merge pull request #1505 from FernandoS27/tex-3d
Implemented 3D Textures
2018-10-18 11:50:42 -04:00
ReinUsesLisp 41fb25349a gl_shader_decompiler: Implement PBK and BRK 2018-10-17 21:30:45 -03:00
bunnei 77e2d68df7
Merge pull request #1489 from FernandoS27/fix-tlds
shader_decompiler: Fix TLDS
2018-10-17 18:58:38 -04:00
FernandoS27 caaa9914fd Clang format and other fixes 2018-10-17 18:52:11 -04:00
FernandoS27 cb9fdc7a26 Implement Reinterpret Surface, to accurately blit 3D textures 2018-10-17 18:52:10 -04:00
FernandoS27 dbc34db6ce Implement GetInRange in the Rasterizer Cache 2018-10-17 18:52:10 -04:00
FernandoS27 fd9e2d0073 Implement 3D Textures 2018-10-17 18:52:08 -04:00
bunnei f912a82a8e
Merge pull request #1497 from bunnei/flush-framebuffers
Implement flushing in the rasterizer cache
2018-10-17 18:40:34 -04:00
bunnei 86dcf2942b
Merge pull request #1496 from FernandoS27/tex-array
Implement Arrays on Tex Instruction
2018-10-17 18:30:44 -04:00
bunnei 648b55c6b9 gl_rasterizer_cache: Remove unnecessary block_depth=1 on Flush. 2018-10-17 18:20:15 -04:00
bunnei 2a035a1f6f gl_rasterizer_cache: Remove unnecessary temporary buffer with unswizzle. 2018-10-17 18:19:35 -04:00
bunnei 43b9494a0f gl_rasterizer_cache: Use AccurateCopySurface for use_accurate_gpu_emulation. 2018-10-16 17:20:49 -04:00
bunnei ee7c2dbf5a config: Rename use_accurate_framebuffers -> use_accurate_gpu_emulation.
- This will be used as a catch-all for slow-but-accurate GPU emulation paths.
2018-10-16 17:02:29 -04:00
bunnei 91602de7f2 rasterizer_cache: Refactor to support in-order flushing. 2018-10-16 16:51:53 -04:00
bunnei 0e59291310 gl_rasterizer_cache: Refactor to only call GetRegionEnd on surface creation. 2018-10-16 11:31:02 -04:00
bunnei 949d7832fa gl_rasterizer_cache: Only flush when use_accurate_framebuffers is enabled. 2018-10-16 11:31:02 -04:00
bunnei 5f79ba04bd gl_rasterizer_cache: Separate guest and host surface size managment. 2018-10-16 11:31:01 -04:00
bunnei 58be4dff79 gl_rasterizer_cache: Rename GetGLBytesPerPixel to GetBytesPerPixel.
- This does not really have anything to do with OpenGL.
2018-10-16 11:31:01 -04:00
bunnei cf7b46c101 gl_rasterizer_cache: Remove unused FlushSurface method. 2018-10-16 11:31:01 -04:00
bunnei 3afdfd7bfa gl_rasterizer: Implement flushing. 2018-10-16 11:31:01 -04:00
bunnei b4e29ccb81 gl_rasterizer_cache: Remove usage of Memory::Read/Write functions.
- These cannot be used within the cache, as they change cache state.
2018-10-16 11:31:00 -04:00
bunnei 4e9683e9d5 gl_rasterizer_cache: Clamp cached surface size to mapped GPU region size. 2018-10-16 11:31:00 -04:00
bunnei 37575eae65 memory_manager: Add a method for querying the end of a mapped GPU region. 2018-10-16 11:31:00 -04:00
bunnei 0be7e82289 rasterizer_cache: Reintroduce method for flushing. 2018-10-16 11:31:00 -04:00
bunnei 9b929e934b gl_rasterizer_cache: Reintroduce code for handling swizzle and flush to guest RAM. 2018-10-16 11:30:59 -04:00
ReinUsesLisp 936c36a514 shader_bytecode: Add Control Code enum 0xf
Control Code 0xf means to unconditionally execute the instruction. This
value is passed to most BRA, EXIT and SYNC instructions (among others)
but this may not always be the case.
2018-10-15 15:36:47 -03:00
ReinUsesLisp b461342a84 gl_shader_decompiler: Fixup style inconsistencies 2018-10-15 15:35:26 -03:00
ReinUsesLisp 27916764b1 gl_rasterizer: Silence implicit cast warning in glBindBufferRange 2018-10-15 15:26:50 -03:00
ReinUsesLisp 6312eec5ef gl_shader_decompiler: Implement HSET2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp 4fc8ad67bf gl_shader_decompiler: Implement HSETP2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp 3d65aa4caf gl_shader_decompiler: Implement HFMA2 instructions 2018-10-15 02:55:51 -03:00
ReinUsesLisp d93cdc2750 gl_shader_decompiler: Implement HADD2_IMM and HMUL2_IMM 2018-10-15 02:07:16 -03:00
ReinUsesLisp d46e2a6e7a gl_shader_decompiler: Implement non-immediate HADD2 and HMUL2 instructions 2018-10-15 02:04:31 -03:00
ReinUsesLisp 08d751d882 gl_shader_decompiler: Setup base for half float unpacking and setting 2018-10-15 01:58:30 -03:00
bunnei 14286f70f0
Merge pull request #1488 from Hexagon12/astc-types
video_core: Added ASTC 5x4; 8x5 types
2018-10-14 14:44:24 -04:00
FernandoS27 1d6559fbd3 Implement Arrays on Tex Instruction 2018-10-14 13:31:02 -04:00
FernandoS27 d880b77698 Fix TLDS 2018-10-13 22:14:25 -04:00
FernandoS27 331ce2942c Shorten the implementation of 3D swizzle to only 3 functions 2018-10-13 20:58:00 -04:00
FernandoS27 1ff20d8538 Fix a Crash on Zelda BotW and Splatoon 2, and simplified LoadGLBuffer 2018-10-13 16:11:11 -04:00
FernandoS27 e0ca938b22 Propagate depth and depth_block on modules using decoders 2018-10-13 15:25:18 -04:00
FernandoS27 d4ae43f9c1 Remove old Swizzle algorithms and use 3d Swizzle 2018-10-13 15:25:17 -04:00
FernandoS27 4d959c6bdc Implement Precise 3D Swizzle 2018-10-13 15:25:16 -04:00
FernandoS27 736db284d2 Implement Fast 3D Swizzle 2018-10-13 15:25:15 -04:00
Hexagon12 cbf723896f Added ASTC 5x4; 8x5 2018-10-13 17:10:26 +03:00
FernandoS27 97b6405a17 Implemented helper function to correctly calculate a texture's size 2018-10-12 14:21:53 -04:00
ReinUsesLisp 17290a4416 gl_shader_decompiler: Implement VMAD 2018-10-11 04:15:10 -03:00
bunnei 6d82c4adf9
Merge pull request #1458 from FernandoS27/fix-render-target-block-settings
Fixed block height settings for RenderTargets and Depth Buffers
2018-10-10 21:24:07 -04:00
bunnei 03ec936ca0
Merge pull request #1460 from FernandoS27/scissor_test
Implemented Scissor Testing
2018-10-10 12:04:10 -04:00
bunnei ee1b204749
Merge pull request #1425 from ReinUsesLisp/geometry-shaders
gl_shader_decompiler: Implement geometry shaders
2018-10-10 11:51:29 -04:00
FernandoS27 5f4ee6f0c8 Add memory Layout to Render Targets and Depth Buffers 2018-10-09 22:28:19 -04:00
FernandoS27 af653906d0 Fixed block height settings for RenderTargets and Depth Buffers, and added block width and block depth 2018-10-09 21:14:32 -04:00
Lioncash 6e27c5d4d1 gl_shader_decompiler: Remove unused variables in TMML's implementation
Given "y" isn't always used, but "x" is, we can rearrange this to avoid
unused variable warnings by changing the names of op_a and op_b
2018-10-09 15:44:37 -04:00
FernandoS27 be97fc884d Implement Scissor Test 2018-10-08 21:36:23 -04:00
FernandoS27 30ff42b8cc Assert Scissor tests 2018-10-08 20:49:36 -04:00
ReinUsesLisp 7c2d6ef210 gl_shader_decompiler: Move position varying location from 15 to 0 and apply an offset 2018-10-07 17:36:00 -03:00
ReinUsesLisp ee4d538850 gl_shader_decompiler: Implement geometry shaders 2018-10-07 17:36:00 -03:00
ReinUsesLisp 4d0c682468 video_core: Allow LabelGLObject to use extra info on any object 2018-10-07 17:27:49 -03:00
bunnei 2c0b0ad50d
Merge pull request #1446 from bunnei/fast_fermi_copy
gl_rasterizer: Implement accelerated Fermi2D copies.
2018-10-06 23:18:52 -04:00
bunnei 1cc5e6e9bc
Merge pull request #1437 from FernandoS27/tex-mode2
Implemented Depth Compare, Shadow Samplers and Texture Processing Modes for TEXS and TLDS
2018-10-06 23:14:27 -04:00
ReinUsesLisp 0ecd181cca gl_rasterizer: Fixup undefined behaviour in SetupDraw 2018-10-06 23:22:48 -03:00
FernandoS27 752faff2bc Implemented Depth Compare and Shadow Samplers 2018-10-06 11:27:54 -04:00
bunnei 9aec85d39c fermi_2d: Implement simple copies with AccelerateSurfaceCopy. 2018-10-06 03:20:04 -04:00
bunnei 011cf77796 gl_rasterizer: Add rasterizer cache code to handle accerated fermi copies. 2018-10-06 03:20:04 -04:00
bunnei 749aef3dd0 gl_rasterizer_cache: Implement a simpler surface copy using glCopyImageSubData. 2018-10-06 03:20:04 -04:00
ReinUsesLisp 3e2380327a gl_rasterizer: Implement quads topology 2018-10-04 00:03:44 -03:00
FernandoS27 f664437ae8 Implemented Texture Processing Modes in TEXS and TLDS 2018-10-03 08:41:12 -04:00
ReinUsesLisp 1835911425 gl_rasterizer: Fixup unassigned point sizes 2018-10-01 01:18:24 -03:00
bunnei bc679c9b8c
Merge pull request #1330 from raven02/tlds
TLDS: Add 1D sampler
2018-09-30 21:53:38 -04:00
bunnei df3799a008 gl_rasterizer_cache: Fixes to how we do render to cubemap.
- Fixes issues with Splatoon 2.
2018-09-30 15:10:14 -04:00
bunnei 29782273ec gl_rasterizer_cache: Add check for array rendering to cubemap texture. 2018-09-30 14:31:58 -04:00
bunnei f543b43fd0 gl_rasterizer_cache: Implement render to cubemap. 2018-09-30 14:31:58 -04:00
bunnei 15cc729ebd gl_shader_decompiler: TEXS: Implement TextureType::TextureCube. 2018-09-30 14:31:58 -04:00
bunnei ed2e0e85c9 gl_rasterizer_cache: Add support for SurfaceTarget::TextureCubemap. 2018-09-30 14:31:58 -04:00
bunnei 871580dcd8 gl_rasterizer_cache: Implement LoadGLBuffer for Texture2DArray. 2018-09-30 14:31:57 -04:00
bunnei a9aa1db552 gl_rasterizer_cache: Update BlitTextures to support non-Texture2D ColorTexture surfaces. 2018-09-30 14:31:57 -04:00
bunnei 2e1cdde994 gl_rasterizer_cache: Track texture target and depth in the cache. 2018-09-30 14:31:57 -04:00
bunnei fefb003b23 gl_rasterizer_cache: Workaround for Texture2D -> Texture2DArray scenario. 2018-09-30 14:31:56 -04:00
bunnei ce452049d3 gl_rasterizer_cache: Keep track of surface 2D size separately from total size. 2018-09-30 14:31:56 -04:00
raven02 b92b4bbeaf Fix trailing whitespace 2018-09-30 23:51:10 +08:00
bunnei fe5962e073
Merge pull request #1411 from ReinUsesLisp/point-size
video_core: Implement point_size and add point state sync
2018-09-29 11:58:39 -04:00
bunnei 8d8366b602
Merge pull request #1406 from ReinUsesLisp/multibind-samplers
gl_state: Pack sampler bindings into a single ARB_multi_bind
2018-09-29 10:55:45 -04:00
ReinUsesLisp e3e51d3ddb video_core: Implement point_size and add point state sync 2018-09-28 02:13:29 -03:00
ReinUsesLisp b8f1506aa5 gl_state: Pack sampler bindings into a single ARB_multi_bind 2018-09-28 02:04:22 -03:00
bunnei 62048edc15
Merge pull request #1377 from FernandoS27/faster-swizzle
Improved Fast Swizzle and Legacy Swizzle
2018-09-27 10:00:04 -04:00
ReinUsesLisp ab65fde9f4 video_core: Add asserts for CS, TFB and alpha testing
Add asserts for compute shader dispatching, transform feedback being
enabled and alpha testing. These have in common that they'll probably break
rendering without logging.
2018-09-25 21:07:00 -03:00
David 9f3fc067bf Added glObjectLabels for renderdoc for textures and shader programs (#1384)
* Added glObjectLabels for renderdoc for textures and shader programs

* Changed hardcoded "Texture" name to reflect the texture type instead

* Removed string initialize
2018-09-23 17:55:41 -04:00
greggameplayer b91e2d55f3 correct BC6H 2018-09-23 19:17:22 +02:00
bunnei fb9f273e90
Merge pull request #1380 from lioncash/const
shader_bytecode: Make operator== and operator!= of IpaMode const qualified
2018-09-22 01:37:36 -04:00
bunnei 73eea61614
Merge pull request #1382 from lioncash/inc
gl_state: Remove unused type alias
2018-09-22 01:36:21 -04:00
Lioncash 90746c33c7 gl_state: Remove unused type alias
This isn't used anywhere within the header, so we can remove it, along
with the include that was previously necessary. This also uncovers an
indirect include in the cpp file for the assertion macros.
2018-09-21 18:22:43 -04:00
Lioncash a8f5fd787f shader_bytecode: Lay out the Ipa-related enums better
This is more consistent with the surrounding enums.
2018-09-21 16:17:31 -04:00
Lioncash 272517cf7e shader_bytecode: Make operator== and operator!= of IpaMode const qualified
These don't affect the state of the struct and can be const member
functions.
2018-09-21 16:17:27 -04:00
bunnei 4f186de069
Merge pull request #1379 from lioncash/bitwise
gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()
2018-09-21 14:02:00 -04:00
FernandoS27 57b44200a2 Reverse stride align restriction on FastSwizzle due to lost performance 2018-09-21 12:09:59 -04:00
FernandoS27 d2dd1289bd Join both Swizzle methods within one interface function 2018-09-21 11:42:34 -04:00
FernandoS27 41c6c4593a Standarized Legacy Swizzle to look alike FastSwizzle and use a Swizzling Table instead 2018-09-21 11:34:54 -04:00
FernandoS27 f020319a45 Remove same output bpp restriction on FastSwizzle 2018-09-21 11:10:44 -04:00
FernandoS27 68aaa83836 Improved Legacy Swizzler to be better documented and work better 2018-09-21 10:57:12 -04:00
Lioncash ba02dd9ebc gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()
This was very likely intended to be a logical OR based off the
conditioning and testing of inversion in one case.

Even if this was intentional, this is the kind of non-obvious thing one
should be clarifying with a comment.
2018-09-21 07:59:03 -04:00
Subv 9cd5c61fcf RasterizerGL: Use the correct framebuffer when clearing via the CLEAR_BUFFERS register.
Previously we were clearing the default backbuffer framebuffer.

Found thanks to a Piglit test :)
2018-09-20 22:31:53 -05:00
FernandoS27 bf2f2a715f Improved fast swizzle and removed restrictions to it 2018-09-20 23:06:53 -04:00
raven02 c8f9bbbf85
Merge branch 'master' into tlds 2018-09-19 19:53:11 +08:00
Markus Wick f465e4aaf2 gl_rasterizer: Fix StartAddress handling with indexed draw calls.
We uploaded the wrong data before. So the offset on the host GPU pointer may work for the first vertices, the last ones run out bounds.
Let's just offset the upload instead.
2018-09-19 09:22:30 +02:00
bunnei bd88d4108f
Merge pull request #1342 from lioncash/trunc
gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A code
2018-09-18 22:11:48 -04:00
bunnei 0284cbe7ec
Merge pull request #1279 from FernandoS27/csetp
shader_decompiler: Implemented (Partialy) Control Codes and CSETP
2018-09-18 22:10:48 -04:00
bunnei 6415f81bb8
Merge pull request #1299 from FernandoS27/texture-sanatize
shader_decompiler: Asserts for Texture Instructions
2018-09-18 22:10:09 -04:00
FernandoS27 567a5524b9 Implemented Internal Flags 2018-09-17 20:50:54 -04:00
Lioncash 9a8dbba1e5 gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A code
These are internally stored as u64 values, so using u32 here causes
truncation warnings. Instead, we can just use u64 and preserve the bit
width.
2018-09-17 19:25:55 -04:00
bunnei fafc80d72e
Merge pull request #1290 from FernandoS27/shader-header
Implemented (Partialy) Shader Header
2018-09-17 18:53:14 -04:00
FernandoS27 e4bb759c4b Implemented I2I.CC on the NEU control code, used by SMO 2018-09-17 17:42:46 -04:00
FernandoS27 e2ac8fb36d Implemented CSETP 2018-09-17 17:42:44 -04:00
FernandoS27 aac77bbd18 Implemented Control Codes 2018-09-17 17:42:43 -04:00
FernandoS27 31e52113b3 Added asserts for texture misc modes to texture instructions 2018-09-17 12:56:36 -04:00
FernandoS27 55a4756766 Added texture misc modes to texture instructions 2018-09-17 12:51:05 -04:00
bunnei a94b623dfb
Merge pull request #1311 from FernandoS27/fast-swizzle
Optimized Texture Swizzling
2018-09-17 12:39:34 -04:00
bunnei 27fe8159c5
Merge pull request #1316 from lioncash/shadow
gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
2018-09-17 12:27:35 -04:00
raven02 b91f7d5d67 Add 1D sampler for TLDS - TexelFetch (Mario Rabbids) 2018-09-17 23:25:18 +08:00
bunnei 076add4ccd
Merge pull request #1326 from FearlessTobi/port-4182
Port #4182 from Citra: "Prefix all size_t with std::"
2018-09-17 09:51:47 -04:00
bunnei 3be048e50a
Merge pull request #1329 from raven02/bgr5a1u
Implement RenderTargetFormat::BGR5A1_UNORM
2018-09-17 09:49:00 -04:00
raven02 2845348608 Implement ASTC_2D_8X8 (Bayonetta 2) 2018-09-17 01:04:27 +08:00
bunnei ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
raven02 0019a36b41 Implement RenderTargetFormat::BGR5A1_UNORM (Pokken Tournament DX) 2018-09-16 00:21:42 +08:00
Subv c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi 63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
FernandoS27 f8e994354f Optimized Texture Swizzling 2018-09-14 12:45:49 -04:00
Lioncash ae128f0375 gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
These variables are already defined within an outer scope.
2018-09-13 21:53:23 -04:00
ReinUsesLisp a42376dfad Use ARB_multi_bind for uniform buffers (#1287)
* gl_rasterizer: use ARB_multi_bind for uniform buffers

* address feedback
2018-09-12 20:27:43 -04:00
bunnei 4a43fb7e1d gl_rasterizer_cache: B5G6R5U should use GL_RGB8 as an internal format.
- Fixes a regression with Sonic Mania with ARB_texture_storage.
2018-09-12 18:09:31 -04:00
bunnei cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27 a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
bunnei 89825766ee
Merge pull request #1278 from tech4me/bg-color-fix
Port Citra #4047 & #4052: add change background color support
2018-09-11 23:13:11 -04:00
bunnei 522a11a11f
Merge pull request #1295 from bunnei/accurate-copies
gl_rasterizer_cache: Improve accuracy of caching and copies.
2018-09-11 23:12:15 -04:00
bunnei 4a9acc87f9
Merge pull request #1294 from degasus/optimizations
gl_rasterizer: Use ARB_texture_storage.
2018-09-11 23:11:36 -04:00
bunnei 7bb226f22d gl_rasterizer_cache: Always blit on recreate, regardless of format.
- Fixes several rendering issues with Super Mario Odyssey.
2018-09-11 22:54:46 -04:00
bunnei cdddd71d08 gl_shader_cache: Remove cache_width/cache_height.
- This was once an optimization, but we no longer need it with the cache reserve.
- This is also inaccurate.
2018-09-11 20:12:29 -04:00
Markus Wick 3e973bc4c6 gl_rasterizer: Use ARB_texture_storage.
It allows us to use texture views and it reduces the overhead within the GPU driver.

But it disallows us to reallocate the texture, but we don't do so anyways.

In the end, it is the new way to allocate textures, so there is no need to use the old way.
2018-09-11 22:18:46 +02:00
FernandoS27 5c676dc884 Implemented LEA and PSET 2018-09-11 12:50:52 -04:00
FernandoS27 3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27 2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27 e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
David Marcec 4c3bd33be2 Fixed renderdoc input/output textures not working due to render targets 2018-09-11 13:31:20 +10:00
bunnei d6e8e16a66
Merge pull request #1286 from bunnei/multi-clear
gl_rasterizer: Implement clear for non-zero render targets.
2018-09-10 20:30:14 -04:00
bunnei 12445b476d
Merge pull request #1285 from bunnei/depth-fix
gl_rasterizer_cache: Only use depth for applicable texture formats.
2018-09-10 20:28:40 -04:00
bunnei d884e805c5
Merge pull request #1284 from bunnei/bgra8_srgb
gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
2018-09-10 20:28:00 -04:00
Markus Wick c1b8cd9058 video_core: Refactor command_processor.
Inline the WriteReg helper as it is called ~20k times per frame.
2018-09-10 22:06:16 +02:00
Markus Wick 0cfb0bacb2 video_core: Move command buffer loop.
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
2018-09-10 22:06:13 +02:00
Markus Wick c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei 4c0b1cc1ae gl_rasterizer_cache: Only use depth for applicable texture formats.
- Fixes an issue with Octopath Traveler leaving stale data here.
2018-09-10 00:50:38 -04:00
bunnei 035e6bd407 gl_rasterizer: Implement clear for non-zero render targets.
- Several misc. changes to ConfigureFramebuffers in support of this.
2018-09-10 00:41:20 -04:00
bunnei 1c34498368 gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
- Used by Octopath Traveler (with multiple render targets).
2018-09-10 00:37:52 -04:00
bunnei 49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27 00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei 223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
bunnei fcf81147e7
Merge pull request #1280 from zero334/improvements
video_core: fixed arithmetic overflow warnings & improved code style
2018-09-09 19:51:46 -04:00
FernandoS27 073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
Patrick Elsässer 64e45b04e0 video_core: fixed arithmetic overflow warnings & improved code style
- Fixed all warnings, for renderer_opengl items, which were indicating a
possible incorrect behavior from integral promotion rules and types
larger than those in which arithmetic is typically performed.
- Added const for variables where possible and meaningful.
- Added constexpr where possible.
2018-09-09 17:51:43 +02:00
tech4me 3dcedb36b4 Port Citra #4047 & #4052: add change background color support 2018-09-08 17:00:21 -07:00
FernandoS27 82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei af074ee422
Merge pull request #1256 from bunnei/tex-target-support
Initial support for non-2D textures
2018-09-08 16:14:46 -04:00
bunnei 8b08cb925b gl_rasterizer: Use baseInstance instead of moving the buffer points.
This hopefully helps our cache not to redundant upload the vertex buffer.

# Conflicts:
#	src/video_core/renderer_opengl/gl_rasterizer.cpp
2018-09-08 04:05:56 -04:00
Patrick Elsässer a8974f0556 video_core: Arithmetic overflow warning fix for gl_rasterizer (#1262)
* video_core: Arithmetic overflow fix for gl_rasterizer

- Fixed warnings, which were indicating incorrect behavior from integral
promotion rules and types larger than those in which arithmetic is
typically performed.

- Added const for variables where possible and meaningful.

* Changed the casts from C to C++ style

Changed the C-style casts to C++ casts as proposed.
Took also care about signed / unsigned behaviour.
2018-09-08 02:59:59 -04:00
bunnei 23ae7cf9db gl_rasterizer_cache: Improve accuracy of RecreateSurface for non-2D textures. 2018-09-08 02:53:39 -04:00
bunnei fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei f165a85398 gl_rasterizer_cache: Partially implement several non-2D texture types. 2018-09-08 02:53:38 -04:00
bunnei 0731383124 gl_shader_decompiler: Partially implement several non-2D texture types (Subv). 2018-09-08 02:53:38 -04:00
bunnei 05f6f59ffb gl_rasterizer: Implement texture wrap mode p. 2018-09-08 02:53:38 -04:00
bunnei ce8291f6c5 gl_rasterizer_cache: Track texture depth. 2018-09-08 02:53:38 -04:00
bunnei 9dccf7e1fa gl_rasterizer_cache: Remove impl. of FlushGLBuffer.
- Will not work for non-2d textures, and was not used anyways.
2018-09-08 02:53:37 -04:00
bunnei 030676b95d gl_rasterizer_cache: Keep track of texture type per surface. 2018-09-08 02:53:37 -04:00
bunnei a439f7b6e1 gl_rasterizer_cache: Remove unused DownloadGLTexture. 2018-09-08 02:53:37 -04:00
bunnei b56e5edafc gl_state: Keep track of texture target. 2018-09-08 02:53:37 -04:00
bunnei 9947c6ad59
Merge pull request #1252 from lioncash/header
video_core/CMakeLists: Add missing gl_buffer_cache.h
2018-09-06 19:19:43 -04:00
bunnei 9b50dca2bb
Merge pull request #1253 from lioncash/explicit
video_core/gl_buffer_cache: Minor tidying changes
2018-09-06 19:19:35 -04:00
bunnei 009a2cc9cc
Merge pull request #1255 from bunnei/minor-opt
gl_rasterizer: Call state.Apply only once on SetupShaders.
2018-09-06 19:19:16 -04:00
bunnei 820f646458 gl_rasterizer: Call state.Apply only once on SetupShaders. 2018-09-06 17:41:53 -04:00
bunnei 948f6c0738 gl_shader_decompiler: Implement saturate mode for IPA. 2018-09-06 17:40:03 -04:00
Lioncash ddcdbce067 gl_buffer_cache: Default initialize member variables
Ensures that the cache always has a deterministic initial state.
2018-09-06 15:07:15 -04:00
Lioncash 8d685a29bc gl_buffer_cache: Make GetHandle() a const member function
GetHandle() internally calls GetHandle() on the stream_buffer instance,
which is a const member function, so this can be made const as well.
2018-09-06 15:07:15 -04:00
Lioncash 14230fe2af gl_buffer_cache: Remove unnecessary includes 2018-09-06 15:05:52 -04:00
Lioncash 68296d9474 gl_buffer_cache: Make constructor explicit
Implicit conversions during construction isn't desirable here.
2018-09-06 14:54:49 -04:00
Lioncash 8f4e09ba07 video_core/CMakeLists: Add missing gl_buffer_cache.h
Without this, the header file won't show up by default within IDEs such
as Visual Studio.
2018-09-06 14:49:51 -04:00
Markus Wick a781042700 gl_shader_gen: Initialize position.
IMO the old code is fine, but nvidia raises shader compiler warnings.
Trivial fix through...
2018-09-06 13:37:50 +02:00
bunnei 77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
bunnei 6f09c5b128
Merge pull request #1244 from FernandoS27/ipa
shader_decompiler: Implemented IPA Properly (Stage 1)
2018-09-05 21:20:40 -04:00
FernandoS27 e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick 7f15306f78 gl_rasterizer: Skip TODO log.
This is called ~3k times per frame in SMO ingame.
My laptop spends ~3ms per frame on allocating and freeing this string.

Let's just stop printing this kind of redundant information.
2018-09-05 20:20:20 +02:00
Markus Wick d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
Markus Wick 50a806ea67 renderer_opengl: Implement a buffer cache.
The idea of this cache is to avoid redundant uploads. So we are going
to cache the uploaded buffers within the stream_buffer and just reuse
the old pointers.
The next step is to implement a VBO cache on GPU memory, but for now,
I want to check the overhead of the cache management. Fetching the
buffer over PCI-E should be quite fast.
2018-09-05 08:03:50 +02:00
Markus Wick 99a71580c4 gl_shader_cache: Use an u32 for the binding point cache.
The std::string generation with its malloc and free requirement
was a noticeable overhead. Also switch to an ordered_map to
avoid the std::hash call. As those maps usually have a size of
two elements, the lookup time shall not matter.
2018-09-04 21:04:41 +02:00
bunnei 9a07e9f805
Merge pull request #1237 from degasus/optimizations
Optimizations
2018-09-04 12:16:06 -04:00
bunnei 26e96d16d0
Merge pull request #1232 from lioncash/copy
gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
2018-09-04 11:52:25 -04:00
Markus Wick 2081ed7db2 command_processor: Use std::array for bound_engines.
subchannel is a 3 bit field. So there must not be more than 8 bound engines.
And using a hashmap for up to 8 values is a bit overpowered.
2018-09-04 14:10:05 +02:00
Markus Wick 10bc725944 Update microprofile scopes.
Blame the subsystems which deserve the blame :)

The updated list is not complete, just the ones I've spotted on random sampling the stack trace.
2018-09-04 11:04:26 +02:00
Lioncash 18a89931a9 gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
Using the getter function intended for external code here makes an
unnecessary copy of the already-accessible used_shaders vector.
2018-09-02 13:10:11 -04:00
bunnei 325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei 89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei 177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei 9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec 60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec 2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec 2bc6abb9a1 Changed tab5980_0 default from 0 -> 1 2018-09-01 19:15:03 +10:00
David Marcec 6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec 948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash 4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
bunnei 7f7eb29323 gl_rasterizer_cache: Use accurate framebuffer setting for accurate copies. 2018-08-31 13:07:28 -04:00
bunnei 123c065086 gl_rasterizer_cache: Also use reserve cache for RecreateSurface. 2018-08-31 13:07:28 -04:00
bunnei 9bc71fcc5f rasterizer_cache: Use boost::interval_map for a more accurate cache. 2018-08-31 13:07:28 -04:00
bunnei d647d9550c gl_renderer: Cache textures, framebuffers, and shaders based on CPU address. 2018-08-31 13:07:27 -04:00
bunnei 16d65182f9 gl_rasterizer: Fix issues with the rasterizer cache.
- Use a single cached page map.
- Fix calculation of ending page.
2018-08-31 13:07:27 -04:00
greggameplayer 06578e89b2 Implement BC6H_UF16 & BC6H_SF16 (#1092)
* Implement BC6H_UF16 & BC6H_SF16
Require by ARMS

* correct coding style

* correct coding style part 2
2018-08-31 12:11:19 -04:00
bunnei f08d24e9c0
Merge pull request #1204 from lioncash/pimpl
core: Make the main System class use the PImpl idiom
2018-08-31 11:31:20 -04:00
bunnei 6683bf50b5
Merge pull request #1207 from degasus/hotfix
Report correct shader size.
2018-08-31 11:21:15 -04:00
Lioncash e2457418da core: Make the main System class use the PImpl idiom
core.h is kind of a massive header in terms what it includes within
itself. It includes VFS utilities, kernel headers, file_sys header,
ARM-related headers, etc. This means that changing anything in the
headers included by core.h essentially requires you to rebuild almost
all of core.

Instead, we can modify the System class to use the PImpl idiom, which
allows us to move all of those headers to the cpp file and forward
declare the bulk of the types that would otherwise be included, reducing
compile times. This change specifically only performs the PImpl portion.
2018-08-31 07:16:57 -04:00
Markus Wick 5be8b7a362 Report correct shader size.
Seems like this was an oversee in regards to 1fd979f50a
It changed GLShader::ProgramCode to a std::vector, so sizeof is wrong.
2018-08-31 09:56:37 +02:00
Hexagon12 d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku 915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei 4d7e1662c8
Merge pull request #1193 from lioncash/priv
gpu: Make memory_manager private
2018-08-28 12:28:57 -04:00
bunnei eb4f2d5596
Merge pull request #1192 from lioncash/unused
gl_rasterizer: Remove unused variables
2018-08-28 12:28:13 -04:00
Lioncash 2e7dc4cac9 gl_shader_cache: Remove unused program_code vector in GetShaderAddress()
Given std::vector is a type with a non-trivial destructor, this
variable cannot be optimized away by the compiler, even if unused.
Because of that, something that was intended to be fairly lightweight,
was actually allocating 32KB and deallocating it at the end of the
function.
2018-08-28 11:20:41 -04:00
Lioncash 45fb74d262 gpu: Make memory_manager private
Makes the class interface consistent and provides accessors for
obtaining a reference to the memory manager instance.

Given we also return references, this makes our more flimsy uses of
const apparent, given const doesn't propagate through pointers in the
way one would typically expect. This makes our mutable state more
apparent in some places.
2018-08-28 11:11:50 -04:00
Lioncash 6771a18c6c gl_rasterizer: Remove unused variables 2018-08-28 10:46:29 -04:00
bunnei b55d8111e6 renderer_opengl: Implement a new shader cache. 2018-08-27 18:26:46 -04:00
bunnei a0e1566dc5 gl_rasterizer_cache: Update to use RasterizerCache base class. 2018-08-27 18:26:46 -04:00
bunnei 382852418b video_core: Add RasterizerCache class for common cache management code. 2018-08-27 18:26:45 -04:00
bunnei 2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei f96ded9815
Merge pull request #1174 from lioncash/debug
debug_utils: Minor individual interface changes
2018-08-27 15:44:29 -04:00
bunnei be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
bunnei 23b86fd3ea
Merge pull request #1167 from lioncash/assert
gl_rasterizer: Correct assertion condition in SyncLogicOpState()
2018-08-25 10:50:59 -04:00
Lioncash c65713832c debug_utils: Remove unused includes
Quite a bit of these aren't necessary directly within the debug_utils
header and can be removed or included where actually necessary.
2018-08-24 20:49:14 -04:00
Lioncash 1e6a209649 debug_utils: Make BreakpointObserver class' constructor explicit
Avoids implicit conversions.
2018-08-24 20:49:14 -04:00
Lioncash b6425c0511 debug_utils: Initialize active_breakpoint member of DebugContext
Ensures that all class members are initialized.
2018-08-24 20:15:50 -04:00
Lioncash 20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku 36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
Lioncash 8fd9eb71b4 gl_rasterizer: Correct assertion condition in SyncLogicOpState()
Previously the assert would always be hit, since it was the equivalent
of: array == nullptr, which is never true.
2018-08-23 23:00:54 -04:00
tech4me ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei 0dce6d7008
Merge pull request #1160 from bunnei/surface-reserve
gl_rasterizer_cache: Several improvements
2018-08-23 12:04:37 -04:00
bunnei d65f079cc1 gl_rasterizer_cache: Blit when possible on RecreateSurface. 2018-08-23 11:27:01 -04:00