Commit graph

2790 commits

Author SHA1 Message Date
Fernando Sahmkow da91e6e4b6 Apply Const correctness to SwizzleKepler and replace u32 for size_t on iterators. 2019-04-16 12:00:46 -04:00
Fernando Sahmkow 13d626fc21 Use ReadBlockUnsafe for fetyching DMA CommandLists 2019-04-16 11:22:34 -04:00
Fernando Sahmkow 06d1c5a991 Document unsafe versions and add BlockCopyUnsafe 2019-04-16 10:11:35 -04:00
Fernando Sahmkow 6fc562a9aa Use ReadBlockUnsafe for Shader Cache 2019-04-15 23:34:03 -04:00
Fernando Sahmkow ef381e6924 Use ReadBlockUnsafe on TIC and TSC reading
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed
from host GPU there.
2019-04-15 23:10:24 -04:00
Fernando Sahmkow 367704aa82 GPU MemoryManager: Implement ReadBlockUnsafe and WriteBlockUnsafe 2019-04-15 23:01:35 -04:00
Fernando Sahmkow 3e96c367bd Use WriteBlock and ReadBlock. 2019-04-15 22:42:34 -04:00
Fernando Sahmkow bec28d692d Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
ReinUsesLisp ef8245bed2 vk_shader_decompiler: Add missing operations 2019-04-15 21:32:57 -03:00
ReinUsesLisp f43995ec53 shader_ir/decode: Fix half float pre-operations and remove MetaHalfArithmetic
Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall.
2019-04-15 21:16:10 -03:00
ReinUsesLisp abcbcb1b2a gl_shader_decompiler: Fix MrgH0 decompilation
GLSL decompilation for HMergeH0 was wrong. This addresses that issue.
2019-04-15 21:16:10 -03:00
ReinUsesLisp 64613db605 shader_ir/decode: Implement half float saturation 2019-04-15 21:16:10 -03:00
ReinUsesLisp 90cbf89303 shader_ir/decode: Reduce severity of unimplemented half-float FTZ 2019-04-15 21:16:09 -03:00
ReinUsesLisp acf618afbc renderer_opengl: Implement half float NaN comparisons 2019-04-15 21:13:26 -03:00
ReinUsesLisp ae46ad48ed shader_ir: Avoid using static on heap-allocated objects
Using static here might be faster at runtime, but it adds a heap
allocation called before main.
2019-04-15 21:12:43 -03:00
Fernando Sahmkow aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
Fernando Sahmkow 8a099ac99f Correct Kepler Memory on Linear Pushes. 2019-04-15 14:51:36 -04:00
Fernando Sahmkow 773d955dfa Support compressed formats on linear textures. 2019-04-15 13:56:09 -04:00
Fernando Sahmkow bf561e4340 Correct Pitch in Fermi2D 2019-04-15 12:24:29 -04:00
ReinUsesLisp f15c59a164 gl_shader_decompiler: Use variable AOFFI on supported hardware 2019-04-14 05:13:19 -03:00
ReinUsesLisp 5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei c9454c8422
Merge pull request #2373 from FernandoS27/z32
Set Pixel Format to Z32 if its R32F and depth compare enabled, and Implement format ZF32_X24S8
2019-04-13 22:14:51 -04:00
bunnei ee2206a1b7
Merge pull request #2386 from ReinUsesLisp/shader-manager
gl_shader_manager: Move code to source file and minor clean up
2019-04-13 22:09:27 -04:00
Lioncash 6d0551196d video_core/gpu: Create threads separately from initialization
Like with CPU emulation, we generally don't want to fire off the threads
immediately after the relevant classes are initialized, we want to do
this after all necessary data is done loading first.

This splits the thread creation into its own interface member function
to allow controlling when these threads in particular get created.
2019-04-11 22:11:40 -04:00
bunnei ea80e2bc57
Merge pull request #2235 from ReinUsesLisp/spirv-decompiler
vk_shader_decompiler: Implement a SPIR-V decompiler
2019-04-11 21:54:23 -04:00
Fernando Sahmkow c9305959d3 gl_rasterizer_cache: Relax restrictions on FastCopySurface and FastLayeredCopySurface 2019-04-11 13:14:28 -04:00
bunnei 6951741a94
Merge pull request #2278 from ReinUsesLisp/vc-texture-cache
video_core: Implement API agnostic view based texture cache
2019-04-10 21:17:35 -04:00
bunnei 0371650bd7
Merge pull request #2372 from FernandoS27/fermi-fix
Correct Fermi Copy on Linear Textures.
2019-04-10 21:17:03 -04:00
ReinUsesLisp 93af663683 gl_shader_manager: Move code to source file and minor clean up 2019-04-10 19:29:15 -03:00
ReinUsesLisp 6df25e9c7b gl_rasterizer: Apply just the needed state on Clear 2019-04-10 18:13:15 -03:00
ReinUsesLisp 0032821864 gl_device: Implement interface and add uniform offset alignment 2019-04-10 15:56:12 -03:00
ReinUsesLisp 75d23a3679 vk_shader_decompiler: Implement flow primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp 58ad8dfac6 vk_shader_decompiler: Implement most common texture primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp 4667ed8e22 vk_shader_decompiler: Implement texture decompilation helper functions 2019-04-10 14:20:25 -03:00
ReinUsesLisp 676172e20d vk_shader_decompiler: Implement Assign and LogicalAssign 2019-04-10 14:20:25 -03:00
ReinUsesLisp d316d248ab vk_shader_decompiler: Implement non-OperationCode visits 2019-04-10 14:20:25 -03:00
ReinUsesLisp b758c861b0 vk_shader_decompiler: Implement OperationCode decompilation interface 2019-04-10 14:20:25 -03:00
ReinUsesLisp fec4eb9776 vk_shader_decompiler: Implement Visit 2019-04-10 14:20:25 -03:00
ReinUsesLisp ca51f99840 vk_shader_decompiler: Implement labels tree and flow 2019-04-10 14:20:25 -03:00
ReinUsesLisp 13aa664f3f vk_shader_decompiler: Implement declarations 2019-04-10 14:20:25 -03:00
ReinUsesLisp ad53b233c5 vk_shader_decompiler: Declare and stub interface for a SPIR-V decompiler 2019-04-10 14:20:25 -03:00
ReinUsesLisp 970d9e57c8 video_core: Add sirit as optional dependency with Vulkan
sirit is a runtime assembler for SPIR-V
2019-04-10 14:20:25 -03:00
bunnei 97648f4841
Merge pull request #2345 from ReinUsesLisp/multibind
gl_rasterizer: Use ARB_multi_bind to update buffers with a single call per drawcall
2019-04-10 11:23:19 -04:00
bunnei ed9dba89d3
Merge pull request #2375 from FernandoS27/fix-ldc
Remove unnecessary bounding in LD_C
2019-04-09 21:23:24 -04:00
Fernando Sahmkow c9f35d96be Remove bounding in LD_C 2019-04-09 20:02:11 -04:00
bunnei 2598433f9c
Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 -04:00
bunnei 353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
bunnei bc7e149835
Merge pull request #2369 from FernandoS27/mip-align
gl_backend: Align Pixel Storage
2019-04-09 17:20:43 -04:00
Fernando Sahmkow cd91e98dab Correct Fermi Copy on Linear Textures. 2019-04-09 14:13:58 -04:00
Fernando Sahmkow 7c458311d3 Implement Texture Format ZF32_X24S8. 2019-04-09 12:33:46 -04:00
Fernando Sahmkow b0aa8ad736 Correct depth compare with color formats for R32F 2019-04-09 12:06:59 -04:00
Fernando Sahmkow 9f16833097 gl_backend: Align Pixel Storage
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
2019-04-08 17:16:02 -04:00
Fernando Sahmkow 5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow 16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow ef8be408d3 Adapt Bindless to work with AOFFI 2019-04-08 12:07:56 -04:00
Fernando Sahmkow 492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow 797e351bf8 Fix bad rebase 2019-04-08 11:35:22 -04:00
Fernando Sahmkow c60b0b8432 Fix TMML 2019-04-08 11:35:22 -04:00
Fernando Sahmkow a77e9a27b0 Simplify ConstBufferAccessor 2019-04-08 11:35:19 -04:00
Fernando Sahmkow fd4e994de3 Refactor GetTextureCode and GetTexCode to use an optional instead of optional parameters 2019-04-08 11:35:18 -04:00
Fernando Sahmkow 4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow 189bd1980c Implement TMML_B 2019-04-08 11:29:49 -04:00
Fernando Sahmkow ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow 90d06acfed Fixes to Const Buffer Accessor and Formatting 2019-04-08 11:23:47 -04:00
Fernando Sahmkow 7af82ca022 Implement Bindless Handling on SetupTexture 2019-04-08 11:23:46 -04:00
Fernando Sahmkow fe392fff24 Unify both sampler types. 2019-04-08 11:23:45 -04:00
Fernando Sahmkow e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
Fernando Sahmkow c4ac05c82c Implement Const Buffer Accessor 2019-04-08 11:19:34 -04:00
bunnei f14328bf0a
Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 -04:00
bunnei c2fee0e519
Merge pull request #2355 from ReinUsesLisp/sync-point
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-07 17:56:11 -04:00
bunnei 8aaf418bd6
Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 -04:00
bunnei 6b18a1592f
Merge pull request #2321 from ReinUsesLisp/gl-state-rework
gl_state: Rework to enable individual applies
2019-04-07 17:50:07 -04:00
bunnei 21a4e7deea
Merge pull request #2098 from FreddyFunk/disk-cache-zstd
gl_shader_disk_cache: Use Zstandard for compression
2019-04-07 17:48:33 -04:00
bunnei 80162888e6
Merge pull request #2352 from bunnei/mem-manager-fixes
memory_manager: Improved implementation of read/write/copy block.
2019-04-07 17:44:59 -04:00
Fernando Sahmkow 021cd56bc9 Permit a Null Shader in case of a bad host_ptr. 2019-04-07 07:52:01 -04:00
ReinUsesLisp ddcb711ee8 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
Lioncash 89c106e31b video_core/textures/convert: Replace include with a forward declaration
Avoids dragging in a direct dependency in a header.
2019-04-06 00:14:36 -04:00
Lioncash fbf452ab0e video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei 864280fabc
Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
bunnei e3402d976d
Merge pull request #2346 from lioncash/header
video_core/engines: Remove unnecessary inclusions where applicable
2019-04-05 23:44:27 -04:00
bunnei 20be92d5e6 memory_manager: Improved implementation of read/write/copy block.
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY!
- Fixes a crash with Mario Tennis Aces
2019-04-05 23:43:34 -04:00
bunnei 89b8801a97
Merge pull request #2350 from lioncash/vmem
video_core/memory_manager: Mark a few member functions with the const qualifier
2019-04-05 23:40:54 -04:00
bunnei 41890a84be
Merge pull request #2347 from lioncash/trunc
video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
2019-04-05 23:39:31 -04:00
bunnei 520e4e5d4b
Merge pull request #2327 from ReinUsesLisp/crash-safe-visit
gl_shader_decompiler: Return early when an operation is invalid
2019-04-05 23:36:18 -04:00
bunnei cb2209d06a
Merge pull request #2337 from lioncash/temporary
gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
2019-04-05 23:35:31 -04:00
Lioncash 00e7190e29 video_core/macro_interpreter: Remove assertion within FetchParameter()
We can just use .at(), which essentially does the same thing, but with
less code.
2019-04-05 22:56:58 -04:00
Lioncash 1efdb4897e video_core/macro_interpreter: Simplify GetRegister()
Given we already ensure nothing can set the zeroth register in
SetRegister(), we don't need to check if the index is zero and special
case it. We can just access the register normally, since it's already
going to be zero.

We can also replace the assertion with .at() to perform the equivalent
behavior inline as part of the API.
2019-04-05 22:55:13 -04:00
Lioncash c13fbe6a41 video_core/memory_manager: Make Read() a const qualified member function
Given this doesn't actually alter internal state, this can be made a
const member function.
2019-04-05 20:30:48 -04:00
Lioncash 76ef6e5c2b video_core/memory_manager: Make ReadBlock() a const qualifier member function
Now, since we have a const qualified variant of GetPointer(), we can put
it to use in ReadBlock() to retrieve the source pointer that is passed
into memcpy.

Now block reading may be done from a const context.
2019-04-05 20:28:44 -04:00
Lioncash 34510bcda8 video_core/memory_manager: Add a const qualified variant of GetPointer()
Allows retrieving read-only pointers from a const context externally.
2019-04-05 20:25:28 -04:00
Lioncash 085b388a7a video_core/memory_manager: Make FindFreeRegion() a const member function
This doesn't modify internal state, so it can be made a const member
function.
2019-04-05 20:22:55 -04:00
Lioncash 9dec087fca video_core/memory_manager: Make GpuToCpuAddress() a const member function
This doesn't modify any internal state, so it can be made a const member
function to allow its use in const contexts.
2019-04-05 20:18:29 -04:00
Fernando Sahmkow fc91e21206 Implement SyncPoint Register in the GPU. 2019-04-05 19:19:30 -04:00
Lioncash 30ce9b2b5c video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
Since c5d41fd812 callback parameters were
changed to use an s64 to represent late cycles instead of an int, so
this was causing a truncation warning to occur here. Changing it to s64
is sufficient to silence the warning.
2019-04-05 18:37:37 -04:00
Lioncash 22f02076c6 video_core/engines: Make memory manager members private
These aren't used externally by anything, so they can be made private
data members.
2019-04-05 18:26:43 -04:00
Lioncash 26223f8124 video_core/engines: Remove unnecessary inclusions where applicable
Replaces header inclusions with forward declarations where applicable
and also removes unused headers within the cpp file. This reduces a few
more dependencies on core/memory.h
2019-04-05 18:26:32 -04:00
ReinUsesLisp 34c3e2c786 renderer_opengl/utils: Skip empty binds 2019-04-05 19:19:49 -03:00
ReinUsesLisp b631c09e72 gl_rasterizer: Use ARB_multi_bind to update SSBOs 2019-04-05 19:18:43 -03:00
ReinUsesLisp 2d1f054c61 gl_rasterizer: Use ARB_multi_bind to update UBOs across stages 2019-04-05 19:10:46 -03:00
bunnei 66be5150d6
Merge pull request #2282 from bunnei/gpu-asynch-v2
gpu_thread: Improve synchronization by using CoreTiming.
2019-04-04 22:38:04 -04:00