bunnei
a98f14e9b0
Merge pull request #6722 from ReinUsesLisp/xmad-opts
...
shader: Fold integer FMA from Nvidia's pattern
2021-07-29 18:45:37 -07:00
ReinUsesLisp
8c9febe8f7
shader: Fold UnpackFloat2x16 and PackFloat2x16
...
Simplifies the code a bit when possible. These instructions should be
no-ops codegen wise.
2021-07-29 21:22:52 -03:00
ReinUsesLisp
1bb46b7d64
shader: Mark ConvertF16F32 and ConvertF32F16 as fp16 instructions
...
Fixes instances where fp16 types are not declared on SPIR-V but they are
used. This shouldn't happen on master, as it's been uncovered by an
additional optimization pass.
2021-07-27 21:33:05 -03:00
Lioncash
c27ddb44de
exception: Make constructors explicit
...
Ensures that exception construction is always explicit.
2021-07-27 04:15:14 -04:00
Lioncash
e490ddf327
exception: Make what() member function nodiscard
2021-07-27 04:14:32 -04:00
Lioncash
90f3678ada
exception: Narrow down specific header
...
We can use the <exception> header instead of pulling in all of the
exception-style classes.
2021-07-27 04:09:18 -04:00
Rodrigo Locatti
c0f99558fb
Merge pull request #6724 from lioncash/nodisc-shader
...
shader_recompiler: Remove unnecessary [[nodiscard]] instances
2021-07-26 16:35:21 -03:00
Rodrigo Locatti
de0b89792c
Merge pull request #6726 from lioncash/hguard
...
emit_spirv_instructions: Add missing header guard
2021-07-26 16:35:11 -03:00
Rodrigo Locatti
3d97f1e6cf
Merge pull request #6727 from lioncash/topology
...
emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
2021-07-26 16:35:03 -03:00
Rodrigo Locatti
b2b3fcdccd
Merge pull request #6723 from lioncash/shader
...
object_pool: Add missing return in Chunk move assignment operator
2021-07-26 06:01:21 -03:00
Lioncash
3e7813e49d
emit_glasm: Fix LINESS_ADJACENCY typo in InputPrimitive()
...
This should be LINES_ADJACENCY
2021-07-26 04:44:56 -04:00
Lioncash
c2915d9f2f
emit_spirv_instructions: Add missing header guard
2021-07-26 04:28:35 -04:00
Lioncash
06ca911621
shader_recompiler: Remove unnecessary [[nodiscard]] instances
...
[[nodiscard]] doesn't do anything on functions with a void return type
and causes superfluous warnings.
2021-07-26 04:23:59 -04:00
Lioncash
0b67df1f7c
control_flow: Fix duplicate switch case in OpcodeToken
...
This previously duplicated the case of the PBK case above it.
2021-07-26 04:16:34 -04:00
Lioncash
89ad9df0e9
object_pool: Add missing return in Chunk move assignment operator
...
Prevents undefined behavior from occurring.
2021-07-26 04:01:05 -04:00
ReinUsesLisp
66a0cedba3
shader: Fold integer FMA from Nvidia's pattern
...
Fold shaders doing "a * b + c" on integers from the pattern generated by
Nvidia's GL compiler.
On a somewhat complex compute shader it reduces the code size by 16
instructions from 2 matches on Turing GPUs.
On Intel as extracted from KHR_pipeline_executable_properties:
Before the optimization:
```
Instruction Count: 2057
Basic Block Count: 45
Scratch Memory Size: 14752
Spill Count: 232
Fill Count: 261
SEND Count: 610
Cycle Count: 11325
```
After the optimization:
```
Instruction Count: 2046
Basic Block Count: 44
Scratch Memory Size: 13728
Spill Count: 219
Fill Count: 268
SEND Count: 604
Cycle Count: 11367
```
2021-07-26 04:58:02 -03:00
ReinUsesLisp
09fb41dc63
shader: Use TryInstRecursive on XMAD multiply folding
...
Simplify a bit the logic.
2021-07-26 04:15:27 -03:00
ReinUsesLisp
f6f0383b49
shader: Add TryInstRecursive utility to values
2021-07-26 01:31:05 -03:00
ReinUsesLisp
7f13104c17
shader: Support out of bound local memory reads and immediate writes
...
Support ignoring immediate out of bound writes. Writing dynamically out
of bounds is not yet supported (e.g. R0+0x4).
Reading out of bounds yields zero. This is supported checking for the
size from the IR; if the input is immediate, the optimization passes
will drop it.
2021-07-22 21:51:41 -04:00
ameerj
56478bc9ac
shader: Fix disabled attribute default values
2021-07-22 21:51:40 -04:00
ameerj
56c30dd9e0
glsl: Simplify FCMP emission
2021-07-22 21:51:40 -04:00
ameerj
79d2684261
glsl: Update TessellationControl gl_in
...
Adheres to GL_ARB_separate_shader_objects requirements
2021-07-22 21:51:40 -04:00
ameerj
fc7bed21b5
shader: Implement ISETP.X
2021-07-22 21:51:40 -04:00
ReinUsesLisp
bf2956d77a
shader: Avoid usage of C++20 ranges to build in clang
2021-07-22 21:51:40 -04:00
ameerj
94af0a00f6
glsl: Clamp shared mem size to GL_MAX_COMPUTE_SHARED_MEMORY_SIZE
2021-07-22 21:51:40 -04:00
lat9nq
49946cf780
shader_recompiler, video_core: Resolve clang errors
...
Silences the following warnings-turned-errors:
-Wsign-conversion
-Wunused-private-field
-Wbraced-scalar-init
-Wunused-variable
And some other errors
2021-07-22 21:51:40 -04:00
ReinUsesLisp
2235a51b5d
shader: Manually convert from array<u32> to bitset instead of using bit_cast
2021-07-22 21:51:40 -04:00
ameerj
41c6cb70f9
glsl: Fix tracking of info.uses_shadow_lod
2021-07-22 21:51:40 -04:00
ameerj
11f04f1022
shader: Ignore global memory ops on devices lacking int64 support
2021-07-22 21:51:40 -04:00
ameerj
57f222c56e
dual_vertex_pass: Clang format
2021-07-22 21:51:40 -04:00
ReinUsesLisp
8722668b3c
emit_spirv: Workaround VK_KHR_shader_float_controls on fp16 Nvidia
...
Fix regression on Fire Emblem: Three Houses when using native fp16.
2021-07-22 21:51:40 -04:00
lat9nq
2e5af95541
shader: GCC fmt 8.0.0 fixes
2021-07-22 21:51:40 -04:00
ameerj
b9069c7891
shader: Account for 33-bit IADD3 scenario
2021-07-22 21:51:40 -04:00
ReinUsesLisp
b21bf79bd2
shader: Only apply shift on register mode for IADD3
2021-07-22 21:51:39 -04:00
ReinUsesLisp
5643a909bc
shader: Fix disabled and unwritten attributes and varyings
2021-07-22 21:51:39 -04:00
ameerj
65daec8b75
glsl: Fix shared and local memory declarations
...
account for the fact that program.*memory_size is in units of bytes.
2021-07-22 21:51:39 -04:00
ameerj
8289eb108f
opengl: Implement LOP.CC
...
Used by MH:Rise
2021-07-22 21:51:39 -04:00
ReinUsesLisp
5b2b0634a1
spirv: Fix code emission when descriptor aliasing is unsupported
...
Fixes OpenGL.
2021-07-22 21:51:39 -04:00
ameerj
00fa09dc45
glsl: Declare local memory in main
2021-07-22 21:51:39 -04:00
ameerj
f7352411f0
glsl: Add passthrough geometry shader support
2021-07-22 21:51:39 -04:00
ReinUsesLisp
8612b5fec5
shader: Use std::bit_cast instead of Common::BitCast for passthrough
2021-07-22 21:51:39 -04:00
ReinUsesLisp
8a3427a4c8
glasm: Add passthrough geometry shader support
2021-07-22 21:51:39 -04:00
ReinUsesLisp
7dafa96ab5
shader: Rework varyings and implement passthrough geometry shaders
...
Put all varyings into a single std::bitset with helpers to access it.
Implement passthrough geometry shaders using host's.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
ecd6b4356b
shader: Only verify shader when graphics debugging is enabled
2021-07-22 21:51:39 -04:00
ReinUsesLisp
395bed3a0a
shader: Unify shader stage types
2021-07-22 21:51:39 -04:00
lat9nq
257d2aab74
lower_int64_to_int32: Add missing include
2021-07-22 21:51:39 -04:00
ReinUsesLisp
fb166b5ff4
shader: Emulate 64-bit integers when not supported
...
Useful for mobile and Intel Xe devices.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
d8d5501459
shader: Add int64 to int32 lowering pass
2021-07-22 21:51:39 -04:00
ReinUsesLisp
04ef2160f9
shader: Teach global memory base tracker to follow vectors
2021-07-22 21:51:39 -04:00
ReinUsesLisp
97e80dda55
shader: Add constant propagation to integer vectors
2021-07-22 21:51:39 -04:00
ameerj
27ca8a0e13
glsl: Better IAdd Overflow CC fix
...
This ensures the original operand values are not overwritten when being used in the overflow detection.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
4397053d5c
shader: Remove IAbs64
2021-07-22 21:51:39 -04:00
ameerj
bc6e399ae3
glsl: Fix IADD CC
2021-07-22 21:51:39 -04:00
ameerj
a7536825df
shader_recompiler: Fix IADD3 input partitioning
2021-07-22 21:51:39 -04:00
ReinUsesLisp
808ef97a08
shader: Move loop safety tests to code emission
2021-07-22 21:51:39 -04:00
ameerj
cbce9ddd4a
glsl: Remove frag color initialization
2021-07-22 21:51:39 -04:00
ameerj
3a2dd1b483
glasm: Implement SetAttribute ViewportMask
2021-07-22 21:51:39 -04:00
ameerj
1c648f176c
emit_glsl_special: Skip initialization of frag_color0
...
Fixes rendering in Devil May Cry without regressing Ori and the Blind Forest.
2021-07-22 21:51:38 -04:00
ReinUsesLisp
1d182fc0f5
shader: Calibrate loop safety threshold
2021-07-22 21:51:38 -04:00
Morph
cfbc85839d
glsl: Add missing ; in EmitSetSampleMask
...
Fixes shader compilation in Okami HD
2021-07-22 21:51:38 -04:00
ameerj
9e066dcb15
glsl: Fix output varying initialization when transform feedback is used
2021-07-22 21:51:38 -04:00
ameerj
a0365217f5
texture_pass: Fix is_read image qualification
...
Atomic operations are considered to have both read and write access. This was not being accounted for.
2021-07-22 21:51:38 -04:00
ReinUsesLisp
0cd08b3e72
shader: Align constant buffer sizes to 16 bytes
...
WAR for AMD reading zeroes on uniform buffers of size 2.
2021-07-22 21:51:38 -04:00
ReinUsesLisp
59fead3a47
spirv: Properly handle devices without int8 and int16
2021-07-22 21:51:38 -04:00
ReinUsesLisp
b5e78607ad
spirv: Handle small storage buffer loads on devices with no support
2021-07-22 21:51:38 -04:00
ameerj
ccbd24fe00
glsl: Fix cbuf component indexing bug falback
2021-07-22 21:51:38 -04:00
ReinUsesLisp
1091995f8e
shader: Simplify MergeDualVertexPrograms
2021-07-22 21:51:38 -04:00
ReinUsesLisp
374eeda1a3
shader: Properly manage attributes not written from previous stages
2021-07-22 21:51:38 -04:00
ReinUsesLisp
892b8aa2ad
glsl: Only declare fragment outputs on fragment shaders
2021-07-22 21:51:38 -04:00
ReinUsesLisp
0ffea97e2e
shader: Split profile and runtime info headers
2021-07-22 21:51:38 -04:00
ReinUsesLisp
cbbca26d18
shader: Add support for native 16-bit floats
2021-07-22 21:51:38 -04:00
ReinUsesLisp
376aa94819
shader: Rename maxwell/program.h to translate_program.h
2021-07-22 21:51:38 -04:00
ameerj
12ef06ba8b
glsl: Obey need_declared_frag_colors to declare and initialize all frag_color
...
Fixes Ori and the blind forest title screen
2021-07-22 21:51:38 -04:00
ameerj
d36f667bc0
glsl: Address rest of feedback
2021-07-22 21:51:38 -04:00
ameerj
c5dfa0b630
glsl: Move gl_Position/generic attribute initialization to EmitProlgue
2021-07-22 21:51:38 -04:00
ameerj
3b339fbbf6
glsl: Conditionally use fine/coarse derivatives based on device support
2021-07-22 21:51:38 -04:00
ameerj
6eea88d614
glsl: Cleanup/Address feedback
2021-07-22 21:51:38 -04:00
ameerj
ae4e452759
glsl: Add Shader_GLSL logging
2021-07-22 21:51:38 -04:00
ameerj
6c6a451d6a
glsl: Add LoopSafety instructions
2021-07-22 21:51:38 -04:00
ameerj
a0d0704aff
glsl: Conditionally add EXT_texture_shadow_lod
2021-07-22 21:51:38 -04:00
ameerj
5e7b2b9661
glsl: Add stubs for sparse queries and variable aoffi when not supported
2021-07-22 21:51:38 -04:00
ameerj
6aa1bf7b6f
glsl: Implement legacy varyings
2021-07-22 21:51:38 -04:00
ameerj
39c29664f9
glsl: Minor cleanup
2021-07-22 21:51:38 -04:00
ameerj
427a2596a1
glsl: Fix Cbuf getters for F32 type
2021-07-22 21:51:38 -04:00
ameerj
7c82f20b52
glsl: Add immediate index oob checking for Cbuf getters
2021-07-22 21:51:38 -04:00
ameerj
84c86e03cd
glsl: Refactor GetCbuf functions to reduce code duplication
2021-07-22 21:51:38 -04:00
ameerj
e81c73a874
glsl: Address more feedback. Implement indexed texture reads
2021-07-22 21:51:38 -04:00
ameerj
7d89a82a48
glsl: Remove Signed Integer variables
2021-07-22 21:51:38 -04:00
ameerj
4759db28d0
glsl: Address Rodrigo's feedback
2021-07-22 21:51:38 -04:00
ameerj
85399e119d
glsl: Reorganize backend code, remove unneeded [[maybe_unused]]
2021-07-22 21:51:37 -04:00
ameerj
e7c8f8911f
glsl: Implement SampleId and SetSampleMask
...
plus some minor refactoring of implementations
2021-07-22 21:51:37 -04:00
ameerj
d1a68f7997
glsl: Add gl_PerVertex in for GS
2021-07-22 21:51:37 -04:00
ameerj
a926695234
glsl: Use existing tracking for enabling EXT_shader_image_load_formatted
2021-07-22 21:51:37 -04:00
ameerj
14bd73db36
glsl: Enable early fragment tests
2021-07-22 21:51:37 -04:00
ameerj
3f31a547e0
glsl: Implement more attribute getters and setters
2021-07-22 21:51:37 -04:00
ameerj
8bb8bbf4ae
glsl: Implement fswzadd
...
and wip nv thread shuffle impl
2021-07-22 21:51:37 -04:00
ameerj
c542204113
glsl: Implement indexed attribute loads
2021-07-22 21:51:37 -04:00
ameerj
2a504b4765
glsl: Conditionally add GL_ARB_sparse_texture2
2021-07-22 21:51:37 -04:00
ameerj
fc0db612ab
glsl: Conditionally use GL_EXT_shader_image_load_formatted
...
Fix for SULD.D
2021-07-22 21:51:37 -04:00
ameerj
fb839061fb
glsl: Remove output generic indexing for geometry stage
2021-07-22 21:51:37 -04:00
ameerj
258106038e
glsl: Allow dynamic tracking of variable allocation
2021-07-22 21:51:37 -04:00
ameerj
465903468e
glsl: Implement barriers
2021-07-22 21:51:37 -04:00
ameerj
421847cf1e
glsl: Implement image atomics and set layer
...
along with some more cleanup/oversight fixes
2021-07-22 21:51:37 -04:00
ameerj
d41aef03c7
glsl: Fix image gather logic
2021-07-22 21:51:37 -04:00
ameerj
35e78d558d
glsl: Add cbuf access workaround for devices with component indexing bug
2021-07-22 21:51:37 -04:00
ameerj
747b8556a4
glsl: Use textureGrad fallback when EXT_texture_shadow_lod is unsupported
2021-07-22 21:51:37 -04:00
ameerj
d12f2b8ccf
emit_glsl_image: Use immediate offsets when possible
2021-07-22 21:51:37 -04:00
ameerj
0a0b0a73d8
glsl: Fix <32-bit SSBO writes
...
and more cleanup
2021-07-22 21:51:37 -04:00
ameerj
34fdb6471d
glsl: Cleanup and address feedback
2021-07-22 21:51:37 -04:00
ameerj
5355568a2d
glsl: Refactor Global memory functions
2021-07-22 21:51:37 -04:00
ameerj
a68fabf6d5
glsl: Increase NUM_VARS that can be allocated
...
needed for HW:AoC.
2021-07-22 21:51:37 -04:00
ameerj
8d8ce24f20
glsl: Implement Load/WriteGlobal
...
along with some other misc changes and fixes
2021-07-22 21:51:37 -04:00
ameerj
af9696059c
glsl: Implement Images
2021-07-22 21:51:37 -04:00
ameerj
6577a63d36
glsl: skip gl_ViewportIndex write if device does not support it
2021-07-22 21:51:37 -04:00
ameerj
f4799e8fa1
glsl: Implement transform feedback
2021-07-22 21:51:37 -04:00
ameerj
31147ffe69
glsl: Yet another gl_ViewportIndex fix attempt
2021-07-22 21:51:37 -04:00
ameerj
9f3970f837
glsl: Add gl_ViewportIndex out attribute
2021-07-22 21:51:37 -04:00
lat9nq
fc29de7d5b
emit_glsl_context_get_set: Remove unused function
2021-07-22 21:51:37 -04:00
ameerj
59576b82a8
glsl: Fix precise variable declaration
...
and add some more separation in the shader for better debugability when dumped
2021-07-22 21:51:37 -04:00
ameerj
8c684b3e23
glsl: Implement tessellation shaders
2021-07-22 21:51:37 -04:00
ameerj
c7d085b505
glsl: Implement ImageGradient and other texture function variants
2021-07-22 21:51:37 -04:00
ameerj
68d075d1e8
glsl: Fix atomic SSBO offsets
...
and implement misc getters
2021-07-22 21:51:37 -04:00
ameerj
19247ba4fa
glsl: Implement geometry shaders
2021-07-22 21:51:37 -04:00
ameerj
df53046d68
glsl: Use NotImplemented macro with function name output
2021-07-22 21:51:37 -04:00
ameerj
3a024b3026
glsl: Implement gl_ViewportIndex
...
SSBU now working
2021-07-22 21:51:37 -04:00
ameerj
b7561226ed
glsl: SHFL fix and prefer shift operations over divide in glsl shader
2021-07-22 21:51:37 -04:00
ameerj
e10366974e
glsl: Implement precise fp variable allocation
2021-07-22 21:51:37 -04:00
ameerj
14bfb4719a
HACK glsl: Write defaults to unused generic attributes
2021-07-22 21:51:37 -04:00
ameerj
4b5a4ea72e
glsl: Fix ssbo indexing and name shadowing between shader stages
2021-07-22 21:51:37 -04:00
ameerj
8ec0028e68
glsl: implement set clip distance
...
and missed a diff in emit_glsl relating to var alloc ref counting
2021-07-22 21:51:37 -04:00
ameerj
9f3ffb996b
glsl: Rework var alloc to not assign unused results
2021-07-22 21:51:37 -04:00
ameerj
1269a0cf8b
glsl: Rework variable allocator to allow for variable reuse
2021-07-22 21:51:37 -04:00
ameerj
9ccbd74991
glsl: Fix ATOM and implement ATOMS
2021-07-22 21:51:37 -04:00
ameerj
68ef3803bf
glsl: Use gl_SubGroupInvocationARB
2021-07-22 21:51:36 -04:00
ameerj
e35ffbbeb0
glsl: Implement VOTE for subgroup size potentially larger
2021-07-22 21:51:36 -04:00
ameerj
770b754afd
glsl: Implement VOTE
2021-07-22 21:51:36 -04:00
ameerj
181a4ffdc4
glsl: Implement ST{LS}
2021-07-22 21:51:36 -04:00
ameerj
57d354b02c
glsl: Implement more instructions used by SMO
2021-07-22 21:51:36 -04:00
ameerj
7df0815117
glsl: Implement more instructions used by SMO
2021-07-22 21:51:36 -04:00
ameerj
80eec85867
glsl: Fix GetAttribute return values
...
fixes font rendering issues as these were used to index into the ssbos
2021-07-22 21:51:36 -04:00
ameerj
1542f31e79
glsl: minor cleanup
2021-07-22 21:51:36 -04:00
ameerj
005eecffcd
glsl: Fix and implement rest of cbuf access
2021-07-22 21:51:36 -04:00
ameerj
3047eb6688
glsl: Implement TXQ and other misc changes
2021-07-22 21:51:36 -04:00
ameerj
5fd92780b2
glsl: TLD4 implementation
2021-07-22 21:51:36 -04:00
ameerj
697eacd095
glsl: Implement TLD instruction
2021-07-22 21:51:36 -04:00
ameerj
e4ba755705
glsl: Implement TEXS
2021-07-22 21:51:36 -04:00
ameerj
59a692e9ed
glsl: Cleanup texture functions
2021-07-22 21:51:36 -04:00
lat9nq
c9a25855bc
shader_recompiler: GCC fixes
2021-07-22 21:51:36 -04:00
ameerj
7619b7d427
glsl: Implement TEX depth functions
2021-07-22 21:51:36 -04:00
ameerj
55e0211a5e
glsl: Implement TEX ImageSample functions
2021-07-22 21:51:36 -04:00