Commit graph

337 commits

Author SHA1 Message Date
bunnei 076add4ccd
Merge pull request #1326 from FearlessTobi/port-4182
Port #4182 from Citra: "Prefix all size_t with std::"
2018-09-17 09:51:47 -04:00
bunnei ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
Subv c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi 63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
bunnei cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27 a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
FernandoS27 3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27 2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27 e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
Markus Wick c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei 49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27 00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei 223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
FernandoS27 073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
FernandoS27 82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei 77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
FernandoS27 e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
bunnei 325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei 89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei 177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei 9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec 60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec 2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec 6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec 948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash 4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
Hexagon12 d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku 915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei 2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
Lioncash 20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku 36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
tech4me ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei 2a472ff54d maxwell_3d: Update to include additional stencil registers. 2018-08-23 11:08:47 -04:00
Laku 8e8326595f implement lop3 2018-08-22 10:09:44 +03:00
bunnei 125d7122ac
Merge pull request #1124 from Subv/logic_ops
GPU: Implemented logic ops.
2018-08-22 01:05:25 -04:00
Lioncash a0e2bd85a5 shader_bytecode: Parenthesize conditional expression within GetTextureType()
Resolves a -Wlogical-op-parentheses warning.
2018-08-21 15:08:35 -04:00
bunnei 2ae88feea7 shader_bytecode: Replace some UNIMPLEMENTED logs. 2018-08-20 21:53:49 -04:00
Subv 6bcdf37d4f GPU: Added registers for the logicop functionality. 2018-08-20 18:42:36 -05:00
bunnei 028d90eb79
Merge pull request #1104 from Subv/instanced_arrays
GLRasterizer: Implemented instanced vertex arrays.
2018-08-20 14:32:50 -04:00
bunnei b20ed93884
Merge pull request #1112 from Subv/sampler_types
Shaders: Use the correct shader type when sampling textures.
2018-08-20 14:30:45 -04:00
bunnei 51ddb130c5
Merge pull request #1089 from Subv/neg_bits
Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
2018-08-19 17:01:48 -04:00
Subv f7edbcd7a3 Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 14:00:12 -05:00
Subv 73b937b190 Shader: Added bitfields for the texture type of the various sampling instructions. 2018-08-19 12:57:51 -05:00
Subv 656758fd81 Shaders: Added decodings for TLD4 and TLD4S 2018-08-19 12:57:08 -05:00
bunnei 29d4f8c2dd
Merge pull request #1109 from Subv/ldg_decode
Shaders: Added decodings for  the LDG and STG instructions.
2018-08-19 13:31:19 -04:00
bunnei 9baf5de90c
Merge pull request #1108 from Subv/front_facing
Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
2018-08-19 13:21:14 -04:00
Subv 1b92ae136f Shaders: Added decodings for the LDG and STG instructions. 2018-08-19 00:46:34 -05:00
Subv 731701a2d2 Shaders: Implemented the gl_FrontFacing input attribute (attr 63). 2018-08-19 00:14:34 -05:00
Subv e0f66c1fbf GLRasterizer: Implemented instanced vertex arrays.
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18 14:42:26 -05:00
Subv 8335b2f115 Shader: Implemented the predicate and mode arguments of LOP.
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)).

This is used by Super Mario Odyssey.
2018-08-18 14:36:37 -05:00
Subv 2e95ba2e9c Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
We should definitely audit our shader generator for more errors like this.
2018-08-18 10:22:42 -05:00
David Marcec 63dff47e22 Added predcondition GreaterThanWithNan 2018-08-18 17:49:59 +10:00
Subv c5284efd4f Rasterizer: Implemented instanced rendering.
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are.

Instanced vertex arrays are not yet implemented.
2018-08-14 22:25:07 -05:00
bunnei 534abf9d97 gl_shader_decompiler: Implement XMAD instruction. 2018-08-12 18:30:24 -04:00
bunnei f2c7b5dcd6
Merge pull request #1024 from Subv/blend_gl
GPU/Maxwell3D: Implemented an alternative set of blend factors.
2018-08-11 22:39:02 -04:00
Subv 969326bd58 GPU/Maxwell3D: Implemented an alternative set of blend factors.
These are used by nouveau and some games like SMO.
2018-08-11 20:57:16 -05:00
Subv 2dad1204e8 RasterizerGL: Ignore invalid/unset vertex attributes.
This should make the es2gears example not crash anymore.
2018-08-11 20:36:40 -05:00
bunnei 403dfd68fc
Merge pull request #1010 from bunnei/unk-vert-attrib-shader
gl_shader_decompiler: Improve handling of unknown input/output attributes.
2018-08-11 19:56:28 -04:00
bunnei 0b668d5ff3 gl_shader_decompiler: Improve handling of unknown input/output attributes. 2018-08-11 19:26:45 -04:00
bunnei 670a2c1f80
Merge pull request #1018 from Subv/ssy_sync
GPU/Shader: Implemented SSY and SYNC as a set_target/jump pair.
2018-08-11 19:10:02 -04:00
Subv c1ad973881 GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY). 2018-08-11 16:00:14 -05:00
Lioncash b8c43b6080 video_core: Use variable template variants of type_traits interfaces where applicable 2018-08-09 20:45:48 -04:00
bunnei efe6b473c5 maxwell_3d: Ignore macros that have not been uploaded yet.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:25:37 -04:00
bunnei 25ba4d1b68
Merge pull request #982 from bunnei/stub-unk-63
gl_shader_decompiler: Stub input attribute Unknown_63.
2018-08-08 22:28:18 -04:00
bunnei cf917a5e93
Merge pull request #976 from bunnei/shader-imm
gl_shader_decompiler: Let OpenGL interpret floats.
2018-08-08 19:17:01 -04:00
bunnei 7f0d0a93f7 gl_shader_decompiler: Stub input attribute Unknown_63. 2018-08-08 02:35:59 -04:00
bunnei 57982df105 maxwell_3d: Use correct const buffer size and check bounds.
- Fixes mem corruption with Super Mario Odyssey and Pokkén Tournament DX.
2018-08-08 02:10:25 -04:00
bunnei e542356d0c gl_shader_decompiler: Let OpenGL interpret floats.
- Accuracy is lost in translation to string, e.g. with NaN.
- Needed for Super Mario Odyssey.
2018-08-08 01:45:23 -04:00
bunnei 904d7eaa94 maxwell_3d: Remove outdated assert. 2018-08-05 23:57:19 -04:00
Lioncash 6030c5ce41 video_core: Eliminate the g_renderer global variable
We move the initialization of the renderer to the core class, while
keeping the creation of it and any other specifics in video_core. This
way we can ensure that the renderer is initialized and doesn't give
unfettered access to the renderer. This also makes dependencies on types
more explicit.

For example, the GPU class doesn't need to depend on the
existence of a renderer, it only needs to care about whether or not it
has a rasterizer, but since it was accessing the global variable, it was
also making the renderer a part of its dependency chain. By adjusting
the interface, we can get rid of this dependency.
2018-08-04 02:36:57 -04:00
Subv 8f2c4191ab GPU: Remove the assert that required the CODE_ADDRESS to be 0.
Games usually just leave it at 0 but nouveau sets it to something else. This already works fine, the assert is useless.
2018-07-24 13:54:12 -05:00
bunnei 148a5bef7e shader_bytecode: Implement other TEXS masks. 2018-07-22 03:23:15 -04:00
bunnei c43eaa94f3 gl_shader_decompiler: Implement SEL instruction. 2018-07-22 00:37:12 -04:00
bunnei 5287991a36 maxwell_3d: Add depth buffer enable, width, and height registers. 2018-07-21 21:51:05 -04:00
Lioncash bb960c8cb4 video_core: Use nested namespaces where applicable
Compresses a few namespace specifiers to be more compact.
2018-07-20 18:23:54 -04:00
Lioncash 8b08f82dc7 maxwell_3d: Remove unused variable within GetStageTextures() 2018-07-19 22:38:28 -04:00
Subv 3d3b10adc7 GPU: Added register definitions for the stencil parameters. 2018-07-17 15:00:21 -05:00
bunnei 8aeff9cf8e gl_rasterizer: Fix check for if a shader stage is enabled. 2018-07-12 22:57:57 -04:00
bunnei 64b5e5d5d9
Merge pull request #655 from bunnei/pred-lt-nan
gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
2018-07-12 18:59:15 -07:00
bunnei 49c0c081c4 gl_shader_decompiler: Implement PredCondition::LessThanWithNan. 2018-07-12 20:04:35 -04:00
bunnei 4757ffdcce gl_shader_decompiler: Use FlowCondition field in EXIT instruction. 2018-07-12 20:00:37 -04:00
Sebastian Valle 274d1fb0fc
Merge pull request #652 from Subv/fadd32i
GPU: Implement the FADD32I shader instruction.
2018-07-12 17:36:51 -05:00
bunnei 3ff21345b4
Merge pull request #651 from Subv/ffma_decode
GPU: Corrected the decoding of FFMA for immediate operands.
2018-07-12 12:42:58 -07:00
Subv c1ae841f47 GPU: Implement the FADD32I shader instruction. 2018-07-12 12:00:31 -05:00
Subv 0cad310e12 GPU: Corrected the decoding of FFMA for immediate operands. 2018-07-12 10:15:48 -05:00
bunnei 639346bcfb
Merge pull request #625 from Subv/imnmx
GPU: Implemented the IMNMX shader instruction.
2018-07-07 19:33:50 -07:00
bunnei 51bd76a5fd
Merge pull request #629 from Subv/depth_test
GPU: Allow using the old NV04 values for the depth test function.
2018-07-05 16:43:10 -04:00