Commit graph

255 commits

Author SHA1 Message Date
Subv eab7457c00 GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
Like the MOV32I and FMUL32I instructions.
This fixes a potential crash when using these instructions.
2018-06-18 19:50:35 -05:00
bunnei afdd657d30 gl_shader_decompiler: Implement LOP instructions. 2018-06-17 15:27:48 -04:00
bunnei 5673ce39c7 gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP. 2018-06-17 13:31:39 -04:00
bunnei d383043e07 gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I. 2018-06-15 22:42:02 -04:00
bunnei 019d7208c8
Merge pull request #556 from Subv/dma_engine
GPU: Partially implemented the Maxwell DMA engine.
2018-06-12 14:25:17 -04:00
bunnei 2015a1b180
Merge pull request #558 from Subv/iadd32i
GPU: Implemented the iadd32i shader instruction.
2018-06-12 14:19:25 -04:00
Subv db0497b808 GPU: Implemented the iadd32i shader instruction. 2018-06-12 11:46:45 -05:00
Subv 987a170665 GPU: Partially implemented the Maxwell DMA engine.
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12 11:27:36 -05:00
bunnei 5f3d6c85db gl_shader_decompiler: Implement saturate for float instructions. 2018-06-11 21:46:34 -04:00
Subv b366b885a1 GPU: Implement the iset family of shader instructions. 2018-06-09 16:19:13 -05:00
Subv 3cb753eeb1 GPU: Added decodings for the ISET family of instructions. 2018-06-09 15:56:50 -05:00
bunnei d81aaa3ed3
Merge pull request #550 from Subv/ssy
GPU: Stub the SSY shader instruction.
2018-06-09 00:42:53 -04:00
bunnei e2176dc7ce
Merge pull request #551 from bunnei/shr
gl_shader_decompiler: Implement SHR instruction.
2018-06-09 00:42:44 -04:00
bunnei 5440b9c634 gl_shader_decompiler: Implement SHR instruction. 2018-06-09 00:01:17 -04:00
Subv abec5f82e2 GPU: Stub the SSY shader instruction.
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-08 22:46:10 -05:00
bunnei bbc4f369ed gl_shader_decompiler: Implement IADD instruction. 2018-06-08 23:25:22 -04:00
bunnei 79e9c2e237 gl_shader_decompiler: Add missing asserts for saturate_a instructions. 2018-06-08 23:24:10 -04:00
Subv c712dafaee GPU: Added registers for normal and independent blending. 2018-06-08 17:04:41 -05:00
bunnei 92209f905f gl_shader_decompiler: Implement BFE_IMM instruction. 2018-06-07 00:58:12 -04:00
bunnei 128aeba0f3 gl_shader_decompiler: F2F: Implement rounding modes. 2018-06-06 22:21:29 -04:00
bunnei 4b114e1b8a shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD. 2018-06-06 19:47:34 -04:00
bunnei 0ff2929644
Merge pull request #534 from Subv/multitexturing
GPU: Implement sampling multiple textures in the generated glsl shaders.
2018-06-06 19:12:52 -04:00
bunnei 4669f15f8b gl_shader_decompiler: Implement LD_C instruction. 2018-06-06 18:09:06 -04:00
bunnei 6e386a334b gl_shader_decompiler: Refactor uniform handling to allow different decodings. 2018-06-06 17:57:15 -04:00
Subv dbfc39d214 GPU: Implement sampling multiple textures in the generated glsl shaders.
All tested games that use a single texture show no regression.

Only Texture2D textures are supported right now, each shader gets its own "tex_fs/vs/gs" sampler array to maintain independent textures between shader stages, the textures themselves are reused if possible.
2018-06-06 12:58:16 -05:00
bunnei 5fb99e6a16
Merge pull request #516 from Subv/f2i_r
GPU: Implemented the F2I_R shader instruction.
2018-06-05 22:01:29 -04:00
bunnei 38eb33f150
Merge pull request #521 from Subv/bra
GPU: Corrected the branch targets for the shader bra instruction.
2018-06-05 10:09:35 -04:00
Subv e7dfcdde74 GPU: Corrected the branch targets for the shader bra instruction. 2018-06-04 22:56:28 -05:00
Subv 4b89348c00 GPU: Implemented the F2I_R shader instruction. 2018-06-04 22:06:50 -05:00
bunnei c23c30c76f gl_shader_decompiler: Implement SHL instruction. 2018-06-04 22:36:49 -04:00
Subv 23b1e6eded GPU: Implement the ISCADD shader instructions. 2018-06-04 20:17:41 -05:00
Subv 438a9b70cc GPU: Added decodings for the ISCADD instructions. 2018-06-04 20:17:39 -05:00
bunnei e8bfff7b4b
Merge pull request #514 from Subv/lop32i
GPU: Implemented the LOP32I instruction.
2018-06-04 20:48:15 -04:00
bunnei f564822e78
Merge pull request #510 from Subv/isetp
GPU: Implemented the ISETP_R and ISETP_C instructions
2018-06-04 20:47:11 -04:00
bunnei 37fd4e6d9b
Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei cdd92dc692
Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
Subv 2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv 5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
Subv 0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv 7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
Subv b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv 06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei ba117854f9
Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
Subv d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei 1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
bunnei bb9d39b8fe
Merge pull request #494 from bunnei/shader-tex
gl_shader_decompiler: Implement TEX, fixes for TEXS.
2018-06-03 12:05:38 -04:00
bunnei e54ea773fc gl_shader_decompiler: Implement RRO as a register move. 2018-06-03 11:14:31 -04:00
bunnei 888eb345c0 gl_shader_decompiler: Implement TEX instruction. 2018-05-31 23:36:45 -04:00
bunnei 4c727d0ba8 gl_shader_decompiler: Support multi-destination for TEXS. 2018-05-31 22:57:32 -04:00
bunnei 15086a22be
Merge pull request #489 from Subv/vertexid
Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
2018-05-30 14:10:48 -04:00
Subv 99f12b05fa Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader. 2018-05-30 10:58:03 -05:00
bunnei 68937a662d gl_shader_decompiler: Partially implement F2F_R instruction. 2018-05-29 23:10:44 -04:00
bunnei ee53688ca7 shader_bytecode: Implement other variants of FMNMX. 2018-05-25 23:18:50 -04:00
bunnei 898f0fa029
Merge pull request #458 from Subv/fmnmx
Shaders: Implemented the FMNMX shader instruction.
2018-05-20 23:44:07 -04:00
Subv 8440cef223 Shaders: Implemented the FMNMX shader instruction. 2018-05-20 17:53:06 -05:00
Subv a056d5ad8c ShadersDecompiler: Added decoding for the PSETP instruction. 2018-05-19 11:41:14 -05:00
bunnei f41eb95e13 maxwell_3d: Reset vertex counts after drawing. 2018-04-29 16:23:31 -04:00
bunnei c7ce472eeb shader_bytecode: Add decoding for FMNMX instruction. 2018-04-29 16:05:17 -04:00
bunnei 6c464a2a4a
Merge pull request #416 from bunnei/shader-ints-p3
gl_shader_decompiler: Implement MOV32I, partially implement I2I, I2F
2018-04-29 12:56:16 -04:00
bunnei f87ea8fa8b fermi_2d: Fix surface copy block height. 2018-04-28 20:40:03 -04:00
bunnei 0c01c34eff gl_shader_decompiler: Partially implement I2I_R, and I2F_R. 2018-04-28 20:03:19 -04:00
bunnei f2dcb39049 shader_bytecode: Add decodings for i2i instructions. 2018-04-28 20:03:18 -04:00
bunnei a7b5ab4d9a gl_shader_decompiler: Implement MOV32_IMM instruction. 2018-04-28 20:03:18 -04:00
Lioncash 8475496630
general: Convert assertion macros over to be fmt-compatible 2018-04-27 10:04:02 -04:00
bunnei c9d7abe9c9 gl_shader_decompiler: Boilerplate for handling integer instructions. 2018-04-26 14:38:42 -04:00
bunnei f81b915fd8
Merge pull request #396 from Subv/shader_ops
Shaders: Implemented the FSET instruction.
2018-04-25 22:42:54 -04:00
Subv 20d86d8a36 GPU: Partially implemented the Fermi2D surface copy operation.
The hardware allows for some rather complicated operations to be performed on the data during the copy, this is not implemented.
Only same-format same-size raw copies are implemented for now.
2018-04-25 12:54:26 -05:00
Subv e9ad8e9185 Shaders: Added bit decodings for the I2I instruction. 2018-04-25 12:52:55 -05:00
Subv 378c881427 GPU: Added surface copy registers to Fermi2D 2018-04-25 11:55:29 -05:00
Subv b1109931b9 GPU: Added boilerplate code for the Fermi2D engine 2018-04-25 11:55:29 -05:00
Subv c16cfbbc6c GPU: Reduce the number of registers of Maxwell3D to 0xE00.
The rest are just macro shim registers.
2018-04-25 11:55:28 -05:00
Subv a994446b6e GPU: Move the Maxwell3D macro uploading code to the inside of the Maxwell3D processor.
It doesn't belong in the PFIFO handler.
2018-04-25 11:55:27 -05:00
Lioncash b7551e457b
video-core: Move logging macros over to new fmt-capable ones 2018-04-25 09:13:57 -04:00
Subv 0369ee7248 Shaders: Added decodings for the FSET instructions. 2018-04-24 22:42:54 -05:00
bunnei 239ac8abe2 memory_manager: Make GpuToCpuAddress return an optional. 2018-04-24 17:49:19 -04:00
bunnei 9e11a76e92 memory_manager: Use GPUVAdddr, not PAddr, for GPU addresses. 2018-04-24 17:40:43 -04:00
bunnei e8c2bb24b2
Merge pull request #386 from Subv/gpu_query
GPU: Added asserts to our code for handling the QUERY_GET GPU command.
2018-04-24 16:13:51 -04:00
Subv f208953585 GPU: Added asserts to our code for handling the QUERY_GET GPU command.
This is based on research from nouveau. Many things are currently unknown and will require hwtests in the future.
This commit also stubs QueryMode::Write2 to do the same as Write. Nouveau code treats them interchangeably, it is currently unknown what the difference is.
2018-04-23 17:06:57 -05:00
Subv 9531a29283 GPU: Support multiple enabled vertex arrays.
The vertex arrays will be copied to the stream buffer one after the other, and the attributes will be set using the ARB_vertex_attrib_binding extension.

yuzu now thus requires OpenGL 4.3 or the ARB_vertex_attrib_binding extension.
2018-04-23 11:34:50 -05:00
bunnei e1630c4d43 shader_bytecode: Add several more instruction decodings. 2018-04-20 22:30:56 -04:00
bunnei 9f6d305eab shader_bytecode: Decode instructions based on bit strings. 2018-04-20 22:30:56 -04:00
Subv c3a8ea76f1 ShaderGen: Implemented predicated instruction execution.
Each predicated instruction will be wrapped in an `if (predicate) { instruction_body; }` in the GLSL, where `predicate` is one of the predicate boolean variables previously set by fsetp.
2018-04-20 21:09:33 -05:00
Subv 0a5e01b710 ShaderGen: Implemented the fsetp instruction.
Predicate variables are now added to the generated shader code in the form of 'pX' where X is the predicate id.
These predicate variables are initialized to false on shader startup and are set via the fsetp instructions.

TODO:

* Not all the comparison types are implemented.
* Only the single-predicate version is implemented.
2018-04-20 21:09:33 -05:00
Subv d03fc77475 ShaderGen: Register id 255 is special and is hardcoded to return 0 (SR_ZERO). 2018-04-20 14:57:40 -05:00
Subv fe84842137 ShaderGen: Implemented the fmul32i shader instruction. 2018-04-19 13:46:32 -05:00
bunnei ce4f159b1c
gl_shader_gen: Support vertical/horizontal viewport flipping. (#347)
* gl_shader_gen: Support vertical/horizontal viewport flipping.

* fixup! gl_shader_gen: Support vertical/horizontal viewport flipping.
2018-04-18 16:42:40 -04:00
Subv 48d4efbd69 GPU: Pitch textures are now supported, don't assert when encountering them. 2018-04-18 12:52:53 -05:00
bunnei c93ea96366
Merge pull request #346 from bunnei/misc-gpu-improvements
Misc gpu improvements
2018-04-17 22:17:07 -04:00
bunnei 71b4a3b9f6
Merge pull request #344 from bunnei/shader-decompiler-p2
Shader decompiler changes part 2
2018-04-17 22:10:53 -04:00
bunnei 4a8eb6745e maxwell3d: Allow Texture2DNoMipmap as Texture2D. 2018-04-17 21:39:15 -04:00
bunnei 531c25386e shader_bytecode: Make ctor's constexpr and explicit. 2018-04-17 21:27:07 -04:00
bunnei 174cba5c58 renderer_opengl: Implement BlendEquation and BlendFunc. 2018-04-17 18:11:48 -04:00
bunnei 5a28dce9eb gl_shader_decompiler: Implement FMUL/FADD/FFMA immediate instructions. 2018-04-17 16:36:42 -04:00
bunnei 8b4443c966 gl_shader_decompiler: Add support for TEXS instruction. 2018-04-17 16:36:38 -04:00
bunnei 1a1af3fda3 gl_rasterizer: Implement indexed vertex mode. 2018-04-16 21:10:15 -04:00
Subv ae58e46036 GPU: Added a function to determine whether a shader stage is enabled or not. 2018-04-14 22:54:23 -05:00
bunnei 1b41b875dc shaders: Add NumTextureSamplers const, remove unused #pragma. 2018-04-14 18:50:06 -04:00
bunnei e6224fec27 shaders: Address PR review feedback. 2018-04-14 16:01:41 -04:00
bunnei 0d408b965b shaders: Fix GCC and clang build issues. 2018-04-14 16:01:40 -04:00
bunnei 86135864da gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup. 2018-04-14 16:01:40 -04:00
bunnei 7639667562 shader_bytecode: Add FSETP and KIL to GetInfo. 2018-04-14 16:01:40 -04:00
bunnei 5a47832221 shader_bytecode: Add SubOp decoding. 2018-04-14 16:01:40 -04:00
bunnei 35aca0bf1f maxwell_3d: Make memory_manager public. 2018-04-13 23:48:27 -04:00
bunnei 33bb53571b maxwell_3d: Fix shader_config decodings. 2018-04-13 23:48:26 -04:00
bunnei 4e7e0f8112 shader_bytecode: Add initial module for shader decoding. 2018-04-13 23:48:19 -04:00
Subv dcc27d6dc1 GPU: Assert when finding a texture with a format type other than UNORM. 2018-04-06 20:44:46 -06:00
Subv 11b4ab9685 GPU: Use the MacroInterpreter class to execute the GPU macros instead of HLEing them. 2018-04-01 12:07:26 -05:00
Subv 1ec8d2123d GPU: Implemented a gpu macro interpreter.
The Ryujinx macro interpreter and envydis were used as reference.

Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
2018-04-01 12:07:26 -05:00
bunnei d30110348b gl_rasterizer: Add a SyncViewport method. 2018-03-26 21:17:04 -04:00
bunnei a6cab532f8 gl_rasterizer: Normalize vertex array data as appropriate. 2018-03-26 21:17:02 -04:00
bunnei 3754e0fdfd maxwell_3d: Use names that match envytools for VertexType. 2018-03-26 21:16:55 -04:00
bunnei 15925b8293 maxwell_3d: Add VertexAttribute struct and cleanup. 2018-03-26 21:16:54 -04:00
bunnei 33c0bf9dc5 Maxwell3D: Call AccelerateDrawBatch on DrawArrays. 2018-03-26 21:16:52 -04:00
bunnei ed2134784e gl_rasterizer: Implement AnalyzeVertexArray. 2018-03-26 21:16:52 -04:00
bunnei 94c70693f9 maxwell: Add RenderTargetFormat enum. 2018-03-26 21:16:49 -04:00
Subv 4697025b73 GPU: Load the sampler info (TSC) when retrieving active textures. 2018-03-26 15:46:49 -05:00
Subv 0ce52b1da2 GPU: Make the debug_context variable a member of the frontend instead of a global. 2018-03-24 23:35:06 -05:00
Subv 2c785bd06c GPU: Added a function to retrieve the active textures for a shader stage.
TODO: A shader may not use all of these textures at the same time, shader analysis should be performed to determine which textures are actually sampled.
2018-03-24 11:31:53 -05:00
Subv 1c31e2b3d2 GPU: Implement the Incoming/FinishedPrimitiveBatch debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv 1ad97c75a0 GPU: Implement the MaxwellCommandLoaded/Processed debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv 1b8d798835 GPU: Added a method to unswizzle a texture without decoding it.
Allow unswizzling of DXT1 textures.
2018-03-24 11:30:56 -05:00
Subv 71ebc3e90d GPU: Preliminary work for texture decoding. 2018-03-24 11:30:56 -05:00
Subv 9b9de30086 GPU: Added viewport registers to Maxwell3D's reg structure. 2018-03-24 01:22:19 -05:00
bunnei 3a6604e8fa maxwell_3d: Add some format decodings and string helper functions. 2018-03-22 19:47:28 -04:00
Subv c450d264eb GPU: Added vertex attribute format registers. 2018-03-21 09:26:47 -05:00
Subv ae28a52277 GPU: Added registers for the number of vertices to render. 2018-03-20 23:28:06 -05:00
Mat M f4700ccabf
Merge pull request #253 from Subv/rt_depth
GPU: Added registers for color and Z buffers.
2018-03-19 23:37:47 -04:00
Subv 7a27a11770 GPU: Added Z buffer registers to Maxwell3D's reg structure. 2018-03-19 16:55:33 -05:00
Subv 21d9519032 GPU: Added the render target (RT) registers to Maxwell3D's reg structure. 2018-03-19 16:46:29 -05:00
N00byKing 1d8b6ad13b Clang Fixes 2018-03-19 17:53:35 +01:00
N00byKing ef875d6a35 Clean Warnings (?) 2018-03-19 17:07:08 +01:00
Subv dcae0c9a4f GPU: Added the TSC registers to the Maxwell3D register structure. 2018-03-19 00:36:25 -05:00
Subv cff7b29bba GPU: Added the TIC registers to the Maxwell3D register structure. 2018-03-19 00:32:57 -05:00
Subv 03156d0c9a GPU: Implement macro 0xE1A BindTextureInfoBuffer in HLE.
This macro simply sets the current CB_ADDRESS to the texture buffer address for the input shader stage.
2018-03-18 19:03:40 -05:00
Subv 7b6868e908 GPU: Implement the BindStorageBuffer macro method in HLE.
This macro binds the SSBO Info Buffer as the current ConstBuffer.
This buffer is usually bound to c0 during shader execution.
Games seem to use this macro instead of directly writing the address for some reason.
2018-03-18 16:50:42 -05:00
Subv 85d820b1b4 GPU: Handle writes to the CB_DATA method.
Writing to this method will cause the written value to be stored in the currently-set ConstBuffer plus CB_POS.

This method is usually used to upload uniforms or other shader-visible data.
2018-03-18 15:23:24 -05:00
Subv aa586fa268 GPU: Store uploaded GPU macros and keep track of the number of method parameters. 2018-03-18 11:51:46 -05:00
Subv 7ac8657432 GPU: Macros are specific to the Maxwell3D engine, so handle them internally. 2018-03-18 11:51:45 -05:00
Subv ccb8da1512 GPU: Renamed ShaderType to ShaderStage as that is less confusing. 2018-03-17 18:32:57 -05:00
Subv 88698c156f GPU: Store shader constbuffer bindings in the GPU state. 2018-03-17 18:32:57 -05:00
Subv 66dae22790 GPU: Corrected some register offsets and removed superfluous macro registers. 2018-03-17 18:32:56 -05:00
Subv 1d9d9c16e8 GPU: Make the SetShader macro call do the same as the real macro's code.
It'll now set the CB_SIZE, CB_ADDRESS and CB_BIND registers when it's called.

Presumably this SetShader function is binding the constant shader uniforms to buffer 1 (c1[]).
2018-03-17 18:32:55 -05:00
Subv 579000e747 GPU: Corrected the parameter documentation for the SetShader macro call.
Register 0xE24 is actually a macro that sets some shader parameters in the register structure.

Macros are uploaded to the GPU at startup and have their own ISA, we'll probably write an interpreter for this in the future.
2018-03-17 13:55:42 -05:00
bunnei 516ef4f19f
Merge pull request #242 from Subv/set_shader
GPU: Handle the SetShader method call (0xE24) and store the shader config.
2018-03-17 00:34:17 -04:00
Subv f93d769a1c GPU: Handle the SetShader method call (0xE24) and store the shader config. 2018-03-16 22:51:06 -05:00
Subv d2888f7e90 GPU: Added the vertex array registers. 2018-03-16 22:47:45 -05:00
bunnei cd4e8a989c
Merge pull request #241 from Subv/gpu_method_call
GPU: Process command mode 5 (IncreaseOnce) differently from other commands
2018-03-16 22:28:22 -04:00
Subv 29feece4b8 GPU: Process command mode 5 (IncreaseOnce) differently from other commands.
Accumulate all arguments before calling the desired method.

Note: Maybe we should do the same for the NonIncreasing mode?
2018-03-16 20:32:44 -05:00
Subv bf310a41b8 GPU: Assert that we get a 0 CODE_ADDRESS register in the 3D engine.
Shader address calculation depends on this value to some extent, we do not currently know what it being 0 entails.
2018-03-16 19:24:41 -05:00
Subv cbec739e7b GPU: Added Maxwell registers for Shader Program control. 2018-03-16 19:23:11 -05:00