Commit graph

1476 commits

Author SHA1 Message Date
Subv 4c59105adf GPU: Implement offsetted rendering when using non-indexed drawing. 2018-07-02 11:23:36 -05:00
Subv fca3d1cc65 GPU: Fixed the index offset rendering, and implemented the base vertex functionality.
This fixes Stardew Valley.
2018-07-02 11:22:17 -05:00
Subv cc73bad293 GPU: Added register definitions for the vertex buffer base element. 2018-07-02 11:21:23 -05:00
bunnei 3d41fdfbba
Merge pull request #604 from Subv/invalid_textures
GPU: Ignore invalid and disabled textures when drawing.
2018-07-02 11:48:18 -04:00
Subv ca633a5a3c GPU: Directly copy the pixels when performing a same-layout DMA. 2018-07-02 09:46:33 -05:00
Subv 80c5e8ae99 GPU: Ignore disabled textures and textures with an invalid address. 2018-07-02 09:43:38 -05:00
Subv e9d147349b GPU: Allow GpuToCpuAddress to return boost::none for unmapped addresses. 2018-07-02 09:42:48 -05:00
bunnei 066d6184d4
Merge pull request #602 from Subv/mufu_subop
GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.
2018-07-01 11:06:04 -04:00
bunnei b611d852db
Merge pull request #601 from Subv/rgba32_ui
GPU: Implement the RGBA32_UINT rendertarget format.
2018-07-01 03:22:38 -04:00
Subv f33e406ff2 GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation. 2018-06-30 14:48:25 -05:00
Subv c0e2d52758 GPU: Implemented the RGBA32_UINT rendertarget format. 2018-06-30 14:23:13 -05:00
Subv b11072d54a GLCache: Specify the component type along the texture type in the format tuple. 2018-06-30 14:08:51 -05:00
bunnei c96da97630 gl_shader_decompiler: Implement predicate NotEqualWithNan. 2018-06-30 03:01:25 -04:00
bunnei 50ef2beb58
Merge pull request #595 from bunnei/raster-cache
Rewrite the OpenGL rasterizer cache
2018-06-29 14:07:28 -04:00
bunnei c18425ef98 gl_rasterizer_cache: Only dereference color_surface/depth_surface if valid. 2018-06-29 13:08:08 -04:00
bunnei 7fa9177830
gl_shader_decompiler: Add a return path for unknown instructions. 2018-06-27 01:14:34 -04:00
bunnei 1dd754590f gl_rasterizer_cache: Implement caching for texture and framebuffer surfaces.
gl_rasterizer_cache: Improved cache management based on Citra's implementation.

gl_surface_cache: Add some docstrings.
2018-06-27 00:15:44 -04:00
bunnei 8af1ae46aa gl_rasterizer_cache: Various fixes for ASTC handling. 2018-06-27 00:08:04 -04:00
bunnei c7c379bd19 gl_rasterizer_cache: Use SurfaceParams as a key for surface caching. 2018-06-27 00:08:04 -04:00
bunnei 6a28a66832 maxwell_3d: Add a struct for RenderTargetConfig. 2018-06-27 00:08:04 -04:00
bunnei 3f9f047375 gl_rasterizer: Implement AccelerateDisplay to forward textures to framebuffers. 2018-06-27 00:08:03 -04:00
bunnei ff6785f3e8 gl_rasterizer_cache: Cache size_in_bytes as a const per surface. 2018-06-27 00:08:03 -04:00
bunnei 9f2f819bb6 gl_rasterizer_cache: Refactor to make SurfaceParams members const. 2018-06-27 00:08:03 -04:00
bunnei 5f57ab1b2a gl_rasterizer_cache: Remove Citra's rasterizer cache, always load/flush surfaces. 2018-06-27 00:08:03 -04:00
bunnei 10422f3c18 gl_rasterizer: Workaround for when exceeding max UBO size. 2018-06-26 23:07:34 -04:00
bunnei dfac394e60
Merge pull request #593 from bunnei/fix-swizzle
gl_state: Fix state management for texture swizzle.
2018-06-26 22:05:49 -04:00
bunnei 73de9bab1a
Merge pull request #592 from bunnei/cleanup-gl-state
gl_state: Remove unused state management from 3DS.
2018-06-26 22:05:03 -04:00
bunnei 8447d20a11 gl_state: Fix state management for texture swizzle. 2018-06-26 17:15:58 -04:00
bunnei 20b58bab9c gl_state: Remove unused state management from 3DS. 2018-06-26 17:09:25 -04:00
bunnei 41b3725d28 gl_rasterizer_cache: Fix inverted B5G6R5 format. 2018-06-26 17:07:36 -04:00
bunnei 36dedae842
Merge pull request #554 from Subv/constbuffer_ubo
Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
2018-06-26 10:25:56 -04:00
mailwl ad39bab271 Fix crash at exit 2018-06-25 18:01:08 +03:00
Subv a3d82ef5d9 Build: Fixed some MSVC warnings in various parts of the code. 2018-06-20 11:39:10 -05:00
bunnei 7a0bb406d5
Merge pull request #574 from Subv/shader_abs_neg
GPU: Perform negation after absolute value in the float shader instructions.
2018-06-18 22:24:57 -04:00
Subv 38989bef43 GPU: Perform negation after absolute value in the float shader instructions. 2018-06-18 19:56:29 -05:00
Subv eab7457c00 GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
Like the MOV32I and FMUL32I instructions.
This fixes a potential crash when using these instructions.
2018-06-18 19:50:35 -05:00
bunnei 0e13d9cb7b
Merge pull request #570 from bunnei/astc
gl_rasterizer: Implement texture format ASTC_2D_4X4.
2018-06-18 19:08:49 -04:00
bunnei ea080501fb
Merge pull request #571 from Armada651/loose-blend
gl_rasterizer: Get loose on independent blending.
2018-06-18 11:36:50 -04:00
Jules Blok 7c7f4a9be2 gl_rasterizer: Get loose on independent blending. 2018-06-18 09:27:06 +02:00
bunnei 61779fa072 gl_rasterizer: Implement texture format ASTC_2D_4X4. 2018-06-18 01:56:59 -04:00
bunnei fe906fff36 gl_rasterizer_cache: Loosen things up a bit. 2018-06-18 00:55:59 -04:00
bunnei afdd657d30 gl_shader_decompiler: Implement LOP instructions. 2018-06-17 15:27:48 -04:00
bunnei 5673ce39c7 gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP. 2018-06-17 13:31:39 -04:00
bunnei d383043e07 gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I. 2018-06-15 22:42:02 -04:00
bunnei fb5bd0920d
Merge pull request #564 from bunnei/lop32i_passb
gl_shader_decompiler: Implement LOP32I LogicOperation PassB.
2018-06-15 22:04:03 -04:00
bunnei 55c49d5bf4 gl_shader_gen: Set position.w to 1. 2018-06-15 20:47:04 -04:00
bunnei 61f9d9c4ab gl_shader_decompiler: Implement LOP32I LogicOperation PassB. 2018-06-15 20:43:33 -04:00
bunnei 019d7208c8
Merge pull request #556 from Subv/dma_engine
GPU: Partially implemented the Maxwell DMA engine.
2018-06-12 14:25:17 -04:00
bunnei 2015a1b180
Merge pull request #558 from Subv/iadd32i
GPU: Implemented the iadd32i shader instruction.
2018-06-12 14:19:25 -04:00
Subv db0497b808 GPU: Implemented the iadd32i shader instruction. 2018-06-12 11:46:45 -05:00
Subv 987a170665 GPU: Partially implemented the Maxwell DMA engine.
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12 11:27:36 -05:00
bunnei 5f3d6c85db gl_shader_decompiler: Implement saturate for float instructions. 2018-06-11 21:46:34 -04:00
Subv 004b1b3830 GPU: Convert the gl_InstanceId and gl_VertexID variables to floats when reading from them.
This corrects the invalid position values in some games when doing attribute-less rendering.
2018-06-10 13:50:19 -05:00
Subv 2a7653142d Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
This should help a bit with GPU performance once we're GPU-bound.
2018-06-09 18:02:05 -05:00
Subv b366b885a1 GPU: Implement the iset family of shader instructions. 2018-06-09 16:19:13 -05:00
Subv 3cb753eeb1 GPU: Added decodings for the ISET family of instructions. 2018-06-09 15:56:50 -05:00
bunnei d81aaa3ed3
Merge pull request #550 from Subv/ssy
GPU: Stub the SSY shader instruction.
2018-06-09 00:42:53 -04:00
bunnei e2176dc7ce
Merge pull request #551 from bunnei/shr
gl_shader_decompiler: Implement SHR instruction.
2018-06-09 00:42:44 -04:00
bunnei 5440b9c634 gl_shader_decompiler: Implement SHR instruction. 2018-06-09 00:01:17 -04:00
Subv abec5f82e2 GPU: Stub the SSY shader instruction.
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-08 22:46:10 -05:00
bunnei bbc4f369ed gl_shader_decompiler: Implement IADD instruction. 2018-06-08 23:25:22 -04:00
bunnei 79e9c2e237 gl_shader_decompiler: Add missing asserts for saturate_a instructions. 2018-06-08 23:24:10 -04:00
Subv c011b6f67e GPU: Synchronize the blend state on every draw call.
Only independent blending on render target 0 is implemented for now.

This fixes the elongated squids in Splatoon 2's boot screen.
2018-06-08 17:05:52 -05:00
Subv c712dafaee GPU: Added registers for normal and independent blending. 2018-06-08 17:04:41 -05:00
bunnei a931cf9e8b
Merge pull request #547 from Subv/compressed_alignment
GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
2018-06-08 16:40:49 -04:00
Subv 8d9534d830 GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
This fixes issues with retrieving non-block-aligned tiled compressed textures from the cache.
2018-06-08 12:27:19 -05:00
Subv 47dc5e0dab Rasterizer: Flush the written region when writing shader uniform data before copying it to the uniform buffers.
This fixes the flip_viewport uniform having invalid values when drawing.
2018-06-08 12:22:39 -05:00
bunnei ee318d4015
Merge pull request #543 from Subv/uniforms
GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.
2018-06-07 11:21:36 -04:00
Subv 86146ef819 GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.
This should fix the bug with the vs_config UBO being uninitialized during shader execution.
2018-06-07 08:33:23 -05:00
bunnei 0639e03055
Merge pull request #542 from bunnei/bfe_imm
gl_shader_decompiler: Implement BFE_IMM instruction.
2018-06-07 01:49:45 -04:00
bunnei 930487c7fb
Merge pull request #541 from Subv/blittextures
GLCache: Fixed copying compressed textures in the rasterizer cache.
2018-06-07 01:35:01 -04:00
bunnei 92209f905f gl_shader_decompiler: Implement BFE_IMM instruction. 2018-06-07 00:58:12 -04:00
Subv f22e090b86 GLCache: Use the full uncompressed size when blitting from one texture to another.
This avoids the problem of only copying a tiny piece of the textures when they are compressed.
2018-06-06 23:26:36 -05:00
Subv 218a08df93 GLCache: Simplify the logic to copy from one texture to another in BlitTextures.
We now use glCopyImageSubData, this should avoid errors with trying to attach a compressed texture as a framebuffer's color attachment and then blitting to it.

Maybe in the future we can change this to glCopyTextureSubImage which only requires GL_ARB_direct_state_access.
2018-06-06 23:25:24 -05:00
bunnei 128aeba0f3 gl_shader_decompiler: F2F: Implement rounding modes. 2018-06-06 22:21:29 -04:00
bunnei 03f877919d
Merge pull request #537 from bunnei/misc-shader
gl_shader_decompiler: Additional decodings, remove unused stuff from TEX
2018-06-06 21:44:37 -04:00
bunnei 37f50c8773
Merge pull request #535 from Subv/gpu_swizzle
GPU: Support changing the texture swizzles for Maxwell textures.
2018-06-06 21:39:47 -04:00
bunnei 00c830405b gl_shader_decompiler: Remove some attribute stuff that has nothing to do with TEX/TEXS. 2018-06-06 19:47:41 -04:00
bunnei 4b114e1b8a shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD. 2018-06-06 19:47:34 -04:00
bunnei 0a49c46353 gl_shader_decompiler: Implement ISETP_IMM instruction. 2018-06-06 19:45:58 -04:00
Subv 47629c89a8 GPU: Support changing the texture swizzles for Maxwell textures. 2018-06-06 18:36:15 -05:00
Subv 89e81a9be2 GLState: Support changing the GL_TEXTURE_SWIZZLE parameter of each texture unit. 2018-06-06 18:36:13 -05:00
bunnei 0ff2929644
Merge pull request #534 from Subv/multitexturing
GPU: Implement sampling multiple textures in the generated glsl shaders.
2018-06-06 19:12:52 -04:00
bunnei 4669f15f8b gl_shader_decompiler: Implement LD_C instruction. 2018-06-06 18:09:06 -04:00
bunnei 4112aa68a6 gl_shader_gen: Add uniform handling for indirect const buffer access. 2018-06-06 18:09:05 -04:00
bunnei 6e386a334b gl_shader_decompiler: Refactor uniform handling to allow different decodings. 2018-06-06 17:57:15 -04:00
Subv dbfc39d214 GPU: Implement sampling multiple textures in the generated glsl shaders.
All tested games that use a single texture show no regression.

Only Texture2D textures are supported right now, each shader gets its own "tex_fs/vs/gs" sampler array to maintain independent textures between shader stages, the textures themselves are reused if possible.
2018-06-06 12:58:16 -05:00
Sebastian Valle ce026332a5
Merge pull request #531 from bunnei/fix-shl
gl_shader_decompiler: Fix un/signed mismatch with SHL.
2018-06-06 08:28:42 -05:00
Sebastian Valle fa220dd709
Merge pull request #530 from bunnei/wrap-mirror
maxwell_to_gl: Implement WrapMode Mirror.
2018-06-06 08:28:27 -05:00
bunnei 9a85277d83
Merge pull request #527 from Subv/rgba32f_texcopy
GPU: Allow the usage of RGBA32_FLOAT and RGBA16_FLOAT in the texture copy engine.
2018-06-06 00:24:13 -04:00
bunnei 05dc93399b
Merge pull request #528 from Subv/rg11b10f
GPU: Implemented the R11FG11FB10F texture and rendertarget formats.
2018-06-06 00:22:54 -04:00
bunnei 566f97b580 gl_shader_decompiler: Fix un/signed mismatch with SHL. 2018-06-05 23:58:06 -04:00
bunnei bf0543af23 maxwell_to_gl: Implement WrapMode Mirror. 2018-06-05 23:56:45 -04:00
Subv adf47cd59a GPU: Allow the usage of RGBA16_FLOAT in the texture copy engine. 2018-06-05 22:01:20 -05:00
Subv c531a92eda GPU: Implemented the R11FG11FB10F texture and rendertarget formats. 2018-06-05 21:57:16 -05:00
Subv 14afc704d4 GPU: Fixed the compression factor for RGBA16F textures.
They're not compressed.
2018-06-05 21:55:17 -05:00
Subv 8d70d1ea45 GPU: Allow the usage of RGBA32_FLOAT in the texture copy engine. 2018-06-05 21:07:40 -05:00
bunnei 5fb99e6a16
Merge pull request #516 from Subv/f2i_r
GPU: Implemented the F2I_R shader instruction.
2018-06-05 22:01:29 -04:00
bunnei 38eb33f150
Merge pull request #521 from Subv/bra
GPU: Corrected the branch targets for the shader bra instruction.
2018-06-05 10:09:35 -04:00
bunnei b54a72afc0
Merge pull request #520 from bunnei/shader-shl
gl_shader_decompiler: Implement SHL instruction.
2018-06-05 10:08:42 -04:00
Subv e7dfcdde74 GPU: Corrected the branch targets for the shader bra instruction. 2018-06-04 22:56:28 -05:00
Subv 4b89348c00 GPU: Implemented the F2I_R shader instruction. 2018-06-04 22:06:50 -05:00
bunnei 8c99dd055c
Merge pull request #518 from Subv/incomplete_shaders
GPU: Implemented predicated exit instructions in the shader programs.
2018-06-04 22:43:46 -04:00
bunnei 799e632ccb gl_shader_decompiler: Fix typo with ISCADD instruction. 2018-06-04 22:41:10 -04:00
bunnei c23c30c76f gl_shader_decompiler: Implement SHL instruction. 2018-06-04 22:36:49 -04:00
bunnei 6ea1576513 gl_shader_decompiler: Implement PredCondition::NotEqual. 2018-06-04 22:00:47 -04:00
Subv 23b1e6eded GPU: Implement the ISCADD shader instructions. 2018-06-04 20:17:41 -05:00
Subv 438a9b70cc GPU: Added decodings for the ISCADD instructions. 2018-06-04 20:17:39 -05:00
bunnei e8bfff7b4b
Merge pull request #514 from Subv/lop32i
GPU: Implemented the LOP32I instruction.
2018-06-04 20:48:15 -04:00
bunnei f564822e78
Merge pull request #510 from Subv/isetp
GPU: Implemented the ISETP_R and ISETP_C instructions
2018-06-04 20:47:11 -04:00
Subv 6cf6fa2842 GPU: Implement predicated exit instructions in the shader programs. 2018-06-04 19:18:11 -05:00
Subv d27279092f GPU: Take into account predicated exits when performing shader control flow analysis. 2018-06-04 19:14:23 -05:00
bunnei 37fd4e6d9b
Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei cdd92dc692
Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
bunnei 38d25a4cb2
Merge pull request #515 from Subv/viewport_fix
GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
2018-06-04 18:11:36 -04:00
Subv 2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv f6679ce422 GPU: Corrected the I2F_R implementation. 2018-06-04 16:41:27 -05:00
Subv 5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
Subv 0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv cb47abecc6 GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface. 2018-06-04 13:01:53 -05:00
Subv 90cddf1996 GPU: Use explicit types when retrieving the uniform values for fsetp/fset and isetp instead of the type of an invalid output register. 2018-06-04 11:22:26 -05:00
Subv 7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
Subv b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv 06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei ba117854f9
Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
Subv d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei 1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
bunnei bb9d39b8fe
Merge pull request #494 from bunnei/shader-tex
gl_shader_decompiler: Implement TEX, fixes for TEXS.
2018-06-03 12:05:38 -04:00
bunnei 27c0f9e02d
Merge pull request #495 from bunnei/improve-rro
gl_shader_decompiler: Implement RRO as a register move.
2018-06-03 12:05:26 -04:00
bunnei e54ea773fc gl_shader_decompiler: Implement RRO as a register move. 2018-06-03 11:14:31 -04:00
Subv 99f9d47d16 GPU: Implemented the DXN1 (BC4) texture format. 2018-06-02 13:17:09 -05:00
bunnei 888eb345c0 gl_shader_decompiler: Implement TEX instruction. 2018-05-31 23:36:45 -04:00
bunnei 4c727d0ba8 gl_shader_decompiler: Support multi-destination for TEXS. 2018-05-31 22:57:32 -04:00
bunnei 49309b5848 gl_rasterizer_cache: Assert that component type is UNorm or format is RGBA16F. 2018-05-30 22:50:41 -04:00
bunnei ca5a4a704b gl_rasterizer_cache: Implement PixelFormat RGBA16F. 2018-05-30 22:24:07 -04:00
bunnei 15086a22be
Merge pull request #489 from Subv/vertexid
Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
2018-05-30 14:10:48 -04:00
Subv 99f12b05fa Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader. 2018-05-30 10:58:03 -05:00
Sebastian Valle 8df011a57f
Merge pull request #483 from bunnei/sonic
Several GPU fixes to boot Sonic Mania
2018-05-30 07:31:46 -05:00
bunnei 6fcc7e9c36 gl_shader_decompiler: F2F_R instruction: Implement abs. 2018-05-29 23:52:54 -04:00
bunnei 68937a662d gl_shader_decompiler: Partially implement F2F_R instruction. 2018-05-29 23:10:44 -04:00
Subv 734106dcb9 GPU: Implemented the R8 texture format (0x1D) 2018-05-29 21:49:37 -05:00
bunnei 0d843eaba6 gl_rasterize_cache: Invert order of tex format RGB565. 2018-05-29 22:16:18 -04:00
greggameplayer 220d4672df add all the known TextureFormat (#474) 2018-05-28 19:26:17 -04:00
bunnei d809f65827
Merge pull request #472 from bunnei/greater-equal
gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual.
2018-05-27 12:14:30 -04:00
bunnei 7f155ba713
Merge pull request #476 from Subv/a1bgr5
GPU: Implemented the A1B5G5R5 texture format (0x14)
2018-05-27 12:14:08 -04:00
Subv 7ddc872b52 GPU: Implemented the A1B5G5R5 texture format (0x14) 2018-05-27 09:02:05 -05:00
bunnei c23ce3365d gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual. 2018-05-25 23:21:29 -04:00
bunnei ee53688ca7 shader_bytecode: Implement other variants of FMNMX. 2018-05-25 23:18:50 -04:00
bunnei aee356bd10
Merge pull request #468 from Subv/compound_preds
Shader: Implemented compound predicates in the fset and fsetp instructions
2018-05-25 22:28:47 -04:00
Subv e2cdf54177 Shader: Implemented compound predicates in fset.
You can specify a predicate in the fset instruction:

Result = ((Value1 Comp Value2) OP P0) ? 1.0 : 0.0;
2018-05-24 17:39:59 -05:00