Commit graph

504 commits

Author SHA1 Message Date
bunnei 4b5141954e gl_shader_gen: Add additional function documentation. 2015-10-21 21:53:17 -04:00
bunnei 2a0a86f629 gl_shader_util: Cleanup header file + add docstring. 2015-10-21 21:53:16 -04:00
bunnei a74774257e gl_shader_gen: Various cleanups + moved TEV stage generation to its own function. 2015-10-21 21:53:16 -04:00
bunnei c86b9d4242 renderer_opengl: Refactor shader generation/caching to be more organized + various cleanups. 2015-10-21 21:53:14 -04:00
bunnei 3c057bd3d8 gl_rasterizer: Move logic for creating ShaderCacheKey to a static function. 2015-10-21 21:53:05 -04:00
bunnei b02a533d94 gl_shader_util: Use vec3 constants for AppendColorCombiner. 2015-10-21 21:51:24 -04:00
bunnei 37b0aa5af7 gl_rasterizer: Fix typo in uploading TEV const color uniforms. 2015-10-21 21:51:24 -04:00
bunnei 82f3e6dc69 gl_shader_util: Fix precision bug with alpha testing.
- Alpha testing is not done with float32 precision, this makes the HW renderer match the SW renderer.
2015-10-21 21:51:23 -04:00
Subv e3f4233cef Initial implementation of fragment shader generation with caching. 2015-10-21 21:51:23 -04:00
Emmanuel Gil Peyrot 14af5919ba CitraQt, SkyEye, Loader, VideoCore: Remove newlines in LOG_* calls.
The LOG_* function itself already appends one.
2015-10-09 22:14:56 +01:00
Rohit Nirmal 32391cffdd Silence -Wsign-compare warnings. 2015-10-06 22:16:15 -05:00
Martin Lindhe bafb7afba2 fix some xcode 7.0 warnings 2015-09-29 23:11:09 +02:00
Lioncash 751fbfdcc3 general: Silence some warnings when using clang 2015-09-16 08:51:53 -04:00
Lioncash aec28ed91e video_core: Reorganize headers 2015-09-11 07:31:15 -04:00
Lioncash 1fa772393b video_core: Remove unnecessary includes from headers 2015-09-11 00:10:03 -04:00
bunnei a008b28659 Merge pull request #1133 from lioncash/emplace-back
gl_rasterizer: Replace push_back calls with emplace_back in AddTriangle
2015-09-10 15:07:06 -04:00
bunnei 0d5604fdcb Merge pull request #1136 from lioncash/proto
renderer_opengl: Remove unimplemented function declaration
2015-09-10 11:29:33 -04:00
Lioncash 8a3428f16c renderer_opengl: Remove unimplemented function declaration 2015-09-10 10:45:44 -04:00
Lioncash 526eb33d1e video_core: Remove unused variables 2015-09-10 10:26:21 -04:00
Lioncash 7b72b71605 gl_rasterizer: Replace push_back calls with emplace_back in AddTriangle 2015-09-10 00:20:30 -04:00
aroulin 1484a23530 Shader JIT: Use SCALE constant from emitter 2015-09-07 16:50:28 +02:00
aroulin 87e3b9ffc0 Shader: Fix size_t to int casts of register offsets 2015-09-07 16:50:28 +02:00
Yuri Kunde Schlesner b044c047c4 OpenGL: Use Sampler Objects to decouple sampler config from textures
Fixes #978
2015-09-03 15:09:51 -03:00
Yuri Kunde Schlesner 466e608c19 OpenGL: Remove ugly and endian-unsafe color pointer casts 2015-09-03 15:09:51 -03:00
Yuri Kunde Schlesner ec28f037e6 OpenGL: Add support for Sampler Objects to state tracker 2015-09-03 15:09:50 -03:00
Yuri Kunde Schlesner cc19a76656 Merge pull request #1087 from yuriks/opengl-glad
Replace the previous OpenGL loader with a glad-generated 3.3 one
2015-09-03 15:07:01 -03:00
bunnei 918ca40c68 Merge pull request #1088 from aroulin/x64-emitter-abi-call
x64: Proper stack alignment in shader JIT function calls
2015-09-02 08:46:58 -04:00
aroulin ba998b85a1 video_core: Fix format specifiers warnings 2015-09-02 08:20:00 +02:00
aroulin 179ad35c2e x64: Proper stack alignment in shader JIT function calls
Import Dolphin stack handling and register saving routines
Also removes the x86 parts from abi files
2015-09-01 23:39:52 +02:00
Tony Wasserka 071510b367 Merge pull request #1092 from Subv/vertex_offset
Pica: Add the vertex_offset register to the Pica registers map.
2015-08-31 18:17:59 +02:00
Subv 58a04c0776 Pica: Added the primitive_restart register (0x25f) to the registers map. 2015-08-31 09:14:18 -05:00
Subv 149ea561a6 Pica: Add the vertex_offset register to the Pica registers map. 2015-08-31 07:02:30 -05:00
aroulin 84959be150 Shader JIT: Fix SGE/SGEI NaN behavior
SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE
instruction was used with NLT
2015-08-31 08:16:15 +02:00
bunnei e77dc4e9d2 Merge pull request #1059 from Subv/vertex_offset
GPU: Implemented register 0x22A PICA_REG_DRAW_VERTEX_OFFSET
2015-08-30 17:12:33 -04:00
Subv 12a11472f1 GPU: Implemented register 0x22A.
This is the equivalent of the "first" parameter in glDrawArrays, it tells the GPU the vertex index at which to start rendering.

Register 0x22A doesn't affect indexed rendering.
2015-08-30 15:46:22 -05:00
Yuri Kunde Schlesner a1a5570e97 Replace the previous OpenGL loader with a glad-generated 3.3 one
The main advantage of switching to glad from glLoadGen is that, apart
from being actively maintained, it supports a customizable entrypoint
loader function, which makes it possible to also support OpenGL ES.
2015-08-30 08:45:56 -03:00
bunnei 58e9f78844 Merge pull request #1049 from Subv/stencil
Rasterizer: Corrected the stencil implementation.
2015-08-29 20:06:25 -04:00
Yuri Kunde Schlesner c5a4025b65 Merge pull request #1065 from yuriks/shader-fp
Shader FP compliance fixes
2015-08-27 16:34:13 -07:00
bunnei f3cef178e3 gl_rasterizer_cache: Detect and ignore unnecessary texture flushes. 2015-08-27 19:07:53 -04:00
aroulin f52d8c1a9b Shader JIT: Fix float to integer rounding in MOVA
MOVA converts new address register values from floats to integers using truncation
2015-08-27 15:26:41 +02:00
archshift dd0e1061ef Shader JIT: ifdef out reference to ifdef'd out shader_map
shader_map was only defined on x86 architectures, but was cleared on shutdown
with no ifdef protection. Ifdef this out so non-x86 architectures can be built.
2015-08-26 22:28:19 +00:00
Yuri Kunde Schlesner 0fcabd2b11 Integrate the MicroProfile profiling library
This brings goodies such as a configurable user interface and
multi-threaded timeline view.
2015-08-24 22:16:28 -03:00
bunnei afd45d1d7f Merge pull request #1063 from Subv/hw_renderer_debug_fb
HWRenderer: Only reload the framebuffer from gpu memory if the hw renderer is in use during a breakpoint
2015-08-24 13:02:44 -04:00
Subv 583d777b1a HWRenderer: Added a workaround for the Intel Windows driver bug that causes glTexSubImage2D to not change the stencil buffer.
Reported here https://communities.intel.com/message/324464
2015-08-24 11:28:28 -05:00
Yuri Kunde Schlesner eff10959de fixup! Shaders: Fix multiplications between 0.0 and inf 2015-08-24 02:10:11 -03:00
Yuri Kunde Schlesner d8ef20c856 Shader JIT: Tiny micro-optimization in DPH 2015-08-24 01:48:37 -03:00
Yuri Kunde Schlesner 630a850d4d Shaders: Fix multiplications between 0.0 and inf
The PICA200 semantics for multiplication are so that when multiplying
inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by
IEEE. This is relied upon by games.

Fixes #1024 (missing OoT interface items)
2015-08-24 01:48:15 -03:00
Yuri Kunde Schlesner 082b74fa24 Shaders: Explicitly conform to PICA semantics in MAX/MIN 2015-08-24 01:46:58 -03:00
Yuri Kunde Schlesner 76247170df Shader JIT: Add name to second scratch register (XMM4) 2015-08-24 01:46:10 -03:00
Lioncash fa5076eb9b shader_jit: Replace two MDisp usages with MatR 2015-08-24 00:39:50 -04:00
Yuri Kunde Schlesner 455147ee95 Shader JIT: Fix CMP NaN behavior to match hardware 2015-08-24 01:29:40 -03:00
bunnei 83c214f6d8 Merge pull request #1062 from aroulin/shader-rcp-rsq
Shader: RCP and RSQ computes only the 1st component
2015-08-23 17:56:35 -04:00
Subv d1b9383d86 HWRenderer: Only reload the framebuffer from gpu memory if the hw renderer is in use during a breakpoint. 2015-08-23 15:26:17 -05:00
aroulin 03c5cfead4 Shader: Use std::sqrt for float instead of sqrt 2015-08-23 22:03:07 +02:00
aroulin fa552f11ef Shader: RCP and RSQ computes only the 1st component 2015-08-23 22:01:17 +02:00
aroulin 2f1514b904 Shader: implement DPH/DPHI in JIT 2015-08-22 11:09:53 +02:00
aroulin 2e7cf2f6cf Shader: implement DPH/DPHI in interpreter
Tests revealed that the component with w=1 is
SRC1 and not SRC2, it is now fixed on 3dbrew.
2015-08-22 11:09:53 +02:00
Subv 0c7da9b815 HWRasterizer: Implemented stencil ops 6 and 7. 2015-08-21 11:05:56 -05:00
Subv 7c1f84a92b SWRasterizer: Implemented stencil ops 6 and 7.
IncrementWrap and DecrementWrap, verified with hwtests.
2015-08-21 11:01:42 -05:00
Subv e43eb130d4 HWRasterizer: Implemented stencil op 1 (GL_ZERO) 2015-08-21 10:59:49 -05:00
Subv fef1462371 SWRasterizer: Implemented stencil action 1 (GL_ZERO).
Verified with hwtests.
2015-08-21 10:35:25 -05:00
Subv b3e530d005 SWRasterizer: Removed a todo. Verified with hwtests. 2015-08-21 10:09:15 -05:00
Subv 8e6336d96b SWRenderer: The stencil depth_pass action is executed even if depth testing is disabled.
The HW renderer already did this.
2015-08-21 09:48:43 -05:00
Subv e74825e3d0 Rasterizer: Abstract duplicated stencil code into a lambda. 2015-08-21 09:45:36 -05:00
Subv 46f660a789 GLRasterizer: Implemented stencil testing in the hw renderer. 2015-08-20 10:11:09 -05:00
Subv 186873420f GPU/Rasterizer: Corrected the stencil implementation.
Verified the behavior with hardware tests.
2015-08-20 10:10:35 -05:00
aroulin f3e8f42718 Shader: implement SGE, SGEI and SLT in JIT 2015-08-19 14:29:39 +02:00
aroulin 863730f6a7 Shader: implement SGE, SGEI in interpreter 2015-08-19 14:29:39 +02:00
bunnei 3c5ff418ca Merge pull request #1047 from aroulin/shader-ex2-lg2
Shader: Save caller-saved registers in JIT before a CALL
2015-08-18 22:02:25 -04:00
aroulin 2f9eb98f03 Shader: Save caller-saved registers in JIT before a CALL 2015-08-19 03:40:07 +02:00
bunnei 026379ed55 Merge pull request #1037 from aroulin/shader-ex2-lg2
Shader: Implement EX2 and LG2 in interpreter/JIT
2015-08-18 19:42:32 -04:00
bunnei 1f18c9f8dd Merge pull request #1034 from yuriks/rg8-textures
videocore: Added RG8 texture support
2015-08-16 22:17:12 -04:00
aroulin 7d3a6016d6 Shader: implement EX2 and LG2 in JIT 2015-08-17 01:12:34 +02:00
LittleWhite 9d6748fa94 Fix Linux GCC 4.9 build (complaining about undeclared memset) 2015-08-16 17:21:08 +02:00
aroulin 638e47c04d Shader: implement EX2 and LG2 in interpreter 2015-08-16 15:54:30 +02:00
Tony Wasserka 96820ae42a Build fix for Debug configurations. 2015-08-16 15:14:54 +02:00
Tony Wasserka f5144e6c10 Merge pull request #997 from Lectem/cmdlist_full_debug
citra-qt: Improve pica command list widget (add mask, fix some issues)
2015-08-16 13:34:45 +02:00
Tony Wasserka 33ba604fd9 Introduce a shader tracer to allow inspection of input/output values for each processed instruction. 2015-08-16 14:12:11 +02:00
Tony Wasserka 2e3601f415 Pica/DebugUtils: Include uniform information into shader dumps. 2015-08-16 13:22:01 +02:00
Tony Wasserka 4cb302c8ae citra-qt: Improve shader debugger.
Now supports dumping the current shader and recognizes a larger number of output semantics.
2015-08-16 13:22:00 +02:00
Patrick Martin 5b65d95310 videocore: Added RG8 texture support 2015-08-16 02:21:50 -03:00
bunnei db97090cad Shader: Use a POD struct for registers. 2015-08-15 18:03:27 -04:00
bunnei b39c053785 Rename ARCHITECTURE_X64 definition to ARCHITECTURE_x86_64. 2015-08-15 18:03:27 -04:00
bunnei 0ee00861f6 Common: Cleanup CPU capability detection code. 2015-08-15 18:03:26 -04:00
bunnei a1942238f5 Common: Move cpu_detect to x64 directory. 2015-08-15 18:03:26 -04:00
bunnei bd7e691f78 x64: Refactor to remove fake interfaces and general cleanups. 2015-08-15 18:03:25 -04:00
bunnei cfb354f11f JIT: Support negative address offsets. 2015-08-15 18:01:22 -04:00
bunnei 094ae6fadb Shader: Initial implementation of x86_x64 JIT compiler for Pica vertex shaders.
- Config: Add an option for selecting to use shader JIT or interpreter.
- Qt: Add a menu option for enabling/disabling the shader JIT.
2015-08-15 18:01:07 -04:00
bunnei d67e2f78b7 Common: Added MurmurHash3 hash function for general-purpose use. 2015-08-15 17:33:46 -04:00
bunnei 3f69c2039d Shader: Define a common interface for running vertex shader programs. 2015-08-15 17:33:44 -04:00
bunnei 18527b9e21 Shader: Move shader code to its own subdirectory, "shader". 2015-08-15 17:33:42 -04:00
bunnei 642b9b5030 GPU: Refactor "VertexShader" namespace to "Shader".
- Also renames "vertex_shader.*" to "shader_interpreter.*"
2015-08-15 17:33:41 -04:00
bunnei 35f3360663 Merge pull request #893 from linkmauve/remove-uint._t-int._t
Replace standard uint*_t and int*_t with CommonTypes’ u* and s* types
2015-08-11 17:55:24 -04:00
Emmanuel Gil Peyrot 5115d0177e ARM Core, Video Core, CitraQt, Citrace: Use CommonTypes types instead of the standard u?int*_t types. 2015-08-11 22:38:44 +01:00
Yuri Kunde Schlesner 254582aa35 OpenGL: Fix state tracking in situations with reused object handles
If an OpenGL object is created, bound to a binding using the state
tracker, and then destroyed, a newly created object can be assigned the
same numeric handle by OpenGL. However, even though it is a new object,
and thus needs to be bound to the binding again, the state tracker
compared the current and previous handles and concluded that no change
needed to be made, leading to failure to bind objects in certain cases.

This manifested as broken text in VVVVVV, which this commit fixes along
with similar texturing problems in other games.
2015-08-06 00:59:37 -03:00
Yuri Kunde Schlesner ff68db61bc OpenGL: Remove redundant texture.enable_2d field from OpenGLState
All uses of this field where it's false can just set the texture id to 0
instead.
2015-08-05 22:55:22 -03:00
Yuri Kunde Schlesner a96502edd3 Videocore: Implement simple vertex caching
This gives a ~2/3 reduction in the amount of vertices that need to be
processed through the vertex loaders and the vertex shader, yielding a
good speedup.
2015-08-04 23:41:47 -03:00
bunnei bb7eb5c574 Merge pull request #1006 from yuriks/fb-commit-profile
OpenGL: Add a profiler category measuring framebuffer readback
2015-07-30 10:39:38 -04:00
bunnei 31c1bb901b Merge pull request #963 from yuriks/gpu-fixes
Misc. GPU vertex loading fixes
2015-07-29 16:45:17 -04:00
Yuri Kunde Schlesner 428154da45 OpenGL: Add a profiler category measuring framebuffer readback 2015-07-28 17:37:46 -03:00