Fernando Sahmkow
90e5694230
VideoCore/Engines: Refactor Engines CallMethod.
2020-04-27 21:47:58 -04:00
ReinUsesLisp
bb1ed66d99
maxwell_3d: Fix depth clamping register
...
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)
We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:
state->depthClampEnable ? 0x101A : 0x181D
The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):
(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c
There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.
- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
bunnei
6c7d8073be
Merge pull request #3742 from FernandoS27/command-list
...
Optimize GPU Command Lists and Introduce Fast GPU Time Option
2020-04-27 00:18:46 -04:00
Rodrigo Locatti
7e38dd580f
Merge pull request #3753 from ReinUsesLisp/ac-vulkan
...
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp
c788f9c0bd
shader/arithmetic_integer: Implement IADD.X
...
IADD.X takes the carry flag and adds it to the result. This is generally
used to emulate 64-bit operations with 32-bit registers.
2020-04-25 22:56:11 -03:00
bunnei
4e37825dab
Merge pull request #3734 from ReinUsesLisp/half-float-mods
...
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
2020-04-25 00:41:43 -04:00
Markus Wick
e717a1df20
Fix -Wdeprecated-copy warning.
2020-04-24 09:33:04 +02:00
ReinUsesLisp
dbaebd8582
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
...
The encoding for negation and absolute value was wrong.
Extracting is now done manually. Similar instructions having different
encodings is the rule, not the exception. To keep sanity and readability
I preferred to extract the desired bit manually.
This is implemented against nxas:
8dbc389957/table.h (L68)
That is itself tested against nvdisasm (Nvidia's official disassembler).
2020-04-23 18:29:38 -03:00
Fernando Sahmkow
5c9feaebb6
Clang Format.
2020-04-23 08:52:58 -04:00
Fernando Sahmkow
18a88d19dc
Maxwell3D: Process Macros on MultiMethod.
2020-04-23 08:52:56 -04:00
Fernando Sahmkow
3fedcc2f6e
DMAPusher: Propagate multimethod writes into the engines.
2020-04-23 08:52:55 -04:00
bunnei
2409fedacf
Merge pull request #3697 from lioncash/declarations
...
CMakeLists: Enable -Wmissing-declarations on Linux builds
2020-04-23 02:18:52 -04:00
Fernando Sahmkow
1b3be8a8f8
MaxwellDMA: Correct copying on accuracy level.
2020-04-22 11:36:25 -04:00
Fernando Sahmkow
b7bc3c2549
FenceManager: Manage syncpoints and rename fences to semaphores.
2020-04-22 11:36:16 -04:00
Fernando Sahmkow
4adfc9bb08
Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan.
2020-04-22 11:36:14 -04:00
Fernando Sahmkow
a081a7c855
GPU: Fix rebase errors.
2020-04-22 11:36:13 -04:00
Fernando Sahmkow
487379c593
OpenGL: Implement Fencing backend.
2020-04-22 11:36:10 -04:00
Fernando Sahmkow
339d0d9d6c
GPU: Delay Fences.
2020-04-22 11:36:08 -04:00
Fernando Sahmkow
da8f17715d
GPU: Refactor synchronization on Async GPU
2020-04-22 11:36:06 -04:00
Fernando Sahmkow
084ceb925a
UI: Replasce accurate GPU option for GPU Accuracy Level
2020-04-22 11:36:04 -04:00
ReinUsesLisp
0bbae63300
gl_rasterizer: Fix buffers without size
...
On NVN buffers can be enabled but have no size. According to deko3d and
the behavior we see in Animal Crossing: New Horizons these buffers get
the special address of 0x1000 and limit themselves to 0xfff.
Implement buffers without a size by binding a null buffer to OpenGL
without a side.
1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)
2020-04-21 19:55:44 -03:00
Rodrigo Locatti
f293b15611
Merge pull request #3718 from ReinUsesLisp/better-pipeline-state
...
fixed_pipeline_state: Pack structure, use memcmp and CityHash on it
2020-04-21 18:17:58 -03:00
bunnei
d3e0cefa60
Merge pull request #3695 from ReinUsesLisp/default-attributes
...
maxwell_3d: Initialize format attributes constant as one
2020-04-20 21:40:18 -04:00
ReinUsesLisp
ab6704f20c
fixed_pipeline_state: Pack attribute state
...
Reduce FixedPipelineState's size from 1384 to 664 bytes
2020-04-18 19:21:19 -03:00
Lioncash
e2d8be1ca2
General: Resolve warnings related to missing declarations
2020-04-16 23:43:34 -04:00
ReinUsesLisp
238c6016f9
maxwell_3d: Initialize format attributes constant as one
...
nouveau expects this to be true but it doesn't set it.
2020-04-16 21:15:07 -03:00
Lioncash
1c340c6efa
CMakeLists: Specify -Wextra on linux builds
...
Allows reporting more cases where logic errors may exist, such as
implicit fallthrough cases, etc.
We currently ignore unused parameters, since we currently have many
cases where this is intentional (virtual interfaces).
While we're at it, we can also tidy up any existing code that causes
warnings. This also uncovered a few bugs as well.
2020-04-15 21:33:46 -04:00
Fernando Sahmkow
e33196d4e7
Merge pull request #3612 from ReinUsesLisp/red
...
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
Mat M
64b5985f0a
Merge pull request #3662 from ReinUsesLisp/constant-attrs
...
gl_rasterizer: Implement constant vertex attributes
2020-04-15 11:54:50 -04:00
ReinUsesLisp
fefe7f18f9
shader/arithmetic: Add FCMP_CR variant
...
Adds another variant of FCMP.
2020-04-14 19:11:04 -03:00
ReinUsesLisp
6dfcabc800
gl_rasterizer: Implement constant vertex attributes
...
Credits go to gdkchan from Ryujinx for finding constant attributes are
used in retail games.
2020-04-14 17:58:53 -03:00
ReinUsesLisp
76615b9f34
gl_rasterizer: Implement line widths and smooth lines
...
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
2020-04-13 01:30:34 -03:00
Fernando Sahmkow
3d91dbb21d
Merge pull request #3578 from ReinUsesLisp/vmnmx
...
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
ReinUsesLisp
76f178ba6e
shader/video: Partially implement VMNMX
...
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.
VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.
This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
ReinUsesLisp
a7baf6fee4
video_core: Add MSAA registers in 3D engine and TIC
...
This adds the registers used for multisampling. It doesn't implement
anything for now.
2020-04-12 00:21:27 -03:00
bunnei
b96fd0bd0e
Merge pull request #3601 from ReinUsesLisp/some-shader-encodings
...
video_core/shader: Add some instruction and S2R encodings
2020-04-09 00:17:39 -04:00
ReinUsesLisp
3185245845
shader/memory: Implement RED.E.ADD
...
Implements a reduction operation. It's an atomic operation that doesn't
return a value.
This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
ReinUsesLisp
8b719e9e1d
shader_bytecode: Rename MOV_SYS to S2R
2020-04-04 03:37:51 -03:00
ReinUsesLisp
9d15feb892
shader_bytecode: Add encoding for BAR
2020-04-04 03:36:21 -03:00
ReinUsesLisp
c02a2dc24a
shader_bytecode: Add encoding for VOTE.VTG
2020-04-04 03:28:11 -03:00
ReinUsesLisp
2339fe199f
shader_decompiler: Remove FragCoord.w hack and change IPA implementation
...
Credits go to gdkchan and Ryujinx. The pull request used for this can
be found here: https://github.com/Ryujinx/Ryujinx/pull/1082
yuzu was already using the header for interpolation, but it was missing
the FragCoord.w multiplication described in the linked pull request.
This commit finally removes the FragCoord.w == 1.0f hack from the shader
decompiler.
While we are at it, this commit renames some enumerations to match
Nvidia's documentation (linked below) and fixes component declaration
order in the shader program header (z and w were swapped).
https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-04-01 21:48:55 -03:00
namkazy
c8f6d9effd
shader_decode: merge GlobalAtomicOp to AtomicOp
2020-03-30 18:47:00 +07:00
ReinUsesLisp
08470d261d
shader_bytecode: Fix I2I_IMM encoding
2020-03-28 18:49:07 -03:00
ReinUsesLisp
cedbe925cd
engines/const_buffer_engine_interface: Store image format type
...
This information is required to properly implement SULD.B. It might also
be handy for all image operations, since it would allow us to implement
them on devices that require the image format to be specified (on
desktop, this would be AMD on OpenGL and Intel on OpenGL and Vulkan).
2020-03-27 00:36:22 -03:00
bunnei
e6aff11057
Merge pull request #3520 from ReinUsesLisp/legacy-varyings
...
gl_shader_decompiler: Implement legacy varyings
2020-03-25 19:27:51 -04:00
namkazy
fc37672f26
apply replay logic to all writes. remove replay from MacroInterpreter::Send (@fincs)
2020-03-22 22:25:44 +07:00
namkazy
f66743cd0c
maxwell_3d: change declaration order
2020-03-22 13:41:16 +07:00
namkazy
d4e93cf38c
maxwell_3d: init shadow_state
2020-03-22 13:35:11 +07:00
namkazy
22f4268c2f
maxwell_3d: this seem more correct.
2020-03-22 12:02:54 +07:00
namkazy
7051dc1902
maxwell_3d: update comments for shadow ram usage
2020-03-22 11:35:26 +07:00
Nguyen Dac Nam
63c2635e6f
maxwell_3d: track shadow ram ctrl and hw reg value
2020-03-22 10:53:41 +07:00
Nguyen Dac Nam
dbfbe352e0
maxwell_3d: implement MME shadow RAM
2020-03-22 10:53:35 +07:00
ReinUsesLisp
9f46066bda
kepler_compute: Remove unused variables
2020-03-18 20:03:19 -03:00
Rodrigo Locatti
ddafc99776
Merge pull request #3502 from namkazt/patch-3
...
shader_decode: Reimplement BFE instructions
2020-03-15 21:23:04 -03:00
ReinUsesLisp
6442e02c5d
shader/shader_ir: Track usage in input attribute and of legacy varyings
2020-03-15 21:01:52 -03:00
ReinUsesLisp
afebdda203
maxwell_3d: Add padding words to XFB entries
...
Use INSERT_UNION_PADDING_WORDS instead of alignas to ensure a size
requirement.
2020-03-13 18:33:05 -03:00
ReinUsesLisp
8e9f23f393
gl_rasterizer: Implement transform feedback bindings
2020-03-13 18:33:04 -03:00
Rodrigo Locatti
244fe13219
Merge branch 'master' into shader-purge
2020-03-13 16:44:06 -03:00
Nguyen Dac Nam
93547cac68
shader_bytecode: update BFE instructions struct.
2020-03-13 12:52:16 +07:00
ReinUsesLisp
e4bc3c3342
gl_rasterizer: Implement polygon modes and fill rectangles
2020-03-09 20:39:58 -03:00
ReinUsesLisp
eb5861e0a2
engines/maxwell_3d: Add TFB registers and store them in shader registry
2020-03-09 18:40:53 -03:00
ReinUsesLisp
978172530e
const_buffer_engine_interface: Store component types
...
This is required for Vulkan. Sampling integer textures with float
handles is illegal.
2020-03-09 18:40:53 -03:00
ReinUsesLisp
042256c6bb
state_tracker: Remove type traits with named structures
2020-02-28 17:56:43 -03:00
ReinUsesLisp
15cadc3948
maxwell_3d: Use two tables instead of three for dirty flags
2020-02-28 17:56:42 -03:00
ReinUsesLisp
9b08698a0c
maxwell_3d: Change write dirty flags to a bitset
2020-02-28 17:56:42 -03:00
ReinUsesLisp
9e74e6988b
maxwell_3d: Flatten cull and front face registers
2020-02-28 17:56:41 -03:00
ReinUsesLisp
eed789d0d1
video_core: Reintroduce dirty flags infrastructure
2020-02-28 17:56:41 -03:00
ReinUsesLisp
1eee891f6e
gl_state: Remove clip distances tracking
2020-02-28 17:26:26 -03:00
ReinUsesLisp
d3e433a380
gl_state: Remove viewport and depth range tracking
2020-02-28 17:25:18 -03:00
ReinUsesLisp
96ac3d518a
gl_rasterizer: Remove dirty flags
2020-02-28 16:39:27 -03:00
bunnei
e22ad52cdb
Merge pull request #3425 from ReinUsesLisp/layered-framebuffer
...
texture_cache: Implement layered framebuffer attachments
2020-02-24 10:14:50 -05:00
bunnei
b2bc7682b4
Merge pull request #3414 from ReinUsesLisp/maxwell-3d-draw
...
maxwell_3d: Unify draw methods
2020-02-19 16:13:50 -05:00
Fernando Sahmkow
93acfbd3a5
Merge pull request #3409 from ReinUsesLisp/host-queries
...
query_cache: Implement a query cache and query 21 (samples passed)
2020-02-18 11:31:06 -04:00
ReinUsesLisp
6a0220b2e1
texture_cache: Implement layered framebuffer attachments
...
Layered framebuffer attachments is a feature that allows applications to
write attach layered textures to a single attachment. What layer the
fragments are written to is decided from the shader using gl_Layer.
2020-02-16 04:19:32 -03:00
ReinUsesLisp
91aa58e410
maxwell_3d: Unify draw methods
...
Pass instanced state of a draw invocation as an argument instead of
having two separate virtual methods.
2020-02-14 18:09:40 -03:00
ReinUsesLisp
73d2d3342d
gl_query_cache: Optimize query cache
...
Use a custom cache instead of relying on a ranged cache.
2020-02-14 17:38:27 -03:00
ReinUsesLisp
aae8c180cb
gl_query_cache: Implement host queries using a deferred cache
...
Instead of waiting immediately for executed commands, defer the query
until the guest CPU reads it. This way we get closer to what the guest
program is doing.
To archive this we have to build a dependency queue, because host APIs
(like OpenGL and Vulkan) use ranged queries instead of counters like
NVN.
Waiting for queries implicitly uses fences and this requires a command
being queued, otherwise the driver will lock waiting until a timeout. To
fix this when there are no commands queued, we explicitly call glFlush.
2020-02-14 17:33:13 -03:00
ReinUsesLisp
2b58652f08
maxwell_3d: Slow implementation of passed samples (query 21)
...
Implements GL_SAMPLES_PASSED by waiting immediately for queries.
2020-02-14 17:27:17 -03:00
bunnei
63a59b9935
Merge pull request #3379 from ReinUsesLisp/cbuf-offset
...
shader/decode: Fix constant buffer offsets
2020-02-14 13:22:53 -05:00
bunnei
3563af2364
Merge pull request #3395 from FernandoS27/queries
...
GPU: Refactor queries implementation and correct GPU Clock.
2020-02-13 20:18:26 -05:00
Fernando Sahmkow
d6ed31b9fa
GPU: Address Feedback.
2020-02-13 18:16:07 -04:00
bunnei
37f1cf8cbd
Merge pull request #3376 from ReinUsesLisp/point-sprite
...
gl_rasterizer: Implement GL_POINT_SPRITE
2020-02-11 08:26:07 -05:00
Fernando Sahmkow
8e9a4944db
GPU: Implement GPU Clock correctly.
2020-02-10 10:44:54 -04:00
Fernando Sahmkow
0cb3bcfbb7
Maxwell3D: Correct query reporting.
2020-02-10 10:41:43 -04:00
bunnei
84ea9c2b42
Merge pull request #3372 from ReinUsesLisp/fix-back-stencil
...
maxwell_3d: Fix stencil back mask
2020-02-09 22:29:28 -05:00
bunnei
90df4b8e2b
Merge pull request #3369 from ReinUsesLisp/shf
...
shader/shift: Implement SHF
2020-02-07 22:06:57 -05:00
ReinUsesLisp
bf9a822b87
shader/decode: Fix constant buffer offsets
...
Some instances were using cbuf34.offset instead of cbuf34.GetOffset().
This returned the an invalid offset. Address those instances and rename
offset to "shifted_offset" to avoid future bugs.
2020-02-05 12:19:09 -03:00
bunnei
08c508b1c4
Merge pull request #3357 from ReinUsesLisp/bfi-rc
...
shader/bfi: Implement register-constant buffer variant
2020-02-04 15:14:13 -05:00
ReinUsesLisp
7da52673d0
gl_rasterizer: Implement GL_POINT_SPRITE
...
OpenGL core defaults to GL_POINT_SPRITE, meanwhile on OpenGL
compatibility we have to explicitly enable it. This fixes
gl_PointCoord's behaviour.
2020-02-04 15:19:45 -03:00
bunnei
bf21aacc74
Merge pull request #3356 from ReinUsesLisp/fcmp
...
shader/arithmetic: Implement FCMP
2020-02-04 11:36:59 -05:00
ReinUsesLisp
4eed744277
maxwell_3d: Fix stencil back mask
2020-02-02 17:50:46 -03:00
bunnei
b5bbe7e752
Merge pull request #3282 from FernandoS27/indexed-samplers
...
Partially implement Indexed samplers in general and specific code in GLSL
2020-02-01 20:41:40 -05:00
ReinUsesLisp
017474c3f8
shader/shift: Implement SHF_LEFT_{IMM,R}
...
Shifts a pair of registers to the left and returns the high register.
2020-02-01 21:19:44 -03:00
ReinUsesLisp
137a8aa55c
shader/bfi: Implement register-constant buffer variant
...
It's the same as the variant that was implemented, but it takes the
operands from another source.
2020-01-27 01:20:38 -03:00
ReinUsesLisp
e3fc3459c8
shader/arithmetic: Implement FCMP
...
Compares the third operand with zero, then selects between the first and
second.
2020-01-27 01:15:44 -03:00
ReinUsesLisp
d95d4ac843
shader/memory: Implement ATOM.ADD
...
ATOM operates atomically on global memory. For now only add ATOM.ADD
since that's what was found in commercial games.
This asserts for ATOM.ADD.S32 (handling the others as unimplemented),
although ATOM.ADD.U32 shouldn't be any different.
This change forces us to change the default type on SPIR-V storage
buffers from float to uint. We could also alias the buffers, but it's
simpler for now to just use uint. While we are at it, abstract the code
to avoid repetition.
2020-01-26 01:54:24 -03:00
Fernando Sahmkow
b97608ca64
Shader_IR: Allow constant access of guest driver.
2020-01-24 16:43:30 -04:00
Fernando Sahmkow
c921e496eb
GPU: Implement guest driver profile and deduce texture handler sizes.
2020-01-24 16:43:29 -04:00
bunnei
5a077c95ce
Merge pull request #3322 from ReinUsesLisp/vk-front-face
...
vk_graphics_pipeline: Set front facing properly
2020-01-19 23:22:34 -05:00
ReinUsesLisp
94915d4ea1
vk_graphics_pipeline: Set front facing properly
...
Front face was being forced to a certain value when cull face is
disabled. Set a default value on initialization and drop the forcefully
set front facing value with culling disabled.
2020-01-18 18:50:47 -03:00
bunnei
9bf4850f74
Merge pull request #3305 from ReinUsesLisp/point-size-program
...
gl_state: Implement PROGRAM_POINT_SIZE
2020-01-18 01:56:32 -05:00
ReinUsesLisp
63ba41a26d
shader/memory: Implement ATOMS.ADD.U32
2020-01-16 17:30:55 -03:00
Lioncash
9e874898f5
maxwell_3d: Make dirty_pointers private
...
This isn't used outside of the class itself, so we can make it private
for the time being.
2020-01-16 04:07:15 -05:00
ReinUsesLisp
c375d735e6
gl_state: Implement PROGRAM_POINT_SIZE
...
For gl_PointSize to have effect we have to activate
GL_PROGRAM_POINT_SIZE.
2020-01-15 16:14:17 -03:00
ReinUsesLisp
0d6d8129c4
yuzu: Remove Maxwell debugger
...
This was carried from Citra and wasn't really used on yuzu. It also adds
some runtime overhead. This commit removes it from yuzu's codebase.
2020-01-02 23:09:44 -03:00
bunnei
028b2718ed
Merge pull request #3239 from ReinUsesLisp/p2r
...
shader/p2r: Implement P2R Pr
2019-12-31 20:37:16 -05:00
bunnei
8a76f816a4
Merge pull request #3228 from ReinUsesLisp/ptp
...
shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
2019-12-26 21:43:44 -05:00
Fernando Sahmkow
5619d24377
Merge pull request #3244 from ReinUsesLisp/vk-fps
...
fixed_pipeline_state: Define structure and loaders
2019-12-25 14:31:29 -04:00
bunnei
4af569ee47
Merge pull request #3236 from ReinUsesLisp/rasterize-enable
...
gl_rasterizer: Implement RASTERIZE_ENABLE
2019-12-24 22:54:10 -05:00
ReinUsesLisp
5770418fb3
maxwell_3d: Add depth bounds registers
2019-12-22 22:55:06 -03:00
ReinUsesLisp
cf27b59493
shader/r2p: Refactor P2R to support P2R
2019-12-20 17:55:42 -03:00
ReinUsesLisp
da0aa4da6b
gl_rasterizer: Implement RASTERIZE_ENABLE
...
RASTERIZE_ENABLE is the opposite of GL_RASTERIZER_DISCARD. Implement it
naturally using this.
NVN games expect rasterize to be enabled by default, reflect that in our
initial GPU state.
2019-12-18 19:28:23 -03:00
ReinUsesLisp
8b26b4228b
shader_bytecode: Fix TLD4S encoding
2019-12-17 23:32:10 -03:00
ReinUsesLisp
e09c1fbc1f
shader/texture: Implement TLD4.PTP
2019-12-16 04:09:24 -03:00
Fernando Sahmkow
af89723fa3
Shader_Ir: Correct TLD4S encoding and implement f16 flag.
2019-12-11 19:53:17 -04:00
bunnei
1a66cde175
Merge pull request #3210 from ReinUsesLisp/memory-barrier
...
shader: Implement MEMBAR.GL
2019-12-11 14:24:39 -05:00
Fernando Sahmkow
7ffb672f61
Maxwell3D: Implement Depth Mode.
...
This commit finishes adding depth mode that was reverted before due to
other unresolved issues.
2019-12-10 19:51:46 -04:00
ReinUsesLisp
425a254fa2
shader: Implement MEMBAR.GL
...
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
ReinUsesLisp
6233b1db08
shader_ir/memory: Implement patch stores
2019-12-09 23:25:21 -03:00
ReinUsesLisp
36651f215a
maxwell_3d: Add tessellation tess level registers
2019-12-06 22:08:22 -03:00
ReinUsesLisp
707bf41c6f
maxwell_3d: Add tessellation mode register
2019-12-06 22:07:31 -03:00
ReinUsesLisp
d2b50c5ebd
maxwell_3d: Add patch vertices register
2019-12-06 22:06:53 -03:00
ReinUsesLisp
74f515e8b6
shader_bytecode: Remove corrupted character
2019-12-06 20:31:56 -03:00
bunnei
e36814d6d5
Merge pull request #3109 from FernandoS27/new-instr
...
Implement FLO & TXD Instructions on GPU Shaders
2019-12-06 18:18:16 -05:00
bunnei
b03242067d
Merge pull request #3098 from ReinUsesLisp/shader-invalidations
...
gl_shader_cache: Miscellaneous changes to shaders
2019-11-24 19:36:30 -05:00
bunnei
b7031b2b9d
Merge pull request #3105 from ReinUsesLisp/fix-stencil-reg
...
maxwell_3d: Fix stencil_back_func_mask offset
2019-11-24 13:53:23 -05:00
ReinUsesLisp
180417c514
gl_shader_cache: Remove dynamic BaseBinding specialization
2019-11-22 21:28:49 -03:00
ReinUsesLisp
c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType
2019-11-22 21:28:48 -03:00
ReinUsesLisp
287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
...
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp
4f5d8e4342
gl_shader_cache: Specialize shader workgroup
...
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp
80eacdf89b
texture_cache: Use a table instead of switch for texture formats
...
Use a large flat array to look up texture formats. This allows us to
properly implement formats with different component types. It should
also be faster.
2019-11-14 20:57:10 -03:00
Fernando Sahmkow
cd0f5dfc17
Shader_IR: Implement TXD instruction.
2019-11-14 11:15:27 -04:00
Fernando Sahmkow
f3d1b370aa
Shader_IR: Implement FLO instruction.
2019-11-14 11:15:27 -04:00
Fernando Sahmkow
95137a04e1
Shader_Bytecode: Add encodings for FLO, SHF and TXD
2019-11-14 11:15:26 -04:00
Fernando Sahmkow
b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
...
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
ReinUsesLisp
7990220df7
maxwell_3d: Fix stencil_back_func_mask offset
...
stencil_back_func_mask and stencil_back_mask were misplaced. This commit
addresses that issue.
2019-11-13 16:35:17 -03:00
ReinUsesLisp
096f339a2a
video_core: Silence implicit conversion warnings
2019-11-08 22:48:50 +00:00
ReinUsesLisp
56e237d1f9
shader_ir/warp: Implement FSWZADD
2019-11-07 20:08:41 -03:00
bunnei
21e07df7b7
Merge pull request #2914 from FernandoS27/fermi-fix
...
Fermi2D: limit blit area to only available area
2019-11-05 20:45:24 -05:00
bunnei
1bdae0fe29
common_func: Use std::array for INSERT_PADDING_* macros.
...
- Zero initialization here is useful for determinism.
2019-11-03 22:22:41 -05:00
Rodrigo Locatti
658489ebf7
Merge pull request #3050 from FernandoS27/fix-tld4
...
shader_ir: Fix TLD4 and add bindless variant
2019-10-30 18:37:17 +00:00
Fernando Sahmkow
9293c3a0f2
Shader_IR: Fix TLD4 and add Bindless Variant.
...
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
ReinUsesLisp
fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture
2019-10-28 00:23:42 -03:00
ReinUsesLisp
538ddd220e
video_core/textures: Remove unused index entry in FullTextureInfo
2019-10-28 00:14:38 -03:00
ReinUsesLisp
961fe4d19b
maxwell_3d: Remove unused method GetStageTextures
2019-10-28 00:14:29 -03:00
ReinUsesLisp
3e469cecc1
maxwell_3d: Silence implicit conversion warnings
...
While we are at it, unify types for dirty reg pointers.
2019-10-27 15:22:17 -03:00
Fernando Sahmkow
be856a38d6
Shader_IR: Address Feedback.
2019-10-26 15:38:30 -04:00
Fernando Sahmkow
e3afd6595a
Shader_IR: Clang format
2019-10-25 09:01:32 -04:00
ReinUsesLisp
ec85648af3
gl_shader_disk_cache: Store and load fast BRX
2019-10-25 09:01:31 -04:00
Fernando Sahmkow
33fcec3502
Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it
2019-10-25 09:01:30 -04:00
Fernando Sahmkow
1a58f45d76
VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders.
2019-10-25 09:01:29 -04:00
Lioncash
7fdf991097
shader_bytecode: Make Matcher constexpr capable
...
Greatly shrinks the amount of generated code for GetDecodeTable().
Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
ReinUsesLisp
e3107788e6
maxwell_3d: Reduce FlushMMEInlineDraw logging to Trace
2019-10-20 03:43:17 -03:00
Fernando Sahmkow
c0eb1aecfd
Fermi2D: Use a different formula for delimiting blit areas.
2019-10-17 18:21:01 -04:00
Fernando Sahmkow
57a46c69f1
Fermi2D: limit blit area to only available area
...
Normaly OpenGL does not care if the areas exceed the texture regions but
other backends such as Vulkan do care about the limits of this areas.
This PR crops the areas of the blit in order that they don't surpass the
limits of the textures. This should help Vulkan and faulty OpenGL
drivers
2019-10-17 10:38:44 -04:00
Lioncash
c9c75f9587
maxwell_3d: Silence truncation warnings
...
A trivial warning caused by not using size_t as the argument types
instead of u32.
2019-10-15 17:51:35 -04:00
ReinUsesLisp
fe7f20e659
maxwell_3d: Add dirty flags for depth bounds values
...
This is useful in Vulkan where we want to update depth bounds without
caring if it's enabled or disabled through vkCmdSetDepthBounds.
2019-10-05 04:07:47 +00:00
bunnei
376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
...
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
David
9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
...
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
Fernando Sahmkow
68f5aff64f
Maxwell3D: Corrections and refactors to MME instance refactor
2019-09-22 07:23:13 -04:00
FearlessTobi
01fc969a5f
Fix clang-format
2019-09-22 02:21:56 +02:00
FearlessTobi
366e900376
fermi_2d: Lower surface copy log severity to DEBUG
2019-09-22 02:18:57 +02:00
Rodrigo Locatti
9286976948
Merge pull request #2878 from FernandoS27/icmp
...
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp
44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
...
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
675f23aedc
shader/image: Implement SULD and remove irrelevant code
...
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
ReinUsesLisp
4de0f1e1c8
shader_bytecode: Add SULD encoding
2019-09-21 17:31:46 -03:00
Fernando Sahmkow
527b841c15
Shader_IR: ICMP corrections and fixes
2019-09-21 14:28:03 -04:00
David Marcec
01a4afee42
Mark DrawArrays as LOG_TRACE
...
There's no reason to clog logs with DrawArray.
2019-09-21 15:43:58 +10:00
Fernando Sahmkow
4b81d19a1a
Shader_IR: Implement ICMP.
2019-09-19 20:56:29 -04:00
Fernando Sahmkow
7761e44d18
Rasterizer: Refactor and simplify DrawBatch Interface.
2019-09-19 11:41:33 -04:00
Fernando Sahmkow
7606da5611
VideoCore: Corrections to the MME Inliner and removal of hacky instance management.
2019-09-19 11:41:29 -04:00
Fernando Sahmkow
ba02d564f8
Video Core: initial Implementation of InstanceDraw Packaging
2019-09-19 11:41:27 -04:00
ReinUsesLisp
0526bf1895
shader_ir/warp: Implement SHFL
2019-09-17 17:44:07 -03:00
Fernando Sahmkow
393cc3ef2f
Merge pull request #2851 from ReinUsesLisp/srgb
...
renderer_opengl: Fix sRGB blits
2019-09-15 10:38:10 -04:00
Fernando Sahmkow
b8b1747704
Merge pull request #2824 from ReinUsesLisp/mme
...
Revert "Revert #2466 " and stub FirmwareCall 4
2019-09-15 06:17:04 -04:00
Rodrigo Locatti
193bfefce4
maxwell_3d: Update firmware 4 call stub commentary
2019-09-14 22:51:18 -03:00
ReinUsesLisp
36abf67e79
shader/image: Implement SUATOM and fix SUST
2019-09-10 20:22:31 -03:00
ReinUsesLisp
78574746bd
renderer_opengl: Fix sRGB blits
...
Removes the sRGB hack of tracking if a frame used an sRGB rendertarget
to apply at least once to blit the final texture as sRGB. Instead of
doing this apply sRGB if the presented image has sRGB.
Also enable sRGB by default on Maxwell3D registers as some games seem to
assume this.
2019-09-10 19:31:42 -03:00
bunnei
34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
...
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
bunnei
c7ec7bc1f5
Merge pull request #2810 from ReinUsesLisp/mme-opt
...
maxwell_3d: Avoid moving macro_params
2019-09-10 11:55:45 -04:00
ReinUsesLisp
6170337001
gl_rasterizer: Implement image bindings
2019-09-05 20:35:51 -03:00
ReinUsesLisp
3a450c1395
kepler_compute: Implement texture queries
2019-09-05 20:35:51 -03:00
ReinUsesLisp
5f309b88db
Revert "Revert #2466 " and stub FirmwareCall 4
2019-09-04 01:55:45 -03:00
ReinUsesLisp
77ef4fa907
shader/shift: Implement SHR wrapped and clamped variants
...
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp
701dedcfad
maxwell_3d: Avoid moving macro_params
2019-09-04 01:55:01 -03:00
bunnei
81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
...
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
...
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
bunnei
137d165672
Merge pull request #2826 from ReinUsesLisp/macro-binding
...
maxwell_3d: Fix macro binding cursor
2019-09-03 22:32:42 -04:00
bunnei
50b5bb44a0
Merge pull request #2765 from FernandoS27/dma-fix
...
MaxwellDMA: Fixes, corrections and relaxations.
2019-09-01 13:13:05 -04:00
ReinUsesLisp
52a41f482f
maxwell_3d: Fix macro binding cursor
2019-09-01 05:01:11 -03:00
Rodrigo Locatti
4d4f9cc104
video_core: Silent miscellaneous warnings ( #2820 )
...
* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
ReinUsesLisp
e3534700d7
shader_ir/conversion: Split int and float selector and implement F2F H1
2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8
shader_ir/conversion: Implement F2I F16 Ra.H1
2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00
float_set_predicate: Add missing negation bit for the second operand
2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23
shader_ir: Implement VOTE
...
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
...
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
2ff8044806
shader_ir: Implement NOP
2019-08-04 03:02:55 -03:00
bunnei
52f54c728d
Merge pull request #2592 from FernandoS27/sync1
...
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
Fernando Sahmkow
a452ff983d
MaxwellDMA: Fixes, corrections and relaxations.
...
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
...
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00