artemist/yuzu - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
Rodrigo Locatti	9286976948	Merge pull request #2878 from FernandoS27/icmp shader_ir: Implement ICMP	2019-09-21 18:06:07 -03:00
Fernando Sahmkow	527b841c15	Shader_IR: ICMP corrections and fixes	2019-09-21 14:28:03 -04:00
bunnei	88d857499b	Merge pull request #2855 from ReinUsesLisp/shfl shader_ir/warp: Implement SHFL for Nvidia devices	2019-09-20 17:10:42 -04:00
Fernando Sahmkow	4b81d19a1a	Shader_IR: Implement ICMP.	2019-09-19 20:56:29 -04:00
bunnei	b31880dc5e	Merge pull request #2784 from ReinUsesLisp/smem shader_ir: Implement shared memory	2019-09-18 16:26:05 -04:00
ReinUsesLisp	0526bf1895	shader_ir/warp: Implement SHFL	2019-09-17 17:44:07 -03:00
ReinUsesLisp	36abf67e79	shader/image: Implement SUATOM and fix SUST	2019-09-10 20:22:31 -03:00
bunnei	34b2c60f95	Merge pull request #2823 from ReinUsesLisp/shr-clamp shader/shift: Implement SHR wrapped and clamped variants	2019-09-10 11:56:17 -04:00
ReinUsesLisp	1f43e5296f	gl_shader_decompiler: Keep track of written images and mark them as modified	2019-09-05 23:26:05 -03:00
ReinUsesLisp	3a450c1395	kepler_compute: Implement texture queries	2019-09-05 20:35:51 -03:00
ReinUsesLisp	4de04eba39	shader_ir: Implement LD_S Loads from shared memory.	2019-09-05 01:38:37 -03:00
ReinUsesLisp	f17415d431	shader_ir: Implement ST_S This instruction writes to a memory buffer shared with threads within the same work group. It is known as "shared" memory in GLSL.	2019-09-05 01:38:37 -03:00
ReinUsesLisp	77ef4fa907	shader/shift: Implement SHR wrapped and clamped variants Nvidia defaults to wrapped shifts, but this is undefined behaviour on OpenGL's spec. Explicitly mask/clamp according to what the guest shader requires.	2019-09-04 01:55:24 -03:00
ReinUsesLisp	dfae2d141a	half_set_predicate: Fix predicate assignments	2019-09-04 01:54:23 -03:00
bunnei	81fbc5370d	Merge pull request #2812 from ReinUsesLisp/f2i-selector shader_ir/conversion: Implement F2I and F2F F16 selector	2019-09-03 22:35:33 -04:00
bunnei	d4f33b822b	Merge pull request #2811 from ReinUsesLisp/fsetp-fix float_set_predicate: Add missing negation bit for the second operand	2019-09-03 22:34:34 -04:00
Rodrigo Locatti	4d4f9cc104	video_core: Silent miscellaneous warnings (#2820 ) * texture_cache/surface_params: Remove unused local variable * rasterizer_interface: Add missing documentation commentary * maxwell_dma: Remove unused rasterizer reference * video_core/gpu: Sort member declaration order to silent -Wreorder warning * fermi_2d: Remove unused MemoryManager reference * video_core: Silent unused variable warnings * buffer_cache: Silent -Wreorder warnings * kepler_memory: Remove unused MemoryManager reference * gl_texture_cache: Add missing override * buffer_cache: Add missing include * shader/decode: Remove unused variables	2019-08-30 14:08:00 -04:00
bunnei	f8cc5668f8	Merge pull request #2758 from ReinUsesLisp/packed-tid shader/decode: Implement S2R Tic	2019-08-29 12:58:43 -04:00
ReinUsesLisp	e3534700d7	shader_ir/conversion: Split int and float selector and implement F2F H1	2019-08-28 16:09:33 -03:00
ReinUsesLisp	b13fbc25b8	shader_ir/conversion: Implement F2I F16 Ra.H1	2019-08-27 23:40:40 -03:00
ReinUsesLisp	6207751b00	float_set_predicate: Add missing negation bit for the second operand	2019-08-27 21:57:43 -03:00
ReinUsesLisp	4e35177e23	shader_ir: Implement VOTE Implement VOTE using Nvidia's intrinsics. Documentation about these can be found here https://developer.nvidia.com/reading-between-threads-shader-intrinsics Instead of using portable ARB instructions I opted to use Nvidia intrinsics because these are the closest we have to how Tegra X1 hardware renders. To stub VOTE on non-Nvidia drivers (including nouveau) this commit simulates a GPU with a warp size of one, returning what is meaningful for the instruction being emulated: * anyThreadNV(value) -> value * allThreadsNV(value) -> value * allThreadsEqualNV(value) -> true ballotARB, also known as "uint64_t(activeThreadsNV())", emits VOTE.ANY Rd, PT, PT; on nouveau's compiler. This doesn't match exactly to Nvidia's code VOTE.ALL Rd, PT, PT; Which is emulated with activeThreadsNV() by this commit. In theory this shouldn't really matter since .ANY, .ALL and .EQ affect the predicates (set to PT on those cases) and not the registers.	2019-08-21 14:50:38 -03:00
bunnei	dfdd20142e	Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix half_set_predicate: Fix HSETP2_C constant buffer offset	2019-08-21 10:29:17 -04:00
bunnei	cedc1aab4a	Merge pull request #2753 from FernandoS27/float-convert Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.	2019-08-21 10:27:57 -04:00
bunnei	ca61e298b3	Merge pull request #2778 from ReinUsesLisp/nop shader_ir: Implement NOP	2019-08-18 08:51:34 -04:00
ReinUsesLisp	2ff8044806	shader_ir: Implement NOP	2019-08-04 03:02:55 -03:00
ReinUsesLisp	ec0da3ef64	half_set_predicate: Fix HSETP2_C constant buffer offset	2019-08-04 02:50:55 -03:00
ReinUsesLisp	77f1a676a1	decode/half_set_predicate: Fix predicates	2019-07-26 00:12:38 -03:00
bunnei	b0ff3179ef	Merge pull request #2739 from lioncash/cflow video_core/control_flow: Minor changes/warning cleanup	2019-07-25 13:04:56 -04:00
bunnei	4d26550f5f	Merge pull request #2737 from FernandoS27/track-fix Shader_Ir: Correct tracking to track from right to left	2019-07-25 12:41:52 -04:00
bunnei	31e8a61527	Merge pull request #2743 from FernandoS27/surpress-assert Downgrade and suppress a series of GPU asserts and debug messages.	2019-07-25 12:34:36 -04:00
ReinUsesLisp	104641db07	shader/decode: Implement S2R Tic	2019-07-22 16:16:10 -03:00
Fernando Sahmkow	11f4e739bd	Shader_Ir: Implement F16 Variants of F2F, F2I, I2F. This commit takes care of implementing the F16 Variants of the conversion instructions and makes sure conversions are done.	2019-07-20 17:38:25 -04:00
Fernando Sahmkow	1158777737	Shader_Ir: Change Debug Asserts for Log Warnings	2019-07-19 22:15:34 -04:00
ReinUsesLisp	45c162444d	shader/half_set_predicate: Fix HSETP2 implementation	2019-07-19 22:21:22 -03:00
ReinUsesLisp	6c4985edc9	shader/half_set_predicate: Implement missing HSETP2 variants	2019-07-19 22:20:47 -03:00
Lioncash	c1c89411da	video_core/control_flow: Provide operator!= for types with operator== Provides operational symmetry for the respective structures.	2019-07-18 21:03:31 -04:00
Lioncash	1780e0e3d0	video_core/control_flow: Prevent sign conversion in TryGetBlock() The return value is a u32, not an s32, so this would result in an implicit signedness conversion.	2019-07-18 21:03:31 -04:00
Lioncash	a162a844d2	video_core/control_flow: Remove unnecessary BlockStack copy constructor This is the default behavior of the copy constructor, so it doesn't need to be specified. While we're at it we can make the other non-default constructor explicit.	2019-07-18 21:03:30 -04:00
Lioncash	56bc11d952	video_core/control_flow: Use std::move where applicable Results in less work being done where avoidable.	2019-07-18 21:03:30 -04:00
Lioncash	e7b39f47f8	video_core/control_flow: Use the prefix variant of operator++ for iterators Same thing, but potentially allows a standard library implementation to pick a more efficient codepath.	2019-07-18 21:03:30 -04:00
Lioncash	6885e7e7ec	video_core/control_flow: Use empty() member function for checking emptiness It's what it's there for.	2019-07-18 21:03:30 -04:00
Lioncash	45fa12a05c	video_core: Resolve -Wreorder warnings Ensures that the constructor members are always initialized in the order that they're declared in.	2019-07-18 21:03:30 -04:00
Lioncash	47df844338	video_core/control_flow: Make program_size for ScanFlow() a std::size_t Prevents a truncation warning from occurring with MSVC. Also the internal data structures already treat it as a size_t, so this is just a discrepancy in the interface.	2019-07-18 21:03:29 -04:00
Lioncash	3df9558593	video_core/control_flow: Place all internally linked types/functions within an anonymous namespace Previously, quite a few functions were being linked with external linkage.	2019-07-18 21:03:29 -04:00
Lioncash	1109db86b7	video_core/shader/decode: Prevent sign-conversion warnings Makes it explicit that the conversions here are intentional.	2019-07-18 21:03:29 -04:00
bunnei	63bda67a34	Merge pull request #2738 from lioncash/shader-ir shader-ir: Minor cleanup-related changes	2019-07-18 13:52:01 -04:00
Fernando Sahmkow	5a06e33859	Shader_Ir: correct clang format	2019-07-18 10:09:26 -04:00
Fernando Sahmkow	0b65e9335e	Shader_Ir: Downgrade precision and rounding asserts to debug asserts. This commit reduces the sevirity of asserts for FP precision and rounding as this are well known and have little to no consequences in gpu's accuracy.	2019-07-18 08:17:19 -04:00
Fernando Sahmkow	223a535f3f	Merge pull request #2740 from lioncash/bra shader/decode/other: Correct branch indirect argument within BRA handling	2019-07-17 14:25:08 -04:00

1 2 3 4 5 ...

433 commits