franklinblanco/yuzu

Author	SHA1	Message	Date
bunnei	86345c126a	Merge pull request #3978 from ReinUsesLisp/write-rz shader_decompiler: Visit source nodes even when they assign to RZ	2020-05-25 21:31:33 -04:00
bunnei	1adabdac7f	Merge pull request #3905 from FernandoS27/vulkan-fix Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline	2020-05-24 15:23:38 -04:00
bunnei	325e7eed3c	Merge pull request #3964 from ReinUsesLisp/arb-integration renderer_opengl: Add assembly program code paths	2020-05-24 00:34:12 -04:00
bunnei	487dd05170	Merge pull request #3979 from ReinUsesLisp/thread-group shader/other: Implement thread comparisons (NV_shader_thread_group)	2020-05-24 00:33:06 -04:00
ReinUsesLisp	e2b67a868b	shader/other: Implement thread comparisons (NV_shader_thread_group) Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt	2020-05-21 23:18:37 -03:00
ReinUsesLisp	ed4e324991	shader_decompiler: Visit source nodes even when they assign to RZ Some operations like atomicMin were ignored because they returned were being stored to RZ. This operations have a side effect and it was being ignored.	2020-05-21 23:16:03 -03:00
ReinUsesLisp	891236124c	buffer_cache: Use boost::intrusive::set for caching Instead of using boost::icl::interval_map for caching, use boost::intrusive::set. interval_map is intended as a container where the keys can overlap with one another; we don't need this for caching buffers and a std::set-like data structure that allows us to search with lower_bound is enough.	2020-05-21 16:44:00 -03:00
ReinUsesLisp	420cc13248	renderer_opengl: Add assembly program code paths Add code required to use OpenGL assembly programs based on NV_gpu_program5. Decompilation for ARB programs is intended to be added in a follow up commit. This does not include ARB decompilation and it's not in an usable state. The intention behind assembly programs is to reduce shader stutter significantly on drivers supporting NV_gpu_program5 (and other required extensions). Currently only Nvidia's proprietary driver supports these extensions. Add a UI option hidden for now to avoid people enabling this option accidentally. This code path has some limitations that OpenGL compatibility doesn't have: - NV_shader_storage_buffer_object is limited to 16 entries for a single OpenGL context state (I don't know if this is an intended limitation, an specification issue or I am missing something). Currently causes issues on The Legend of Zelda: Link's Awakening. - NV_parameter_buffer_object can't bind buffers using an offset different to zero. The used workaround is to copy to a temporary buffer (this doesn't happen often so it's not an issue). On the other hand, it has the following advantages: - Shaders build a lot faster. - We have control over how floating point rounding is done over individual instructions (SPIR-V on Vulkan can't do this). - Operations on shared memory can be unsigned and signed. - Transform feedbacks are dynamic state (not yet implemented). - Parameter buffers (uniform buffers) are per stage, matching NVN and hardware's behavior. - The API to bind and create assembly programs makes sense, unlike ARB_separate_shader_objects.	2020-05-19 18:00:04 -03:00
bunnei	b1a1bd12ca	Merge pull request #3899 from ReinUsesLisp/float-comparisons shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL	2020-05-13 09:51:14 -04:00
ReinUsesLisp	8b329ddcc9	gl_shader_decompiler: Properly emulate NaN behaviour on NE "Not equal" operators on GLSL seem to behave as unordered when we expect an ordered comparison. Manually emulate this checking for LGE values (numbers, not-NaNs).	2020-05-10 02:59:33 -03:00
Fernando Sahmkow	0a4be73b9b	VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation.	2020-05-09 19:25:29 -04:00
Rodrigo Locatti	7e376af8fc	Merge pull request #3839 from Morph1984/r8g8ui texture: Implement R8G8UI	2020-05-09 05:28:55 -03:00
ReinUsesLisp	4e57f9d5cf	shader_ir: Separate float-point comparisons in ordered and unordered This allows us to use native SPIR-V instructions without having to manually check for NAN.	2020-05-09 04:55:15 -03:00
ReinUsesLisp	f813cd3ff7	gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzle	2020-05-04 17:51:30 -03:00
bunnei	2aff0b4733	Merge pull request #3808 from ReinUsesLisp/wait-for-idle {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers	2020-05-03 02:43:18 -04:00
bunnei	e6b4311178	Merge pull request #3693 from ReinUsesLisp/clean-samplers shader/texture: Support multiple unknown sampler properties	2020-05-02 00:45:41 -04:00
Morph	7909860d16	texture: Implement R8G8UI - Used by The Walking Dead: The Final Season	2020-04-30 13:19:36 -04:00
bunnei	bf3f030a0d	Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp maxwell_3d: Fix depth clamping register	2020-04-30 13:07:31 -04:00
bunnei	c7b5a87c90	Merge pull request #3799 from ReinUsesLisp/iadd-cc shader: Implement P2R CC, IADD Rd.CC and IADD.X	2020-04-30 12:56:36 -04:00
bunnei	da2b8295e1	Merge pull request #3805 from ReinUsesLisp/preserve-contents texture_cache: Reintroduce preserve_contents accurately	2020-04-30 12:56:19 -04:00
bunnei	72b73d22ab	Merge pull request #3784 from ReinUsesLisp/shader-memory-util shader/memory_util: Deduplicate code	2020-04-28 12:05:50 -04:00
ReinUsesLisp	fe931ac976	{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers Drop MemoryBarrier from the buffer cache and use Maxwell3D's register WaitForIdle. To implement this on OpenGL we just call glMemoryBarrier with the necessary bits. Vulkan lacks this synchronization primitive, so we set an event and immediately wait for it. This is not a pretty solution, but it's what Vulkan can do without submitting the current command buffer to the queue (which ends up being more expensive on the CPU).	2020-04-28 02:18:12 -03:00
ReinUsesLisp	bb1ed66d99	maxwell_3d: Fix depth clamping register Using deko3d as reference: `4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)` We were using bits 3 and 4 to determine depth clamping, but these are the same both enabled and disabled: state->depthClampEnable ? 0x101A : 0x181D The same happens on Nvidia's OpenGL driver, where they do something like this (default capabilities, GL 4.5 compatibility): (state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c There's always a difference between the first bits in this register, but bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This commit changes yuzu's behaviour to use bit 11 to determine depth clamping. - Fixes depth issues on Super Mario Odyssey's intro.	2020-04-27 20:50:14 -03:00
ReinUsesLisp	8da16cf9fb	texture_cache: Reintroduce preserve_contents accurately This reverts commit `94b0e2e5da`. preserve_contents proved to be a meaningful optimization. This commit reintroduces it but properly implemented on OpenGL. We have to make sure the clear removes all the previous contents of the image. It's not currently implemented on Vulkan because we can do smart things there that's preferred to be introduced in a separate commit.	2020-04-26 19:53:02 -03:00
Rodrigo Locatti	7e38dd580f	Merge pull request #3753 from ReinUsesLisp/ac-vulkan {gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers	2020-04-26 01:55:43 -03:00
ReinUsesLisp	ddd82ef42b	shader/memory_util: Deduplicate code Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as well as shader decoder code. While we are at it, fix a bug in gl_shader_cache where compute shaders had an start offset of a stage shader.	2020-04-26 01:38:51 -03:00
ReinUsesLisp	255197e643	shader/arithmetic_integer: Implement CC for IADD	2020-04-25 22:55:26 -03:00
ReinUsesLisp	72deb773fd	shader_ir: Turn classes into data structures	2020-04-23 18:00:06 -03:00
Fernando Sahmkow	c043ac4f13	GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop,	2020-04-22 20:34:32 -04:00
Fernando Sahmkow	39e5b72948	Async GPU: Correct flushing behavior to be similar to old async GPU behavior.	2020-04-22 11:36:26 -04:00
Fernando Sahmkow	644588fd88	ShaderCache/PipelineCache: Cache null shaders.	2020-04-22 11:36:25 -04:00
Fernando Sahmkow	f616dc0b59	Address Feedback.	2020-04-22 11:36:24 -04:00
Fernando Sahmkow	ec2f3e48e1	Fix GCC error.	2020-04-22 11:36:23 -04:00
Fernando Sahmkow	0649f05900	QueryCache: Implement Async Flushes.	2020-04-22 11:36:18 -04:00
Fernando Sahmkow	131b342130	OpenGL: Guarantee writes to Buffers.	2020-04-22 11:36:18 -04:00
Fernando Sahmkow	1fb516cd97	GPU: Implement Flush Requests for Async mode.	2020-04-22 11:36:17 -04:00
Fernando Sahmkow	b7bc3c2549	FenceManager: Manage syncpoints and rename fences to semaphores.	2020-04-22 11:36:16 -04:00
Fernando Sahmkow	b10db7e4a5	FenceManager: Implement async buffer cache flushes on High settings	2020-04-22 11:36:15 -04:00
Fernando Sahmkow	a081a7c855	GPU: Fix rebase errors.	2020-04-22 11:36:13 -04:00
Fernando Sahmkow	e84eb64e51	Rasterizer: Disable fence managing in synchronous gpu.	2020-04-22 11:36:12 -04:00
Fernando Sahmkow	165ae823f5	ThreadManager: Sync async reads on accurate gpu.	2020-04-22 11:36:12 -04:00
Fernando Sahmkow	1f345ebe3a	GPU: Implement a Fence Manager.	2020-04-22 11:36:10 -04:00
Fernando Sahmkow	487379c593	OpenGL: Implement Fencing backend.	2020-04-22 11:36:10 -04:00
Fernando Sahmkow	8b1eb44b3e	BufferCache: Implement OnCPUWrite and SyncGuestHost	2020-04-22 11:36:07 -04:00
Fernando Sahmkow	da8f17715d	GPU: Refactor synchronization on Async GPU	2020-04-22 11:36:06 -04:00
Fernando Sahmkow	084ceb925a	UI: Replasce accurate GPU option for GPU Accuracy Level	2020-04-22 11:36:04 -04:00
bunnei	d64290884a	Merge pull request #3714 from lioncash/copies gl_shader_decompiler: Avoid copies where applicable	2020-04-21 20:16:02 -04:00
ReinUsesLisp	0bbae63300	gl_rasterizer: Fix buffers without size On NVN buffers can be enabled but have no size. According to deko3d and the behavior we see in Animal Crossing: New Horizons these buffers get the special address of 0x1000 and limit themselves to 0xfff. Implement buffers without a size by binding a null buffer to OpenGL without a side. `1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)`	2020-04-21 19:55:44 -03:00
Mat M	5305806071	Merge pull request #3716 from bunnei/fix-another-impl-fallthrough video_core: gl_shader_decompiler: Fix implicit fallthrough errors.	2020-04-18 15:17:52 -04:00
bunnei	03726fb7f5	video_core: gl_shader_decompiler: Fix implicit fallthrough errors.	2020-04-18 15:15:21 -04:00

1 2 3 4 5 ...

2270 Commits