Commit Graph

3594 Commits

Author SHA1 Message Date
Fernando Sahmkow ba02d564f8 Video Core: initial Implementation of InstanceDraw Packaging 2019-09-19 11:41:27 -04:00
bunnei b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp 0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp 2dd6411753 maxwell_to_gl: Fix mipmap filtering
OpenGL texture filters follow GL_<texture_filter>_MIPMAP_<mipmap_filter>
but we were using them in the opposite way.
2019-09-17 03:32:24 -03:00
ReinUsesLisp af809b491e gl_rasterizer: Remove unused code paths from ConfigureFramebuffers 2019-09-17 02:50:42 -03:00
Fernando Sahmkow 393cc3ef2f
Merge pull request #2851 from ReinUsesLisp/srgb
renderer_opengl: Fix sRGB blits
2019-09-15 10:38:10 -04:00
Fernando Sahmkow b8b1747704
Merge pull request #2824 from ReinUsesLisp/mme
Revert "Revert #2466" and stub FirmwareCall 4
2019-09-15 06:17:04 -04:00
Rodrigo Locatti 193bfefce4
maxwell_3d: Update firmware 4 call stub commentary 2019-09-14 22:51:18 -03:00
Fernando Sahmkow daae327e86
Merge pull request #2857 from ReinUsesLisp/surface-srgb
video_core/surface: Add function to detect sRGB surfaces
2019-09-14 03:53:21 -04:00
Fernando Sahmkow 18fac59050
Merge pull request #2858 from ReinUsesLisp/vk-device
vk_device: Add miscellaneous features and minor style changes
2019-09-14 03:52:06 -04:00
ReinUsesLisp 01d96e1136 vk_device: Add miscellaneous features and minor style changes
* Increase minimum Vulkan requirements
* Require VK_EXT_vertex_attribute_divisor
* Require depthClamp, samplerAnisotropy and largePoints features
* Search and expose VK_KHR_uniform_buffer_standard_layout
* Search and expose VK_EXT_index_type_uint8
* Search and expose native float16 arithmetics
* Track current driver with VK_KHR_driver_properties
* Query and expose SSBO alignment
* Query more image formats
* Improve logging overall
* Minor style changes
* Minor rephrasing of commentaries
2019-09-13 02:10:07 -03:00
ReinUsesLisp 99e23bd0fd video_core/surface: Add function to detect sRGB surfaces
This is required for proper conversion to RGBA8_UNORM or RGBA8_SRGB
surfaces when a backend can target both native and converted ASTC.
2019-09-13 00:27:04 -03:00
ReinUsesLisp 6b997c8f7f renderer_opengl: Fix rebase mistake 2019-09-11 00:09:37 -03:00
ReinUsesLisp 36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
Fernando Sahmkow e60d281a01 gl_rasterizer: Correct sRGB Fix regression 2019-09-10 19:31:42 -03:00
ReinUsesLisp 78574746bd renderer_opengl: Fix sRGB blits
Removes the sRGB hack of tracking if a frame used an sRGB rendertarget
to apply at least once to blit the final texture as sRGB. Instead of
doing this apply sRGB if the presented image has sRGB.

Also enable sRGB by default on Maxwell3D registers as some games seem to
assume this.
2019-09-10 19:31:42 -03:00
bunnei 34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
bunnei c7ec7bc1f5
Merge pull request #2810 from ReinUsesLisp/mme-opt
maxwell_3d: Avoid moving macro_params
2019-09-10 11:55:45 -04:00
ReinUsesLisp 17a9b0178d gl_shader_decompiler: Avoid writing output attribute when unimplemented 2019-09-06 15:02:12 -03:00
ReinUsesLisp 1f43e5296f gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp 7228e22098 texture_cache: Minor changes 2019-09-05 23:25:15 -03:00
ReinUsesLisp 322d0200c8 gl_rasterizer: Apply textures and images state 2019-09-05 20:35:51 -03:00
ReinUsesLisp 80ec2feee8 gl_rasterizer: Add samplers to compute dispatches 2019-09-05 20:35:51 -03:00
ReinUsesLisp 954fc02fdd gl_rasterizer: Minor code changes 2019-09-05 20:35:51 -03:00
ReinUsesLisp 04cdecb7a1 gl_state: Split textures and samplers into two arrays 2019-09-05 20:35:51 -03:00
ReinUsesLisp 6170337001 gl_rasterizer: Implement image bindings 2019-09-05 20:35:51 -03:00
ReinUsesLisp 5edf24b510 gl_state: Add support for glBindImageTextures 2019-09-05 20:35:51 -03:00
ReinUsesLisp 2424eefad2 texture_cache: Pass TIC to texture cache 2019-09-05 20:35:51 -03:00
ReinUsesLisp 3a450c1395 kepler_compute: Implement texture queries 2019-09-05 20:35:51 -03:00
ReinUsesLisp 2e5b5c2358 gl_rasterizer: Split SetupTextures 2019-09-05 20:35:51 -03:00
Fernando Sahmkow 4ee9949639
Merge pull request #2804 from ReinUsesLisp/remove-gs-special
gl_shader_cache: Remove special casing for geometry shaders
2019-09-05 16:03:46 -04:00
bunnei 03badbdd9b
Merge pull request #2833 from ReinUsesLisp/fix-stencil
gl_rasterizer: Fix stencil testing
2019-09-05 15:27:31 -04:00
ReinUsesLisp 0f7b813d65 gl_shader_decompiler: Implement shared memory 2019-09-05 01:40:24 -03:00
ReinUsesLisp 4de04eba39 shader_ir: Implement LD_S
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp f17415d431 shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
David d34fa7c4fa
Merge pull request #2802 from ReinUsesLisp/hsetp2-pred
half_set_predicate: Fix HSETP2 predicate assignments
2019-09-05 12:26:39 +10:00
ReinUsesLisp 6177cbdbe1 gl_shader_decompiler: Fixup slow path 2019-09-04 15:03:51 -03:00
ReinUsesLisp 7bbc98cfc3 gl_rasterizer: Fix stencil testing
* Fix stencil dirty flags tracking when stencil is disabled
* Attach stencil on clears (previously it only attached depth)
* Attach stencil on drawing regardless of stencil testing being enabled
2019-09-04 01:59:09 -03:00
ReinUsesLisp 5f309b88db Revert "Revert #2466" and stub FirmwareCall 4 2019-09-04 01:55:45 -03:00
ReinUsesLisp 77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp 701dedcfad maxwell_3d: Avoid moving macro_params 2019-09-04 01:55:01 -03:00
ReinUsesLisp 42e1bb6d46 gl_shader_cache: Remove special casing for geometry shaders
Now that ProgramVariants holds the primitive topology we no longer need
to keep track of individual geometry shaders topologies.
2019-09-04 01:54:43 -03:00
ReinUsesLisp dfae2d141a half_set_predicate: Fix predicate assignments 2019-09-04 01:54:23 -03:00
ReinUsesLisp 9cf52d027d gl_device: Disable precise in fragment shaders on bugged drivers 2019-09-04 01:54:00 -03:00
ReinUsesLisp 03276e7490 gl_shader_decompiler: Fixup AMD's slow path type 2019-09-04 01:54:00 -03:00
ReinUsesLisp 6c449793b8 gl_shader_decompiler: Rework GLSL decompiler type system
GLSL decompiler type system was broken. We converted all return values
to float except for some cases where returning we couldn't and
implicitly broke the rule of returning floats (e.g. for bools or bool
pairs).

Instead of doing this introduce class Expression that knows what type a
return value has and when a consumer wants to use the string it asks for
it with a required type, emitting a runtime error if types are
incompatible.

This has the disadvantage that there's more C++ code, but we can emit
better GLSL code that's easier to read.
2019-09-04 01:54:00 -03:00
bunnei 19af91434e
Merge pull request #2793 from ReinUsesLisp/bgr565
renderer_opengl: Implement RGB565 framebuffer format
2019-09-03 22:36:32 -04:00
bunnei 81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
bunnei 137d165672
Merge pull request #2826 from ReinUsesLisp/macro-binding
maxwell_3d: Fix macro binding cursor
2019-09-03 22:32:42 -04:00
bunnei 50b5bb44a0
Merge pull request #2765 from FernandoS27/dma-fix
MaxwellDMA: Fixes, corrections and relaxations.
2019-09-01 13:13:05 -04:00
ReinUsesLisp 52a41f482f maxwell_3d: Fix macro binding cursor 2019-09-01 05:01:11 -03:00
Rodrigo Locatti 4d4f9cc104 video_core: Silent miscellaneous warnings (#2820)
* texture_cache/surface_params: Remove unused local variable

* rasterizer_interface: Add missing documentation commentary

* maxwell_dma: Remove unused rasterizer reference

* video_core/gpu: Sort member declaration order to silent -Wreorder warning

* fermi_2d: Remove unused MemoryManager reference

* video_core: Silent unused variable warnings

* buffer_cache: Silent -Wreorder warnings

* kepler_memory: Remove unused MemoryManager reference

* gl_texture_cache: Add missing override

* buffer_cache: Add missing include

* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
ReinUsesLisp 878adee0a3 gl_buffer_cache: Add missing include
RasterizerInterface was considered an incomplete object by clang.
2019-08-29 22:02:52 +00:00
bunnei a67c4e6e02
Merge pull request #2742 from ReinUsesLisp/fix-texture-buffers
gl_texture_cache: Miscellaneous texture buffer fixes
2019-08-29 15:59:17 -04:00
bunnei e424615839
Merge pull request #2783 from FernandoS27/new-buffer-cache
Implement a New LLE Buffer Cache
2019-08-29 13:07:01 -04:00
bunnei f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp 6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp 4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
Fernando Sahmkow 83ec2091c1 Buffer Cache: Adress Feedback. 2019-08-21 12:14:27 -04:00
Fernando Sahmkow 6ce2c85047 Buffer_Cache: Implement flushing. 2019-08-21 12:14:26 -04:00
Fernando Sahmkow de8ff8a1c6 Buffer_Cache: Implement barriers. 2019-08-21 12:14:25 -04:00
Fernando Sahmkow 286f4c446a Buffer_Cache: Optimize and track written areas. 2019-08-21 12:14:25 -04:00
Fernando Sahmkow 5f4b746a1e BufferCache: Rework mapping caching. 2019-08-21 12:14:24 -04:00
Fernando Sahmkow 86d8563314 Buffer_Cache: Fixes and optimizations. 2019-08-21 12:14:23 -04:00
Fernando Sahmkow 862bec001b Video_Core: Implement a new Buffer Cache 2019-08-21 12:14:22 -04:00
bunnei d654b3d82e
Merge pull request #2769 from FernandoS27/commands-flush
GPU: Flush commands on every dma pusher step.
2019-08-21 10:29:56 -04:00
bunnei dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp 80702aa88f renderer_opengl: Implement RGB565 framebuffer format 2019-08-21 02:28:31 -03:00
ReinUsesLisp 9cdf5c6c31 renderer_opengl: Use block linear swizzling for CPU framebuffers 2019-08-21 02:17:14 -03:00
ReinUsesLisp 8ad7268c75 renderer_opengl: Use VideoCore pixel format 2019-08-21 02:16:40 -03:00
ReinUsesLisp 9a76e94b3d gpu: Change optional<reference_wrapper<T>> to T* for FramebufferConfig 2019-08-21 01:55:25 -03:00
bunnei ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
bunnei 87bbefe55f
Merge pull request #2768 from ReinUsesLisp/hsetp2-fix
decode/half_set_predicate: Fix predicates
2019-08-18 08:50:54 -04:00
ReinUsesLisp 2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
ReinUsesLisp ec0da3ef64 half_set_predicate: Fix HSETP2_C constant buffer offset 2019-08-04 02:50:55 -03:00
Fernando Sahmkow e52c895559 GPU: Flush commands on every dma pusher step.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
bunnei 52f54c728d
Merge pull request #2592 from FernandoS27/sync1
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
ReinUsesLisp 77f1a676a1 decode/half_set_predicate: Fix predicates 2019-07-26 00:12:38 -03:00
Fernando Sahmkow a452ff983d MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei b0ff3179ef
Merge pull request #2739 from lioncash/cflow
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei 4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei 31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
bunnei 9be9600bdc
Merge pull request #2704 from FernandoS27/conditional
maxwell3d: Implement Conditional Rendering
2019-07-24 17:07:57 -04:00
ReinUsesLisp 104641db07 shader/decode: Implement S2R Tic 2019-07-22 16:16:10 -03:00
bunnei f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
bunnei 27e10e0442
Merge pull request #2735 from FernandoS27/pipeline-rework
Rework Dirty Flags in GPU Pipeline, Optimize CBData and Redo Clearing mechanism
2019-07-21 00:59:52 -04:00
Fernando Sahmkow 11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow 7a35178ee2 Maxwell3D: Reorganize and address feedback 2019-07-20 10:18:35 -04:00
Fernando Sahmkow 1158777737 Shader_Ir: Change Debug Asserts for Log Warnings 2019-07-19 22:15:34 -04:00
ReinUsesLisp 45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp 6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Lioncash c1c89411da video_core/control_flow: Provide operator!= for types with operator==
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash 1780e0e3d0 video_core/control_flow: Prevent sign conversion in TryGetBlock()
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash a162a844d2 video_core/control_flow: Remove unnecessary BlockStack copy constructor
This is the default behavior of the copy constructor, so it doesn't need
to be specified.

While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash 56bc11d952 video_core/control_flow: Use std::move where applicable
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash e7b39f47f8 video_core/control_flow: Use the prefix variant of operator++ for iterators
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash 6885e7e7ec video_core/control_flow: Use empty() member function for checking emptiness
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash 45fa12a05c video_core: Resolve -Wreorder warnings
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash 47df844338 video_core/control_flow: Make program_size for ScanFlow() a std::size_t
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash 3df9558593 video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash 1109db86b7 video_core/shader/decode: Prevent sign-conversion warnings
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei 63bda67a34
Merge pull request #2738 from lioncash/shader-ir
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow 5a06e33859 Shader_Ir: correct clang format 2019-07-18 10:09:26 -04:00
Fernando Sahmkow 43f57d668c GPU: Add missing puller methods.
This adds some missing puller methods. We don't assert them as these are 
nop operations for us.
2019-07-18 08:54:42 -04:00
Fernando Sahmkow 3a3fee5abf MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.
This log was just to know which games used DMA. It's no longer 
important.
2019-07-18 08:31:38 -04:00
Fernando Sahmkow d3b71ff80d Gl_Texture_Cache: Remove assert on component type in GetFormatTuple
Textures can have different components types in different orders. This 
assert was completely inprecise and the effectiveness of such is better 
handled by case and within the texture cache.
2019-07-18 08:20:31 -04:00
Fernando Sahmkow 0b65e9335e Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and 
rounding as this are well known and have little to no consequences in 
gpu's accuracy.
2019-07-18 08:17:19 -04:00
ReinUsesLisp 74632c76ce gl_shader_decompiler: Rename bufferImage to imageBuffer
The online OpenGL documentation is wrong. The type definition is
imageBuffer.
2019-07-18 01:16:44 -03:00
ReinUsesLisp 87909d327f gl_shader_cache: Fix newline on buffer preprocessor definitions 2019-07-18 01:16:15 -03:00
ReinUsesLisp e7bdf8b22a textures: Fix texture buffer size calculation 2019-07-18 01:07:08 -03:00
ReinUsesLisp 84027f4808 gl_texture_cache: Do not set texture parameters to buffers 2019-07-18 01:06:26 -03:00
ReinUsesLisp 73b2dc6d4f gl_texture_cache: Add missing break in CreateTexture 2019-07-18 01:04:18 -03:00
Fernando Sahmkow 4be61013a1 GL_State: Feedback and fixes 2019-07-17 17:29:56 -04:00
Fernando Sahmkow 5ad889f6fd Maxwell3D: Address Feedback 2019-07-17 17:29:55 -04:00
Fernando Sahmkow 7826f0afd9 Texture_Cache: Rebase Fixes 2019-07-17 17:29:54 -04:00
Fernando Sahmkow 8cdbfe69b1 GL_Rasterizer: Corrections to Clearing. 2019-07-17 17:29:54 -04:00
Fernando Sahmkow 0ff4a5fa39 Maxwell3D: Correct marking dirtiness on CB upload 2019-07-17 17:29:53 -04:00
Fernando Sahmkow fec32fed18 GL_Rasterizer: Rework RenderTarget/DepthBuffer clearing 2019-07-17 17:29:52 -04:00
Fernando Sahmkow a081dea8ab Maxwell3D: Implement State Dirty Flags. 2019-07-17 17:29:51 -04:00
Fernando Sahmkow 0d3db58657 Maxwell3D: Rework CBData Upload 2019-07-17 17:29:50 -04:00
Fernando Sahmkow f2e7b29c14 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow e42bcf2314 maxwell3d: Implement Conditional Rendering
Conditional Rendering takes care of conditionaly clearing or drawing
depending on a set of queries. This PR implements the query checks to
stablish if things can be rendered or not.
2019-07-17 17:13:19 -04:00
Fernando Sahmkow 223a535f3f
Merge pull request #2740 from lioncash/bra
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash bebbdc2067 shader_ir: std::move Node instance where applicable
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.

This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash 60926ac16b shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash 44d87ff641 shader_ir: Remove unused includes
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow d614193e49 Shader_Ir: Correct tracking to track from right to left 2019-07-16 15:06:59 -04:00
Fernando Sahmkow b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash e2d7dda166 shader/decode/other: Correct branch indirect argument within BRA handling
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
ReinUsesLisp 2a4044a858 gl_shader_cache: Fix clang-format issues 2019-07-15 20:33:51 -03:00
ReinUsesLisp 6b0d017675 gl_shader_decompiler: Stub local memory size 2019-07-15 17:38:25 -03:00
ReinUsesLisp 56bca83bde gl_shader_cache: Address review commentaries 2019-07-15 17:38:25 -03:00
ReinUsesLisp bbecd13697 gl_shader_cache: Address CI issues 2019-07-15 17:38:25 -03:00
ReinUsesLisp 725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow 1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei b77a1ed67a
Merge pull request #2705 from FernandoS27/tex-cache-fixes
GPU: Fixes to Texture Cache and Include Microprofiles for GL State/BufferCopy/Macro Interpreter
2019-07-14 22:44:36 -04:00
ReinUsesLisp afa8096df5 shader: Allow tracking of indirect buffers without variable offset
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
bunnei 3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow 2ac7472d3f Texture_Cache: Address Feedback 2019-07-14 17:42:39 -04:00
Fernando Sahmkow 0f54b541f4 Texture_Cache: Remove some unprecise fallback case and clang format 2019-07-14 12:00:32 -04:00
Fernando Sahmkow 5818959e54 Texture_Cache: Force Framebuffer reset if an active render target is unregistered. 2019-07-14 12:00:31 -04:00
Fernando Sahmkow 913b7a6872 GPU: Add a microprofile for macro interpreter 2019-07-14 12:00:30 -04:00
Fernando Sahmkow a9943222f2 GL_State: Add a microprofile timer to OpenGL state. 2019-07-14 12:00:30 -04:00
Fernando Sahmkow 5c1e1a148e Gl_Texture_Cache: Measure Buffer Copy Times 2019-07-14 12:00:29 -04:00
Fernando Sahmkow 5d31bab69a Texture_Cache: Correct Linear Structural Match. 2019-07-14 12:00:28 -04:00
Fernando Sahmkow 4882c058fd
Merge pull request #2690 from SciresM/physmem_fixes
Implement MapPhysicalMemory/UnmapPhysicalMemory
2019-07-14 09:16:46 -04:00
Fernando Sahmkow 0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
bunnei bb67091c77
Merge pull request #2609 from FernandoS27/new-scan
Implement a New Shader Scanner, Decompile Flow Stack and implement BRX BRA.CC
2019-07-11 17:36:23 -04:00
ReinUsesLisp 0eb0c24269 gl_shader_decompiler: Fix gl_PointSize redeclaration 2019-07-11 16:10:59 -03:00
ReinUsesLisp aca40de224 gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_array 2019-07-11 04:27:00 -03:00
bunnei fd066ffbce
Merge pull request #2697 from lioncash/doc
gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
2019-07-10 16:38:09 -04:00
bunnei 7fb7054bc8
Merge pull request #2686 from ReinUsesLisp/vk-scheduler
vk_scheduler: Drop execution context in favor of views
2019-07-10 16:35:48 -04:00
bunnei 206ec29f17
Merge pull request #2691 from lioncash/override
video_core: Add missing override specifiers
2019-07-10 16:25:43 -04:00
Fernando Sahmkow f2549739d1 shader_ir: Add comments on missing instruction.
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Michael Scire a1845d1dd3 prefer system reference over global accessor 2019-07-09 08:11:35 -07:00
Fernando Sahmkow 2de7649311 shader_ir: limit explorastion to best known program size. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow e7c6045a03 control_flow: Correct block breaking algorithm. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow dc4a93594c control_flow: Assert shaders bigger than limit. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow e7a88f0ab3 control_flow: Address feedback. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow 34357b110c shader_ir: Correct parsing of scheduling instructions and correct sizing 2019-07-09 08:14:41 -04:00
Fernando Sahmkow cfb3db1a32 shader_ir: Correct max sizing 2019-07-09 08:14:40 -04:00
Fernando Sahmkow d45fed3030 shader_ir: Remove unnecessary constructors and use optional for ScanFlow result 2019-07-09 08:14:40 -04:00
Fernando Sahmkow 01b21ee1e8 shader_ir: Corrections, documenting and asserting control_flow 2019-07-09 08:14:39 -04:00
Fernando Sahmkow d5533b440c shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow 926b80102f shader_ir: Decompile Flow Stack 2019-07-09 08:14:38 -04:00
Fernando Sahmkow 459fce3a8f shader_ir: propagate shader size to the IR 2019-07-09 08:14:37 -04:00
Fernando Sahmkow 8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
Fernando Sahmkow c218ae4b02 shader_ir: Remove the old scanner. 2019-07-09 08:14:36 -04:00
Fernando Sahmkow 8af6e6a052 shader_ir: Implement a new shader scanner 2019-07-09 08:14:36 -04:00
Lioncash c04785c928 gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
must_reconfigure isn't a parameter for this function any more, so it can
be replaced with current_state.

While we're at it, we can make the parameters of the declaration match
the same name as the ones in the definition.
2019-07-09 02:08:15 -04:00
Michael Scire 697206092e Prevent merging of device mapped memory blocks.
This sets the DeviceMapped attribute for GPU-mapped memory blocks,
and prevents merging device mapped blocks. This prevents memory
mapped from the gpu from having its backing address changed by
block coalesce.
2019-07-08 22:52:05 -07:00
ReinUsesLisp c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias be020f7621
Delete decode_integer_set.cpp 2019-07-07 21:40:33 +02:00
ReinUsesLisp d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
Lioncash cbdd6cd1c0 vk_sampler_cache: Remove unused includes
These are no longer used within this header, so they can be removed.
2019-07-07 13:40:36 -04:00
Lioncash 4b27680639 video_core: Add missing override specifiers 2019-07-07 13:38:39 -04:00
ReinUsesLisp 86a874a2fc vk_scheduler: Drop execution context in favor of views
Instead of passing by copy an execution context through out the whole
Vulkan call hierarchy, use a command buffer view and fence view
approach.

This internally dereferences the command buffer or fence forcing the
user to be unable to use an outdated version of it on normal usage.
It is still possible to keep store an outdated if it is casted to
VKFence& or vk::CommandBuffer.

While changing this file, add an extra parameter for Flush and Finish to
allow releasing the fence from this calls.
2019-07-07 03:30:22 -03:00
ReinUsesLisp 79a23ca5f0 buffer_cache: Avoid [[nodiscard]] to make clang-format happy 2019-07-06 01:17:05 -03:00
ReinUsesLisp 83050c9495 buffer_cache: Try to fix MinGW build 2019-07-06 01:14:05 -03:00
ReinUsesLisp f7691ebe57 gl_rasterizer: Fix nullptr dereference on disabled buffers 2019-07-06 00:37:56 -03:00
ReinUsesLisp 7ecf64257a gl_rasterizer: Minor style changes 2019-07-06 00:37:55 -03:00
ReinUsesLisp 9cdc576f60 gl_rasterizer: Fix vertex and index data invalidations 2019-07-06 00:37:55 -03:00
ReinUsesLisp 1fa21fa192 gl_buffer_cache: Implement with generic buffer cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp 32c0212b24 buffer_cache: Implement a generic buffer cache
Implements a templated class with a similar approach to our current
generic texture cache. It is designed to be compatible with Vulkan and
OpenGL,
2019-07-06 00:37:55 -03:00
ReinUsesLisp 2bcae41a73 gl_buffer_cache: Remove global system getters 2019-07-06 00:37:55 -03:00
ReinUsesLisp 02ab844934 gl_device: Query SSBO alignment 2019-07-06 00:37:55 -03:00
ReinUsesLisp d14fbfb9b5 gl_buffer_cache: Implement flushing 2019-07-06 00:37:55 -03:00
ReinUsesLisp 345f852bdb gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp 8155b12d3d gl_buffer_cache: Rework to support internalized buffers 2019-07-06 00:37:55 -03:00
ReinUsesLisp f8ba72d491 gl_buffer_cache: Store in CachedBufferEntry the used buffer handle 2019-07-06 00:37:55 -03:00
ReinUsesLisp b54fb8fc4c gl_buffer_cache: Return used buffer from Upload function 2019-07-06 00:37:55 -03:00
ReinUsesLisp a6d2f52fc3 gl_rasterizer: Add some commentaries 2019-07-06 00:37:55 -03:00
ReinUsesLisp 2b9d4088ec gl_rasterizer: Make DrawParameters rasterizer instance const 2019-07-06 00:37:55 -03:00
ReinUsesLisp 2e39c20da5 gl_rasterizer: Move index buffer uploading to its own method 2019-07-06 00:37:55 -03:00
Fernando Sahmkow d20ede40b1 NVServices: Styling, define constructors as explicit and corrections 2019-07-05 15:49:32 -04:00
Fernando Sahmkow b391e5f638 NVFlinger: Correct GCC compile error 2019-07-05 15:49:31 -04:00
Fernando Sahmkow 0335a25d1f NVServices: Make NVEvents Automatic according to documentation. 2019-07-05 15:49:29 -04:00
Fernando Sahmkow 7d1b974bca GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardware 2019-07-05 15:49:26 -04:00
Fernando Sahmkow f2e026a1d8 gpu_asynch: Simplify synchronization to a simpler consumer->producer scheme. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow 0706d633bf nv_host_ctrl: Make Sync GPU variant always return synced result. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow 600dddf88d Async GPU: do invalidate as synced operation
Async GPU: Always invalidate synced.
2019-07-05 15:49:19 -04:00
Fernando Sahmkow c13433aee4 Gpu: use an std mutex instead of a spin_lock to guard syncpoints 2019-07-05 15:49:18 -04:00
Fernando Sahmkow eef55f493b Gpu: Mark areas as protected. 2019-07-05 15:49:16 -04:00
Fernando Sahmkow a45643cb3b nv_services: Stub CtrlEventSignal 2019-07-05 15:49:15 -04:00
Fernando Sahmkow 8942047d41 Gpu: Implement Hardware Interrupt Manager and manage GPU interrupts 2019-07-05 15:49:14 -04:00
Fernando Sahmkow 82b829625b video_core: Implement GPU side Syncpoints 2019-07-05 15:49:11 -04:00
Zach Hilman 772c86a260
Merge pull request #2601 from FernandoS27/texture_cache
Implement a new Texture Cache
2019-07-05 13:39:13 -04:00
Fernando Sahmkow 3b9d89839d texture_cache: Address Feedback 2019-07-05 09:46:53 -04:00
Fernando Sahmkow 30b176f92b texture_cache: Correct Texture Buffer Uploading 2019-07-04 19:38:19 -04:00
Zach Hilman ad50cd7df9 gl_shader_cache: Make CachedShader constructor private
Fixes missing review comments introduced.
2019-07-03 20:39:46 -04:00
Zach Hilman da5a537029
Merge pull request #2563 from ReinUsesLisp/shader-initializers
gl_shader_cache: Use static constructors for CachedShader initialization
2019-07-03 20:20:05 -04:00
Fernando Sahmkow 4705d1b523 rasterizer_cache: Protect inherited caches from submission level 2019-07-01 04:32:01 -04:00
ReinUsesLisp 6e1db6b703 texture_cache: Pack sibling queries inside a method 2019-06-29 20:47:46 -03:00
ReinUsesLisp 8eae66907e texture_cache: Use std::vector reservation for sampled_textures 2019-06-29 20:10:31 -03:00
ReinUsesLisp f6f1a8f26a texture_cache: Style changes 2019-06-29 19:52:37 -03:00
ReinUsesLisp dd9ace502b texture_cache: Use std::array for siblings_table 2019-06-29 18:54:13 -03:00
ReinUsesLisp 3f3c3ca5f9 texture_cache: Address feedback 2019-06-29 17:29:39 -03:00
Fernando Sahmkow 223ca80753 texture_cache: Correct variable naming. 2019-06-25 19:35:08 -04:00
Fernando Sahmkow 5aeabd9a17 gl_texture_cache: Correct asserts 2019-06-25 19:26:59 -04:00
Fernando Sahmkow 88bc39374f texture_cache: Corrections, documentation and asserts 2019-06-25 18:36:19 -04:00
Fernando Sahmkow c0abc7124d surface_params: Corrections, asserts and documentation. 2019-06-25 18:03:25 -04:00
Fernando Sahmkow fb234560b0 copy_params: use constexpr for constructor 2019-06-25 17:42:50 -04:00
Fernando Sahmkow 18d24fbdd0 gl_texture_cache: Corrections and fixes 2019-06-25 17:40:08 -04:00
Fernando Sahmkow 36665ce0b2 gl_resource_manager: Correct MakeStreamCopy 2019-06-25 17:32:04 -04:00
Fernando Sahmkow 58c8a44e7a texture_cache: Query MemoryManager from the system 2019-06-25 17:26:00 -04:00
ReinUsesLisp 7565389700 texture_cache: Include "core/core.h" 2019-06-24 02:15:57 -03:00
ReinUsesLisp e723441e37 gl_texture_cache: Explicitly add indirect include 2019-06-24 02:13:55 -03:00
ReinUsesLisp 34841a41c3 texture_cache/surface_view: Address feedback 2019-06-24 02:09:56 -03:00
ReinUsesLisp 0837290992 texture_cache/surface_base: Address feedback 2019-06-24 02:08:52 -03:00
ReinUsesLisp 75de730e28 video_core/surface: Address feedback 2019-06-24 02:07:11 -03:00
ReinUsesLisp 10a83653ee decode/texture: Address feedback 2019-06-24 02:05:05 -03:00
ReinUsesLisp 4504302abc renderer_opengl/utils: Remove unused includes and unused forward declaration 2019-06-24 02:03:37 -03:00
ReinUsesLisp 4b2ff1e00e gl_texture_cache: Address some feedback 2019-06-24 02:01:44 -03:00
ReinUsesLisp 0b6df52109 gl_shader_disk_cache: Address feedback 2019-06-24 01:59:32 -03:00
ReinUsesLisp b8b05a484a gl_shader_decompiler: Address feedback 2019-06-24 01:56:38 -03:00
ReinUsesLisp 4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
bunnei a9f3c54871
Merge pull request #2579 from ReinUsesLisp/fix-aoffi-test
gl_device: Fix TestVariableAoffi test
2019-06-21 15:28:55 -04:00
Fernando Sahmkow d1812316e1 texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow 51ba60b27e shader_cache: Correct versioning and size calculation. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 97c8c9f49a texture_cache: Eliminate linear textures fallthrough 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 6acdae0e4c texture_cache: Correct format R16U as sibling 2019-06-20 21:38:34 -03:00
Fernando Sahmkow d7587842eb texture_cache: Implement texception detection and texture barriers. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 198a0395bb texture_cache: Corrections to buffers and shadow formats use. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow fed773a86c texture_cache: Implement Irregular Views in surfaces 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 082740d34d surface: Correct format S8Z24 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 03d489dcf5 texture_cache: Initialize all siblings to invalid pixel format. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 9422cf7c10 gl_texture_cache: Use Stream Buffers instead of Persistant for Buffer Copies. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow fac3706253 gl_texture_cache: Correct Image Blit 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 7232a1ed16 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 3dd7643214 texture_cache: Use siblings textures on Rebuild and fix possible error on blitting 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 4db28f72f6 texture_cache: Remove old rasterizer cache 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 2d83553ea7 texture_cache: Implement siblings texture formats. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow cb728797b0 fermi2d: Correct Origin Mode 2019-06-20 21:38:34 -03:00
Fernando Sahmkow a56f687793 texture_cache: correct texture buffer on surface params 2019-06-20 21:38:34 -03:00
Fernando Sahmkow b01f9c8a70 texture_cache: eliminate accelerated depth->color/color->depth copies due to driver instability. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 561ce29c98 texture_cache: correct mutex locks 2019-06-20 21:38:34 -03:00
Fernando Sahmkow b7de31ac97 shader_ir: Fix image copy rebase issues 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 6f69f06873 texture_cache: Don't Image Copy if component types differ 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 9f755218a1 texture_cache: move some large methods to cpp files 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 3809041c24 texture_cache: Optimize GetSurface and use references on functions that don't change a surface. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 60bf761afb texture_cache: Implement Buffer Copy and detect Turing GPUs Image Copies 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 228f516bb4 texture_cache uncompress-compress is untopological.
This makes conflicts between non compress and compress textures to be 
auto recycled. It also limits the amount of mipmaps a texture can have 
if it goes above it's limit.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow 9251354152 texture_cache: Correct copying between compressed and uncompressed formats 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 0966665fc2 texture_cache: Only load on recycle with accurate GPU.
Testing so far has proven this to be quite safe as texture memory read 
added a 2-5ms load to the current cache.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow ea1525dab1 Fix rebase errors 2019-06-20 21:38:33 -03:00
Fernando Sahmkow bdf9faab33 texture_cache: Handle uncontinuous surfaces. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow e60ed2bb3e texture_cache: return null surface on invalid address 2019-06-20 21:38:33 -03:00
Fernando Sahmkow fcac55d5bf texture_cache: Add checks for texture buffers. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 175aa343ff texture_cache: Fermi2D reform and implement View Mirage
This also does some fixes on compressed textures reinterpret and on the
Fermi2D engine in general.
2019-06-20 21:38:33 -03:00
ReinUsesLisp 1bf4154e7d gl_shader_decompiler: Implement image binding settings 2019-06-20 21:38:33 -03:00
ReinUsesLisp 9097301d92 shader: Implement bindless images 2019-06-20 21:38:33 -03:00
ReinUsesLisp 06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp 007ffbef1c gl_rasterizer: Track texture buffer usage 2019-06-20 21:38:33 -03:00
ReinUsesLisp 58c0d37422 video_core: Make ARB_buffer_storage a required extension 2019-06-20 21:36:12 -03:00
ReinUsesLisp 07f7ce1da2 gl_rasterizer_cache: Use texture buffers to emulate texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp b8c75a845b maxwell_3d: Partially implement texture buffers as 1D textures 2019-06-20 21:36:12 -03:00
ReinUsesLisp 6c81c8f5b7 gl_shader_decompiler: Allow 1D textures to be texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp 4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow d267948a73 texture_cache: loose TryReconstructSurface when accurate GPU is not on.
Also corrects some asserts.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow 6162cb922e texture_cache: Document the most important methods. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 4530511ee4 texture_cache: Try to Reconstruct Surface on bigger than overlap.
This fixes clouds in SMO Cap Kingdom and lens on Cloud Kingdom.
Also moved accurate_gpu setting check to Pick Strategy
2019-06-20 21:36:12 -03:00
Fernando Sahmkow a79831d9d0 texture_cache: Implement Guard mechanism 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 7731a0e2d1 texture_cache: General Fixes
Fixed ASTC mipmaps loading
Fixed alignment on openGL upload/download
Fixed Block Height Calculation
Removed unalign_height
2019-06-20 21:36:12 -03:00
ReinUsesLisp c2ed348bdd surface_params: Ensure pitch is always written to avoid surface leaks 2019-06-20 21:36:12 -03:00
ReinUsesLisp 9098905dd1 gl_framebuffer_cache: Use a hashed struct to cache framebuffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow d65a4af895 texture_cache return invalid buffer on deactivated color_mask 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 6bd034eae9 engine_upload: Addapt to new Texture Cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp 2131f71573 surface_params: Optimize CreateForTexture
Instead of using Common::AlignUp, use Common::AlignBits to align the
texture compression factor.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow 41b4674458 gl_texture_cache: Make main views be proxy textures instead of a full view. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 07cc7e0c12 texture_cache: Add ASync Protections 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 1bbc9debfb Remove Framebuffer reconfiguration and restrict rendertarget protection 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 5192521dc3 texture_cache: Implement GPU Dirty Flags 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 94f2be5473 texture_cache: Optimize GetMipBlockHeight and GetMipBlockDepth 2019-06-20 21:36:12 -03:00
Fernando Sahmkow a4a58be2d4 texture_cache: Implement L1_Inner_cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp 345e73f2fe video_core: Use un-shifted block sizes to avoid integer divisions
Instead of storing all block width, height and depths in their shifted
form:

block_width = 1U << block_shift;

Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
ReinUsesLisp 28d7c2f5a5 texture_cache: Change internal cache from lists to vectors 2019-06-20 21:36:12 -03:00
Fernando Sahmkow b347543e83 Reduce amount of size calculations. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 4e2071b6d9 texture_cache: Correct premature texceptions
Due to our current infrastructure, it is possible for a mipmap to be set 
on as a render target before a texception of that mipmap's superset be 
set afterwards. This is problematic as we rely on texture views to set 
up texceptions and protecting render targets targets for 3D texture 
rendering.

One simple solution is to configure framebuffers after texture setup but 
this brings other problems. This solution, forces a reconfiguration of 
the framebuffers after such event happens.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow ba677ccb5a texture_cache: Implement guest flushing 2019-06-20 21:36:12 -03:00
Fernando Sahmkow de0b1cb2b2 Fixes to mipmap's process and reconstruct process 2019-06-20 21:36:12 -03:00
ReinUsesLisp e0002599ac surface_base: Add parenthesis to EmplaceOverview's predicate 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 324e470879 Texture Cache: Implement Blitting and Fermi Copies 2019-06-20 21:36:12 -03:00
ReinUsesLisp 549fd18ac4 surface_view: Add constructor for ViewParams 2019-06-20 21:36:12 -03:00
ReinUsesLisp 16e8625a30 surface_base: Split BreakDown into layered and non-layered variants 2019-06-20 21:36:12 -03:00
ReinUsesLisp 2b30000a1e surface_base: Silence truncation warnings and minor renames and reordering 2019-06-20 21:36:12 -03:00
ReinUsesLisp 03d10ea3b4 copy_params: Use constructor instead of C-like initialization 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 1af4414861 Correct Mipmaps View method in Texture Cache 2019-06-20 21:36:12 -03:00
Fernando Sahmkow d86f9cd709 Change texture_cache chaching from GPUAddr to CacheAddr
This also reverses the changes to make invalidation and flushing through
the GPU address.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow b711cdce78 Corrections to Structural Matching
The texture will now be reconstructed if the width only matches on GoB 
alignment.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow bc930754cc Implement Texture Cache V2 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 3d471e732d Correct Surface Base and Views for new Texture Cache 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 3b26206dbd Add OGLTextureView 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 6b0695b3cd Deglobalize Memory Manager on texture cahe and Implement Invalidation and Flushing using GPUVAddr 2019-06-20 21:36:11 -03:00
ReinUsesLisp 6c410104f4 texture_cache: Remove execution context copies from the texture cache
This is done to simplify the OpenGL implementation, it is needed for
Vulkan.
2019-06-20 21:36:11 -03:00
ReinUsesLisp fa59a7b4d8 gl_texture_cache: Implement fermi copies 2019-06-20 21:36:11 -03:00
ReinUsesLisp 1b4503c571 texture_cache: Split texture cache into different files 2019-06-20 21:36:11 -03:00
ReinUsesLisp 5f3aacdc37 texture_cache: Move staging buffer into a generic implementation 2019-06-20 21:36:11 -03:00
ReinUsesLisp 2787a0c287 texture_cache: Flush 3D textures in the order they are drawn 2019-06-20 21:36:11 -03:00
ReinUsesLisp 4b396f375c gl_texture_cache: Minor changes 2019-06-20 21:36:11 -03:00
ReinUsesLisp 0cefb7bcb4 gl_texture_cache: Add copy from multiple overlaps into a single surface 2019-06-20 21:36:11 -03:00
ReinUsesLisp 84139586c9 gl_texture_cache: Attach surface textures instead of views 2019-06-20 21:36:11 -03:00
ReinUsesLisp fb94871791 gl_texture_cache: Add fast copy path 2019-06-20 21:36:11 -03:00
ReinUsesLisp bab21e8cb3 gl_texture_cache: Initial implementation 2019-06-20 21:36:11 -03:00
bunnei c28694d907
Merge pull request #2591 from lioncash/record
core: Remove unused CiTrace source files
2019-06-19 22:28:26 -04:00
Lioncash 61d2498f00 core: Remove unused CiTrace source files
These source files have been unused for the entire lifecycle of the
project. They're a hold-over from Citra and only add to the build time
of the project, so they can be removed.

There's also likely no way this would ever work in yuzu in its current
form without revamping quite a bit of it, given how different the GPU on
the Switch is compared to the 3DS.
2019-06-18 16:57:59 -04:00
bunnei c7b5c245e1
Merge pull request #2562 from ReinUsesLisp/split-cbuf-upload
video_core/engines: Move ConstBufferInfo out of Maxwell3D
2019-06-17 22:35:04 -04:00
Zach Hilman c0e7b91145
Merge pull request #2538 from ReinUsesLisp/ssy-pbk
shader: Split SSY and PBK stack
2019-06-15 20:30:13 -04:00
ReinUsesLisp ee81fb94cd gl_device: Fix TestVariableAoffi test
This test is intended to be invalid GLSL, but it was being invalid in
two points instead of one. The intention is to use a non-immediate
parameter in a textureOffset like function.

The problem is that this shader was being compiled as a separable
shader object and the text was writting to gl_Position without a
redeclaration, being invalid GLSL.

Address that issue by using a user-defined output attribute.
2019-06-11 23:02:50 -03:00
bunnei f981efdf8d
Merge pull request #2572 from FernandoS27/gpu-mem
GPUVM: Correct GPU VM virtual address space
2019-06-11 21:09:57 -04:00
Fernando Sahmkow f79823fda7 GPUVM: Correct GPU VM virtual address space 2019-06-09 17:47:15 -04:00
ReinUsesLisp 528c15051c kepler_compute: Use std::array for cbuf info 2019-06-07 20:36:22 -03:00
ReinUsesLisp 17d5fb6d06 kepler_compute: Fix block_dim_x encoding 2019-06-07 20:35:46 -03:00
ReinUsesLisp 4ec8a3df08 gl_shader_cache: Use static constructors for CachedShader initialization 2019-06-07 20:20:22 -03:00
ReinUsesLisp 5669ff3cbd gl_rasterizer: Remove unused parameters in descriptor uploads 2019-06-07 19:52:16 -03:00
ReinUsesLisp 2f2a61887a video_core/engines: Move ConstBufferInfo out of Maxwell3D 2019-06-07 19:47:15 -03:00
Zach Hilman de33ad25f5
Merge pull request #2514 from ReinUsesLisp/opengl-compat
video_core: Drop OpenGL core in favor of OpenGL compatibility
2019-06-07 17:23:25 -04:00
ReinUsesLisp fe8e6618f2 shader: Split SSY and PBK stack
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:

        SSY label1;
        PBK label2;
        SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00
ReinUsesLisp 769a50661a shader/node: Minor changes
Reflect std::shared_ptr nature of Node on initializers and remove
constant members in nodes.

Add some commentaries.
2019-06-06 20:03:33 -03:00
ReinUsesLisp e1b3be7ced shader: Move Node declarations out of the shader IR header
Analysis passes do not have a good reason to depend on shader_ir.h to
work on top of nodes. This splits node-related declarations to their own
file and leaves the IR in shader_ir.h
2019-06-06 20:02:37 -03:00
ReinUsesLisp bf4dfb3ad4 shader: Use shared_ptr to store nodes and move initialization to file
Instead of having a vector of unique_ptr stored in a vector and
returning star pointers to this, use shared_ptr. While changing
initialization code, move it to a separate file when possible.

This is a first step to allow code analysis and node generation beyond
the ShaderIR class.
2019-06-05 20:41:52 -03:00
bunnei a20ba09bfd
Merge pull request #2520 from ReinUsesLisp/vulkan-refresh
vk_device,vk_shader_decompiler: Miscellaneous changes
2019-06-05 18:10:00 -04:00
bunnei 55c5029171
Merge pull request #2540 from ReinUsesLisp/remove-guest-position
gl_shader_decompiler: Remove guest "position" varying
2019-06-05 18:07:23 -04:00
bunnei 0bcc305797
Merge pull request #2512 from ReinUsesLisp/comp-indexing
gl_shader_decompiler: Pessimize uniform buffer access on AMD's prorpietary driver
2019-06-05 18:02:30 -04:00
Zach Hilman 81e09bb121
Merge pull request #2545 from lioncash/timing
core/core_timing_util: Use std::chrono types for specifying time units
2019-06-05 15:52:37 -04:00
Zach Hilman dd4fe0dab1
Merge pull request #2534 from ReinUsesLisp/shader-cleanup
gl_shader_cache: Minor style changes
2019-06-05 15:28:34 -04:00
Lioncash 42f5fd0ab3 core/core_timing_util: Use std::chrono types for specifying time units
Makes the interface more type-safe and consistent in terms of return
values.
2019-06-04 20:31:24 -04:00
Fernando Sahmkow a32c52b1d8 shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
ReinUsesLisp 0935c2d97b gl_shader_decompiler: Remove guest "position" varying
"position" was being written but not read anywhere besides geometry
shaders, where it had the same value as gl_Position.

This commit replaces "position" with gl_Position, reducing the
complexity of our code and the emitted GLSL code.
2019-06-03 01:01:34 -03:00
ReinUsesLisp e72b9044a0 gl_shader_cache: Store a system class and drop global accessors 2019-05-30 14:01:40 -03:00
ReinUsesLisp ad321564ed gl_shader_cache: Add commentaries explaining the intention in shaders creation 2019-05-30 13:58:38 -03:00
ReinUsesLisp 838b6d2ff8 gl_shader_cache: Flip if condition in GetStageProgram to reduce indentation 2019-05-30 13:56:03 -03:00
ReinUsesLisp 6ac4490751 gl_buffer_cache: Remove unused ReserveMemory method 2019-05-30 13:21:01 -03:00
ReinUsesLisp a89cc0bafc maxwell_to_gl: Use GL_CLAMP to emulate Clamp wrap mode 2019-05-30 13:21:01 -03:00
ReinUsesLisp b76df62c00 gl_rasterizer: Move alpha testing to the OpenGL pipeline
Removes the alpha testing code from each fragment shader invocation.
2019-05-30 13:21:01 -03:00
ReinUsesLisp df509486c4 gl_rasterizer: Use GL_QUADS to emulate quads rendering 2019-05-30 13:21:01 -03:00
bunnei e3608578e4
Merge pull request #2446 from ReinUsesLisp/tid
shader: Implement S2R Tid{XYZ} and CtaId{XYZ}
2019-05-29 12:21:17 -04:00
ReinUsesLisp 21c0b4dec8 gl_device: Add commentary to AOFFI unit test source code
The intention behind this commit is to hint someone inspecting an
apitrace dump to ignore this ill-formed GLSL code.
2019-05-27 00:55:57 -03:00
ReinUsesLisp 84928e6d67 gl_shader_gen: Always declare extensions after the version declaration
This addresses a bug on geometry shaders where code was being written
before all #extension declarations were done. Ref to #2523
2019-05-27 00:51:35 -03:00
ReinUsesLisp f424b46036 vk_device: Let formats array type be deduced 2019-05-26 03:09:06 -03:00
ReinUsesLisp a4c5e3e339 vk_shader_decompiler: Misc fixes
Fix missing OpSelectionMerge instruction. This caused devices loses on
most hardware, Intel didn't care.

Fix [-1;1] -> [0;1] depth conversions.

Conditionally use VK_EXT_scalar_block_layout. This allows us to use
non-std140 layouts on UBOs.

Update external Vulkan headers.
2019-05-26 01:48:04 -03:00
ReinUsesLisp dec3c981d0 vk_device: Enable features when available and misc changes
Keeps track of native ASTC support, VK_EXT_scalar_block_layout
availability and SSBO range.

Check for independentBlend and vertexPipelineStorageAndAtomics as a
required feature. Always enable it.

Use vk::to_string format to log Vulkan enums.

Style changes.
2019-05-26 01:41:34 -03:00
Lioncash 5a4564bd8e renderer_opengl/utils: Use a std::string_view with LabelGLObject()
Uses a std::string_view instead of a std::string, given the pointed to
string isn't modified and is only used in a formatting operation.

This is nice because a few usages directly supply a string literal to
the function, allowing these usages to otherwise not heap allocate,
unlike the std::string overloads.

While we're at it, we can combine the address formatting into a single
formatting call.
2019-05-24 23:50:10 -04:00
bunnei 68c9c9222d
Merge pull request #2358 from ReinUsesLisp/parallel-shader
gl_shader_cache: Use shared contexts to build shaders in parallel at boot
2019-05-24 22:42:08 -04:00
bunnei 1a2d90ab09
Merge pull request #2485 from ReinUsesLisp/generic-memory
shader/memory: Implement generic memory stores and loads (ST and LD)
2019-05-24 18:24:26 -04:00
ReinUsesLisp d8827b07b5 gl_shader_decompiler: Use an if based cbuf indexing for broken drivers
The following code is broken on AMD's proprietary GLSL compiler:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value = values[idx & 3];
```

It index the wrong components, to fix this the following pessimized code
is emitted when that bug is present:
```glsl
uint idx = ...;
vec4 values = ...;
float some_value;
if ((idx & 3) == 0) some_value = values.x;
if ((idx & 3) == 1) some_value = values.y;
if ((idx & 3) == 2) some_value = values.z;
if ((idx & 3) == 3) some_value = values.w;
```
2019-05-24 02:47:56 -03:00
ReinUsesLisp 46177901b8 gl_device: Add test to detect broken component indexing
Component indexing on AMD's proprietary driver is broken. This commit adds
a test to detect when we are on a driver that can't successfully manage
component indexing.

It dispatches a dummy draw with just one vertex shader that writes to an
indexed SSBO from the GPU with data sent through uniforms, it then reads
that data from the CPU and compares the expected output.
2019-05-24 02:47:56 -03:00
Lioncash b6dcb1ae4d shader/shader_ir: Make Comment() take a std::string by value
This allows for forming comment nodes without making unnecessary copies
of the std::string instance.

e.g. previously:

Comment(fmt::format("Base address is c[0x{:x}][0x{:x}]",
        cbuf->GetIndex(), cbuf_offset));

Would result in a copy of the string being created, as CommentNode()
takes a std::string by value (a const ref passed to a value parameter
results in a copy).

Now, only one instance of the string is ever moved around. (fmt::format
returns a std::string, and since it's returned from a function by value,
this is a prvalue (which can be treated like an rvalue), so it's moved
into Comment's string parameter), we then move it into the CommentNode
constructor, which then moves the string into its member variable).
2019-05-23 03:01:55 -03:00
Lioncash 228e58d0a5 shader/decode/*: Add missing newline to files lacking them
Keeps the shader code file endings consistent.
2019-05-23 02:55:52 -03:00
Lioncash 87b4c1ac5e shader/decode/*: Eliminate indirect inclusions
Amends cases where we were using things that were indirectly being
satisfied through other headers. This way, if those headers change and
eliminate dependencies on other headers in the future, we don't have
cascading compilation errors.
2019-05-23 02:55:52 -03:00
Lioncash 195b54602f shader/decode/memory: Remove left in debug pragma 2019-05-22 17:08:50 -04:00
Lioncash de23847184 renderer_opengl/gl_shader_decompiler: Remove redundant name specification in format string
This accidentally slipped through a rebase.
2019-05-21 09:47:21 -04:00
ReinUsesLisp 69215b5a55 gl_shader_cache: Fix clang strict standard build issues 2019-05-20 22:46:05 -03:00
ReinUsesLisp c03b8c4c19 gl_shader_cache: Use shared contexts to build shaders in parallel 2019-05-20 22:45:55 -03:00
ReinUsesLisp 75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
bunnei 9a17b20896
Merge pull request #2494 from lioncash/shader-text
gl_shader_decompiler: Add AddLine() overloads with single function that forwards to libfmt
2019-05-20 20:42:40 -04:00
ReinUsesLisp 9c3461604c shader: Implement S2R Tid{XYZ} and CtaId{XYZ} 2019-05-20 16:36:49 -03:00
ReinUsesLisp ada79fa8ad gl_shader_decompiler: Make GetSwizzle constexpr 2019-05-20 16:36:48 -03:00
Lioncash 58a0c13e34 gl_shader_decompiler: Tidy up minor remaining cases of unnecessary std::string concatenation 2019-05-20 14:14:48 -04:00
Lioncash 6fb29764d6 gl_shader_decompiler: Replace individual overloads with the fmt-based one
Gets rid of the need to special-case brace handling depending on the
overload used, and makes it consistent across the board with how fmt
handles them.

Strings with compile-time deducible strings are directly forwarded to
std::string's constructor, so we don't need to worry about the
performance difference here, as it'll be identical.
2019-05-20 14:14:48 -04:00
Lioncash 784d2b6c3d gl_shader_decompiler: Utilize fmt overload of AddLine() where applicable 2019-05-20 14:14:44 -04:00
Fernando Sahmkow 911fafb967 Revert #2466
This reverts a tested behavior on delay slots not exiting if the exit 
flag is set. Currently new tests are required in order to ensure this 
behavior.
2019-05-19 16:04:44 -04:00
Lioncash 91ec251c4a gl_shader_decompiler: Add AddLine() overload that forwards to fmt
In a lot of places throughout the decompiler, string concatenation via
operator+ is used quite heavily. This is usually fine, when not heavily
used, but when used extensively, can be a problem. operator+ creates an
entirely new heap allocated temporary string and given we perform
expressions like:

std::string thing = a + b + c + d;

this ends up with a lot of unnecessary temporary strings being created
and discarded, which kind of thrashes the heap more than we need to.
Given we utilize fmt in some AddLine calls, we can make this a part of
the ShaderWriter's API. We can make an overload that simply acts as a
passthrough to fmt.

This way, whenever things need to be appended to a string, the operation
can be done via a single string formatting operation instead of
discarding numerous temporary strings. This also has the benefit of
making the strings themselves look nicer and makes it easier to spot
errors in them.
2019-05-19 14:12:20 -04:00
bunnei d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Hexagon12 b94b08fa6f
Merge pull request #2491 from FernandoS27/dma-fix
Dma_pusher: ASSERT on empty command_list
2019-05-19 16:27:15 +01:00
Hexagon12 f8b1e53369
Merge pull request #2452 from FernandoS27/raster-cache-fix
Correct possible error on Rasterizer Caches
2019-05-19 16:00:44 +01:00
Hexagon12 2aebbe9bf9
Merge pull request #2497 from lioncash/shader-ir
shader/shader_ir: Minor changes
2019-05-19 15:51:06 +01:00
Hexagon12 fadf66993c
Merge pull request #2495 from lioncash/cache
gl_shader_disk_cache: Minor cleanup
2019-05-19 15:50:23 +01:00
Fernando Sahmkow 9e98100c94 Dma_pusher: ASSERT on empty command_list
This is a measure to avoid crashes on command list reading as an empty
command_list is considered a NOP.
2019-05-19 10:48:31 -04:00
Hexagon12 18cdbdafa2
Merge pull request #2467 from lioncash/move
video_core/gpu_thread: Remove redundant copy constructor for CommandDataContainer
2019-05-19 15:20:37 +01:00
Hexagon12 9175bffbdb
Merge pull request #2466 from yuzu-emu/mme-exit-delay-slot
GPU/MMEInterpreter: Ignore the 'exit' flag when it's executed inside a delay slot.
2019-05-19 15:14:41 +01:00
Hexagon12 ac3775e6ae
Merge pull request #2468 from lioncash/deduction
yuzu: Remove explicit types from locks where applicable
2019-05-19 15:05:56 +01:00
Hexagon12 b54bd3f018
Merge pull request #2472 from FernandoS27/tic
maxwell_3d: reduce severity of different component formats assert.
2019-05-19 15:04:47 +01:00
Hexagon12 3bd5f01240
Merge pull request #2469 from lioncash/copyable
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
2019-05-19 15:02:17 +01:00
Sebastian Valle a6ed792ac4
Merge pull request #2470 from lioncash/ranged-for
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
2019-05-19 09:01:19 -05:00
Hexagon12 4452195d41
Merge pull request #2480 from ReinUsesLisp/fix-quads
gl_rasterizer: Pass the right number of array quad vertices count
2019-05-19 14:58:49 +01:00