Commit Graph

5303 Commits

Author SHA1 Message Date
Mat M c4001225f6
Merge pull request #3631 from ReinUsesLisp/more-astc
texture/astc: More small ASTC optimizations
2020-04-13 10:17:32 -04:00
Mat M 7b62212461
Merge pull request #3619 from ReinUsesLisp/i2i
shader/conversion: Implement I2I sign extension, saturation and selection
2020-04-13 10:17:07 -04:00
Mat M 3351e1e94f
Merge pull request #3627 from ReinUsesLisp/layered-view
gl_texture_cache: Attach view instead of base texture for layered attchments
2020-04-13 10:16:18 -04:00
Mat M d37d899431
Merge pull request #3646 from ReinUsesLisp/fix-glsl-turing
gl_shader_decompiler: Improve generated code in HMergeH*
2020-04-13 10:15:12 -04:00
Mat M 47036859eb
Merge pull request #3633 from ReinUsesLisp/clean-texdec
shader/texture: Remove type mismatches management from shader decoder
2020-04-13 10:13:05 -04:00
ReinUsesLisp 76615b9f34 gl_rasterizer: Implement line widths and smooth lines
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
2020-04-13 01:30:34 -03:00
ReinUsesLisp 05cf270836 gl_shader_decompiler: Implement merges with bitfieldInsert
This also fixes Turing issues but it avoids doing more bitcasts. This
should improve the generated code while also avoiding more points where
compilers can flush floats.
2020-04-12 22:39:59 -03:00
Fernando Sahmkow 3d91dbb21d
Merge pull request #3578 from ReinUsesLisp/vmnmx
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
ReinUsesLisp 75eb953575 gl_shader_decompiler: Improve generated code in HMergeH*
Avoiding bitwise expressions, this fixes Turing issues in shaders using
half float merges that affected several games.
2020-04-12 05:06:55 -03:00
ReinUsesLisp 76f178ba6e shader/video: Partially implement VMNMX
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.

VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.

This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
ReinUsesLisp a7baf6fee4 video_core: Add MSAA registers in 3D engine and TIC
This adds the registers used for multisampling. It doesn't implement
anything for now.
2020-04-12 00:21:27 -03:00
ReinUsesLisp 94b0e2e5da texture_cache: Remove preserve_contents
preserve_contents was always true. We can't assume we don't have to
preserve clears because scissored and color masked clears exist.

This removes preserve_contents and assumes it as true at all times.
2020-04-11 01:51:02 -03:00
ReinUsesLisp 2905142f47 renderer_vulkan: Drop Vulkan-Hpp 2020-04-10 22:49:02 -03:00
bunnei 51c6688e21
Merge pull request #3594 from ReinUsesLisp/vk-instance
yuzu: Drop SDL2 and Qt frontend Vulkan requirements
2020-04-10 20:06:55 -04:00
ReinUsesLisp a87b16da9a shader/texture: Remove type mismatches management from shader decoder
Since commit e22816a5bb we handle type mismatches from the CPU.
We don't need to hack our shader decoder due to game bugs anymore.

Removed in this commit.
2020-04-10 00:57:32 -03:00
Fernando Sahmkow 7182ef31c9
Merge pull request #3622 from ReinUsesLisp/srgb-texture-border
video_core/texture: Use a LUT to convert sRGB texture borders
2020-04-09 18:01:48 -04:00
ReinUsesLisp 6bf5d2b011 astc: Hard code bit depth changes to 8 and use fast replicate 2020-04-09 18:37:12 -03:00
Rodrigo Locatti 36f607217f
Merge pull request #3610 from FernandoS27/gpu-caches
Refactor all the GPU Caches to use VAddr for cache addressing
2020-04-09 17:59:21 -03:00
ReinUsesLisp bd2c1ab8a0 astc: Use boost's static_vector to avoid heap allocations 2020-04-09 05:27:57 -03:00
ReinUsesLisp 5de130beea astc: Implement a fast precompiled alternative for Replicate 2020-04-09 03:58:25 -03:00
ReinUsesLisp 6b4d4473be astc: Move Replicate to a constexpr LUT when possible 2020-04-09 03:35:07 -03:00
ReinUsesLisp d22a689250 astc: Make InputBitStream constexpr 2020-04-09 02:54:05 -03:00
ReinUsesLisp 0efc230381 astc: OutputBitStream style changes and make it constexpr 2020-04-09 02:37:51 -03:00
bunnei b96fd0bd0e
Merge pull request #3601 from ReinUsesLisp/some-shader-encodings
video_core/shader: Add some instruction and S2R encodings
2020-04-09 00:17:39 -04:00
ReinUsesLisp 6c8f9f40d7 gl_texture_cache: Attach view instead of base texture for layered attachments
This way we are not ignoring the base layer of the current texture.
2020-04-08 22:20:25 -03:00
Fernando Sahmkow 7cd6daf115 VkRasterizer: Eliminate Legacy code. 2020-04-08 18:59:09 -04:00
Fernando Sahmkow 1c18dc6577 Memory: Correct GCC errors. 2020-04-08 18:09:16 -04:00
Fernando Sahmkow 913f42a3a7 Memory: Address Feedback. 2020-04-08 13:40:46 -04:00
Fernando Sahmkow e00d992848 GPUMemoryManager: Improve safety of memory reads. 2020-04-08 12:08:06 -04:00
ReinUsesLisp a209d464f9 video_core/textures: Move GetMaxAnisotropy to cpp file 2020-04-07 20:47:31 -03:00
ReinUsesLisp d7db088180 video_core/texture: Use a LUT to convert sRGB texture borders
This is a reversed look up table extracted from
https://gist.github.com/rygorous/2203834#file-gistfile1-cpp-L41-L62

that is used in
04d4e9e587/source/maxwell/tsc_generate.cpp (L38)

Games usually bind 0xFD expecting a float texture border of 1.0f.
The conversion previous to this commit was multiplying the uint8 sRGB
texture border color by 255. This is close to 1.0f but when that
difference matters, some graphical glitches appear.

This look up table is manually changed in the edges, clamping towards
0.0f and 1.0f.

While we are at it, move this logic to its own translation unit.
2020-04-07 20:38:14 -03:00
bunnei f316911248
Merge pull request #3599 from ReinUsesLisp/revert-3499
Revert "Merge pull request #3499 from ReinUsesLisp/depth-2d-array"
2020-04-07 16:51:41 -04:00
ReinUsesLisp bf1d66b7c0 yuzu: Drop SDL2 and Qt frontend Vulkan requirements
Create Vulkan instances and surfaces from the Vulkan backend.
2020-04-07 16:32:19 -03:00
Rodrigo Locatti 487f9ba525
Merge pull request #3489 from namkazt/patch-2
shader: implement SULD.D bits32/64
2020-04-07 16:21:09 -03:00
Nguyen Dac Nam 935648ffa9
address nit. 2020-04-07 18:29:30 +07:00
ReinUsesLisp bc1b4b85b0 renderer_vulkan: Query device names from the backend 2020-04-07 02:23:23 -03:00
ReinUsesLisp da706cad25 shader/conversion: Implement I2I sign extension, saturation and selection
Reimplements I2I adding sign extension, saturation (clamp source value
to the destination), selection and destination sizes that are not 32
bits wide.

It doesn't implement CC yet.
2020-04-07 02:19:44 -03:00
Nguyen Dac Nam bf1174c114
Apply suggestions from code review
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-04-07 07:55:49 +07:00
Fernando Sahmkow f9d5718c4b Clang Format. 2020-04-06 09:23:08 -04:00
Fernando Sahmkow ea535d9470 Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow 3dd5c07454 Query Cache: Use VAddr instead of physical memory for adressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow 7fcd0fee6d Buffer Cache: Use vAddr instead of physical memory. 2020-04-06 09:23:06 -04:00
Fernando Sahmkow 6ee316cb8f Texture Cache: Use vAddr instead of physical memory for caching. 2020-04-06 09:23:05 -04:00
Fernando Sahmkow 9c0f40a1f5 GPU: Setup Flush/Invalidate to use VAddr instead of CacheAddr 2020-04-06 09:21:46 -04:00
Fernando Sahmkow 588a20be3f
Merge pull request #3513 from ReinUsesLisp/native-astc
video_core: Use native ASTC when available
2020-04-06 09:21:11 -04:00
namkazy 2c98e14d13 shader_decode: SULD.D using std::pair instead of out parameter 2020-04-06 13:46:55 +07:00
namkazy 9efa51311f shader_decode: SULD.D avoid duplicate code block. 2020-04-06 13:34:06 +07:00
namkazy 7f5696513f shader_decode: SULD.D fix conversion error. 2020-04-06 13:26:58 +07:00
namkazy 2906372ba1 shader_decode: SULD.D implement bits64 and reverse shader ir init method to removed shader stage. 2020-04-06 13:09:19 +07:00
ReinUsesLisp 3185245845 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
ReinUsesLisp fd0a2b5151 shader/memory: Add "using std::move" 2020-04-06 02:18:14 -03:00
ReinUsesLisp 79970c9174 shader/memory: Minor fixes in ATOM 2020-04-06 00:54:22 -03:00
Fernando Sahmkow 69277de29d
Merge pull request #3592 from ReinUsesLisp/ipa
shader_decompiler: Remove FragCoord.w hack and change IPA implementation
2020-04-05 19:29:40 -04:00
Fernando Sahmkow 1633fbf99a
Merge pull request #3589 from ReinUsesLisp/fix-clears
gl_rasterizer: Mark cleared textures as dirty
2020-04-05 19:29:26 -04:00
namkazy 730f9b55b3 silent warning (conversion error) 2020-04-05 16:02:07 +07:00
namkazy 9f6ebccf06 shader_decode: SULD.D -> SINT actually same as UNORM. 2020-04-05 15:18:42 +07:00
namkazy 6f2b7087c2 shader_decode: SULD.D fix decode SNORM component 2020-04-05 14:46:43 +07:00
namkazy 69657ff19c clang-format 2020-04-05 12:57:50 +07:00
namkazy 24cc64c5b3 shader_decode: get sampler descriptor from registry. 2020-04-05 12:54:48 +07:00
namkazy acd3f0ab37 tweaking. 2020-04-05 10:31:32 +07:00
Nguyen Dac Nam 8370188b3c clang-format 2020-04-05 10:31:31 +07:00
namkazy 3e3afa9be6 cleanup unuse params 2020-04-05 10:31:31 +07:00
namkazy 5cd5857000 cleanup debug code. 2020-04-05 10:31:30 +07:00
namkazy 658112783d reimplement get component type, uncomment mistaken code 2020-04-05 10:31:30 +07:00
namkazy 3ad06e9b2b remove disable optimize 2020-04-05 10:31:30 +07:00
namkazy f24c2e1103 [wip] reimplement SULD.D 2020-04-05 10:31:29 +07:00
namkazy 58bcb86af5 add shader stage when init shader ir 2020-04-05 10:31:29 +07:00
Nguyen Dac Nam 2cefdd92bd clang-fix 2020-04-05 10:31:28 +07:00
Nguyen Dac Nam 1f3d142875 shader: image - import PredCondition 2020-04-05 10:31:27 +07:00
Nguyen Dac Nam 08db60392d shader: SULD.D bits32 implement more complexer method. 2020-04-05 10:31:27 +07:00
Nguyen Dac Nam ed1d8beb13 shader: SULD.D import StoreType 2020-04-05 10:31:26 +07:00
Nguyen Dac Nam 6d235b8631 shader: implement SULD.D bits32 2020-04-05 10:31:26 +07:00
ReinUsesLisp 60106531b4 shader/other: Add error message for some S2R registers 2020-04-04 03:46:07 -03:00
ReinUsesLisp 8b719e9e1d shader_bytecode: Rename MOV_SYS to S2R 2020-04-04 03:37:51 -03:00
ReinUsesLisp 9d15feb892 shader_bytecode: Add encoding for BAR 2020-04-04 03:36:21 -03:00
ReinUsesLisp 16ae98dbb3 shader_ir: Add error message for EXIT.FCSM_TR 2020-04-04 03:34:08 -03:00
ReinUsesLisp c02a2dc24a shader_bytecode: Add encoding for VOTE.VTG 2020-04-04 03:28:11 -03:00
ReinUsesLisp 80c4fee4ec Revert "Merge pull request #3499 from ReinUsesLisp/depth-2d-array"
This reverts commit 41905ee467, reversing
changes made to 35145bd529.

It causes regressions in several games.
2020-04-04 00:02:26 -03:00
ReinUsesLisp e1bd89e1c2 shader/memory: Silence no return value warning
Silences a warning about control paths not all returning a value.
2020-04-02 03:34:27 -03:00
Rodrigo Locatti 825a6e2615
Merge pull request #3552 from jroweboy/single-context
Refactor Context management (Fixes renderdoc on opengl issues)
2020-04-02 01:38:25 -03:00
ReinUsesLisp 2339fe199f shader_decompiler: Remove FragCoord.w hack and change IPA implementation
Credits go to gdkchan and Ryujinx. The pull request used for this can
be found here: https://github.com/Ryujinx/Ryujinx/pull/1082

yuzu was already using the header for interpolation, but it was missing
the FragCoord.w multiplication described in the linked pull request.
This commit finally removes the FragCoord.w == 1.0f hack from the shader
decompiler.

While we are at it, this commit renames some enumerations to match
Nvidia's documentation (linked below) and fixes component declaration
order in the shader program header (z and w were swapped).

https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-04-01 21:48:55 -03:00
ReinUsesLisp dd1232755b gl_texture_cache: Fix software ASTC fallback 2020-04-01 01:44:15 -03:00
ReinUsesLisp 2f0da10dc3 vk_device: Add missing ASTC queries 2020-04-01 01:14:04 -03:00
ReinUsesLisp b6571ca9f0 video_core: Use native ASTC when available 2020-04-01 01:14:04 -03:00
ReinUsesLisp 16270dcfe4 gl_device: Detect if ASTC is reported and expose it 2020-04-01 01:14:04 -03:00
Rodrigo Locatti baf91c920c
Merge pull request #3591 from ReinUsesLisp/vk-wrapper-part2
renderer_vulkan/wrapper: Add a Vulkan wrapper (part 2 of 2)
2020-03-31 22:14:26 -03:00
ReinUsesLisp f22f6b72c3 renderer_vulkan/wrapper: Add vkEnumerateInstanceExtensionProperties wrapper 2020-03-31 21:32:08 -03:00
ReinUsesLisp 27dd542c60 renderer_vulkan/wrapper: Add command buffer handle 2020-03-31 21:32:08 -03:00
ReinUsesLisp 5c90d060d8 renderer_vulkan/wrapper: Add physical device handle 2020-03-31 21:32:08 -03:00
ReinUsesLisp 0eb37de98f renderer_vulkan/wrapper: Add device handle 2020-03-31 21:32:08 -03:00
ReinUsesLisp 11774308d3 renderer_vulkan/wrapper: Add swapchain handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp 7fe52ef77f renderer_vulkan/wrapper: Add fence handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp 3a63ae0658 renderer_vulkan/wrapper: Add device memory handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp 397f53dea1 renderer_vulkan/wrapper: Add pool handles 2020-03-31 21:32:07 -03:00
ReinUsesLisp affee77b70 renderer_vulkan/wrapper: Add buffer and image handles 2020-03-31 21:32:07 -03:00
ReinUsesLisp d85ca0ab33 renderer_vulkan/wrapper: Add queue handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp 151ddcf419 renderer_vulkan/wrapper: Add instance handle 2020-03-31 21:32:07 -03:00
Fernando Sahmkow b03c0536ce
Merge pull request #3561 from ReinUsesLisp/f2f-conversion
shader/conversion: Fix F2F rounding operations with different sizes
2020-03-31 14:45:02 -04:00
Fernando Sahmkow 5b95a01463
Merge pull request #3577 from ReinUsesLisp/lea
shader/lea: Fix LEA implementation
2020-03-31 14:36:07 -04:00
ReinUsesLisp 1c5e2b60a7 gl_rasterizer: Mark cleared textures as dirty
Fixes a potential edge case where cleared textures read from the CPU
were not flushed.
2020-03-31 05:51:56 -03:00
Rodrigo Locatti c19425ed69
Merge pull request #3506 from namkazt/patch-9
shader_decode: Implement partial ATOM/ATOMS instr
2020-03-31 00:56:28 -03:00
Nguyen Dac Nam 238c35b2c9
clang-format 2020-03-31 08:08:06 +07:00
Nguyen Dac Nam defb9642da
shader_decode: fix by suggestion 2020-03-31 08:02:44 +07:00
Rodrigo Locatti 69728e8ad5
Merge pull request #3566 from ReinUsesLisp/vk-wrapper-part1
renderer_vulkan/wrapper: Add a Vulkan wrapper (part 1 of 2)
2020-03-30 21:57:36 -03:00
bunnei 4c72190a06
Merge pull request #3560 from ReinUsesLisp/fix-stencil
gl_rasterizer: Synchronize stencil testing on clears
2020-03-30 17:03:07 -04:00
namkazy cb0a4151f8 clang-format 2020-03-30 20:46:21 +07:00
namkazy c2665ec9c2 gl_decompiler: min/max op not implement yet 2020-03-30 18:48:22 +07:00
namkazy 4f7bea403a shader_decode: ATOM/ATOMS: add function to avoid code repetition 2020-03-30 18:47:50 +07:00
namkazy c8f6d9effd shader_decode: merge GlobalAtomicOp to AtomicOp 2020-03-30 18:47:00 +07:00
Nguyen Dac Nam 972485ff18 shader_decode: implement ATOM operation for S32 and U32 2020-03-30 17:44:48 +07:00
namkazy 93cac0d294 clang-format 2020-03-30 17:44:48 +07:00
Nguyen Dac Nam 3dc09a6250 shader_decode: implement ATOMS instr partial. 2020-03-30 17:44:46 +07:00
Nguyen Dac Nam a2cc80b605 vk_decompiler: add atomic op and handler function. 2020-03-30 17:44:45 +07:00
Nguyen Dac Nam 552f0ff267 gl_decompiler: add atomic op 2020-03-30 17:44:45 +07:00
Nguyen Dac Nam 2c780db5b9 shader: node - update correct comment 2020-03-30 17:44:44 +07:00
Nguyen Dac Nam c119473c40 shader_decode: add Atomic op for common usage 2020-03-30 17:44:44 +07:00
ReinUsesLisp 08470d261d shader_bytecode: Fix I2I_IMM encoding 2020-03-28 18:49:07 -03:00
ReinUsesLisp b6c9fba81c renderer_vulkan/wrapper: Address feedback 2020-03-28 04:09:02 -03:00
ReinUsesLisp 5300a918c6 shader/lea: Simplify generated LEA code 2020-03-28 03:55:04 -03:00
ReinUsesLisp 523a709bf1 shader/lea: Fix op_a and op_b usages
They were swapped.
2020-03-27 18:37:20 -03:00
ReinUsesLisp 796b3319e6 shader/lea: Remove const and use move when possible 2020-03-27 18:36:38 -03:00
Fernando Sahmkow 7a2f60df26
Merge pull request #3565 from ReinUsesLisp/image-format
engines/const_buffer_engine_interface: Store image format and types
2020-03-27 14:08:54 -04:00
ReinUsesLisp 2694552b7f renderer_vulkan/wrapper: Add owning handles 2020-03-27 03:21:04 -03:00
ReinUsesLisp 7413b30923 renderer_vulkan/wrapper: Add pool allocations owning templated class 2020-03-27 03:21:04 -03:00
ReinUsesLisp d8d392b39a renderer_vulkan/wrapper: Add owning handle templated class 2020-03-27 03:21:04 -03:00
ReinUsesLisp 60f351084a renderer_vulkan/wrapper: Add destroy and free overload set 2020-03-27 03:21:04 -03:00
ReinUsesLisp a9e4528d10 renderer_vulkan/wrapper: Add dispatch table and loaders 2020-03-27 03:21:04 -03:00
ReinUsesLisp 3f0b7673f0 renderer_vulkan/wrapper: Add exception class 2020-03-27 03:21:04 -03:00
ReinUsesLisp f5cee0e885 renderer_vulkan/wrapper: Add ToString function for VkResult 2020-03-27 03:21:03 -03:00
ReinUsesLisp 92c8d783b3 renderer_vulkan/wrapper: Add Vulakn wrapper and a span helper
The intention behind a Vulkan wrapper is to drop Vulkan-Hpp.

The issues with Vulkan-Hpp are:
- Regular breaks of the API.
- Copy constructors that do the same as the aggregates (fixed recently)
- External dynamic dispatch that is hard to remove
- Alias KHR handles with non-KHR handles making it impossible to use
smart handles on Vulkan 1.0 instances with extensions that were included
on Vulkan 1.1.
- Dynamic dispatchers silently change size depending on preprocessor
definitions. Different files will have different dispatch definitions,
generating all kinds of hard to debug memory issues.

In other words, Vulkan-Hpp is not "production ready" for our needs and
this wrapper aims to replace it without losing RAII and exception
safety.
2020-03-27 03:13:18 -03:00
ReinUsesLisp cedbe925cd engines/const_buffer_engine_interface: Store image format type
This information is required to properly implement SULD.B. It might also
be handy for all image operations, since it would allow us to implement
them on devices that require the image format to be specified (on
desktop, this would be AMD on OpenGL and Intel on OpenGL and Vulkan).
2020-03-27 00:36:22 -03:00
Dan 744b207d92 maxwell_to_vk: implement signedscaled vertex formats 2020-03-27 00:14:19 +01:00
James Rowe cf9c94d401 Address review and fix broken yuzu-tester build 2020-03-25 23:32:42 -06:00
ReinUsesLisp 46791c464a shader/conversion: Fix F2F rounding operations with different sizes
Rounding operations only matter when the conversion size of source and
destination is the same, i.e. .F16.F16, .F32.F32 and .F64.F64.

When there is a mismatch (.F16.F32), these bits are used for IEEE
rounding, we don't emulate this because GLSL and SPIR-V don't support
configuring it per operation.
2020-03-26 01:58:49 -03:00
ReinUsesLisp 7617e88fb2 gl_rasterizer: Update stencil test regardless of it being disabled 2020-03-26 01:08:14 -03:00
ReinUsesLisp c310cef615 gl_rasterizer: Synchronize stencil testing on clears 2020-03-26 00:51:47 -03:00
bunnei 23c7dda710
Merge pull request #3544 from makigumo/myfork/patch-2
xmad: fix clang build error
2020-03-25 19:29:16 -04:00
bunnei e6aff11057
Merge pull request #3520 from ReinUsesLisp/legacy-varyings
gl_shader_decompiler: Implement legacy varyings
2020-03-25 19:27:51 -04:00
James Rowe 282adfc70b Frontend/GPU: Refactor context management
Changes the GraphicsContext to be managed by the GPU core. This
eliminates the need for the frontends to fool around with tricky
MakeCurrent/DoneCurrent calls that are dependent on the settings (such
as async gpu option).

This also refactors out the need to use QWidget::fromWindowContainer as
that caused issues with focus and input handling. Now we use a regular
QWidget and just access the native windowHandle() directly.

Another change is removing the debug tool setting in FrameMailbox.
Instead of trying to block the frontend until a new frame is ready, the
core will now take over presentation and draw directly to the window if
the renderer detects that its hooked by NSight or RenderDoc

Lastly, since it was in the way, I removed ScopeAcquireWindowContext and
replaced it with a simple subclass in GraphicsContext that achieves the
same result
2020-03-24 21:03:42 -06:00
Fernando Sahmkow 497f593525
Merge pull request #3543 from ReinUsesLisp/gl-depth-range
gl_rasterizer: Use transformed viewport for depth ranges
2020-03-23 12:00:21 -04:00
makigumo 5a5c6d4ed8 xmad: fix clang build error 2020-03-23 00:09:31 +01:00
namkazy fc37672f26 apply replay logic to all writes. remove replay from MacroInterpreter::Send (@fincs) 2020-03-22 22:25:44 +07:00
namkazy f66743cd0c maxwell_3d: change declaration order 2020-03-22 13:41:16 +07:00
namkazy d4e93cf38c maxwell_3d: init shadow_state 2020-03-22 13:35:11 +07:00
ReinUsesLisp bdcedc8506 gl_rasterizer: Use transformed viewport for depth ranges
Implement depth ranges using the transformed viewport instead of the
generic one. This matches the current Vulkan implementation but doesn't
support negative depth ranges. An update to glad is required for this.
2020-03-22 03:26:07 -03:00
namkazy 22f4268c2f maxwell_3d: this seem more correct. 2020-03-22 12:02:54 +07:00
namkazy 7051dc1902 maxwell_3d: update comments for shadow ram usage 2020-03-22 11:35:26 +07:00
Nguyen Dac Nam 01af036c1f marco_interpreter: write hw value when shadow ram requested 2020-03-22 10:53:41 +07:00
Nguyen Dac Nam 63c2635e6f maxwell_3d: track shadow ram ctrl and hw reg value 2020-03-22 10:53:41 +07:00
Nguyen Dac Nam dbfbe352e0 maxwell_3d: implement MME shadow RAM 2020-03-22 10:53:35 +07:00
bunnei bdddbe2daa
Merge pull request #3505 from namkazt/patch-8
shader_decode: implement XMAD mode CSfu
2020-03-19 17:41:01 -04:00
ReinUsesLisp 38c1e77f01 vk_texture_cache: Silence misc warnings 2020-03-18 20:03:19 -03:00
ReinUsesLisp b6b2e31e5e vk_staging_buffer_pool: Silence unused constant warning 2020-03-18 20:03:19 -03:00
ReinUsesLisp fc51ece7bf vk_rasterizer: Remove unused variable 2020-03-18 20:03:19 -03:00
ReinUsesLisp 98d85cdc20 vk_pipeline_cache: Remove unused variable 2020-03-18 20:03:19 -03:00
ReinUsesLisp dab450ec46 maxwell_to_vk: Sielence -Wswitch warning 2020-03-18 20:03:19 -03:00
ReinUsesLisp 351816ac38 gl_shader_decompiler: Remove deprecated function and its usages 2020-03-18 20:03:19 -03:00
ReinUsesLisp acf328a71f gl_rasterizer: Silence misc warnings 2020-03-18 20:03:19 -03:00
ReinUsesLisp 9f46066bda kepler_compute: Remove unused variables 2020-03-18 20:03:19 -03:00
ReinUsesLisp 664fa4ea06 astc: Fix clang build issues 2020-03-18 04:30:25 -03:00
ReinUsesLisp f5658a9fda gl_shader_decompiler: Don't redeclare gl_VertexID and gl_InstanceID 2020-03-18 01:28:41 -03:00
Mat M edb9cccb36
Merge pull request #3510 from FernandoS27/dirty-write
DirtyFlags: relax need to set render_targets as dirty
2020-03-17 17:29:22 -04:00
Mat M f54d2d3114
Merge pull request #3509 from ReinUsesLisp/astc-opts
astc: General changes and optimizations
2020-03-17 17:28:49 -04:00
Mat M d787856621
Merge pull request #3518 from ReinUsesLisp/scissor-clears
vk_rasterizer: Implement scissor clears and layered clears
2020-03-17 17:27:15 -04:00
Mat M 9fdfd58f9f
Merge pull request #3519 from ReinUsesLisp/int-formats
maxwell_to_vk: Implement RG32 and RGB32 integer vertex formats
2020-03-17 17:26:16 -04:00
bunnei 1c45c8086e
Merge pull request #3498 from ReinUsesLisp/texel-fetch-glsl
gl_shader_decompiler: Add layer component to texelFetch
2020-03-17 10:53:38 -04:00
ReinUsesLisp 53d673a7d3 renderer_opengl: Move some logic to an anonymous namespace 2020-03-16 04:03:34 -03:00
ReinUsesLisp 311d2fc768 renderer_opengl: Detect Nvidia Nsight as a debugging tool
Use getenv to detect Nsight.
2020-03-16 03:59:08 -03:00
Rodrigo Locatti b16c8e0e8d
Merge pull request #3515 from ReinUsesLisp/vertex-vk-assert
vk_rasterizer: Fix vertex range assert
2020-03-15 21:26:54 -03:00
Rodrigo Locatti 7cc46a6faa
Merge pull request #3501 from ReinUsesLisp/rgba16-snorm
video_core: Implement RGBA16_SNORM
2020-03-15 21:24:53 -03:00
Rodrigo Locatti ddafc99776
Merge pull request #3502 from namkazt/patch-3
shader_decode: Reimplement BFE instructions
2020-03-15 21:23:04 -03:00
Rodrigo Locatti d64edf21bb
Merge pull request #3503 from makigumo/patch-2
maxwell_to_vk: add vertex format eA2B10G10R10UnormPack32
2020-03-15 21:21:38 -03:00
ReinUsesLisp 5afc397d52 gl_shader_decompiler: Implement legacy varyings
Legacy varyings are special attributes carried over in hardware from
the OpenGL 1 and OpenGL 2 days. These were generally used instead of the
generic attributes we use today. They are deprecated or removed from
most APIs, but Nvidia still ships them in hardware.

To implement these, this commit maps them 1:1 to OpenGL compatibility.
2020-03-15 21:03:59 -03:00
ReinUsesLisp 6442e02c5d shader/shader_ir: Track usage in input attribute and of legacy varyings 2020-03-15 21:01:52 -03:00
ReinUsesLisp 8e6e55d6f8 shader/shader_ir: Fix clip distance usage stores 2020-03-15 20:53:14 -03:00
ReinUsesLisp 464bd5fad7 shader/shader_ir: Change declare output attribute to a switch 2020-03-15 20:49:35 -03:00
Rodrigo Locatti 86b1f15d9a
Merge pull request #3512 from bunnei/fix-renderdoc
renderer_opengl: Keep frames synchronized when using a GPU debugger.
2020-03-15 19:28:43 -03:00
ReinUsesLisp 52acb7f9a0 maxwell_to_vk: Implement RG32 and RGB32 integer vertex formats 2020-03-15 18:51:49 -03:00
ReinUsesLisp 71cc772988 vk_rasterizer: Implement layered clears 2020-03-15 18:37:19 -03:00
makigumo f91046bf8d
vk_shader_decompiler: fix linux build 2020-03-15 18:00:14 +01:00
ReinUsesLisp a7131af7d6 vk_rasterizer: Fix vertex range assert
End can be equal to start in CalculateVertexArraysSize. This is quite
common when the vertex size is zero.
2020-03-15 04:04:17 -03:00
ReinUsesLisp 8baf98e439 vk_rasterizer: Reimplement clears with vkCmdClearAttachments 2020-03-15 03:40:41 -03:00
bunnei c5afe93dcc renderer_opengl: Keep presentation frames in lock-step when GPU debugging.
- Fixes renderdoc with OpenGL renderer.
2020-03-14 17:45:01 -04:00
bunnei 4373fa8042 gl_device: Add option to check GL_EXT_debug_tool. 2020-03-14 17:39:29 -04:00
bunnei 4dfd5c84ea
Merge pull request #3508 from FernandoS27/page-table
PageTable: move backing addresses to a children class as the CPU page table does not need them.
2020-03-14 16:50:27 -04:00
Fernando Sahmkow 380fc8d2e1 DirtyFlags: relax need to set render_targets as dirty
The texture cache already takes care of setting a render target to dirty 
when invalidated.
2020-03-14 11:47:33 -04:00
Fernando Sahmkow c51dbf8038
Merge pull request #3500 from ReinUsesLisp/incompatible-types
texture_cache: Report incompatible textures as black
2020-03-14 09:49:05 -04:00
Fernando Sahmkow 41905ee467
Merge pull request #3499 from ReinUsesLisp/depth-2d-array
texture_cache/surface_params: Force depth=1 on 2D textures
2020-03-14 09:48:39 -04:00
Fernando Sahmkow 27cbb75e7c PageTable: move backing addresses to a children class as the CPU page table does not need them.
This PR aims to reduce the memory usage in the CPU page table by moving
GPU specific parameters into a child class. This saves 1Gb of Memory for
most games.
2020-03-14 09:43:57 -04:00
ReinUsesLisp 42cb8f1124 astc: Fix typos from search and replace 2020-03-14 01:05:20 -03:00
ReinUsesLisp 9b8fb3c756 astc: Minor changes to InputBitStream 2020-03-14 00:45:54 -03:00
ReinUsesLisp d71d7d917e astc: Pass val in Replicate by copy 2020-03-14 00:13:58 -03:00
ReinUsesLisp 134f3ff9b4 astc: Call std::vector:reserve on decodedClolorValues to avoid reallocating 2020-03-14 00:09:56 -03:00
Nguyen Dac Nam 3287b1247d
clang-format 2020-03-14 10:07:40 +07:00
Nguyen Dac Nam 240d45830d
nit 2020-03-14 09:57:24 +07:00
ReinUsesLisp 3377b78ea7 astc: Call std::vector::reserve on texelWeightValues to avoid reallocating 2020-03-13 23:52:51 -03:00
ReinUsesLisp 801fd04f75 astc: Create a LUT at compile time for encoding values 2020-03-13 23:40:02 -03:00
ReinUsesLisp e183820956 astc: Make IntegerEncodedValue a trivial structure 2020-03-13 22:49:28 -03:00
ReinUsesLisp 70a31eda62 astc: Make IntegerEncodedValue constructor constexpr 2020-03-13 22:36:45 -03:00
ReinUsesLisp 5ed377b989 astc: Make IntegerEncodedValue trivially copyable 2020-03-13 22:30:31 -03:00
ReinUsesLisp e7d97605e8 astc: Rename C types to common_types 2020-03-13 22:28:51 -03:00
ReinUsesLisp 835a3d09c6 astc: Move Popcnt to an anonymous namespace and make it constexpr 2020-03-13 22:26:48 -03:00
ReinUsesLisp 731a9a322e astc: Use common types instead of stdint.h integer types 2020-03-13 22:22:27 -03:00
ReinUsesLisp d3dc4e399c astc: Use 'enum class' instead of 'enum' for EIntegerEncoding 2020-03-13 22:20:12 -03:00
ReinUsesLisp 69c7a01f88 vk/gl_shader_decompiler: Silence assertion on compute 2020-03-13 18:33:05 -03:00
ReinUsesLisp 62560f1e63 vk_shader_decompiler: Fix default varying regression 2020-03-13 18:33:05 -03:00
ReinUsesLisp afebdda203 maxwell_3d: Add padding words to XFB entries
Use INSERT_UNION_PADDING_WORDS instead of alignas to ensure a size
requirement.
2020-03-13 18:33:05 -03:00
ReinUsesLisp 4bc4851d45 gl_shader_decompiler: Fix implicit conversion errors 2020-03-13 18:33:05 -03:00
Rodrigo Locatti 47459f6a36 vk_shader_decompiler: Fix implicit type conversion
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-03-13 18:33:05 -03:00
ReinUsesLisp 2fae1e6205 vk_rasterizer: Implement transform feedback binding zero 2020-03-13 18:33:05 -03:00
ReinUsesLisp b67360c0f8 vk_shader_decompiler: Add XFB decorations to generic varyings 2020-03-13 18:33:05 -03:00
ReinUsesLisp 8d5bdcb17b vk_device: Enable VK_EXT_transform_feedback when available 2020-03-13 18:33:05 -03:00
ReinUsesLisp c320702092 vk_device: Shrink formatless capability name size 2020-03-13 18:33:05 -03:00
ReinUsesLisp ae6189d7c2 shader/transform_feedback: Expose buffer stride 2020-03-13 18:33:05 -03:00
ReinUsesLisp 7acebd7eb6 vk_shader_decompiler: Use registry for specialization 2020-03-13 18:33:05 -03:00
ReinUsesLisp 8e9f23f393 gl_rasterizer: Implement transform feedback bindings 2020-03-13 18:33:04 -03:00
ReinUsesLisp 4d711dface gl_shader_decompiler: Decorate output attributes with XFB layout
We sometimes have to slice attributes in different parts. This is needed
for example in instances where the game feedbacks 3 components but
writes 4 from the shader (something that is possible with
GL_NV_transform_feedback).
2020-03-13 18:33:04 -03:00
ReinUsesLisp 3dcaa84ba4 shader/transform_feedback: Add host API friendly TFB builder 2020-03-13 18:33:04 -03:00
Rodrigo Locatti 244fe13219
Merge branch 'master' into shader-purge 2020-03-13 16:44:06 -03:00
bunnei b30b1f741d
Merge pull request #3491 from ReinUsesLisp/polygon-modes
gl_rasterizer: Implement polygon modes and fill rectangles
2020-03-13 10:08:57 -04:00
Nguyen Dac Nam 829f424618
nit & remove some optional param 2020-03-13 20:47:38 +07:00
Nguyen Dac Nam a166217480
shader_decode: implement XMAD mode CSfu 2020-03-13 19:01:49 +07:00
makigumo 753bc2026f
fix formatting 2020-03-13 11:37:24 +01:00
makigumo 54681909be
maxwell_to_vk: add vertex format eA2B10G10R10UnormPack32 2020-03-13 11:26:13 +01:00
Nguyen Dac Nam 00607fe1e0
clang-format 2020-03-13 15:38:57 +07:00
Nguyen Dac Nam 325977c0c6
Apply suggestions from code review
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-03-13 15:35:15 +07:00
Nguyen Dac Nam 70ff82f72d
shader_decode: BFE add ref of reverse parallel method. 2020-03-13 14:20:18 +07:00
Nguyen Dac Nam 96a4abe12d
shader_decode: implement BREV on BFE
Implement reverse parallel follow: https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
2020-03-13 14:13:31 +07:00
Nguyen Dac Nam 93547cac68
shader_bytecode: update BFE instructions struct. 2020-03-13 12:52:16 +07:00
Nguyen Dac Nam 911c56ccef
node_helper: add IBitfieldExtract case 2020-03-13 12:50:32 +07:00
Nguyen Dac Nam 465ba30d08
shader_decode: Reimplement BFE instructions 2020-03-13 12:48:01 +07:00
ReinUsesLisp e24197bb3f gl_shader_decompiler: Initialize gl_Position on vertex shaders 2020-03-12 23:31:06 -03:00
Fernando Sahmkow 00e9ba0603
Merge pull request #3483 from namkazt/patch-1
vk_rasterizer: fix mistype on SetupGraphicsImages
2020-03-12 22:10:48 -04:00
Fernando Sahmkow f159a12820
Merge pull request #3480 from ReinUsesLisp/vk-disabled-ubo
vk_rasterizer: Support disabled uniform buffers
2020-03-12 22:09:49 -04:00
ReinUsesLisp 3a10016e38 gl_shader_decompiler: Add missing {} on smem GLSL emission 2020-03-12 21:50:37 -03:00
ReinUsesLisp 4dcca90ef4 video_core: Implement RGBA16_SNORM
Implement RGBA16_SNORM with the current API. Nothing special here.
2020-03-12 21:42:33 -03:00
ReinUsesLisp e22816a5bb texture_cache: Report incompatible textures as black
Some games bind incompatible texture types to certain types.
For example Astral Chain binds a 2D texture with 1 layer (non-array) to
a cubemap slot (that's how it's used in the shader). After testing this
in hardware, the expected "undefined behavior" is to report all pixels
as black.

We already have a path for reporting black textures in the texture
cache. When textures types are incompatible, this commit binds these
kind of textures. This is done on the API agnostic texture cache so no
extra code has to be inserted on OpenGL or Vulkan.

As a side effect, this fixes invalidations of ASTC textures on Astral
Chain. This happened because yuzu detected a cube texture and forced
6 faces, generating a texture larger than what the TIC reported.
2020-03-12 18:22:05 -03:00
ReinUsesLisp daae6a323b texture_cache/surface_params: Force depth=1 on 2D textures
Sometimes games will sample a 2D array TIC with a 2D access in the
shader. This causes bad interactions with the rest of the texture cache.
To emulate what the game wants to do, force a depth=1 on 2D textures
(not 2D arrays) and let the texture cache handle the rest.
2020-03-12 18:11:42 -03:00
ReinUsesLisp 38fe070d78 gl_shader_decompiler: Add layer component to texelFetch
TexelFetch was not emitting the array component generating invalid GLSL.
2020-03-12 18:10:29 -03:00
ReinUsesLisp 825d629565 gl_shader_decompiler: Fix regression in render target declarations
A previous commit introduced a way to declare as few render targets as
possible. Turns out this introduced a regression in some games.
2020-03-12 05:01:20 -03:00
ReinUsesLisp 8357908099 gl_shader_manager: Fix interaction between graphics and compute
After a compute shader was set to the pipeline, no graphics shader was
invoked again. To address this use glUseProgram to bind compute shaders
(without state tracking) and call glUseProgram(0) when transitioning out
of it back to the graphics pipeline.
2020-03-11 01:04:52 -03:00
ReinUsesLisp e4bc3c3342 gl_rasterizer: Implement polygon modes and fill rectangles 2020-03-09 20:39:58 -03:00
ReinUsesLisp eb5861e0a2 engines/maxwell_3d: Add TFB registers and store them in shader registry 2020-03-09 18:40:53 -03:00
ReinUsesLisp b1acb4f73f shader/registry: Address feedback 2020-03-09 18:40:53 -03:00
ReinUsesLisp b1061afed9 gl_shader_decompiler: Add identifier to decompiled code 2020-03-09 18:40:53 -03:00
ReinUsesLisp e612242977 gl_shader_decompiler: Roll back to GLSL core 430
RenderDoc won't build shaders if we use GLSL compatibility.
2020-03-09 18:40:53 -03:00
ReinUsesLisp 978172530e const_buffer_engine_interface: Store component types
This is required for Vulkan. Sampling integer textures with float
handles is illegal.
2020-03-09 18:40:53 -03:00
ReinUsesLisp 120f688272 yuzu/loading_screen: Remove unused shader progress mode 2020-03-09 18:40:53 -03:00
ReinUsesLisp e1932351a9 gl_shader_cache: Reduce registry consistency to debug assert
Registry consistency is something that practically can't happen and it
has a measurable runtime cost. Reduce it to a DEBUG_ASSERT.
2020-03-09 18:40:07 -03:00
ReinUsesLisp 66a8a3e887 shader/registry: Cache tessellation state 2020-03-09 18:40:07 -03:00
ReinUsesLisp 0528be5c92 shader/registry: Store graphics and compute metadata
Store information GLSL forces us to provide but it's dynamic state in
hardware (workgroup sizes, primitive topology, shared memory size).
2020-03-09 18:40:07 -03:00
ReinUsesLisp e8efd5a901 video_core: Rename "const buffer locker" to "registry" 2020-03-09 18:40:06 -03:00
ReinUsesLisp bd8b9bbcee gl_shader_cache: Rework shader cache and remove post-specializations
Instead of pre-specializing shaders and then post-specializing them,
drop the later and only "specialize" the shader while decoding it.
2020-03-09 18:40:06 -03:00
Rodrigo Locatti 22e825a3bc
Merge pull request #3301 from ReinUsesLisp/state-tracker
video_core: Remove gl_state and use a state tracker based on dirty flags
2020-03-09 18:34:37 -03:00
ReinUsesLisp 1aa75b1081 textures: Fix anisotropy hack
Previous code could generate an anisotropy value way higher than x16.
2020-03-08 15:59:38 -03:00
bunnei 84e9f9f395
Merge pull request #3452 from Morph1984/anisotropic-filtering
frontend/Graphics: Add "Advanced" graphics tab and experimental Anisotropic Filtering support
2020-03-07 22:28:35 -05:00
Nguyen Dac Nam 16cfbb068c
vk_reasterizer: fix mistype on SetupGraphicsImages
This should use Maxwell3D engine. Fixed some GPU error on Kirby and maybe other games.
2020-03-08 10:06:59 +07:00
bunnei 662feb8c1c
Merge pull request #3481 from ReinUsesLisp/abgr5-storage
maxwell_to_vk: Remove Storage capability for A1B5G5R5U
2020-03-07 19:51:33 -05:00
ReinUsesLisp e4f9ce0379 vk_rasterizer: Support disabled uniform buffers 2020-03-06 18:47:51 -03:00
ReinUsesLisp aa6fe3f1aa maxwell_to_vk: Remove Storage capability for A1B5G5R5U 2020-03-06 18:47:27 -03:00
bunnei 49eff536d0
Merge pull request #3463 from ReinUsesLisp/vk-toctou
vk_swapchain: Silence TOCTOU race condition
2020-03-05 19:38:42 -05:00
bunnei 0361aa1915
Merge pull request #3451 from ReinUsesLisp/indexed-textures
vk_shader_decompiler: Implement indexed textures
2020-03-05 11:42:46 -05:00
bunnei fa1d625eed
Merge pull request #3469 from namkazt/patch-1
shader_decode: Fix LD, LDG when track constant buffer
2020-03-04 23:10:01 -05:00
bunnei 67e7186d79
Merge pull request #3455 from ReinUsesLisp/attr-scaled
video_core: Implement more scaled attribute formats
2020-03-03 22:46:20 -05:00
Nguyen Dac Nam 85a4222a8c
nit: move comment to right place. 2020-02-29 13:50:10 +07:00
ReinUsesLisp 735c003a70 video_core/dirty_flags: Address feedback 2020-02-28 17:56:43 -03:00
ReinUsesLisp ef7f6eb67d renderer_opengl: Fix edge-case where alpha testing might cull presentation 2020-02-28 17:56:43 -03:00
ReinUsesLisp a6a350ddc3 gl_texture_cache: Remove blending disable on blits
Blending doesn't affect blits. Rasterizer discard does, update the
commentaries.
2020-02-28 17:56:43 -03:00
ReinUsesLisp 887d5288ef gl_rasterizer: Don't disable blending on clears
Blending doesn't affect clears.
2020-02-28 17:56:43 -03:00
ReinUsesLisp ac204754d4 dirty_flags: Deduplicate code between OpenGL and Vulkan 2020-02-28 17:56:43 -03:00
ReinUsesLisp 6669b359a3 vk_rasterizer: Pass Maxwell registers to dynamic updates 2020-02-28 17:56:43 -03:00
ReinUsesLisp 042256c6bb state_tracker: Remove type traits with named structures 2020-02-28 17:56:43 -03:00
ReinUsesLisp 6ac3eb4d87 vk_state_tracker: Implement dirty flags for stencil properties 2020-02-28 17:56:43 -03:00
ReinUsesLisp f9df2c6bcd vk_state_tracker: Implement dirty flags for depth bounds 2020-02-28 17:56:43 -03:00
ReinUsesLisp cd0e28c9ec vk_state_tracker: Implement dirty flags for blend constants 2020-02-28 17:56:43 -03:00
ReinUsesLisp a33870996b vk_state_tracker: Implement dirty flags for depth bias 2020-02-28 17:56:43 -03:00
ReinUsesLisp 42f1874965 vk_state_tracker: Implement dirty flags for scissors 2020-02-28 17:56:43 -03:00
ReinUsesLisp 1bd95a314f vk_state_tracker: Initial implementation
Add support for render targets and viewports.
2020-02-28 17:56:43 -03:00
ReinUsesLisp b1498d2c54 gl_rasterizer: Remove num vertex buffers magic number 2020-02-28 17:56:43 -03:00
ReinUsesLisp 62437943a7 gl_rasterizer: Only apply polygon offset clamp if enabled 2020-02-28 17:56:43 -03:00
ReinUsesLisp 2eeea90713 gl_state_tracker: Implement dirty flags for depth clamp enabling 2020-02-28 17:56:43 -03:00
ReinUsesLisp 3ce66776ec gl_rasterizer: Disable scissor 0 when scissor is not used on clear 2020-02-28 17:56:43 -03:00
ReinUsesLisp 35bb9239ca gl_rasterizer: Notify depth mask changes on clear 2020-02-28 17:56:43 -03:00
ReinUsesLisp 98c8948b23 gl_rasterizer: Minor sort changes to clearing 2020-02-28 17:56:42 -03:00
ReinUsesLisp 15cadc3948 maxwell_3d: Use two tables instead of three for dirty flags 2020-02-28 17:56:42 -03:00
ReinUsesLisp a5bfc0d045 gl_state_tracker: Track state of index buffers 2020-02-28 17:56:42 -03:00
ReinUsesLisp a42a6e1a2c gl_state_tracker: Implement dirty flags for clip control 2020-02-28 17:56:42 -03:00
ReinUsesLisp 4f8d152b18 gl_state_tracker: Implement dirty flags for point sizes 2020-02-28 17:56:42 -03:00
ReinUsesLisp 231601763c gl_state_tracker: Implement dirty flags for fragment color clamp 2020-02-28 17:56:42 -03:00
ReinUsesLisp bf1a1d989f gl_state_tracker: Implement dirty flags for logic op 2020-02-28 17:56:42 -03:00
ReinUsesLisp 13afd0e5b0 gl_state_tracker: Implement dirty flags for sRGB 2020-02-28 17:56:42 -03:00
ReinUsesLisp d8f5c45051 gl_state_tracker: Implement dirty flags for rasterize enable 2020-02-28 17:56:42 -03:00
ReinUsesLisp b727d99441 gl_state_tracker: Implement dirty flags for multisample 2020-02-28 17:56:42 -03:00
ReinUsesLisp 3c22bd92d8 gl_state_tracker: Implement dirty flags for alpha testing 2020-02-28 17:56:42 -03:00
ReinUsesLisp 9e46953580 gl_state_tracker: Implement dirty flags for polygon offsets 2020-02-28 17:56:42 -03:00
ReinUsesLisp 46a1888e02 gl_state_tracker: Implement dirty flags for primitive restart 2020-02-28 17:56:42 -03:00
ReinUsesLisp 37536d7a49 gl_state_tracker: Implement dirty flags for stencil testing 2020-02-28 17:56:42 -03:00
ReinUsesLisp 40a2c57df5 gl_state_tracker: Implement depth dirty flags 2020-02-28 17:56:42 -03:00
ReinUsesLisp b910a83a47 gl_state_tracker: Implement dirty flags for front face and culling 2020-02-28 17:56:42 -03:00
ReinUsesLisp b01dd7d1c8 gl_state_tracker: Implement dirty flags for blending 2020-02-28 17:56:42 -03:00
ReinUsesLisp f7ec078592 gl_state_tracker: Implement dirty flags for clip distances and shaders 2020-02-28 17:56:42 -03:00
ReinUsesLisp 758ad3f75d gl_state_tracker: Add dirty flags for buffers and divisors 2020-02-28 17:56:42 -03:00
ReinUsesLisp 9b08698a0c maxwell_3d: Change write dirty flags to a bitset 2020-02-28 17:56:42 -03:00
ReinUsesLisp 69ad6279e4 gl_state_tracker: Implement dirty flags for vertex formats 2020-02-28 17:56:42 -03:00
ReinUsesLisp 6530144ccb gl_state_tracker: Implement dirty flags for color masks 2020-02-28 17:56:42 -03:00
ReinUsesLisp ba6f390448 gl_state_tracker: Implement dirty flags for scissors 2020-02-28 17:56:42 -03:00
ReinUsesLisp 7f52efdf61 gl_state_tracker: Implement dirty flags for viewports 2020-02-28 17:56:41 -03:00
ReinUsesLisp dacf83ac02 renderer_opengl: Reintroduce dirty flags for render targets 2020-02-28 17:56:41 -03:00
ReinUsesLisp 9e74e6988b maxwell_3d: Flatten cull and front face registers 2020-02-28 17:56:41 -03:00
ReinUsesLisp eed789d0d1 video_core: Reintroduce dirty flags infrastructure 2020-02-28 17:56:41 -03:00
ReinUsesLisp b92dfcd7f2 gl_state: Remove completely 2020-02-28 17:56:35 -03:00
ReinUsesLisp 1c4bf9cbfa gl_state: Remove program tracking 2020-02-28 17:52:14 -03:00
ReinUsesLisp 5ccb07933a gl_state: Remove framebuffer tracking 2020-02-28 17:52:10 -03:00
ReinUsesLisp 17a7fa751b gl_state: Remove image tracking 2020-02-28 17:36:40 -03:00
ReinUsesLisp 9677db03da gl_state: Remove texture and sampler tracking 2020-02-28 17:35:58 -03:00
ReinUsesLisp 1bc0da3dea gl_state: Remove blend state tracking 2020-02-28 17:34:43 -03:00
ReinUsesLisp 7d9a5e9e30 gl_state: Remove stencil test tracking 2020-02-28 17:32:05 -03:00
ReinUsesLisp 07a954e67f gl_state: Remove clip control tracking 2020-02-28 17:31:57 -03:00
ReinUsesLisp 1eee891f6e gl_state: Remove clip distances tracking 2020-02-28 17:26:26 -03:00
ReinUsesLisp e8125af8dd gl_state: Remove rasterizer disable tracking 2020-02-28 17:25:28 -03:00
ReinUsesLisp d3e433a380 gl_state: Remove viewport and depth range tracking 2020-02-28 17:25:18 -03:00
ReinUsesLisp 7c16b3551b gl_state: Remove scissor test tracking 2020-02-28 17:00:23 -03:00
ReinUsesLisp 0914c70b7f gl_state: Remove color mask tracking 2020-02-28 16:59:17 -03:00
ReinUsesLisp 2392b548be gl_state: Remove clamp framebuffer color tracking
This commit doesn't reset it for screen draws because clamping doesn't
change anything there.
2020-02-28 16:58:30 -03:00
ReinUsesLisp f92236976b gl_state: Remove multisample tracking 2020-02-28 16:57:47 -03:00
ReinUsesLisp 04d1134191 gl_state: Remove framebuffer sRGB tracking 2020-02-28 16:55:23 -03:00
ReinUsesLisp d5ab0358b6 gl_state: Remove VAO cache and tracking 2020-02-28 16:54:37 -03:00
ReinUsesLisp 2a662fea36 gl_state: Remove depth clamp tracking 2020-02-28 16:53:35 -03:00
ReinUsesLisp e1a16a52fa gl_state: Remove depth tracking 2020-02-28 16:52:46 -03:00
ReinUsesLisp 0f343d32c4 gl_state: Remove primitive restart tracking 2020-02-28 16:51:45 -03:00
ReinUsesLisp 42708c762e gl_state: Remove logic op tracker 2020-02-28 16:51:23 -03:00
ReinUsesLisp 915d73f3b8 gl_state: Remove blend color tracking 2020-02-28 16:50:58 -03:00
ReinUsesLisp a0321b984f gl_state: Remove polygon offset tracking 2020-02-28 16:49:20 -03:00
ReinUsesLisp f646321dd0 gl_state: Remove alpha test tracking 2020-02-28 16:48:57 -03:00
ReinUsesLisp c8f5f54a44 gl_state: Remove cull mode tracking 2020-02-28 16:48:23 -03:00
ReinUsesLisp 925521da5f gl_state: Remove front face tracking 2020-02-28 16:47:59 -03:00
ReinUsesLisp d2d5554296 gl_state: Remove point size tracking 2020-02-28 16:39:44 -03:00
ReinUsesLisp b95f064b51 gl_rasterizer: Add oglEnablei helper 2020-02-28 16:39:44 -03:00
ReinUsesLisp 1698143a1d gl_rasterizer: Add OpenGL enable/disable helper 2020-02-28 16:39:44 -03:00
ReinUsesLisp 96ac3d518a gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
bunnei 5056d23d0d renderer_opengl: Fix SRGB presentation frame tracking.
- Fixes SRGB in Super Smash Bros. Ultimate.
2020-02-28 01:13:38 -05:00
Nguyen Dac Nam 6c0c2dfabc shader_decode: Fix LD, LDG when track constant buffer 2020-02-28 13:11:19 +07:00
Morph 7ee6065178 Create an "Advanced" tab in the graphics configuration tab and add anisotropic filtering levels. 2020-02-27 21:34:00 -05:00
bunnei 969357af1a
Merge pull request #3430 from bunnei/split-presenter
Port citra-emu/citra#4940: "Split Presentation thread from Render thread"
2020-02-27 19:51:55 -05:00
bunnei ebbfe73557 renderer_opengl: Reduce swap chain size to 3. 2020-02-27 19:50:17 -05:00
Nguyen Dac Nam db2f547434
shader: FMUL switch to using LUT (#3441)
* shader: add FmulPostFactor LUT table

* shader: FMUL apply LUT

* Update src/video_core/engines/shader_bytecode.h

Co-Authored-By: Mat M. <mathew1800@gmail.com>

* nit: mistype

* clang-format & add missing import

* shader: remove post factor LUT.

* shader: move post factor LUT to function and fix incorrect order.

* clang-format

* shader: FMUL: add static to post factor LUT

* nit: typo

Co-authored-by: Mat M. <mathew1800@gmail.com>
2020-02-27 11:14:25 -05:00
bunnei a17214baea renderer_opengl: Use more concise lock syntax. 2020-02-26 18:35:35 -05:00
bunnei aef159354c renderer_opengl: Move Frame/FrameMailbox to OpenGL namespace. 2020-02-26 18:28:50 -05:00
ReinUsesLisp 0aaa69e4d7 vk_swapchain: Silence TOCTOU race condition
It's possible that the window is resized from the moment we ask for its
size to the moment a swapchain is created, causing validation issues.

To workaround this Vulkan issue request the capabilities again just
before creating the swapchain, making the race condition less likely.
2020-02-26 17:07:18 -03:00
bunnei 1f57f679a4
Merge pull request #3440 from namkazt/patch-6
shader: implement LOP3 fast replace for old function
2020-02-26 10:24:35 -05:00
bunnei 795893a9a5 renderer_opengl: Create gl_framebuffer_data if empty. 2020-02-25 21:23:02 -05:00
bunnei e25297536f frontend: qt: bootmanager: Vulkan: Restore support for VK backend. 2020-02-25 21:23:01 -05:00
bunnei 667f026c95 core: frontend: Refactor scope_acquire_window_context to scope_acquire_context. 2020-02-25 21:23:00 -05:00
bunnei dc672ca4b3 renderer_opengl: Add texture mailbox support for presenter thread. 2020-02-25 21:22:59 -05:00
bunnei add2c38b73 renderer_opengl: Add OGLRenderbuffer to resource/state management. 2020-02-25 21:22:58 -05:00
Mat M 45ac1c62c6
Merge pull request #3461 from ReinUsesLisp/r32i-rt
video_core/surface: Add R32_SINT render target format
2020-02-25 17:47:14 -05:00
Mat M 00e3eab9c1
Merge pull request #3460 from ReinUsesLisp/unused-format-getter
video_core/gpu: Remove unused functions
2020-02-25 17:46:07 -05:00
ReinUsesLisp 466ce715e4 video_core/surface: Add R32_SINT render target format 2020-02-25 17:19:34 -03:00
ReinUsesLisp 3c648e3e2d video_core/gpu: Remove unused functions 2020-02-25 16:53:47 -03:00
bunnei 78ab2e0474
Merge pull request #3417 from ReinUsesLisp/r32i
texture: Implement R32I
2020-02-25 14:08:45 -05:00
bunnei e22ad52cdb
Merge pull request #3425 from ReinUsesLisp/layered-framebuffer
texture_cache: Implement layered framebuffer attachments
2020-02-24 10:14:50 -05:00
ReinUsesLisp 1e9213632a vk_shader_decompiler: Implement indexed textures
Implement accessing textures through an index. It uses the same
interface as OpenGL, the main difference is that Vulkan bindings are
forced to be arrayed (the binding index doesn't change for stacked
textures in SPIR-V).
2020-02-24 01:26:07 -03:00
ReinUsesLisp 1dda77d392 shader: Simplify indexed sampler usages 2020-02-24 01:26:07 -03:00
ReinUsesLisp e2dd59e341 video_core: Implement more scaler attribute formats
While changing this, fix assert in vk_shader_decompiler. We now know
scaled formats are expected to be float in shaders attributes.
2020-02-24 00:27:37 -03:00
bunnei 2b4cdb73b6
Merge pull request #3424 from ReinUsesLisp/spirv-layer
vk_shader_decompiler: Implement Layer output attribute
2020-02-22 23:45:16 -05:00
bunnei 754aac331f
Merge pull request #3422 from ReinUsesLisp/buffer-flush
surface_base: Implement texture buffer flushes
2020-02-22 23:09:50 -05:00
ReinUsesLisp 7dc488a375 shader/texture: Fix illegal 3D texture assert
Fix typo in the illegal 3D texture assert logic. We care about catching
arrayed 3D textures or 3D shadow textures, not regular 3D textures.
2020-02-21 15:57:27 -03:00
Rodrigo Locatti 4a6a1aeab4
Merge pull request #3433 from namkazt/patch-1
renderer_vulkan: Add the rest of case for TryConvertBorderColor
2020-02-21 15:56:09 -03:00
Rodrigo Locatti ef27b4b7b5
Merge pull request #3434 from namkazt/patch-2
vk_shader: Implement ImageLoad
2020-02-21 15:55:05 -03:00
Rodrigo Locatti 6b2719c0bb
Merge pull request #3435 from namkazt/patch-3
vulkan: add DXT23_SRGB
2020-02-21 15:48:19 -03:00
bunnei dc7ebc2d01
Merge pull request #3423 from ReinUsesLisp/no-match-3d
texture_cache: Avoid matches in 3D textures
2020-02-21 12:16:51 -05:00
Nguyen Dac Nam 10d8afb302
nit: add const to where it need. 2020-02-21 21:16:45 +07:00
Nguyen Dac Nam 1956a34ee5
shader: implement LOP3 fast replace for old function
ref: https://devtalk.nvidia.com/default/topic/1070081/cuda-programming-and-performance/reverse-lut-for-lop3-lut/
2020-02-21 19:08:07 +07:00
Nguyen Dac Nam c0c4da27d9
vk_device: remove left over from other branch 2020-02-21 08:56:18 +07:00
bunnei fe8e5d8ae4
Merge pull request #3438 from bunnei/gpu-mem-manager-fix
video_core: memory_manager: Flush/invalidate asynchronously when possible.
2020-02-20 20:04:05 -05:00
Nguyen Dac Nam ecf275887b
clang-format 2020-02-20 09:39:30 +07:00
Nguyen Dac Nam fbbad95845
shader_decompiler: only add StorageImageReadWithoutFormat when available 2020-02-20 09:28:13 +07:00
bunnei bf0c929d4c
Merge pull request #3415 from ReinUsesLisp/texture-code
shader/texture: Allow 2D shadow arrays and simplify code
2020-02-19 20:06:14 -05:00
bunnei d65fa7d65c video_core: memory_manager: Flush/invalidate asynchronously on Unmap.
- Minor perf improvement.
2020-02-19 20:03:52 -05:00
bunnei b2bc7682b4
Merge pull request #3414 from ReinUsesLisp/maxwell-3d-draw
maxwell_3d: Unify draw methods
2020-02-19 16:13:50 -05:00
bunnei c8261a1a57
Merge pull request #3411 from ReinUsesLisp/specific-funcs
gl_rasterizer: Use the least generic OpenGL draw function possible
2020-02-19 15:37:41 -05:00
Nguyen Dac Nam 88cb05e6e7
shader_decompiler: add check in case of device not support ShaderStorageImageReadWithoutFormat 2020-02-19 12:57:22 +07:00
Nguyen Dac Nam e61c7e9310
vk_device: setup shaderStorageImageReadWithoutFormat 2020-02-19 12:56:36 +07:00
Nguyen Dac Nam 47106ab152
vk_device: add check for shaderStorageImageReadWithoutFormat 2020-02-19 12:55:56 +07:00
Nguyen Dac Nam 1b6308727c
shader_conversion: I2F : add Assert for case src_size is Short 2020-02-19 11:40:35 +07:00
Nguyen Dac Nam a2c2c5768f
fix warning 2020-02-19 11:10:26 +07:00
Nguyen Dac Nam a8508f2bc0
clang-format fix 2020-02-19 11:02:59 +07:00
Nguyen Dac Nam 556f3a6e9a
shader_conversion: add conversion I2F for Short 2020-02-19 10:54:37 +07:00
bunnei e545c2322c
Merge pull request #3410 from ReinUsesLisp/vk-draw-index
vk_shader_decompiler: Fix vertex id and instance id
2020-02-18 22:37:33 -05:00
Nguyen Dac Nam 2ef8af93aa
vk_shader: add Capability StorageImageReadWithoutFormat 2020-02-19 10:16:51 +07:00
Nguyen Dac Nam f6f0762e81 vk_shader: Implement function ImageLoad (Used by Kirby Start Allies)
Please enter the commit message for your changes. Lines starting
2020-02-19 08:39:01 +07:00
Nguyen Dac Nam ec206f7f95
fixups mistake auto commit. 2020-02-19 01:24:32 +07:00
Nguyen Dac Nam eaf60ca5d8
Update code structure
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-02-19 01:23:08 +07:00
Fernando Sahmkow 93acfbd3a5
Merge pull request #3409 from ReinUsesLisp/host-queries
query_cache: Implement a query cache and query 21 (samples passed)
2020-02-18 11:31:06 -04:00
Nguyen Dac Nam 9295966d26
add vertex UnsignedInt size RGBA 2020-02-18 21:52:51 +07:00
Nguyen Dac Nam 9fc42fffd9
add eBc2SrgbBlock to formats 2020-02-18 21:44:09 +07:00
Nguyen Dac Nam 493f0ad904
vulkan: add DXT23_SRGB 2020-02-18 21:39:50 +07:00
Nguyen Dac Nam ba84f0988f
renderer_vulkan: Add the rest of case for TryConvertBorderColor 2020-02-18 16:52:54 +07:00
ReinUsesLisp 6a0220b2e1 texture_cache: Implement layered framebuffer attachments
Layered framebuffer attachments is a feature that allows applications to
write attach layered textures to a single attachment. What layer the
fragments are written to is decided from the shader using gl_Layer.
2020-02-16 04:19:32 -03:00
ReinUsesLisp 1caf3f11c8 vk_shader_decompiler: Implement Layer output attribute
SPIR-V's Layer is GLSL's gl_Layer. It lets the application choose from a
shader stage (vertex, tessellation or geometry) which framebuffer layer
write the output fragments to.
2020-02-16 04:17:37 -03:00
ReinUsesLisp bfda5ff3f6 texture_cache: Avoid matches in 3D textures
Code before this commit was trying to match 3D textures with another
target. Fix that.
2020-02-16 04:15:42 -03:00
ReinUsesLisp fd62bdf377 surface_base: Implement texture buffer flushes
Implement downloads to guest memory from texture buffers on the generic
cache and OpenGL.
2020-02-16 04:13:27 -03:00
bunnei 0f70f68fb3
Revert "video_core: memory_manager: Use GPU interface for cache functions." 2020-02-15 17:47:15 -05:00
ReinUsesLisp 14c2a4a2ec texture: Implement R32I 2020-02-15 16:26:50 -03:00
ReinUsesLisp 6910ade146 shader/texture: Allow 2D shadow arrays and simplify code
Shadow sampler 2D arrays are supported on OpenGL, so there's no reason
to forbid these. Enable textureLod usage on these.

Minor style changes.
2020-02-15 02:36:28 -03:00
ReinUsesLisp 91aa58e410 maxwell_3d: Unify draw methods
Pass instanced state of a draw invocation as an argument instead of
having two separate virtual methods.
2020-02-14 18:09:40 -03:00
ReinUsesLisp 6d3a046caa query_cache: Address feedback 2020-02-14 17:38:27 -03:00
ReinUsesLisp 54a00ee4cf query_cache: Fix ambiguity in CacheAddr getter 2020-02-14 17:38:27 -03:00
ReinUsesLisp cc0694559f query_cache: Add a recursive mutex for concurrent usage 2020-02-14 17:38:27 -03:00
ReinUsesLisp bcd348f238 vk_query_cache: Implement generic query cache on Vulkan 2020-02-14 17:38:27 -03:00
ReinUsesLisp c31382ced5 query_cache: Abstract OpenGL implementation
Abstract the current OpenGL implementation into the VideoCommon
namespace and reimplement it on top of that. Doing this avoids repeating
code and logic in the Vulkan implementation.
2020-02-14 17:38:27 -03:00
ReinUsesLisp 73d2d3342d gl_query_cache: Optimize query cache
Use a custom cache instead of relying on a ranged cache.
2020-02-14 17:38:27 -03:00
ReinUsesLisp aae8c180cb gl_query_cache: Implement host queries using a deferred cache
Instead of waiting immediately for executed commands, defer the query
until the guest CPU reads it. This way we get closer to what the guest
program is doing.

To archive this we have to build a dependency queue, because host APIs
(like OpenGL and Vulkan) use ranged queries instead of counters like
NVN.

Waiting for queries implicitly uses fences and this requires a command
being queued, otherwise the driver will lock waiting until a timeout. To
fix this when there are no commands queued, we explicitly call glFlush.
2020-02-14 17:33:13 -03:00
ReinUsesLisp ef9920e164 gl_rasterizer: Sort method declarations 2020-02-14 17:27:17 -03:00
ReinUsesLisp fe1238be7a gl_rasterizer: Add queued commands counter
Keep track of the queued OpenGL commands that can signal a fence if
waited on. As a side effect, we avoid calls to glFlush when no commands
are queued.
2020-02-14 17:27:17 -03:00
ReinUsesLisp 2b58652f08 maxwell_3d: Slow implementation of passed samples (query 21)
Implements GL_SAMPLES_PASSED by waiting immediately for queries.
2020-02-14 17:27:17 -03:00
bunnei 63a59b9935
Merge pull request #3379 from ReinUsesLisp/cbuf-offset
shader/decode: Fix constant buffer offsets
2020-02-14 13:22:53 -05:00
ReinUsesLisp 3217400dd1 gl_resource_manager: Add managed query class 2020-02-13 22:25:55 -03:00
bunnei 3563af2364
Merge pull request #3395 from FernandoS27/queries
GPU: Refactor queries implementation and correct GPU Clock.
2020-02-13 20:18:26 -05:00
ReinUsesLisp 336a4f8e99 gl_rasterizer: Use the least generic OpenGL draw function possible
This may help some implementations.
2020-02-13 21:55:21 -03:00
ReinUsesLisp cbea8c74de vk_shader_decompiler: Fix vertex id and instance id
Vulkan's VertexIndex and InstanceIndex don't match with hardware. This
is because Nvidia implements gl_VertexID and gl_InstanceID. The math
that relates these is:

gl_VertexIndex = gl_BaseVertex + gl_VertexID
gl_InstanceIndex = gl_InstanceIndex + gl_InstanceID

To emulate it using what Vulkan's SPIR-V offers (the *Index variants)
this commit substracts gl_Base* from gl_*Index to obtain the OpenGL and
hardware's equivalent.
2020-02-13 20:25:28 -03:00
Fernando Sahmkow d6ed31b9fa GPU: Address Feedback. 2020-02-13 18:16:07 -04:00
bunnei 37f1cf8cbd
Merge pull request #3376 from ReinUsesLisp/point-sprite
gl_rasterizer: Implement GL_POINT_SPRITE
2020-02-11 08:26:07 -05:00
Fernando Sahmkow 8e9a4944db GPU: Implement GPU Clock correctly. 2020-02-10 10:44:54 -04:00
Fernando Sahmkow 0cb3bcfbb7 Maxwell3D: Correct query reporting. 2020-02-10 10:41:43 -04:00
bunnei 84ea9c2b42 Merge pull request #3372 from ReinUsesLisp/fix-back-stencil
maxwell_3d: Fix stencil back mask
2020-02-09 22:29:28 -05:00
bunnei e210835dd0
Merge pull request #3387 from bunnei/gpu-mpscqueue
gpu_thread: Use MPSCQueue for GPU commands.
2020-02-08 21:15:48 -05:00
bunnei b5c13ee0eb gpu_thread: Use MPSCQueue for GPU commands.
- Necessary for multiple service threads.
2020-02-07 23:01:23 -05:00
bunnei 7cacb08cdf video_core: memory_manager: Use GPU interface for cache functions. 2020-02-07 22:59:35 -05:00
bunnei 90bda66028
Merge pull request #3378 from ReinUsesLisp/uscaled
maxwell_to_gl: Implement R8G8_USCALED
2020-02-07 22:55:52 -05:00
bunnei 90df4b8e2b
Merge pull request #3369 from ReinUsesLisp/shf
shader/shift: Implement SHF
2020-02-07 22:06:57 -05:00
bunnei 09d766d357
Merge pull request #3362 from ReinUsesLisp/fix-instanced
gl_rasterizer: Fix instanced draw arrays
2020-02-06 21:39:59 -05:00
ReinUsesLisp bf9a822b87 shader/decode: Fix constant buffer offsets
Some instances were using cbuf34.offset instead of cbuf34.GetOffset().
This returned the an invalid offset. Address those instances and rename
offset to "shifted_offset" to avoid future bugs.
2020-02-05 12:19:09 -03:00
ReinUsesLisp 8bb9eef97b maxwell_to_gl: Implement R8G8_USCALED 2020-02-04 21:32:36 -03:00
ReinUsesLisp c81c361e82 maxwell_to_gl: Reduce unimplemented formats to LOG_ERROR 2020-02-04 21:32:08 -03:00
ReinUsesLisp 0eb36c90f4 vk_rasterizer: Use noexcept variants of std::bitset
Removes bounds checking from "texceptions" instances.
2020-02-04 18:04:24 -03:00
bunnei 08c508b1c4
Merge pull request #3357 from ReinUsesLisp/bfi-rc
shader/bfi: Implement register-constant buffer variant
2020-02-04 15:14:13 -05:00
ReinUsesLisp 7da52673d0 gl_rasterizer: Implement GL_POINT_SPRITE
OpenGL core defaults to GL_POINT_SPRITE, meanwhile on OpenGL
compatibility we have to explicitly enable it. This fixes
gl_PointCoord's behaviour.
2020-02-04 15:19:45 -03:00
bunnei bf21aacc74
Merge pull request #3356 from ReinUsesLisp/fcmp
shader/arithmetic: Implement FCMP
2020-02-04 11:36:59 -05:00
bunnei c31ec00d67
Merge pull request #3337 from ReinUsesLisp/vulkan-staged
yuzu: Implement Vulkan frontend
2020-02-03 16:56:25 -05:00
ReinUsesLisp 4eed744277 maxwell_3d: Fix stencil back mask 2020-02-02 17:50:46 -03:00
ReinUsesLisp 223a89a19f shader: Remove curly braces initializers on shared pointers 2020-02-01 22:52:10 -03:00
bunnei b5bbe7e752
Merge pull request #3282 from FernandoS27/indexed-samplers
Partially implement Indexed samplers in general and specific code in GLSL
2020-02-01 20:41:40 -05:00
ReinUsesLisp 729ca120e3 shader/shift: Implement SHIFT_RIGHT_{IMM,R}
Shifts a pair of registers to the right and returns the low register.
2020-02-01 21:20:02 -03:00
ReinUsesLisp 017474c3f8 shader/shift: Implement SHF_LEFT_{IMM,R}
Shifts a pair of registers to the left and returns the high register.
2020-02-01 21:19:44 -03:00
bunnei c593e45dbd
Merge pull request #3347 from ReinUsesLisp/local-mem
shader/memory: Implement LDL.S16, LDS.S16, STL.S16 and STS.S16
2020-01-30 10:59:52 -05:00
ReinUsesLisp b69321650e gl_rasterizer: Fix instanced draw arrays
glDrawArrays was being used when the draw had a base instance specified.
This commit removes the draw parameters abstraction and fixes the
mentioned issue.
2020-01-30 02:22:00 -03:00
bunnei 2db7adc42a
Merge pull request #3350 from ReinUsesLisp/atom
shader/memory: Implement ATOM.ADD
2020-01-29 16:49:54 -05:00
ReinUsesLisp f92cbc5501 yuzu: Implement Vulkan frontend
Adds a Qt and SDL2 frontend for Vulkan. It also finishes the missing
bits on Vulkan initialization.
2020-01-29 17:53:11 -03:00
ReinUsesLisp 788d57d723 settings: Add settings for graphics backend 2020-01-29 17:53:11 -03:00
ReinUsesLisp 9f0162e4b5 shader/other: Fix skips for SYNC and BRK 2020-01-29 17:53:11 -03:00
ReinUsesLisp 270177f38a shader/other: Stub S2R LaneId 2020-01-29 17:53:11 -03:00
ReinUsesLisp b35449c85d buffer_cache: Delay buffer destructions
Delay buffer destruction some extra frames to avoid destroying buffers
that are still being used from older frames. This happens on Nvidia's
driver with mailbox.
2020-01-29 17:53:11 -03:00
bunnei b11aeced18
Merge pull request #3355 from ReinUsesLisp/break-down
texture_cache/surface_base: Fix layered break down
2020-01-29 12:29:56 -05:00
bunnei 91f79225e7
Merge pull request #3358 from ReinUsesLisp/implicit-texture-cache
gl_texture_cache: Silence implicit sign cast warnings
2020-01-29 11:23:50 -05:00
bunnei c457e47297
Merge pull request #3359 from ReinUsesLisp/assert-point-size
gl_shader_decompiler: Remove UNIMPLEMENTED for gl_PointSize
2020-01-28 15:19:51 -05:00
ReinUsesLisp 8178fe8960 gl_shader_decompiler: Remove UNIMPLEMENTED for gl_PointSize
This was implemented by a previous commit and it's no longer required.
2020-01-28 16:32:30 -03:00
ReinUsesLisp abae795986 gl_texture_cache: Silence implicit sign cast warnings 2020-01-27 20:59:11 -03:00
ReinUsesLisp 137a8aa55c shader/bfi: Implement register-constant buffer variant
It's the same as the variant that was implemented, but it takes the
operands from another source.
2020-01-27 01:20:38 -03:00
ReinUsesLisp e3fc3459c8 shader/arithmetic: Implement FCMP
Compares the third operand with zero, then selects between the first and
second.
2020-01-27 01:15:44 -03:00
ReinUsesLisp f55f6ff9bb texture_cache/surface_base: Fix layered break down
Layered break downs was passing "layer" as a "depth" parameter. This
commit addresses that.
2020-01-26 21:48:07 -03:00
ReinUsesLisp d17dfa6104 gl_texture_cache: Properly implement depth/stencil sampling
This addresses the long standing issue of compatibility vs. core
profiles on OpenGL, properly implementing depth vs. stencil sampling
depending on the texture swizzle.
2020-01-26 21:44:08 -03:00
ReinUsesLisp d95d4ac843 shader/memory: Implement ATOM.ADD
ATOM operates atomically on global memory. For now only add ATOM.ADD
since that's what was found in commercial games.

This asserts for ATOM.ADD.S32 (handling the others as unimplemented),
although ATOM.ADD.U32 shouldn't be any different.

This change forces us to change the default type on SPIR-V storage
buffers from float to uint. We could also alias the buffers, but it's
simpler for now to just use uint. While we are at it, abstract the code
to avoid repetition.
2020-01-26 01:54:24 -03:00
Fernando Sahmkow bb8eb15d39 Shader_IR: Address feedback. 2020-01-25 09:04:59 -04:00
ReinUsesLisp d26e74f0a3 shader/memory: Implement STL.S16 and STS.S16 2020-01-25 03:16:10 -03:00
ReinUsesLisp 9a2cdf8520 shader/memory: Implement unaligned LDL.S16 and LDS.S16 2020-01-25 03:16:10 -03:00
ReinUsesLisp 531f25a037 shader/memory: Move unaligned load/store to functions 2020-01-25 03:16:10 -03:00
ReinUsesLisp 96638f57c9 shader/memory: Implement LDL.S16 and LDS.S16 2020-01-25 03:15:55 -03:00
bunnei dfd998216c
Merge pull request #3344 from ReinUsesLisp/vk-botw
vk_shader_decompiler: Disable default values on unwritten render targets
2020-01-24 17:31:55 -05:00
Fernando Sahmkow 806f569143 Shader_IR: Change name of TrackSampler function so it does not confuse with the type. 2020-01-24 16:44:48 -04:00
Fernando Sahmkow 3919b7b8a9 Shader_IR: Corrections, styling and extras. 2020-01-24 16:44:48 -04:00
Fernando Sahmkow 37b8504faa Shader_IR: Correct Custom Variable assignment. 2020-01-24 16:44:47 -04:00
Fernando Sahmkow 7c530e0666 Shader_IR: Propagate bindless index into the GL compiler. 2020-01-24 16:44:47 -04:00
Fernando Sahmkow 3c34678627 Shader_IR: Implement Injectable Custom Variables to the IR. 2020-01-24 16:43:31 -04:00
Fernando Sahmkow 2b02f29a2d GL Backend: Introduce indexed samplers into the GL backend 2020-01-24 16:43:31 -04:00
Fernando Sahmkow 037ea431ce Shader_IR: deduce size of indexed samplers 2020-01-24 16:43:31 -04:00
Fernando Sahmkow f4603d23c5 Shader_IR: Setup Indexed Samplers on the IR 2020-01-24 16:43:30 -04:00
Fernando Sahmkow 603c861532 Shader_IR: Implement initial code for tracking indexed samplers. 2020-01-24 16:43:30 -04:00
Fernando Sahmkow 64496f2456 Shader_IR: Address Feedback 2020-01-24 16:43:30 -04:00
Fernando Sahmkow b97608ca64 Shader_IR: Allow constant access of guest driver. 2020-01-24 16:43:30 -04:00
Fernando Sahmkow dc5cfa8d28 Shader_IR: Address Feedback 2020-01-24 16:43:29 -04:00
Fernando Sahmkow 74aa7de5e3 Guest_driver: Correct compiling errors in GCC. 2020-01-24 16:43:29 -04:00
Fernando Sahmkow 1e4b6bef6f Shader_IR: Store Bound buffer on Shader Usage 2020-01-24 16:43:29 -04:00
Fernando Sahmkow c921e496eb GPU: Implement guest driver profile and deduce texture handler sizes. 2020-01-24 16:43:29 -04:00
bunnei a104b985a8
Merge pull request #3273 from FernandoS27/txd-array
Shader_IR: Implement TXD Array.
2020-01-24 14:02:40 -05:00
ReinUsesLisp 1690f1adba vk_shader_decompiler: Disable default values on unwritten render targets
Some games like The Legend of Zelda: Breath of the Wild assign
render targets without writing them from the fragment shader. This
generates Vulkan validation errors, so silence these I previously
introduced a commit to set "vec4(0, 0, 0, 1)" for these attachments. The
problem is that this is not what games expect. This commit reverts that
change.
2020-01-24 01:16:21 -03:00
ReinUsesLisp 3ce28342a2 gl_shader_cache: Disable fastmath on Nvidia 2020-01-21 19:08:08 -03:00
Fernando Sahmkow 79e0991d9b
Merge pull request #3330 from ReinUsesLisp/vk-blit-screen
vk_blit_screen: Initial implementation
2020-01-20 22:32:16 -04:00
ReinUsesLisp a665581684 vk_blit_screen: Address feedback 2020-01-20 18:43:11 -03:00
bunnei 69b44392a7
Merge pull request #3328 from ReinUsesLisp/vulkan-atoms
vk_shader_decompiler: Implement UAtomicAdd (ATOMS) on SPIR-V
2020-01-20 00:01:52 -05:00
bunnei 5a077c95ce
Merge pull request #3322 from ReinUsesLisp/vk-front-face
vk_graphics_pipeline: Set front facing properly
2020-01-19 23:22:34 -05:00
ReinUsesLisp f5dfe68a94 vk_blit_screen: Initial implementation
This abstraction takes care of presenting accelerated and
non-accelerated or "framebuffer" images to the Vulkan swapchain.
2020-01-19 21:12:43 -03:00
bunnei 41373d212e
Merge pull request #3313 from ReinUsesLisp/vk-rasterizer
vk_rasterizer: Implement Vulkan's rasterizer
2020-01-19 18:09:01 -05:00
ReinUsesLisp b2c976ad0e vk_shader_decompiler: Implement UAtomicAdd (ATOMS) on SPIR-V
Also updates sirit to include atomic instructions.
2020-01-19 16:40:31 -03:00
Fernando Sahmkow 51c8aea979
Merge pull request #3317 from ReinUsesLisp/gl-decomp-cc-decomp
gl_shader_decompiler: Fix decompilation of condition codes
2020-01-18 19:56:55 -04:00
ReinUsesLisp d110a371bb gl_state: Use bool instead of GLboolean
This fixes template resolution considering GLboolean an integer instead
of a bool.
2020-01-18 19:10:34 -03:00
ReinUsesLisp 94915d4ea1 vk_graphics_pipeline: Set front facing properly
Front face was being forced to a certain value when cull face is
disabled. Set a default value on initialization and drop the forcefully
set front facing value with culling disabled.
2020-01-18 18:50:47 -03:00
bunnei 9bf4850f74
Merge pull request #3305 from ReinUsesLisp/point-size-program
gl_state: Implement PROGRAM_POINT_SIZE
2020-01-18 01:56:32 -05:00
bunnei 15163edaaa
Merge pull request #3312 from ReinUsesLisp/atoms-u32
shader/memory: Implement ATOMS.ADD.U32
2020-01-18 00:54:07 -05:00
ReinUsesLisp 09b1d762d7 vk_rasterizer: Address feedback 2020-01-17 21:40:01 -03:00
ReinUsesLisp f34e519da3 gl_shader_decompiler: Fix decompilation of condition codes
Use Visit instead of reimplementing it. Fixes unimplemented negations
for condition codes.
2020-01-17 21:23:01 -03:00
bunnei 48863afb65
Merge pull request #3306 from ReinUsesLisp/gl-texture
gl_texture_cache: Minor fixes and style changes
2020-01-17 15:44:02 -05:00
bunnei 657b3a366e
Merge pull request #3311 from ReinUsesLisp/z32fx24s8
format_lookup_table: Fix ZF32_X24S8 component types
2020-01-17 08:22:32 -05:00
ReinUsesLisp fe5356d223 vk_rasterizer: Implement Vulkan's rasterizer
This abstraction is Vulkan's equivalent to OpenGL's rasterizer. It takes
care of joining all parts of the backend and rendering accordingly on
demand.
2020-01-16 23:05:15 -03:00
ReinUsesLisp 38e789c761 renderer_vulkan: Add header as placeholder 2020-01-16 22:54:15 -03:00
bunnei e041f33569
Merge pull request #3300 from ReinUsesLisp/vk-texture-cache
vk_texture_cache: Implement generic texture cache on Vulkan
2020-01-16 19:19:26 -05:00
ReinUsesLisp f09cd52980 vk_texture_cache: Address feedback 2020-01-16 18:23:10 -03:00
ReinUsesLisp 63ba41a26d shader/memory: Implement ATOMS.ADD.U32 2020-01-16 17:30:55 -03:00
ReinUsesLisp 0caab54b5d format_lookup_table: Fix ZF32_X24S8 component types
Component types for ZF32_X24S8 were using UNORM. Drivers will set FLOAT,
UINT, UNORM, UNORM; causing a format mismatch. This commit addresses
that.
2020-01-16 17:29:13 -03:00
Rodrigo Locatti 82e1285c1e
vk_texture_cache: Fix typo in commentary
Co-Authored-By: MysticExile <30736337+MysticExile@users.noreply.github.com>
2020-01-16 16:59:46 -03:00
bunnei 30faf6a964
Merge pull request #3308 from lioncash/private
maxwell_3d: Make dirty_pointers private
2020-01-16 13:26:35 -05:00
bunnei d23869811d
Merge pull request #3304 from lioncash/fwd-decl
renderer_opengl/utils: Forward declare private structs
2020-01-16 11:21:18 -05:00
Lioncash 9e874898f5 maxwell_3d: Make dirty_pointers private
This isn't used outside of the class itself, so we can make it private
for the time being.
2020-01-16 04:07:15 -05:00
ReinUsesLisp c375d735e6 gl_state: Implement PROGRAM_POINT_SIZE
For gl_PointSize to have effect we have to activate
GL_PROGRAM_POINT_SIZE.
2020-01-15 16:14:17 -03:00
Lioncash 7af56dfa76 renderer_opengl/utils: Remove unused header inclusions
Nothing from these headers are used, so they can be removed.
2020-01-15 06:31:23 -05:00
Lioncash 06d30fbcca renderer_opengl/utils: Forward declare private structs
Keeps the definitions hidden and allows changes to the structs without
needing to recompile all users of classes containing said structs.
2020-01-15 06:30:01 -05:00
ReinUsesLisp 66a1c777c9 gl_texture_cache: Use local variables to simplify DownloadTexture 2020-01-14 17:39:48 -03:00
ReinUsesLisp cdb00546f0 gl_texture_cache: Fix format for RGBX16F 2020-01-14 17:38:33 -03:00
ReinUsesLisp 2d09467f6f gl_texture_cache: Use Snorm internal format for RG8S 2020-01-14 17:37:58 -03:00
ReinUsesLisp 02624c35ec gl_texture_cache: Use Snorm internal format for ABGR8S 2020-01-14 17:37:23 -03:00
Rodrigo Locatti 64cd46579b
Merge pull request #3303 from lioncash/reorder
control_flow: Silence -Wreorder warning for CFGRebuildState
2020-01-14 16:15:18 -03:00
Lioncash a1eee1749e control_flow: Silence -Wreorder warning for CFGRebuildState
Organizes the initializer list in the same order that the variables
would actually be initialized in.
2020-01-14 13:28:48 -05:00
Lioncash f10ea944e0 gl_shader_cache: Remove unused STAGE_RESERVED_UBOS constant
Given this isn't used, this can be removed entirely.
2020-01-14 13:16:52 -05:00
Lioncash 4cd5ad90f3 gl_shader_cache: std::move entries in CachedShader constructor
Avoids several reallocations of std::vector instances where applicable.
2020-01-14 13:14:16 -05:00
Lioncash 15a6840e7a gl_shader_cache: Remove unused entries variable in BuildShader()
Eliminates a few unnecessary constructions of std::vectors.
2020-01-14 13:11:49 -05:00
bunnei 55f95e7f26
Merge pull request #3287 from ReinUsesLisp/ldg-stg-16
shader_ir/memory: Implement u16 and u8 for STG and LDG
2020-01-14 09:57:08 -05:00
bunnei 15788ffcde
Merge pull request #3288 from ReinUsesLisp/uncurse-aoffi
shader_ir/texture: Simplify AOFFI code
2020-01-13 23:52:12 -05:00
bunnei 6985eea519
Merge pull request #3290 from ReinUsesLisp/gl-clamp
maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driver
2020-01-13 19:16:06 -05:00
ReinUsesLisp 09e17fbb0f vk_texture_cache: Implement generic texture cache on Vulkan
It currently ignores PBO linearizations since these should be dropped as
soon as possible on OpenGL.
2020-01-13 20:37:50 -03:00
ReinUsesLisp 2b2712fa95 texture_cache/surface_params: Make GetNumLayers public 2020-01-13 20:35:43 -03:00
Rodrigo Locatti b1138e5ea1
vk_compute_pass: Address feedback
Comment hardcoded SPIR-V modules.
2020-01-10 22:46:34 -03:00
ReinUsesLisp 3d46709b7f maxwell_to_vk: Implement GL_CLAMP hacking Nvidia's driver
Nvidia's driver defaults invalid enumerations to GL_CLAMP. Vulkan
doesn't expose GL_CLAMP through its API, but we can hack it on Nvidia's
driver using the internal driver defaults.
2020-01-10 17:12:50 -03:00
ReinUsesLisp 13021b534c shader_ir/texture: Simplify AOFFI code 2020-01-09 03:50:37 -03:00
ReinUsesLisp e2a2a556b9 shader_ir/memory: Implement u16 and u8 for STG and LDG
Using the same technique we used for u8 on LDG, implement u16.

In the case of STG, load memory and insert the value we want to set
into it with bitfieldInsert. Then set that value.
2020-01-09 02:12:29 -03:00
ReinUsesLisp 908e085d02 vk_compute_pass: Add compute passes to emulate missing Vulkan features
This currently only supports quad arrays and u8 indices.

In the future we can remove quad arrays with a table written from the
CPU, but this was used to bootstrap the other passes helpers and it
was left in the code.

The blob code is generated from the "shaders/" directory. Read the
instructions there to know how to generate the SPIR-V.
2020-01-08 19:24:26 -03:00
ReinUsesLisp 82a64da077 vk_shader_util: Add helper to build SPIR-V shaders 2020-01-08 19:22:20 -03:00
ReinUsesLisp 6888d776ff vk_pipeline_cache: Initial implementation
Given a pipeline key, this cache returns a pipeline abstraction (for
graphics or compute).
2020-01-06 22:02:26 -03:00
ReinUsesLisp 2effdeb924 vk_graphics_pipeline: Initial implementation
This abstractio represents the state of the 3D engine at a given draw.
Instead of changing individual bits of the pipeline how it's done in
APIs like D3D11, OpenGL and NVN; on Vulkan we are forced to put
everything together into a single, immutable object.

It takes advantage of the few dynamic states Vulkan offers.
2020-01-06 22:02:26 -03:00
ReinUsesLisp dc96a59fa0 vk_compute_pipeline: Initial implementation
This abstraction represents a Vulkan compute pipeline.
2020-01-06 22:02:26 -03:00
ReinUsesLisp b392a5986e vk_pipeline_cache: Add file and define descriptor update template filler
This function allows us to share code between compute and graphics
pipelines compilation.
2020-01-06 22:02:26 -03:00
ReinUsesLisp 3142f1b597 fixed_pipeline_state: Add depth clamp 2020-01-06 22:02:26 -03:00
ReinUsesLisp 9c548146ca vk_rasterizer: Add placeholder 2020-01-06 22:02:26 -03:00
bunnei 5be00cba15
Merge pull request #3276 from ReinUsesLisp/pipeline-reqs
vk_update_descriptor/vk_renderpass_cache: Add pipeline cache dependencies
2020-01-06 17:03:34 -05:00
ReinUsesLisp 5aeff9aff5 vk_renderpass_cache: Initial implementation
The renderpass cache is used to avoid creating renderpasses on each
draw. The hashed structure is not currently optimized.
2020-01-06 18:28:32 -03:00
ReinUsesLisp 322d6a0311 vk_update_descriptor: Initial implementation
The update descriptor is used to store in flat memory a large chunk of
staging data used to update descriptor sets through templates. It
provides a push interface to easily insert descriptors following the
current pipeline. The order used in the descriptor update template has
to be implicitly followed. We can catch bugs here using validation
layers.
2020-01-06 18:28:32 -03:00
ReinUsesLisp 5b01f80a12 vk_stream_buffer/vk_buffer_cache: Avoid halting and use generic cache
The stream buffer before this commit once it was full (no more bytes to
write before looping) waiting for all previous operations to finish.
This was a temporary solution and had a noticeable performance penalty
in performance (from what a profiler showed).

To avoid this mark with fences usages of the stream buffer and once it
loops wait for them to be signaled. On average this will never wait.
Each fence knows where its usage finishes, resulting in a non-paged
stream buffer.

On the other side, the buffer cache is reimplemented using the generic
buffer cache. It makes use of the staging buffer pool and the new
stream buffer.
2020-01-06 18:13:41 -03:00
ReinUsesLisp ceb851b590 vk_memory_manager: Misc changes
* Allocate memory in discrete exponentially increasing chunks until the
128 MiB threshold. Allocations larger thant that increase linearly by
256 MiB (depending on the required size). This allows to use small
allocations for small resources.

* Move memory maps to a RAII abstraction. To optimize for debugging
tools (like RenderDoc) users will map/unmap on usage. If this ever
becomes a noticeable overhead (from my profiling it doesn't) we can
transparently move to persistent memory maps without harming the API,
getting optimal performance for both gameplay and debugging.

* Improve messages on exceptional situations.

* Fix typos "requeriments" -> "requirements".

* Small style changes.
2020-01-06 18:13:41 -03:00
ReinUsesLisp 85bb6a6f08 vk_buffer_cache: Temporarily remove buffer cache
This is intended for a follow up commit to avoid circular dependencies.
2020-01-06 17:58:46 -03:00
bunnei 89fc75d769
Merge pull request #3257 from degasus/no_busy_loops
video_core: Block in WaitFence.
2020-01-06 00:09:57 -05:00
Fernando Sahmkow 56e450a3f7
Merge pull request #3264 from ReinUsesLisp/vk-descriptor-pool
vk_descriptor_pool: Initial implementation
2020-01-05 15:54:41 -04:00
bunnei cd0a7dfdbc
Merge pull request #3258 from FernandoS27/shader-amend
Shader_IR: add the ability to amend code in the shader ir.
2020-01-04 14:05:17 -05:00
Fernando Sahmkow 3dd6b55851 Shader_IR: Address Feedback 2020-01-04 14:40:57 -04:00
Fernando Sahmkow a1667a7b46 Shader_IR: Implement TXD Array.
This commit extends the compilation of TXD to support array samplers on
TXD.
2020-01-04 13:28:02 -04:00
Rodrigo Locatti 6e347d8d1b
Update src/video_core/renderer_vulkan/vk_descriptor_pool.cpp
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-01-03 17:34:30 -03:00
ReinUsesLisp 0d6d8129c4 yuzu: Remove Maxwell debugger
This was carried from Citra and wasn't really used on yuzu. It also adds
some runtime overhead. This commit removes it from yuzu's codebase.
2020-01-02 23:09:44 -03:00
bunnei ae0e481677
Merge pull request #3243 from ReinUsesLisp/topologies
maxwell_to_gl: Implement missing primitive topologies
2020-01-01 20:33:33 -05:00
ReinUsesLisp 1fe7df4517 vk_descriptor_pool: Initial implementation
Create a large descriptor pool where we allocate all our descriptors
from. It has to be wide enough to support any pipeline, hence its large
numbers.

If the descritor pool is filled, we allocate more memory at that moment.
This way we can take advantage of permissive drivers like Nvidia's that
allocate more descriptors than what the spec requires.
2020-01-01 16:44:06 -03:00
bunnei 028b2718ed
Merge pull request #3239 from ReinUsesLisp/p2r
shader/p2r: Implement P2R Pr
2019-12-31 20:37:16 -05:00
Fernando Sahmkow b3371ed09e Shader_IR: add the ability to amend code in the shader ir.
This commit introduces a mechanism by which shader IR code can be
amended and extended. This useful for track algorithms where certain
information can derived from before the track such as indexes to array
samplers.
2019-12-30 15:31:48 -04:00
Fernando Sahmkow 7bd447355f
Merge pull request #3248 from ReinUsesLisp/vk-image
vk_image: Add an image object abstraction
2019-12-30 14:25:14 -04:00
Rodrigo Locatti 4cbb363d3f
vk_image: Avoid unnecesary equals 2019-12-30 13:28:23 -03:00
Fernando Sahmkow 287d5921cf
Merge pull request #3249 from ReinUsesLisp/vk-staging-buffer-pool
vk_staging_buffer_pool: Add a staging pool for temporary operations
2019-12-30 12:25:59 -04:00
Markus Wick cb9dd01ffd video_core: Block in WaitFence.
This function is called rarely and blocks quite often for a long time.
So don't waste power and let the CPU sleep.

This might also increase the performance as the other cores might be allowed to clock higher.
2019-12-30 13:04:53 +01:00
Rodrigo Locatti f2c61bbe13
vk_staging_buffer_pool: Initialize last epoch to zero 2019-12-29 19:19:43 -03:00
Fernando Sahmkow f846e3d6d0
Merge pull request #3250 from ReinUsesLisp/empty-fragment
gl_rasterizer: Allow rendering without fragment shader
2019-12-28 14:33:53 -04:00
bunnei 8a76f816a4
Merge pull request #3228 from ReinUsesLisp/ptp
shader/texture: Implement AOFFI and PTP for TLD4 and TLD4S
2019-12-26 21:43:44 -05:00
ReinUsesLisp 5b989f189f
gl_rasterizer: Allow rendering without fragment shader
Rendering without a fragment shader is usually used in depth-only
passes.
2019-12-26 16:38:49 -03:00
ReinUsesLisp 3813af2f3c
vk_staging_buffer_pool: Add a staging pool for temporary operations
The job of this abstraction is to provide staging buffers for temporary
operations. Think of image uploads or buffer uploads to device memory.

It automatically deletes unused buffers.
2019-12-25 18:12:17 -03:00
ReinUsesLisp c83bf7cd1e
vk_image: Add an image object abstraction
This object's job is to contain an image and manage its transitions.
Since Nvidia hardware doesn't know what a transition is but Vulkan
requires them anyway, we have to state track image subresources
individually.

To avoid the overhead of tracking each subresource in images with many
subresources (think of cubemap arrays with several mipmaps), this commit
tracks when subresources have diverged. As long as this doesn't happen
we can check the state of the first subresource (that will be shared
with all subresources) and update accordingly.

Image transitions are deferred to the scheduler command buffer.
2019-12-25 18:00:16 -03:00
Fernando Sahmkow 5619d24377
Merge pull request #3244 from ReinUsesLisp/vk-fps
fixed_pipeline_state: Define structure and loaders
2019-12-25 14:31:29 -04:00
bunnei 4af569ee47
Merge pull request #3236 from ReinUsesLisp/rasterize-enable
gl_rasterizer: Implement RASTERIZE_ENABLE
2019-12-24 22:54:10 -05:00
ReinUsesLisp b9e3f5eb36
fixed_pipeline_state: Define symetric operator!= and mark as noexcept
Marks as noexcept Hash, operator== and operator!= for consistency.
2019-12-24 18:24:08 -03:00
ReinUsesLisp 4a3026b16b
fixed_pipeline_state: Define structure and loaders
The intention behind this hasheable structure is to describe the state
of fixed function pipeline state that gets compiled to a single graphics
pipeline state object. This is all dynamic state in OpenGL but Vulkan
wants it in an immutable state, even if hardware can edit it freely.

In this commit the structure is defined in an optimized state (it uses
booleans, has paddings and many data entries that can be packed to
single integers). This is intentional as an initial implementation that
is easier to debug, implement and review. It will be optimized in later
stages, or it might change if Vulkan gets more dynamic states.
2019-12-22 22:59:11 -03:00
ReinUsesLisp 5770418fb3
maxwell_3d: Add depth bounds registers 2019-12-22 22:55:06 -03:00
ReinUsesLisp 91d35559e5
maxwell_to_gl: Implement missing primitive topologies
Many of these topologies are exclusively available in OpenGL.
2019-12-22 22:33:01 -03:00
bunnei e976d0e924
Merge pull request #3241 from ReinUsesLisp/gl-shader-cache
gl_shader_cache: Style changes
2019-12-22 16:23:46 -05:00
bunnei 1e76655f83
Merge pull request #3238 from ReinUsesLisp/vk-resource-manager
vk_resource_manager: Catch device losses and other changes
2019-12-22 15:57:16 -05:00
bunnei 0f3ac9cfeb
Merge pull request #3203 from FernandoS27/tex-cache-fixes
Texture Cache: Add HLE methods for building 3D textures
2019-12-22 14:25:13 -05:00
Fernando Sahmkow 3dc585d011
Merge pull request #3237 from ReinUsesLisp/vk-shader-decompiler
vk_shader_decompiler: Misc changes
2019-12-22 12:36:56 -04:00
Fernando Sahmkow 218ee18417 Texture Cache: Improve documentation 2019-12-22 12:29:23 -04:00
Fernando Sahmkow a3916588b6 Texture Cache: Address Feedback 2019-12-22 12:24:34 -04:00
Fernando Sahmkow 51c9e98677 Texture Cache: Add HLE methods for building 3D textures within the GPU in certain scenarios.
This commit adds a series of HLE methods for handling 3D textures in
general. This helps games that generate 3D textures on every frame and
may reduce loading times for certain games.
2019-12-22 12:24:34 -04:00
Fernando Sahmkow aea978e037
Merge pull request #3230 from ReinUsesLisp/vk-emu-shaders
renderer_vulkan/shader: Add helper GLSL shaders
2019-12-22 11:23:09 -04:00
Fernando Sahmkow 27efcc15e9
Merge pull request #3240 from ReinUsesLisp/decomp-cond-code
vk_shader_decompiler: Use Visit instead of reimplementing it
2019-12-22 11:20:55 -04:00
bunnei 16dcfacbfc
Merge pull request #3235 from ReinUsesLisp/ldg-u8
shader/memory: Implement LDG.U8 and unaligned U8 loads
2019-12-21 22:50:28 -05:00
ReinUsesLisp 1e16023d60
gl_shader_cache: Update commentary for shared memory
Remove false commentary. Not dividing by 4 the size of shared memory is
not a hack; it describes the number of integers, not bytes.

While we are at it sort the generated code to put preprocessor lines on
the top.
2019-12-20 22:51:21 -03:00
ReinUsesLisp 486c6a5316
gl_shader_cache: Remove unused entry in GetPrimitiveDescription 2019-12-20 22:49:30 -03:00
ReinUsesLisp af93909c9c
vk_shader_decompiler: Use Visit instead of reimplementing it
ExprCondCode visit implements the generic Visit. Use this instead of
that one.

As an intended side effect this fixes unwritten memory usages in cases
when a negation of a condition code is used.
2019-12-20 21:36:25 -03:00
ReinUsesLisp 38d3a48873
shader/p2r: Implement P2R Pr
P2R dumps predicate or condition codes state to a register. This is
useful for unit testing.
2019-12-20 18:02:41 -03:00
ReinUsesLisp cf27b59493
shader/r2p: Refactor P2R to support P2R 2019-12-20 17:55:42 -03:00
bunnei 7be65c6a68
Merge pull request #3234 from ReinUsesLisp/i2f-u8-selector
shader/conversion: Implement byte selector in I2F
2019-12-19 22:36:26 -05:00
bunnei 6d55b14cc0
Merge pull request #3233 from ReinUsesLisp/mismatch-sizes
shader/texture: Properly shrink unused entries in size mismatches
2019-12-19 20:40:27 -05:00
ReinUsesLisp e41da22c8d
vk_resource_manager: Add entry to VKFence to test its usage 2019-12-19 16:31:34 -03:00
ReinUsesLisp ec983a2451
vk_reosurce_manager: Add assert for releasing fences
Notify the programmer when a request to release a fence is invalid
because the fence is already free.
2019-12-19 16:31:34 -03:00
ReinUsesLisp 6ddffa010a
vk_resource_manager: Implement VKFenceWatch move constructor
This allows us to put VKFenceWatch inside a std::vector without storing
it in heap. On move we have to signal the fences where the new protected
resource is, adding some overhead.
2019-12-19 16:31:34 -03:00
ReinUsesLisp 54747d60bc
vk_device: Add entry to catch device losses
VK_NV_device_diagnostic_checkpoints allows us to push data to a Vulkan
queue and then query it even after a device loss. This allows us to push
the current pipeline object and see what was the call that killed the
device.
2019-12-19 16:31:33 -03:00
ReinUsesLisp 2a63b3bdb9
vk_shader_decompiler: Fix full decompilation
When full decompilation was enabled, labels were not being inserted and
instructions were misused. Fix these bugs.
2019-12-19 16:24:45 -03:00
ReinUsesLisp de918ebeb0
vk_shader_decompiler: Skip NDC correction when it is native
Avoid changing gl_Position when the NDC used by the game is [0, 1]
(Vulkan's native).
2019-12-19 16:24:45 -03:00
ReinUsesLisp 485c21eac3
vk_shader_decompiler: Normalize output fragment attachments
Some games write from fragment shaders to an unexistant framebuffer
attachment or they don't write to one when it exists in the framebuffer.
Fix this by skipping writes or adding zeroes.
2019-12-19 16:24:45 -03:00
bunnei 1eb4a95d2b
Merge pull request #3232 from ReinUsesLisp/gl-decompiler-images
gl_shader_decompiler: Add missing DeclareImages
2019-12-19 11:32:47 -05:00
bunnei 253aa52351
Merge pull request #3231 from ReinUsesLisp/tld4s-encoding
shader_bytecode: Fix TLD4S encoding
2019-12-19 11:32:25 -05:00
ReinUsesLisp f4a25f854c
vk_device: Add query for RGBA8Uint 2019-12-19 02:08:29 -03:00
ReinUsesLisp abb33d4aec
vk_shader_decompiler: Update sirit and implement Texture AOFFI 2019-12-19 01:42:13 -03:00
bunnei d53cf05513
Merge pull request #3221 from ReinUsesLisp/vk-scheduler
vk_scheduler: Delegate commands to a worker thread and state track
2019-12-18 22:04:08 -05:00
ReinUsesLisp da0aa4da6b
gl_rasterizer: Implement RASTERIZE_ENABLE
RASTERIZE_ENABLE is the opposite of GL_RASTERIZER_DISCARD. Implement it
naturally using this.

NVN games expect rasterize to be enabled by default, reflect that in our
initial GPU state.
2019-12-18 19:28:23 -03:00
ReinUsesLisp ae8d4b6c0c
shader/memory: Implement LDG.U8 and unaligned U8 loads
LDG can load single bytes instead of full integers or packs of integers.
These have the advantage of loading bytes that are not aligned to 4
bytes.

To emulate these this commit gets the byte being referenced (by doing
"address & 3" and then using that to extract the byte from the loaded
integer:

result = bitfieldExtract(loaded_integer, (address % 4) * 8, 8)
2019-12-18 01:21:46 -03:00
ReinUsesLisp a7d6bd1ef1
shader/conversion: Implement byte selector in I2F
I2F's byte selector is used to choose what bytes to convert to float.
e.g. if the input is 0xaabbccdd and the selector is ".B3" it will
convert 0xaa. The default (when it's not shown in nvdisasm) is ".B0", in
that example the default would convert 0xdd to float.
2019-12-18 00:41:22 -03:00
ReinUsesLisp 15a753b9a5
shader/texture: Properly shrink unused entries in size mismatches
When a image format mismatches we were inserting zeroes to the texture
itself. This was not handling cases were the mismatch uses less
coordinates than the guest shader code. Address that by resizing the
vector.
2019-12-17 23:38:10 -03:00
ReinUsesLisp e438079b50
gl_shader_decompiler: Add missing DeclareImages 2019-12-17 23:34:15 -03:00
ReinUsesLisp 8b26b4228b
shader_bytecode: Fix TLD4S encoding 2019-12-17 23:32:10 -03:00
ReinUsesLisp b52297767e
renderer_vulkan/shader: Add helper GLSL shaders
These shaders are used to specify code that is not dynamically generated
in the Vulkan backend. Instead of packing it inside the build system,
it's manually built and copied to the C++ file to avoid adding
unnecessary build time dependencies.

quad_array should be dropped in the future since it can be emulated with
a memory pool generated from the CPU.
2019-12-16 17:59:08 -03:00
bunnei 65b1b05e05
Merge pull request #3182 from ReinUsesLisp/renderer-opengl
renderer_opengl: Miscellaneous clean ups
2019-12-16 13:01:04 -05:00
ReinUsesLisp e09c1fbc1f
shader/texture: Implement TLD4.PTP 2019-12-16 04:09:24 -03:00
ReinUsesLisp 844e4a297b
shader/texture: Enable arrayed TLD4 2019-12-16 02:37:21 -03:00
ReinUsesLisp a87c85eba2
gl_shader_decompiler: Rename "sepparate" to "separate" 2019-12-16 02:12:51 -03:00
ReinUsesLisp 3d2c44848b
shader/texture: Implement AOFFI for TLD4S 2019-12-16 02:06:42 -03:00
ReinUsesLisp 3d9fff82c0
shader/texture: Remove unnecesary parenthesis 2019-12-16 01:52:33 -03:00
Rodrigo Locatti eac075692b
Merge pull request #3219 from FernandoS27/fix-bindless
Corrections and fixes to TLD4S & bindless samplers failing
2019-12-16 01:26:11 -03:00
bunnei 3d51153611
Merge pull request #3222 from ReinUsesLisp/maxwell-to-vk
maxwell_to_vk: Use VK_EXT_index_type_uint8 and misc changes
2019-12-14 22:30:12 -05:00
bunnei 035ec7d9de
Merge pull request #3213 from ReinUsesLisp/intel-mesa
gl_device: Enable compute shaders for Intel Mesa drivers
2019-12-14 16:04:31 -05:00
bunnei 2b650543c6
Merge pull request #3212 from ReinUsesLisp/fix-smem-lmem
gl_shader_cache: Add missing new-line on emitted GLSL
2019-12-13 21:35:29 -05:00
ReinUsesLisp e3ea583893
maxwell_to_vk: Improve image format table and add more formats
A1B5G5R5 uses A1R5G5B5. This is flipped with image view swizzles;
flushing is still not properly implemented on Vulkan for this particular
format.
2019-12-13 03:12:29 -03:00
ReinUsesLisp f27b21077d
maxwell_to_vk: Implement more vertex formats 2019-12-13 03:12:28 -03:00
ReinUsesLisp 8db8631d81
maxwell_to_vk: Implement more primitive topologies
Add an extra argument to query device capabilities in the future. The
intention behind this is to use native quads, quad strips, line loops
and polygons if these are released for Vulkan.
2019-12-13 03:12:28 -03:00
ReinUsesLisp 15513f0801
maxwell_to_vk: Approach GL_CLAMP closer to the GL spec
The OpenGL spec defines GL_CLAMP's formula similarly to CLAMP_TO_EDGE
and CLAMP_TO_BORDER depending on the filter mode used. It doesn't
exactly behave like this, but it's the closest we can get with what
Vulkan offers without emulating it by injecting shader code.
2019-12-13 03:12:28 -03:00
ReinUsesLisp f845df8651
maxwell_to_vk: Use VK_EXT_index_type_uint8 when available 2019-12-13 02:37:23 -03:00
ReinUsesLisp 2df9a2dcaf
vk_scheduler: Delegate commands to a worker thread and state track
Introduce a worker thread approach for delegating Vulkan work derived
from dxvk's approach. https://github.com/doitsujin/dxvk

Now that the scheduler is what handles all Vulkan work related to
command streaming, store state tracking in itself. This way we can know
when to reupload Vulkan dynamic state to the queue (since this one is
invalidated between command buffers unlike NVN). We can also store the
renderpass state and graphics pipeline bound to avoid redundant binds
and renderpass begins/ends.
2019-12-13 02:24:48 -03:00
bunnei 8fc49a83b6
Merge pull request #3217 from jhol/fix-boost-include
Added missing include
2019-12-11 22:21:24 -05:00
Fernando Sahmkow c0ee0aa1a8 Shader_IR: Correct TLD4S Depth Compare. 2019-12-11 19:53:17 -04:00
Fernando Sahmkow af89723fa3 Shader_Ir: Correct TLD4S encoding and implement f16 flag. 2019-12-11 19:53:17 -04:00
Fernando Sahmkow 84a158c977 Gl_Shader_compiler: Correct Depth Compare for Texture Gather operations. 2019-12-11 19:53:16 -04:00
Fernando Sahmkow 271a3264f3 Shader_Ir: default failed tracks on bindless samplers to null values. 2019-12-11 19:53:16 -04:00
Fernando Sahmkow 1d2ba3cc97 Gl_Rasterizer: Skip Tesselation Control and Eval stages as they are un implemented.
This commit ensures the OGL backend does not execute tesselation shader 
stages as they are currently unimplemented.
2019-12-11 15:41:26 -04:00
bunnei 1a66cde175
Merge pull request #3210 from ReinUsesLisp/memory-barrier
shader: Implement MEMBAR.GL
2019-12-11 14:24:39 -05:00
Joel Holdsworth e9faa1617c Added missing include 2019-12-11 18:11:49 +00:00
ReinUsesLisp f564eaebed
gl_device: Enable compute shaders for Intel Mesa drivers
Previously we naively checked for "Intel" in GL_VENDOR, but this
includes both Intel's proprietary driver and the mesa driver. Re-enable
compute shaders for mesa.
2019-12-11 00:00:30 -03:00
ReinUsesLisp 48e16c4c49
gl_shader_cache: Add missing new-line on emitted GLSL
Add missing new-line. This caused shaders using local memory and shared
memory to inject a preprocessor GLSL line after an expression (resulting
in invalid code).

It looked like this:
shared uint smem[8];#define LOCAL_MEMORY_SIZE 16

It should look like this (addressed by this commit):
shared uint smem[8];
\#define LOCAL_MEMORY_SIZE 16
2019-12-10 23:52:51 -03:00
Fernando Sahmkow 7ffb672f61 Maxwell3D: Implement Depth Mode.
This commit finishes adding depth mode that was reverted before due to
other unresolved issues.
2019-12-10 19:51:46 -04:00
ReinUsesLisp 425a254fa2
shader: Implement MEMBAR.GL
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
ReinUsesLisp 233ed96a5c
vk_shader_decompiler: Fix build issues on old gcc versions 2019-12-10 01:55:38 -03:00
ReinUsesLisp d30cf51d7d
vk_shader_decompiler: Reduce YNegate's severity 2019-12-09 23:52:28 -03:00
ReinUsesLisp 0b5b93053d
shader_ir/other: Implement S2R InvocationId 2019-12-09 23:52:28 -03:00
ReinUsesLisp ecbfa416f0
vk_shader_decompiler: Misc changes
Update Sirit and its usage in vk_shader_decompiler. Highlights:
- Implement tessellation shaders
- Implement geometry shaders
- Implement some missing features
- Use native half float instructions when available.
2019-12-09 23:51:57 -03:00
ReinUsesLisp 9ad6327fbd
shader: Keep track of shaders using warp instructions 2019-12-09 23:40:41 -03:00
ReinUsesLisp 6233b1db08
shader_ir/memory: Implement patch stores 2019-12-09 23:25:21 -03:00
ReinUsesLisp 19ce0d4f1a
vk_device: Misc changes
- Setup more features and requirements.
- Improve logging for missing features.
- Collect telemetry parameters.
- Add queries for more image formats.
- Query push constants limits.
- Optionally enable some extensions.
2019-12-09 01:04:48 -03:00
bunnei faf5ae6a50
Merge pull request #3198 from ReinUsesLisp/tessellation-maxwell
maxwell_3d: Add tessellation state entries
2019-12-08 22:28:25 -05:00
ReinUsesLisp 7ea362e134
externals: Update Vulkan-Headers 2019-12-08 22:08:19 -03:00
ReinUsesLisp f632d00eb1
vk_swapchain: Add support for swapping sRGB
We don't know until the game is running if it's using an sRGB color
space or not. Add support for hot-swapping swapchain surface formats.
2019-12-06 22:42:08 -03:00
ReinUsesLisp 36651f215a
maxwell_3d: Add tessellation tess level registers 2019-12-06 22:08:22 -03:00
ReinUsesLisp 707bf41c6f
maxwell_3d: Add tessellation mode register 2019-12-06 22:07:31 -03:00
ReinUsesLisp d2b50c5ebd
maxwell_3d: Add patch vertices register 2019-12-06 22:06:53 -03:00
ReinUsesLisp 74f515e8b6
shader_bytecode: Remove corrupted character 2019-12-06 20:31:56 -03:00
bunnei e36814d6d5
Merge pull request #3109 from FernandoS27/new-instr
Implement FLO & TXD Instructions on GPU Shaders
2019-12-06 18:18:16 -05:00
bunnei 3c1b6b5723
Merge pull request #2987 from FernandoS27/texture-invalid
Texture_Cache: Redo invalid Surfaces handling.
2019-12-02 12:07:05 -05:00
bunnei 930b7c18a6
Merge pull request #3184 from ReinUsesLisp/framebuffer-cache
gl_framebuffer_cache: Optimize framebuffer cache management
2019-11-30 18:46:40 -05:00
ReinUsesLisp ff64c3951a
texture_cache/surface_base: Fix out of bounds texture views
Some texture views were being created out of bounds (with more layers or
mipmaps than what the original texture has). This is because of a
miscalculation in mipmap bounding. end_layer and end_mipmap are out of
bounds (e.g. layer 6 in a cubemap), there's no need to add one more
there.

Fixes OpenGL errors and Vulkan crashes on Splatoon 2.
2019-11-29 16:51:14 -03:00
ReinUsesLisp fb6cf12a17
gl_framebuffer_cache: Optimize framebuffer key
Pack color attachment enumerations into a single u32. To determine the
number of buffers, the highest color attachment with a shared pointer
that doesn't point to null is used.
2019-11-28 23:02:20 -03:00
ReinUsesLisp c34da106ed
gl_rasterizer: Re-enable framebuffer cache for clear buffers 2019-11-28 23:02:20 -03:00
ReinUsesLisp e6a0a30334
renderer_opengl: Make ScreenRectVertex's constructor constexpr 2019-11-28 20:36:02 -03:00
ReinUsesLisp dee7844443
renderer_opengl: Remove C casts 2019-11-28 20:28:27 -03:00
ReinUsesLisp 3a44faff11
renderer_opengl: Use explicit binding for presentation shaders 2019-11-28 20:25:56 -03:00
ReinUsesLisp 75cc501d52
renderer_opengl: Drop macros for message decorations 2019-11-28 20:15:25 -03:00
ReinUsesLisp 056f049b26
renderer_opengl: Move static definitions to anonymous namespace 2019-11-28 20:14:40 -03:00
ReinUsesLisp 4589582eaf
renderer_opengl: Move commentaries to header file 2019-11-28 20:11:03 -03:00
bunnei e3ee017e91
Merge pull request #3169 from lioncash/memory
core/memory: Deglobalize memory management code
2019-11-28 11:43:17 -05:00
Rodrigo Locatti 913d0bb269
Merge pull request #3174 from lioncash/optional
video_core/gpu_thread: Tidy up SwapBuffers()
2019-11-27 20:35:31 -03:00
Lioncash aed6d8bef5 video_core/gpu_thread: Tidy up SwapBuffers()
We can just use std::nullopt and std::make_optional to make this a
little bit less noisy.
2019-11-27 17:46:11 -05:00
Lioncash 9403979c22 video_core/const_buffer_locker: Make use of std::tie in HasEqualKeys()
Tidies it up a little bit visually.
2019-11-27 05:53:43 -05:00
Lioncash 930e311526 video_core/const_buffer_locker: Remove unused includes 2019-11-27 05:51:13 -05:00
Lioncash 9341ca7979 video_core/const_buffer_locker: Remove #pragma once from cpp file
Silences a compiler warning.
2019-11-27 05:50:51 -05:00
Lioncash 849581075a core/memory: Migrate over RasterizerMarkRegionCached() to the Memory class
This is only used within the accelerated rasterizer in two places, so
this is also a very trivial migration.
2019-11-26 21:55:38 -05:00
Lioncash 3f08e8d8d4 core/memory: Migrate over GetPointer()
With all of the interfaces ready for migration, it's trivial to migrate
over GetPointer().
2019-11-26 21:55:38 -05:00
Lioncash 536fc7f0ea core: Prepare various classes for memory read/write migration
Amends a few interfaces to be able to handle the migration over to the
new Memory class by passing the class by reference as a function
parameter where necessary.

Notably, within the filesystem services, this eliminates two ReadBlock()
calls by using the helper functions of HLERequestContext to do that for
us.
2019-11-26 21:55:37 -05:00
bunnei 6df6caaf5f
Merge pull request #3143 from ReinUsesLisp/indexing-bug
gl_device: Deduce indexing bug from device instead of heuristic
2019-11-26 21:53:12 -05:00
ReinUsesLisp ef4446cb11
gl_shader_decompiler: Fix casts from fp32 to f16
Casts from f32 to f16 zeroes the higher half of the target register.
2019-11-25 22:22:33 -03:00
ReinUsesLisp 410d44ce05
gl_device: Deduce indexing bug from device instead of heuristic
The heuristic to detect AMD's driver was not working properly since it
also included Intel. Instead of using heuristics to detect it, compare
the GL_VENDOR string.
2019-11-25 16:15:22 -03:00
bunnei 2899c93818
Merge pull request #3158 from ReinUsesLisp/srgb-blit
gl_texture_cache: Apply sRGB on blits
2019-11-24 20:47:13 -05:00
bunnei 33a6b45a6c
Merge pull request #3155 from bunnei/fix-asynch-gpu-wait
gpu_thread: Don't spin wait if there are no GPU commands.
2019-11-24 20:19:25 -05:00
bunnei b03242067d
Merge pull request #3098 from ReinUsesLisp/shader-invalidations
gl_shader_cache: Miscellaneous changes to shaders
2019-11-24 19:36:30 -05:00
ReinUsesLisp 74fff717aa
gl_texture_cache: Apply sRGB on blits
glBlitFramebuffer keeps in mind GL_FRAMEBUFFER_SRGB's state. Enable this
depending on the target surface pixel format.
2019-11-24 18:13:33 -03:00
bunnei b7031b2b9d
Merge pull request #3105 from ReinUsesLisp/fix-stencil-reg
maxwell_3d: Fix stencil_back_func_mask offset
2019-11-24 13:53:23 -05:00
bunnei e81e0036b4
Merge pull request #3145 from ReinUsesLisp/buffer-cache-init
buffer_cache: Remove brace initialized for objects with default constructor
2019-11-24 02:55:02 -05:00
bunnei 9ec84fc592 gpu_thread: Don't spin wait if there are no GPU commands. 2019-11-23 15:17:28 -05:00
bunnei 4ed183ee42
Merge pull request #3141 from ReinUsesLisp/gl-position
gl_shader_gen: Apply default value to gl_Position
2019-11-23 13:23:46 -05:00
ReinUsesLisp dc2e83fa31
gl_device: Reserve base bindings on limited devices
SSBOs and other resources are limited per pipeline on Intel and AMD.
Heuristically reserve resources per stage having in mind the reported
OpenGL limits.
2019-11-22 21:28:50 -03:00
ReinUsesLisp e3d7334be9
gl_state: Skip null texture binds
glBindTextureUnit doesn't support null textures. Skip binding these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp 919ac2c4d3
gl_rasterizer: Disable compute shaders on Intel
Intel's proprietary driver enters in a corrupt state when compute
shaders are executed. For now, disable these.
2019-11-22 21:28:50 -03:00
ReinUsesLisp 894ad74b87
gl_shader_cache: Hack shared memory size
The current shared memory size seems to be smaller than what the game
actually uses. This makes Nvidia's driver consistently blow up; in the
case of FE3H it made it explode on Qt's SwapBuffers while SDL2 worked
just fine. For now keep this hack since it's still progress over the
previous hardcoded shared memory size.
2019-11-22 21:28:49 -03:00
ReinUsesLisp e35b9597ef
gl_shader_decompiler: Normalize image bindings 2019-11-22 21:28:49 -03:00
ReinUsesLisp 36d9b409fc
gl_shader_decompiler: Normalize cbuf bindings
Stage and compute shaders were using a different binding counter.
Normalize these.
2019-11-22 21:28:49 -03:00
ReinUsesLisp f936b86c7c
gl_rasterizer: Add missing cbuf counter reset on compute 2019-11-22 21:28:49 -03:00
ReinUsesLisp 180417c514
gl_shader_cache: Remove dynamic BaseBinding specialization 2019-11-22 21:28:49 -03:00
ReinUsesLisp c8a48aacc0
video_core: Unify ProgramType and ShaderStage into ShaderType 2019-11-22 21:28:48 -03:00
ReinUsesLisp 0f23359a44
gl_rasterizer: Bind graphics images to draw commands
Images were not being bound to draw invocations because these would
require a cache invalidation.
2019-11-22 21:28:48 -03:00
ReinUsesLisp 287ae2b9e8
gl_shader_cache: Specialize local memory size for compute shaders
Local memory size in compute shaders was stubbed with an arbitary size.
This commit specializes local memory size from guest GPU parameters.
2019-11-22 21:28:48 -03:00
ReinUsesLisp dbeb523879
gl_shader_cache: Specialize shared memory size
Shared memory was being declared with an undefined size. Specialize from
guest GPU parameters the compute shader's shared memory size.
2019-11-22 21:28:47 -03:00
ReinUsesLisp 4f5d8e4342
gl_shader_cache: Specialize shader workgroup
Drop the usage of ARB_compute_variable_group_size and specialize compute
shaders instead. This permits compute to run on AMD and Intel
proprietary drivers.
2019-11-22 21:28:47 -03:00
ReinUsesLisp dc9961f341
shader/texture: Handle TLDS texture type mismatches
Some games like "Fire Emblem: Three Houses" bind 2D textures to offsets
used by instructions of 1D textures. To handle the discrepancy this
commit uses the the texture type from the binding and modifies the
emitted code IR to build a valid backend expression.

E.g.: Bound texture is 2D and instruction is 1D, the emitted IR samples
a 2D texture in the coordinate ivec2(X, 0).
2019-11-22 21:28:47 -03:00
ReinUsesLisp 32c1bc6a67
shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
ReinUsesLisp 73aaf365e7
buffer_cache: Remove brace initialized for objects with default constructor 2019-11-20 16:00:40 -03:00
Fernando Sahmkow cc81c0ce64 Texture_Cache: Redo invalid Surfaces handling.
This commit aims to redo the full setup of invalid textures and
guarantee correct behavior across backends in the case of finding one by
using black dummy textures that match the target of the expected
texture.
2019-11-20 14:59:35 -04:00
ReinUsesLisp 24f4198cee
shader/other: Reduce DEPBAR log severity
While DEPBAR is stubbed it doesn't change anything from our end. Shading
languages handle what this instruction does implicitly. We are not
getting anything out fo this log except noise.
2019-11-19 21:26:40 -03:00
ReinUsesLisp bc10714dcf
gl_shader_gen: Apply default value to gl_Position
Nvidia has sane default output values for varyings, but the other
vendors don't apply these. To properly emulate this we would have to
analyze the shader header. For the time being, apply the same default
Nvidia applies so we get the same behaviour on non-Nvidia drivers.
2019-11-19 20:32:01 -03:00
bunnei b0819e2ffb
Merge pull request #3086 from ReinUsesLisp/format-lookups
texture_cache: Use a flat table instead of switch for texture format lookups
2019-11-19 18:29:17 -05:00
Fernando Sahmkow c8473f399e Shader_IR: Address Feedback 2019-11-18 07:34:34 -04:00
bunnei a8295d2c53
Merge pull request #3047 from ReinUsesLisp/clip-control
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
2019-11-15 12:09:19 -05:00
ReinUsesLisp 4681381a34
format_lookup_table: Address feedback
format_lookup_table: Drop bitfields

format_lookup_table: Use std::array for definition table

format_lookup_table: Include <limits> instead of <numeric>
2019-11-14 20:57:30 -03:00
ReinUsesLisp 80eacdf89b
texture_cache: Use a table instead of switch for texture formats
Use a large flat array to look up texture formats. This allows us to
properly implement formats with different component types. It should
also be faster.
2019-11-14 20:57:10 -03:00
ReinUsesLisp 48a1687f51
texture_cache: Drop abstracted ComponentType
Abstracted ComponentType was not being used in a meaningful way.
This commit drops its usage.

There is one place where it was being used to test compatibility between
two cached surfaces, but this one is implied in the pixel format.
Removing the component type test doesn't change the behaviour.
2019-11-14 18:21:42 -03:00
greggameplayer c6bc13d0aa correct the implementation of RGBA16UI 2019-11-14 21:37:39 +01:00
Fernando Sahmkow cd0f5dfc17 Shader_IR: Implement TXD instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow f3d1b370aa Shader_IR: Implement FLO instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow 95137a04e1 Shader_Bytecode: Add encodings for FLO, SHF and TXD 2019-11-14 11:15:26 -04:00
Fernando Sahmkow b6f6733131
Merge pull request #3081 from ReinUsesLisp/fswzadd-shuffles
shader: Implement FSWZADD and reimplement SHFL
2019-11-14 10:27:27 -04:00
ReinUsesLisp 7990220df7
maxwell_3d: Fix stencil_back_func_mask offset
stencil_back_func_mask and stencil_back_mask were misplaced. This commit
addresses that issue.
2019-11-13 16:35:17 -03:00
Rodrigo Locatti cf770a68a5
Merge pull request #3084 from ReinUsesLisp/cast-warnings
video_core: Treat implicit conversions as errors
2019-11-13 02:16:22 -03:00
Rodrigo Locatti fb9418798d
video_core: Enable sign conversion warnings
Enable sign conversion warnings but don't treat them as errors.
2019-11-11 18:00:37 -03:00
bunnei 0fc596de6e
Merge pull request #3082 from ReinUsesLisp/fix-lockers
gl_shader_cache: Fix locker constructors
2019-11-09 13:58:36 -05:00
ReinUsesLisp 18c1cb68fd video_core: Treat implicit conversions as errors 2019-11-08 22:49:39 +00:00
ReinUsesLisp 096f339a2a video_core: Silence implicit conversion warnings 2019-11-08 22:48:50 +00:00
bunnei a056d8de16
Merge pull request #3080 from FernandoS27/glsl-fix
GLSLDecompiler: Correct Texture Gather Offset.
2019-11-08 15:56:29 -05:00
ReinUsesLisp bfa973a62b
gl_shader_cache: Fix locker constructors
Properly pass engine when a shader is being constructed from memory.
2019-11-07 20:43:31 -03:00
ReinUsesLisp 3ab0514698
gl_shader_cache: Enable extensions only when available
Silence GLSL compilation warnings.
2019-11-07 20:08:42 -03:00
ReinUsesLisp cd66395944
gl_shader_decompiler: Add safe fallbacks when ARB_shader_ballot is not available 2019-11-07 20:08:42 -03:00
ReinUsesLisp 56e237d1f9
shader_ir/warp: Implement FSWZADD 2019-11-07 20:08:41 -03:00
ReinUsesLisp 08b2b1080a
gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics 2019-11-07 20:08:41 -03:00
Fernando Sahmkow 3d7c284e0f GLSLDecompiler: Correct Texture Gather Offset.
This commit corrects the argument ordering in textureGatherOffset.
2019-11-07 11:43:56 -04:00
bunnei b6ae48966d
Merge pull request #3032 from ReinUsesLisp/simplify-control-flow-brx
shader/control_flow: Abstract repeated code chunks in BRX tracking
2019-11-07 01:30:01 -05:00
Morph 0e8a3bf3e5 buffer_cache: Add missing includes (#3079)
`boost::make_iterator_range` is available when `boost/range/iterator_range.hpp` is included.
Also include `boost/icl/interval_map.hpp` and `boost/icl/interval_set.hpp`.
2019-11-07 06:25:53 +00:00
bunnei 344d15f61e
Merge pull request #3070 from ReinUsesLisp/shader-warnings
shader_ir: Reduce severity of warnings
2019-11-07 00:47:24 -05:00
ReinUsesLisp e9d2fad984
gl_rasterizer: Remove front facing hack 2019-11-07 01:52:18 -03:00
ReinUsesLisp f1facaeaef
gl_shader_decompiler: Fix typo "y_negate"->"y_direction" 2019-11-07 01:52:18 -03:00
ReinUsesLisp e2ea0c3e11
gl_shader_manager: Remove unused variable in SetFromRegs 2019-11-07 01:52:18 -03:00
ReinUsesLisp f019817f8f
gl_rasterizer: Emulate viewport flipping with ARB_clip_control
Emulates negative y viewports with ARB_clip_control. This allows us to
more easily emulated pipelines with tessellation and/or geometry shader
stages. It also avoids corrupting games with transform feedbacks and
negative viewports (gl_Position.y was being modified).
2019-11-07 01:52:18 -03:00
Rodrigo Locatti ff5a0f370c
shader/control_flow: Specify constness on caller lambdas
Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>

Update src/video_core/shader/control_flow.cpp

Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-11-07 01:44:09 -03:00
ReinUsesLisp 7b069252f8
shader/control_flow: Use callable template instead of std::function 2019-11-07 01:44:08 -03:00
ReinUsesLisp 46c3047283
shader/control_flow: Abstract repeated code chunks in BRX tracking
Remove copied and pasted for cycles into a common templated function.
2019-11-07 01:44:08 -03:00
ReinUsesLisp ae7dfa93be
shader/control_flow: Silence Intellisense cast warnings 2019-11-07 01:44:08 -03:00
ReinUsesLisp deb1b54eed
shader/control_flow: Remove brace initializer in std containers
These containers have a default constructor.
2019-11-07 01:44:08 -03:00
ReinUsesLisp 39c66abd91
shader/decode: Reduce severity of arithmetic rounding warnings 2019-11-07 01:43:38 -03:00
ReinUsesLisp c4374d0d41
shader/arithmetic: Reduce RRO stub severity 2019-11-07 01:43:38 -03:00
ReinUsesLisp 35d40b74b3
shader/texture: Remove NODEP warnings
These warnings don't offer meaningful information while decoding
shaders. Remove them.
2019-11-07 01:43:38 -03:00
bunnei 468576284d
Merge pull request #3057 from ReinUsesLisp/buffer-sub-data
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
2019-11-06 10:08:55 -05:00
Rodrigo Locatti 654b77d2ec
Merge pull request #3039 from ReinUsesLisp/cleanup-samplers
shader/node: Unpack bindless texture encoding
2019-11-06 04:54:11 +00:00
bunnei 21e07df7b7
Merge pull request #2914 from FernandoS27/fermi-fix
Fermi2D: limit blit area to only available area
2019-11-05 20:45:24 -05:00
bunnei 1bdae0fe29 common_func: Use std::array for INSERT_PADDING_* macros.
- Zero initialization here is useful for determinism.
2019-11-03 22:22:41 -05:00
ReinUsesLisp 442a1cc021
gl_rasterizer: Re-enable stream buffer memory due to global memory
Global memory is still using the stream buffer when it shouldn't. As a
temporary fix re-enable the stream buffer on compute.
2019-11-02 13:19:19 -03:00
ReinUsesLisp 76ca2a5f82
gl_rasterizer: Upload constant buffers with glNamedBufferSubData
Nvidia's OpenGL driver maps gl(Named)BufferSubData with some requirements
to a fast. This path has an extra memcpy but updates the buffer without
orphaning or waiting for previous calls. It can be seen as a better
model for "push constants" that can upload a whole UBO instead of 256
bytes.

This path has some requirements established here:
http://on-demand.gputechconf.com/gtc/2014/presentations/S4379-opengl-44-scene-rendering-techniques.pdf#page=24

Instead of using the stream buffer, this commits moves constant buffers
uploads to calls of glNamedBufferSubData and from my testing it brings a
performance improvement. This is disabled when the vendor is not Nvidia
since it brings performance regressions.
2019-11-02 05:05:34 -03:00
Fernando Sahmkow 23cabc98db Shader_IR: Fix regression on TLD4
Originally on the last commit I thought TLD4 acted the same as TLD4S and 
didn't have a mask. It actually does have a component mask. This commit 
corrects that.
2019-10-30 21:14:57 -04:00
Rodrigo Locatti 658489ebf7
Merge pull request #3050 from FernandoS27/fix-tld4
shader_ir: Fix TLD4 and add bindless variant
2019-10-30 18:37:17 +00:00
Fernando Sahmkow 9293c3a0f2 Shader_IR: Fix TLD4 and add Bindless Variant.
This commit fixes an issue where not all 4 results of tld4 were being
written, the color component was defaulted to red, among other things.
It also implements the bindless variant.
2019-10-30 12:02:03 -04:00
bunnei 2382bbe3ac
Merge pull request #3046 from ReinUsesLisp/clean-gl-state
gl_state: Miscellaneous clean up
2019-10-29 22:50:04 -04:00
bunnei b5138f3c35
Merge pull request #3035 from ReinUsesLisp/rasterizer-accelerated
rasterizer_accelerated: Add intermediary for GPU rasterizers
2019-10-29 22:06:41 -04:00
Rodrigo Locatti 3d0cde6a75
gl_state: Use std::array::fill instead of std::fill
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2019-10-30 01:30:31 +00:00
ReinUsesLisp ce20ed8e4e
gl_state: Move dirty checks to individual apply calls instead of Apply
This requires removing constness from some methods, but for consistency
it's removed in all methods.
2019-10-29 21:27:25 -03:00
ReinUsesLisp 3c6557c235
gl_state: Remove ApplyDefaultState
OpenGL has defaults values we can trust. Remove these.
2019-10-29 21:27:25 -03:00
ReinUsesLisp d3651b0b82
gl_state: Change SetDefaultViewports to use default constructor 2019-10-29 21:27:24 -03:00
ReinUsesLisp c7698d0bc8
gl_state: Minor style changes 2019-10-29 21:27:24 -03:00
ReinUsesLisp a14d202ac2
gl_state: Remove unused Citra TextureUnits 2019-10-29 21:27:24 -03:00
ReinUsesLisp 28fece8e9b
gl_state: Move initializers from constructor to class declaration 2019-10-29 21:27:23 -03:00
ReinUsesLisp a993df1ee2
shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Rodrigo Locatti 2ec5b55ee3
Merge pull request #3004 from ReinUsesLisp/maxwell3d-cleanup
maxwell_3d: Remove unused entries
2019-10-29 23:46:33 +00:00
Rodrigo Locatti c5d9589942
Merge pull request #3037 from FernandoS27/new-formats
video_core: Implement texture format E5B9G9R9_SHAREDEXP.
2019-10-28 01:36:58 -03:00
ReinUsesLisp fa31e5b868
maxwell_3d/kepler_compute: Remove unused arguments in GetTexture 2019-10-28 00:23:42 -03:00
ReinUsesLisp 538ddd220e
video_core/textures: Remove unused index entry in FullTextureInfo 2019-10-28 00:14:38 -03:00
ReinUsesLisp 961fe4d19b
maxwell_3d: Remove unused method GetStageTextures 2019-10-28 00:14:29 -03:00
Fernando Sahmkow 3f9262195b Video_Core: Implement texture format E5B9G9R9_SHAREDEXP.
This commit implements the E5B9G9R9 Texture format into the general 
system and OpenGL backend.
2019-10-27 16:44:09 -04:00
bunnei 6909b2f0f9
Merge pull request #3034 from ReinUsesLisp/w4244-maxwell3d
maxwell_3d: Silence implicit conversion warnings
2019-10-27 15:08:59 -04:00
ReinUsesLisp 3e469cecc1
maxwell_3d: Silence implicit conversion warnings
While we are at it, unify types for dirty reg pointers.
2019-10-27 15:22:17 -03:00
ReinUsesLisp bd2aff3e26
rasterizer_accelerated: Add intermediary for GPU rasterizers
Add an intermediary class that implements common functions across GPU
accelerated rasterizers. This avoids code repetition on different
backends.
2019-10-27 03:40:08 -03:00
ReinUsesLisp a5aa1bb174
astc: Silence implicit conversion warnings 2019-10-27 03:04:50 -03:00
Rodrigo Locatti 26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti a0d79085c4
Merge pull request #3027 from lioncash/lookup
shader_ir: Use std::array with std::pair instead of std::unordered_map
2019-10-26 05:49:15 -03:00
Rodrigo Locatti d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow e3afd6595a Shader_IR: Clang format 2019-10-25 09:01:32 -04:00
ReinUsesLisp 78f3e8a757 gl_shader_cache: Implement locker variants invalidation 2019-10-25 09:01:32 -04:00
ReinUsesLisp ec85648af3 gl_shader_disk_cache: Store and load fast BRX 2019-10-25 09:01:31 -04:00
ReinUsesLisp fa2c297f3e const_buffer_locker: Minor style changes 2019-10-25 09:01:31 -04:00
ReinUsesLisp 7b81ba4d8a gl_shader_decompiler: Move entries to a separate function 2019-10-25 09:01:31 -04:00
Fernando Sahmkow 1244f2d368 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:31 -04:00
Fernando Sahmkow a05120ec0b Shader_IR: Correct typo in Consistent method. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow 33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Fernando Sahmkow 8909f52166 Shader_IR: Implement Fast BRX and allow multi-branches in the CFG. 2019-10-25 09:01:30 -04:00
Fernando Sahmkow acd6441134 Shader_Cache: setup connection of ConstBufferLocker 2019-10-25 09:01:29 -04:00
Fernando Sahmkow 1a58f45d76 VideoCore: Unify const buffer accessing along engines and provide ConstBufferLocker class to shaders. 2019-10-25 09:01:29 -04:00
Fernando Sahmkow 2ef696c85a Shader_IR: Implement BRX tracking. 2019-10-25 09:01:29 -04:00
Rodrigo Locatti 5062728669
Merge pull request #3028 from lioncash/constexpr
shader_bytecode: Make Matcher constexpr capable
2019-10-24 15:10:40 -03:00
Lioncash 7fdf991097 shader_bytecode: Make Matcher constexpr capable
Greatly shrinks the amount of generated code for GetDecodeTable().

Collapses an assembly output of 9000+ lines down to ~3621 with Clang,
and 6513 down to ~2616 with GCC, given it's now allowed to construct all
the entries as a sequence of constant data.
2019-10-24 01:10:10 -04:00
Lioncash 382717172e shader_ir: Use std::array with pair instead of unordered_map
Given the overall size of the maps are very small, we can use arrays of
pairs here instead of always heap allocating a new map every time the
functions are called. Given the small size of the maps, the difference
in container lookups are negligible, especially given the entries are
already sorted.
2019-10-24 00:25:38 -04:00
Lioncash 1f5401c89c video_core/shader: Resolve instances of variable shadowing
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow c4a0aa9207
Merge pull request #2995 from ReinUsesLisp/ignore-gmem
shader_ir/memory: Ignore global memory when tracking fails
2019-10-22 13:22:43 -04:00
Fernando Sahmkow 7ecf9f7228
Merge pull request #2983 from lioncash/fallthrough
gl_shader_decompiler/vk_shader_decompiler: Resolve implicit fallthrough cases
2019-10-22 13:16:46 -04:00
Fernando Sahmkow 1509d2ffbd Shader_Ir: Fix TLD4S from using a component mask.
TLD4S always outputs 4 values, the previous code checked a component 
mask and omitted those values that weren't part of it. This commit 
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp 1ea07954fb shader_ir/memory: Ignore global memory when tracking fails
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.

In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.

The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
ReinUsesLisp e3107788e6
maxwell_3d: Reduce FlushMMEInlineDraw logging to Trace 2019-10-20 03:43:17 -03:00
Rodrigo Locatti dc5eedef71
Merge pull request #2994 from lioncash/fmt
video_core/shader/ast: Minor changes to ASTPrinter
2019-10-18 01:05:25 -03:00
Lioncash 074b38b7a9 video_core/shader/ast: Make ShowCurrentState() and SanityCheck() const member functions
These can also trivially be made const member functions, with the
addition of a few consts.
2019-10-17 20:59:48 -04:00
Lioncash 222f4b45eb video_core/shader/ast: Make ASTManager::Print a const member function
Given all visiting functions never modify the nodes, we can trivially
make this a const member function.
2019-10-17 20:56:39 -04:00
Rodrigo Locatti fd922ddb01
Merge pull request #2993 from lioncash/vulkan-expr
vk_shader_decompiler: Mark operator() function parameters as const references
2019-10-17 21:46:49 -03:00
Lioncash 7831e86c34 video_core/shader/ast: Make ExprPrinter members private
This member already has an accessor, so there's no need for it to be
public.
2019-10-17 20:39:36 -04:00
Lioncash a2eccbf075 video_core/shader/ast: Make Indent() return a string_view
The returned string is simply a substring of our constexpr tabs
string_view, so we can just use a string_view here as well, since the
original string_view is guaranteed to always exist.

Now the function is fully non-allocating.
2019-10-17 20:29:00 -04:00
Lioncash 15d177a6ac video_core/shader/ast: Make Indent() private
It's never used outside of this class, so we can narrow its scope down.
2019-10-17 20:26:13 -04:00
Lioncash 7f6a8a33d4 video_core/shader/ast: Rename Ident() to Indent()
This can be confusing, given "ident" is generally used as a shorthand
for "identifier".
2019-10-17 20:26:13 -04:00
Lioncash 081530686c video_core/shader/ast: Make use of fmt where applicable
Makes a few strings nicer to read and also eliminates a bit of string
churn with operator+.
2019-10-17 20:26:10 -04:00
Lioncash c6bec9aa10 vk_shader_decompiler: Mark operator() function parameters as const references
These parameters aren't actually modified in any way, so they can be
made const references.
2019-10-17 19:44:00 -04:00
Rodrigo Locatti 219fdcb9d9
Merge pull request #2966 from FernandoS27/astc-formats
Implement a series of ASTC formats and R4G4B4A4 format
2019-10-17 19:24:11 -03:00
Rodrigo Locatti a21b88ef8f
Merge pull request #2979 from lioncash/macro
video_core/macro_interpreter: Make definitions of most private enums/unions hidden
2019-10-17 19:21:09 -03:00
Fernando Sahmkow c0eb1aecfd Fermi2D: Use a different formula for delimiting blit areas. 2019-10-17 18:21:01 -04:00
Lioncash 125caf5d6e video_core/macro_interpreter: Make definitions of most private enums/unions hidden
This allows the implementation of these types to change without
requiring a rebuild of everything that includes the macro interpreter
header.
2019-10-17 17:55:46 -04:00
bunnei 9fe8072c67
Merge pull request #2980 from lioncash/warn
maxwell_3d: Silence truncation warnings
2019-10-17 14:02:16 -04:00
Fernando Sahmkow 57a46c69f1 Fermi2D: limit blit area to only available area
Normaly OpenGL does not care if the areas exceed the texture regions but
other backends such as Vulkan do care about the limits of this areas.
This PR crops the areas of the blit in order that they don't surpass the
limits of the textures. This should help Vulkan and faulty OpenGL
drivers
2019-10-17 10:38:44 -04:00
Rodrigo Locatti 60c602e4e7
Merge pull request #2978 from lioncash/doxygen
video_core/texture_cache: Amend Doxygen references
2019-10-16 22:09:40 -03:00
Rodrigo Locatti e00b529a89
Merge pull request #2982 from lioncash/surface
texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()
2019-10-16 19:43:32 -03:00
bunnei ef9b31783d
Merge pull request #2912 from FernandoS27/async-fixes
General fixes to Async GPU
2019-10-16 10:34:48 -04:00
Rodrigo Locatti 60315060b1
Merge pull request #2984 from lioncash/fallthrough2
video_core/surface: Add missing break in PixelFormatFromTextureFormat()
2019-10-15 23:08:34 -03:00
Lioncash cf9e13c255 video_core/surface: Add missing break in PixelFormatFromTextureFormat()
Prevents fallthrough into the following case.
2019-10-15 21:53:15 -04:00
Rodrigo Locatti 14f3cebcd4
Merge pull request #2981 from lioncash/copy
gl_shader_decompiler: Minor cleanup-related changes
2019-10-15 21:07:25 -03:00
Lioncash 6947bf8e44 vk_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:40:58 -04:00
Lioncash b42a74ff2c gl_shader_decompiler: Resolve fallthrough within ExprDecompiler's ExprCondCode operator()
This would previously result in NeverExecute and UnusedIndex being
treated as regular predicates.
2019-10-15 19:38:55 -04:00
Lioncash a24e8bf9cf texture_cache: Avoid unnecessary surface copies within PickStrategy() and TryReconstructSurface()
We can take these by const reference and avoid making unnecessary
copies, preventing some atomic reference count increments and
decrements.
2019-10-15 19:31:33 -04:00
Lioncash 77b4916b33 control_flow: Silence truncation warnings
This can be trivially fixed by making the input size a size_t.
CFGRebuildState's constructor parameter is already a std::size_t, so
this just makes the size type fully conform with it.
2019-10-15 19:10:28 -04:00
Lioncash 4f16ce9294 gl_shader_decompiler: Make ExprDecompiler's GetResult() a const member function
This is only ever used to read, but not write, the resulting string, so
we can enforce this by making it a const member function.
2019-10-15 19:02:59 -04:00
Lioncash 67df3f7742 gl_shader_decompiler: Use a std::string_view with GetDeclarationWithSuffix()
This allows the function to be completely non-allocating for inputs of
all sizes (i.e. there's no heap cost for an input to convert to a
std::string_view).
2019-10-15 19:00:48 -04:00
Lioncash 04a1161354 gl_shader_decompiler: Fold flow_var constant into GetFlowVariable()
This is only ever used within this function, so we can narrow it's scope
down.
2019-10-15 18:58:36 -04:00
Lioncash 2f2ab9b5bc gl_shader_decompiler: Mark ASTDecompiler/ExprDecompiler parameters as const references where applicable
These member functions don't actually modify the input parameter, so we
can make this explicit with the use of const.
2019-10-15 18:57:02 -04:00
Lioncash b8a62adcf1 gl_shader_decompiler: Pass by reference to GenerateTextureArgument()
Avoids an unnecessary atomic reference count increment and decrement.
2019-10-15 18:29:37 -04:00
Lioncash d1d7ce74d2 gl_shader_decompiler: Use std::holds_alternative within GenerateTexture()
This only ever queries if the type exists within the variant, but
doesn't actually do anything with the return value. We can just use
std::holds_alternative for this use case.
2019-10-15 18:25:48 -04:00
Lioncash 67658dd6e8 shader/node: std::move Meta instance within OperationNode constructor
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-15 18:21:59 -04:00
Lioncash 9760795bfb gl_shader_decompiler: Avoid unnecessary copies of MetaImage
MetaImage contains a std::vector, so copying here could result in
unnecessary reallocations. Given the operation lives throughout the
entire scope, this is safe to do.
2019-10-15 18:14:55 -04:00
Lioncash c9c75f9587 maxwell_3d: Silence truncation warnings
A trivial warning caused by not using size_t as the argument types
instead of u32.
2019-10-15 17:51:35 -04:00
bunnei 2299950de1
Merge pull request #2972 from lioncash/system
{bcat, gpu, nvflinger}: Remove trivial usages of the global system accessor
2019-10-15 17:49:12 -04:00
Lioncash b25b94400e video_core/gpu: Remove use of the global system accessor
We can just make use of the reference member variable instead of
accessing the global system instance.
2019-10-15 16:39:30 -04:00
Lioncash 524eb15513 video_core/texture_cache: Amend Doxygen references
Amends the doxygen comments so that they properly resolve. While we're
at it, we can correct some typos and fix up some of the comments'
formatting in order to make them slightly nicer to read.
2019-10-15 15:40:00 -04:00
Lioncash ac4dbd3b25 common: Rename binary_find.h to algorithm.h
Makes the header more general for other potential algorithms in the
future. While we're at it, include a missing <functional> include to
satisfy the use of std::less.
2019-10-15 15:24:50 -04:00
Fernando Sahmkow cfc2f30dc4 AsyncGpu: Address Feedback 2019-10-11 13:41:15 -04:00
bunnei 2ba273e49e
Merge pull request #2928 from ReinUsesLisp/dirty-depth-bounds
maxwell_3d: Add dirty flags for depth bounds values
2019-10-09 15:44:30 -04:00
bunnei 6b5e50d20e
Merge pull request #2927 from ReinUsesLisp/polygon-offset-units
gl_rasterizer: Fix polygon offset units
2019-10-09 15:38:52 -04:00
Fernando Sahmkow f32a49d3d8 Surfaces: Implement R4G4B4A4U format. 2019-10-09 12:57:02 -04:00
Fernando Sahmkow b9ddb517b1 Surfaces: Implement ASTC 6x6 10x10 12x12 8x6 6x5 2019-10-09 12:44:31 -04:00
ReinUsesLisp 3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp 632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG 2019-10-07 14:48:58 -03:00
ReinUsesLisp 58b597c5ec
gl_shader_disk_cache: Properly ignore existing cache
Previously old entries where appended to the file even if the shader
cache was ignored at boot. Address that issue.
2019-10-06 18:00:20 -03:00
Lioncash f883cd4f0e video_core/control_flow: Eliminate variable shadowing warnings 2019-10-05 09:14:27 -04:00
Lioncash 25702b6256 video_core/control_flow: Eliminate pessimizing moves
These can inhibit the ability of a compiler to perform RVO.
2019-10-05 09:14:27 -04:00
Lioncash d82b181d44 video_core/ast: Unindent most of IsFullyDecompiled() by one level 2019-10-05 09:14:27 -04:00
Lioncash 6c41d1cd7e video_core/ast: Make ShowCurrentState() take a string_view instead of std::string
Allows the function to be non-allocating in terms of the output string.
2019-10-05 09:14:27 -04:00
Lioncash 3c54edae24 video_core/ast: Eliminate variable shadowing warnings 2019-10-05 09:14:26 -04:00
Lioncash 5a0a9c7449 video_core/ast: Replace std::string with a constexpr std::string_view
Same behavior, but without the need to heap allocate
2019-10-05 09:14:26 -04:00
Lioncash 3a20d9734f video_core/ast: Default the move constructor and assignment operator
This is behaviorally equivalent and also fixes a bug where some members
weren't being moved over.
2019-10-05 09:14:26 -04:00
Lioncash 43503a69bf video_core/{ast, expr}: Organize forward declaration
Keeps them alphabetically sorted for readability.
2019-10-05 09:14:26 -04:00
Lioncash 50ad745585 video_core/expr: Supply operator!= along with operator==
Provides logical symmetry to the interface.
2019-10-05 09:14:26 -04:00
Lioncash 8eb1398f8d video_core/{ast, expr}: Use std::move where applicable
Avoids unnecessary atomic reference count increments and decrements.
2019-10-05 09:14:23 -04:00
Lioncash 8e0c80f269 video_core/ast: Supply const accessors for data where applicable
Provides const equivalents of data accessors for use within const
contexts.
2019-10-05 08:22:03 -04:00
David 3728bbc22a
Merge pull request #2888 from FernandoS27/decompiler2
Shader_IR: Implement a full control flow decompiler for the shader IR.
2019-10-05 21:52:20 +10:00
ReinUsesLisp fe7f20e659 maxwell_3d: Add dirty flags for depth bounds values
This is useful in Vulkan where we want to update depth bounds without
caring if it's enabled or disabled through vkCmdSetDepthBounds.
2019-10-05 04:07:47 +00:00
Fernando Sahmkow 538f5880ff GL_Renderer: Remove lefting snippet. 2019-10-04 19:59:55 -04:00
Fernando Sahmkow 9f2719d1a4 Gl_Rasterizer: Protect CPU Memory mapping from multiple threads. 2019-10-04 19:59:53 -04:00
Fernando Sahmkow 3f104464de Core: Wait for GPU to be idle before shutting down. 2019-10-04 19:59:53 -04:00
Fernando Sahmkow ffc2ce89a0 Nvdrv: Do framelimiting only in the CPU Thread 2019-10-04 19:59:50 -04:00
Fernando Sahmkow 5b5e60ffec GPU_Async: Correct fences, display events and more.
This commit uses guest fences on vSync event instead of an articial fake 
fence we had.
It also corrects to keep signaling display events while loading the game 
as the OS is suppose to send buffers to vSync during that time.
2019-10-04 19:59:48 -04:00
Fernando Sahmkow ab47a660c8 Texture_Cache: Blit Deduction corrections and simplifications. 2019-10-04 18:53:47 -04:00
Fernando Sahmkow 2036504a82 TextureCache: Add the ability to deduce if two textures are depth on blit. 2019-10-04 18:53:46 -04:00
Fernando Sahmkow e6eae4b815 Shader_ir: Address feedback 2019-10-04 18:52:57 -04:00
Fernando Sahmkow 3c09d9abe6 Shader_Ir: Address Feedback and clang format. 2019-10-04 18:52:57 -04:00
Fernando Sahmkow 507a9c6a40 vk_shader_decompiler: Correct Branches inside conditionals. 2019-10-04 18:52:56 -04:00
Fernando Sahmkow 000ad558dd vk_shader_decompiler: Clean code and be const correct. 2019-10-04 18:52:55 -04:00
Fernando Sahmkow 7c756baa77 Shader_IR: clean up AST handling and add documentation. 2019-10-04 18:52:55 -04:00
Fernando Sahmkow 5ea740beb5 Shader_IR: Correct OutwardMoves for Ifs 2019-10-04 18:52:54 -04:00
Fernando Sahmkow 100a4bd988 vk_shader_compiler: Don't enclose branches with if(true) to avoid crashing AMD 2019-10-04 18:52:54 -04:00
Fernando Sahmkow 189a50bc2a gl_shader_decompiler: Refactor and address feedback. 2019-10-04 18:52:53 -04:00
Fernando Sahmkow b3c46d6948 Shader_IR: corrections and clang-format 2019-10-04 18:52:53 -04:00
Fernando Sahmkow 466cd52ad4 vk_shader_compiler: Correct SPIR-V AST Decompiling 2019-10-04 18:52:52 -04:00
Fernando Sahmkow 2e9a810423 Shader_IR: allow else derivation to be optional. 2019-10-04 18:52:52 -04:00
Fernando Sahmkow ca9901867e vk_shader_compiler: Implement the decompiler in SPIR-V 2019-10-04 18:52:51 -04:00
Fernando Sahmkow 0366c18d87 Shader_IR: mark labels as unused for partial decompile. 2019-10-04 18:52:51 -04:00
Fernando Sahmkow 47e4f6a52c Shader_Ir: Refactor Decompilation process and allow multiple decompilation modes. 2019-10-04 18:52:50 -04:00
Fernando Sahmkow 38fc995f6c gl_shader_decompiler: Implement AST decompiling 2019-10-04 18:52:50 -04:00
Fernando Sahmkow 6fdd501113 shader_ir: Declare Manager and pass it to appropiate programs. 2019-10-04 18:52:49 -04:00
Fernando Sahmkow 8be6e1c522 shader_ir: Corrections to outward movements and misc stuffs 2019-10-04 18:52:48 -04:00
Fernando Sahmkow 4fde66e609 shader_ir: Add basic goto elimination 2019-10-04 18:52:48 -04:00
Fernando Sahmkow c17953978b shader_ir: Initial Decompile Setup 2019-10-04 18:52:47 -04:00
ReinUsesLisp 69c806feb6
gl_rasterizer: Fix polygon offset units
For some reason hardware divides polygon offset units by two. This is
visible since drivers multiply the application requested polygon offset
by two.
2019-10-01 02:00:23 -03:00
ReinUsesLisp f926230ab1
gl_shader_decompiler: Add tailing return for HUnpack2 2019-09-24 01:03:59 -03:00
ReinUsesLisp 25bfaffdff
gl_shader_decompiler: Fix clang build issues 2019-09-24 01:03:27 -03:00
bunnei 376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
David 9d69206cd0
Merge pull request #2870 from FernandoS27/multi-draw
Implement a MME Draw commands Inliner and correct host instance drawing
2019-09-22 23:13:02 +10:00
Fernando Sahmkow 822ca65d69
Merge pull request #2891 from FearlessTobi/rod-tex
video_core: Implement RGBX16F and lower Surface Copy log severity
2019-09-22 09:11:28 -04:00
David 3bfba23362
Merge pull request #2867 from ReinUsesLisp/configure-framebuffers-clean
gl_rasterizer: Remove unused code paths from ConfigureFramebuffers
2019-09-22 23:10:07 +10:00
Fernando Sahmkow 68f5aff64f Maxwell3D: Corrections and refactors to MME instance refactor 2019-09-22 07:23:13 -04:00
FearlessTobi 01fc969a5f Fix clang-format 2019-09-22 02:21:56 +02:00
FearlessTobi 366e900376 fermi_2d: Lower surface copy log severity to DEBUG 2019-09-22 02:18:57 +02:00
FearlessTobi 55d272efe6 video_core: Implement RGBX16F PixelFormat 2019-09-22 02:16:44 +02:00
Rodrigo Locatti 9286976948
Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp 44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp 675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
ReinUsesLisp 4de0f1e1c8
shader_bytecode: Add SULD encoding 2019-09-21 17:31:46 -03:00
Fernando Sahmkow 527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
David 9ad42fb0cf
Merge pull request #2868 from ReinUsesLisp/fix-mipmaps
maxwell_to_gl: Fix mipmap filtering
2019-09-21 19:57:09 +10:00
David Marcec 01a4afee42 Mark DrawArrays as LOG_TRACE
There's no reason to clog logs with DrawArray.
2019-09-21 15:43:58 +10:00
bunnei bbe82d62b0
Merge pull request #2846 from ReinUsesLisp/fixup-viewport-index
gl_shader_decompiler: Avoid writing output attribute when unimplemented
2019-09-20 17:11:20 -04:00
bunnei 88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow 433e764bb0 Rasterizer: Correct introduced bug where a conditional render wouldn't stop a draw call from executing 2019-09-20 15:44:28 -04:00
Fernando Sahmkow 4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
Fernando Sahmkow 7761e44d18 Rasterizer: Refactor and simplify DrawBatch Interface. 2019-09-19 11:41:33 -04:00
Fernando Sahmkow d2ea592ddb Rasterizer: Address Feedback and conscerns. 2019-09-19 11:41:32 -04:00
Fernando Sahmkow c17655ce74 Rasterizer: Refactor draw calls, remove deadcode and clean up. 2019-09-19 11:41:31 -04:00
Fernando Sahmkow 7606da5611 VideoCore: Corrections to the MME Inliner and removal of hacky instance management. 2019-09-19 11:41:29 -04:00
Fernando Sahmkow ba02d564f8 Video Core: initial Implementation of InstanceDraw Packaging 2019-09-19 11:41:27 -04:00
bunnei b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp 0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp 2dd6411753 maxwell_to_gl: Fix mipmap filtering
OpenGL texture filters follow GL_<texture_filter>_MIPMAP_<mipmap_filter>
but we were using them in the opposite way.
2019-09-17 03:32:24 -03:00
ReinUsesLisp af809b491e gl_rasterizer: Remove unused code paths from ConfigureFramebuffers 2019-09-17 02:50:42 -03:00
Fernando Sahmkow 393cc3ef2f
Merge pull request #2851 from ReinUsesLisp/srgb
renderer_opengl: Fix sRGB blits
2019-09-15 10:38:10 -04:00
Fernando Sahmkow b8b1747704
Merge pull request #2824 from ReinUsesLisp/mme
Revert "Revert #2466" and stub FirmwareCall 4
2019-09-15 06:17:04 -04:00
Rodrigo Locatti 193bfefce4
maxwell_3d: Update firmware 4 call stub commentary 2019-09-14 22:51:18 -03:00
Fernando Sahmkow daae327e86
Merge pull request #2857 from ReinUsesLisp/surface-srgb
video_core/surface: Add function to detect sRGB surfaces
2019-09-14 03:53:21 -04:00
Fernando Sahmkow 18fac59050
Merge pull request #2858 from ReinUsesLisp/vk-device
vk_device: Add miscellaneous features and minor style changes
2019-09-14 03:52:06 -04:00
ReinUsesLisp 01d96e1136 vk_device: Add miscellaneous features and minor style changes
* Increase minimum Vulkan requirements
* Require VK_EXT_vertex_attribute_divisor
* Require depthClamp, samplerAnisotropy and largePoints features
* Search and expose VK_KHR_uniform_buffer_standard_layout
* Search and expose VK_EXT_index_type_uint8
* Search and expose native float16 arithmetics
* Track current driver with VK_KHR_driver_properties
* Query and expose SSBO alignment
* Query more image formats
* Improve logging overall
* Minor style changes
* Minor rephrasing of commentaries
2019-09-13 02:10:07 -03:00
ReinUsesLisp 99e23bd0fd video_core/surface: Add function to detect sRGB surfaces
This is required for proper conversion to RGBA8_UNORM or RGBA8_SRGB
surfaces when a backend can target both native and converted ASTC.
2019-09-13 00:27:04 -03:00
ReinUsesLisp 6b997c8f7f renderer_opengl: Fix rebase mistake 2019-09-11 00:09:37 -03:00
ReinUsesLisp 36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
Fernando Sahmkow e60d281a01 gl_rasterizer: Correct sRGB Fix regression 2019-09-10 19:31:42 -03:00
ReinUsesLisp 78574746bd renderer_opengl: Fix sRGB blits
Removes the sRGB hack of tracking if a frame used an sRGB rendertarget
to apply at least once to blit the final texture as sRGB. Instead of
doing this apply sRGB if the presented image has sRGB.

Also enable sRGB by default on Maxwell3D registers as some games seem to
assume this.
2019-09-10 19:31:42 -03:00
bunnei 34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
bunnei c7ec7bc1f5
Merge pull request #2810 from ReinUsesLisp/mme-opt
maxwell_3d: Avoid moving macro_params
2019-09-10 11:55:45 -04:00
ReinUsesLisp 17a9b0178d gl_shader_decompiler: Avoid writing output attribute when unimplemented 2019-09-06 15:02:12 -03:00
ReinUsesLisp 1f43e5296f gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp 7228e22098 texture_cache: Minor changes 2019-09-05 23:25:15 -03:00
ReinUsesLisp 322d0200c8 gl_rasterizer: Apply textures and images state 2019-09-05 20:35:51 -03:00
ReinUsesLisp 80ec2feee8 gl_rasterizer: Add samplers to compute dispatches 2019-09-05 20:35:51 -03:00
ReinUsesLisp 954fc02fdd gl_rasterizer: Minor code changes 2019-09-05 20:35:51 -03:00
ReinUsesLisp 04cdecb7a1 gl_state: Split textures and samplers into two arrays 2019-09-05 20:35:51 -03:00
ReinUsesLisp 6170337001 gl_rasterizer: Implement image bindings 2019-09-05 20:35:51 -03:00
ReinUsesLisp 5edf24b510 gl_state: Add support for glBindImageTextures 2019-09-05 20:35:51 -03:00
ReinUsesLisp 2424eefad2 texture_cache: Pass TIC to texture cache 2019-09-05 20:35:51 -03:00
ReinUsesLisp 3a450c1395 kepler_compute: Implement texture queries 2019-09-05 20:35:51 -03:00
ReinUsesLisp 2e5b5c2358 gl_rasterizer: Split SetupTextures 2019-09-05 20:35:51 -03:00
Fernando Sahmkow 4ee9949639
Merge pull request #2804 from ReinUsesLisp/remove-gs-special
gl_shader_cache: Remove special casing for geometry shaders
2019-09-05 16:03:46 -04:00
bunnei 03badbdd9b
Merge pull request #2833 from ReinUsesLisp/fix-stencil
gl_rasterizer: Fix stencil testing
2019-09-05 15:27:31 -04:00
ReinUsesLisp 0f7b813d65 gl_shader_decompiler: Implement shared memory 2019-09-05 01:40:24 -03:00
ReinUsesLisp 4de04eba39 shader_ir: Implement LD_S
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp f17415d431 shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
David d34fa7c4fa
Merge pull request #2802 from ReinUsesLisp/hsetp2-pred
half_set_predicate: Fix HSETP2 predicate assignments
2019-09-05 12:26:39 +10:00
ReinUsesLisp 6177cbdbe1 gl_shader_decompiler: Fixup slow path 2019-09-04 15:03:51 -03:00
ReinUsesLisp 7bbc98cfc3 gl_rasterizer: Fix stencil testing
* Fix stencil dirty flags tracking when stencil is disabled
* Attach stencil on clears (previously it only attached depth)
* Attach stencil on drawing regardless of stencil testing being enabled
2019-09-04 01:59:09 -03:00
ReinUsesLisp 5f309b88db Revert "Revert #2466" and stub FirmwareCall 4 2019-09-04 01:55:45 -03:00
ReinUsesLisp 77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp 701dedcfad maxwell_3d: Avoid moving macro_params 2019-09-04 01:55:01 -03:00
ReinUsesLisp 42e1bb6d46 gl_shader_cache: Remove special casing for geometry shaders
Now that ProgramVariants holds the primitive topology we no longer need
to keep track of individual geometry shaders topologies.
2019-09-04 01:54:43 -03:00
ReinUsesLisp dfae2d141a half_set_predicate: Fix predicate assignments 2019-09-04 01:54:23 -03:00
ReinUsesLisp 9cf52d027d gl_device: Disable precise in fragment shaders on bugged drivers 2019-09-04 01:54:00 -03:00
ReinUsesLisp 03276e7490 gl_shader_decompiler: Fixup AMD's slow path type 2019-09-04 01:54:00 -03:00
ReinUsesLisp 6c449793b8 gl_shader_decompiler: Rework GLSL decompiler type system
GLSL decompiler type system was broken. We converted all return values
to float except for some cases where returning we couldn't and
implicitly broke the rule of returning floats (e.g. for bools or bool
pairs).

Instead of doing this introduce class Expression that knows what type a
return value has and when a consumer wants to use the string it asks for
it with a required type, emitting a runtime error if types are
incompatible.

This has the disadvantage that there's more C++ code, but we can emit
better GLSL code that's easier to read.
2019-09-04 01:54:00 -03:00
bunnei 19af91434e
Merge pull request #2793 from ReinUsesLisp/bgr565
renderer_opengl: Implement RGB565 framebuffer format
2019-09-03 22:36:32 -04:00
bunnei 81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
bunnei 137d165672
Merge pull request #2826 from ReinUsesLisp/macro-binding
maxwell_3d: Fix macro binding cursor
2019-09-03 22:32:42 -04:00
bunnei 50b5bb44a0
Merge pull request #2765 from FernandoS27/dma-fix
MaxwellDMA: Fixes, corrections and relaxations.
2019-09-01 13:13:05 -04:00
ReinUsesLisp 52a41f482f maxwell_3d: Fix macro binding cursor 2019-09-01 05:01:11 -03:00
Rodrigo Locatti 4d4f9cc104 video_core: Silent miscellaneous warnings (#2820)
* texture_cache/surface_params: Remove unused local variable

* rasterizer_interface: Add missing documentation commentary

* maxwell_dma: Remove unused rasterizer reference

* video_core/gpu: Sort member declaration order to silent -Wreorder warning

* fermi_2d: Remove unused MemoryManager reference

* video_core: Silent unused variable warnings

* buffer_cache: Silent -Wreorder warnings

* kepler_memory: Remove unused MemoryManager reference

* gl_texture_cache: Add missing override

* buffer_cache: Add missing include

* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
ReinUsesLisp 878adee0a3 gl_buffer_cache: Add missing include
RasterizerInterface was considered an incomplete object by clang.
2019-08-29 22:02:52 +00:00
bunnei a67c4e6e02
Merge pull request #2742 from ReinUsesLisp/fix-texture-buffers
gl_texture_cache: Miscellaneous texture buffer fixes
2019-08-29 15:59:17 -04:00
bunnei e424615839
Merge pull request #2783 from FernandoS27/new-buffer-cache
Implement a New LLE Buffer Cache
2019-08-29 13:07:01 -04:00
bunnei f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp 6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp 4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
Fernando Sahmkow 83ec2091c1 Buffer Cache: Adress Feedback. 2019-08-21 12:14:27 -04:00
Fernando Sahmkow 6ce2c85047 Buffer_Cache: Implement flushing. 2019-08-21 12:14:26 -04:00
Fernando Sahmkow de8ff8a1c6 Buffer_Cache: Implement barriers. 2019-08-21 12:14:25 -04:00
Fernando Sahmkow 286f4c446a Buffer_Cache: Optimize and track written areas. 2019-08-21 12:14:25 -04:00
Fernando Sahmkow 5f4b746a1e BufferCache: Rework mapping caching. 2019-08-21 12:14:24 -04:00
Fernando Sahmkow 86d8563314 Buffer_Cache: Fixes and optimizations. 2019-08-21 12:14:23 -04:00
Fernando Sahmkow 862bec001b Video_Core: Implement a new Buffer Cache 2019-08-21 12:14:22 -04:00
bunnei d654b3d82e
Merge pull request #2769 from FernandoS27/commands-flush
GPU: Flush commands on every dma pusher step.
2019-08-21 10:29:56 -04:00
bunnei dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp 80702aa88f renderer_opengl: Implement RGB565 framebuffer format 2019-08-21 02:28:31 -03:00
ReinUsesLisp 9cdf5c6c31 renderer_opengl: Use block linear swizzling for CPU framebuffers 2019-08-21 02:17:14 -03:00
ReinUsesLisp 8ad7268c75 renderer_opengl: Use VideoCore pixel format 2019-08-21 02:16:40 -03:00
ReinUsesLisp 9a76e94b3d gpu: Change optional<reference_wrapper<T>> to T* for FramebufferConfig 2019-08-21 01:55:25 -03:00
bunnei ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
bunnei 87bbefe55f
Merge pull request #2768 from ReinUsesLisp/hsetp2-fix
decode/half_set_predicate: Fix predicates
2019-08-18 08:50:54 -04:00
ReinUsesLisp 2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
ReinUsesLisp ec0da3ef64 half_set_predicate: Fix HSETP2_C constant buffer offset 2019-08-04 02:50:55 -03:00
Fernando Sahmkow e52c895559 GPU: Flush commands on every dma pusher step.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
bunnei 52f54c728d
Merge pull request #2592 from FernandoS27/sync1
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
ReinUsesLisp 77f1a676a1 decode/half_set_predicate: Fix predicates 2019-07-26 00:12:38 -03:00
Fernando Sahmkow a452ff983d MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei b0ff3179ef
Merge pull request #2739 from lioncash/cflow
video_core/control_flow: Minor changes/warning cleanup
2019-07-25 13:04:56 -04:00
bunnei 4d26550f5f
Merge pull request #2737 from FernandoS27/track-fix
Shader_Ir: Correct tracking to track from right to left
2019-07-25 12:41:52 -04:00
bunnei 31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
bunnei 9be9600bdc
Merge pull request #2704 from FernandoS27/conditional
maxwell3d: Implement Conditional Rendering
2019-07-24 17:07:57 -04:00
ReinUsesLisp 104641db07 shader/decode: Implement S2R Tic 2019-07-22 16:16:10 -03:00
bunnei f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
bunnei 27e10e0442
Merge pull request #2735 from FernandoS27/pipeline-rework
Rework Dirty Flags in GPU Pipeline, Optimize CBData and Redo Clearing mechanism
2019-07-21 00:59:52 -04:00
Fernando Sahmkow 11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow 7a35178ee2 Maxwell3D: Reorganize and address feedback 2019-07-20 10:18:35 -04:00
Fernando Sahmkow 1158777737 Shader_Ir: Change Debug Asserts for Log Warnings 2019-07-19 22:15:34 -04:00
ReinUsesLisp 45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp 6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Lioncash c1c89411da video_core/control_flow: Provide operator!= for types with operator==
Provides operational symmetry for the respective structures.
2019-07-18 21:03:31 -04:00
Lioncash 1780e0e3d0 video_core/control_flow: Prevent sign conversion in TryGetBlock()
The return value is a u32, not an s32, so this would result in an
implicit signedness conversion.
2019-07-18 21:03:31 -04:00
Lioncash a162a844d2 video_core/control_flow: Remove unnecessary BlockStack copy constructor
This is the default behavior of the copy constructor, so it doesn't need
to be specified.

While we're at it we can make the other non-default constructor
explicit.
2019-07-18 21:03:30 -04:00
Lioncash 56bc11d952 video_core/control_flow: Use std::move where applicable
Results in less work being done where avoidable.
2019-07-18 21:03:30 -04:00
Lioncash e7b39f47f8 video_core/control_flow: Use the prefix variant of operator++ for iterators
Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath.
2019-07-18 21:03:30 -04:00
Lioncash 6885e7e7ec video_core/control_flow: Use empty() member function for checking emptiness
It's what it's there for.
2019-07-18 21:03:30 -04:00
Lioncash 45fa12a05c video_core: Resolve -Wreorder warnings
Ensures that the constructor members are always initialized in the order
that they're declared in.
2019-07-18 21:03:30 -04:00
Lioncash 47df844338 video_core/control_flow: Make program_size for ScanFlow() a std::size_t
Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface.
2019-07-18 21:03:29 -04:00
Lioncash 3df9558593 video_core/control_flow: Place all internally linked types/functions within an anonymous namespace
Previously, quite a few functions were being linked with external
linkage.
2019-07-18 21:03:29 -04:00
Lioncash 1109db86b7 video_core/shader/decode: Prevent sign-conversion warnings
Makes it explicit that the conversions here are intentional.
2019-07-18 21:03:29 -04:00
bunnei 63bda67a34
Merge pull request #2738 from lioncash/shader-ir
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow 5a06e33859 Shader_Ir: correct clang format 2019-07-18 10:09:26 -04:00
Fernando Sahmkow 43f57d668c GPU: Add missing puller methods.
This adds some missing puller methods. We don't assert them as these are 
nop operations for us.
2019-07-18 08:54:42 -04:00
Fernando Sahmkow 3a3fee5abf MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.
This log was just to know which games used DMA. It's no longer 
important.
2019-07-18 08:31:38 -04:00
Fernando Sahmkow d3b71ff80d Gl_Texture_Cache: Remove assert on component type in GetFormatTuple
Textures can have different components types in different orders. This 
assert was completely inprecise and the effectiveness of such is better 
handled by case and within the texture cache.
2019-07-18 08:20:31 -04:00
Fernando Sahmkow 0b65e9335e Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and 
rounding as this are well known and have little to no consequences in 
gpu's accuracy.
2019-07-18 08:17:19 -04:00
ReinUsesLisp 74632c76ce gl_shader_decompiler: Rename bufferImage to imageBuffer
The online OpenGL documentation is wrong. The type definition is
imageBuffer.
2019-07-18 01:16:44 -03:00
ReinUsesLisp 87909d327f gl_shader_cache: Fix newline on buffer preprocessor definitions 2019-07-18 01:16:15 -03:00
ReinUsesLisp e7bdf8b22a textures: Fix texture buffer size calculation 2019-07-18 01:07:08 -03:00
ReinUsesLisp 84027f4808 gl_texture_cache: Do not set texture parameters to buffers 2019-07-18 01:06:26 -03:00
ReinUsesLisp 73b2dc6d4f gl_texture_cache: Add missing break in CreateTexture 2019-07-18 01:04:18 -03:00
Fernando Sahmkow 4be61013a1 GL_State: Feedback and fixes 2019-07-17 17:29:56 -04:00
Fernando Sahmkow 5ad889f6fd Maxwell3D: Address Feedback 2019-07-17 17:29:55 -04:00
Fernando Sahmkow 7826f0afd9 Texture_Cache: Rebase Fixes 2019-07-17 17:29:54 -04:00
Fernando Sahmkow 8cdbfe69b1 GL_Rasterizer: Corrections to Clearing. 2019-07-17 17:29:54 -04:00
Fernando Sahmkow 0ff4a5fa39 Maxwell3D: Correct marking dirtiness on CB upload 2019-07-17 17:29:53 -04:00
Fernando Sahmkow fec32fed18 GL_Rasterizer: Rework RenderTarget/DepthBuffer clearing 2019-07-17 17:29:52 -04:00
Fernando Sahmkow a081dea8ab Maxwell3D: Implement State Dirty Flags. 2019-07-17 17:29:51 -04:00
Fernando Sahmkow 0d3db58657 Maxwell3D: Rework CBData Upload 2019-07-17 17:29:50 -04:00
Fernando Sahmkow f2e7b29c14 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow e42bcf2314 maxwell3d: Implement Conditional Rendering
Conditional Rendering takes care of conditionaly clearing or drawing
depending on a set of queries. This PR implements the query checks to
stablish if things can be rendered or not.
2019-07-17 17:13:19 -04:00
Fernando Sahmkow 223a535f3f
Merge pull request #2740 from lioncash/bra
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash bebbdc2067 shader_ir: std::move Node instance where applicable
These are std::shared_ptr instances underneath the hood, which means
copying them isn't as cheap as a regular pointer. Particularly so on
weakly-ordered systems.

This avoids atomic reference count increments and decrements where they
aren't necessary for the core set of operations.
2019-07-16 19:49:23 -04:00
Lioncash 60926ac16b shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00
Lioncash 44d87ff641 shader_ir: Remove unused includes
Removes unnecessary header dependencies.
2019-07-16 19:47:42 -04:00
Fernando Sahmkow d614193e49 Shader_Ir: Correct tracking to track from right to left 2019-07-16 15:06:59 -04:00
Fernando Sahmkow b56e7f870a
Merge pull request #2565 from ReinUsesLisp/track-indirect
shader/track: Track indirect buffers
2019-07-16 14:58:35 -04:00
Lioncash e2d7dda166 shader/decode/other: Correct branch indirect argument within BRA handling
This appears to have been a copy/paste error introduced within
8a6fc529a9
2019-07-16 12:20:45 -04:00
ReinUsesLisp 2a4044a858 gl_shader_cache: Fix clang-format issues 2019-07-15 20:33:51 -03:00
ReinUsesLisp 6b0d017675 gl_shader_decompiler: Stub local memory size 2019-07-15 17:38:25 -03:00
ReinUsesLisp 56bca83bde gl_shader_cache: Address review commentaries 2019-07-15 17:38:25 -03:00
ReinUsesLisp bbecd13697 gl_shader_cache: Address CI issues 2019-07-15 17:38:25 -03:00
ReinUsesLisp 725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow 1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei b77a1ed67a
Merge pull request #2705 from FernandoS27/tex-cache-fixes
GPU: Fixes to Texture Cache and Include Microprofiles for GL State/BufferCopy/Macro Interpreter
2019-07-14 22:44:36 -04:00
ReinUsesLisp afa8096df5 shader: Allow tracking of indirect buffers without variable offset
While changing this code, simplify tracking code to allow returning
the base address node, this way callers don't have to manually rebuild
it on each invocation.
2019-07-14 22:36:44 -03:00
bunnei 3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow 2ac7472d3f Texture_Cache: Address Feedback 2019-07-14 17:42:39 -04:00
Fernando Sahmkow 0f54b541f4 Texture_Cache: Remove some unprecise fallback case and clang format 2019-07-14 12:00:32 -04:00
Fernando Sahmkow 5818959e54 Texture_Cache: Force Framebuffer reset if an active render target is unregistered. 2019-07-14 12:00:31 -04:00
Fernando Sahmkow 913b7a6872 GPU: Add a microprofile for macro interpreter 2019-07-14 12:00:30 -04:00
Fernando Sahmkow a9943222f2 GL_State: Add a microprofile timer to OpenGL state. 2019-07-14 12:00:30 -04:00
Fernando Sahmkow 5c1e1a148e Gl_Texture_Cache: Measure Buffer Copy Times 2019-07-14 12:00:29 -04:00
Fernando Sahmkow 5d31bab69a Texture_Cache: Correct Linear Structural Match. 2019-07-14 12:00:28 -04:00
Fernando Sahmkow 4882c058fd
Merge pull request #2690 from SciresM/physmem_fixes
Implement MapPhysicalMemory/UnmapPhysicalMemory
2019-07-14 09:16:46 -04:00
Fernando Sahmkow 0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
bunnei bb67091c77
Merge pull request #2609 from FernandoS27/new-scan
Implement a New Shader Scanner, Decompile Flow Stack and implement BRX BRA.CC
2019-07-11 17:36:23 -04:00
ReinUsesLisp 0eb0c24269 gl_shader_decompiler: Fix gl_PointSize redeclaration 2019-07-11 16:10:59 -03:00
ReinUsesLisp aca40de224 gl_shader_decompiler: Fix conditional usage of GL_ARB_shader_viewport_layer_array 2019-07-11 04:27:00 -03:00
bunnei fd066ffbce
Merge pull request #2697 from lioncash/doc
gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
2019-07-10 16:38:09 -04:00
bunnei 7fb7054bc8
Merge pull request #2686 from ReinUsesLisp/vk-scheduler
vk_scheduler: Drop execution context in favor of views
2019-07-10 16:35:48 -04:00
bunnei 206ec29f17
Merge pull request #2691 from lioncash/override
video_core: Add missing override specifiers
2019-07-10 16:25:43 -04:00
Fernando Sahmkow f2549739d1 shader_ir: Add comments on missing instruction.
Also shows Nvidia's address space on comments.
2019-07-09 17:15:45 -04:00
Michael Scire a1845d1dd3 prefer system reference over global accessor 2019-07-09 08:11:35 -07:00
Fernando Sahmkow 2de7649311 shader_ir: limit explorastion to best known program size. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow e7c6045a03 control_flow: Correct block breaking algorithm. 2019-07-09 08:14:43 -04:00
Fernando Sahmkow dc4a93594c control_flow: Assert shaders bigger than limit. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow e7a88f0ab3 control_flow: Address feedback. 2019-07-09 08:14:42 -04:00
Fernando Sahmkow 34357b110c shader_ir: Correct parsing of scheduling instructions and correct sizing 2019-07-09 08:14:41 -04:00
Fernando Sahmkow cfb3db1a32 shader_ir: Correct max sizing 2019-07-09 08:14:40 -04:00
Fernando Sahmkow d45fed3030 shader_ir: Remove unnecessary constructors and use optional for ScanFlow result 2019-07-09 08:14:40 -04:00
Fernando Sahmkow 01b21ee1e8 shader_ir: Corrections, documenting and asserting control_flow 2019-07-09 08:14:39 -04:00
Fernando Sahmkow d5533b440c shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow 926b80102f shader_ir: Decompile Flow Stack 2019-07-09 08:14:38 -04:00
Fernando Sahmkow 459fce3a8f shader_ir: propagate shader size to the IR 2019-07-09 08:14:37 -04:00
Fernando Sahmkow 8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
Fernando Sahmkow c218ae4b02 shader_ir: Remove the old scanner. 2019-07-09 08:14:36 -04:00
Fernando Sahmkow 8af6e6a052 shader_ir: Implement a new shader scanner 2019-07-09 08:14:36 -04:00
Lioncash c04785c928 gl_rasterizer: Amend documentation comment for ConfigureFramebuffers()
must_reconfigure isn't a parameter for this function any more, so it can
be replaced with current_state.

While we're at it, we can make the parameters of the declaration match
the same name as the ones in the definition.
2019-07-09 02:08:15 -04:00
Michael Scire 697206092e Prevent merging of device mapped memory blocks.
This sets the DeviceMapped attribute for GPU-mapped memory blocks,
and prevents merging device mapped blocks. This prevents memory
mapped from the gpu from having its backing address changed by
block coalesce.
2019-07-08 22:52:05 -07:00
ReinUsesLisp c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
Tobias be020f7621
Delete decode_integer_set.cpp 2019-07-07 21:40:33 +02:00
ReinUsesLisp d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
Lioncash cbdd6cd1c0 vk_sampler_cache: Remove unused includes
These are no longer used within this header, so they can be removed.
2019-07-07 13:40:36 -04:00
Lioncash 4b27680639 video_core: Add missing override specifiers 2019-07-07 13:38:39 -04:00
ReinUsesLisp 86a874a2fc vk_scheduler: Drop execution context in favor of views
Instead of passing by copy an execution context through out the whole
Vulkan call hierarchy, use a command buffer view and fence view
approach.

This internally dereferences the command buffer or fence forcing the
user to be unable to use an outdated version of it on normal usage.
It is still possible to keep store an outdated if it is casted to
VKFence& or vk::CommandBuffer.

While changing this file, add an extra parameter for Flush and Finish to
allow releasing the fence from this calls.
2019-07-07 03:30:22 -03:00
ReinUsesLisp 79a23ca5f0 buffer_cache: Avoid [[nodiscard]] to make clang-format happy 2019-07-06 01:17:05 -03:00
ReinUsesLisp 83050c9495 buffer_cache: Try to fix MinGW build 2019-07-06 01:14:05 -03:00
ReinUsesLisp f7691ebe57 gl_rasterizer: Fix nullptr dereference on disabled buffers 2019-07-06 00:37:56 -03:00
ReinUsesLisp 7ecf64257a gl_rasterizer: Minor style changes 2019-07-06 00:37:55 -03:00
ReinUsesLisp 9cdc576f60 gl_rasterizer: Fix vertex and index data invalidations 2019-07-06 00:37:55 -03:00
ReinUsesLisp 1fa21fa192 gl_buffer_cache: Implement with generic buffer cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp 32c0212b24 buffer_cache: Implement a generic buffer cache
Implements a templated class with a similar approach to our current
generic texture cache. It is designed to be compatible with Vulkan and
OpenGL,
2019-07-06 00:37:55 -03:00
ReinUsesLisp 2bcae41a73 gl_buffer_cache: Remove global system getters 2019-07-06 00:37:55 -03:00
ReinUsesLisp 02ab844934 gl_device: Query SSBO alignment 2019-07-06 00:37:55 -03:00
ReinUsesLisp d14fbfb9b5 gl_buffer_cache: Implement flushing 2019-07-06 00:37:55 -03:00
ReinUsesLisp 345f852bdb gl_rasterizer: Drop gl_global_cache in favor of gl_buffer_cache 2019-07-06 00:37:55 -03:00
ReinUsesLisp 8155b12d3d gl_buffer_cache: Rework to support internalized buffers 2019-07-06 00:37:55 -03:00
ReinUsesLisp f8ba72d491 gl_buffer_cache: Store in CachedBufferEntry the used buffer handle 2019-07-06 00:37:55 -03:00
ReinUsesLisp b54fb8fc4c gl_buffer_cache: Return used buffer from Upload function 2019-07-06 00:37:55 -03:00
ReinUsesLisp a6d2f52fc3 gl_rasterizer: Add some commentaries 2019-07-06 00:37:55 -03:00
ReinUsesLisp 2b9d4088ec gl_rasterizer: Make DrawParameters rasterizer instance const 2019-07-06 00:37:55 -03:00
ReinUsesLisp 2e39c20da5 gl_rasterizer: Move index buffer uploading to its own method 2019-07-06 00:37:55 -03:00
Fernando Sahmkow d20ede40b1 NVServices: Styling, define constructors as explicit and corrections 2019-07-05 15:49:32 -04:00
Fernando Sahmkow b391e5f638 NVFlinger: Correct GCC compile error 2019-07-05 15:49:31 -04:00
Fernando Sahmkow 0335a25d1f NVServices: Make NVEvents Automatic according to documentation. 2019-07-05 15:49:29 -04:00
Fernando Sahmkow 7d1b974bca GPU: Correct Interrupts to interrupt on syncpt/value instead of event, mirroring hardware 2019-07-05 15:49:26 -04:00
Fernando Sahmkow f2e026a1d8 gpu_asynch: Simplify synchronization to a simpler consumer->producer scheme. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow 0706d633bf nv_host_ctrl: Make Sync GPU variant always return synced result. 2019-07-05 15:49:20 -04:00
Fernando Sahmkow 600dddf88d Async GPU: do invalidate as synced operation
Async GPU: Always invalidate synced.
2019-07-05 15:49:19 -04:00
Fernando Sahmkow c13433aee4 Gpu: use an std mutex instead of a spin_lock to guard syncpoints 2019-07-05 15:49:18 -04:00
Fernando Sahmkow eef55f493b Gpu: Mark areas as protected. 2019-07-05 15:49:16 -04:00
Fernando Sahmkow a45643cb3b nv_services: Stub CtrlEventSignal 2019-07-05 15:49:15 -04:00
Fernando Sahmkow 8942047d41 Gpu: Implement Hardware Interrupt Manager and manage GPU interrupts 2019-07-05 15:49:14 -04:00
Fernando Sahmkow 82b829625b video_core: Implement GPU side Syncpoints 2019-07-05 15:49:11 -04:00
Zach Hilman 772c86a260
Merge pull request #2601 from FernandoS27/texture_cache
Implement a new Texture Cache
2019-07-05 13:39:13 -04:00
Fernando Sahmkow 3b9d89839d texture_cache: Address Feedback 2019-07-05 09:46:53 -04:00
Fernando Sahmkow 30b176f92b texture_cache: Correct Texture Buffer Uploading 2019-07-04 19:38:19 -04:00
Zach Hilman ad50cd7df9 gl_shader_cache: Make CachedShader constructor private
Fixes missing review comments introduced.
2019-07-03 20:39:46 -04:00
Zach Hilman da5a537029
Merge pull request #2563 from ReinUsesLisp/shader-initializers
gl_shader_cache: Use static constructors for CachedShader initialization
2019-07-03 20:20:05 -04:00
Fernando Sahmkow 4705d1b523 rasterizer_cache: Protect inherited caches from submission level 2019-07-01 04:32:01 -04:00
ReinUsesLisp 6e1db6b703 texture_cache: Pack sibling queries inside a method 2019-06-29 20:47:46 -03:00
ReinUsesLisp 8eae66907e texture_cache: Use std::vector reservation for sampled_textures 2019-06-29 20:10:31 -03:00
ReinUsesLisp f6f1a8f26a texture_cache: Style changes 2019-06-29 19:52:37 -03:00
ReinUsesLisp dd9ace502b texture_cache: Use std::array for siblings_table 2019-06-29 18:54:13 -03:00
ReinUsesLisp 3f3c3ca5f9 texture_cache: Address feedback 2019-06-29 17:29:39 -03:00
Fernando Sahmkow 223ca80753 texture_cache: Correct variable naming. 2019-06-25 19:35:08 -04:00
Fernando Sahmkow 5aeabd9a17 gl_texture_cache: Correct asserts 2019-06-25 19:26:59 -04:00
Fernando Sahmkow 88bc39374f texture_cache: Corrections, documentation and asserts 2019-06-25 18:36:19 -04:00
Fernando Sahmkow c0abc7124d surface_params: Corrections, asserts and documentation. 2019-06-25 18:03:25 -04:00
Fernando Sahmkow fb234560b0 copy_params: use constexpr for constructor 2019-06-25 17:42:50 -04:00
Fernando Sahmkow 18d24fbdd0 gl_texture_cache: Corrections and fixes 2019-06-25 17:40:08 -04:00
Fernando Sahmkow 36665ce0b2 gl_resource_manager: Correct MakeStreamCopy 2019-06-25 17:32:04 -04:00
Fernando Sahmkow 58c8a44e7a texture_cache: Query MemoryManager from the system 2019-06-25 17:26:00 -04:00
ReinUsesLisp 7565389700 texture_cache: Include "core/core.h" 2019-06-24 02:15:57 -03:00
ReinUsesLisp e723441e37 gl_texture_cache: Explicitly add indirect include 2019-06-24 02:13:55 -03:00
ReinUsesLisp 34841a41c3 texture_cache/surface_view: Address feedback 2019-06-24 02:09:56 -03:00
ReinUsesLisp 0837290992 texture_cache/surface_base: Address feedback 2019-06-24 02:08:52 -03:00
ReinUsesLisp 75de730e28 video_core/surface: Address feedback 2019-06-24 02:07:11 -03:00
ReinUsesLisp 10a83653ee decode/texture: Address feedback 2019-06-24 02:05:05 -03:00
ReinUsesLisp 4504302abc renderer_opengl/utils: Remove unused includes and unused forward declaration 2019-06-24 02:03:37 -03:00
ReinUsesLisp 4b2ff1e00e gl_texture_cache: Address some feedback 2019-06-24 02:01:44 -03:00
ReinUsesLisp 0b6df52109 gl_shader_disk_cache: Address feedback 2019-06-24 01:59:32 -03:00
ReinUsesLisp b8b05a484a gl_shader_decompiler: Address feedback 2019-06-24 01:56:38 -03:00
ReinUsesLisp 4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
bunnei a9f3c54871
Merge pull request #2579 from ReinUsesLisp/fix-aoffi-test
gl_device: Fix TestVariableAoffi test
2019-06-21 15:28:55 -04:00
Fernando Sahmkow d1812316e1 texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow 51ba60b27e shader_cache: Correct versioning and size calculation. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 97c8c9f49a texture_cache: Eliminate linear textures fallthrough 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 6acdae0e4c texture_cache: Correct format R16U as sibling 2019-06-20 21:38:34 -03:00
Fernando Sahmkow d7587842eb texture_cache: Implement texception detection and texture barriers. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 198a0395bb texture_cache: Corrections to buffers and shadow formats use. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow fed773a86c texture_cache: Implement Irregular Views in surfaces 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 082740d34d surface: Correct format S8Z24 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 03d489dcf5 texture_cache: Initialize all siblings to invalid pixel format. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 9422cf7c10 gl_texture_cache: Use Stream Buffers instead of Persistant for Buffer Copies. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow fac3706253 gl_texture_cache: Correct Image Blit 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 7232a1ed16 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 3dd7643214 texture_cache: Use siblings textures on Rebuild and fix possible error on blitting 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 4db28f72f6 texture_cache: Remove old rasterizer cache 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 2d83553ea7 texture_cache: Implement siblings texture formats. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow cb728797b0 fermi2d: Correct Origin Mode 2019-06-20 21:38:34 -03:00
Fernando Sahmkow a56f687793 texture_cache: correct texture buffer on surface params 2019-06-20 21:38:34 -03:00
Fernando Sahmkow b01f9c8a70 texture_cache: eliminate accelerated depth->color/color->depth copies due to driver instability. 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 561ce29c98 texture_cache: correct mutex locks 2019-06-20 21:38:34 -03:00
Fernando Sahmkow b7de31ac97 shader_ir: Fix image copy rebase issues 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 6f69f06873 texture_cache: Don't Image Copy if component types differ 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 9f755218a1 texture_cache: move some large methods to cpp files 2019-06-20 21:38:34 -03:00
Fernando Sahmkow 3809041c24 texture_cache: Optimize GetSurface and use references on functions that don't change a surface. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 60bf761afb texture_cache: Implement Buffer Copy and detect Turing GPUs Image Copies 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 228f516bb4 texture_cache uncompress-compress is untopological.
This makes conflicts between non compress and compress textures to be 
auto recycled. It also limits the amount of mipmaps a texture can have 
if it goes above it's limit.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow 9251354152 texture_cache: Correct copying between compressed and uncompressed formats 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 0966665fc2 texture_cache: Only load on recycle with accurate GPU.
Testing so far has proven this to be quite safe as texture memory read 
added a 2-5ms load to the current cache.
2019-06-20 21:38:33 -03:00
Fernando Sahmkow ea1525dab1 Fix rebase errors 2019-06-20 21:38:33 -03:00
Fernando Sahmkow bdf9faab33 texture_cache: Handle uncontinuous surfaces. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow e60ed2bb3e texture_cache: return null surface on invalid address 2019-06-20 21:38:33 -03:00
Fernando Sahmkow fcac55d5bf texture_cache: Add checks for texture buffers. 2019-06-20 21:38:33 -03:00
Fernando Sahmkow 175aa343ff texture_cache: Fermi2D reform and implement View Mirage
This also does some fixes on compressed textures reinterpret and on the
Fermi2D engine in general.
2019-06-20 21:38:33 -03:00
ReinUsesLisp 1bf4154e7d gl_shader_decompiler: Implement image binding settings 2019-06-20 21:38:33 -03:00
ReinUsesLisp 9097301d92 shader: Implement bindless images 2019-06-20 21:38:33 -03:00
ReinUsesLisp 06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp 007ffbef1c gl_rasterizer: Track texture buffer usage 2019-06-20 21:38:33 -03:00
ReinUsesLisp 58c0d37422 video_core: Make ARB_buffer_storage a required extension 2019-06-20 21:36:12 -03:00
ReinUsesLisp 07f7ce1da2 gl_rasterizer_cache: Use texture buffers to emulate texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp b8c75a845b maxwell_3d: Partially implement texture buffers as 1D textures 2019-06-20 21:36:12 -03:00
ReinUsesLisp 6c81c8f5b7 gl_shader_decompiler: Allow 1D textures to be texture buffers 2019-06-20 21:36:12 -03:00
ReinUsesLisp 4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow d267948a73 texture_cache: loose TryReconstructSurface when accurate GPU is not on.
Also corrects some asserts.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow 6162cb922e texture_cache: Document the most important methods. 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 4530511ee4 texture_cache: Try to Reconstruct Surface on bigger than overlap.
This fixes clouds in SMO Cap Kingdom and lens on Cloud Kingdom.
Also moved accurate_gpu setting check to Pick Strategy
2019-06-20 21:36:12 -03:00
Fernando Sahmkow a79831d9d0 texture_cache: Implement Guard mechanism 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 7731a0e2d1 texture_cache: General Fixes
Fixed ASTC mipmaps loading
Fixed alignment on openGL upload/download
Fixed Block Height Calculation
Removed unalign_height
2019-06-20 21:36:12 -03:00
ReinUsesLisp c2ed348bdd surface_params: Ensure pitch is always written to avoid surface leaks 2019-06-20 21:36:12 -03:00
ReinUsesLisp 9098905dd1 gl_framebuffer_cache: Use a hashed struct to cache framebuffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow d65a4af895 texture_cache return invalid buffer on deactivated color_mask 2019-06-20 21:36:12 -03:00
Fernando Sahmkow 6bd034eae9 engine_upload: Addapt to new Texture Cache 2019-06-20 21:36:12 -03:00