Commit Graph

1615 Commits

Author SHA1 Message Date
Subv d27279092f GPU: Take into account predicated exits when performing shader control flow analysis. 2018-06-04 19:14:23 -05:00
bunnei 37fd4e6d9b
Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei cdd92dc692
Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
bunnei 38d25a4cb2
Merge pull request #515 from Subv/viewport_fix
GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
2018-06-04 18:11:36 -04:00
Subv 2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv f6679ce422 GPU: Corrected the I2F_R implementation. 2018-06-04 16:41:27 -05:00
Subv 5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
Subv 0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv cb47abecc6 GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface. 2018-06-04 13:01:53 -05:00
Subv 90cddf1996 GPU: Use explicit types when retrieving the uniform values for fsetp/fset and isetp instead of the type of an invalid output register. 2018-06-04 11:22:26 -05:00
Subv 7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
Subv b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv 06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei ba117854f9
Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
Subv d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei 1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
bunnei bb9d39b8fe
Merge pull request #494 from bunnei/shader-tex
gl_shader_decompiler: Implement TEX, fixes for TEXS.
2018-06-03 12:05:38 -04:00
bunnei 27c0f9e02d
Merge pull request #495 from bunnei/improve-rro
gl_shader_decompiler: Implement RRO as a register move.
2018-06-03 12:05:26 -04:00
bunnei e54ea773fc gl_shader_decompiler: Implement RRO as a register move. 2018-06-03 11:14:31 -04:00
Subv 99f9d47d16 GPU: Implemented the DXN1 (BC4) texture format. 2018-06-02 13:17:09 -05:00
bunnei 888eb345c0 gl_shader_decompiler: Implement TEX instruction. 2018-05-31 23:36:45 -04:00
bunnei 4c727d0ba8 gl_shader_decompiler: Support multi-destination for TEXS. 2018-05-31 22:57:32 -04:00
bunnei 49309b5848 gl_rasterizer_cache: Assert that component type is UNorm or format is RGBA16F. 2018-05-30 22:50:41 -04:00
bunnei ca5a4a704b gl_rasterizer_cache: Implement PixelFormat RGBA16F. 2018-05-30 22:24:07 -04:00
bunnei 15086a22be
Merge pull request #489 from Subv/vertexid
Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
2018-05-30 14:10:48 -04:00
Subv 99f12b05fa Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader. 2018-05-30 10:58:03 -05:00
Sebastian Valle 8df011a57f
Merge pull request #483 from bunnei/sonic
Several GPU fixes to boot Sonic Mania
2018-05-30 07:31:46 -05:00
bunnei 6fcc7e9c36 gl_shader_decompiler: F2F_R instruction: Implement abs. 2018-05-29 23:52:54 -04:00
bunnei 68937a662d gl_shader_decompiler: Partially implement F2F_R instruction. 2018-05-29 23:10:44 -04:00
Subv 734106dcb9 GPU: Implemented the R8 texture format (0x1D) 2018-05-29 21:49:37 -05:00
bunnei 0d843eaba6 gl_rasterize_cache: Invert order of tex format RGB565. 2018-05-29 22:16:18 -04:00
greggameplayer 220d4672df add all the known TextureFormat (#474) 2018-05-28 19:26:17 -04:00
bunnei d809f65827
Merge pull request #472 from bunnei/greater-equal
gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual.
2018-05-27 12:14:30 -04:00
bunnei 7f155ba713
Merge pull request #476 from Subv/a1bgr5
GPU: Implemented the A1B5G5R5 texture format (0x14)
2018-05-27 12:14:08 -04:00
Subv 7ddc872b52 GPU: Implemented the A1B5G5R5 texture format (0x14) 2018-05-27 09:02:05 -05:00
bunnei c23ce3365d gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual. 2018-05-25 23:21:29 -04:00
bunnei ee53688ca7 shader_bytecode: Implement other variants of FMNMX. 2018-05-25 23:18:50 -04:00
bunnei aee356bd10
Merge pull request #468 from Subv/compound_preds
Shader: Implemented compound predicates in the fset and fsetp instructions
2018-05-25 22:28:47 -04:00
Subv e2cdf54177 Shader: Implemented compound predicates in fset.
You can specify a predicate in the fset instruction:

Result = ((Value1 Comp Value2) OP P0) ? 1.0 : 0.0;
2018-05-24 17:39:59 -05:00
Subv e2db7a83f6 GPU: Allow command lists to rebind a channel to another engine in the middle of the command list. 2018-05-24 17:32:46 -05:00
Subv 126270d963 Shader: Implemented compound predicates in fsetp.
You can specify three predicates in an fsetp instruction:

P1 = (Value1 Comp Value2) OP P0;
P2 = !(Value1 Comp Value2) OP P0;
2018-05-24 17:22:36 -05:00
bunnei 58857b9f46
Merge pull request #456 from Subv/unmap_buffer
Implemented nvhost-as-gpu's UnmapBuffer and nvmap's Free ioctls.
2018-05-20 23:54:50 -04:00
bunnei 898f0fa029
Merge pull request #458 from Subv/fmnmx
Shaders: Implemented the FMNMX shader instruction.
2018-05-20 23:44:07 -04:00
Sebastian Valle 6486544e09
Merge pull request #452 from Subv/psetp
ShadersDecompiler: Added decoding for the PSETP instruction.
2018-05-20 20:00:55 -05:00
Sebastian Valle 2dbfcd32d7
Merge pull request #451 from Subv/gl_array_size
GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.
2018-05-20 20:00:40 -05:00
Subv 8440cef223 Shaders: Implemented the FMNMX shader instruction. 2018-05-20 17:53:06 -05:00
Subv 72b5c448cf GPU: Implemented nvhost-as-gpu's UnmapBuffer ioctl.
It removes a mapping previously created with the MapBufferEx ioctl.
2018-05-20 14:25:56 -05:00
Subv a056d5ad8c ShadersDecompiler: Added decoding for the PSETP instruction. 2018-05-19 11:41:14 -05:00
Subv 98b143c2d6 GLRenderer: Remove unused hw_vao_enabled_attributes variable. 2018-05-19 11:36:38 -05:00
Subv 370ab5df9b GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.
The stream buffer is where all the vertex data is copied, some games require this to be much bigger than the 4 MB we used to have.
2018-05-19 11:36:09 -05:00
Subv 21959ddfef GLRenderer: Log the shader source code when program linking fails. 2018-05-19 11:19:34 -05:00
Lioncash 7c9644646f
general: Make formatting of logged hex values more straightforward
This makes the formatting expectations more obvious (e.g. any zero padding specified
is padding that's entirely dedicated to the value being printed, not any pretty-printing
that also gets tacked on).
2018-05-02 09:49:36 -04:00
bunnei 225ff1130f
Merge pull request #422 from bunnei/shader-mov
Shader instructions MOV_C, MOV_R, and several minor GPU things
2018-04-29 21:47:42 -04:00
bunnei f41eb95e13 maxwell_3d: Reset vertex counts after drawing. 2018-04-29 16:23:31 -04:00
bunnei 08b8fcbe6d gl_shader_decompiler: Implement MOV_R. 2018-04-29 16:05:18 -04:00
bunnei 316327f487 maxwell_to_gl: Implement type SignedNorm, Size_8_8_8_8. 2018-04-29 16:05:17 -04:00
bunnei c7ce472eeb shader_bytecode: Add decoding for FMNMX instruction. 2018-04-29 16:05:17 -04:00
Subv da32c648bf Shaders: Implemented predicate condition 3 (LessEqual) in the fset and fsetp instructions. 2018-04-29 12:49:41 -05:00
bunnei a71346cd7c gl_shader_decompiler: Implement MOV_C. 2018-04-29 13:13:13 -04:00
bunnei 6c464a2a4a
Merge pull request #416 from bunnei/shader-ints-p3
gl_shader_decompiler: Implement MOV32I, partially implement I2I, I2F
2018-04-29 12:56:16 -04:00
bunnei f87ea8fa8b fermi_2d: Fix surface copy block height. 2018-04-28 20:40:03 -04:00
bunnei 0c01c34eff gl_shader_decompiler: Partially implement I2I_R, and I2F_R. 2018-04-28 20:03:19 -04:00
bunnei e73927cfc2 gl_shader_decompiler: More cleanups, etc. with how we handle register types. 2018-04-28 20:03:19 -04:00
bunnei c691fa4074 GLSLRegister: Simplify register declarations, etc. 2018-04-28 20:03:19 -04:00
bunnei f2dcb39049 shader_bytecode: Add decodings for i2i instructions. 2018-04-28 20:03:18 -04:00
bunnei a7b5ab4d9a gl_shader_decompiler: Implement MOV32_IMM instruction. 2018-04-28 20:03:18 -04:00
bunnei 6b365f7703
Merge pull request #408 from bunnei/shader-ints-p2
gl_shader_decompiler: Add GLSLRegisterManager class to track register state.
2018-04-27 16:06:09 -04:00
Lioncash 16198f979e
renderer_opengl: Replace usages of LOG_GENERIC with fmt-capable equivalents 2018-04-27 12:09:35 -04:00
bunnei e6242ab5e6 gl_shader_decompiler: Add GLSLRegisterManager class to track register state. 2018-04-27 11:49:26 -04:00
Lioncash 8475496630
general: Convert assertion macros over to be fmt-compatible 2018-04-27 10:04:02 -04:00
bunnei c9d7abe9c9 gl_shader_decompiler: Boilerplate for handling integer instructions. 2018-04-26 14:38:42 -04:00
bunnei 37fa9a15cd gl_shader_decompiler: Move color output to EXIT instruction. 2018-04-26 14:38:41 -04:00
bunnei f81b915fd8
Merge pull request #396 from Subv/shader_ops
Shaders: Implemented the FSET instruction.
2018-04-25 22:42:54 -04:00
Subv 20d86d8a36 GPU: Partially implemented the Fermi2D surface copy operation.
The hardware allows for some rather complicated operations to be performed on the data during the copy, this is not implemented.
Only same-format same-size raw copies are implemented for now.
2018-04-25 12:54:26 -05:00
Subv e9ad8e9185 Shaders: Added bit decodings for the I2I instruction. 2018-04-25 12:52:55 -05:00
Subv 1740aa5444 Shaders: Implemented the FSET instruction.
This instruction is similar to the FSETP instruction, but it doesn't set a predicate, it sets the destination register to 1.0 if the condition holds, and 0 otherwise.
2018-04-25 12:52:32 -05:00
Subv 1dd4861d38 GPU: Make the Textures::CopySwizzledData function accessible from the outside of the file. 2018-04-25 11:55:30 -05:00
Subv a6da2b93c1 GPU: Added a function to retrieve the bytes per pixel of the render target formats. 2018-04-25 11:55:29 -05:00
Subv 378c881427 GPU: Added surface copy registers to Fermi2D 2018-04-25 11:55:29 -05:00
Subv b1109931b9 GPU: Added boilerplate code for the Fermi2D engine 2018-04-25 11:55:29 -05:00
Subv c16cfbbc6c GPU: Reduce the number of registers of Maxwell3D to 0xE00.
The rest are just macro shim registers.
2018-04-25 11:55:28 -05:00
Subv a994446b6e GPU: Move the Maxwell3D macro uploading code to the inside of the Maxwell3D processor.
It doesn't belong in the PFIFO handler.
2018-04-25 11:55:27 -05:00
Subv e2f2a49d2d GPU: Corrected the upper bound of the PFIFO method ids in the command processor. 2018-04-25 11:53:54 -05:00
Lioncash b7551e457b
video-core: Move logging macros over to new fmt-capable ones 2018-04-25 09:13:57 -04:00
Subv 0369ee7248 Shaders: Added decodings for the FSET instructions. 2018-04-24 22:42:54 -05:00
bunnei c30cd898fc renderer_opengl: Use correct byte order for framebuffer pixel format ABGR8. 2018-04-24 22:31:46 -04:00
bunnei f1a4a004fb gl_rasterizer_cache: Use CHAR_BIT for bpp conversions instead of 8. 2018-04-24 22:31:46 -04:00
bunnei 0a023cfb4f gl_rasterizer_cache: Use GPU PAGE_BITS/SIZE, not CPU. 2018-04-24 22:31:46 -04:00
bunnei 9022d926eb gl_rasterizer_cache: Use new logger. 2018-04-24 22:31:46 -04:00
bunnei fbb3cd110c gl_rasterizer_cache: Add a function for finding framebuffer GPU address. 2018-04-24 22:31:46 -04:00
bunnei bc0f1896fc gl_rasterizer_cache: Handle compressed texture sizes. 2018-04-24 22:31:46 -04:00
bunnei 4415e00181 gl_rasterizer_cache: Update to be based on GPU addresses, not CPU addresses. 2018-04-24 22:31:45 -04:00
bunnei 10c6d89119 memory_manager: Add implement CpuToGpuAddress. 2018-04-24 17:49:20 -04:00
bunnei 239ac8abe2 memory_manager: Make GpuToCpuAddress return an optional. 2018-04-24 17:49:19 -04:00
bunnei 9e11a76e92 memory_manager: Use GPUVAdddr, not PAddr, for GPU addresses. 2018-04-24 17:40:43 -04:00
bunnei e8c2bb24b2
Merge pull request #386 from Subv/gpu_query
GPU: Added asserts to our code for handling the QUERY_GET GPU command.
2018-04-24 16:13:51 -04:00
Lioncash d1b23b2b51
renderer_opengl: Silence a -Wdangling-else warning in DrawScreenTriangles() 2018-04-24 11:13:08 -04:00
bunnei 07dc0bbf3e
Merge pull request #379 from Subv/multi_buffers
GPU: Support multiple enabled vertex arrays.
2018-04-24 01:09:02 -04:00
Subv f208953585 GPU: Added asserts to our code for handling the QUERY_GET GPU command.
This is based on research from nouveau. Many things are currently unknown and will require hwtests in the future.
This commit also stubs QueryMode::Write2 to do the same as Write. Nouveau code treats them interchangeably, it is currently unknown what the difference is.
2018-04-23 17:06:57 -05:00
bunnei 3967f9c6ef
Merge pull request #383 from Subv/gpu_mmu
GPU: Make the GPU virtual memory manager use 16 page bits and 10 pagetable bits.
2018-04-23 14:00:52 -04:00
Subv 9531a29283 GPU: Support multiple enabled vertex arrays.
The vertex arrays will be copied to the stream buffer one after the other, and the attributes will be set using the ARB_vertex_attrib_binding extension.

yuzu now thus requires OpenGL 4.3 or the ARB_vertex_attrib_binding extension.
2018-04-23 11:34:50 -05:00
Subv f823c1d599 GPU: Make the GPU virtual memory manager use 16 page bits and 10 page table bits.
Also removed some dead code and added memory map consistency asserts.
2018-04-23 10:57:12 -05:00
Subv 010227e149 GPU: Implement the RGB10_A2 RenderTarget format, it will use the same format as the A2BGR10 texture format. 2018-04-23 10:50:28 -05:00
Subv c079cf4eec GPU: Implement the A2BGR10 texture format. 2018-04-21 17:32:25 -05:00
bunnei f8764bb5d3
Merge pull request #376 from bunnei/shader-decoder
Shader opcode decoding
2018-04-21 00:04:51 -04:00
bunnei f8a037ead4
Merge pull request #375 from lioncash/header
opengl: Remove unnecessary header inclusions
2018-04-20 23:08:47 -04:00
bunnei d08fd7e86d gl_shader_decompiler: Skip RRO instruction. 2018-04-20 22:30:56 -04:00
bunnei 8b28dc55e6 gl_shader_decompiler: Cleanup error logging. 2018-04-20 22:30:56 -04:00
bunnei e1630c4d43 shader_bytecode: Add several more instruction decodings. 2018-04-20 22:30:56 -04:00
bunnei 9f6d305eab shader_bytecode: Decode instructions based on bit strings. 2018-04-20 22:30:56 -04:00
bunnei 8ac3a3f45e
Merge pull request #369 from Subv/shader_instr2
ShaderGen: Implemented fsetp/kil and predicated instruction execution.
2018-04-20 22:29:39 -04:00
bunnei 634d9ee18b
Merge pull request #374 from lioncash/noexcept
gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operators
2018-04-20 22:28:47 -04:00
Subv 17a0ef1e1e ShaderGen: Implemented the KIL instruction, which is equivalent to 'discard'. 2018-04-20 21:09:34 -05:00
Subv c3a8ea76f1 ShaderGen: Implemented predicated instruction execution.
Each predicated instruction will be wrapped in an `if (predicate) { instruction_body; }` in the GLSL, where `predicate` is one of the predicate boolean variables previously set by fsetp.
2018-04-20 21:09:33 -05:00
Subv 0a5e01b710 ShaderGen: Implemented the fsetp instruction.
Predicate variables are now added to the generated shader code in the form of 'pX' where X is the predicate id.
These predicate variables are initialized to false on shader startup and are set via the fsetp instructions.

TODO:

* Not all the comparison types are implemented.
* Only the single-predicate version is implemented.
2018-04-20 21:09:33 -05:00
Lioncash eafdcc1b8a opengl: Remove unnecessary header inclusions 2018-04-20 20:19:37 -04:00
Lioncash ab71997b2c gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operators
Standard library containers may use std::move_if_noexcept to perform
move operations. If a move cannot be performed under these
circumstances, then a copy is attempted. Given we only intend for these
types to be move-only this can be somewhat problematic. By defining
these to be noexcept we prevent cases where copies may be attempted.
2018-04-20 20:04:00 -04:00
Lioncash 7db0b8d74f gl_rasterizer_cache: Make MatchFlags an enum class
Prevents implicit conversions and scope pollution.
2018-04-20 19:50:05 -04:00
Subv d03fc77475 ShaderGen: Register id 255 is special and is hardcoded to return 0 (SR_ZERO). 2018-04-20 14:57:40 -05:00
Subv 2e0a9f66a0 ShaderGen: Ignore the 'sched' instruction when generating shaders.
The 'sched' instruction has a very convoluted encoding, but fortunately it seems to only appear on a fixed interval (once every 4 instructions).
2018-04-20 14:57:40 -05:00
bunnei 326b044c19
Merge pull request #367 from lioncash/clamp
math_util: Remove the Clamp() function
2018-04-20 14:18:03 -04:00
Lioncash fae2dd0344
math_util: Remove the Clamp() function
C++17 adds clamp() to the standard library, so we can remove ours in
favor of it.
2018-04-20 10:14:13 -04:00
bunnei 701dd649e6
Merge pull request #363 from lioncash/array-size
common_funcs: Remove ARRAY_SIZE macro
2018-04-20 09:43:02 -04:00
Lioncash d9e316e353 common_funcs: Remove ARRAY_SIZE macro
C++17 has non-member size() which we can just call where necessary.
2018-04-19 22:36:52 -04:00
Lioncash 3841ec4200 renderer_opengl: Add missing header guards 2018-04-19 21:13:59 -04:00
bunnei 17ad56c1dc
Merge pull request #356 from lioncash/shader
glsl_shader_decompiler: Minor API changes to ShaderWriter
2018-04-19 21:09:25 -04:00
Lioncash e3b6f6c016 glsl_shader_decompiler: Use std::string_view instead of std::string for AddLine()
This function doesn't need to take ownership of the string data being
given to it, considering all we do is append the characters to the
internal string instance.

Instead, use a string view to simply reference the string data without
any potential heap allocation.

Now anything that is a raw const char* won't need to be converted to a
std::string before appending.
2018-04-19 20:12:58 -04:00
Lioncash 412b31ad72 glsl_shader_decompiler: Add AddNewLine() function to ShaderWriter
Avoids constructing a std::string just to append a newline character
2018-04-19 20:09:27 -04:00
Lioncash aa26baa3db glsl_shader_decompiler: Add char overload for ShaderWriter's AddLine()
Avoids constructing a std::string just to append a character.
2018-04-19 20:04:09 -04:00
Lioncash 4ef392906b glsl_shader_decompiler: Append indentation without constructing a separate std::string
The interface of std::string already lets us append N copies of a
character to an existing string.
2018-04-19 19:59:25 -04:00
Subv fe84842137 ShaderGen: Implemented the fmul32i shader instruction. 2018-04-19 13:46:32 -05:00
Subv 5367935d35 ShaderGen: Fixed a case where the TEXS instruction would use the same registers for the input and the output.
It will now save the coords before writing the outputs in a subscope.
2018-04-19 13:33:17 -05:00
Subv 057170928c GPU: Add support for the DXT23 and DXT45 compressed texture formats. 2018-04-18 20:48:53 -05:00
bunnei 60e6e8953e
Merge pull request #351 from Subv/tex_formats
GPU: Implemented the B5G6R5 format.
2018-04-18 20:20:51 -04:00
Subv 2985056340 GPU: Implemented the B5G6R5 format. 2018-04-18 18:16:45 -05:00
bunnei ce4f159b1c
gl_shader_gen: Support vertical/horizontal viewport flipping. (#347)
* gl_shader_gen: Support vertical/horizontal viewport flipping.

* fixup! gl_shader_gen: Support vertical/horizontal viewport flipping.
2018-04-18 16:42:40 -04:00
Subv 43d98ca8fe GLCache: Added boilerplate code to make supporting configurable texture component types.
For now only the UNORM type is supported.
2018-04-18 14:17:28 -05:00
Subv 5b3fab6766 GLCache: Unify texture and framebuffer formats when converting to OpenGL. 2018-04-18 14:17:28 -05:00
Subv b2c1672e10 GPU: Texture format 8 and framebuffer format 0xD5 are actually ABGR8. 2018-04-18 14:17:27 -05:00
Subv 48d4efbd69 GPU: Pitch textures are now supported, don't assert when encountering them. 2018-04-18 12:52:53 -05:00
Subv a3e82e8e1f GLCache: Take into account the texture's block height when caching and unswizzling. 2018-04-18 12:52:53 -05:00
Subv ac09b5a2e9 GLCache: Added a function to convert cached PixelFormats back to texture formats.
TODO: The way we handle cached formats must change, framebuffer and texture formats are too different to keep them in the same place.
2018-04-18 12:52:52 -05:00
Subv 6b63aaa5b4 GPU: Allow using a configurable block height when unswizzling textures. 2018-04-18 12:52:51 -05:00
Subv db5f2bfa7e GPU/TIC: Added the pitch and block height fields to the TIC structure. 2018-04-18 11:38:39 -05:00
bunnei c93ea96366
Merge pull request #346 from bunnei/misc-gpu-improvements
Misc gpu improvements
2018-04-17 22:17:07 -04:00
bunnei 71b4a3b9f6
Merge pull request #344 from bunnei/shader-decompiler-p2
Shader decompiler changes part 2
2018-04-17 22:10:53 -04:00
bunnei 7222d9a4c3 gl_rasterizer_cache: Add missing LOG statements. 2018-04-17 21:44:36 -04:00
bunnei 9df8e924fb texture: Add missing formats. 2018-04-17 21:41:36 -04:00
bunnei 3ed8a1cac7 gpu: Add several framebuffer formats to RenderTargetFormat. 2018-04-17 21:40:38 -04:00
bunnei 4a8eb6745e maxwell3d: Allow Texture2DNoMipmap as Texture2D. 2018-04-17 21:39:15 -04:00
bunnei 531c25386e shader_bytecode: Make ctor's constexpr and explicit. 2018-04-17 21:27:07 -04:00
bunnei 174cba5c58 renderer_opengl: Implement BlendEquation and BlendFunc. 2018-04-17 18:11:48 -04:00
bunnei 1f6fe062ca gl_shader_decompiler: Fix warnings with MarkAsUsed. 2018-04-17 16:36:44 -04:00
bunnei ed542a7309 gl_shader_decompiler: Cleanup logging, updating to NGLOG_*. 2018-04-17 16:36:44 -04:00
bunnei ef2d5ab0c1 gl_shader_decompiler: Implement several MUFU subops and abs_d. 2018-04-17 16:36:43 -04:00
bunnei 59f4ff4659 gl_shader_decompiler: Fix swizzle in GetRegister. 2018-04-17 16:36:42 -04:00
bunnei 5a28dce9eb gl_shader_decompiler: Implement FMUL/FADD/FFMA immediate instructions. 2018-04-17 16:36:42 -04:00
bunnei 8d4899d6ea gl_shader_decompiler: Allow vertex position to be used in fragment shader. 2018-04-17 16:36:40 -04:00
bunnei 95144cc39c gl_shader_decompiler: Implement IPA instruction. 2018-04-17 16:36:39 -04:00
bunnei 8b4443c966 gl_shader_decompiler: Add support for TEXS instruction. 2018-04-17 16:36:38 -04:00
bunnei 5ba71369ac gl_shader_decompiler: Use fragment output color for GPR 0-3. 2018-04-17 15:25:54 -04:00
bunnei 5d529698c9 gl_shader_decompiler: Partially implement MUFU. 2018-04-17 15:25:54 -04:00
bunnei 2b082e2710
Merge pull request #343 from Subv/tex_wrap_4
GPU: Implement some wrap modes
2018-04-17 12:25:24 -04:00
Subv 636ad34707 MaxwellToGL: Implemented tex wrap mode 1 (Wrap, GL_REPEAT). 2018-04-17 10:17:18 -05:00
Subv 7fc516cc1a MaxwellToGL: Added a TODO and partial implementation of maxwell wrap mode 4 (Clamp, GL_CLAMP).
This clamp mode was removed from OpenGL as of 3.1, we can emulate it by using GL_CLAMP_TO_BORDER to get the border color of the texture, and then manually sampling the edge to mix them in the fragment shader.
2018-04-17 10:16:50 -05:00
bunnei 77bdc49343 gl_rendering: Use NGLOG* for changed code. 2018-04-16 21:23:28 -04:00
bunnei 1a1af3fda3 gl_rasterizer: Implement indexed vertex mode. 2018-04-16 21:10:15 -04:00
Subv 477aab5960 GPU: Use the same buffer names in the generated GLSL and the buffer uploading code. 2018-04-15 15:02:50 -05:00
Subv 14ac40436e GPU: Don't use explicit binding points when uploading the constbuffers to opengl.
The bindpoints will now be dynamically calculated based on the number of buffers used by the previous shader stage.
2018-04-15 14:14:57 -05:00
Subv e128e90350 GPU: Don't use GetPointer when uploading the constbuffer data to the GPU. 2018-04-15 11:18:09 -05:00
Subv 7da47da66e GPU: Use the buffer hints from the shader decompiler to upload only the necessary const buffers for each shader stage. 2018-04-15 11:15:54 -05:00
bunnei 73d9c494ea shaders: Expose hints about used const buffers. 2018-04-15 11:50:10 -04:00
Subv c9b511da08 GPU: Upload the entirety of each constbuffer for each shader stage as SSBOs.
We're going to need the shader generator to give us a mapping of the actual used const buffers to properly bind them to the shader.
2018-04-14 23:02:05 -05:00
Subv 1957640ea2 GPU: Allow configuring ssbos in the opengl state manager. 2018-04-14 22:54:23 -05:00
Subv ae58e46036 GPU: Added a function to determine whether a shader stage is enabled or not. 2018-04-14 22:54:23 -05:00
bunnei 1b41b875dc shaders: Add NumTextureSamplers const, remove unused #pragma. 2018-04-14 18:50:06 -04:00
bunnei e6224fec27 shaders: Address PR review feedback. 2018-04-14 16:01:41 -04:00
bunnei eabeedf6af gl_shader_decompiler: Cleanup log statements. 2018-04-14 16:01:41 -04:00
bunnei 0d408b965b shaders: Fix GCC and clang build issues. 2018-04-14 16:01:40 -04:00
bunnei 86135864da gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup. 2018-04-14 16:01:40 -04:00
bunnei 7639667562 shader_bytecode: Add FSETP and KIL to GetInfo. 2018-04-14 16:01:40 -04:00
bunnei 5a47832221 shader_bytecode: Add SubOp decoding. 2018-04-14 16:01:40 -04:00
bunnei 50023bdae7 gl_shader_decompiler: Add shader stage hint. 2018-04-14 16:01:39 -04:00
bunnei a992aac5eb renderer_opengl: Fix Morton copy byteswap, etc. 2018-04-14 16:01:39 -04:00
bunnei 0ca8fce9d0 gl_shader_manager: Implement SetShaderSamplerBindings. 2018-04-13 23:48:30 -04:00
bunnei beddc8afd2 gl_rasterizer: Generate shaders and upload uniforms. 2018-04-13 23:48:29 -04:00
bunnei 85d77a3d24 gl_shader_decompiler: Basic impl. for very simple vertex shaders.
- Tested with Puyo Puyo Tetris and Cave Story+
2018-04-13 23:48:28 -04:00
bunnei 51f37f5061 gl_shader_manager: Cleanup and consolidate uniform handling. 2018-04-13 23:48:28 -04:00
bunnei 35aca0bf1f maxwell_3d: Make memory_manager public. 2018-04-13 23:48:27 -04:00
bunnei 33bb53571b maxwell_3d: Fix shader_config decodings. 2018-04-13 23:48:26 -04:00
bunnei 5617831d5f gl_rasterizer: Use shader program manager, remove test shader. 2018-04-13 23:48:26 -04:00
bunnei 459826a705 renderer_opengl: Add gl_shader_manager class. 2018-04-13 23:48:25 -04:00
bunnei 8aa21a03b3 maxwell_to_gl: Add a few types, etc. 2018-04-13 23:48:24 -04:00
bunnei 10953495c1 gl_shader_gen: Add hashable setup/config structs. 2018-04-13 23:48:23 -04:00
bunnei 2fcbb35ad2 gl_shader_util: Add missing includes. 2018-04-13 23:48:23 -04:00
bunnei da1114ca59 renderer_opengl: Use OGLProgram instead of OGLShader. 2018-04-13 23:48:21 -04:00
bunnei 4f2b2d0bc5 gl_shader_util: Grab latest upstream. 2018-04-13 23:48:21 -04:00
bunnei dbfd106ba0 gl_resource_manager: Grab latest upstream. 2018-04-13 23:48:20 -04:00
bunnei ed7e597b44 gl_shader_decompiler: Add skeleton code from Citra for shader analysis. 2018-04-13 23:48:20 -04:00
bunnei 4e7e0f8112 shader_bytecode: Add initial module for shader decoding. 2018-04-13 23:48:19 -04:00
James Rowe 0b855f1c21 Fix clang format issues 2018-04-06 22:00:48 -06:00
Subv dcc27d6dc1 GPU: Assert when finding a texture with a format type other than UNORM. 2018-04-06 20:44:46 -06:00
Subv b0ca330e14 GL: Set up the textures used for each draw call.
Each Maxwell shader stage can have an arbitrary number of textures, but we're limited to a certain number in OpenGL. We try to only use the minimum amount of host textures by not keeping a 1:1 relation between guest texture ids and host texture ids, ie, guest texture id 8 can be host texture id 0 if it's the only texture used in the guest shader program.
This mapping will have to be passed to the shader decompiler so it can rewrite the texture accesses.
2018-04-06 20:44:46 -06:00
Subv cb3183212d GL: Bind the textures to the shaders used for drawing. 2018-04-06 20:44:46 -06:00
Subv 65faeb9b2a GLCache: Specialize the MortonCopy function for the DXT1 texture format.
It will now use the UnswizzleTexture function instead of the MortonCopyPixels128, which doesn't seem to work for textures.
2018-04-06 20:44:46 -06:00
Subv b258403f0d GLCache: Implemented GetTextureSurface. 2018-04-06 20:44:45 -06:00
Subv 65ea52394b GLCache: Support uploading compressed textures to the GPU.
Compressed texture formats like DXT1, DXT2, DXT3, etc will use this to ease the load on the CPU.
2018-04-06 20:44:45 -06:00
Subv 73eaef9c05 GL: Remove remaining references to 3DS-specific pixel formats 2018-04-06 20:44:42 -06:00
Subv b305646c44 RasterizerCache: Remove 3DS-specific pixel formats.
We're only left with RGB8 and DXT1 for now. More will be added as they are needed.
2018-04-06 20:40:24 -06:00
Subv c28ed85875 GL: Create the sampler objects when starting up the GL rasterizer. 2018-04-06 20:40:24 -06:00
Subv ca96b04a0c GL: Ported the SamplerInfo struct from citra. 2018-04-06 20:40:24 -06:00
Subv 0171ec606b GL: Rename PicaTexture to MaxwellTexture. 2018-04-06 20:40:24 -06:00
Subv f73a280eeb GL: Added functions to convert Maxwell tex filters and wrap modes to OpenGL. 2018-04-06 20:40:23 -06:00
Subv ad1810e895 Textures: Added a helper function to know if a texture is blocklinear or pitch. 2018-04-06 20:40:23 -06:00
N00byKing d1d7582a5b
rasterizer_interface.h: Update from citra to yuzu 2018-04-04 23:07:58 +02:00
N00byKing 27dbbd8227
gl_rasterizer_cache.cpp: Update from citra to yuzu 2018-04-04 23:05:10 +02:00
N00byKing cfc28e0c1a
gl_rasterizer_cache.h: Update from citra to yuzu 2018-04-04 23:04:24 +02:00
N00byKing ca17f581f5
renderer_opengl.h: Update from citra to yuzu 2018-04-04 23:03:02 +02:00
Subv 11b4ab9685 GPU: Use the MacroInterpreter class to execute the GPU macros instead of HLEing them. 2018-04-01 12:07:26 -05:00
Subv 1ec8d2123d GPU: Implemented a gpu macro interpreter.
The Ryujinx macro interpreter and envydis were used as reference.

Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
2018-04-01 12:07:26 -05:00
bunnei 5e343edc9e renderer_opengl: Use better naming for DrawScreens and DrawSingleScreen. 2018-03-26 21:17:07 -04:00
bunnei c33abac275 gl_rasterizer: Move code to bind framebuffer surfaces before draw to its own function. 2018-03-26 21:17:05 -04:00
bunnei d30110348b gl_rasterizer: Add a SyncViewport method. 2018-03-26 21:17:04 -04:00
bunnei 67bc2f5ecd gl_rasterizer: Move PrimitiveTopology check to MaxwellToGL. 2018-03-26 21:17:03 -04:00
bunnei 666d53299c graphics_surface: Fix merge conflicts. 2018-03-26 21:17:03 -04:00
bunnei ac19e3d061 gl_rasterizer: Use ReadBlock instead of GetPointer for SetupVertexArray. 2018-03-26 21:17:02 -04:00
bunnei a6cab532f8 gl_rasterizer: Normalize vertex array data as appropriate. 2018-03-26 21:17:02 -04:00
bunnei 527ce12ce4 maxwel_to_gl: Fix string formatting in log statements. 2018-03-26 21:17:01 -04:00
bunnei d89bfec5f5 rasterizer: Rename DrawTriangles to DrawArrays. 2018-03-26 21:17:00 -04:00
bunnei 1bfc0dc2db gl_rasterizer: Use passthrough shader for SetupVertexShader. 2018-03-26 21:17:00 -04:00
bunnei 0a5832798a renderer_opengl: Logging, etc. cleanup. 2018-03-26 21:16:59 -04:00
bunnei 7504df52fc renderer_opengl: Remove framebuffer RasterizerFlushVirtualRegion hack. 2018-03-26 21:16:58 -04:00
bunnei c1ccbf332f gl_rasterizer_cache: Implement UpdatePagesCachedCount. 2018-03-26 21:16:58 -04:00
bunnei c2dbdefedf gl_rasterizer: Implement SetupVertexArray. 2018-03-26 21:16:56 -04:00
bunnei cd8bb6ea9b gl_rasterizer_cache: Fix an ASSERT_MSG. 2018-03-26 21:16:56 -04:00
bunnei 4369af6b7e maxwell_to_gl: Add module and function for decoding VertexType. 2018-03-26 21:16:55 -04:00
bunnei 3754e0fdfd maxwell_3d: Use names that match envytools for VertexType. 2018-03-26 21:16:55 -04:00
bunnei 15925b8293 maxwell_3d: Add VertexAttribute struct and cleanup. 2018-03-26 21:16:54 -04:00
bunnei 0ee38e1363 gl_rasterizer: Use 32 texture units instead of 3. 2018-03-26 21:16:53 -04:00
bunnei 0162a2d5cb gl_rasterizer: Implement DrawTriangles. 2018-03-26 21:16:53 -04:00
bunnei 33c0bf9dc5 Maxwell3D: Call AccelerateDrawBatch on DrawArrays. 2018-03-26 21:16:52 -04:00
bunnei ed2134784e gl_rasterizer: Implement AnalyzeVertexArray. 2018-03-26 21:16:52 -04:00
bunnei 8041d72a1f gl_rasterizer_cache: MortonCopy Switch-style. 2018-03-26 21:16:51 -04:00
bunnei 170ac3f9ee gl_rasterizer_cache: Implement GetFramebufferSurfaces. 2018-03-26 21:16:51 -04:00
bunnei 94c70693f9 maxwell: Add RenderTargetFormat enum. 2018-03-26 21:16:49 -04:00
bunnei 1a9df83535 renderer_opengl: Only draw the screen if a framebuffer is specified. 2018-03-26 21:16:49 -04:00
Subv 4697025b73 GPU: Load the sampler info (TSC) when retrieving active textures. 2018-03-26 15:46:49 -05:00
Subv 56e2013c1f GPU: Added the TSC structure. It contains information about the sampler. 2018-03-26 15:45:05 -05:00
Subv 6afe9e0105 GPU: Added more fields to the TIC structure. 2018-03-26 15:44:20 -05:00
Subv 0ce52b1da2 GPU: Make the debug_context variable a member of the frontend instead of a global. 2018-03-24 23:35:06 -05:00
Subv 2c785bd06c GPU: Added a function to retrieve the active textures for a shader stage.
TODO: A shader may not use all of these textures at the same time, shader analysis should be performed to determine which textures are actually sampled.
2018-03-24 11:31:53 -05:00
Subv 39e60cfeb1 Frontend: Updated the surface view debug widget to work with Maxwell surfaces. 2018-03-24 11:31:53 -05:00
Subv 1c31e2b3d2 GPU: Implement the Incoming/FinishedPrimitiveBatch debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv 1ad97c75a0 GPU: Implement the MaxwellCommandLoaded/Processed debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv 77fd0d47e7 Frontend: Ported the GPU breakpoints and surface viewer widgets from citra. 2018-03-24 11:31:49 -05:00
Subv 1b8d798835 GPU: Added a method to unswizzle a texture without decoding it.
Allow unswizzling of DXT1 textures.
2018-03-24 11:30:56 -05:00
Subv 71ebc3e90d GPU: Preliminary work for texture decoding. 2018-03-24 11:30:56 -05:00
Subv 9b9de30086 GPU: Added viewport registers to Maxwell3D's reg structure. 2018-03-24 01:22:19 -05:00
bunnei d561e4acc8 gl_rasterizer: Fake render in green, because it's cooler. 2018-03-23 22:27:53 -04:00
bunnei 4ed54738fc gl_rasterizer: Log warning instead of sync'ing unimplemented funcs. 2018-03-23 22:24:16 -04:00
bunnei b7da9d5a54 gl_rasterizer_cache: Add missing include for vm_manager. 2018-03-23 16:54:20 -04:00
bunnei 0f8401906b renderer_opengl: Only invalidate the framebuffer region, not flush. 2018-03-23 15:52:14 -04:00
bunnei 054393917e renderer_opengl: Fixes for properly flushing & rendering the framebuffer. 2018-03-23 15:49:04 -04:00
bunnei b36b627d4d RasterizerCacheOpenGL: FlushAll should flush full memory region. 2018-03-23 15:25:16 -04:00
bunnei 11047d7fd5 rasterizer: Flush and invalidate regions should be 64-bit. 2018-03-23 15:01:45 -04:00
bunnei cdf541fb5b renderer_opengl: Add framebuffer_transform_flags member variable. 2018-03-23 14:59:14 -04:00
bunnei ec4e1a3685 renderer_opengl: Better handling of framebuffer transform flags. 2018-03-23 14:58:27 -04:00
bunnei c2c55e0811 renderer_opengl: Use accelerated framebuffer load with LoadFBToScreenInfo. 2018-03-22 23:28:37 -04:00
bunnei a0b1235f82 gl_rasterizer: Implement AccelerateDisplay method from Citra. 2018-03-22 23:06:54 -04:00
bunnei f61b9f7338 LoadGLBuffer: Use bytes_per_pixel, not bits. 2018-03-22 23:01:57 -04:00
bunnei 6ced80bb47 gl_rasterizer_cache: LoadGLBuffer should do a morton copy. 2018-03-22 22:54:04 -04:00
bunnei 740310113b video_core: Move MortonCopyPixels128 to utils header. 2018-03-22 22:52:40 -04:00
bunnei 8a250de987 video_core: Remove usage of PAddr and replace with VAddr. 2018-03-22 21:13:46 -04:00
bunnei bfe45774f1 video_core: Move FramebufferInfo to FramebufferConfig in GPU. 2018-03-22 21:04:30 -04:00
bunnei c6362543d4 gl_rasterizer: Replace a bunch of UNIMPLEMENTED with ASSERT. 2018-03-22 20:19:34 -04:00
bunnei f707c2dac4 gl_rasterizer: Add a simple passthrough shader in lieu of shader generation. 2018-03-22 20:00:41 -04:00
bunnei 7c3a263839 gpu: Expose Maxwell3D engine. 2018-03-22 19:48:20 -04:00
bunnei 3a6604e8fa maxwell_3d: Add some format decodings and string helper functions. 2018-03-22 19:47:28 -04:00
bunnei 656de23d93 renderer: Create rasterizer and cleanup. 2018-03-22 19:46:37 -04:00
Subv c450d264eb GPU: Added vertex attribute format registers. 2018-03-21 09:26:47 -05:00
Subv ae28a52277 GPU: Added registers for the number of vertices to render. 2018-03-20 23:28:06 -05:00
bunnei 0b3ab30762
Merge pull request #254 from bunnei/port-citra-renderer
Port Citra OpenGL rasterizer code
2018-03-20 21:37:43 -04:00
bunnei 6e3222363c renderer_gl: Port boilerplate rasterizer code over from Citra. 2018-03-20 00:07:32 -04:00
bunnei 9c468e0c55 gl_shader_util: Sync latest version with Citra. 2018-03-20 00:07:31 -04:00
bunnei d7b1ebe4a8 renderer_gl: Port over gl_shader_gen module from Citra. 2018-03-20 00:07:30 -04:00
Mat M f4700ccabf
Merge pull request #253 from Subv/rt_depth
GPU: Added registers for color and Z buffers.
2018-03-19 23:37:47 -04:00
bunnei 4bdb46e4c2 renderer_gl: Port over gl_shader_decompiler module from Citra. 2018-03-19 23:14:03 -04:00
bunnei a3e10b1a72 renderer_gl: Port over gl_rasterizer_cache module from Citra. 2018-03-19 23:14:03 -04:00
bunnei db0cfb8e8b gl_resource_manager: Sync latest version with Citra. 2018-03-19 23:14:02 -04:00
bunnei 0e4b9cdde4 renderer_gl: Port over gl_stream_buffer module from Citra. 2018-03-19 23:14:02 -04:00
bunnei 6a0902e56d gl_state: Sync latest version with Citra. 2018-03-19 23:13:49 -04:00
Subv 7a27a11770 GPU: Added Z buffer registers to Maxwell3D's reg structure. 2018-03-19 16:55:33 -05:00
Subv 21d9519032 GPU: Added the render target (RT) registers to Maxwell3D's reg structure. 2018-03-19 16:46:29 -05:00
N00byKing 1d8b6ad13b Clang Fixes 2018-03-19 17:53:35 +01:00
N00byKing ef875d6a35 Clean Warnings (?) 2018-03-19 17:07:08 +01:00
Subv dcae0c9a4f GPU: Added the TSC registers to the Maxwell3D register structure. 2018-03-19 00:36:25 -05:00
Subv cff7b29bba GPU: Added the TIC registers to the Maxwell3D register structure. 2018-03-19 00:32:57 -05:00
Subv 03156d0c9a GPU: Implement macro 0xE1A BindTextureInfoBuffer in HLE.
This macro simply sets the current CB_ADDRESS to the texture buffer address for the input shader stage.
2018-03-18 19:03:40 -05:00
Subv 7b6868e908 GPU: Implement the BindStorageBuffer macro method in HLE.
This macro binds the SSBO Info Buffer as the current ConstBuffer.
This buffer is usually bound to c0 during shader execution.
Games seem to use this macro instead of directly writing the address for some reason.
2018-03-18 16:50:42 -05:00
Subv 85d820b1b4 GPU: Handle writes to the CB_DATA method.
Writing to this method will cause the written value to be stored in the currently-set ConstBuffer plus CB_POS.

This method is usually used to upload uniforms or other shader-visible data.
2018-03-18 15:23:24 -05:00
Subv a64b936cbe GPU: Move the GPU's class constructor and destructors to a cpp file.
This should reduce recompile times when editing the Maxwell3D register structure.
2018-03-18 15:23:24 -05:00
Subv aa586fa268 GPU: Store uploaded GPU macros and keep track of the number of method parameters. 2018-03-18 11:51:46 -05:00
Subv 7ac8657432 GPU: Macros are specific to the Maxwell3D engine, so handle them internally. 2018-03-18 11:51:45 -05:00
Subv ccb8da1512 GPU: Renamed ShaderType to ShaderStage as that is less confusing. 2018-03-17 18:32:57 -05:00
Subv 88698c156f GPU: Store shader constbuffer bindings in the GPU state. 2018-03-17 18:32:57 -05:00
Subv 66dae22790 GPU: Corrected some register offsets and removed superfluous macro registers. 2018-03-17 18:32:56 -05:00
Subv 1d9d9c16e8 GPU: Make the SetShader macro call do the same as the real macro's code.
It'll now set the CB_SIZE, CB_ADDRESS and CB_BIND registers when it's called.

Presumably this SetShader function is binding the constant shader uniforms to buffer 1 (c1[]).
2018-03-17 18:32:55 -05:00
Subv 579000e747 GPU: Corrected the parameter documentation for the SetShader macro call.
Register 0xE24 is actually a macro that sets some shader parameters in the register structure.

Macros are uploaded to the GPU at startup and have their own ISA, we'll probably write an interpreter for this in the future.
2018-03-17 13:55:42 -05:00
bunnei 516ef4f19f
Merge pull request #242 from Subv/set_shader
GPU: Handle the SetShader method call (0xE24) and store the shader config.
2018-03-17 00:34:17 -04:00
Subv f93d769a1c GPU: Handle the SetShader method call (0xE24) and store the shader config. 2018-03-16 22:51:06 -05:00
Subv d2888f7e90 GPU: Added the vertex array registers. 2018-03-16 22:47:45 -05:00
bunnei cd4e8a989c
Merge pull request #241 from Subv/gpu_method_call
GPU: Process command mode 5 (IncreaseOnce) differently from other commands
2018-03-16 22:28:22 -04:00
Subv 29feece4b8 GPU: Process command mode 5 (IncreaseOnce) differently from other commands.
Accumulate all arguments before calling the desired method.

Note: Maybe we should do the same for the NonIncreasing mode?
2018-03-16 20:32:44 -05:00
Subv bf310a41b8 GPU: Assert that we get a 0 CODE_ADDRESS register in the 3D engine.
Shader address calculation depends on this value to some extent, we do not currently know what it being 0 entails.
2018-03-16 19:24:41 -05:00
Subv cbec739e7b GPU: Added Maxwell registers for Shader Program control. 2018-03-16 19:23:11 -05:00
Subv 5fb4c718cc GPU: Intercept writes to the VERTEX_END_GL register.
This is the register that gets written after a game calls DrawArrays().

We should collect all GPU state and draw using our graphics API here.
2018-03-04 19:14:04 -05:00
Lioncash 490d0e36a0
maxwell_3d: Make constructor explicit 2018-02-13 23:47:51 -05:00
bunnei af8ae770ef
Merge pull request #187 from Subv/maxwell3d_query
GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.
2018-02-13 23:25:07 -05:00
bunnei be5ba4d952
Merge pull request #178 from Subv/command_buffers
GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines
2018-02-12 13:51:52 -05:00
Subv ac61a7d1e6 GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.
Only QueryMode::Write is supported at the moment.
2018-02-12 12:34:41 -05:00
Subv 6cddf9d88e Make a GPU class in VideoCore to contain the GPU state.
Also moved the GPU MemoryManager class to video_core since it makes more sense for it to be there.
2018-02-11 23:44:12 -05:00
Subv e01a8f2187 GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines. 2018-02-11 22:42:48 -05:00
bunnei deadcb39c2 renderer_opengl: Support framebuffer flip vertical. 2018-02-11 21:03:55 -05:00
MerryMage 738f91a57d memory: Replace all memory hooking with Special regions 2018-01-27 15:16:39 +00:00
James Rowe 096be16636 Format: Run the new clang format on everything 2018-01-20 16:45:11 -07:00
Lioncash e710a1b989 CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables, and then
construct the target in most cases.
2018-01-17 21:51:43 -05:00
MerryMage e35644c005 clang-format 2018-01-16 18:05:21 +00:00
bunnei 92801b1c34 renderer_gl: Clear screen to black before rendering framebuffer. 2018-01-15 00:20:19 -05:00
bunnei ebd613c2cc renderer: Render previous frame when no new one is available. 2018-01-14 23:54:56 -05:00
MerryMage e86bdb1601 Fix build on macOS and linux 2018-01-13 22:38:52 +00:00
James Rowe 389979018c Remove gpu debugger and get yuzu qt to compile 2018-01-12 19:11:04 -07:00
James Rowe 1d28b2e142 Remove references to PICA and rasterizers in video_core 2018-01-12 19:11:03 -07:00
bunnei 11adef4843 renderer_opengl: Fix LOG_TRACE in LoadFBToScreenInfo. 2018-01-11 22:32:44 -05:00
bunnei ee4691297f renderer_opengl: Support rendering Switch framebuffer. 2018-01-10 23:28:59 -05:00
bunnei 236d463c52 render_base: Add a struct describing framebuffer metadata. 2018-01-10 23:28:56 -05:00
bunnei 866e66dc31 renderer_opengl: Add MortonCopyPixels function for Switch framebuffer. 2018-01-10 23:28:53 -05:00
bunnei 9e2ad45c98 renderer_opengl: Update DrawScreens for Switch. 2018-01-10 23:28:49 -05:00
bunnei 93480b10ef core/video_core: Fix a bunch of u64 -> u32 warnings. 2018-01-01 15:40:35 -05:00
bunnei 960a1416de hle: Initial implementation of NX service framework and IPC. 2017-10-14 22:18:42 -04:00
Huw Pascoe b3b34a1e76 Extracted the attribute setup and draw commands into their own functions 2017-10-04 01:08:29 +01:00
Huw Pascoe a13ab958cb Fixed type conversion ambiguity 2017-09-30 09:34:35 +01:00
Subv a321bce378 Disable unary operator- on Math::Vec2/Vec3/Vec4 for unsigned types.
It is unlikely we will ever use this without first doing a Cast to a signed type.
Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
2017-09-27 09:06:41 -05:00
B3n30 dc6a365337 Merge pull request #2951 from huwpascoe/perf-4
Optimized Morton
2017-09-25 08:28:55 +02:00
Huw Pascoe 903906da3b Optimized Float<M,E> multiplication
Before:

ucomiss xmm1, xmm1
jp      .L9
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm0, xmm2
setp    al
cmovne  eax, edx
test    al, al
jne     .L9
.L3:
movaps  xmm0, xmm2
ret
.L9:
ucomiss xmm0, xmm0
jp      .L10
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm1, xmm2
setp    al
cmovne  eax, edx
test    al, al
je      .L3

After:

movaps  xmm2, xmm1
mulss   xmm2, xmm0
ucomiss xmm2, xmm2
jnp     .L3
ucomiss xmm1, xmm0
jnp     .L11
.L3:
movaps  xmm0, xmm2
ret
.L11:
pxor    xmm2, xmm2
jmp     .L3
2017-09-25 00:54:02 +01:00
Huw Pascoe 876aa82c29 Optimized Morton 2017-09-24 22:27:14 +01:00
James Rowe 93930a966f Merge pull request #2921 from jroweboy/batch-fix-2
GPU: Add draw for immediate and batch modes
2017-09-24 07:57:16 -06:00
James Rowe 19d41dcc6e Remove pipeline.gpu_mode and fix minor issues 2017-09-23 09:28:20 -06:00
Yuri Kunde Schlesner a7758b0b36 Merge pull request #2928 from huwpascoe/master
Fixed framebuffer warning
2017-09-22 04:06:38 +02:00
Huw Pascoe a234e4c200 Improved performance of FromAttributeBuffer
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.

I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
Huw Pascoe 6a110ac5f5 Fixed framebuffer warning 2017-09-17 11:57:06 +01:00
Yuri Kunde Schlesner 699c920991 Merge pull request #2900 from wwylele/clip-2
PICA: implement custom clip plane
2017-09-16 10:23:00 +02:00
James Rowe ad0b57f407 GPU: Add draw for immediate and batch modes
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
2017-09-11 09:21:43 -06:00
bunnei 11baa40d75 Merge pull request #2865 from wwylele/gs++
PICA: implemented geometry shader
2017-09-07 23:02:59 -04:00
bunnei ff4941fb3a Merge pull request #2914 from wwylele/fresnel-fix
pica/lighting: only apply Fresnel factor for the last light
2017-09-05 10:00:49 -04:00
wwylele 12fbc8c8df pica/lighting: only apply Fresnel factor for the last light 2017-09-03 08:22:03 +03:00
wwylele e2c41a5891 video_core: report telemetry for gas mode 2017-08-31 12:54:17 +03:00
bunnei f0e461bf6f Merge pull request #2891 from wwylele/sw-bump
SwRasterizer/Lighting: implement bump mapping
2017-08-30 21:07:30 -04:00
Weiyi Wang 647f017c6d Merge pull request #2892 from Subv/warnings2
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-28 03:21:51 -05:00
Subv da88f3b8f0 Warnings: Fixed a few missing-return warnings in video_core. 2017-08-26 11:58:22 -05:00
wwylele 417cb45e3f SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGL 2017-08-25 07:26:45 +03:00
wwylele addbcd5784 gl_rasterizer: implement custom clip plane 2017-08-25 07:26:45 +03:00
wwylele ea51a3af26 SwRasterizer: implement custom clip plane 2017-08-24 15:34:27 +03:00
wwylele 17c6104d2a gl_rasterizer/lighting: more accurate CP formula 2017-08-22 09:34:44 +03:00
wwylele b5aa570354 SwRasterizer/Lighting: implement LUT input CP 2017-08-22 09:34:44 +03:00
wwylele 3e478ca131 SwRasterizer/Lighting: implement bump mapping 2017-08-22 09:34:44 +03:00
wwylele 63b6e802cd swrasterizer: remove invalid TODO
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21 08:03:07 +03:00
wwylele 72b26ac32f swrasterizer/clipper: remove tested TODO
hwtested. Current implementation is the correct behavior
2017-08-21 08:03:07 +03:00
wwylele 5a4af616c6 gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shader 2017-08-21 08:03:07 +03:00
wwylele 1eca380886 gl_rasterizer: add clipping plane z<=0 defined in PICA 2017-08-21 08:03:07 +03:00
Yuri Kunde Schlesner 46d1ca768d Merge pull request #2872 from wwylele/sw-geo-factor
SwRasterizer/Lighting: implement geometric factor
2017-08-20 17:49:42 -07:00
James Rowe 8afa81ac1b Merge pull request #2871 from wwylele/sw-spotlight
SwRasterizer/Lighting: implement spot light
2017-08-19 20:10:24 -06:00
wwylele 0f35755572 pica/command_processor: build geometry pipeline and run geometry shader
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
 - no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
 - GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
 - GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
 - GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19 10:13:20 +03:00
wwylele 8285ca4ad8 pica/shader/jit: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele 36981a5aa6 pica/primitive_assembly: Handle winding for GS primitive
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19 10:13:20 +03:00
wwylele bb63ae3052 correct constness 2017-08-19 10:13:20 +03:00
wwylele 28128348f2 pica/shader/interpreter: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele 46c6973d2b pica/shader: extend UnitState for GS
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele 686fb3e78c gl_shader_gen: don't call SampleTexture when bump map is not used 2017-08-11 18:35:00 +03:00
wwylele 945f9a1b04 SwRasterizer/Lighting: implement spot light 2017-08-11 01:19:10 +03:00
wwylele 14ee32c46a SwRasterizer/Lighting: implement geometric factor 2017-08-11 01:18:43 +03:00
wwylele 5d9d42f0d0 SwRasterizer/Lighting: use make_tuple instead of constructor
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10 12:19:58 +03:00
wwylele db309b2423 pica/regs: layout geometry shader configuration regs
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-10 01:53:08 +03:00
Weiyi Wang 792dee47a7 Merge pull request #2822 from wwylele/sw_lighting-2
Implement fragment lighting in the sw renderer (take 2)
2017-08-09 18:54:29 +03:00
wwylele baa24f4ea9 pica: upload shared shader code to both unit 2017-08-07 10:30:05 +03:00
wwylele 2252a63f80 SwRasterizer/Lighting: shorten file name 2017-08-03 13:51:22 +03:00
wwylele eda28266fb SwRasterizer/Lighting: move to its own file 2017-08-02 22:20:40 +03:00
wwylele 48b4105871 SwRasterizer/Lighting: reduce confusion 2017-08-02 22:07:15 +03:00
wwylele c59ed47608 SwRasterizer/Lighting: move quaternion normalization to the caller 2017-08-02 22:05:53 +03:00
wwylele c89f804a01 pica/shader_interpreter: fix off-by-one in LOOP 2017-07-27 13:48:27 +03:00
Sebastian Valle c6a2e519ef Merge pull request #2816 from wwylele/proctex-lutlutlut
gl_rasterizer: use texture buffer for proctex LUT
2017-07-22 23:03:48 -05:00
Sebastian Valle e646bd902d Merge pull request #2834 from wwylele/depth-enable-fix
gl_rasterizer_cache: fix using_depth_fb
2017-07-22 23:02:59 -05:00
bunnei df8b9863f9 telemetry: Log performance, configuration, and system data. 2017-07-17 21:32:28 -04:00
wwylele 4feff63ffa SwRasterizer/Lighting: dist atten lut input need to be clamp 2017-07-11 22:19:00 +03:00
wwylele 56e5425e59 SwRasterizer/Lighting: unify float suffix 2017-07-11 22:15:35 +03:00
wwylele e415558a4f SwRasterizer/Lighting: get rid of nested return 2017-07-11 22:15:35 +03:00
wwylele c6d1472513 SwRasterizer/Lighting: refactor GetLutValue into a function.
merging similar pattern. Also makes the code more similar to the gl one
2017-07-11 22:15:35 +03:00
wwylele f13cf506e0 SwRasterizer: only interpolate quat and view when lighting is enabled 2017-07-11 21:35:57 +03:00
wwylele efc655aec0 SwRasterizer/Lighting: pass lighting state as parameter 2017-07-11 20:06:26 +03:00
Subv 9906feefbd SwRasterizer/Lighting: Move the clamp highlight calculation to the end of the per-light loop body. 2017-07-11 19:39:15 +03:00
Subv 7526af5e52 SwRasterizer/Lighting: Move the lighting enable check outside the ComputeFragmentsColors function. 2017-07-11 19:39:15 +03:00