4e6848d A32/ir_emitter: Bugfix: ExceptionRaised was producing incorrect PC
41ba9fd value: Move ImmediateToU64() to be a part of Value's interface
c6a6271 reg_alloc: Emit AVX instructions where able
aedd32a abi: Emit AVX instructions where able
f2d9337 a64_exclusive_monitor: Loosen memory ordering requirements
7ca709d travis: Make macOS builds use Xcode 10
14dd45e Fix VShift terminology
88554c4 emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS16
ab4e316 emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS64
0ea84f3 emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftS32
c77a2c5 emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftU16()
e9441fd emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU64()
0e9c33c emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU32()
8f85274 emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8()
be05e75 Merge pull request #397 from VelocityRa/dec-shift-fix
bc328fc decoders: Cast to correctly-sized type before shifting
9c3d2d1 a64_emit_x64: Lowercase PAGE_SIZE
f538d29 emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed
1603a6e emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE
2e1ccaf emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8
555bfda emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16
71c2589 externals: Update Xbyak to 5.73
1ec1b2f Squashed 'externals/xbyak/' changes from 1de435ed..42462ef9
Now that we have all of the rearranging and proper structure sizes in
place, it's fairly trivial to implement svcGetThreadContext(). In the
64-bit case we can more or less just write out the context as is, minus
some minor value sanitizing. In the 32-bit case we'll need to clear out
the registers that wouldn't normally be accessible from a 32-bit
AArch32 exectuable (or process).
This will be necessary for the implementation of svcGetThreadContext(),
as the kernel checks whether or not the process that owns the thread
that has it context being retrieved is a 64-bit or 32-bit process.
If the process is 32-bit, then the upper 15 general-purpose registers
and upper 16 vector registers are cleared to zero (as AArch32 only has
15 GPRs and 16 128-bit vector registers. not 31 general-purpose
registers and 32 128-bit vector registers like AArch64).
Makes the public interface consistent in terms of how accesses are done
on a process object. It also makes it slightly nicer to reason about the
logic of the process class, as we don't want to expose everything to
external code.
Internally within the kernel, it also includes a member variable for the
floating-point status register, and TPIDR, so we should do the same here to match
it.
While we're at it, also fix up the size of the struct and add a static
assertion to ensure it always stays the correct size.
A process should never require being reference counted in this
situation. If the handle to a process is freed before this function is
called, it's definitely a bug with our lifetime management, so we can
put the requirement in place for the API that the process must be a
valid instance.
boost::static_pointer_cast for boost::intrusive_ptr (what SharedPtr is),
takes its parameter by const reference. Given that, it means that this
std::move doesn't actually do anything other than obscure what the
function's actual behavior is, so we can remove this. To clarify, this
would only do something if the parameter was either taking its argument
by value, by non-const ref, or by rvalue-reference.
Add asserts for compute shader dispatching, transform feedback being
enabled and alpha testing. These have in common that they'll probably break
rendering without logging.
The std::vector instances are already initially allocated with all
entries having these values, there's no need to loop through and fill
them with it again when they aren't modified.
auto x = 0;
auto-deduces x to be an int. This is undesirable when working with
unsigned values. It also causes sign conversion warnings. Instead, we
can make it a proper unsigned value with the correct width that the
following expressions operate on.
Ternary operators have a lower precedence than arithmetic operators, so
what was actually occurring here is "return (out + full) ? x : y" which most
definitely isn't intended, given we calculate out recursively above. We
were essentially doing a lot of work for nothing.