Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.
e.g. for the original swap32 code on x86-64, clang 8.0 would generate:
mov ecx, edi
rol cx, 8
shl ecx, 16
shr edi, 16
rol di, 8
movzx eax, di
or eax, ecx
ret
while GCC 8.3 would generate the ideal:
mov eax, edi
bswap eax
ret
now both generate the same optimal output.
MSVC used to generate the following with the old code:
mov eax, ecx
rol cx, 8
shr eax, 16
rol ax, 8
movzx ecx, cx
movzx eax, ax
shl ecx, 16
or eax, ecx
ret 0
Now MSVC also generates a similar, but equally optimal result as clang/GCC:
bswap ecx
mov eax, ecx
ret 0
====
In the swap64 case, for the original code, clang 8.0 would generate:
mov eax, edi
bswap eax
shl rax, 32
shr rdi, 32
bswap edi
or rax, rdi
ret
(almost there, but still missing the mark)
while, again, GCC 8.3 would generate the more ideal:
mov rax, rdi
bswap rax
ret
now clang also generates the optimal sequence for this fallback as well.
This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.
mov r8, rcx
mov r9, rcx
mov rax, 71776119061217280
mov rdx, r8
and r9, rax
and edx, 65280
mov rax, rcx
shr rax, 16
or r9, rax
mov rax, rcx
shr r9, 16
mov rcx, 280375465082880
and rax, rcx
mov rcx, 1095216660480
or r9, rax
mov rax, r8
and rax, rcx
shr r9, 16
or r9, rax
mov rcx, r8
mov rax, r8
shr r9, 8
shl rax, 16
and ecx, 16711680
or rdx, rax
mov eax, -16777216
and rax, r8
shl rdx, 16
or rdx, rcx
shl rdx, 16
or rax, rdx
shl rax, 8
or rax, r9
ret 0
which is pretty unfortunate.
Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability".
Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.
Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.
This shrinks the overall code down to just
if (msvc)
use msvc's functions
else if (clang or gcc)
use clang/gcc's builtins
else
use the slow path
The template type here is actually a forwarding reference, not an rvalue
reference in this case, so it's more appropriate to use std::forward to
preserve the value category of the type being moved.
Makes the return type consistently uniform (like the intrinsics we're
wrapping). This also conveniently silences a truncation warning within
the kernel multi_level_queue.
Since C++17, the introduction of deduction guides for locking facilities
means that we no longer need to hardcode the mutex type into the locks
themselves, making it easier to switch mutex types, should it ever be
necessary in the future.
Many of these functions are carried over from Dolphin (where they aren't
used anymore). Given these have no use (and we really shouldn't be
screwing around with OS-specific thread scheduler handling from the
emulator, these can be removed.
The function for setting the thread name is left, however, since it can
have debugging utility usages.
When #2247 was created, thread_queue_list.h was the only user of
boost-related code, however #2252 moved the page table struct into
common, which makes use of Boost.ICL, so we need to add the dependency
to the common library's link interface again.
We really don't need to pull in several headers of boost related
machinery just to perform the erase-remove idiom (particularly with
C++20 around the corner, which adds universal container std::erase and
std::erase_if, which we can just use instead).
With this, we don't need to link in anything boost-related into common.
This makes the class much more flexible and doesn't make performing
copies with classes that contain a bitfield member a pain.
Given BitField instances are only intended to be used within unions, the
fact the full storage value would be copied isn't a big concern (only
sizeof(union_type) would be copied anyways).
While we're at it, provide defaulted move constructors for consistency.
Moves local global state into the Impl class itself and initializes it
at the creation of the instance instead of in the function.
This makes it nicer for weakly-ordered architectures, given the
CreateEntry() class won't need to have atomic loads executed for each
individual call to the CreateEntry class.
Makes it consistent with the regular standard containers in terms of
size representation. This also gets rid of dependence on our own
type aliases, removing the need for an include.
The necessity of this parameter is dubious at best, and in 2019 probably
offers completely negligible savings as opposed to just leaving this
enabled. This removes it and simplifies the overall interface.
This is compromise for swap type being used in union. A union has deleted default constructor if it has at least one variant member with non-trivial default constructor, and no variant member of T has a default member initializer. In the use case of Bitfield, all variant members will be the swap type on endianness mismatch, which would all have non-trivial default constructor if default value is specified, and non of them can have member initializer
Original reason:
As Windows multi-byte character codec is unspecified while we always assume std::string uses UTF-8 in our code base, this can output gibberish when the string contains non-ASCII characters. ::OutputDebugStringW combined with Common::UTF8ToUTF16W is preferred here.
While admirable as a means to ensure immutability, this has the
unfortunate downside of making the class non-movable. std::move cannot
actually perform a move operation if the provided operand has const data
members (std::move acts as an operation to "slide" resources out of an
object instance). Given Barrier contains move-only types such as
std::mutex, this can lead to confusing error messages if an object ever
contained a Barrier instance and said object was attempted to be moved.
This is also unused and superceded by standard functionality. The
standard library provides std::this_thread::sleep_for(), which provides
a much more flexible interface, as different time units can be used with
it.
This is an old function that's no longer necessary. C++11 introduced
proper threading support to the language and a thread ID can be
retrieved via std::this_thread::get_id() if it's ever needed.
This is an analog of BitSet from Dolphin that was introduced to allow
iterating over a set of bits. Given it's currently unused, and given
that std::bitset exists, we can remove this. If it's ever needed in the
future it can be brought back.
Xbyak is currently entirely unused. Rather than carting it along, remove
it and get rid of a dependency. If it's ever needed in the future, then
it can be re-added (and likely be more up to date at that point in
time).
Currently, there's no way to specify if an assertion should
conditionally occur due to unimplemented behavior. This is useful when
something is only partially implemented (e.g. due to ongoing RE work).
In particular, this would be useful within the graphics code.
The rationale behind this is it allows a dev to disable unimplemented
feature assertions (which can occur in an unrelated work area), while
still enabling regular assertions, which act as behavior guards for
conditions or states which must not occur. Previously, the only way a
dev could temporarily disable asserts, was to disable the regular
assertion macros, which has the downside of also disabling, well, the
regular assertions which hold more sanitizing value, as opposed to
unimplemented feature assertions.
Currently, this was only performing a logging call, which doesn't
actually invoke any assertion behavior. This is unlike
UNIMPLEMENTED_MSG, which *does* assert.
This makes the expected behavior uniform across both macros.
Storing signed type causes the following behaviour: extractValue can do overflow/negative left shift. Now it only relies on two implementation-defined behaviours (which are almost always defined as we want): unsigned->signed conversion and signed right shift
An old function from Dolphin. This is also unused, and pretty inflexible
when it comes to printing out different data types (for example, one
might not want to print out an array of u8s but a different type
instead. Given we use fmt, there's no need to keep this implementation
of the function around.
This is an unused hold-over from Dolphin that was primarily used to
parse values out of the .ini files. Given we already have libraries that
do this for us, we don't need to keep this around.
Everything from here is completely unused and also written with the
notion of supporting 32-bit architecture variants in mind. Given the
Switch itself is on a 64-bit architecture, we won't be supporting 32-bit
architectures. If we need specific allocation functions in the future,
it's likely more worthwhile to new functions for that purpose.
Like with TelemetryJson, we can make the implementation details private
and avoid the need to expose httplib to external libraries that need to
use the Client class.
* Added a context menu on the buttons including Clear & Restore Default
* Allow clearing (unsetting) inputs. Added a Clear All button
* Allow restoring a single input to default (instead of all)
operator+ for std::string creates an entirely new string, which is kind
of unnecessary here if we just want to append a null terminator to the
existing one.
Reduces the total amount of potential allocations that need to be done
in the logging path.
First of all they are foundamentally broken. As our convention is that std::string is always UTF-8, these functions assume that the multi-byte character version of TString (std::string) from windows is also in UTF-8, which is almost always wrong. We are not going to build multi-byte character build, and even if we do, this dirty work should be handled by frontend framework early.
We always use unicode internally. Any dirty work of conversion with other codec should be handled by frontend framework (Qt). Further more, ShiftJIS/CP1252 are not special (they are not code set used by 3ds, or any guest/host dependencies we have), so there is no reason to specifically include them
* Stubbed IRS
Currently we have no ideal way of implementing IRS. For the time being we should have the functions stubbed until we come up with a way to emulate IRS properly.
* Added IRS to logging backend
* Forward declared shared memory for irs
MSVC 19.11 (A.K.A. VS 15.3)'s C++ standard library implements P0154R1
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0154r1.html)
which defines two new constants within the <new> header, std::hardware_destructive_interference_size
and std::hardware_constructive_interference_size.
std::hardware_destructive_interference_size defines the minimum
recommended offset between two concurrently-accessed objects to avoid
performance degradation due to contention introduced by the
implementation (with the lower-bound being at least alignof(max_align_t)).
In other words, the minimum offset between objects necessary to avoid
false-sharing.
std::hardware_constructive_interference_size on the other hand defines
the maximum recommended size of contiguous memory occupied by two
objects accessed wth temporal locality by concurrent threads (also
defined to be at least alignof(max_align_t)). In other words the maximum
size to promote true-sharing.
So we can simply use this facility to determine the ideal alignment
size. Unfortunately, only MSVC supports this right now, so we need to
enclose it within an ifdef for the time being.
Multi-line doc comments still need the '<' after the ///, otherwise it's
treated as a regular comment and makes the original doc comment broken
in viewers, IDEs, etc. While we're at it, also fix some typos in the
comments.
The previous form of initializing done here is a C-ism, an empty set of
braces is sufficient for initializing (and doesn't potentially cause
missing brace warnings, given the first member of the struct is a COORD
struct).
Previously core itself was the library containing the code to gather
common information (build info, CPU info, and OS info), however all of
this isn't core-dependent and can be moved to the common code and use
the common interfaces. We can then just call those functions from the
core instead.
This will allow replacing our CPU detection with Xbyak's which has
better detection facilities than ours. It also keeps more
architecture-dependent code in common instead of core.
These currently aren't used and contain commented out source code that
corresponds to Dolphin's JIT. Given our CPU code is organized quite
differently, we shouldn't be keeping this around (at the moment it just
adds to compile times marginally).
The filter is returned via const reference, so this was making a
pointless copy of the entire filter every time a message was being
pushed into the logger instance.
Note: according to cppreference it is necessary to convert char to unsigned char when using std::tolower and std::toupper, otherwise the behaviour would be undefined.
There's no need to perform the resize separately here, since the
constructor allows presizing the buffer.
Also move the empty string check before the construction of the string
to make the early out more straightforward.
This is equivalent to doing:
push_back(std::string(""));
which is likely not to cause issues, assuming a decent std::string
implementation with small-string optimizations implemented in its
design, however it's still a little unnecessary to copy that buffer
regardless. Instead, we can use emplace_back() to directly construct the
empty string within the std::vector instance, eliminating any possible
overhead from the copy.
We can just use the variant of std::string's replace() function that can
replace an occurrence with N copies of the same character, eliminating
the need to allocate a std::string containing a buffer of spaces.
We can just leverage std::unique_ptr to automatically close these for us
in error cases instead of jumping to the end of the function to call
fclose on them.
Instead of using an unsigned int as a parameter and expecting a user to
always pass in the correct values, we can just convert the enum into an
enum class and use that type as the parameter type instead, which makes
the interface more type safe.
We also get rid of the bookkeeping "NUM_" element in the enum by just
using an unordered map. This function is generally low-frequency in
terms of calls (and I'd hope so, considering otherwise would mean we're
slamming the disk with IO all the time) so I'd consider this acceptable
in this case.
This avoids a redundant std::string construction if a key doesn't exist
in the map already.
e.g.
data[key] requires constructing a new default instance of the value in
the map (but this is wasteful, since we're already setting something
into the map over top of it).
Allows avoiding constructing std::string instances, since this only
reads an arbitrary sequence of characters.
We can also make ParseFilterRule() internal, since it doesn't depend on
any private instance state of Filter
These can just use a view to a string since its only comparing against
two names in both cases for matches. This avoids constructing
std::string instances where they aren't necessary.
These are unused and essentially don't provide much benefit either. If
we ever need rotation functions, these can be introduced in a way that
they don't sit in a common_* header and require a bunch of ifdefing to
simply be available
Android and macOS have supported thread_local for quite a while, but
most importantly is that we don't even really need it. Instead of using
a thread-local buffer, we can just return a non-static buffer as a
std::string, avoiding the need for that quality entirely.
These operators don't modify internal class state, so they can be made
const member functions. While we're at it, drop the unnecessary inline
keywords. Member functions that are defined in the class declaration are
already inline by default.
This provides the equivalent behavior, but without as much boilerplate.
While we're at it, explicitly default the move constructor, since we
have a move-assignment operator defined.