Within the kernel, shared memory and transfer memory facilities exist as
completely different kernel objects. They also have different validity
checking as well. Therefore, we shouldn't be treating the two as the
same kind of memory.
They also differ in terms of their behavioral aspect as well. Shared
memory is intended for sharing memory between processes, while transfer
memory is intended to be for transferring memory to other processes.
This breaks out the handling for transfer memory into its own class and
treats it as its own kernel object. This is also important when we
consider resource limits as well. Particularly because transfer memory
is limited by the resource limit value set for it.
While we currently don't handle resource limit testing against objects
yet (but we do allow setting them), this will make implementing that
behavior much easier in the future, as we don't need to distinguish
between shared memory and transfer memory allocations in the same place.
The previous code had some minor issues with it, really not a big deal,
but amending it is basically 'free', so I figured, "why not?".
With the standard container maps, when:
map[key] = thing;
is done, this can cause potentially undesirable behavior in certain
scenarios. In particular, if there's no value associated with the key,
then the map constructs a default initialized instance of the value
type.
In this case, since it's a std::shared_ptr (as a type alias) that is
the value type, this will construct a std::shared_pointer, and then
assign over it (with objects that are quite large, or actively heap
allocate this can be extremely undesirable).
We also make the function take the region by value, as we can avoid a
copy (and by extension with std::shared_ptr, a copy causes an atomic
reference count increment), in certain scenarios when ownership isn't a
concern (i.e. when ReserveGlobalRegion is called with an rvalue
reference, then no copy at all occurs). So, it's more-or-less a "free"
gain without many downsides.
With this, all kernel objects finally have all of their data members
behind an interface, making it nicer to reason about interactions with
other code (as external code no longer has the freedom to totally alter
internals and potentially messing up invariants).
After doing a little more reading up on the Opus codec, it turns out
that the multistream API that is part of libopus can handle regular
packets. Regular packets are just a degenerate case of multistream Opus
packets, and all that's necessary is to pass the number of streams as 1
and provide a basic channel mapping, then everything works fine for
that case.
This allows us to get rid of the need to use both APIs in the future
when implementing multistream variants in a follow-up PR, greatly
simplifying the code that needs to be written.
Previously this was required, as BitField wasn't trivially copyable.
BitField has since been made trivially copyable, so now this isn't
required anymore.
Relocates the error code to where it's most related, similar to how all
the other error codes are. Previously we were including a non-generic
error in the main result code header.
These can just be passed regularly, now that we use fmt instead of our
old logging system.
While we're at it, make the parameters to MakeFunctionString
std::string_views.
Instead of holding a reference that will get invalidated by
dma_pushbuffer.pop(), hold it as a copy. This doesn't have any
performance cost since CommandListHeader is 8 bytes long.
There's no real need to use a shared lifetime here, since we don't
actually expose them to anything else. This is also kind of an
unnecessary use of the heap given the objects themselves are so small;
small enough, in fact that changing over to optionals actually reduces
the overall size of the HLERequestContext struct (818 bytes to 808
bytes).
Now that we have the address arbiter extracted to its own class, we can
fix an innaccuracy with the kernel. Said inaccuracy being that there
isn't only one address arbiter. Each process instance contains its own
AddressArbiter instance in the actual kernel.
This fixes that and gets rid of another long-standing issue that could
arise when attempting to create more than one process.
Similar to how WaitForAddress was isolated to its own function, we can
also move the necessary conditional checking into the address arbiter
class itself, allowing us to hide the implementation details of it from
public use.
Rather than let the service call itself work out which function is the
proper one to call, we can make that a behavior of the arbiter itself,
so we don't need to directly expose those implementation details.
This makes the class much more flexible and doesn't make performing
copies with classes that contain a bitfield member a pain.
Given BitField instances are only intended to be used within unions, the
fact the full storage value would be copied isn't a big concern (only
sizeof(union_type) would be copied anyways).
While we're at it, provide defaulted move constructors for consistency.
Because of the recent separation of GPU functionality into sync/async
variants, we need to mark the destructor virtual to provide proper
destruction behavior, given we use the base class within the System
class.
Prior to this, it was undefined behavior whether or not the destructor
in the derived classes would ever execute.
This will be utilized by more than just that class in the future. This
also renames it from OpusHeader to OpusPacketHeader to be more specific
about what kind of header it is.
We already have the thread instance that was created under the current
process, so we can just pass the handle table of it along to retrieve
the owner of the mutex.
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.
This also uncovered some indirect dependencies within other source
files. This also fixes those.
Places all error codes in an easily includable header.
This also corrects the unsupported error code (I accidentally used the
hex value when I meant to use the decimal one).
Places all of the functions for address arbiter operation into a class.
This will be necessary for future deglobalizing efforts related to both
the memory and system itself.
Removes a few inclusion dependencies from the headers or replaces
existing ones with ones that don't indirectly include the required
headers.
This allows removing an inclusion of core/memory.h, meaning that if the
memory header is ever changed in the future, it won't result in
rebuilding the entirety of the HLE services (as the IPC headers are used
quite ubiquitously throughout the HLE service implementations).
Avoids directly relying on the global system instance and instead makes
an arbitrary system instance an explicit dependency on construction.
This also allows removing dependencies on some global accessor functions
as well.
Given we already pass in a reference to the kernel that the shared
memory instance is created under, we can just use that to check the
current process, rather than using the global accessor functions.
This allows removing direct dependency on the system instance entirely.
In these cases the system object is nearby, and in the other, the
long-form of accessing the telemetry instance is already used, so we can
get rid of the use of the global accessor.
We already pass a reference to the system object to the constructor of the renderer,
so we can just use that instead of using the global accessor functions.
Reduces the potential amount of rebuilding necessary if any headers
change. In particular, we were including a header from the core library
when we don't even link the core library to the web_service library, so
this also gets rid of an indirect dependency.
Moves local global state into the Impl class itself and initializes it
at the creation of the instance instead of in the function.
This makes it nicer for weakly-ordered architectures, given the
CreateEntry() class won't need to have atomic loads executed for each
individual call to the CreateEntry class.
Any SDL invocation can call the even callback on the same thread, which can call GetSDLJoystickBySDLID and eventually cause double lock on joystick_map_mutex. To avoid this, lock guard should be placed as closer as possible to the object accessing code, so that any SDL invocation is with the mutex unlocked
Changes the interface as well to remove any unique methods that
frontends needed to call such as StartJoystickEventHandler by
conditionally starting the polling thread only if the frontend hasn't
started it already. Additionally, moves all global state into a single
SDLState class in order to guarantee that the destructors are called in
the proper order
MSVC does not seem to like using constexpr values in a lambda that were declared outside of it.
Previously on MSVC build the hotkeys to inc-/decrease the speed limit were not working correctly because in the lambda the SPEED_LIMIT_STEP had garbage values.
After googling around a bit I found: https://github.com/codeplaysoftware/computecpp-sdk/issues/95 which seems to be a similar issue.
Trying the suggested fix to make the variable static constexpr also fixes the bug here.
The comment already invalidates itself: neither MMIO nor rasterizer cache belongsHLE kernel state. This mutex has a too large scope if MMIO or cache is included, which is prone to dead lock when multiple thread acquires these resource at the same time. If necessary, each MMIO component or rasterizer should have their own lock.
This currently has the same behavior as the regular
OpenAudioRenderer API function, so we can just move the code within
OpenAudioRenderer to an internal function that both service functions
call.