This gives us significantly more control over where in the
initialization process we start execution of the main process.
Previously we were running the main process before the CPU or GPU
threads were initialized (not good). This amends execution to start
after all of our threads are properly set up.
Now that we have dependencies on the initialization order, we can move
the creation of the main process to a more sensible area: where we
actually load in the executable data.
This allows localizing the creation and loading of the process in one
location, making the initialization of the process much nicer to trace.
Like with CPU emulation, we generally don't want to fire off the threads
immediately after the relevant classes are initialized, we want to do
this after all necessary data is done loading first.
This splits the thread creation into its own interface member function
to allow controlling when these threads in particular get created.
Our initialization process is a little wonky than one would expect when
it comes to code flow. We initialize the CPU last, as opposed to
hardware, where the CPU obviously needs to be first, otherwise nothing
else would work, and we have code that adds checks to get around this.
For example, in the page table setting code, we check to see if the
system is turned on before we even notify the CPU instances of a page
table switch. This results in dead code (at the moment), because the
only time a page table switch will occur is when the system is *not*
running, preventing the emulated CPU instances from being notified of a
page table switch in a convenient manner (technically the code path
could be taken, but we don't emulate the process creation svc handlers
yet).
This moves the threads creation into its own member function of the core
manager and restores a little order (and predictability) to our
initialization process.
Previously, in the multi-threaded cases, we'd kick off several threads
before even the main kernel process was created and ready to execute (gross!).
Now the initialization process is like so:
Initialization:
1. Timers
2. CPU
3. Kernel
4. Filesystem stuff (kind of gross, but can be amended trivially)
5. Applet stuff (ditto in terms of being kind of gross)
6. Main process (will be moved into the loading step in a following
change)
7. Telemetry (this should be initialized last in the future).
8. Services (4 and 5 should ideally be alongside this).
9. GDB (gross. Uses namespace scope state. Needs to be refactored into a
class or booted altogether).
10. Renderer
11. GPU (will also have its threads created in a separate step in a
following change).
Which... isn't *ideal* per-se, however getting rid of the wonky
intertwining of CPU state initialization out of this mix gets rid of
most of the footguns when it comes to our initialization process.
Now that we have the address arbiter extracted to its own class, we can
fix an innaccuracy with the kernel. Said inaccuracy being that there
isn't only one address arbiter. Each process instance contains its own
AddressArbiter instance in the actual kernel.
This fixes that and gets rid of another long-standing issue that could
arise when attempting to create more than one process.
Gets rid of the largest set of mutable global state within the core.
This also paves a way for eliminating usages of GetInstance() on the
System class as a follow-up.
Note that no behavioral changes have been made, and this simply extracts
the functionality into a class. This also has the benefit of making
dependencies on the core timing functionality explicit within the
relevant interfaces.
Places all of the timing-related functionality under the existing Core
namespace to keep things consistent, rather than having the timing
utilities sitting in its own completely separate namespace.
This is a function that definitely doesn't always have a non-modifying
behavior across all implementations, so this should be made non-const.
This gets rid of the need to mark data members as mutable to work around
the fact mutating data members needs to occur.
Keeps the CPU-specific behavior from being spread throughout the main
System class. This will also act as the home to contain member functions
that perform operations on all cores. The reason for this being that the
following pattern is sort of prevalent throughout sections of the
codebase:
If clearing the instruction cache for all 4 cores is necessary:
Core::System::GetInstance().ArmInterface(0).ClearInstructionCache();
Core::System::GetInstance().ArmInterface(1).ClearInstructionCache();
Core::System::GetInstance().ArmInterface(2).ClearInstructionCache();
Core::System::GetInstance().ArmInterface(3).ClearInstructionCache();
This is kind of... well, silly to copy around whenever it's needed.
especially when it can be reduced down to a single line.
This change also puts the basics in place to begin "ungrafting" all of the
forwarding member functions from the System class that are used to
access CPU state or invoke CPU-specific behavior. As such, this change
itself makes no changes to the direct external interface of System. This
will be covered by another changeset.
* get rid of boost::optional
* Remove optional references
* Use std::reference_wrapper for optional references
* Fix clang format
* Fix clang format part 2
* Adressed feedback
* Fix clang format and MacOS build
Many of the Current<Thing> getters (as well as a few others) were
missing const qualified variants, which makes it a pain to retrieve
certain things from const qualified references to System.
There's no need for shared ownership here, as the only owning class
instance of those Cpu instances is the System class itself. We can also
make the thread_to_cpu map use regular pointers instead of shared_ptrs,
given that the Cpu instances will always outlive the cases where they're
used with that map.
Like the barrier, this is owned entirely by the System and will always
outlive the encompassing state, so shared ownership semantics aren't
necessary here.
This will always outlive the Cpu instances, since it's destroyed after
we destroy the Cpu instances on shutdown, so there's no need for shared
ownership semantics here.
Neither of these functions alter the ownership of the provided pointer,
so we can simply make the parameters a reference rather than a direct
shared pointer alias. This way we also disallow passing incorrect memory values like
nullptr.
There's no real need to use a shared pointer in these cases, and only
makes object management more fragile in terms of how easy it would be to
introduce cycles. Instead, just do the simple thing of using a regular
pointer. Much of this is just a hold-over from citra anyways.
It also doesn't make sense from a behavioral point of view for a
process' thread to prolong the lifetime of the process itself (the
process is supposed to own the thread, not the other way around).
A process should never require being reference counted in this
situation. If the handle to a process is freed before this function is
called, it's definitely a bug with our lifetime management, so we can
put the requirement in place for the API that the process must be a
valid instance.
Given these are only added to the class to allow those functions to
access the private constructor, it's a better approach to just make them
static functions in the interface, to make the dependency explicit.
Given we now have the kernel as a class, it doesn't make sense to keep
the current process pointer within the System class, as processes are
related to the kernel.
This also gets rid of a subtle case where memory wouldn't be freed on
core shutdown, as the current_process pointer would never be reset,
causing the pointed to contents to continue to live.