Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
One of the later commits will enable writing to GS regs.
It turns out that on startup, most games will write 4096 GS program words.
The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages:
```
HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024
```
New constants have been introduced to represent these limits.
The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
Corrects a few issues with regards to Doxygen documentation, for example:
- Incorrect parameter referencing.
- Missing @param tags.
- Typos in @param tags.
and a few minor other issues.
The resulting pointer wasn't written to unless the index was verified as
valid, but that's still UB and triggered debug checks in MSVC.
Reported by garrettboast on IRC
This also fixes a long-standing but neverthless harmless memory
corruption bug, whech the padding of the OutputVertex struct would get
corrupted by unused attributes.
This doesn't belong in LoadInputVertex because it also happens for
non-VS invocations. Since it's not used by the JIT it seems adequate to
initialize it in the interpreter which is the only thing that cares
about them.