Commit graph

21 commits

Author SHA1 Message Date
Fabian 37c3d1f83c Generate direct control flow, using wasm blocks and loops 2020-12-31 19:14:33 -06:00
Fabian 9853bdb868 Merge 16-bit and 32-bit 0f tables (saves 200kB on v86.wasm) 2020-12-31 19:14:31 -06:00
Fabian 11dfeb1ee7 Fix minor descrepancy between analyser and jit 2020-12-31 19:14:31 -06:00
Fabian a26eb43719 Fix: Inhibit interrupts for one instruction after STI (fixes ReactOS) 2020-12-31 19:14:30 -06:00
Fabian acb8ad5423 Avoid console.assert (doesn't throw) 2020-12-31 19:14:30 -06:00
Fabian a73988a817 Make loop, loopz, loopnz and jcxz custom generated 2020-12-31 19:14:30 -06:00
Fabian c207400922 Fix Rust warnings 2020-12-31 19:14:29 -06:00
Fabian 4bb3c14e57 Comment 2020-12-31 19:14:28 -06:00
Fabian 0263764a5c Remove unused unguarded_register property during analysis 2020-12-31 19:14:28 -06:00
Fabian a8308b988d Store registers in locals
This changes registers to be temporarily stored in wasm locals, across
each complete wasm module. Registers are moved from memory to locals
upon entering the wasm module and moved from locals to memory upon
leaving. Additionally, calls to functions that modify registers are
wrapped between moving registers to memory before and moving back to
locals after. This affects:

1. All non-custom instructions
2. safe_{read,write}_slow, since it may page fault (the slow path of all memory accesses)
3. task_switch_test* and trigger_ud
4. All block boundaries
5. The fallback functions of gen_safe_read_write (read-modify-write memory accesses)

The performance benefits are currently mostly eaten up by 1. and 4. (if
one calculates the total number of read/writes to registers in memory,
they are higher after this patch, as each instructions of typ 1. or 4.
requires moving all 8 register twice). This can be improved later by the
relatively mechanical work of making instructions custom (not
necessarily full code generation, only the part of the instruction where
registers are accessed). Multi-page wasm module generation will
significantly reduce the number of type 4. instructions.

Due to 2., the overall code size has significantly increased. This case
(the slow path of memory access) is often generated but rarely executed.
These moves can be removed in a later patch by a different scheme for
safe_{read,write}_slow, which has been left out of this patch for
simplicity of reviewing.

This also simplifies our code generation for storing registers, as

    instructions_body.const_i32(register_offset);
    // some computations ...
    instruction_body.store_i32();

turns into:

    // some computations ...
    write_register(register_index);

I.e., a prefix is not necessary anymore as locals are indexed directly.

Further patches will allow getting rid of some temporary locals, as
registers now can be used directly.
2020-08-30 19:37:15 -05:00
Fabian 440b67eda5 Support for gen_safe_write128 and code generation for MOVAPS/MOVDQA (0F29/660F7F) 2020-08-30 19:37:15 -05:00
Fabian ec059a9f27 Codegen for fpu instructions (D8 group) 2020-08-30 19:37:15 -05:00
Fabian a5cbf53da5 Fix jit in presence of new page fault handling
Makes the following a block boundary:

- push
- Any non-custom instruction that uses modrm encoding
- Any sse/fpu instruction

This commit affects performance negatively. In order to fix this, the
above instructions need to be implemented using custom code generators
for the memory access.
2020-08-30 19:29:53 -05:00
Fabian 1253b72906 Generate prefix handling for string instructions 2020-08-30 19:29:13 -05:00
Fabian 3a8d644d75 Port jit to Rust
The following files and functions were ported:
- jit.c
- codegen.c
- _jit functions in instructions*.c and misc_instr.c
- generate_{analyzer,jit}.js (produces Rust code)
- jit_* from cpu.c

And the following data structures:
- hot_code_addresses
- wasm_table_index_free_list
- entry_points
- jit_cache_array
- page_first_jit_cache_entry

Other miscellaneous changes:
- Page is an abstract type
- Addresses, locals and bitflags are unsigned
- Make the number of entry points a growable type
- Avoid use of global state wherever possible
- Delete string packing
- Make CachedStateFlags abstract
- Make AnalysisType product type
- Make BasicBlockType product type
- Restore opcode assertion
- Set opt-level=2 in debug mode (for test performance)
- Delete JIT_ALWAYS instrumentation (now possible via api)
- Refactor generate_analyzer.js
- Refactor generate_jit.js
2020-08-30 19:29:13 -05:00
Fabian 2609041e64 Don't fail when invalid instructions are analyzed/jitted 2020-08-30 19:27:07 -05:00
Fabian cba5491fc4 Multiple jit block entry points
- introduce multiple entry points per compiled wasm module, by passing
  the initial state to the generated function.
- continue analysing and compiling after instructions that change eip, but
  will eventually return to the next instruction, in particular CALLs
  (and generate an entry point for the following instruction)

This commit is incomplete in the sense that the container will crash
after some time of execution, as wasm table indices are never freed
2020-08-30 19:27:02 -05:00
Fabian ded423b1c5 x86 table: Add remaining 0f instructions, simplify gen scripts 2020-07-21 20:10:14 -05:00
Fabian a9b5f153a8 Move around and add some assertions 2020-07-21 20:10:14 -05:00
Fabian 65bf2e350d Don't generate duplicate flags checks
Also simplify the generated analyzer code by just finding the jump
offset, not the computed jump target
2020-07-21 20:10:14 -05:00
Fabian f8349af093 New block analysis, generation of state machine with multiple basic blocks
This commit consists of three components:

1. A new generated x86-parser that analyses instructions. For now, it
   only detects the control flow of an instruction: Whether it is a
   (conditional) jump, a normal instruction or a basic block boundary
2. A new function, jit_find_basic_blocks, that finds and connects basic
   blocks using 1. It loosely finds all basic blocks making up a function,
   i.e. it doesn't follow call or return instructions (but it does follow
   all near jumps). Different from our previous analysis, it also finds
   basic blocks in the strict sense that no basic block contains a jump
   into the middle of another basic block
3. A new code-generating function, jit_generate, that takes the output
   of 2 as input. It generates a state machine:
   - Each basic block becomes a case block in a switch-table
   - Each basic block ends with setting a state variable for the following basic block
   - The switch-table is inside a while(true) loop, which is terminated
     by return statements in basic blocks which are leaves

Additionally:
- Block linking has been removed as it is (mostly) obsoleted by these
  changes. It may later be reactived for call instructions
- The code generator API has been extended to generate the code for the state machine
- The iterations of the state machine are limited in order to avoid
  infinite loops that can't be interrupted
2020-07-21 20:10:14 -05:00