This commit contains the final changes requires for porting all C code
to Rust and from emscripten to llvm:
- tools/wasm-patch-indirect-function-table.js: A script that rewrites
the wasm generated by llvm to remove the table limit
- tools/rust-lld-wrapper: A wrapper around rust-lld that removes
arguments forced by rustc that break compilation for us
- src/rust/cpu2/Makefile: A monstrosity to postprocess c2rust's output
- gen/generate_interpreter.js: Ported to produce Rust instead of C
- src/rust/*: A few functions and macros to connect the old Rust code
and the new Rust code
- src/*.js: Removes the loading of the old emscripten wasm module and
adapts imports and exports from emscripten to llvm
If mountpoint already exists, then we're silently making its children
inaccessible which may not be what we expected/intended.
Create a new forwarder inode upon mounting.
The testing "framework" code is slowly turning into spaghetti due to the
asynchronous nature of the triggers. Using async functions will help
clarify the program flow if we think we should address this issue.
The following files and functions were ported:
- jit.c
- codegen.c
- _jit functions in instructions*.c and misc_instr.c
- generate_{analyzer,jit}.js (produces Rust code)
- jit_* from cpu.c
And the following data structures:
- hot_code_addresses
- wasm_table_index_free_list
- entry_points
- jit_cache_array
- page_first_jit_cache_entry
Other miscellaneous changes:
- Page is an abstract type
- Addresses, locals and bitflags are unsigned
- Make the number of entry points a growable type
- Avoid use of global state wherever possible
- Delete string packing
- Make CachedStateFlags abstract
- Make AnalysisType product type
- Make BasicBlockType product type
- Restore opcode assertion
- Set opt-level=2 in debug mode (for test performance)
- Delete JIT_ALWAYS instrumentation (now possible via api)
- Refactor generate_analyzer.js
- Refactor generate_jit.js
- Use this data structure to delete cached code immediately when page is
written, not later when wasm index is reused
- Remove "dirty page" data structure
- Simplify cycle_internal, as no entries can be found dirty, they are
removed immediately after being overwritten
This commit consists of three components:
1. A new generated x86-parser that analyses instructions. For now, it
only detects the control flow of an instruction: Whether it is a
(conditional) jump, a normal instruction or a basic block boundary
2. A new function, jit_find_basic_blocks, that finds and connects basic
blocks using 1. It loosely finds all basic blocks making up a function,
i.e. it doesn't follow call or return instructions (but it does follow
all near jumps). Different from our previous analysis, it also finds
basic blocks in the strict sense that no basic block contains a jump
into the middle of another basic block
3. A new code-generating function, jit_generate, that takes the output
of 2 as input. It generates a state machine:
- Each basic block becomes a case block in a switch-table
- Each basic block ends with setting a state variable for the following basic block
- The switch-table is inside a while(true) loop, which is terminated
by return statements in basic blocks which are leaves
Additionally:
- Block linking has been removed as it is (mostly) obsoleted by these
changes. It may later be reactived for call instructions
- The code generator API has been extended to generate the code for the state machine
- The iterations of the state machine are limited in order to avoid
infinite loops that can't be interrupted
According to Fabian: safe_write128 is called suprisingly often as it is used by
Linux to fill the frame buffer. We can do two optimisations here: Add write128
to avoid one in_mapped_range check. Add mmap_write128 (taking 4 32-bit integers)
to avoid several switches from wasm to js and lookups of
memory_map_write32[aligned_addr]
(we can assume that writes don't cross a 16-byte boundary)
- Don't pass the old entry to jit_generate, instead look it up in
create_cache_entry (negligible performance overhead)
- Don't call the generated code immediate, this is a pre-requisite for
asynchronous compilation. This also required disabling the
self-modifying code assertion
- Don't pass codegen_finalize the start address to codegen_finalize,
instead set it in jit_generate
- Moved all helper functions to coverage.js
- Refactor individual cov_*[func_id] objects to coverage[func_id].*
- Write coverage data to its own directory (./build/coverage/coverage_data*)
- Enable/disable coverage logging in do_many_cycles to account for exceptions
- Better naming
- Minor stylistic refactoring
Notes:
- The coverage_dump_loop doesn't really help much, but it helps keep the memory
usage a bit down for longer running tests (eg. booting Linux) - Linux still
slows down too much and can't callibrate its time to boot
- Dumping the incoming data without structuring by file results in very large
files (~150-300 mb for the nasm tests). Structuring by fn_id/fn_name slows
execution down, but allows for more manageable coverage data
- Buffering data in memory and writing to disk synchronously is faster / about
the same speed as async or buffering + async IO.