Commit graph

296 commits

Author SHA1 Message Date
Fabian a9dac09ceb Custom codegen for xchg (91-98) 2020-12-31 19:14:28 -06:00
Fabian 091b2324d9 Custom codegen for 8C 2020-12-31 19:14:28 -06:00
Fabian 93ffd50969 Custom codegen for more sse move aliases (660F29/660F6F/F30F7F) 2020-08-30 19:37:15 -05:00
Fabian 75e5c2a56f Codegen for 8-bit shifts (D0/D2) 2020-08-30 19:37:15 -05:00
Fabian 815e5d338e Codegen more fpu instructions and run their tests (D9_6, DA_5) 2020-08-30 19:37:15 -05:00
Fabian fdd1dc377d Custom codegen for xadd (0FC1) 2020-08-30 19:37:15 -05:00
Fabian 874818866a Codegen for mul32 + custom mul/imul (F7_[45]) 2020-08-30 19:37:15 -05:00
Fabian 7024207fa4 Codegen for inc/dec (group 40-4F) 2020-08-30 19:37:15 -05:00
Fabian a8308b988d Store registers in locals
This changes registers to be temporarily stored in wasm locals, across
each complete wasm module. Registers are moved from memory to locals
upon entering the wasm module and moved from locals to memory upon
leaving. Additionally, calls to functions that modify registers are
wrapped between moving registers to memory before and moving back to
locals after. This affects:

1. All non-custom instructions
2. safe_{read,write}_slow, since it may page fault (the slow path of all memory accesses)
3. task_switch_test* and trigger_ud
4. All block boundaries
5. The fallback functions of gen_safe_read_write (read-modify-write memory accesses)

The performance benefits are currently mostly eaten up by 1. and 4. (if
one calculates the total number of read/writes to registers in memory,
they are higher after this patch, as each instructions of typ 1. or 4.
requires moving all 8 register twice). This can be improved later by the
relatively mechanical work of making instructions custom (not
necessarily full code generation, only the part of the instruction where
registers are accessed). Multi-page wasm module generation will
significantly reduce the number of type 4. instructions.

Due to 2., the overall code size has significantly increased. This case
(the slow path of memory access) is often generated but rarely executed.
These moves can be removed in a later patch by a different scheme for
safe_{read,write}_slow, which has been left out of this patch for
simplicity of reviewing.

This also simplifies our code generation for storing registers, as

    instructions_body.const_i32(register_offset);
    // some computations ...
    instruction_body.store_i32();

turns into:

    // some computations ...
    write_register(register_index);

I.e., a prefix is not necessary anymore as locals are indexed directly.

Further patches will allow getting rid of some temporary locals, as
registers now can be used directly.
2020-08-30 19:37:15 -05:00
Fabian f5540d9edf Use pop16_reg_jit for pop esp 2020-08-30 19:37:15 -05:00
Fabian 0c42ea0d1f Custom code generation for leave (C9) 2020-08-30 19:37:15 -05:00
Fabian 837e6ff362 Custom code generation for ret imm (C2) 2020-08-30 19:37:15 -05:00
Fabian c0f1d2a487 Custom code generation for arith al/ax/eax, imm (group [0123][45CD], A8/A9) 2020-08-30 19:37:15 -05:00
Fabian c9163c2df5 Custom code generation for mov reg, imm (B0-BF) 2020-08-30 19:37:15 -05:00
Fabian 2837ccd06b Support for gen_safe_read128 and code generation for MOVDQU (F30F6F) 2020-08-30 19:37:15 -05:00
Fabian dca6be2d94 Also generate nop for prefetch instruction 2020-08-30 19:37:15 -05:00
Fabian 440b67eda5 Support for gen_safe_write128 and code generation for MOVAPS/MOVDQA (0F29/660F7F) 2020-08-30 19:37:15 -05:00
Fabian e2ab5eabdd Code generation for missing memory operations (8-bit shifts, shrd, shld, xadd) 2020-08-30 19:37:15 -05:00
Fabian ec846b34d9 Codegen for fpu instructions (misc instructions) (D9_[14], DB_5, DD_5, DF_4) 2020-08-30 19:37:15 -05:00
Fabian 1eab44746b Codegen for fpu instructions (fldcw/fstcw) (D9_5, D9_7) 2020-08-30 19:37:15 -05:00
Fabian fdce557820 Codegen for fpu instructions (memory stores: fst/fstp/fist/fistp) (D9_[23], DB_[23], DD_[23], DF_[237]) 2020-08-30 19:37:15 -05:00
Fabian 7c99bdae74 Codegen for fpu instructions (memory loads: fld, fild) (D9_0, DB_0, DD_0, DF_5) 2020-08-30 19:37:15 -05:00
Fabian c452c357dd Codegen for fpu instructions (DE group) 2020-08-30 19:37:15 -05:00
Fabian 21caefbffd Codegen for fpu instructions (DC group) 2020-08-30 19:37:15 -05:00
Fabian ec059a9f27 Codegen for fpu instructions (D8 group) 2020-08-30 19:37:15 -05:00
Fabian 05296b0586 Enable fpu instructions in nasm tests 2020-08-30 19:37:15 -05:00
Fabian 0798a0b40e Don't create unnecessary entry points
This commit prevents creation of entry points for jumps within the same
page. In interpreted mode, execution is continued on these kinds of
jumps.

Since this prevents the old hotness detection from working efficiently,
hotness detection has also been changed to work based on instruction
counters, and is such more precise (longer basic blocks are compiled
earlier).

This also breaks the old detection loop safety mechanism and causes
Linux to sometimes loop forever on "calibrating delay loop", so
JIT_ALWAYS_USE_LOOP_SAFETY has been set to 1.
2020-08-30 19:29:54 -05:00
Fabian 5eaece7743 jit memory moves with immediate address (A0/A1/A2/A3) 2020-08-30 19:29:54 -05:00
Fabian 8de547455e jit memory access for imul 2020-08-30 19:29:54 -05:00
Fabian 6a2cd6419d jit memory access for 8-bit read-modify-write operations with immediate 2020-08-30 19:29:54 -05:00
Fabian 2635ed71b4 jit memory access for 8-bit read-modify-write operations 2020-08-30 19:29:54 -05:00
Awal Garg 54151e2306 jit 0x0FBF 2020-08-30 19:29:54 -05:00
Awal Garg 0377e95c42 jit 0x0FB7 2020-08-30 19:29:54 -05:00
Fabian 98d69c0bef Mark unimplemented instructions as block boundaries 2020-08-30 19:29:54 -05:00
Fabian 46f9bc9d00 Remove non-faulting property of instructions (all instructions are non-faulting) 2020-08-30 19:29:54 -05:00
Fabian d63c956a89 sse: Implement 0F5A/0F5B/CVTT?[SPD][SDQ]2[SPD][SDQ] (#57) 2020-08-30 19:29:54 -05:00
Fabian 8ab707dbc2 sse: Implement 0FE6/CVTPD2DQ/CVTTPD2DQ/CVTDQ2PD (#57) 2020-08-30 19:29:54 -05:00
Fabian 3ea0089878 sse: Implement 0F2C/0F2D/CVTT[PS][SD]2[SP]I (#57) 2020-08-30 19:29:54 -05:00
Fabian 9665dbf994 sse: Implement 0F2E/0F2F/u?comis[sd] (#57) 2020-08-30 19:29:54 -05:00
Fabian 8dc066f73d sse: Expand sse3 instruction 2020-08-30 19:29:54 -05:00
Fabian cc507db69b sse: Implement 0FC6/shufp[sd] (#57) 2020-08-30 19:29:54 -05:00
Fabian 9e902eb1dc sse: Implement 0F52/rcpps (#57) 2020-08-30 19:29:54 -05:00
Fabian 5dd26ead30 Generate code for memory instructions (0F4*, 0F9*, 0FAF: cmovcc, setcc, imul) 2020-08-30 19:29:54 -05:00
Fabian de01a4b265 Generate code for memory instructions (F6/F7/FF_{0,1}: test/inc/dec) 2020-08-30 19:29:54 -05:00
Fabian fa50294b47 Generate code for read-modify-write instructions (C1/D1/D3: Shifts and rotates) 2020-08-30 19:29:54 -05:00
Fabian 3706bcac12 Use jit for read-modify-write arithmetic instructions 2020-08-30 19:29:54 -05:00
Fabian cfb9cd8abe Partial custom implementation for arithmethic instructions with read-memory 2020-08-30 19:29:54 -05:00
Fabian 9de2b926a7 Custom implementations for test instruction (only wrapper) 2020-08-30 19:29:54 -05:00
Fabian 9164e0a48f Custom implementation for 'mov r/m, imm' 2020-08-30 19:29:54 -05:00
Fabian 415e345e54 C6/C7 don't need to marked as block boundary 2020-08-30 19:29:54 -05:00
Fabian b0eff6b951 Implement 8-bit memory accesses 2020-08-30 19:29:54 -05:00
Fabian c36a179a5e Remove block_boundary from push 2020-08-30 19:29:53 -05:00
Fabian a5cbf53da5 Fix jit in presence of new page fault handling
Makes the following a block boundary:

- push
- Any non-custom instruction that uses modrm encoding
- Any sse/fpu instruction

This commit affects performance negatively. In order to fix this, the
above instructions need to be implemented using custom code generators
for the memory access.
2020-08-30 19:29:53 -05:00
Fabian a88420910d Handle pagefaults without JS exceptions
This commit makes the return type of most basic memory access primitives
Result, where the Err(()) case means a page fault happened, the
instruction should be aborted and execution should continue at the page
fault handler.

The following primites have a Result return type: safe_{read,write}*,
translate_address_*, read_imm*, writable_or_pagefault, get_phys_eip,
modrm_resolve, push*, pop*.

Any instruction needs to handle the page fault cases and abort
execution appropriately. The return_on_pagefault! macro has been
provided to get the same behaviour as the previously used JS exceptions
(local to the function).

Calls from JavaScript abort on a pagefault, except for
writable_or_pagefault, which returns a boolean. JS needs to check
before calling any function that may pagefault.

This commit does not yet pervasively apply return_on_pagefault!, this
will be added in the next commit.

Jitted code does not yet properly handle the new form of page faults,
this will be added in a later commit.
2020-08-30 19:29:53 -05:00
Fabian 33acb48fb9 Implement cvtsd2si (#57) 2020-08-30 19:29:53 -05:00
Fabian 6fa702c8aa Implement {min,max,div}{p,s}{s,d} sse instructions (#57) 2020-08-30 19:29:53 -05:00
Fabian c10bbca85e Add sqrt{p,s}{d,s} instructions (#57) 2020-08-30 19:29:53 -05:00
Fabian 70ae4b720a Remove use of raising cpu exceptions for trigger_ud 2020-08-30 19:29:53 -05:00
Fabian 7e574dde52 Implement some floating point sse1/sse2 instructions (#57) 2020-08-30 19:29:53 -05:00
Fabian 9f2c78efb4 Add missing sse3 instruction and add note on others 2020-08-30 19:29:53 -05:00
Fabian 49961ade7c Remove hintable nops that were refitted for mpx instructions 2020-08-30 19:29:53 -05:00
Fabian e0aabb2937 Mark hintable nops as non-faulting 2020-08-30 19:29:53 -05:00
Fabian bdef74eced Generate code for task_switch_test{,_mmx}, use non-raising exceptions 2020-08-30 19:29:53 -05:00
Fabian 02a7bbb8f7 Implement hintable nops 2020-08-30 19:29:53 -05:00
Fabian f43ab3387a Remove use of cpu exceptions for trigger_gp for instructions 2020-08-30 19:29:53 -05:00
Fabian 5e82bc0e00 Remove use of cpu exceptions for trigger_ss (partially including switch_seg) 2020-08-30 19:29:53 -05:00
Fabian 4ee7da8f83 Remove use of cpu exceptions for divisions 2020-08-30 19:29:53 -05:00
Awal Garg b3e415cf9f jit inline 0xC3 2020-08-30 19:29:53 -05:00
Awal Garg c2c5e4f35c jit inline 0xC7
The generated rust code doesn't call read_imm* functions for custom
instructions now for the memory variant branches when both immediate
values and modrm byte is used
2020-08-30 19:29:53 -05:00
Awal Garg 4d622c165e jit inline nop instructions 2020-08-30 19:29:53 -05:00
Fabian 1253b72906 Generate prefix handling for string instructions 2020-08-30 19:29:13 -05:00
Fabian 3a8d644d75 Port jit to Rust
The following files and functions were ported:
- jit.c
- codegen.c
- _jit functions in instructions*.c and misc_instr.c
- generate_{analyzer,jit}.js (produces Rust code)
- jit_* from cpu.c

And the following data structures:
- hot_code_addresses
- wasm_table_index_free_list
- entry_points
- jit_cache_array
- page_first_jit_cache_entry

Other miscellaneous changes:
- Page is an abstract type
- Addresses, locals and bitflags are unsigned
- Make the number of entry points a growable type
- Avoid use of global state wherever possible
- Delete string packing
- Make CachedStateFlags abstract
- Make AnalysisType product type
- Make BasicBlockType product type
- Restore opcode assertion
- Set opt-level=2 in debug mode (for test performance)
- Delete JIT_ALWAYS instrumentation (now possible via api)
- Refactor generate_analyzer.js
- Refactor generate_jit.js
2020-08-30 19:29:13 -05:00
Fabian 9b2b3250df Fix 8-bit jumps in 16-bit mode 2020-08-30 19:27:07 -05:00
Fabian 5995414f87 JIT: Follow call instructions 2020-08-30 19:27:07 -05:00
Fabian ad7fa728b5 Annotate some instructions 2020-08-30 19:27:07 -05:00
Fabian 8466ca205e Mark ud2 instruction as block boundary 2020-08-30 19:27:07 -05:00
Fabian cba5491fc4 Multiple jit block entry points
- introduce multiple entry points per compiled wasm module, by passing
  the initial state to the generated function.
- continue analysing and compiling after instructions that change eip, but
  will eventually return to the next instruction, in particular CALLs
  (and generate an entry point for the following instruction)

This commit is incomplete in the sense that the container will crash
after some time of execution, as wasm table indices are never freed
2020-08-30 19:27:02 -05:00
Amaan Cheval 2128f07796 jit: Inline 0x89 and 0x8b opcodes's reg variants 2020-07-21 20:10:14 -05:00
Fabian 39d8d17031 Make 8f custom, simplify generate_jit by removing handling of requires_prefix_call 2020-07-21 20:10:14 -05:00
Fabian ded423b1c5 x86 table: Add remaining 0f instructions, simplify gen scripts 2020-07-21 20:10:14 -05:00
Fabian a9b5f153a8 Move around and add some assertions 2020-07-21 20:10:14 -05:00
Fabian f8349af093 New block analysis, generation of state machine with multiple basic blocks
This commit consists of three components:

1. A new generated x86-parser that analyses instructions. For now, it
   only detects the control flow of an instruction: Whether it is a
   (conditional) jump, a normal instruction or a basic block boundary
2. A new function, jit_find_basic_blocks, that finds and connects basic
   blocks using 1. It loosely finds all basic blocks making up a function,
   i.e. it doesn't follow call or return instructions (but it does follow
   all near jumps). Different from our previous analysis, it also finds
   basic blocks in the strict sense that no basic block contains a jump
   into the middle of another basic block
3. A new code-generating function, jit_generate, that takes the output
   of 2 as input. It generates a state machine:
   - Each basic block becomes a case block in a switch-table
   - Each basic block ends with setting a state variable for the following basic block
   - The switch-table is inside a while(true) loop, which is terminated
     by return statements in basic blocks which are leaves

Additionally:
- Block linking has been removed as it is (mostly) obsoleted by these
  changes. It may later be reactived for call instructions
- The code generator API has been extended to generate the code for the state machine
- The iterations of the state machine are limited in order to avoid
  infinite loops that can't be interrupted
2020-07-21 20:10:14 -05:00
Fabian 0d5ca58354 Minor: Use freeze on instruction objects 2020-07-21 20:10:14 -05:00
Amaan Cheval 41c8241d5e x86_table: Mark state-altering instructions as JIT block boundaries
These instructions, if included within a compiled JIT block, may alter the
state_flags of a block entry (such as whether flat segmentation is used or not),
which may invalidate the block that is running - this caused bugs in OpenBSD
because of a block like this being compiled:

0xF81F2: 8E DB                mov ds, bx
0xF81F4: 8E D3                mov ss, bx
0xF81F6: 66 8B 26 B8 F5       mov esp, dword ptr [0xf5b8] <--
0xF81FB: 66 89 36 B8 F5       mov dword ptr [0xf5b8], esi <--

The memory accesses implicitly use DS. If we include flat-segmenetation as a
flag within state_flags and optimize calls to get_seg based on it, this behavior
would cause issues (and did, in OpenBSD).

By marking these instructions as block boundaries, we remediate that issue.
2020-07-21 20:10:14 -05:00
Amaan Cheval 4d87bebee9 gen: s/jump/block_boundary/ 2020-07-21 20:10:14 -05:00
Amaan Cheval 1f0e7c3ce0 fpu: Have opcode 0xDF use fixed_g instruction functions 2020-07-21 20:10:14 -05:00
Amaan Cheval fca80793b8 fpu: Have opcode 0xDE use fixed_g instruction functions 2020-07-21 20:10:14 -05:00
Amaan Cheval 4910777084 fpu: Have opcode 0xDD use fixed_g instruction functions 2020-07-21 20:10:14 -05:00
Amaan Cheval 33c2b72553 fpu: Have opcode 0xDC use fixed_g instruction functions 2020-07-21 20:10:14 -05:00
Amaan Cheval c3a856e944 fpu: Have opcode 0xDB use fixed_g instruction functions 2020-07-21 20:10:14 -05:00
Amaan Cheval 0f23cf2745 fpu: Have opcode 0xDA use fixed_g instruction functions 2020-07-21 20:10:14 -05:00
Amaan Cheval df86637cb8 fpu: Have opcode 0xD9 use fixed_g 2020-07-21 20:10:14 -05:00
Amaan Cheval 173be47658 fpu: Have opcode 0xD8 use fixed_g and the regular instruction decoder 2020-07-21 20:10:14 -05:00
Awal Garg 54a43ab437 improve segment prefix handling and custom code generation for lea 2020-07-21 20:10:14 -05:00
Amaan Cheval 843527ac04 Apply stack_size_32 cache optimization to push r/m 2020-07-21 20:10:14 -05:00
Amaan Cheval 3ffb2ac35f Apply stack_size_32 cache optimization to push imm 2020-07-21 20:10:14 -05:00
Amaan Cheval 3512d34314 Optimize push/pop JIT instructions to not check stack_size_32
We generate a version of the push/pop instruction with the stack_size_32 fixed,
since the state tends not to change much. If it does change, state_flags won't
match the output of pack_current_state_flags and the cache entry will therefore
be invalidated.
2020-07-21 20:10:14 -05:00
Fabian f5938caa5a Link blocks for conditional jumps 2020-07-21 20:10:14 -05:00
Amaan Cheval bbe7d3d1d1 Add nonfaulting property to instructions in x86_table
See:
https://gist.github.com/AmaanC/faff7066d16f1dee4bbbd6b73a72d831

From the geek32[1] sheet, the criteria for nonfaulting instructions used was:

```
groups = ['arith', 'logical', 'conver', 'datamov', 'datamov arith', 'shftrot', 'flgctrl'];

// Excluded because they may trigger faults, so the optimization can't apply to them
excluded_opcodes = [
    // May trigger_ud
    '0x8C',
    // switch_seg may fault
    '0x8E',
    // mov to/from seg:offset (memory accesses)
    '0xA0',
    '0xA1',
    '0xA2',
    '0xA3',
    // Unimplemented in v86
    '0xD50A',
    '0x0F38F0',
    '0x0F38F1',
];
// Keywords that indicate a group/instruction which may fault
excluded_row_words = ['fpu', 'simd', 'mmx', 'sse', 'vmx', 'XLAT', 'DIV', 'AMX', 'AAM', 'CLI', 'STI', 'CMPXCHG8B'];
```

[1] http://ref.x86asm.net/geek32.html#x0F90
2020-07-21 20:10:14 -05:00
Fabian f53aba84b5 Linking compiled blocks 2020-07-21 20:10:13 -05:00
Fabian cf687afe37 Mark hlt as changing eip
Not strictly necessary, as it escapes execution using
MAGIC_CPU_EXCEPTION, but a good idea in case it changes later
2020-07-21 20:10:13 -05:00
Fabian 0b32b05deb Address review 2020-07-21 20:10:13 -05:00
Fabian 96c6da294c Fix popf handling for jit 2020-07-21 20:10:13 -05:00
Fabian 815c7a33bf Fix STI handling for jit 2020-07-21 20:10:13 -05:00
Fabian e19e71386b Revert "Replace prefix_call with custom_resolve_modrm"
This reverts commit c7c42065ac4e8cdc2f8653b36a32d1df9cb26a2e.
2020-07-21 20:10:13 -05:00
Fabian 96ad9f80a1 Annotate instruction table with jumping instructions 2020-07-21 20:10:13 -05:00
Amaan Cheval b3a4a30a9f Implement pandnp{s,d}, xmm/m128 2020-07-21 20:10:13 -05:00
Amaan Cheval 91c6e08864 Implement orp{s,d}, xmm/m128 2020-07-21 20:10:13 -05:00
Amaan Cheval 70749c6aff Minor: Update comment for maskmovq and maskmovdqu 2020-07-21 20:10:12 -05:00
Amaan Cheval e4b3032266 Implement paddd xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval bb58b8be45 Implement paddw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval d109faefb8 Implement paddb xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 6d84b62bc0 Implement psubq xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 0100369eaf Implement psubq mm, mm/m64 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 4949ffddd2 Implement psubw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval e8cf6ebdc3 Implement psubb xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval e7abcdae8c Update maskmovdqu's entry in x86 encoding table 2020-07-21 20:10:12 -05:00
Amaan Cheval 6d609d2b42 Implement maskmovq mm, mm 2020-07-21 20:10:12 -05:00
Amaan Cheval c2801e6ef8 Implement psadbw xmm, xmm/m128 2020-07-21 20:10:12 -05:00
Amaan Cheval 2f949ef93f Implement psadbw mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval 7b34662717 Implement pmaddwd xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval b33c136a25 Implement pmuludq xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval cfe26d5e33 Implement pmuludq mm, mm/m64 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 2ccf7fbb93 Implement pslld xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval d002913329 Implement psllw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 9b9c73a706 Implement pmaxsw xmm, xmm/m128 2020-07-21 20:10:12 -05:00
Amaan Cheval 52b43b00bc Implement pmaxsw mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval ccd57dfcd0 Implement paddsw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval e71be08f7a Implement paddsb xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 96ad95fcd8 Implement pminsw xmm, xmm/m128 2020-07-21 20:10:12 -05:00
Amaan Cheval 08158d69e2 Implement pminsw mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval 69001dbd8a Implement psubsw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval df20bd5742 Implement psubsb xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval a4029538f6 Implement movntq m64, mm 2020-07-21 20:10:12 -05:00
Amaan Cheval ef289aab7a Implement pmulhw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 91b0b9a41e Implement pmulhuw mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval 5a26ae8bb0 Implement pavgw xmm, xmm/m128 2020-07-21 20:10:12 -05:00
Amaan Cheval da697b72c2 Implement pavgw mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval de2e118f91 Implement psrad xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 33adb1a7f6 Implement psraw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 58fe73f3de Implement pavgb xmm, xmm/m128 2020-07-21 20:10:12 -05:00
Amaan Cheval 16c0f9ce5e Implement pavgb mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval c025717e46 Implement pandn xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 41fac79092 Implement pmaxub mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval 4218f36b13 Implement pand xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 61f6c96717 Implement pminub mm, mm/m64 2020-07-21 20:10:12 -05:00
Amaan Cheval d7259a5d13 Implement psubusw xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 6e147c048c Implement psubusb xmm, xmm/m128 (sse2) 2020-07-21 20:10:12 -05:00
Amaan Cheval 9de9e9da00 Implement pmovmskb r, mm 2020-07-21 20:10:12 -05:00
Amaan Cheval 5cde440520 Implement movdq2q mm, xmm (sse2) 2020-07-21 20:10:12 -05:00