# ES Module Lexer [![Build Status][travis-image]][travis-url] A JS module syntax lexer used in [es-module-shims](https://github.com/guybedford/es-module-shims). Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling. A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only. For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms. _Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._ ### Usage ``` npm install es-module-lexer ``` For use in CommonJS: ```js const { init, parse } = require('es-module-lexer'); (async () => { // either await init, or call parse asynchronously // this is necessary for the Web Assembly boot await init; const [imports, exports] = parse('export var p = 5'); exports[0] === 'p'; })(); ``` An ES module version is also available: ```js import { init, parse } from 'es-module-lexer'; (async () => { await init; const source = ` import { name } from 'mod'; import json from './json.json' assert { type: 'json' } export var p = 5; export function q () { }; // Comments provided to demonstrate edge cases import /*comment!*/ ('asdf', { assert: { type: 'json' }}); import /*comment!*/.meta.asdf; `; const [imports, exports] = parse(source, 'optional-sourcename'); // Returns "mod" imports[0].n source.substring(imports[0].s, imports[0].e); // "s" = start // "e" = end // Returns "import { name } from 'mod'" source.substring(imports[0].ss, imports[0].se); // "ss" = statement start // "se" = statement end // Returns "{ type: 'json' }" source.substring(imports[1].a, imports[1].se); // "a" = assert // Returns "p,q" exports.toString(); // Dynamic imports are indicated by imports[2].d > -1 // In this case the "d" index is the start of the dynamic import // Returns true imports[2].d > -1; // Returns "asdf" imports[2].n // Returns "'asdf'" source.substring(imports[2].s, imports[2].e); // Returns "import /*comment!*/ (" source.substring(imports[2].d, imports[2].s); // Returns "import /*comment!*/ ('asdf', { assert: { type: 'json' } })" source.substring(imports[2].d, imports[2].se + 1); // Returns "{ assert: { type: 'json' } }" source.substring(imports[2].a, imports[2].e); // ss is the same as d // as, ae not used for dynamic imports // import.meta is indicated by imports[2].d === -2 // Returns true imports[2].d === -2; // Returns "import /*comment!*/.meta" source.substring(imports[2].s, imports[2].e); })(); ``` ### Escape Sequences To handle escape sequences in specifier strings, the `.n` field of imported specifiers will be provided where possible. For dynamic import expressions, this field will be empty if not a valid JS string. ### Facade Detection Facade modules that only use import / export syntax can be detected via the third return value: ```js const [,, facade] = parse(` export * from 'external'; import * as ns from 'external2'; export { a as b } from 'external3'; export { ns }; `); facade === true; ``` ### Environment Support Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm). ### Grammar Support * Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators. * Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking. * Always correctly parses valid JS source, but may parse invalid JS source without errors. ### Limitations The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking. The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases: ```js // Only "a" is detected as an export, "q" isn't export var a = 'asdf', q = z; // "b" is not detected as an export export var { a: b } = asdf; ``` The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above. ### Benchmarks Benchmarks can be run with `npm run bench`. Current results: ``` Module load time > 7ms Cold Run, All Samples test/samples/*.js (3057 KiB) > 33ms Warm Runs (average of 25 runs) test/samples/angular.js (719 KiB) > 4.08ms test/samples/angular.min.js (188 KiB) > 2.08ms test/samples/d3.js (491 KiB) > 4.72ms test/samples/d3.min.js (274 KiB) > 3ms test/samples/magic-string.js (34 KiB) > 0.04ms test/samples/magic-string.min.js (20 KiB) > 0ms test/samples/rollup.js (902 KiB) > 8.16ms test/samples/rollup.min.js (429 KiB) > 4.28ms Warm Runs, All Samples (average of 25 runs) test/samples/*.js (3057 KiB) > 25.68ms ``` ### Building To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases. The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project. The build through the Makefile is then run via `make lib/lexer.wasm`, which can also be triggered via `npm run build-wasm` to create `dist/lexer.js`. On Windows it may be preferable to use the Linux subsystem. After the Web Assembly build, the CJS build can be triggered via `npm run build`. ### License MIT [travis-url]: https://travis-ci.org/guybedford/es-module-lexer [travis-image]: https://travis-ci.org/guybedford/es-module-lexer.svg?branch=master