TS compile API - Madinah

背景#

之前在进行 SDK 编译流程的时候，为了方便开发者开发，经常会写一些 alias 方便将一些长的相对路径变成一个个很短的 alias 之后的路径。
例如

{
  // ...
  "baseUrl":"src",
  "paths": {
    "@/package": ["./index"],
    "@/package/*": ["./*"],
  }
}

这样的作用是将别名永远指向 src 目录，当 src 目录下面层级非常深的文件引用顶层的文件的时候，可以直接写成，这样可以减少相对路径的引用，让代码看着美观 && 方便调整文件的目录结构。

import Components from "@/package/ui/header"

这样的话开发阶段的话，DX 体验友好，但是当我们最终把代码打包成 SDK 的时候，使用 SDK 的宿主环境不一定和我们配置成一样的 alias，不一致的话就会存在找不到文件的问题，这就需要我们在打包 TS 的的时候将 alias 的相关文件编译会相对路径，实现所有引入方的兼容。这样就需要对于 TS 的编译流程有个了解

TS compiler 相关#

TS 的编译流程#

SourceCode（源码） ~~ 扫描器(Scanner) ~~> Token 流
Token 流 ~~ 解析器 (parser) ~~> AST（抽象语法树）
AST ~~ 绑定器 (binder) ~~> Symbols（符号）
AST + 符号(Symbols) ~~ 检查器 ~~> 类型验证
AST + 检查器 ~~ 发射器(emitter) ~~> JavaScript 代码

扫描器#

TS（TypeScript）中的 scanner（扫描器）是编译器的第一个阶段，也被称为词法分析器。它负责将源代码文件中的字符流转换成一系列的词法单元（tokens）。
工作方式如下：

读取字符流: scanner 从源代码文件中逐个读取字符。
识别词法单元: scanner 根据一组预定义的语法规则，将字符组合成识别出的词法单元，如标识符、关键字、运算符、常量等。它使用有限自动机（finite automation）或正则表达式来匹配字符序列。
生成词法单元：一旦识别出一个完整的词法单元，scanner 将其生成为一个包含类型和值信息的对象，并将其传递给下一个阶段的编译器。
处理特殊情况: scanner 同时处理特殊情况，如注释、字符串字面量，以及对转义字符的解析等。

例如，考虑以下 TypeScript 代码片段：

let age: number = 25;

scanner 将逐个读取字符并生成以下词法单元：

let 关键字
age 标识符
: 冒号（运算符）
number 关键字
= 等号（运算符）
25 数字常量
; 分号（分隔符）

词法单元生成的顺序由语法规则定义，scanner 会不断重复这个过程，直到源代码文件中的所有字符都被处理完毕。这个阶段只是将相关的 token 提取出来，没有进行语法，语义相关的分析。

import * as ts from "typescript";

// TypeScript has a singleton scanner
const scanner = ts.createScanner(ts.ScriptTarget.Latest, /*skipTrivia*/ true);

// That is initialized using a function `initializeState` similar to
function initializeState(text: string) {
  scanner.setText(text);
  scanner.setOnError((message: ts.DiagnosticMessage, length: number) => {
    console.error(message);
  });
  scanner.setScriptTarget(ts.ScriptTarget.ES5);
  scanner.setLanguageVariant(ts.LanguageVariant.Standard);
}

// Sample usage
initializeState(`
var foo = 123;
`.trim());

// Start the scanning
var token = scanner.scan();
while (token != ts.SyntaxKind.EndOfFileToken) {
  console.log(ts.SyntaxKind[token]);
  token = scanner.scan();
}

output

VarKeyword
Identifier
FirstAssignment
FirstLiteralToken
SemicolonToken

解析器#

TS（TypeScript）的解析器是用于将 TypeScript 代码转换为抽象语法树（Abstract Syntax Tree，简称 AST）的工具。解析器的主要作用是将源代码解析为语法树，以便后续的静态分析、类型检查和编译等操作。
解析器通过分析源代码的词法（Lexical）和语法（Syntactic）结构来构建语法树。词法分析阶段将源代码分解为标记（Tokens），例如关键字、标识符、运算符和常量等。语法分析阶段将标记组织成一个树形结构，确保代码的语法正确性。

import * as ts from "typescript";

function printAllChildren(node: ts.Node, depth = 0) {
  console.log(new Array(depth + 1).join('----'), ts.SyntaxKind[node.kind], node.pos, node.end);
  depth++;
  node.getChildren().forEach(c=> printAllChildren(c, depth));
}

var sourceCode = `
var foo = 123;
`.trim();

var sourceFile = ts.createSourceFile('foo.ts', sourceCode, ts.ScriptTarget.ES5, true);
printAllChildren(sourceFile);

output

SourceFile 0 14
---- SyntaxList 0 14
-------- VariableStatement 0 14
------------ VariableDeclarationList 0 13
---------------- VarKeyword 0 3
---------------- SyntaxList 3 13
-------------------- VariableDeclaration 3 13
------------------------ Identifier 3 7
------------------------ FirstAssignment 7 9
------------------------ FirstLiteralToken 9 13
------------ SemicolonToken 13 14
---- EndOfFileToken 14 14

绑定器#

一般的 JavaScript 解析器的流程大致是

SourceCode ~~Scanner~~> Tokens ~~Parser~~> AST ~~Emitter~~> JavaScript

但是上面的流程对于 TS 来说少了一个关键的步骤 TypeScript 的语义** 系统，** 为了协助（检查器执行）类型检查，绑定器将源码的各部分连接成一个相关的类型系统，供检查器使用。绑定器的主要职责是创建_符号_（Symbols)

简单理解

深入探索结构

#

可以通过里面的 pos end 来判断作用域相关的引用唯一性

检查器#

这里会联合上面 binder 产生出来的 Symbol 一起做类型推导，类型检查等
代码示例

import * as ts from "typescript";
import path from 'path'

// 创建一个 TypeScript 项目
const program = ts.createProgram({
  rootNames: [path.join(__dirname, './check.ts')], // 项目中所有要检查的文件的路径
  options: {
    ...ts.getDefaultCompilerOptions(),
    baseUrl: '.'
  }, // 编译选项
});

// 获取项目中的所有语义错误
const diagnostics = ts.getPreEmitDiagnostics(program)

// 打印错误信息
diagnostics.forEach((diagnostic) => {
  console.log(
    `Error: ${ts.flattenDiagnosticMessageText(diagnostic.messageText, "\n")}`
  );
});

check.ts

const a:string  = 1
console.log(a)

const b = ({)

output

Error: Type 'number' is not assignable to type 'string'.
Error: Property assignment expected.
Error: '}' expected.

发射器#

emitter.ts：是 TS -> JavaScript 的发射器
declarationEmitter.ts：这个发射器用于为 TypeScript 源文件（.ts） 创建_声明文件_

Emit 阶段会调用 Printer 将 AST 转换为文本，Printer（打印器）这个名字非常贴切，将 AST 打印成文本

import * as ts from 'typescript';

const printer = ts.createPrinter();
const result = printer.printNode(
  ts.EmitHint.Unspecified,
  makeNode(),
  undefined,
);
console.log(result);

function makeNode() {
  return ts.factory.createVariableStatement(
    undefined,
    ts.factory.createVariableDeclarationList(
      [
        ts.factory.createVariableDeclaration(
          ts.factory.createIdentifier('video'),
          undefined,
          ts.factory.createKeywordTypeNode(ts.SyntaxKind.NumberKeyword),
          ts.factory.createStringLiteral('conference'),
        ),
      ],
      ts.NodeFlags.Const,
    ),
  );
}

Transformers#

上面的部分介绍了 TS 编译代码的一些流程，同时 TS 也给我们提供了一些类似于 “生命周期的钩子”，允许我们在编译的流程中添加自己自定义的部分。

before 在 TypeScript 之前运行转换器 (代码还没有编译)
after 在 TypeScript 之后运行转换器 (代码已编译)
afterDeclarations 在声明步骤之后运行转换器 (你可以在这里转换类型定义)

API#

visiting#

ts.visitNode (node, visitor) 用来遍历 root node
ts.visitEachChild (node, visitor, context) 用来遍历子节点
ts.isXyz (node) 用来判断节点类型的例如 ts.isVariableDeclaration (node)

Nodes#

ts.createXyz 创建新节点 (然后返回)，ts.createIdentifier ('world')
ts.updateXyz 用来更新节点 ts.updateVariableDeclaration ()

书写一个 transformer#

const transformer =
  (_program: ts.Program) => (context: ts.TransformationContext) => {
    return (sourceFile: ts.Bundle | ts.SourceFile) => {
      const visitor = (node: ts.Node) => {
        console.log('zxzxxxx', node);
        if (ts.isIdentifier(node)) {
          switch (node.escapedText) {
            case 'babel':
              return ts.factory.createStringLiteral('babel-transformer');
            case 'typescript':
              return ts.factory.createStringLiteral('typescript-transformer');
          }
        }
        return ts.visitEachChild(node, visitor, context);
      };

      return ts.visitNode(sourceFile, visitor);
    };
  };

const program = ts.createProgram([path.join(__dirname, './02.ts')], {
  baseUrl: '.',
  target: ts.ScriptTarget.ESNext,
  module: ts.ModuleKind.ESNext,
  declaration: true,
  declarationMap: true,
  jsx: ts.JsxEmit.React,
  moduleResolution: ts.ModuleResolutionKind.NodeJs,
  skipLibCheck: true,
  allowSyntheticDefaultImports: true,
  outDir: path.join(__dirname, '../dist/transform'),
});

const res = program.emit(undefined, undefined, undefined, undefined, {
  after: [transformer(program)],
});
console.log(res);

更多代码示例 https://github.com/itsdouges/typescript-transformer-handbook/tree/master/example-transformers

实际具体的应用#

import path from 'path';
import { chain, head, isEmpty } from 'lodash';
import ts from 'typescript';

export function replaceAlias(
  fileName: string,
  importPath: string,
  paths?: Record<string, string[]>
) {
  if (isEmpty(paths)) return importPath;

  const normalizedPaths = chain(paths)
    .mapKeys((_, key) => key.replace(/\*$/, ''))
    .mapValues(head)
    .omitBy(isEmpty)
    .mapValues((resolve) => (resolve as string).replace(/\*$/, ''))
    .value();

  for (const [alias, resolveTo] of Object.entries(normalizedPaths)) {
    if (importPath.startsWith(alias)) {
      const resolvedPath = importPath.replace(alias, resolveTo);
      const relativePath = path.relative(path.dirname(fileName), resolvedPath);
      return relativePath.startsWith('.') ? relativePath : `./${relativePath}`;
    }
  }

  return importPath;
}

export default function (_program?: ts.Program | null, _pluginOptions = {}) {
  return ((ctx) => {
    const { factory } = ctx;
    const compilerOptions = ctx.getCompilerOptions();

    return (sourceFile: ts.Bundle | ts.SourceFile) => {
      const { fileName } = sourceFile.getSourceFile();
      function traverseVisitor(node: ts.Node): ts.Node | null {
        let importValue: string | null = null;
        if (ts.isCallExpression(node)) {
          const { expression } = node;
          if (node.arguments.length === 0) return null;
          const arg = node.arguments[0];
          if (!ts.isStringLiteral(arg)) return null;
          if (
            // Can't call getText on after step
            expression.getText(sourceFile as ts.SourceFile) !== 'require' &&
            expression.kind !== ts.SyntaxKind.ImportKeyword
          )
            return null;
          importValue = arg.text;
          // import, export
        } else if (
          ts.isImportDeclaration(node) ||
          ts.isExportDeclaration(node)
        ) {
          if (
            !node.moduleSpecifier ||
            !ts.isStringLiteral(node.moduleSpecifier)
          )
            return null;
          importValue = node.moduleSpecifier.text;
        } else if (
          ts.isImportTypeNode(node) &&
          ts.isLiteralTypeNode(node.argument) &&
          ts.isStringLiteral(node.argument.literal)
        ) {
          importValue = node.argument.literal.text;
        } else if (ts.isModuleDeclaration(node)) {
          if (!ts.isStringLiteral(node.name)) return null;
          importValue = node.name.text;
        } else {
          return null;
        }

        const newImport = replaceAlias(
          fileName,
          importValue,
          compilerOptions.paths
        );

        if (!newImport || newImport === importValue) return null;

        const newSpec = factory.createStringLiteral(newImport);

        let newNode: ts.Node | null = null;

        if (ts.isImportTypeNode(node))
          newNode = factory.updateImportTypeNode(
            node,
            factory.createLiteralTypeNode(newSpec),
            node.assertions,
            node.qualifier,
            node.typeArguments,
            node.isTypeOf
            );

            if (ts.isImportDeclaration(node))
              newNode = factory.updateImportDeclaration(
              node,
              node.modifiers,
              node.importClause,
              newSpec,
              node.assertClause
            );

            if (ts.isExportDeclaration(node))
              newNode = factory.updateExportDeclaration(
              node,
              node.modifiers,
              node.isTypeOnly,
              node.exportClause,
              newSpec,
              node.assertClause
            );

            if (ts.isCallExpression(node))
              newNode = factory.updateCallExpression(
              node,
              node.expression,
              node.typeArguments,
              [newSpec]
            );

            if (ts.isModuleDeclaration(node))
              newNode = factory.updateModuleDeclaration(
              node,
              node.modifiers,
              newSpec,
              node.body
            );

            return newNode;
            }

            function visitor(node: ts.Node): ts.Node {
            	return traverseVisitor(node) || ts.visitEachChild(node, visitor, ctx);
            }
            return ts.visitNode(sourceFile, visitor);
            };
            }) as ts.TransformerFactory<ts.Bundle | ts.SourceFile>;
            }

参考资料#

https://www.youtube.com/watch?v=BU0pzqyF0nw
https://github.com/basarat/typescript-book
https://github.com/itsdouges/typescript-transformer-handbook
https://github.com/LeDDGroup/typescript-transform-paths
https://github.com/nonara/ts-patch
https://github.com/LeDDGroup/typescript-transform-paths/blob/v1.0.0/src/index.ts
https://github.com/microsoft/TypeScript-Compiler-Notes