.Net8顶级技术:JIT操纵IL分割成BasicBlock

前言

在JIT进行机器码编译的前期,JIT首先需要做两件事,其一:把需要编译的函数的参数和局部变量,以及调用Call的指令写入到内存。通过判断参数,局部变量,Call指令的Type,来获取需要开辟的栈空间大小。其二:就是把IL代码分割成BasicBlock代码块。本篇来看下后者。.

概括

这里以一个例子来看下:
C# Code

static void Main(){  int[] array = new int[10_000_000];  for (int i = 0; i < 1_000_000; i++)  {      Test(array);  }}

IL Code

.method private hidebysig static void  Main() cil managed{  .entrypoint  // 代码大小       35 (0x23)  .maxstack  2  .locals init (int32[] V_0,           int32 V_1)  IL_0000:  ldc.i4     0x989680  IL_0005:  newarr     [System.Runtime]System.Int32  IL_000a:  stloc.0  IL_000b:  ldc.i4.0  IL_000c:  stloc.1  IL_000d:  br.s       IL_001a  IL_000f:  ldloc.0  IL_0010:  call       bool Program::Test(int32[])  IL_0015:  pop  IL_0016:  ldloc.1  IL_0017:  ldc.i4.1  IL_0018:  add  IL_0019:  stloc.1  IL_001a:  ldloc.1  IL_001b:  ldc.i4     0xf4240  IL_0020:  blt.s      IL_000f  IL_0022:  ret} // end of method Program::Main

注意到这段IL里面有两个跳转的地方分别是:

 IL_000d:  br.s       IL_001a  IL_0020:  blt.s      IL_000f

JIT会依据这两个跳转的逻辑。把这段IL分割成四个BasicBlock,分别如下

以下BasicBlock(BB)代码段:BB01:是IL代码从IL_0000到IL_000F,可以看到它是从前起始IL到第一个跳转IL代码的下一个指令,这段偏移范围的空间。以此类推。
BB02:IL_000F到IL_001A。BB03:IL_001A到IL_0022BB04:IL_0022到IL_0023

IL就形成如下几个代码段:

第一个代码段:  IL_0000:  ldc.i4     0x989680  IL_0005:  newarr     [System.Runtime]System.Int32  IL_000a:  stloc.0  IL_000b:  ldc.i4.0  IL_000c:  stloc.1  IL_000d:  br.s       IL_001a  IL_000f:  ldloc.0
第二个代码段:  IL_0010:  call       bool Program::Test(int32[])  IL_0015:  pop  IL_0016:  ldloc.1  IL_0017:  ldc.i4.1  IL_0018:  add  IL_0019:  stloc.1  IL_001a:  ldloc.1
第三个代码段:   IL_001b:  ldc.i4     0xf4240   IL_0020:  blt.s      IL_000f   IL_0022:  ret
第四个代码段:

那么它最终形成的BasicBlock

***** BB01STMT00000 ( 0x000[E-] ... 0x00A )               [000004] -ACXG------                         *  ASG       ref               [000003] D------N---                         +--*  LCL_VAR   ref    V00 loc0               [000002] --CXG------                         \--*  CALL help ref    CORINFO_HELP_NEWARR_1_VC               [000001] H---------- arg0                       +--*  CNS_INT(h) long   0x7ff8c283f050 class               [000000] ----------- arg1                       \--*  CNS_INT   long   0x989680
***** BB01STMT00001 ( 0x00B[E-] ... 0x00C )               [000007] -A---------                         *  ASG       int               [000006] D------N---                         +--*  LCL_VAR   int    V01 loc1               [000005] -----------                         \--*  CNS_INT   int    0
------------ BB02 [00F..01A), preds={BB03} succs={BB03}
***** BB02STMT00003 ( 0x00F[E-] ... 0x015 )               [000013] --C-G------                         *  CALL      int    Program:Test(int[]):bool               [000012] ----------- arg0                    \--*  LCL_VAR   ref    V00 loc0
***** BB02STMT00004 ( 0x016[E-] ... 0x019 )               [000018] -A---------                         *  ASG       int               [000017] D------N---                         +--*  LCL_VAR   int    V01 loc1               [000016] -----------                         \--*  ADD       int               [000014] -----------                            +--*  LCL_VAR   int    V01 loc1               [000015] -----------                            \--*  CNS_INT   int    1
------------ BB03 [01A..022) -> BB02 (cond), preds={BB01,BB02} succs={BB04,BB02}
***** BB03STMT00002 ( 0x01A[E-] ... 0x020 )               [000011] -----------                         *  JTRUE     void               [000010] -----------                         \--*  LT        int               [000008] -----------                            +--*  LCL_VAR   int    V01 loc1               [000009] -----------                            \--*  CNS_INT   int    0xF4240
------------ BB04 [022..023) (return), preds={BB03} succs={}
***** BB04STMT00005 ( 0x022[E-] ... 0x022 )               [000019] -----------                         *  RETURN    void

以后的操作都在BasicBlock上了