Post by "KevinShan", 08-16-2007, 17:18
-----------------------------------------------------
Introduction
Microsoft Intermediate Language (MSIL) is a language used as the output of a number of compilers (C#, VB, .NET, and so forth). The ILDasm (Intermediate Language Disassembler) program that ships with the .NET Framework SDK (FrameworkSDK\Bin\ildasm.exe) allows the user to see MSIL code in human-readable format. By using this utility, we can open any .NET executable file (EXE or DLL) and see MSIL code.
The ILAsm program (Intermediate Language Assembler) generates an executable file from the MSIL language. We can find this program in the WINNT\Microsoft.NET\Framework\vn.nn.nn directory.
Any Visual C++ programmer starting with .NET development is interested in what happens in the low level of the .NET Framework. Learning MSIL gives a user the chance to understand some things that are hidden from a programmer working with C# or VB.NET. Knowing MSIL gives more power to a .NET programmer. We never need to write programs in MSIL directly, but in some difficult cases it is very useful to open the MSIL code in ILDasm and see how things are done.
A MSIL reference in DOC format is available to a .NET developer and may be found in the Framework SDK directory:
FrameworkSDK\Tool Developers Guide\docs\Partition II Metadata.doc (Metadata Definition and Semantics). In this file, I found a description of all MSIL directives such as .entrypoint, .locals, and so on.
FrameworkSDK\Tool Developers Guide\docs\Partition III CIL.doc (CIL Instruction Set) contains a full list of the MSIL commands.
I also used in my work in an ILDAsm tutorial from MSDN and an excellent article in the May 2001 issue of MSDN Magazine: "ILDASM is Your New Best Friend" by John Robbins.
I think the best way to learn the language is to write some programs in it. This is a reason I decided to make several small MSIL programs. Actually, I didn't write this code—the C# compiler generated it. I made some minor changes and added a lot of notes describing how MSIL is working.
Reading the sample projects attached to this article may help a .NET programmer understand Intermediate Language and easily read MSIL code when this is necessary.
General Information
All operations in MSIL are executed on the stack. When a function is called, its parameters and local variables are allocated on the stack. Function code starting from this stack state may push some values onto the stack, make operations with these values, and pop values from the stack.
Execution of both MSIL commands and functions is done in three steps:
Push command operands or function parameters onto the stack.
Execute the MSIL command or call function. The command or function pops their operands (parameters) from the stack and pushes onto the stack result (return value).
Read result from the stack.
Steps 1 and 3 are optional. For example, the void function doesn't push a return value to the stack.
The stack contains objects of value types and references to objects of reference type. Reference type objects are kept in the heap.
MSIL commands used to push values onto the stack are called ld... (load). Commands used to pop values from the stack are called st... (store), because values are stored in variables. Therefore, we will call the push operation loading and the pop operation storing.
Sample Projects
The code attached to this article contains a number of Console Applications written in MSIL. To build them, ensure that the ILAsm program is available through the PATH. Each project is done as a Visual Studio solution. The source IL file may be opened in the VS Text Editor. The build command runs the ILAsm program, which generates an exe file in the project directory. The run command executes this file. At the end of each program, I added these lines, which can be written in C#:- Console.WriteLine("Press Enter to continue");
- Console::Read();
复制代码 This is done to see the program output when it is run from Windows Explorer.
Here's a list of the included projects:
PrintString—prints the string to the console.
XequalN—assigns a value to the int variable and prints it to the console.
Operations—reads two numbers from the console; makes operations +, -, and *; and shows the result.
Array1—allocates int array, assign values to its elements; print elements and array length.
Compare—enters two numbers and prints the minimum.
Array2—fills array elements in loop and prints some elements.
Unsafe—uses unsafe pointers to access array elements.
PInvoke—calls Win32 API.
Classes—works with classes.
Exception—handles exceptions.
I suggest that you read these projects in the same order as they are described here. In the projects' descriptions given below, I explain each new MSIL command used in the program and show some code fragments.
PrintString Program
PrintString is the MSIL Hello, World application.
MSIL directives used in the code are as follows:
.entrypoint—defines the application entry point (the function called by .NET Runtime when the program starts).
.maxstack—defines the maximum stack depth used by the function code. The C# compiler sets always the exact value for each function. In the sample project, I set this value to 8.
MSIL commands are as follows:
ldstr string—loads the string constant onto the stack.
call function(parameters)—calls the static function. Parameters for the function should be loaded onto the stack before this call.
pop—pops a value from the stack. Used when we don't need to store a value in the variable.
ret—returns from a function.
Calling the static function is simple. We push to stack the function parameters, call the function, and read from the stack function return value (if function is not void). Console.WriteLine is an example of such a function.
Here is the code:- .assembly PrintString {}
- /*
- Console.WriteLine("Hello, World)"
- */
- .method static public void main() il managed
- {
- .entrypoint // this function is the application
- // entry point
- .maxstack 8
- // *****************************************************
- // Console.WriteLine("Hello, World)";
- // *****************************************************
- ldstr "Hello, World" // load string onto stack
- // Call static System.Console.Writeline function
- // (function pops string from the stack)
- call void [mscorlib]System.Console::WriteLine
- (class System.String)
- // *****************************************************
- ldstr "Press Enter to continue"
- call void [mscorlib]System.Console::WriteLine
- (class System.String)
- // Call the System.Console.Read function
- call int32 [mscorlib]System.Console::Read()
- // The pop instruction removes the top element from the stack.
- // (remove number returned by Read() function)
- pop
- // *****************************************************
- ret
- }
复制代码 XequalN Program
The program assigns a value to the integer variable and prints it to the console window.
Commands:
ldc.i4.n—loads a 32-bit constant (n from 0 to 8) onto the stack
stloc.n—stores a value from the stack to local variable number n (n from 0 to 3)
Code:
- .assembly XequalN {}
- // int x;
- // x = 7;
- // Console.WriteLine(x);
- .method static public void main() il managed
- {
- .entrypoint
- .maxstack 8
- .locals init ([0] int32 x) // Allocate local variable
- // *****************************************************
- // x = 7;
- // *****************************************************
- ldc.i4.7 // load constant onto stack
- stloc.0 // store value from stack to
- // var. 0
- // *****************************************************
- // Console.WriteLine(x);
- // *****************************************************
- ldloc.0 // load var.0 onto stack
- call void [mscorlib]System.Console::WriteLine(int32)
- ret
- }
复制代码 Operations Program
The program reads two numbers from the console, makes simple math operations with them, and shows the result.
Commands:
add—adds two values. Command parameters should be loaded onto the stack before the call. The function pops the parameters and pushes a result onto the stack.
sub—subtracts two values.
mul—multiplies two values.
Code fragments:Array1 Program
The program allocates the int array, assigns values to its elements, and then prints the elements and array length.
Commands:
newarr type—creates an array of type elements. The array size should be loaded onto the stack before a call to this command. Loads onto the stack a reference to the array.
stelem.i4—assigns a value to an array member. The value has type Int32. The array reference, index, and value should be loaded onto the stack before a call to this command.
ldelema type—loads to the stack the address of an array element. The array reference and index should be loaded onto the stack before a call to this command. The address is used to call a non-static class function (see later).
ldlen—loads the length of an array onto the stack. The array reference should be loaded onto the stack before a call to this command.
ldloca.s variable—loads the address of the variable onto the stack.
ldc.i4.s value—loads an Int32 constant onto the stack (used for values more than 8).
conv.i4—converts value from the stack to Int32.
call instance function(arguments)—calls a non-static class function. Before a call to a non-static function, we need to load onto the stack the address of the class object (used first as a hidden parameter, as in C++) and function arguments. In this sample object, the address is loaded using the ldelema and ldloca commands.
In some code fragments in this sample, I wrote in the notes to stack the state starting after the last local variable. In this sample, we see the variable generated by the compiler. This variable is used to make the call to the non-static class function.
Code:Compare Program
The program reads two numbers and prints their minimum.
Commands:
bge.s label—goes to label if value1 is greater than or equal to value 2. Values 1 and 2 should be loaded onto the stack before a call to this command.
br.s label—goes to label.
box value type—converts a value type to an Object and loads the Object's reference onto the stack.
Boxing in this program is caused by the C# line: Console.WriteLine("{0:d}", z);
Writing this line in this way: Console.WriteLine(z.ToString());
doesn't cause boxing.
Code:- .assembly Compare {}
- /*
- int x, y, z;
- string s;
- Console.WriteLine("Enter x:");
- s = Console.ReadLine();
- x = Int32.Parse(s);
- Console.WriteLine("Enter y:");
- s = Console.ReadLine();
- y = Int32.Parse(s);
- if ( x < y )
- z = x;
- else
- z = y;
- Console.WriteLine("{0:d}", z);
- */
- .method static public void main() il managed
- {
- .entrypoint
- .maxstack 8
- .locals init ([0] int32 x,
- [1] int32 y,
- [2] int32 z,
- [3] string s)
- // *****************************************************
- // Console.WriteLine("Enter x:");
- // *****************************************************
- ldstr "Enter x:" // load string onto stack
- call void [mscorlib]System.Console::WriteLine(string)
- // *****************************************************
- // s = Console.ReadLine();
- // *****************************************************
- call string [mscorlib]System.Console::ReadLine()
- stloc.3 // store to var. 3
- // *****************************************************
- // x = Int32.Parse(s);
- // *****************************************************
- ldloc.3 // load var. 3 onto stack
- call int32 [mscorlib]System.Int32::Parse(string)
- stloc.0 // store to var. 0
- // The same operations for y ...
- // *****************************************************
- // branch
- // if ( x >= y ) goto L_GR;
- // *****************************************************
- ldloc.0 // load x onto stack (value 1)
- ldloc.1 // load y onto stack (value 2)
- bge.s L_GR // goto L_GR if value1 is greater
- // than or equal to value2
- // *****************************************************
- // z = x
- // *****************************************************
- ldloc.0 // load variable 0 onto stack
- stloc.2 // store to variable 2
- br.s L_CONTINUE // goto L_CONTINUE
- L_GR:
- // *****************************************************
- // z = y
- // *****************************************************
- ldloc.1 // load variable 1 onto stack
- stloc.2 // store to variable 2
- L_CONTINUE:
- // *****************************************************
- // Console.WriteLine("{0:d}", z);
- // NOTE: this line causes boxing.
- // *****************************************************
- ldstr "{0:d}" // load string onto stack
- ldloc.2 // load variable 2 to stack (z)
- box [mscorlib]System.Int32 // convert Int32 to Object
- call void [mscorlib]System.Console::WriteLine(string, object)
- ret
- }
复制代码 Array2 Program
The program fills an array in the loop and prints its elements. This time, we add the static function ShowNumber(int), which is called from main.
Commands:
blt.s label—goes to label if value 1 is less than value 2. Values 1 and 2 should be loaded onto the stack before a call to this command.
ldelem.i4—loads an array element onto the stack. A reference to the array and index should be loaded onto the stack before a call to this command.
ldarga.s argument—loads the address of the function argument onto the stack.
We can see in this program that the for loop is implemented in MSIL using labels.
Code:Unsafe Program
The program fills and prints the int array using an unsafe pointer.
In this program, we see the new, unsafe types: int32* and int32&. The pinned keyword, used with a local variable, prevents GC from moving the object pointed to by the variable.
Commands:
dup—duplicates the value on the stack.
stind.i4—stores the value by address. The address and value should be loaded onto the stack before a call to this command.
Code:PInvoke Program
The program shows the computer name using the Win32 API GetComputerName and MessageBox. API declarations in MSIL look like this:- .method public hidebysig static pinvokeimpl("kernel32.dll"
- autochar winapi)
- int32 GetComputerName(
- class [mscorlib]System.Text.StringBuilder
- marshal( lptstr) buffer,
- int32& size) cil managed preservesig
- {
- }
- .method public hidebysig static pinvokeimpl("User32.dll"
- autochar winapi)
- int32 MessageBox(native int hWnd,
- string marshal( lptstr) lpText,
- string marshal( lptstr) lpCaption,
- int32 uType) cil managed preservesig
- {
- }
复制代码 They are called by the same rules as other any functions.
Classes Program
In previous programs, we called the class functions from the static function main. In this program, we will see how to write classes. The program contains two classes: Class1, with function main; and SampleClass, created in main.
Directive:
.field—defines class member. Used with keywords public, private, static, and so forth.
Commands:
stsfld static field—replaces the value of the static field with the value from the the stack.
ldfld field—loads a non-static class field onto the stack. The address of the class instance should be loaded onto the stack before a call to this command.
ldarg.n—loads argument number n onto the stack. In a non-static class function, argument 0 is a hidden argument and points to the this instance.
newobj constructor—creates a new instance of a class using constructor. Constructor parameters should be loaded onto the stack before this call. A reference to the created instance is loaded onto the stack.
callvirt instance function—calls a late-bound method on an object.
Code:Exception Program
The program divides two numbers, catching a divide-by-zero exception. The try/catch block in MSIL looks like it does in C#.
Command:
leave.s label—leaves a protected block such as try or catch.
Code:
- .assembly Exception {}
- /*
- int x, y, z;
- string s;
- Console.WriteLine("Enter x:");
- s = Console.ReadLine();
- x = Int32.Parse(s);
- Console.WriteLine("Enter y:");
- s = Console.ReadLine();
- y = Int32.Parse(s);
- try
- {
- z = x / y;
- Console.WriteLine(z.ToString());
- }
- catch (Exception e)
- {
- Console.WriteLine(e.Message);
- }
- */
- .method static public void main() il managed
- {
- .entrypoint
- .maxstack 8
- .locals ([0] int32 x,
- [1] int32 y,
- [2] int32 z,
- [3] string s,
- [4] class [mscorlib]System.Exception e)
- // Enter x, y ...
- .try
- {
- // *************************************************
- // z = x / y;
- // *************************************************
- ldloc.0 // load var. 0
- ldloc.1 // load var. 1
- div // divide
- stloc.2 // store in var. 2
- // *************************************************
- // Console.WriteLine(z.ToString());
- // *************************************************
- ldloca.s z // load address of z
- call instance string [mscorlib]System.Int32
- ::ToString()
- call void [mscorlib]System.Console
- ::WriteLine(string)
- leave.s END_TRY_CATCH // exit try block
- }
- catch [mscorlib]System.Exception
- {
- stloc.s e // store exception thrown on
- // the stack
- // *************************************************
- // Console.WriteLine(e.Message);
- // *************************************************
- ldloc.s e // load e
- callvirt instance string [mscorlib]System.Exception
- ::get_Message()
- call void [mscorlib]System.Console
- ::WriteLine(string)
- leave.s END_TRY_CATCH // exit catch block
- }
- END_TRY_CATCH:
- ret
- }
复制代码 Downloads |