Tuesday, February 18, 2020

C# internals : How does the stack work (call stack)

I propose to look at the internals that are behind the simple lines of initializing of the objects, calling methods, and passing parameters. And, of course, we will use this information in practice — we will subtract the stack of the calling method.

Disclaimer


Before proceeding with the story, I strongly recommend you to read the first post about StructLayout, there is an example that will be used in this article.

All code behind the high-level one is presented for the debug mode, because it shows the conceptual basis. JIT optimization is a separate big topic that will not be covered here.

I would also like to warn that this article does not contain material that should be used in real projects.

First — theory


Any code eventually becomes a set of machine commands. Most understandable is their representation in the form of Assembly language instructions that directly correspond to one (or several) machine instructions.

Before turning to a simple example, I propose to get acquainted with stack. Stack is primarily a chunk of memory that is used, as a rule, to store various kinds of data (usually they can be called temporal data). It is also worth remembering that the stack grows towards smaller addresses. That is the later an object is placed on the stack, the less address it will have.

Now let's take a look on the next piece of code in Assembly language (I’ve omitted some of the calls that are inherent in the debug mode).

C#:
public class StubClass 
{
    public static int StubMethod(int fromEcx, int fromEdx, int fromStack) 
    {
        int local = 5;
        return local + fromEcx + fromEdx + fromStack;
    }
    
    public static void CallingMethod()
    {
        int local1 = 7, local2 = 8, local3 = 9;
        int result = StubMethod(local1, local2, local3);
    }
}

Asm:
StubClass.StubMethod(Int32, Int32, Int32)
    1: push ebp
    2: mov ebp, esp
    3: sub esp, 0x10
    4: mov [ebp-0x4], ecx
    5: mov [ebp-0x8], edx
    6: xor edx, edx
    7: mov [ebp-0xc], edx
    8: xor edx, edx
    9: mov [ebp-0x10], edx
    10: nop
    11: mov dword [ebp-0xc], 0x5
    12: mov eax, [ebp-0xc]
    13: add eax, [ebp-0x4]
    14: add eax, [ebp-0x8]
    15: add eax, [ebp+0x8]
    16: mov [ebp-0x10], eax
    17: mov eax, [ebp-0x10]
    18: mov esp, ebp
    19: pop ebp
    20: ret 0x4

StubClass.CallingMethod()
    1: push ebp
    2: mov ebp, esp
    3: sub esp, 0x14
    4: xor eax, eax
    5: mov [ebp-0x14], eax
    6: xor edx, edx
    7: mov [ebp-0xc], edx
    8: xor edx, edx
    9: mov [ebp-0x8], edx
    10: xor edx, edx
    11: mov [ebp-0x4], edx
    12: xor edx, edx
    13: mov [ebp-0x10], edx
    14: nop
    15: mov dword [ebp-0x4], 0x7
    16: mov dword [ebp-0x8], 0x8
    17: mov dword [ebp-0xc], 0x9
    18: push dword [ebp-0xc]
    19: mov ecx, [ebp-0x4]
    20: mov edx, [ebp-0x8]
    21: call StubClass.StubMethod(Int32, Int32, Int32)
    22: mov [ebp-0x14], eax
    23: mov eax, [ebp-0x14]
    24: mov [ebp-0x10], eax
    25: nop
    26: mov esp, ebp
    27: pop ebp
    28: ret

The first thing to notice is the EBP and the ESP registers and operations with them.

A misconception that the EBP register is somehow related to the pointer to the top of the stack is common among my friends. I must say that it is not.

The ESP register is responsible for pointing to the top of the stack. Correspondingly, with each PUSH instruction (putting a value on the top of the stack) the value of ESP register is decremented (the stack grows towards smaller addresses), and with each POP instruction it is incremented. Also, the CALL command pushes the return address on the stack, thereby decrements the value of the ESP register. In fact, the change of the ESP register is performed not only when these instructions are executed (for example, when interrupt calls are made, the same thing happens with the CALL instructions).

Will consider StubMethod().

In the first line, the content of the EBP register is saved (it is put on a stack). Before returning from a function, this value will be restored.

The second line stores the current value of the address of the top of the stack (the value of the register ESP is moved to EBP). Next, we move the top of the stack to as many positions as we need to store local variables and parameters (third row). Something like memory allocation for all local needs — stack frame. At the same time, the EBP register is a starting point in the context of the current call. Addressing is based on this value.

All of the above is called the function prologue.

After that, variables on the stack are accessed via the stored EBP register, which points on the place where the variables of this method begin. Next comes the initialization of local variables.

Fastcall reminder: in .net, the fastcall calling convention is used.
The calling convention governs the location and the order of the parameters passed to the function.
The first and second parameters are passed via the ECX and EDX registers, respectively, the subsequent parameters are transmitted via the stack. (This is for 32-bit systems, as always. In 64-bit systems four parameters passed through registers(RCXRDXR8R9))

For non-static methods, the first parameter is implicit and contains the address of the instance on which the method is called (this address).

In lines 4 and 5, the parameters that were passed through the registers (the first 2) are stored on the stack.

Next is cleaning the space on the stack for local variables (stack frame) and initializing local variables.

It is worth be mentioned that the result of the function is in the register EAX.

In lines 12-16, the addition of the desired variables occurs. I draw your attention to line 15. There is a accessing value by the address that is greater than the beginning of the stack, that is, to the stack of the previous method. Before calling, the caller pushes a parameter to the top of the stack. Here we read it. The result of the addition is obtained from the register EAX and placed on the stack. Since this is the return value of the StubMethod(), it is placed again in EAX. Of course, such absurd instruction sets are inherent only in the debug mode, but they show exactly how our code looks like without smart optimizer that does the lion’s share of the work.

In lines 18 and 19, both the previous EBP (calling method) and the pointer to the top of the stack are restored (at the time the method is called). The last line is the returning from function. About the value 0x4 I will tell a bit later.

Such a sequence of commands is called a function epilogue.

Now let's take a look at CallingMethod(). Let's go straight to line 18. Here we put the third parameter on the top of the stack. Please note that we do this using the PUSH instruction, that is, the ESP value is decremented. The other 2 parameters are put into registers ( fastcall). Next comes the StubMethod() method call. Now let's remember the RET 0x4 instruction. Here the following question is possible: what is 0x4? As I mentioned above, we have pushed the parameters of the called function onto the stack. But now we do not need them. 0x4 indicates how many bytes need to be cleared from the stack after the function call. Since the parameter was one, you need to clear 4 bytes.

Here is a rough image of the stack:
Thus, if we turn around and see what lies on the stack right after the method call, the first thing we will see EBP, that was pushed onto the stack (in fact, this happened in the first line of the current method). The next thing will be the return address. It determines the place, there to resume the execution after our function is finished (used by RET). And right after these fields we will see the parameters of the current function (starting from the 3rd, first two parameters are passed through registers). And behind them the stack of the calling method hides!

The first and second fields mentioned before (EBP and return address) explain the offset in +0x8 when we access parameters.

Correspondingly, the parameters must be at the top of the stack in a strictly defined order before function call. Therefore, before calling the method, each parameter is pushed onto the stack.
But what if they do not push, and the function will still take them?

Small example


So, all the above facts have caused me an overwhelming desire to read the stack of the method that will call my method. The idea that I am only in one position from the third argument (it will be closest to the stack of the calling method) is the cherished data that I want to receive so much, did not let me sleep.

Thus, to read the stack of the calling method, I need to climb a little further than the parameters.

When referring to parameters, the calculation of the address of a particular parameter is based only on the fact that the caller has pushed them all onto the stack.

But implicit passing through the EDX parameter (who is interested — previous post) makes me think that we can outsmart the compiler in some cases.

The tool I used to do this is called StructLayoutAttribute (al features are in the first post). //One day I will learn a bit more than only this attribute, I promise

We use the same favorite method with overlapped reference types.

At the same time, if overlapping methods have a different number of parameters, the compiler does not push the required ones onto the stack (at least because it does not know which ones).
However, the method that is actually called (with the same offset from a different type), turns into positive addresses relative to its stack, that is, those where it plans to find the parameters.

But nobody passes parameters and method begins to read the stack of the calling method. And the address of the object(with Id property, that is used in the WriteLine()) is in the place, where the third parameter is expected.
using System;
using System.Runtime.InteropServices;

namespace Magic
{
    public class StubClass
    {
        public StubClass(int id)
        {
            Id = id;
        }

        public int Id;
    }

    [StructLayout(LayoutKind.Explicit)]
    public class CustomStructWithLayout
    {
        [FieldOffset(0)]
        public Test1 Test1;
        [FieldOffset(0)]
        public Test2 Test2;
    }
    public class Test1
    {
        public virtual void Useless(int skipFastcall1, int skipFastcall2, StubClass adressOnStack)
        {
            adressOnStack.Id = 189;
        }
    }
    public class Test2
    {
        public virtual int Useless()
        {
            return 888;
        }
    }

    class Program
    {
        static void Main()
        {
            Test2 objectWithLayout = new CustomStructWithLayout
            {
                Test2 = new Test2(),
                Test1 = new Test1()
            }.Test2;
            StubClass adressOnStack = new StubClass(3);
            objectWithLayout.Useless();
            Console.WriteLine($"MAGIC - {adressOnStack.Id}"); // MAGIC - 189
        }
    }
}

I will not give the assembly language code, everything is pretty clear there, but if there are any questions, I will try to answer them in the comments

I understand perfectly that this example cannot be used in practice, but in my opinion, it can be very useful for understanding the general scheme of work.

Wednesday, February 12, 2020

Reading instance address from another method stack

Hello. This time we continue to laugh at the normal method call. I propose to get acquainted with the method call with parameters without passing parameters. We will also try to convert the reference type to a number — its address, without using pointers and unsafe code.

Disclaimer


Before proceeding with the story, I strongly recommend that you read the previous post about StructLayout. Here I will use some features, that were described there.

Also I would like to warn that this article does not contain material that should be used in real projects.

Some initial information


Before we start practicing, let's remember how the C# code is converted into assembler code.
Let us examine a simple example.
public class Helper 
{
    public virtual void Foo(int param)
    {
    }
}

public class Program 
{
    public void Main() 
    {
        Helper helper = new Helper();
        var param = 5;
        helper.Foo(param);
    }
}

This code does not contain anything difficult, but the instructions generated by JiT contain several key points. I propose to look only on a small fragment of the generated code. in my examples I will use assembler code for 32 bit machines.
1: mov dword [ebp-0x8], 0x5
2: mov ecx, [ebp-0xc]
3: mov edx, [ebp-0x8]
4: mov eax, [ecx]
5: mov eax, [eax+0x28]
6: call dword [eax+0x10]

In this small example, you can observe fastcall — calling convention that uses registers to pass parameters (the first two parameters from left to right in the ecx and edx registers), and the remaining parameters are passed through the stack from right to left. The first (implicit) parameter is the address of the instance of the class on which the method is called (for non-static methods).

In our case first parameter is the address of the instance, second one is our int value.

So int the first line we see the local variable 5, there is nothing interesting here.
In the second line, we copy the address of the Helper instance into the ecx register. This is the address of the pointer to method table.
In the third line there is copying of local variable 5 into the edx register
In the fourth line we can see copying of the method table address into the eax register
Fifth line contains loading of the value from memory at the address 40 bytes larger than the address of the method table: the start of the methods addresses in the method table. (The method table contains various information that is stored before. For example address of the base class method table, the EEClass address, various flags, including the garbage collector flag, and so on). Thus, the address of the first method from the method table is now stored in the eax register.
Note: In .NET Core the layout of the method table was changed. Now there is field (at 32/64 bit offset for 32 and 64 bit systems respectively) that contains the address of the start of method list.
In the sixth line, the method is called at offset 16 from the beginning, that is, the fifth one in the method table. Why is our only method is fifth? I remind you that object has 4 virtual methods (ToString(), Equals(), GetHashCode() and Finalize()), which all classes will have, respectively.

Goto Practice;


Practive:
It's time to start a small demonstration. I suggest such small blank (very similar to the blank from the previous article).
    [StructLayout(LayoutKind.Explicit)]
    public class CustomStructWithLayout
    {
        [FieldOffset(0)]
        public Test1 Test1;

        [FieldOffset(0)]
        public Test2 Test2;
    }

    public class Test1
    {
        public virtual int Useless(int param)
        {
            Console.WriteLine(param);
            return param;
        }
    }

    public class Test2
    {
        public virtual int Useless()
        {
            return 888;
        }
    }

    public class Stub
    {
        public void Foo(int stub) { }
    }

And let's use that stuff in such way:
    class Program
    {
        static void Main(string[] args)
        {
            Test2 fake = new CustomStructWithLayout
            {
                Test2 = new Test2(),
                Test1 = new Test1()
            }.Test2;
            Stub bar = new Stub();
            int param = 55555;
            bar.Foo(param);
            fake.Useless();
            Console.Read();
        }
    }

As you might guess, from the experience of the previous article, the Useless(int j) method of type Test1 will be called.

But what will be displayed? The attentive reader, I believe, has already answered this question. «55555» is displayed on the console.

But let's still look at the generated code fragments.
     mov ecx, [ebp-0x20]
     mov edx, [ebp-0x10]
     cmp [ecx], ecx
     call Stub.Foo(Int32)
     mov ecx, [ebp-0x1c]
     mov eax, [ecx]
     mov eax, [eax+0x28]
     call dword [eax+0x10]

I think you recognize the virtual method call pattern, it starts after Stub.Foo(Int32) call. As we can see, as expected ecx is filled with the address of the instance on which the method is called. But since the compiler think that we call a method of type Test2, which has no parameters, nothing is written to edx. However, we have another method call before. And there we have used edx to pass parameter. And of course we don't have instructions, that clear edx. So, as you can see in console output, previous edx value was used.

There is another interesting nuance. I specifically used the meaningful type. I suggest trying to replace the parameter type of the Foo method of the Stub type with any reference type, for example, a string. But the parameter type of the method Useless() does not change. Below you can see the result on my machine with some clarifying information: WinDBG and Calculator :)
The output window displays the address of the reference type in decimal notation.

Total


We refreshed the knowledge of calling methods using the fastcall convention and immediately used the wonderful edx register to pass a parameter in 2 methods at a time. We also spat on all types and with the knowledge that everything is only bytes easily obtained the address of the object without using pointers and unsafe code. Further I plan to use the received address for even more inapplicable purposes!

Thanks for attention!

P.S. C# code can be found here

Tuesday, February 11, 2020

Fast call of private methods in C#

Hi. I would like to show you an example of using StructLayout for something more interesting than examples with bytes, ints, and other primitive types, when everything happens quite obviously.
Before proceeding to lightning-fast violation of encapsulation, it is worth recalling briefly what StructLayout is. Strictly speaking, it is even StructLayoutAttribute, an attribute that allows you to create structures and classes similar to union in C ++. This attribute allows you to take control of the placement of class members in memory (using offsets). Accordingly, it is placed above the class.

Usually, if a class has 2 fields, we expect them to be arranged sequentially, that is they will be independent of each other (do not overlap). However, StructLayout allows you to specify that the location of the fields will be set not by the environment, but by the user. To specify the offset of the fields explicitly we should use the parameter LayoutKind.Explicit.

To indicate which offset from the beginning of the class / structure (hereinafter just «class») we want to place the field, we need to put the FieldOffset attribute on it. This attribute takes as a parameter the number of bytes — the offset from the beginning of the class. It is impossible to pass a negative value, in order not to spoil the pointers to the method table or the sync block index. So it’s going to be a little more complicated.

Let's start writing the code. To begin with, I suggest starting with a simple example. Create following class:

public class CustomClass { public override string ToString() { return "CUSTOM"; } public virtual object Field { get; } = new object(); }

Next, we use the above described mechanism for explicitly specifying field offsets.

[StructLayout(LayoutKind.Explicit)] public class CustomStructWithLayout { [FieldOffset(0)] public string Str;
[FieldOffset(0)] public CustomClass SomeInstance; }

For now, I'll postpone the explanations and use the written class as follows:

class Program { static void Main(string[] args) { CustomStructWithLayout instance = new CustomStructWithLayout(); instance.SomeInstance = new CustomClass(); instance.Str = "4564"; Console.WriteLine(instance.SomeInstance.GetType()); //System.String Console.WriteLine(instance.SomeInstance.ToString()); //4564 Console.Read(); } }

Calling the GetType() method returns a string, the ToString() method is naughty and gives us the string «4564».

Brain Discharge: What will be displayed after calling the CustomClass virtual property?

As you already guessed, we initialized CustomStructWithLayout, both links are null, then we initialize the field of our type, and then assign the string to the Str field. As a result, CustomClass link doesn't points to CustomClass object, it points to the System.string object (including the table of methods and the index of the synchronization unit). But the compiler sees the field is still of the type of our class.

For proof, here is a small clipping from WinDbg:

Здесь можно увидеть несколько необычных вещей.

  • In CustomStructWithLayout object fields have different addresses of methodtables (highly expected), but the addresses of objects are the same.
  • The second is that you can see that both fields are located at offset 4. I think most will understand, but just in case, I will explain, directly to the address of the object placed a link to the table of methods. The fields begin with an offset of 4 bytes (for 32 bits), and the index of the synchronization block is located with an offset of -4. Thus, both objects are at the same offset.

Now that you have figured out what is happening, you can try using offsets to call what should not have been called.

For this, I repeated the structure of the string class in one of my classes. But I repeated only the beginning, since the class string is quite voluminous and I am very lazy.

Note: consts and statics are not required, just for fun.

public class CustomClassLikeString { public const int FakeAlignConst = 3; public const int FakeCharPtrAlignConst = 3; public static readonly object FakeStringEmpty; public char FakeFirstChar; public int FakeLength = 3; public const int FakeTrimBoth = 3; public const int FakeTrimHead = 3; public const int FakeTrimTail = 3; public CustomClassLikeString(){} public CustomClassLikeString(int a){} public CustomClassLikeString(byte a){} public CustomClassLikeString(short a){} public CustomClassLikeString(string a){} public CustomClassLikeString(uint a){} public CustomClassLikeString(ushort a){} public CustomClassLikeString(long a){ } public void Stub1() { } public virtual int CompareTo(object value) { return 800; } public virtual int CompareTo(string value) { return 801; } }

Well, the structure with the layout will be changed a bit.

[StructLayout(LayoutKind.Explicit)] public class CustomStructWithLayout { [FieldOffset(0)] public string Str; [FieldOffset(0)] public CustomClassLikeString SomeInstance; }

Further, when calling FakeLength or the CompareTo() method, due to the identical offset of these class members relative to the address of the object itself, the corresponding string method will be called (in this case).

Getting to the first private method of the string that I can use was too long, so I stopped at a public one. But the field is private, everything is honest. By the way, the methods are made virtual to protect against any optimizations that interfere with the work (for example, embedding), and also so that the method is called by the offset in the method table.

So, performance. It is clear that a direct competitor in calling things that should not be called and in violation of encapsulation is reflection. I think that it is clear that we are faster than this thing, because we do not analyze metadata. Exact values:

Here is a long piece of code with how I measured performance (If someone needs it):

[ClrJob] [RPlotExporter, RankColumn] [InProcessAttribute] public class Benchmarking { private CustomStructWithLayout instance; private string str; [GlobalSetup] public void Setup() { instance = new CustomStructWithLayout(); instance.SomeInstance = new CustomClassLikeString(); instance.Str = "4564"; str = "4564"; } [Benchmark] public int StructLayoutField() { return instance.SomeInstance.FakeLength; } [Benchmark] public int ReflectionField() { return (int)typeof(string).GetField("m_stringLength", BindingFlags.Instance | BindingFlags.NonPublic).GetValue(str); } [Benchmark] public int StructLayoutMethod() { return instance.SomeInstance.CompareTo("4564"); } [Benchmark] public int ReflectionMethod() { return (int)typeof(string).GetMethod("CompareTo", new[] { typeof(string) }).Invoke(str, new[] { "4564" }); } } class Program { static void Main(string[] args) { var summary = BenchmarkRunner.Run<Benchmarking>(); } }

Monday, February 3, 2020

Hello, World!

Hi, that's just test post, no useful data.