#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    5
    Rep Power
    0

    Cool 1-line simple C function is not so simple


    Hi,

    I need to understand and reproduce (in another language)
    logic of following function (C code)
    and I don't really follow, what it is doing

    double __thiscall sub_1(int this) {

    return * (double *) (this + 12);

    }

    It's compiled OK, but crashed while running .exe file

    I'm not strong with C at all, and cannot find out,
    what actual manipulation this set of operands is doing
    * (double *)
    It's not a derefencing, because there is no pointers declared.

    Anyway, who could tell me - what will be output of function

    for input this = 5 and why ?
    Last edited by King Arthur; December 12th, 2013 at 06:25 AM. Reason: syntax error
  2. #2
  3. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    It's interpreting that int as a pointer (via the cast to double* ) and then interpreting the contents of that memory address as a double value. If your target language does not support pointers, then good luck translating this.

    If you are running this under a modern operating system, such as Windows or Linux, then you are undoubtedly trying to access a memory location that your program is not allowed to access, in which case the OS protects itself from your malware program by terminating it with extreme prejudice. Hence the crash.

    You need to pass to it a valid address (albeit as an integer). But if a double value wasn't actually stored at that address plus 12, then you would get garbage.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    5
    Rep Power
    0
    Originally Posted by dwise1_aol
    You need to pass to it a valid address (albeit as an integer)
    sorry, don't understand this part.
    my target language is Java and it doesn't support pointers.

    what value function will return?
  6. #4
  7. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    Originally Posted by dwise1
    If your target language does not support pointers, then good luck translating this.
    Like I said.

    If you pass it a valid address, it will return a double value which would be its interpretation of whatever happens to be in memory at the location it calculates (the address you pass it plus 12). If that location does not contain a double (refer to IEEE 754 for the required format), then it will misinterpret whatever bit pattern just happens to be in the eight bytes there (again, refer to IEEE 754).

    I suspect that there is code elsewhere in the program that writes data there, but that is just a guess. Also, it doesn't make any sense to me for the program to be keeping that address as an int, unless it requires the user to enter in an arbitrary address. Again, that is just a guess in the light of insufficient information being available.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    5
    Rep Power
    0
    OK, I was trying to simplify things,
    but people would like to see the whole picture.

    The whole picture (if this would help) as follows:

    I'm trying to decompile .dll file and I have done so
    by XRay decompiler, who created some presudo-C code.
    I need to understand and reproduce code of one particular function
    from this dll, but because it is not one of the exported one -
    I even don't know its original name and thus - cannot call (and test it directly) from the dll.

    This is the source code , how decompiler produced it

    Code:
    double __thiscall sub_10001BD0(int this)
    {
      int v1; // esi@1
      char v2; // zf@9
      int v3; // edx@11
      int v4; // edx@12
      double v5; // st6@15
      double v6; // st6@17
      int v7; // edx@18
      char v8; // c0@18
      double v9; // st6@19
      double v10; // st6@23
      int v11; // eax@25
      double v12; // st6@25
      double v13; // st5@31
      double v14; // st7@34
      double v15; // st7@35
      double v16; // st5@38
      double v17; // st5@40
      double v18; // st7@44
    
      v1 = *(_DWORD *)(this + 24072);
      if ( v1 != 2 || *(_DWORD *)(this + 24088) > *(_DWORD *)(this + 23800) )
      {
        *(_BYTE *)(this + 24094) = 0;
        *(_BYTE *)(this + 24095) = 0;
      }
      if ( v1 == 2 )
      {
        if ( *(_BYTE *)(this + 24094) == 1 )
          *(_BYTE *)(this + 24095) = 1;
        if ( v1 == 2 )
          *(_BYTE *)(this + 24094) = 1;
      }
      v2 = *(_BYTE *)(this + 24095) == 0;
      *(_DWORD *)(this + 23800) = *(_DWORD *)(this + 24088);
      if ( v2 )
      {
        if ( *(_DWORD *)(this + 24056) != *(_DWORD *)(this + 24060) )
        {
          v3 = *(_DWORD *)(this + 8840);
          if ( v1 == 2 )
          {
            v4 = v3 - 1;
            if ( v4 < 1 )
              v4 = 129;
            if ( *(double *)(this + 8 * v4 + 3120) <= 0.0 )
              v5 = 0.0;
            else
              v5 = *(double *)(this + 23664) - *(double *)(this + 8 * v4 + 3120);
            v6 = v5 * 0.66 + *(double *)(this + 8864);
            *(_QWORD *)(this + 8856) = *(_QWORD *)&v6;
            *(_QWORD *)(this + 8864) = *(_QWORD *)&v6;
          }
          else
          {
            v8 = 0.0 < *(double *)(this + 8 * v3 + 3120);
            v7 = this + 8 * v3 + 3120;
            if ( v8 )
              v9 = *(double *)(this + 23664) - *(double *)v7;
            else
              v9 = 0.0;
            *(double *)(this + 8856) = v9 * 0.66 + *(double *)(this + 8864);
          }
        }
        if ( v1 == 2 )
          v10 = *(double *)(this + 8 * *(_DWORD *)(this + 8840) + 3120);
        else
          v10 = *(double *)(this + 23744);
        v11 = *(_DWORD *)(this + 24064);
        v12 = v10 - *(double *)(this + 8856);
        if ( (v11 != 1 || *(_DWORD *)(this + 24068) > 30) && (v11 || *(_DWORD *)(this + 24068) > 2160) )
        {
          v16 = *(double *)(this + 23744);
          if ( 0.0 == *(double *)(this + 8872) )
          {
            *(_QWORD *)(this + 8872) = *(_QWORD *)&v16;
            *(double *)(this + 8880) = *(double *)(this + 23744);
          }
          else
          {
            v17 = (v16 - *(double *)(this + 8880)) * 0.3333333333333333 + *(double *)(this + 8880);
            *(_QWORD *)(this + 8872) = *(_QWORD *)&v17;
            if ( v1 == 2 )
              *(_QWORD *)(this + 8880) = *(_QWORD *)&v17;
          }
          if ( 0.0 == *(double *)(this + 8888) )
          {
            *(double *)(this + 8888) = *(double *)(this + 8872);
            *(double *)(this + 8896) = *(double *)(this + 8872);
          }
          else
          {
            v18 = 0.3333333333333333 * (*(double *)(this + 8872) - *(double *)(this + 8896)) + *(double *)(this + 8896);
            *(_QWORD *)(this + 8888) = *(_QWORD *)&v18;
            if ( v1 == 2 )
              *(_QWORD *)(this + 8896) = *(_QWORD *)&v18;
          }
          v14 = *(double *)(this + 8888);
        }
        else
        {
          if ( 0.0 == *(double *)(this + 8872) )
          {
            *(_QWORD *)(this + 8872) = *(_QWORD *)&v12;
            *(_QWORD *)(this + 8880) = *(_QWORD *)&v12;
          }
          else
          {
            v13 = (v12 - *(double *)(this + 8880)) * 0.3333333333333333 + *(double *)(this + 8880);
            *(_QWORD *)(this + 8872) = *(_QWORD *)&v13;
            if ( v1 == 2 )
              *(_QWORD *)(this + 8880) = *(_QWORD *)&v13;
          }
          if ( 0.0 == *(double *)(this + 8888) )
          {
            *(double *)(this + 8888) = *(double *)(this + 8872);
            *(double *)(this + 8896) = *(double *)(this + 8872);
            v14 = *(double *)(this + 8888) + *(double *)(this + 8856);
          }
          else
          {
            v15 = 0.3333333333333333 * (*(double *)(this + 8872) - *(double *)(this + 8896)) + *(double *)(this + 8896);
            *(_QWORD *)(this + 8888) = *(_QWORD *)&v15;
            if ( v1 == 2 )
            {
              *(_QWORD *)(this + 8896) = *(_QWORD *)&v15;
              v14 = *(double *)(this + 8888) + *(double *)(this + 8856);
            }
            else
            {
              v14 = *(double *)(this + 8888) + *(double *)(this + 8856);
            }
          }
        }
        if ( v1 == 2 )
        {
          *(double *)(this + 12192) = *(double *)(this + 12184);
          *(double *)(this + 12184) = *(double *)(this + 12176);
          *(double *)(this + 12176) = *(double *)(this + 12168);
          *(double *)(this + 12168) = *(double *)(this + 12160);
          *(double *)(this + 12160) = *(double *)(this + 12152);
          *(double *)(this + 12152) = *(double *)(this + 12144);
          *(double *)(this + 12144) = *(double *)(this + 12136);
        }
        *(_QWORD *)(this + 12136) = *(_QWORD *)&v14;
      }
      return *(double *)(this + 12136);
    }

    It is probably not the original source code of the function,
    but that's all I have.

    Is it possible from this information - to find out, what will be
    the output of this function, if input this = 5 ?
    and why?
    From my point of view - only the last line of the code
    (actual return statement) does matter - the rest is just
    obfuscating rubbish, but I'm not C expert and most probably
    I'm missing something.

    Anyway, if I compile short version of sub_1() or
    full (long) one - it won't return anything for me,
    so I'm troubleshooting it.


    And If there are nothing more in the dll and in the memory?
    Only main - which is calling this sub_1() and passing value 5 to it.
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    OK then. Review how to use a DLL -- it's been several years since I last worked with one at that level. As I recall, in order to call a function, you get a pointer to it. In order to access a global variable, you need to use a pointer. So then the way that function works is an artifact of how a DLL works. That's why when you get a DLL you also get a header file and a static library, which takes care of all the DLL linkage for you. I'm willing to bet that the equivalent normal C code wouldn't involve any pointers.

    Study up on low-level programming with a DLL and the decompiled code should start to make sense. The only resource that I am familiar with is the one I used, a 15-year-old book written for Visual C++ 6 which is at work so I won't see it again until Monday. You should be able to find something on-line via your favorite search engine.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    5
    Rep Power
    0
    OK, thanks for your input.

    Simple question - what the following line of code is
    doing and why?

    Code:
     *(double *)(this + 12192) = *(double *)(this + 12184);
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,839
    Rep Power
    480
    I'd say it copies 8 bytes of data from one place to another. I don't know why this should be part of the algorithm. Which is to say I don't know much about the algorithm.
    [code]Code tags[/code] are essential for python code and Makefiles!
  16. #9
  17. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Posts
    4,379
    Rep Power
    1871
    Given that the single parameter is called this, and that you decompiled a DLL, it seems likely that you decompiled a C++ function back into C.

    Recall that the first hidden parameter of any call to a C++ class member function is the 'this' parameter pointing to an instance of the class.
    Code:
    class foo {
       private:
          int x;
          float y;
       public:
          int func() {
              return x;
          }
    };
    
    int main ( ) {
        foo me;
        cout << me.func();
    }
    At the machine level, func is going to get a pointer to me.

    Knowing this, all your 'this + nnnn' calculations are really the member variables of an instance of whatever class it is you decompiled.

    Given the size of the offsets, the class seems to maintain quite a lot of internal state, which will affect the return result.
    I somehow doubt you'll be able to replicate the functionality of one single member function without the context of the rest of the class.

    > Only main - which is calling this sub_1() and passing value 5 to it.
    Calling it with a literal numeric constant makes no sense at all.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2013
    Posts
    5
    Rep Power
    0
    OK, thanks, that gives me an idea.

    Yes, it was C++ dll.
    Now decompiled to C code.
    I was so naive to think, that this C code
    is close representation of original C++ code
    and if I manage to understand and adjust it,
    and then - compile - I'll get the copy of dll
    with my customized functionality.

    Now I see, that this ugly C code is not even close
    to the original C++ code at all,
    and it is impossible to get something useful from it.
  20. #11
  21. Contributing User

    Join Date
    Aug 2003
    Location
    UK
    Posts
    5,109
    Rep Power
    1802
    Originally Posted by King Arthur
    Simple question - what the following line of code is
    doing and why?

    Code:
     *(double *)(this + 12192) = *(double *)(this + 12184);
    Compilation is a one-way, non-reversible process, any attempt at decompilation will not produce anything like the original code, it may not even generate code that if recompiled would work - as you have found out. One problem is that at the machine code level the code it is not possible always to infer the intended source level data type from the machine code.

    So the answer to the "what does it do?"; something that probably only makes sense in very specific circumstances. And the "why?"; because at some abstract level it is mimicking what the machine code disassembly did - not what the original source code did.

    To see what you are up against, create or take some compilable C source of some reasonable complexity - functions, pointers, various data types, arrays, structures perhaps, - compile it (without debug information), and then use your decompiler on it and see what you get and whether you can relate it in any way to the source code!? You'll fund that it is worse than round-tripping Google Translate through all its supported languages!

    Given the inherent problems and frankly experimental nature of decompilers, any attempt to then translate that decompilation to a different high-level language, especially an VM language is probably doomed.

    A better approach would be to take the documentation of the API presented by the DLL in question and reimplement it from scratch.

IMN logo majestic logo threadwatch logo seochat tools logo