• by AlbertoGP on 1/4/2023, 7:19:12 AM

    Yes, a few days ago there was a story in HN that went mostly ignored: “G-3PO: A Protocol Droid for Ghidrahttps://medium.com/tenable-techblog/g-3po-a-protocol-droid-f...

    https://news.ycombinator.com/item?id=34161642

    That’s a script for the reverse-engineering tool Ghidra that uses GPT-3 to de-compile machine code and to write plain English explanations of what a piece of code does.

    The article is quite detailed and describes both its capabilities and its limitations. That G-3PO script is open source, MIT license: https://github.com/tenable/ghidra_tools/tree/main/g3po

    There was also another HN story about what at first sight looks like an alternative implementation of the same idea: “GptHidra – Ghidra plugin that asks OpenAI Chat GPT to explain functions

    https://news.ycombinator.com/item?id=34165291

    This one is more recent and lacks that good write-up mentioned above. The script is smaller and it seems to have fewer features.

    I suggest checking both of them.

  • by shagie on 1/4/2023, 4:21:59 AM

    So, I grabbed a one liner (I didn't want to use too many tokens) from the IOCCC and tossed it into https://beta.openai.com/playground/p/default-explain-code?mo...

    This is from the 1987 korn entry which won "best one liner".

    https://www.ioccc.org/years.html

            main() { printf(&unix["\021%six\012\0"],(unix)["have"]+"fun"-0x60);}
    
        """
        Here's what the above C code is doing: (GPT output follows)
        1. The first line is a comment.
        2. The second line is the main function.
        3. The third line is a printf statement.
        4. The fourth line is the end of the main function.
    
        The printf statement is the most interesting part.
    
        The first argument to printf is a string.
        The string is made up of two parts.
        The first part is "\021%six\012\0".
        The second part is "(unix)["have"]+"fun"-0x60".
    
        The first part of the string, "\021%six\012\0", is just a normal string.
        The second part of the string, "(unix)["have"]+"fun"-0x60", is a little more complicated.
    
        The second part starts with "(unix)".
        This is an array.
        The array is called "unix".
    
        The rest of the second part, "["have"]+"fun"-0x60", is an expression.
        The expression starts with "["have"]".
        This is the "have" element of the unix array.
    
        The rest of the expression is "+"fun"-0x60".
        This is the string "fun" with 0x60 subtracted from it.
    
        Putting it all together, the string is made up of the following:
        1. The string "\021%six\012\0".
        2. The "have" element of the unix array.
        3. The string "fun" with 0x60 subtracted from it.
    
        The result is the string "have fun".
    
        The 0x21 in the first part of the string is a ! character.
        The 0x0a in the first part of the string is a newline character.
    
        Putting it all together, the string is:
        "!x
        "
    
        The ! character tells printf to print the string in reverse.
        The x character tells printf to print the string in hexadecimal.
        The newline character tells printf to start a new line.
    
        The result is:
        $   ./a.out
        0x6e696168
    
        The 0x6e696168 is "niah" in hexadecimal.
        This is "have" in reverse.
    
        The result is:
        $   ./a.out
        0x6e696168
        have
    
    Even this takes a lot of output tokens.