Hacker News Clone

Ask HN: How do LLMs run code described in prompts?

by DLA on 2/14/2025, 2:37:15 PM with 1 comments

The other day I was working with ChatGPT and I asked it to extract entities from a text passage and "...generate a unique ID for each extracted entity by MD5 hashing the entity value and type." And it did just that!

I get the next token probability model concept in LLMs, but how the heck can it write and run code in the formation of the response? Is that actually what's going on behind the curtain?

Thanks smart HN people.

by nahco314 on 2/14/2025, 5:46:47 PM
I don't know exactly how OpenAI does it, but I think it's probably something like this: the OpenAI server that generates the responses reads and executes the ‘code that LLM wants to execute’ that exists in the LLM output in some way, and then passes the execution results as internal conversation history. As a concrete method, for example, the code to be executed could be enclosed in tags such as <run-code>, or it may be achieved using metadata at a lower layer.
However, although GPT-4o should have a code execution function, it is possible that the code is not actually being executed and that the LLM is simply generating output that looks like (but is not actually) correct.