Executing LLM-Generated Code with Flame

By Klaus Ma September 02, 2025

AI Agent Flame

In the era of AI-powered development, Large Language Models (LLMs) are increasingly being used to generate code snippets and scripts. However, executing this generated code safely and efficiently can be challenging. This blog post demonstrates how to use Flame - a distributed system for elastic workloads - to securely execute code generated by LLMs.

Overview

Flame provides a robust platform for running AI-generated scripts with several key benefits:

Security: Each execution runs in an isolated environment
Scalability: Distributed execution across multiple nodes
Flexibility: Support for multiple programming languages
Reliability: Built-in error handling and resource management

The Example: AI-Generated Python Scripts

Let’s explore how to use Flame to execute Python code generated by an LLM. The complete example is available in examples/agents/scripts/main.py.

Prerequisites

Before running the example, ensure you have:

A Flame cluster running (see Quick Start Guide)
Python 3.7+ with the required dependencies
An API key for your preferred LLM provider (we’ll use DeepSeek in this example)

Step-by-Step Implementation

1. Define the Execution Tool

First, we define a tool that the LLM can use to execute scripts; in this example, we did not create the run_script function to simplify the demo. In addition, Flame also support input, a byte array, for the script to read more information via stdio.

tools = [
    {
        "type": "function",
        "function": {
            "name": "run_script",
            "description": "Run a script, the user should supply a script name and parameters",
            "parameters": {
                "type": "object",
                "properties": {
                    "language": {
                        "type": "string",
                        "description": "The language of the script, e.g. python",
                    },
                    "code": {
                        "type": "string",
                        "description": "The code of the script to run, e.g. print('Hello, world!')",
                    }
                },
                "required": ["language", "code"]
            },
        }
    },
]

2. Set Up the LLM Client

Configure your LLM client (in this case, DeepSeek):

client = OpenAI(api_key=os.getenv("DEEPSEEK_API_KEY"), base_url="https://api.deepseek.com")

def send_messages(messages):
    response = client.chat.completions.create(
        model="deepseek-chat",
        messages=messages,
        tools=tools
    )
    return response.choices[0].message

3. Create a Flame Session

Initialize a Flame session for executing the generated code as follow. The flmexec is a build-in application of Flame to execute script, it support Python and Shell for now.

async def main():
    session = await flame.create_session("flmexec")

4. Generate and Execute Code

The workflow involves four main steps:

Generate Code: Ask the LLM to create a script
Request Execution: Ask the LLM to run the generated code
Execute with Flame: Use Flame to safely execute the code
Process Results: Let the LLM interpret the execution results

# Step 1: Generate code
messages = [{"role": "user", "content": "Provide a Python snippet that computes and prints the sum of integers from 1 to 100."}]
message = send_messages(messages)
messages.append(message)
print(f"Model>\t {message.content}")

# Step 2: Request execution
message = {"role": "user", "content": "run this code"}
messages.append(message)
message = send_messages(messages)

tool = message.tool_calls[0]
messages.append(message)

# Step 3: Execute with Flame
input = tool.function.arguments.encode("utf-8")
task = await session.invoke(input)

# Step 4: Process results
messages.append({"role": "tool", "tool_call_id": tool.id, "content": task.output.decode("utf-8")})
message = send_messages(messages)
print(f"Model>\t {message.content}")

Running the Example

To run the example:

# Set your API key
export DEEPSEEK_API_KEY="your-api-key-here"

# Run the example
python3 ./examples/agents/scripts/main.py

Expected Output

When you run the example, you’ll see output similar to:

Model>	 Here's a Python snippet that computes and prints the sum of integers from 1 to 100:

```python
# Calculate the sum of integers from 1 to 100
total = sum(range(1, 101))
print("The sum of integers from 1 to 100 is:", total)
```

Would you like me to run this script for you?
Model>	 The sum of integers from 1 to 100 is **5050**. Let me know if you'd like to explore anything else!

Advanced Use Cases

Batch Code Generation

You can extend this pattern to generate and execute multiple scripts in parallel. Those scripts will be executed in the same session for the agent which make it easier to debug or eval. The Flame will execute those scripts in parallel if necessary.

# Generate multiple scripts
scripts = [
    "Calculate fibonacci numbers up to 20",
    "Generate a random password",
    "Parse and analyze a JSON file"
]

# Execute them concurrently within the same Flame session for the agent.

Interactive Code Execution

Create an interactive environment where users can iteratively refine generated code. As the session keep open, there also zero startup time for the Agent which gets a better performance.

# Allow users to modify generated code
user_feedback = "Can you optimize this for better performance?"
# Re-generate and re-execute with improvements

Code Validation

As the flmexec also support input during execution, we can use Flame to test generated code against various inputs and edge cases

# Test generated functions with different inputs
test_cases = [1, 10, 100, 1000]
# Execute each test case in isolation

Conclusion

Flame provides a powerful and secure platform for executing AI-generated code. By combining the creativity of LLMs with Flame’s robust execution environment, you can build sophisticated AI-powered applications that can generate, test, and execute code safely and efficiently.

The example demonstrated in this blog post shows just the beginning of what’s possible. You can extend this pattern to build more complex AI agents, automated testing systems, or interactive coding assistants.

To get started with your own implementations, check out the complete example in examples/agents/scripts/main.py and explore the Flame documentation for more advanced features and capabilities.