Executing LLM-Generated Code with Flame
In the era of AI-powered development, Large Language Models (LLMs) are increasingly being used to generate code snippets and scripts. However, executing this generated code safely and efficiently can be challenging. This blog post demonstrates how to use Flame - a distributed system for elastic workloads - to securely execute code generated by LLMs.
Overview
Flame provides a robust platform for running AI-generated scripts with several key benefits:
- Security: Each execution runs in an isolated environment
- Scalability: Distributed execution across multiple nodes
- Flexibility: Support for multiple programming languages
- Reliability: Built-in error handling and resource management
The Example: AI-Generated Python Scripts
Let’s explore how to use Flame to execute Python code generated by an LLM. The complete example is available in examples/agents/scripts/main.py
.
Prerequisites
Before running the example, ensure you have:
- A Flame cluster running (see Quick Start Guide)
- Python 3.7+ with the required dependencies
- An API key for your preferred LLM provider (we’ll use DeepSeek in this example)
Step-by-Step Implementation
1. Define the Execution Tool
First, we define a tool that the LLM can use to execute scripts; in this example, we did not create the run_script
function to simplify the demo.
In addition, Flame also support input
, a byte array, for the script to read more information via stdio.
tools = [
{
"type": "function",
"function": {
"name": "run_script",
"description": "Run a script, the user should supply a script name and parameters",
"parameters": {
"type": "object",
"properties": {
"language": {
"type": "string",
"description": "The language of the script, e.g. python",
},
"code": {
"type": "string",
"description": "The code of the script to run, e.g. print('Hello, world!')",
}
},
"required": ["language", "code"]
},
}
},
]
2. Set Up the LLM Client
Configure your LLM client (in this case, DeepSeek):
client = OpenAI(api_key=os.getenv("DEEPSEEK_API_KEY"), base_url="https://api.deepseek.com")
def send_messages(messages):
response = client.chat.completions.create(
model="deepseek-chat",
messages=messages,
tools=tools
)
return response.choices[0].message
3. Create a Flame Session
Initialize a Flame session for executing the generated code as follow. The flmexec
is a build-in application of Flame to execute script, it support Python and Shell for now.
async def main():
session = await flame.create_session("flmexec")
4. Generate and Execute Code
The workflow involves four main steps:
- Generate Code: Ask the LLM to create a script
- Request Execution: Ask the LLM to run the generated code
- Execute with Flame: Use Flame to safely execute the code
- Process Results: Let the LLM interpret the execution results
# Step 1: Generate code
messages = [{"role": "user", "content": "Provide a Python snippet that computes and prints the sum of integers from 1 to 100."}]
message = send_messages(messages)
messages.append(message)
print(f"Model>\t {message.content}")
# Step 2: Request execution
message = {"role": "user", "content": "run this code"}
messages.append(message)
message = send_messages(messages)
tool = message.tool_calls[0]
messages.append(message)
# Step 3: Execute with Flame
input = tool.function.arguments.encode("utf-8")
task = await session.invoke(input)
# Step 4: Process results
messages.append({"role": "tool", "tool_call_id": tool.id, "content": task.output.decode("utf-8")})
message = send_messages(messages)
print(f"Model>\t {message.content}")
Running the Example
To run the example:
# Set your API key
export DEEPSEEK_API_KEY="your-api-key-here"
# Run the example
python3 ./examples/agents/scripts/main.py
Expected Output
When you run the example, you’ll see output similar to:
Model> Here's a Python snippet that computes and prints the sum of integers from 1 to 100:
```python
# Calculate the sum of integers from 1 to 100
total = sum(range(1, 101))
print("The sum of integers from 1 to 100 is:", total)
```
Would you like me to run this script for you?
Model> The sum of integers from 1 to 100 is **5050**. Let me know if you'd like to explore anything else!
Advanced Use Cases
Batch Code Generation
You can extend this pattern to generate and execute multiple scripts in parallel. Those scripts will be executed in the same session for the agent which make it easier to debug or eval. The Flame will execute those scripts in parallel if necessary.
# Generate multiple scripts
scripts = [
"Calculate fibonacci numbers up to 20",
"Generate a random password",
"Parse and analyze a JSON file"
]
# Execute them concurrently within the same Flame session for the agent.
Interactive Code Execution
Create an interactive environment where users can iteratively refine generated code. As the session keep open, there also zero startup time for the Agent which gets a better performance.
# Allow users to modify generated code
user_feedback = "Can you optimize this for better performance?"
# Re-generate and re-execute with improvements
Code Validation
As the flmexec
also support input
during execution, we can use Flame to test generated code against various inputs and edge cases
# Test generated functions with different inputs
test_cases = [1, 10, 100, 1000]
# Execute each test case in isolation
Conclusion
Flame provides a powerful and secure platform for executing AI-generated code. By combining the creativity of LLMs with Flame’s robust execution environment, you can build sophisticated AI-powered applications that can generate, test, and execute code safely and efficiently.
The example demonstrated in this blog post shows just the beginning of what’s possible. You can extend this pattern to build more complex AI agents, automated testing systems, or interactive coding assistants.
To get started with your own implementations, check out the complete example in examples/agents/scripts/main.py
and explore the Flame documentation for more advanced features and capabilities.