Skip to main content

AI Agent

The AI agent is the core of Aether. It runs inside each workspace and can write code, execute commands, navigate a browser, provision infrastructure, and manage databases — all from natural language prompts.

Model Configuration (BYOK)

Aether uses a Bring Your Own Key model. You connect your own AI provider credentials, and Aether routes requests accordingly.

Adding API Keys

Go to Settings > API Keys and add your provider key. The key is encrypted and stored securely.
You need at least one API key configured before the agent can process tasks. The agent uses the key matching the provider configured for your account.

Agent Tools

The agent has access to several categories of tools, each designed for a specific type of interaction.

Code Tools

Standard development operations provided through the SDK:
ToolDescription
File readRead file contents by path
File writeCreate or overwrite files
File editMake targeted edits to existing files
Bash executionRun shell commands in the workspace
GlobFind files by pattern
GrepSearch file contents with regex
Web searchSearch the web for documentation or references
Web fetchRetrieve content from a URL
Todo trackingManage task checklists during work

Browser Tools (Playwright)

The agent controls a headless browser for testing and interacting with web applications:
ToolDescription
browser_navigateNavigate to a URL
browser_screenshotCapture a screenshot of the current page
browser_snapshotTake an accessibility snapshot (structured DOM)
browser_clickClick an element on the page
browser_typeType text into an input field
browser_fillFill a form field directly
browser_wait_forWait for an element, text, or condition
browser_get_textExtract text content from an element
browser_evaluateExecute JavaScript on the page
browser_press_keyPress a keyboard key
browser_go_backNavigate back in browser history
browser_closeClose the browser
The browser runs inside the workspace VM, so it can access localhost URLs from your dev server. This is how the agent tests UI changes in real-time.

How the Agent Runs

  1. You create a task with a prompt (via web chat or CLI)
  2. The workspace service starts the agent runtime
  3. The agent streams responses over WebSocket — text, tool calls, thinking steps
  4. Tool invocations execute inside the workspace (file operations, bash commands, browser actions)
  5. Conversation history is persisted to the database for context continuity
  6. The agent continues until the task is complete, it needs your input, or you abort