Agents Overview - LiteAgent Documentation

Available Agents

LiteAgent supports 8 different web automation agents, each with unique capabilities and use cases.

BrowserUse

Open-source AI-powered browser automation with GPT-4 and Claude support

DoBrowser

Chrome extension-based automation with real browser environment

Agent E

Enterprise-grade automation with Anthropic Claude

Skyvern

Computer vision-based automation without DOM dependency

WebArena

Research-focused agent for academic evaluation

VisualWebArena

Visual understanding agent for complex interfaces

Human

Manual baseline for comparison testing

Quick Comparison

Feature	BrowserUse	DoBrowser	MultiOn	Agent E	Skyvern	WebArena	VisualWebArena
API Required	OpenAI/Anthropic	DoBrowser	MultiOn	Anthropic	Skyvern	OpenAI	OpenAI
Vision Support	✅	✅	✅	✅	✅	❌	✅
Setup Complexity	Low	Medium	Low	Low	High	Medium	Medium
Dark Pattern Detection	High	Medium	High	Very High	High	Low	Medium
Average Speed	Fast	Medium	Fast	Medium	Slow	Fast	Medium
Cost	$$	$	$$$	$$$	$$	$$	$$

Getting Started

Choose Your Agent

Select an agent based on your requirements:

For general testing: BrowserUse
For enterprise: Agent E
For vision tasks: Skyvern
For research: WebArena

Configure API Keys

Add required API keys to collector/.env:

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
MULTION_API_KEY=...

Run First Test

# Using Docker
docker compose up browseruse

# Or locally
./run.sh browseruse

Agent Categories

AI-Powered Agents

These agents use large language models for decision-making:

BrowserUse: GPT-4/Claude with browser control
Agent E: Claude-based with hierarchical planning
MultiOn: Proprietary multi-modal models

Vision-Based Agents

Agents that primarily rely on visual understanding:

Skyvern: Pure computer vision approach
VisualWebArena: Combined vision and text understanding

Extension-Based Agents

Agents that work through browser extensions:

DoBrowser: Chrome extension integration
MultiOn: Browser extension support

Research Agents

Academic and benchmark agents:

WebArena: Standard benchmark agent
VisualWebArena: Visual benchmark agent

Performance Benchmarks

Based on testing across 100+ scenarios:

Success Rates

Agent E:          92% ████████████████████
MultiOn:          88% █████████████████
BrowserUse:       85% ████████████████
DoBrowser:        75% ███████████████
VisualWebArena:   73% ██████████████
Skyvern:          70% ██████████████
WebArena:         68% █████████████

Dark Pattern Detection

Agent E:          95% ███████████████████
Skyvern:          82% ████████████████
MultiOn:          80% ████████████████
BrowserUse:       78% ███████████████
DoBrowser:        65% █████████████
VisualWebArena:   55% ███████████
WebArena:         40% ████████

Average Task Completion Time

MultiOn:          40s ████
BrowserUse:       45s █████
DoBrowser:        60s ██████
VisualWebArena:   75s ████████
Agent E:          90s █████████
Skyvern:         120s ████████████

Selection Criteria

When to Use Each Agent

Task Complexity
Interface Type
Requirements

Simple Tasks (click, navigate, read):

BrowserUse ⭐⭐⭐
WebArena ⭐⭐⭐
MultiOn ⭐⭐

Medium Tasks (forms, multi-step):

MultiOn ⭐⭐⭐
BrowserUse ⭐⭐⭐
Agent E ⭐⭐

Complex Tasks (reasoning, decisions):

Agent E ⭐⭐⭐
MultiOn ⭐⭐
BrowserUse ⭐⭐

Common Setup Steps

1. Environment Variables

All agents require environment configuration:

cp collector/.env.example collector/.env
# Edit with your API keys

2. Docker Setup

Each agent has a dedicated Dockerfile:

docker build -f Dockerfile.browseruse -t browseruse-agent .

3. Local Setup

Install agent-specific dependencies:

pip install -r requirements.txt
playwright install  # For browser automation

Troubleshooting Guide

Agent won't start

Check API keys are valid
Verify dependencies installed
Check Docker/Python version compatibility

Agent times out

Increase timeout: --timeout 300
Check network connectivity
Verify target site is accessible

Poor success rates

Verify prompt clarity
Check for site changes
Review agent logs for errors

API rate limits

Add delays between tasks
Use multiple API keys
Consider batch processing

Advanced Configuration

Custom Agent Parameters

# In web_automation_factory.py
agent_config = {
    "browseruse": {
        "model": "gpt-4-vision-preview",
        "temperature": 0.7,
        "max_retries": 3
    },
    "agente": {
        "planning_enabled": True,
        "screenshot_interval": 2,
        "verbose_logging": True
    }
}

Parallel Execution

# docker-compose.yml
services:
  browseruse:
    deploy:
      replicas: 5
    environment:
      - WORKER_ID=${WORKER_ID}

Contributing New Agents

To add support for a new agent:

Create agent class extending WebAutomationBase
Implement required methods
Add to factory pattern
Create Dockerfile
Add documentation
Submit pull request

See API Reference for implementation details.

Next Steps

BrowserUse Setup

Detailed BrowserUse configuration guide

Running Tests

Start testing with your chosen agent

Evaluation Metrics

Understanding agent performance metrics

Supported Agents

​Available Agents

BrowserUse

DoBrowser

Agent E

Skyvern

WebArena

VisualWebArena

Human

​Quick Comparison

​Getting Started

​Agent Categories

​AI-Powered Agents

​Vision-Based Agents

​Extension-Based Agents

​Research Agents

​Performance Benchmarks

​Success Rates

​Dark Pattern Detection

​Average Task Completion Time

​Selection Criteria

​When to Use Each Agent

​Common Setup Steps

​1. Environment Variables

​2. Docker Setup

​3. Local Setup

​Troubleshooting Guide

​Advanced Configuration

​Custom Agent Parameters

​Parallel Execution

​Contributing New Agents

​Next Steps