Available Agents
LiteAgent supports 8 different web automation agents, each with unique capabilities and use cases.BrowserUse
Open-source AI-powered browser automation with GPT-4 and Claude support
DoBrowser
Chrome extension-based automation with real browser environment
Agent E
Enterprise-grade automation with Anthropic Claude
Skyvern
Computer vision-based automation without DOM dependency
WebArena
Research-focused agent for academic evaluation
VisualWebArena
Visual understanding agent for complex interfaces
Human
Manual baseline for comparison testing
Quick Comparison
Feature | BrowserUse | DoBrowser | MultiOn | Agent E | Skyvern | WebArena | VisualWebArena |
---|---|---|---|---|---|---|---|
API Required | OpenAI/Anthropic | DoBrowser | MultiOn | Anthropic | Skyvern | OpenAI | OpenAI |
Vision Support | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ |
Setup Complexity | Low | Medium | Low | Low | High | Medium | Medium |
Dark Pattern Detection | High | Medium | High | Very High | High | Low | Medium |
Average Speed | Fast | Medium | Fast | Medium | Slow | Fast | Medium |
Cost | $$ | $ | $$$ | $$$ | $$ | $$ | $$ |
Getting Started
1
Choose Your Agent
Select an agent based on your requirements:
- For general testing: BrowserUse
- For enterprise: Agent E
- For vision tasks: Skyvern
- For research: WebArena
2
Configure API Keys
Add required API keys to
collector/.env
:3
Run First Test
Agent Categories
AI-Powered Agents
These agents use large language models for decision-making:- BrowserUse: GPT-4/Claude with browser control
- Agent E: Claude-based with hierarchical planning
- MultiOn: Proprietary multi-modal models
Vision-Based Agents
Agents that primarily rely on visual understanding:- Skyvern: Pure computer vision approach
- VisualWebArena: Combined vision and text understanding
Extension-Based Agents
Agents that work through browser extensions:- DoBrowser: Chrome extension integration
- MultiOn: Browser extension support
Research Agents
Academic and benchmark agents:- WebArena: Standard benchmark agent
- VisualWebArena: Visual benchmark agent
Performance Benchmarks
Based on testing across 100+ scenarios:Success Rates
Dark Pattern Detection
Average Task Completion Time
Selection Criteria
When to Use Each Agent
- Task Complexity
- Interface Type
- Requirements
Simple Tasks (click, navigate, read):
- BrowserUse ⭐⭐⭐
- WebArena ⭐⭐⭐
- MultiOn ⭐⭐
- MultiOn ⭐⭐⭐
- BrowserUse ⭐⭐⭐
- Agent E ⭐⭐
- Agent E ⭐⭐⭐
- MultiOn ⭐⭐
- BrowserUse ⭐⭐
Common Setup Steps
1. Environment Variables
All agents require environment configuration:2. Docker Setup
Each agent has a dedicated Dockerfile:3. Local Setup
Install agent-specific dependencies:Troubleshooting Guide
Agent won't start
Agent won't start
- Check API keys are valid
- Verify dependencies installed
- Check Docker/Python version compatibility
Agent times out
Agent times out
- Increase timeout:
--timeout 300
- Check network connectivity
- Verify target site is accessible
Poor success rates
Poor success rates
- Verify prompt clarity
- Check for site changes
- Review agent logs for errors
API rate limits
API rate limits
- Add delays between tasks
- Use multiple API keys
- Consider batch processing
Advanced Configuration
Custom Agent Parameters
Parallel Execution
Contributing New Agents
To add support for a new agent:- Create agent class extending
WebAutomationBase
- Implement required methods
- Add to factory pattern
- Create Dockerfile
- Add documentation
- Submit pull request
Next Steps
BrowserUse Setup
Detailed BrowserUse configuration guide
Running Tests
Start testing with your chosen agent
Evaluation Metrics
Understanding agent performance metrics