API Reference

Complete technical reference for all TABS API endpoints. This guide provides detailed documentation for each endpoint, including parameters, response formats, and usage examples.

Base URL and Authentication

Base URL: https://api.tabstack.ai

Authentication: Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Common Error Responses

All endpoints return consistent error response format:

{
  "error": "Error description"
}

HTTP Status Codes:

400 - Bad Request (invalid parameters)
401 - Unauthorized (invalid API key)
404 - Not Found (URL not accessible)
429 - Rate Limit Exceeded
500 - Internal Server Error

Endpoints

GET /fetch

Retrieve raw HTML content from any URL with intelligent fetching strategies.

Parameters

Parameter	Type	Required	Description
`url`	string	Yes	Target URL to fetch

Example Request

curl -X GET "https://api.tabstack.ai/fetch?url=https://example.com/article" \
  -H "Authorization: Bearer YOUR_API_KEY"

Example Response

{
  "url": "https://example.com/article",
  "statusCode": 200,
  "body": "<!DOCTYPE html><html><head><title>Article Title</title>...",
  "headers": {
    "content-type": "text/html; charset=UTF-8",
    "server": "nginx/1.18.0",
    "last-modified": "Wed, 15 Jan 2024 10:30:00 GMT"
  }
}

Response Fields

Field	Type	Description
`url`	string	Originally requested URL
`statusCode`	integer	HTTP response status code
`body`	string	Raw HTML content
`headers`	object	HTTP response headers

GET /markdown

Convert HTML content to clean, structured markdown perfect for LLM processing.

Parameters

Parameter	Type	Required	Description
`url`	string	Yes	Target URL to convert
`metadata`	boolean	No	Include metadata extraction (default: false)
`nocache`	boolean	No	Skip cached content (default: false)

Example Request

curl -X GET "https://api.tabstack.ai/markdown?url=https://blog.example.com/post&metadata=true" \
  -H "Authorization: Bearer YOUR_API_KEY"

Example Response

{
  "url": "https://blog.example.com/post",
  "content": "# Article Title\n\nThis is the main content converted to clean markdown...\n\n## Section Header\n\nMore content here with [links](https://example.com) preserved.",
  "metadata": {
    "title": "Article Title",
    "description": "Article description from meta tags",
    "author": "Jane Smith",
    "publishedTime": "2024-01-15T10:30:00Z",
    "keywords": ["technology", "AI", "development"],
    "image": "https://blog.example.com/image.jpg",
    "canonicalUrl": "https://blog.example.com/post"
  }
}

Response Fields

Field	Type	Description
`url`	string	Source URL
`content`	string	Clean markdown content
`metadata`	object	Extracted metadata (if requested)

Metadata Fields

Field	Type	Description
`title`	string	Page title
`description`	string	Meta description
`author`	string	Article author
`publishedTime`	string	Publication date (ISO 8601)
`keywords`	array	Keywords and tags
`image`	string	Featured image URL
`canonicalUrl`	string	Canonical URL

GET /schema

Generate JSON schemas by analyzing web page structure with AI.

Parameters

Parameter	Type	Required	Description
`url`	string	Yes	Target URL to analyze
`instructions`	string	No	Custom analysis instructions
`nocache`	boolean	No	Skip cached schema (default: false)

Example Request

curl -X GET "https://api.tabstack.ai/schema?url=https://store.example.com/product/123&instructions=Focus%20on%20product%20information%20and%20pricing" \
  -H "Authorization: Bearer YOUR_API_KEY"

Example Response

{
  "type": "object",
  "properties": {
    "product_name": {
      "type": "string",
      "description": "Product title"
    },
    "price": {
      "type": "object",
      "properties": {
        "current_price": {"type": "number"},
        "original_price": {"type": "number"},
        "currency": {"type": "string"}
      }
    },
    "availability": {
      "type": "string",
      "description": "Stock status"
    },
    "reviews": {
      "type": "object",
      "properties": {
        "average_rating": {"type": "number"},
        "review_count": {"type": "integer"}
      }
    },
    "features": {
      "type": "array",
      "items": {"type": "string"},
      "description": "Product features list"
    }
  },
  "required": ["product_name", "price", "availability"]
}

Schema Generation Tips

Use instructions parameter to focus on specific data types
Generated schemas follow JSON Schema specification
Schemas can be used directly with /json endpoint
More specific instructions yield better schemas

POST /json

Extract structured data from web pages using custom JSON schemas.

Request Body

Field	Type	Required	Description
`url`	string	Yes	Target URL
`json_schema`	object	Yes	JSON schema for extraction
`nocache`	boolean	No	Skip cached data (default: false)

Schema Requirements

The json_schema parameter must follow OpenAI Structured Outputs specifications:

Required Rules

Root level must be an object { "type": "object", ... }
All properties must be required "required": ["title", "author", "date"]
All objects must forbid additional properties "additionalProperties": false
Only simple types allowed: string, number, boolean, array, object
No format specifiers (e.g., format: "uri", format: "date")

Best Practices

Add description fields to each property
Arrays should include maxItems (≤40 recommended)
Keep nesting depth under 5 levels
Nested objects also need required and additionalProperties: false

Example Schema

{
  "url": "https://example.com/article",
  "json_schema": {
    "type": "object",
    "properties": {
      "title": {
        "type": "string",
        "description": "Article title"
      },
      "author": {
        "type": "string",
        "description": "Author name"
      },
      "tags": {
        "type": "array",
        "items": { "type": "string" },
        "maxItems": 20,
        "description": "Article tags"
      }
    },
    "required": ["title", "author", "tags"],
    "additionalProperties": false
  }
}

Example Request

curl -X POST "https://api.tabstack.ai/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/article",
    "json_schema": {
      "type": "object",
      "properties": {
        "title": {
          "type": "string",
          "description": "Article title"
        },
        "author": {
          "type": "string",
          "description": "Author name"
        },
        "tags": {
          "type": "array",
          "items": { "type": "string" },
          "maxItems": 20,
          "description": "Article tags"
        }
      },
      "required": ["title", "author", "tags"],
      "additionalProperties": false
    }
  }'

Example Response

{
  "headline": "Breaking: Major Technology Breakthrough Announced",
  "author": "Sarah Johnson",
  "publish_date": "2024-01-15",
  "summary": "Researchers at Tech University have developed a new AI model that achieves unprecedented accuracy in natural language understanding tasks.",
  "tags": ["technology", "AI", "research", "breakthrough"],
  "reading_time": 5
}

Schema Design Best Practices

Use descriptive property names
Include required fields for essential data
Use appropriate data types (string, number, boolean, array, object)
Provide descriptions for complex properties
Test schemas with /schema endpoint first

POST /transform

Transform and process web content using AI with custom instructions and schemas.

Request Body

Field	Type	Required	Description
`url`	string	Yes	Target URL
`instructions`	string	Yes	Processing instructions
`json_schema`	object	No	Output format schema
`nocache`	boolean	No	Skip cached results (default: false)

Example Request

curl -X POST "https://api.tabstack.ai/transform" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://research.example.com/paper",
    "instructions": "Analyze this research paper and extract key findings, methodology, and implications for practical applications.",
    "json_schema": {
      "type": "object",
      "properties": {
        "research_summary": {"type": "string"},
        "key_findings": {
          "type": "array",
          "items": {"type": "string"}
        },
        "methodology": {"type": "string"},
        "practical_applications": {
          "type": "array",
          "items": {"type": "string"}
        },
        "limitations": {"type": "string"},
        "future_research": {"type": "string"}
      },
      "required": ["research_summary", "key_findings", "methodology"]
    }
  }'

Example Response

{
  "research_summary": "This study investigates the effectiveness of transformer models in multilingual text classification tasks across 12 languages.",
  "key_findings": [
    "Transformer models outperform traditional approaches by 15-20% across all tested languages",
    "Performance degrades significantly for low-resource languages with <10k training examples",
    "Cross-lingual transfer learning improves results by up to 25% for related language families"
  ],
  "methodology": "The researchers used a dataset of 500k labeled examples across 12 languages, employing BERT and XLM-R models with fine-tuning approaches.",
  "practical_applications": [
    "Multilingual customer support systems",
    "Cross-border content moderation",
    "International market sentiment analysis"
  ],
  "limitations": "The study was limited to text classification tasks and did not explore generative capabilities.",
  "future_research": "Future work should investigate multilingual generation tasks and the development of more efficient cross-lingual architectures."
}

Instruction Writing Tips

Be specific about desired output format
Include examples when helpful
Specify analysis depth and focus areas
Use clear, actionable language
Test instructions iteratively for best results

POST /automate

Execute AI-powered web automation tasks using natural language. This endpoint always streams responses using Server-Sent Events (SSE).

Request Body

Field	Type	Required	Description
`task`	string	Yes	Natural language task description
`url`	string	No	Starting URL for the task
`data`	object	No	JSON data for form filling or context
`guardrails`	string	No	Safety constraints for execution
`maxIterations`	number	No	Maximum task iterations (1-100, default: 50)
`maxValidationAttempts`	number	No	Maximum validation attempts (1-10, default: 3)

Use Cases

Web scraping and data extraction
Form filling and interaction
Navigation and information gathering
Multi-step web workflows
Content analysis from web pages

Example Request

curl -X POST "https://api.tabstack.ai/automate" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "task": "find product information and pricing",
    "url": "https://example-store.com/product/123",
    "guardrails": "extract only product details, don'\''t add to cart"
  }'

Streaming Response Format

The /automate endpoint always returns a streaming response using Server-Sent Events (SSE). The response content type is text/event-stream.

Event Format:

event: <event_type>
data: <JSON_data>

Each event starts with event: followed by the event type, then data: with JSON payload, separated by empty lines.

Event Types

Task Events:

start - Task initialization
task:setup - Task configuration
task:started - Task execution begins
task:completed - Task finished successfully
task:aborted - Task was terminated
task:validated - Task completion validation
task:validation_error - Validation failed

Agent Events:

agent:processing - Agent thinking/planning
agent:status - Status updates and plans
agent:step - Processing step iterations
agent:action - Actions being performed
agent:reasoned - Agent reasoning output
agent:extracted - Data extraction results
agent:waiting - Agent waiting for operations

Browser Events:

browser:navigated - Page navigation events
browser:action_started - Browser action initiated
browser:action_completed - Browser action finished
browser:screenshot_captured - Screenshot taken

System Events:

system:debug_compression - Debug compression info
system:debug_message - Debug messages

Stream Control:

complete - End of stream with results
done - Stream termination
error - Error occurred

Example Streaming Response

event: start
data: {"task": "what is the temperature in Tokyo?", "url": "https://weather.com"}

event: agent:processing
data: {"operation": "Creating task plan", "hasScreenshot": false}

event: agent:status
data: {"message": "Task plan created", "plan": "Navigate to weather site and search for Tokyo temperature"}

event: browser:navigated
data: {"title": "Weather.com", "url": "https://weather.com"}

event: task:started
data: {"task": "what is the temperature in Tokyo?", "url": "https://weather.com"}

event: agent:step
data: {"iterationId": "abc123", "currentIteration": 0}

event: agent:action
data: {"action": "fill_and_enter", "ref": "search", "value": "Tokyo"}

event: browser:action_completed
data: {"success": true, "action": "fill_and_enter"}

event: agent:extracted
data: {"extractedData": "Temperature: 23°C (73°F)"}

event: task:completed
data: {"success": true, "finalAnswer": "Current temperature in Tokyo is 23°C (73°F)"}

event: complete
data: {"success": true, "result": {"finalAnswer": "Current temperature in Tokyo is 23°C (73°F)"}}

event: done
data: {}

Handling Streaming Responses

Python Example:

import requests
import json

def stream_automate(api_key, task, url=None, guardrails=None):
    headers = {
        'Authorization': f'Bearer {api_key}',
        'Content-Type': 'application/json'
    }

    payload = {'task': task}
    if url:
        payload['url'] = url
    if guardrails:
        payload['guardrails'] = guardrails

    response = requests.post(
        'https://api.tabstack.ai/automate',
        headers=headers,
        json=payload,
        stream=True
    )

    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('event:'):
                event_type = line[6:].strip()
            elif line.startswith('data:'):
                data = json.loads(line[5:].strip())
                print(f"[{event_type}] {data}")

# Usage
stream_automate(
    api_key='YOUR_API_KEY',
    task='find the weather in Tokyo',
    url='https://weather.com',
    guardrails='browse only, no purchases'
)

JavaScript/Node.js Example:

async function streamAutomate(apiKey, task, options = {}) {
    const response = await fetch('https://api.tabstack.ai/automate', {
        method: 'POST',
        headers: {
            'Authorization': `Bearer ${apiKey}`,
            'Content-Type': 'application/json'
        },
        body: JSON.stringify({
            task,
            ...options
        })
    });

    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    let buffer = '';
    let currentEvent = '';

    while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n');
        buffer = lines.pop() || '';

        for (const line of lines) {
            if (line.startsWith('event:')) {
                currentEvent = line.substring(6).trim();
            } else if (line.startsWith('data:')) {
                const data = JSON.parse(line.substring(5).trim());
                console.log(`[${currentEvent}]`, data);
            }
        }
    }
}

// Usage
streamAutomate('YOUR_API_KEY', 'find the weather in Tokyo', {
    url: 'https://weather.com',
    guardrails: 'browse only, no purchases'
});

Additional Examples

Form Filling:

{
  "task": "submit the contact form with my information",
  "url": "https://company.com/contact",
  "data": {
    "name": "John Doe",
    "email": "john@example.com",
    "message": "Interested in your services"
  }
}

Complex Navigation:

{
  "task": "search for flights and compare prices",
  "url": "https://kayak.com",
  "data": {
    "from": "NYC",
    "to": "LAX",
    "date": "2024-12-25"
  },
  "guardrails": "search and compare only, don't book anything",
  "maxIterations": 75
}

Error Responses

400 Bad Request:

{
  "error": "task is required"
}

401 Unauthorized:

{
  "error": "Unauthorized - Invalid token"
}

500 Internal Server Error:

{
  "error": "failed to call automate server"
}

503 Service Unavailable:

{
  "error": "automate service not available"
}

Best Practices

Be specific with tasks: Clearly describe what you want the automation to accomplish
Use guardrails: Set safety constraints to prevent unintended actions
Handle streaming events: Process events in real-time to track progress
Set appropriate limits: Adjust maxIterations based on task complexity
Provide context data: Include relevant data for form filling or search queries
Monitor for errors: Watch for error events and handle them gracefully

Error Handling

Common Error Scenarios

URL Not Accessible (404)

{
  "error": "failed to fetch url"
}

Invalid JSON Schema (400)

{
  "error": "invalid json schema format"
}

Rate Limit Exceeded (429)

{
  "error": "rate limit exceeded"
}

Authentication Failed (401)

{
  "error": "unauthorized"
}

Retry Logic Recommendations

Implement exponential backoff for rate limit errors
Retry failed requests up to 3 times
Handle timeout errors gracefully
Cache successful responses when appropriate

Error Monitoring

Monitor these metrics for production systems:

Error rate by endpoint
Response time percentiles
Rate limit hit frequency
Failed URL patterns

Performance Optimization

Caching Strategy

Use caching (nocache=false) for stable content
Skip cache (nocache=true) for frequently updated pages
Implement client-side caching for repeated requests
Monitor cache hit rates for optimization

Request Batching

For processing multiple URLs:

import asyncio
import aiohttp

async def batch_process_urls(api_key, urls, endpoint='markdown'):
    async with aiohttp.ClientSession() as session:
        tasks = []

        for url in urls:
            task = process_single_url(session, api_key, url, endpoint)
            tasks.append(task)

        results = await asyncio.gather(*tasks, return_exceptions=True)
        return results

async def process_single_url(session, api_key, url, endpoint):
    headers = {"Authorization": f"Bearer {api_key}"}
    params = {"url": url}

    async with session.get(f"https://api.tabstack.ai/{endpoint}",
                          headers=headers, params=params) as response:
        return await response.json()

Monitoring and Alerting

Track these metrics for production use:

Response Times: Monitor 95th percentile latency
Error Rates: Track 4xx and 5xx responses
Rate Limit Usage: Monitor approaching limits
Cache Hit Rates: Optimize caching strategy

For more examples and advanced usage patterns, see our Research Assistant Example.

Base URL and Authentication​

Common Error Responses​

Endpoints​

GET /fetch​

Parameters​

Example Request​

Example Response​

Response Fields​

GET /markdown​

Parameters​

Example Request​

Example Response​

Response Fields​

Metadata Fields​

GET /schema​

Parameters​

Example Request​

Example Response​

Schema Generation Tips​

POST /json​

Request Body​

Schema Requirements​

Required Rules​

Best Practices​

Example Schema​

Example Request​

Example Response​

Schema Design Best Practices​

POST /transform​

Request Body​

Example Request​

Example Response​

Instruction Writing Tips​

POST /automate​

Request Body​

Use Cases​

Example Request​

Streaming Response Format​

Event Types​

Example Streaming Response​

Handling Streaming Responses​

Additional Examples​

Error Responses​

Best Practices​

Error Handling​

Common Error Scenarios​

Retry Logic Recommendations​

Error Monitoring​

Performance Optimization​

Caching Strategy​

Request Batching​

Monitoring and Alerting​

Base URL and Authentication

Common Error Responses

Endpoints

GET /fetch

Parameters

Example Request

Example Response

Response Fields

GET /markdown

Parameters

Example Request

Example Response

Response Fields

Metadata Fields

GET /schema

Parameters

Example Request

Example Response

Schema Generation Tips

POST /json

Request Body

Schema Requirements

Required Rules

Best Practices

Example Schema

Example Request

Example Response

Schema Design Best Practices

POST /transform

Request Body

Example Request

Example Response

Instruction Writing Tips

POST /automate

Request Body

Use Cases

Example Request

Streaming Response Format

Event Types

Example Streaming Response

Handling Streaming Responses

Additional Examples

Error Responses

Best Practices

Error Handling

Common Error Scenarios

Retry Logic Recommendations

Error Monitoring

Performance Optimization

Caching Strategy

Request Batching

Monitoring and Alerting