Skip to main content

Documentation Index

Fetch the complete documentation index at: https://support.agentrank.io/llms.txt

Use this file to discover all available pages before exploring further.

Vespa provides a comprehensive set of HTTP APIs for interacting with your application. These APIs enable document management, search queries, and application deployment operations.

Available APIs

Vespa exposes the following main HTTP APIs:

Document v1 API

CRUD operations for documents

Search API

Query and retrieve documents

Deploy API

Deploy and manage applications

Base URL Structure

All Vespa HTTP APIs follow a consistent URL structure:
http://<host>:<port>/<api-path>
  • Document API: http://localhost:8080/document/v1/
  • Search API: http://localhost:8080/search/
  • Deploy API: http://localhost:19071/application/v2/

Authentication

Vespa supports multiple authentication mechanisms depending on your deployment:

Vespa Cloud

For Vespa Cloud deployments, all API requests require authentication using:
  • mTLS (Mutual TLS): Certificate-based authentication for production environments
  • API Keys: Token-based authentication for development and CI/CD
curl -H "Authorization: Bearer <api-key>" \
  https://<endpoint>.vespa-app.cloud/search/?yql=...

Self-Hosted Vespa

For self-hosted deployments:
  • Authentication is optional and configured per container
  • Can integrate with custom authentication filters
  • Supports token-based authentication via custom handlers

Request Format

Content Types

Vespa APIs support the following content types:
  • JSON (default): application/json
  • CBOR: application/cbor (binary JSON format for better performance)

Common Headers

Content-Type
string
default:"application/json"
The MIME type of the request body
Accept
string
default:"application/json"
The desired response format. Supports application/json or application/cbor

Response Format

All API responses follow a consistent JSON structure:
{
  "pathId": "/document/v1/namespace/doctype/docid/1",
  "id": "id:namespace:doctype::1",
  "message": "Success"
}

Error Responses

Error responses include detailed information:
{
  "pathId": "/document/v1/namespace/doctype/docid/1",
  "message": "Document not found"
}
pathId
string
The request path that generated the response
message
string
Human-readable message describing the result or error

HTTP Status Codes

Vespa uses standard HTTP status codes:
Status CodeDescription
200 OKRequest succeeded
201 CreatedDocument created successfully
400 Bad RequestInvalid request format or parameters
404 Not FoundDocument or resource not found
412 Precondition FailedTest-and-set condition failed
429 Too Many RequestsRate limit exceeded or system overload
500 Internal Server ErrorServer-side error
504 Gateway TimeoutRequest timeout exceeded
507 Insufficient StorageNo storage space available

Timeouts

All APIs support timeout configuration:
timeout
string
default:"180s"
Request timeout in seconds or with unit suffix (e.g., 5s, 1000ms)
curl "http://localhost:8080/document/v1/mynamespace/music/docid/1?timeout=30s"

Tracing

Enable request tracing for debugging:
tracelevel
integer
default:"0"
Trace level from 0 (off) to 9 (maximum detail)
curl "http://localhost:8080/search/?yql=...&tracelevel=5"
The trace information is included in the response:
{
  "trace": {
    "children": [
      {
        "message": "Invoking chain 'default'"
      }
    ]
  }
}

Rate Limiting

Vespa implements automatic rate limiting to protect system stability:
  • Document API: Queue-based throttling with configurable limits
  • Search API: Thread pool saturation monitoring
  • Deploy API: Per-tenant rate limiting
When rate limits are exceeded, the API returns HTTP 429 (Too Many Requests).

Performance Considerations

Best Practices

  1. Batch Operations: Use visitor operations for bulk reads instead of individual GET requests
  2. Connection Reuse: Use HTTP keep-alive and connection pooling
  3. CBOR Format: Use CBOR for better performance with large responses
  4. Timeout Configuration: Set appropriate timeouts based on operation complexity
  5. Compression: Enable HTTP compression for large payloads

Example: Connection Pooling

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

session = requests.Session()
retries = Retry(total=3, backoff_factor=0.1)
adapter = HTTPAdapter(max_retries=retries, pool_connections=10, pool_maxsize=20)
session.mount('http://', adapter)

response = session.get('http://localhost:8080/search/?yql=...')

Monitoring and Metrics

Vespa exposes API metrics through:
  • Prometheus metrics: Available at /prometheus/v1/values
  • JSON metrics: Available at /state/v1/metrics
Key metrics to monitor:
  • http.status.2xx: Successful requests
  • http.status.4xx: Client errors
  • http.status.5xx: Server errors
  • http.request.latency: Request latency percentiles

Next Steps

Document v1 API

Learn about document CRUD operations

Search API

Explore query capabilities

Deploy API

Deploy your applications

Query Language

Learn YQL query syntax