API Reference

The PDF Processor API is served at https://api.docsights.ai/v1. All requests require the X-Api-Key header.

Overview

ItemValue
Base URLhttps://api.docsights.ai/v1
AuthX-Api-Key: YOUR_API_KEY (header)
Content typeapplication/json

Documents and Jobs

Add a new document

POST /documents

Submit a document for extraction. Request body must include document with url and name. Optional: schema (inference schema), documentWebhook, jobWebhook.

Request body: DocumentRequest

  • document (required): { url: string (uri), name: string }
  • schema (optional): InferenceSchema — nested fields with dataType (string | number | boolean | object | array), optional request, fields (for object), items (for array)
  • documentWebhook (optional): URI for document completion notification
  • jobWebhook (optional): URI for job completion notification

Response: 202 AcceptedDocumentResponse

  • document: { documentId (uuid), jobId? (uuid), status: "accepted" }

Get all documents

GET /documents

List documents with optional pagination and time range.

Query parameters:

  • cursor (optional): base64 cursor for pagination
  • from (optional): unix timestamp (start)
  • to (optional): unix timestamp (end)

Response: 200DocumentListResponse

  • documents: array of { name, id (uuid), status } — status one of: ACCEPTED, PROCESSING, ERROR, FINISHED, BILLED
  • nextCursor: string or null

Get document by ID

GET /documents/{id}

Get details for a single document. id is a UUID path parameter.

Query parameters:

  • include-markdown (optional, default false): include markdown in response
  • include-doctags (optional, default false): include doctags in response

Response: 200DocumentDetailResponse

  • documentId, status, billing (extractionCost, totalQueryCost), jobs (array of jobId + status), optional markdown, doctags

Add a new job

POST /documents/{id}/jobs

Run an inference job on an existing document. id is the document UUID.

Request body: JobRequest

  • schema (required): InferenceSchema
  • jobWebhook (optional): URI for job completion

Response: 202 AcceptedJobAcceptedResponse

  • documentId, jobId, status: "ACCEPTED"

Get job by ID

GET /documents/{id}/jobs/{jobId}

Get job status and extracted output. id and jobId are UUID path parameters.

Response: 200JobDetailResponse

  • jobId, status, requestSchema, outputSchema (extracted data per requestSchema)

Schema reference

  • InferenceSchemaField: dataType (required), optional request, fields (for object), items (for array). Recursive for nested structures.
  • InferenceSchema: object with values as InferenceSchemaField.
  • Document and job status values: ACCEPTED, PROCESSING, ERROR, FINISHED, BILLED.

For a machine-readable spec, use the OpenAPI definition at api.docsights.ai or the api-spec.yaml in the repo to generate clients.

Was this page helpful?