Extract

The Extract module pulls nested, precise properties out of chaotic, flat files. By providing a standard JSON Schema, you instruct the AI engine to locate, format, and return a matching JSON object, effectively turning unstructured documents into structured data records.

 ┌───────────────┐                                 ┌─────────────────────────┐
 │ Raw Document  │      ┌───────────────────┐      │ Extracted JSON Output   │
 │ (Invoice PDF) │ ───► │  AI Engine        │ ───► │ {                       │
 │               │      │  + User Schema    │      │   "totalAmount": 450,   │
 │               │      └───────────────────┘      │   "invoiceNumber": ...  │
 └───────────────┘                                 │ }                       │
                                                   └─────────────────────────┘

Guaranteed Schema Adherence: The output always matches your provided JSON Schema.
Complex Data Extraction: Easily extract nested arrays (like invoice line items) or deep hierarchical objects.
Contextual Instructions: Append custom instructions to guide the AI's logic for specific fields.

🏗️ Configuration

The Extract module is defined within the extractionConfig block.

Parameters

Parameter	Type	Required	Description
`jsonSchema`	`object`	Yes	A valid JSON Schema (Draft 7+) defining the desired output structure.
`llmId`	`UUID`	No	Reference to a pre-saved LLM configuration ID to use for extraction.
`llmConfig`	`object`	No	Inline LLM configuration (overrides `llmId` or workspace defaults).
`systemPromptAppend`	`string`	No	Extra instructions for the AI (e.g., "Only extract values in USD").

🛠️ Extraction Operations

Extraction is typically performed as part of a stateless process run. For a detailed technical reference of every field and parameter, see the API Process Reference.

Execute an Extraction Run

To extract structured data, submit a process request to your workspace with an extractionConfig.

curl -X POST "https://api.axelered.com/v1/w/{workspace_id}/process" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "config": {
      "extractionConfig": {
        "jsonSchema": {
          "type": "object",
          "properties": {
            "invoiceNumber": { "type": "string" },
            "totalAmount": { "type": "number" }
          },
          "required": ["invoiceNumber", "totalAmount"]
        }
      }
    },
    "files": [
      {
        "url": "https://example.com/invoice.pdf"
      }
    ]
  }'

The structured result will be available via the extractionUrl returned in the task documents list once the run reaches a completed state.

Read & List Results

To retrieve extracted data across one or multiple documents, use the following endpoints:

List Extractions: Retrieve all extracted JSON objects for a complete run.
Read Document Extraction: Directly access the extraction result for a specific document.

🏗️ Configuration

Parameters

🛠️ Extraction Operations

Execute an Extraction Run

Read & List Results

On this page