Skip to main content

ParseResponse

Response from document parsing operations.
duration
float
required
The duration of the parse request in seconds.
job_id
string
required
Unique identifier for the parse job.
result
Result
required
The response from the document processing service. Can be either a Full Result or URL Result. Due to HTTPS size limitations, large responses are returned as presigned URLs.
usage
ParseUsage
required
Usage information for the parse operation.
pdf_url
string
The storage URL of the converted PDF file.
The link to the studio pipeline for the document.

Result Types

Full Result

chunks
List[Chunk]
required
Array of extracted chunks from the document.
type
literal
required
Always set to “full”.
custom
object
Custom metadata fields.
ocr
OCRResult
OCR data including lines and words with bounding boxes and confidence scores.

URL Result

result_id
string
required
Unique identifier for the result.
type
literal
required
Always set to “url”.
url
string
required
Presigned URL to download the full result.

Chunk Structure

blocks
List[Block]
required
Array of blocks within the chunk.
content
string
required
The content of the chunk extracted from the document.
embed
string
required
Chunk content optimized for embedding and retrieval.
enriched
string
The enriched content of the chunk extracted from the document.
enrichment_success
boolean
Whether the enrichment was successful.

Block Structure

bbox
BoundingBox
required
The bounding box of the block extracted from the document.
content
string
required
The content of the block extracted from the document.
type
enum
required
The type of block. One of: Header, Footer, Title, Section Header, Page Number, List Item, Figure, Table, Key Value, Text, Comment, Signature.
chart_data
List[string]
(Experimental) The URL/link to chart data JSON for figure blocks processed by chart agent.
confidence
string
The confidence for the block (“low” or “high”). Takes into account factors like OCR and table structure.
extra
Dict[string, object]
Extra metadata fields for the block. Fields like ‘is_chart’ will only appear when set to True.
granular_confidence
GranularConfidence
Granular confidence scores for the block. Available when numeric confidence scores are enabled.
image_url
string
(Experimental) The URL of the image associated with the block.

ExtractResponse

Response from structured data extraction operations.
result
List[object]
required
The extracted response in your provided schema. This is a list of dictionaries. If disable_chunking is True (default), then it will be a list of length one.
usage
ExtractUsage
required
Usage information for the extract operation.
citations
List[object]
The citations corresponding to the extracted response.
job_id
string
Unique identifier for the extract job.
The link to the studio pipeline for the document.

V3ExtractResponse

Response from V3 extraction operations.
result
Union[List[object], object]
required
The extracted response in your provided schema. This is a list of dictionaries. If disable_chunking is True (default), then it will be a list of length one.
usage
ExtractUsage
required
Usage information for the extract operation.
job_id
string
Unique identifier for the extract job.
The link to the studio pipeline for the document.

SplitResponse

Response from document splitting operations.
result
SplitResult
required
The split result containing section mappings and splits.
usage
ParseUsage
required
Usage information for the split operation.

SplitResult Structure

splits
List[Split]
required
Array of document splits.
section_mapping
Dict[string, List[int]]
Mapping of section names to page numbers.

Split Structure

name
string
required
Name of the split section.
pages
List[int]
required
Page numbers included in this split.
conf
enum
Confidence level of the split (“high” or “low”).
partitions
List[Partition]
Sub-partitions within this split.

EditResponse

Response from document editing operations.
document_url
string
required
Presigned URL to download the edited document.
form_schema
List[FormSchema]
Form schema for PDF forms. List of widgets with their types, descriptions, and bounding boxes.
usage
ParseUsage
Usage information for the edit operation, including number of pages and credits charged.

FormSchema Structure

bbox
BoundingBox
required
Bounding box coordinates of the widget.
description
string
required
Description of the widget extracted from the document.
type
enum
required
Type of the form widget. One of: text, checkbox, radio, dropdown, barcode.
fill
boolean
If True (default), the system will attempt to fill this widget. If False, the widget will be created but intentionally left unfilled.
value
string
If provided, this value will be used directly instead of attempting to intelligently determine the field value.

PipelineResponse

Response from pipeline operations combining multiple processing steps.
job_id
string
required
Unique identifier for the pipeline job.
result
PipelineResult
required
Combined results from all pipeline steps.
usage
ParseUsage
required
Total usage information for the pipeline operation.

PipelineResult Structure

extract
Union[List[ExtractResult], ExtractResponse, V3ExtractResponse]
Extract operation results. Can be a list of results (for Extract -> Split pipelines) or a single result.
parse
Union[ParseResponse, List[ParseResponse]]
Parse operation results. Can be a single response or list of responses.
split
SplitResponse
Split operation results.
edit
EditResponse
Edit operation results.