Job Management - Reducto Python SDK

Overview

The Reducto SDK provides comprehensive job management capabilities for tracking asynchronous document processing operations. You can retrieve individual jobs, list all jobs, and cancel running jobs.

Get a Specific Job

Retrieve the status and results of a specific job using its job ID:

from reducto import Reducto

client = Reducto()

# Get job by ID
job = client.job.get(job_id="job_abc123")

print(f"Job status: {job.status}")
if job.status == "completed":
    print(f"Result: {job.result}")

Job Response

The job response includes:

job_id: Unique identifier for the job
status: Current status (pending, processing, completed, failed, cancelled)
result: The processing result (when status is completed)
error: Error details (when status is failed)
created_at: Job creation timestamp
completed_at: Job completion timestamp (when applicable)

Get All Jobs

Retrieve a list of all your jobs with pagination support:

from reducto import Reducto

client = Reducto()

# Get first page of jobs
response = client.job.get_all(
    limit=100  # Max 500, defaults to 100
)

for job in response.jobs:
    print(f"Job {job.job_id}: {job.status}")

# Check if there are more results
if response.next_cursor:
    print(f"Next page cursor: {response.next_cursor}")

Pagination

Use the cursor parameter to fetch subsequent pages:

# Get first page
response = client.job.get_all(limit=100)

# Get next page using cursor from previous response
if response.next_cursor:
    next_page = client.job.get_all(
        cursor=response.next_cursor,
        limit=100
    )

Exclude Configurations

Reduce response size by excluding configuration details:

response = client.job.get_all(
    limit=100,
    exclude_configs=True  # Omit raw_config from response
)

Parameters

limit

int

default:"100"

Maximum number of jobs to return per page (max 500)

cursor

str

Cursor for pagination. Use the next_cursor from the previous response to fetch the next page.

exclude_configs

bool

default:"false"

Exclude raw_config from response to reduce size

Cancel a Job

Cancel a running or pending job:

from reducto import Reducto

client = Reducto()

# Cancel a job
client.job.cancel(job_id="job_abc123")

print("Job cancelled successfully")

You can only cancel jobs that are in pending or processing status. Completed, failed, or already cancelled jobs cannot be cancelled.

Async Job Management

All job operations work with the async client:

import asyncio
from reducto import AsyncReducto

client = AsyncReducto()

async def main():
    # Get a job
    job = await client.job.get(job_id="job_abc123")
    print(f"Status: {job.status}")
    
    # Get all jobs
    response = await client.job.get_all(limit=50)
    for job in response.jobs:
        print(f"Job {job.job_id}: {job.status}")
    
    # Cancel a job
    await client.job.cancel(job_id="job_abc123")

asyncio.run(main())

Complete Example: Job Polling

Here’s a complete example of submitting a job and polling for completion:

import time
from reducto import Reducto

client = Reducto()

# Submit an async job
response = client.parse.run(
    input="https://pdfobject.com/pdf/sample.pdf",
)

job_id = response.job_id
print(f"Submitted job: {job_id}")

# Poll for completion
while True:
    job = client.job.get(job_id=job_id)
    
    if job.status == "completed":
        print("Job completed!")
        print(f"Result: {job.result}")
        break
    elif job.status == "failed":
        print(f"Job failed: {job.error}")
        break
    elif job.status == "cancelled":
        print("Job was cancelled")
        break
    else:
        print(f"Job status: {job.status}")
        time.sleep(5)  # Wait 5 seconds before checking again

Job Lifecycle

Jobs progress through the following states:

Pending: Job is queued and waiting to start
Processing: Job is currently being processed
Completed: Job finished successfully
Failed: Job encountered an error
Cancelled: Job was cancelled by user request

Best Practices

Use Webhooks for Production

Instead of polling for job completion, configure webhooks to receive notifications when jobs complete. See Webhooks for details.

Implement Exponential Backoff

When polling for job status, use exponential backoff to reduce API calls:

import time

delay = 1
max_delay = 60

while True:
    job = client.job.get(job_id=job_id)
    if job.status in ["completed", "failed", "cancelled"]:
        break
    time.sleep(delay)
    delay = min(delay * 2, max_delay)

Clean Up Old Jobs

Periodically retrieve and clean up old completed jobs to keep your job list manageable:

response = client.job.get_all(limit=500)
for job in response.jobs:
    if job.status == "completed" and is_old(job.completed_at):
        # Process or archive the result
        pass

Handle Pagination

When retrieving all jobs, always handle pagination to ensure you process all results:

cursor = None
all_jobs = []

while True:
    response = client.job.get_all(cursor=cursor, limit=500)
    all_jobs.extend(response.jobs)
    
    if not response.next_cursor:
        break
    cursor = response.next_cursor

Webhooks - Configure webhooks for job completion notifications
Async Support - Learn about async patterns in the SDK
Error Handling - Handle job failures gracefully

​Overview

​Get a Specific Job

​Job Response

​Get All Jobs

​Pagination

​Exclude Configurations

​Parameters

​Cancel a Job

​Async Job Management

​Complete Example: Job Polling

​Job Lifecycle

​Best Practices

​Related

Overview

Get a Specific Job

Job Response

Get All Jobs

Pagination

Exclude Configurations

Parameters

Cancel a Job

Async Job Management

Complete Example: Job Polling

Job Lifecycle

Best Practices

Related