Skip to content

Textract

CloudMock emulates Amazon Textract, supporting synchronous and asynchronous document text detection, document analysis, expense analysis, and tagging.

OperationStatusNotes
DetectDocumentTextSupportedSynchronous text detection
AnalyzeDocumentSupportedSynchronous document analysis
StartDocumentTextDetectionSupportedStarts async text detection
GetDocumentTextDetectionSupportedReturns async text detection results
StartDocumentAnalysisSupportedStarts async document analysis
GetDocumentAnalysisSupportedReturns async analysis results
StartExpenseAnalysisSupportedStarts async expense analysis
GetExpenseAnalysisSupportedReturns async expense results
TagResourceSupportedAdds tags to a resource
UntagResourceSupportedRemoves tags from a resource
ListTagsForResourceSupportedLists tags for a resource
import { TextractClient, StartDocumentTextDetectionCommand, GetDocumentTextDetectionCommand } from '@aws-sdk/client-textract';
const client = new TextractClient({
endpoint: 'http://localhost:4566',
region: 'us-east-1',
credentials: { accessKeyId: 'test', secretAccessKey: 'test' },
});
const { JobId } = await client.send(new StartDocumentTextDetectionCommand({
DocumentLocation: { S3Object: { Bucket: 'my-bucket', Name: 'document.pdf' } },
}));
const result = await client.send(new GetDocumentTextDetectionCommand({ JobId }));
console.log(result.JobStatus);
import boto3
client = boto3.client('textract',
endpoint_url='http://localhost:4566',
region_name='us-east-1',
aws_access_key_id='test',
aws_secret_access_key='test')
response = client.start_document_text_detection(
DocumentLocation={'S3Object': {'Bucket': 'my-bucket', 'Name': 'document.pdf'}})
result = client.get_document_text_detection(JobId=response['JobId'])
print(result['JobStatus'])
cloudmock.yml
services:
textract:
enabled: true
  • No actual OCR or document analysis is performed
  • Results return stub/empty block data
  • Async jobs complete immediately