
Amazon Textract
Amazon Textract
Amazon Textract is a document OCR and structured extraction service — recognizes forms, tables, and signatures far beyond traditional OCR.
Tables preserved with row/column structure
Handwriting accuracy lags printed text
Call Textract.analyzeDocument (sync, small docs) or startDocumentAnalysis (async, large PDFs). Results return as a tree of blocks (page, line, word, key, value, cell, etc.).
Uptime · 30-day window
GitHub activity
About this API
Textract's key advantage over traditional OCR is structured output. Plain OCR turns images into plain text; Textract additionally tells you which row/column of which table the text sits in, which key maps to which value, and which lines form a paragraph. That makes invoice, form, and contract analysis tractable without elaborate post-processing.
It comes in several modes. DetectDocumentText is plain OCR. AnalyzeDocument adds FORMS (key/value), TABLES, and SIGNATURES. AnalyzeExpense handles invoices and receipts. AnalyzeID handles IDs, passports, and driver's licenses. Every mode returns a block tree — developers walk the tree to reassemble business structure (e.g. stitch table cells into a 2D array).
Asian-language support including Chinese is limited; Textract serves English-document workflows best. For high-volume Chinese document processing, evaluate Chinese-cloud OCR services in parallel.
What you can build
- 1Automate invoice and receipt entry
- 2Extract structured data from PDF tables
- 3Bulk-parse resumes
- 4Pull info from KYC documents
Strengths & limitations
Strengths
- Tables preserved with row/column structure
- Forms parsed as key/value pairs
- Dedicated Invoices / Receipts analysis modes
Limitations
- Handwriting accuracy lags printed text
- Limited support for Asian languages including Chinese
- Per-page pricing — high-volume document processing is expensive
Getting started
Call Textract.analyzeDocument (sync, small docs) or startDocumentAnalysis (async, large PDFs). Results return as a tree of blocks (page, line, word, key, value, cell, etc.).
FAQ
How is it priced?+
Per page — plain OCR ~$1.50/1000 pages; FORMS/TABLES mode ~$15–50/1000. Specialized modes (AnalyzeExpense) cost more.
Does it work for Chinese?+
Limited. Printed Chinese is sometimes recognized but accuracy drops for complex layouts and tables compared with English.
How do I handle large PDFs?+
Use startDocumentAnalysis as an async job with an SNS notification. Results paginate — iterate NextToken to fetch everything.
Technical details
- Auth type
- api_key
- Pricing
- paid
- Protocols
- REST
- SDKs
- python, javascript, java, go, ruby, csharp
- Response time
- 42 ms
- Last health check
- 6/26/2026, 6:22:16 AM
Endpoints
Parsed from the OpenAPI spec. Showing 12 of 13 non-deprecated endpoints.
/#X-Amz-Target=Textract.AnalyzeDocument/#X-Amz-Target=Textract.AnalyzeExpense/#X-Amz-Target=Textract.AnalyzeID/#X-Amz-Target=Textract.DetectDocumentText/#X-Amz-Target=Textract.GetDocumentAnalysis/#X-Amz-Target=Textract.GetDocumentTextDetection/#X-Amz-Target=Textract.GetExpenseAnalysis/#X-Amz-Target=Textract.GetLendingAnalysis/#X-Amz-Target=Textract.GetLendingAnalysisSummary/#X-Amz-Target=Textract.StartDocumentAnalysis/#X-Amz-Target=Textract.StartDocumentTextDetection/#X-Amz-Target=Textract.StartExpenseAnalysis1 more endpoints not shown. See the OpenAPI spec for the full list.
More from Amazon Web Services
AWS IAM Access Analyzer API analyzes IAM resource policies for over-privileged access or external access — proactively surfaces security risks.
Alexa for Business helps you use Alexa in your organization.
Amazon API Gateway helps developers deliver robust, secure, and scalable mobile and web application back ends.
Use AppConfig, a capability of Amazon Web Services Systems Manager, to create, manage, and quickly deploy application configurations.
Welcome to the Amazon AppFlow API reference.
The Amazon AppIntegrations service enables you to configure and reuse connections to external applications.
Amazon AppStream 2.0 API Reference.
Amazon Athena is an interactive query service that lets you use standard SQL to analyze data directly in Amazon S3.