Process documents with layout parser
Layout parser extracts document content elements like text, tables, and lists, and creates context-aware chunks that facilitate information retrieval in generative AI and discovery applications.
Layout parser features
Parse document layouts. You can input HTML or PDF files to layout parser to identify content elements like text blocks, tables, lists, and structural elements such as titles and headings. These elements help define the organization and hierarchy of a document with rich content and structural elements that can create more context for information retrieval and discovery.
Chunk documents. Layout parser can break documents up into chunks that retain contextual information about the layout hierarchy of the original document. Answer-generating LLMs can use chunks to improve relevance and decrease computational load.
Taking a document's layout into account during chunking improves semantic coherence and reduces noise in the content when it's used for retrieval and LLM generation. All text in a chunk comes from the same layout entity, such as a heading, subheading, or list.
Gemini layout parser. Preview. The Gemini layout parser gives better layout quality on table recognition, reading order, and text recognition of PDF files. You can enable the feature by default by selecting layout parser processor version
pretrained-layout-parser-v1.4-2025-08-25,pretrained-layout-parser-v1.5-2025-08-25, orpretrained-layout-parser-v1.5-pro-2025-08-25for your processor.Parse images and tables as annotations. Preview. Layout parser can identify if there are images or tables in parsed documents. When found, they're annotated as a descriptive block of text with the information depicted in the image and table.
Limitations
The following limitations apply:
- Online processing:
- Input file size maximum of 20 MB for all file types
- Maximum of 15 pages per PDF file
- Batch processing:
- Maximum single file size of 1 GB for PDF files
- Maximum of 500 pages per PDF file
Layout detection per file type
The following table lists the elements that layout parser can detect per document file type.
| File type | MIME Type | Detected elements | Limitations |
|---|---|---|---|
| HTML | text/html |
paragraph, table, list, title, heading, page header, page footer | Be aware that parsing relies heavily on HTML tags, so CSS-based formatting might not be captured. |
application/pdf |
paragraph, table, title, heading, page header, page footer | Tables spanning multiple pages might be split in two tables. | |
| DOCX | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
paragraph, tables across multiple pages, list, title, heading elements | Nested tables are not supported. |
| PPTX | application/vnd.openxmlformats-officedocument.presentationml.presentation |
paragraph, table, list, title, heading elements | For headings to be identified accurately, they should be marked as such within the PowerPoint file. Nested tables and hidden slides are not supported. |
| XLSX | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
tables within Excel spreadsheets, supporting INT,
FLOAT, and STRING values |
Multiple table detection is not supported. Hidden sheets, rows, or columns might also impact detection. Files with up to 5 million cells can be processed. |
| XLSM | application/vnd.ms-excel.sheet.macroenabled.12 |
spreadsheet with macro enabled, supporting INT,
FLOAT, and STRING values |
Multiple table detection is not supported. Hidden sheets, rows, or columns might also impact detection. |
Processor versions
The following models are available for layout parser. To change model versions, see Manage processor versions.
To make a quota increase request (QIR) for the default processor quota, follow the steps in Manage your quota.
| Model version | Description | Release channel | Release date |
|---|---|---|---|
pretrained-layout-parser-v1.0-2024-06-03 |
General availability version for document layout analysis. This is the default pre-trained processorversion. | Stable | June 3, 2024 |
pretrained-layout-parser-v1.5-2025-08-25 |
Preview version powered by Gemini 2.5 Flash LLM for better layout analysis on PDF files. Recommended for those who want to experiment with new versions. If it's used for non-PDF files, it will have the same behavior as the stable pretrained-layout-parser-v1.0-2024-06-03. |
Release Candidate | August 25, 2025 |
pretrained-layout-parser-v1.5-pro-2025-08-25 |
Preview version powered by Gemini 2.5 Pro LLM for better layout analysis on PDF files. v1.5-pro has higher latency than v1.5. If it’s used for non-PDF files, it will have the same behavior as the stable v1.0. | Release Candidate | August 25, 2025 |
Before you begin
To turn on layout parser, follow these steps:
Create a layout parser by following the instructions in Creating and managing processors.
The processor type name is
LAYOUT_PARSER_PROCESSOR.Enable layout parser by following the instructions in Enable a processor.
Send an online process request with layout parser
Input documents to layout parser to parse and chunk.
Follow the instructions for batch processing requests in Send a processing request.
Configure fields in
ProcessOptions.layoutConfiginProcessDocumentRequest.REST
Before using any of the request data, make the following replacements:
- LOCATION: your processor's location, for example:
us- United Stateseu- European Union
- PROJECT_ID: Your Google Cloud project ID.
- PROCESSOR_ID: the ID of your custom processor.
- MIME_TYPE: Layout parser supports
application/pdfandtext/html. - DOCUMENT: The content to be split into chunks. Layout parser accepts raw PDF or HTML documents, or parsed documents that were output by the layout parser.
- CHUNK_SIZE: Optional. The chunk size, in tokens, to use when splitting documents.
- INCLUDE_ANCESTOR_HEADINGS: Optional. Boolean. Whether or not to include ancestor headings when splitting documents.
HTTP method and URL:
POST https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process
Request JSON body:
// Sample for inputting raw documents such as PDF or HTML { "rawDocument": { "mimeType": "MIME_TYPE", "content": "DOCUMENT" }, "processOptions": { "layoutConfig": { "chunkingConfig": { "chunkSize": "CHUNK_SIZE", "includeAncestorHeadings": "INCLUDE_ANCESTOR_HEADINGS", } } } }To send your request, choose one of these options:
The response includes the processed document with layout and chunking information ascurl
Save the request body in a file named
request.json, and execute the following command:curl -X POST \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process"PowerShell
Save the request body in a file named
request.json, and execute the following command:$headers = @{ }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:process" | Select-Object -Expand ContentDocument.documentLayoutandDocument.chunkedDocument.Python
For more information, see the Document AI Python API reference documentation.
To authenticate to Document AI, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
- LOCATION: your processor's location, for example:
Batch process documents with layout parser
Use the following procedure to parse and chunk multiple documents in a single request.
Input documents to layout parser to parse and chunk.
Follow the instructions for batch processing requests in Send a processing request.
Configure fields in
ProcessOptions.layoutConfigwhen making abatchProcessrequest.Input
The following example JSON configures
ProcessOptions.layoutConfig."processOptions": { "layoutConfig": { "chunkingConfig": { "chunkSize": "CHUNK_SIZE", "includeAncestorHeadings": "INCLUDE_ANCESTOR_HEADINGS_BOOLEAN" } } }Replace the following:
CHUNK_SIZE: The maximum chunk size, in number of tokens, to use when splitting documents.INCLUDE_ANCESTOR_HEADINGS_BOOLEAN: Whether to include ancestor headings when splitting documents. Ancestor headings are the parents of subheadings in the original document. They can provide a chunk with additional context about its position in the original document. Up to two levels of headings can be included with a chunk.
What's next
- Review the processors list.
- Create a custom classifier.
- Use Enterprise Document OCR to detect and extract text.
- Review Send a batch process documents request to learn how to handle responses.