🌟 Introduction¶
YomiToku is a Document AI engine specialized in Japanese document image analysis. It provides full OCR (optical character recognition) and layout analysis capabilities, enabling the recognition, extraction, and conversion of text and diagrams from images.
- 🤖 Equipped with four AI models trained on Japanese datasets: text detection, text recognition, layout analysis, and table structure recognition. All models are independently trained and optimized for Japanese documents, delivering high-precision inference.
- 🇯🇵 Each model is specifically trained for Japanese document images, supporting the recognition of over 7,000 Japanese characters, including vertical text and other layout structures unique to Japanese documents. (It also supports English documents.)
- 📈 By leveraging layout analysis, table structure parsing, and reading order estimation, it extracts information while preserving the semantic structure of the document layout.
- 📄 Supports a variety of output formats, including HTML, Markdown, JSON, and CSV. It also allows for the extraction of diagrams and images contained within the documents.
- ⚡ Operates efficiently in GPU environments, enabling fast document transcription and analysis. It requires less than 8GB of VRAM, eliminating the need for high-end GPUs.。
🙋 Contact¶
If you have any questions, please contact us at support@mlism.com.
Index¶
Basic Usage¶
- Installation: Installation instructions
- FAQ: Frequently asked questions
CLI Usage¶
- Document Analyzer: How to use the CLI
- Extractor: How to use the Extractor
- Schema Generation Prompt: Schema generation prompt
Python API¶
- Document Analyzer Python API: How to use the DocumentAnalyzer API
- Table Semantic Parser Python API: How to use the TableSemanticParser
- Module Output: Output schema definitions for each module
- Model Config: Model configuration settings
Code Reference¶
Inputs¶
- load_image: How to load image files
- load_pdf: How to load PDF files
Modules¶
Outputs¶
Utilities¶
- create_searchable_pdf: Create searchable PDF files
- table_to_csv: Convert table data to CSV
Error Codes¶
- Error Codes: List of error codes
- Error Codes List: Detailed error codes
Sample Code¶
- Use Rotate Detection: How to use the rotation detection module
- Table Extraction: Extract table data (TableSemanticParser)
- Searchable PDF: Create searchable PDF files
- Get Query Count Information: Retrieve processed page count information
Server¶
- Overview: How to use the REST API server
Operations¶
- Monitoring: Logging and monitoring
- Release Note: Release notes