OCR¶
Bases: BaseJob
A class for performing Optical Character Recognition (OCR) on images.
This class integrates text detection and text recognition components to extract text from images. It supports customization through configurations and allows asynchronous processing of images.
引数:
| 名前 | タイプ | デスクリプション | デフォルト |
|---|---|---|---|
configs
|
dict
|
A dictionary of configurations to override the default settings.
The |
{}
|
device
|
str
|
The device to use for computation, e.g., "cuda" or "cpu". Defaults to "cuda". |
'cuda'
|
visualize
|
bool
|
Whether to enable visualization during OCR processing. Defaults to False. |
False
|
license_key
|
str
|
The license key for using specific features or services. Defaults to None. |
None
|
secret_key
|
str
|
The secret key for authentication with external services. Defaults to None. |
None
|
device_token
|
str
|
The device token for authentication with external services. Defaults to None. |
None
|
属性:
| 名前 | タイプ | デスクリプション |
|---|---|---|
detector |
TextDetector
|
An instance of the text detection module used to detect text regions in images. |
recognizer |
TextRecognizer
|
An instance of the text recognition module used to recognize text content from detected regions. |
ソースコード位置: src/yomitoku/ocr.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
__call__(img)
¶
Perform OCR on the given image.
This method is a synchronous wrapper for the run method, allowing direct
invocation of the OCR process.
引数:
| 名前 | タイプ | デスクリプション | デフォルト |
|---|---|---|---|
img
|
ndarray
|
The input image in BGR format (as loaded by OpenCV). |
必須 |
戻り値:
| 名前 | タイプ | デスクリプション |
|---|---|---|
tuple |
tuple[OCRSchema, ndarray | None]
|
A tuple containing:
|
ソースコード位置: src/yomitoku/ocr.py
run(img)
async
¶
Perform OCR on the given image asynchronously.
This method detects text regions in the image and recognizes the text content from those regions. It also supports visualization of the OCR process.
引数:
| 名前 | タイプ | デスクリプション | デフォルト |
|---|---|---|---|
img
|
ndarray
|
The input image in BGR format. |
必須 |
戻り値:
| 名前 | タイプ | デスクリプション |
|---|---|---|
tuple |
tuple[dict[str, Any], ndarray]
|
A tuple containing:
|