OCR Engine
Turqoa's OCR engine is a purpose-built optical character recognition system optimized for port and terminal environments. It handles three distinct recognition tasks — license plates, container codes, and seal numbers — each with specialized models trained on millions of real-world port images.
Capabilities
License Plate Recognition
The plate OCR model recognizes license plates from over 120 countries and regional formats. It handles:
- Standard passenger and commercial vehicle plates
- Multi-line plate formats (common in European and Middle Eastern regions)
- Temporary and dealer plates
- Dirty, damaged, or partially obscured plates
- Plates captured under infrared illumination
Container Code Recognition
The container code model is trained specifically on ISO 6346 container markings. It extracts:
- Owner code — 3 letters identifying the container operator
- Equipment category — 1 letter (U, J, or Z)
- Serial number — 6 digits
- Check digit — 1 digit, validated automatically against ISO 6346
The model also reads:
- Container size and type codes
- Maximum gross weight markings
- Tare weight markings
- CSC plate data (when visible)
Seal Recognition
The seal OCR model reads alphanumeric codes from high-security bolt seals, cable seals, and e-seals:
- ISO 17712 compliant seal formats
- Carrier-specific seal numbering
- QR and barcode seals (via separate barcode decoder)
Supported Formats
Container Codes (ISO 6346)
| Component | Format | Example |
|---|---|---|
| Owner code | 3 uppercase letters | MSC |
| Category | 1 letter (U/J/Z) | U |
| Serial number | 6 digits | 123456 |
| Check digit | 1 digit | 7 |
| Full code | Combined | MSCU1234567 |
Note: The engine automatically validates the check digit using the ISO 6346 algorithm. Mismatched check digits are flagged and the transaction confidence is reduced.
Regional Plate Formats
Turqoa ships with recognition profiles for all major port regions:
| Region | Formats | Notes |
|---|---|---|
| North America | US state plates, Canadian provincial, Mexican federal | Including temporary tags |
| Europe | EU standard, UK post-Brexit, Turkish | Multi-line support |
| Middle East | UAE, Saudi, Oman, Bahrain, Qatar, Kuwait | Arabic + Latin dual |
| East Asia | Chinese provincial, Japanese, Korean | CJK character support |
| Southeast Asia | Malaysian, Singaporean, Thai, Indonesian | Variable formats |
| Africa | South African, Egyptian, Moroccan, Nigerian | Regional variants |
Configuration
# ~/.turqoa/sites/my-terminal/ocr.yaml
ocr:
container:
model: turqoa-container-v4
confidence_threshold: 0.90
check_digit_validation: true
retry_on_low_confidence: true
max_retries: 2
plate:
model: turqoa-plate-v3
confidence_threshold: 0.85
regions:
- middle_east
- europe
multi_line: true
infrared_mode: auto
seal:
model: turqoa-seal-v2
confidence_threshold: 0.80
barcode_fallback: true
formats:
- iso_17712
- carrier_specific
Confidence Thresholds
Confidence thresholds control when OCR results are considered reliable enough for automated processing. Setting these correctly is critical for balancing automation rate against accuracy.
| Threshold | Effect |
|---|---|
| Too high (> 0.98) | Many transactions routed to manual review; low automation rate |
| Recommended (0.90–0.95) | Good balance of automation and accuracy |
| Too low (< 0.80) | Higher automation rate but increased risk of misreads |
Tune thresholds using historical data:
turqoa ocr analyze --site "my-terminal" --period 7d
This produces a report showing accuracy at different threshold levels, allowing data-driven tuning.
Multi-Language Support
The OCR engine natively supports multiple scripts:
- Latin — English, Spanish, Portuguese, French, German, Turkish
- Arabic — Standard Arabic, Farsi numerals
- CJK — Simplified Chinese, Traditional Chinese, Japanese, Korean
- Cyrillic — Russian, Ukrainian
- Devanagari — Hindi
For plates with dual-script formats (common in the Middle East), the engine reads both scripts and cross-validates:
ocr:
plate:
dual_script:
enabled: true
primary: arabic
secondary: latin
cross_validate: true
Performance Tuning
Image Quality
OCR accuracy is directly tied to image quality. Key factors:
- Resolution — Minimum 150 pixels across the plate or container code
- Exposure — Avoid overexposed or underexposed captures
- Focus — The character region must be in sharp focus
- Angle — Maximum 30-degree skew from perpendicular
Processing Speed
| Model | Average Latency | GPU Memory |
|---|---|---|
| Container OCR v4 | 45 ms | 1.2 GB |
| Plate OCR v3 | 30 ms | 0.8 GB |
| Seal OCR v2 | 55 ms | 1.0 GB |
To optimize throughput on multi-lane installations:
ocr:
processing:
batch_size: 4 # Process multiple images per GPU call
worker_threads: 2 # Parallel processing threads per model
gpu_memory_fraction: 0.8 # Reserve 80% of GPU for OCR
preload_models: true # Load models at startup, not on first request
Warning: Setting
gpu_memory_fractiontoo high may starve the damage detection model of GPU resources. Ensure total GPU memory allocation across all models does not exceed 90% of available VRAM.
Retry Logic
When a read falls below the confidence threshold, the engine can request a re-capture:
ocr:
retry:
enabled: true
max_attempts: 3
delay_ms: 500
escalate_on_failure: true # Route to manual review after all retries fail