Building OCR, document parsing, and PDF-to-Markdown workflows for search, RAG, and knowledge systems.
I work on practical document AI pipelines:
- OCR and document parsing for scanned and born-digital PDFs
- PDF to Markdown conversion with layout, table, formula, and reading-order recovery
- benchmark tooling for comparing document parsing models under unified scoring
- computer vision projects that aim for usable outputs rather than demos only
- Structure-preserving PDF parsing for downstream LLM workflows
- Benchmarking document parsing models on OmniDocBench and MDPBench
- Building local, reproducible pipelines for Markdown-first document recovery
Benchmark and deployment toolkit for document parsing models, with unified outputs, official-rule evaluation, and reproducible lite/full benchmark workflows.
Tech:
Python PaddleOCR Benchmarking Markdown OmniDocBench MDPBench
A PP-StructureV3-based PDF to Markdown project focused on preserving titles, tables, formulas, images, and reading order from complex documents.
Tech:
Python PPStructureV3 OCR PDF Markdown
A practical computer vision project for garbage recognition and classification based on YOLOv5.
Tech:
Python YOLOv5 OpenCV Computer Vision
- outputs that can be used directly in RAG and knowledge workflows
- reproducible local deployment, not just benchmark screenshots
- structured results instead of plain text dumps
- engineering tradeoffs: quality, speed, deployment cost, and controllability
If you are working on OCR, document parsing, PDF processing, or Markdown-oriented document recovery, feel free to connect on GitHub.
