PrayerQX

Building OCR, document parsing, and PDF-to-Markdown workflows for search, RAG, and knowledge systems.

About

I work on practical document AI pipelines:

OCR and document parsing for scanned and born-digital PDFs
PDF to Markdown conversion with layout, table, formula, and reading-order recovery
benchmark tooling for comparing document parsing models under unified scoring
computer vision projects that aim for usable outputs rather than demos only

Current Focus

Structure-preserving PDF parsing for downstream LLM workflows
Benchmarking document parsing models on OmniDocBench and MDPBench
Building local, reproducible pipelines for Markdown-first document recovery

Featured Projects

doc-parsing-benchmark

Benchmark and deployment toolkit for document parsing models, with unified outputs, official-rule evaluation, and reproducible lite/full benchmark workflows.

Tech: Python PaddleOCR Benchmarking Markdown OmniDocBench MDPBench

PPStructureV3-PDF-to-Markdown

A PP-StructureV3-based PDF to Markdown project focused on preserving titles, tables, formulas, images, and reading order from complex documents.

Tech: Python PPStructureV3 OCR PDF Markdown

yolov5-garbage-classification

A practical computer vision project for garbage recognition and classification based on YOLOv5.

Tech: Python YOLOv5 OpenCV Computer Vision

What I Care About

outputs that can be used directly in RAG and knowledge workflows
reproducible local deployment, not just benchmark screenshots
structured results instead of plain text dumps
engineering tradeoffs: quality, speed, deployment cost, and controllability

GitHub Activity

Contact

If you are working on OCR, document parsing, PDF processing, or Markdown-oriented document recovery, feel free to connect on GitHub.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PrayerQX

Achievements

Achievements

Block or report PrayerQX

PrayerQX

About

Current Focus

Featured Projects

doc-parsing-benchmark

PPStructureV3-PDF-to-Markdown

yolov5-garbage-classification

What I Care About

GitHub Activity

Contact

Popular repositories Loading

Uh oh!