Official PyTorch Implementation of "ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement"
Hao Liu1, Ruping Cao1, Kun Wang1, Zhiran Li1, Fan Liu2, Yupeng Hu1, Liqiang Nie3
1 Shandong University 2 Southeast University 3 Harbin Institute of Technology (Shenzhen)
This repository is the official implementation of ChartLens, our champion solution for DataMFM Challenge Track 2: Chart Understanding at CVPR 2026.
Given a chart image, the challenge requires two complementary outputs:
chart2csv_predictions.jsonl: structured CSV tables recovered from chart images.chart2summary_predictions.jsonl: faithful natural-language summaries grounded in chart text and numeric facts.
ChartLens treats chart understanding as a verification-guided correction problem instead of a one-shot generation task. The final system combines Granite-Vision-4.1-4B LoRA adaptation, Structure-Aware CSV Verification and Correction (SAVC), OCR-assisted text retention checking, and Text-Retention-Guided Summary Refinement (TRSR).
This repo includes:
- training code for ChartNet-based LoRA adaptation
- inference code for Granite Vision and LoRA checkpoints
- CSV verification and correction scripts
- OCR-based summary retention evaluation and repair scripts
- final prediction files for the DataMFM chart understanding track
- [2026/06] Released README, code, and final prediction files.
- Dual-branch chart understanding framework. ChartLens separately optimizes structured data recovery and factual summary refinement.
- SAVC for chart-to-CSV correction. The CSV branch verifies structure, completeness, and numerical consistency before editing unreliable table entries.
- TRSR for faithful summaries. The summary branch uses OCR-extracted chart text to detect low-retention summaries and repair missing titles, legends, annotations, sources, and numeric evidence.
- Strong challenge performance. ChartLens ranks first in DataMFM Challenge Track 2 and achieves the best overall score among submitted solutions.
Our method consists of three main stages:
- Initial output construction. Granite-Vision-4.1-4B is adapted with LoRA on ChartNet samples to strengthen chart-to-CSV extraction. The LoRA-generated CSV and base summary are used as initial predictions.
- Structure-Aware CSV Verification and Correction (SAVC). SAVC checks the candidate CSV against the chart image from structural consistency, content completeness, and numerical accuracy perspectives. It preserves reliable entries and repairs headers, categories, legends, and values when necessary.
- Text-Retention-Guided Summary Refinement (TRSR). TRSR extracts chart text with PaddleOCR, computes text retention, and refines only low-coverage summaries using chart image evidence, OCR references, and the original summary.
| Team / Method | CSV Numeric F1 | CSV Structural Score | Summary ROUGE-L | Summary Numeric Fact F1 | Overall |
|---|---|---|---|---|---|
| hskl18 | 47.29 | 37.77 | 20.17 | 51.71 | 39.23 |
| SFD | 65.37 | 68.55 | 14.51 | 36.26 | 46.17 |
| HHHHHHHHHH | 67.66 | 66.81 | 35.65 | 70.76 | 60.22 |
| Wind Rain Tower | 76.22 | 73.52 | 29.47 | 62.27 | 60.37 |
| ytttttt | 73.43 | 71.57 | 45.69 | 73.48 | 66.04 |
| anmspro | 75.26 | 74.65 | 44.81 | 73.73 | 67.11 |
| acceed | 76.28 | 73.63 | 45.23 | 73.69 | 67.21 |
| durgasandeep | 76.03 | 74.78 | 45.30 | 73.95 | 67.52 |
| Zhiheng | 76.55 | 74.82 | 45.23 | 73.69 | 67.57 |
| ChartLens (Ours) | 80.62 | 75.66 | 45.57 | 74.55 | 69.10 |
| Setting | CSV Numeric F1 | CSV Structural Score | Summary ROUGE-L | Summary Numeric Fact F1 | Overall |
|---|---|---|---|---|---|
| Direct output | 79.13 | 75.94 | 44.96 | 73.19 | 68.30 |
| Correction without OCR | 80.62 | 75.66 | 45.09 | 72.23 | 68.40 |
| Correction with OCR | 80.62 | 75.66 | 45.57 | 74.55 | 69.10 |
Final prediction files are provided under result/.
git clone https://github.com/iLearnLab/CVPRW26-ChartLens.git
cd CVPRW26-ChartLensconda create -n chartlens python=3.10 -y
conda activate chartlens
pip install -r requirements.txtGPU execution is expected for Granite inference and LoRA training. The inference and training scripts expose --gpu_id and set CUDA_VISIBLE_DEVICES before importing torch.
ChartLens/
βββ code/
β βββ build_chartnet_sft.py # Build ChartNet SFT data
β βββ calibrate_baseline_with_ai.py # SAVC CSV verification and correction
β βββ infer_chartnet_granite.py # Base Granite Vision inference
β βββ infer_granite_with_lora.py # Granite Vision + LoRA inference
β βββ load_chartnet_500.py # Load compact ChartNet supervision set
β βββ ocr.py # OCR text retention evaluation
β βββ repair_summary.py # TRSR summary repair
β βββ train_lora_chartnet.py # LoRA training
βββ result/
β βββ real/
β β βββ chart2csv_predictions.jsonl
β β βββ chart2summary_predictions.jsonl
β βββ synthetic/
β βββ chart2csv_predictions.jsonl
β βββ chart2summary_predictions.jsonl
βββ CVPR_DataMFM_ChartLens.pdf
βββ method.png
βββ requirements.txt
βββ README.mdThe cloud link of checkpoints: Hugging Face & Google Drive.
We use the datasets and splits released by the DataMFM Challenge. The chart understanding track uses two splits, real and synthetic, and requires JSONL predictions for chart-to-CSV and chart-to-summary.
To prepare ChartNet SFT data for LoRA training:
python code/load_chartnet_500.py \
--out_dir Fine-tuning/Dataset/raw \
--num_samples 500
python code/build_chartnet_sft.py \
--gt_path Fine-tuning/Dataset/raw/gt.jsonl \
--image_dir Fine-tuning/Dataset/raw/images \
--out_dir Fine-tuning/Dataset/sft \
--csv_repeat 2 \
--summary_repeat 1If you already have ChartNet ground-truth files and chart images prepared, start from code/build_chartnet_sft.py and point --gt_path and --image_dir to your local paths.
python code/infer_granite_with_lora.py \
--image_root /path/to/data \
--out_root /path/to/output \
--model_path /path/to/granite-vision-4.1-4b \
--lora_path Fine-tuning/FT/model/granite_chartnet_lora_bs2 \
--gpu_id 0 \
--splits real syntheticUse code/infer_chartnet_granite.py for base Granite Vision inference without a LoRA adapter.
export OPENAI_API_KEY="..."
python code/calibrate_baseline_with_ai.py \
--split all \
--baseline_root /path/to/baseline_predictions \
--image_root /path/to/data \
--output_root /path/to/savc_output \
--base_url "https://your-openai-compatible-endpoint" \
--model gemini-3.5-flash \
--threshold 85--baseline_root should contain split directories such as real/ and synthetic/, each with chart2csv_predictions.jsonl and chart2summary_predictions.jsonl.
python code/ocr.py \
--real_images /path/to/data/real/images \
--synthetic_images /path/to/data/synthetic/images \
--real_summary /path/to/baseline/real/chart2summary_predictions.jsonl \
--synthetic_summary /path/to/baseline/synthetic/chart2summary_predictions.jsonl \
--output_dir /path/to/ocr_text_copy_coverage \
--threshold 0.8
export AIGCBEST_API_KEY="..."
python code/repair_summary.py \
--split all \
--workers 20 \
--ocr_eval_root /path/to/ocr_text_copy_coverage \
--output_root /path/to/trsr_outputTrain the LoRA adapter on the prepared ChartNet SFT data:
python code/train_lora_chartnet.py \
--model_path /path/to/granite-vision-4.1-4b \
--train_jsonl Fine-tuning/Dataset/sft/train.jsonl \
--val_jsonl Fine-tuning/Dataset/sft/val.jsonl \
--output_dir Fine-tuning/FT/model/granite_chartnet_lora_bs2 \
--gpu_id 0 \
--epochs 2 \
--batch_size 1 \
--grad_accum 8For DataMFM Track 2, organize the final predictions as:
submission.zip
βββ real/
β βββ chart2csv_predictions.jsonl
β βββ chart2summary_predictions.jsonl
βββ synthetic/
βββ chart2csv_predictions.jsonl
βββ chart2summary_predictions.jsonlEach CSV prediction line:
{"imagename": "example.png", "predicted_csv": "Header A,Header B\nA,1\nB,2"}Each summary prediction line:
{"imagename": "example.png", "predicted_summary": "One paragraph summary grounded in the chart."}If you find this project useful for your research, please consider citing:
@article{liu2026chartlens,
title={ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement},
author={Liu, Hao and Cao, Ruping and Wang, Kun and Li, Zhiran and Liu, Fan and Hu, Yupeng and Nie, Liqiang},
journal={arXiv preprint arXiv:2606.10640},
year={2026}
}This project is released under the Apache 2.0 License.
If you have any questions, feel free to contact:
- Hao Liu:
liuh90210@gmail.com - Ruping Cao:
caoruping657@gmail.com
