Skip to content

iLearn-Lab/CVPRW26-ChartLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ ChartLens @ CVPR 2026 DataMFM Chart Understanding Challenge

Official PyTorch Implementation of "ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement"

Challenge arXiv Weights Project

Hao Liu1, Ruping Cao1, Kun Wang1, Zhiran Li1, Fan Liu2, Yupeng Hu1, Liqiang Nie3

1 Shandong University 2 Southeast University 3 Harbin Institute of Technology (Shenzhen)


πŸ“Œ Introduction

This repository is the official implementation of ChartLens, our champion solution for DataMFM Challenge Track 2: Chart Understanding at CVPR 2026.

Given a chart image, the challenge requires two complementary outputs:

  • chart2csv_predictions.jsonl: structured CSV tables recovered from chart images.
  • chart2summary_predictions.jsonl: faithful natural-language summaries grounded in chart text and numeric facts.

ChartLens treats chart understanding as a verification-guided correction problem instead of a one-shot generation task. The final system combines Granite-Vision-4.1-4B LoRA adaptation, Structure-Aware CSV Verification and Correction (SAVC), OCR-assisted text retention checking, and Text-Retention-Guided Summary Refinement (TRSR).

This repo includes:

  • training code for ChartNet-based LoRA adaptation
  • inference code for Granite Vision and LoRA checkpoints
  • CSV verification and correction scripts
  • OCR-based summary retention evaluation and repair scripts
  • final prediction files for the DataMFM chart understanding track

πŸ“° News

  • [2026/06] Released README, code, and final prediction files.

✨ Highlights

  • Dual-branch chart understanding framework. ChartLens separately optimizes structured data recovery and factual summary refinement.
  • SAVC for chart-to-CSV correction. The CSV branch verifies structure, completeness, and numerical consistency before editing unreliable table entries.
  • TRSR for faithful summaries. The summary branch uses OCR-extracted chart text to detect low-retention summaries and repair missing titles, legends, annotations, sources, and numeric evidence.
  • Strong challenge performance. ChartLens ranks first in DataMFM Challenge Track 2 and achieves the best overall score among submitted solutions.

🧠 Method Overview

ChartLens pipeline

Our method consists of three main stages:

  1. Initial output construction. Granite-Vision-4.1-4B is adapted with LoRA on ChartNet samples to strengthen chart-to-CSV extraction. The LoRA-generated CSV and base summary are used as initial predictions.
  2. Structure-Aware CSV Verification and Correction (SAVC). SAVC checks the candidate CSV against the chart image from structural consistency, content completeness, and numerical accuracy perspectives. It preserves reliable entries and repairs headers, categories, legends, and values when necessary.
  3. Text-Retention-Guided Summary Refinement (TRSR). TRSR extracts chart text with PaddleOCR, computes text retention, and refines only low-coverage summaries using chart image evidence, OCR references, and the original summary.

πŸ“Š Results

Main Quantitative Results

Team / Method CSV Numeric F1 CSV Structural Score Summary ROUGE-L Summary Numeric Fact F1 Overall
hskl18 47.29 37.77 20.17 51.71 39.23
SFD 65.37 68.55 14.51 36.26 46.17
HHHHHHHHHH 67.66 66.81 35.65 70.76 60.22
Wind Rain Tower 76.22 73.52 29.47 62.27 60.37
ytttttt 73.43 71.57 45.69 73.48 66.04
anmspro 75.26 74.65 44.81 73.73 67.11
acceed 76.28 73.63 45.23 73.69 67.21
durgasandeep 76.03 74.78 45.30 73.95 67.52
Zhiheng 76.55 74.82 45.23 73.69 67.57
ChartLens (Ours) 80.62 75.66 45.57 74.55 69.10

Ablation Study

Setting CSV Numeric F1 CSV Structural Score Summary ROUGE-L Summary Numeric Fact F1 Overall
Direct output 79.13 75.94 44.96 73.19 68.30
Correction without OCR 80.62 75.66 45.09 72.23 68.40
Correction with OCR 80.62 75.66 45.57 74.55 69.10

Final prediction files are provided under result/.


βš™οΈ Installation

1. Clone the repository

git clone https://github.com/iLearnLab/CVPRW26-ChartLens.git
cd CVPRW26-ChartLens

2. Create environment

conda create -n chartlens python=3.10 -y
conda activate chartlens
pip install -r requirements.txt

GPU execution is expected for Granite inference and LoRA training. The inference and training scripts expose --gpu_id and set CUDA_VISIBLE_DEVICES before importing torch.


πŸ—‚οΈ Repository Structure

ChartLens/
β”œβ”€β”€ code/
β”‚   β”œβ”€β”€ build_chartnet_sft.py          # Build ChartNet SFT data
β”‚   β”œβ”€β”€ calibrate_baseline_with_ai.py  # SAVC CSV verification and correction
β”‚   β”œβ”€β”€ infer_chartnet_granite.py      # Base Granite Vision inference
β”‚   β”œβ”€β”€ infer_granite_with_lora.py     # Granite Vision + LoRA inference
β”‚   β”œβ”€β”€ load_chartnet_500.py           # Load compact ChartNet supervision set
β”‚   β”œβ”€β”€ ocr.py                         # OCR text retention evaluation
β”‚   β”œβ”€β”€ repair_summary.py              # TRSR summary repair
β”‚   └── train_lora_chartnet.py         # LoRA training
β”œβ”€β”€ result/
β”‚   β”œβ”€β”€ real/
β”‚   β”‚   β”œβ”€β”€ chart2csv_predictions.jsonl
β”‚   β”‚   └── chart2summary_predictions.jsonl
β”‚   └── synthetic/
β”‚       β”œβ”€β”€ chart2csv_predictions.jsonl
β”‚       └── chart2summary_predictions.jsonl
β”œβ”€β”€ CVPR_DataMFM_ChartLens.pdf
β”œβ”€β”€ method.png
β”œβ”€β”€ requirements.txt
└── README.md

πŸ’Ύ Checkpoints

The cloud link of checkpoints: Hugging Face & Google Drive.


πŸ“‚ Data Preparation

We use the datasets and splits released by the DataMFM Challenge. The chart understanding track uses two splits, real and synthetic, and requires JSONL predictions for chart-to-CSV and chart-to-summary.

To prepare ChartNet SFT data for LoRA training:

python code/load_chartnet_500.py \
  --out_dir Fine-tuning/Dataset/raw \
  --num_samples 500

python code/build_chartnet_sft.py \
  --gt_path Fine-tuning/Dataset/raw/gt.jsonl \
  --image_dir Fine-tuning/Dataset/raw/images \
  --out_dir Fine-tuning/Dataset/sft \
  --csv_repeat 2 \
  --summary_repeat 1

If you already have ChartNet ground-truth files and chart images prepared, start from code/build_chartnet_sft.py and point --gt_path and --image_dir to your local paths.


⚑ Inference

Granite Vision + LoRA Inference

python code/infer_granite_with_lora.py \
  --image_root /path/to/data \
  --out_root /path/to/output \
  --model_path /path/to/granite-vision-4.1-4b \
  --lora_path Fine-tuning/FT/model/granite_chartnet_lora_bs2 \
  --gpu_id 0 \
  --splits real synthetic

Use code/infer_chartnet_granite.py for base Granite Vision inference without a LoRA adapter.

SAVC CSV Correction

export OPENAI_API_KEY="..."

python code/calibrate_baseline_with_ai.py \
  --split all \
  --baseline_root /path/to/baseline_predictions \
  --image_root /path/to/data \
  --output_root /path/to/savc_output \
  --base_url "https://your-openai-compatible-endpoint" \
  --model gemini-3.5-flash \
  --threshold 85

--baseline_root should contain split directories such as real/ and synthetic/, each with chart2csv_predictions.jsonl and chart2summary_predictions.jsonl.

TRSR Summary Refinement

python code/ocr.py \
  --real_images /path/to/data/real/images \
  --synthetic_images /path/to/data/synthetic/images \
  --real_summary /path/to/baseline/real/chart2summary_predictions.jsonl \
  --synthetic_summary /path/to/baseline/synthetic/chart2summary_predictions.jsonl \
  --output_dir /path/to/ocr_text_copy_coverage \
  --threshold 0.8

export AIGCBEST_API_KEY="..."

python code/repair_summary.py \
  --split all \
  --workers 20 \
  --ocr_eval_root /path/to/ocr_text_copy_coverage \
  --output_root /path/to/trsr_output

πŸš€ Training

Train the LoRA adapter on the prepared ChartNet SFT data:

python code/train_lora_chartnet.py \
  --model_path /path/to/granite-vision-4.1-4b \
  --train_jsonl Fine-tuning/Dataset/sft/train.jsonl \
  --val_jsonl Fine-tuning/Dataset/sft/val.jsonl \
  --output_dir Fine-tuning/FT/model/granite_chartnet_lora_bs2 \
  --gpu_id 0 \
  --epochs 2 \
  --batch_size 1 \
  --grad_accum 8

πŸ“¦ Submission Format

For DataMFM Track 2, organize the final predictions as:

submission.zip
β”œβ”€β”€ real/
β”‚   β”œβ”€β”€ chart2csv_predictions.jsonl
β”‚   └── chart2summary_predictions.jsonl
└── synthetic/
    β”œβ”€β”€ chart2csv_predictions.jsonl
    └── chart2summary_predictions.jsonl

Each CSV prediction line:

{"imagename": "example.png", "predicted_csv": "Header A,Header B\nA,1\nB,2"}

Each summary prediction line:

{"imagename": "example.png", "predicted_summary": "One paragraph summary grounded in the chart."}

πŸ“š Citation

If you find this project useful for your research, please consider citing:

@article{liu2026chartlens,
  title={ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement},
  author={Liu, Hao and Cao, Ruping and Wang, Kun and Li, Zhiran and Liu, Fan and Hu, Yupeng and Nie, Liqiang},
  journal={arXiv preprint arXiv:2606.10640},
  year={2026}
}

πŸ“„ License

This project is released under the Apache 2.0 License.


πŸ“¬ Contact

If you have any questions, feel free to contact:

  • Hao Liu: liuh90210@gmail.com
  • Ruping Cao: caoruping657@gmail.com

About

[CVPRW26] Official Implementation for "ChartLens: A Dual-Branch Framework for Chart Data Correction and Factual Summary Refinement"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages