Skip to content

Freedisch/yeye

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

yeye

Adapting Large Language Models for Emergency Dispatch in Togo Using Local Infrastructure Data

yeye is a framework for constructing geospatially grounded instruction-tuning datasets from real infrastructure data and using them to fine-tune LLMs for emergency dispatch decision support in low-resource settings.

Submitted to Deep Learning Indaba 2026 — AI for Social Impact and Sustainable Systems track.

⚠️ Research prototype only. yeye must not be deployed in any operational emergency dispatch context. See Limitations.


Overview

Understaffed emergency call centers in countries like Togo cannot guarantee a trained human operator for every incoming call. Off-the-shelf LLMs like Mistral-7B have no knowledge of Togo's health facilities, road networks, or resource constraints — this information is too sparsely documented online for models to acquire from pretraining alone.

yeye addresses this by:

  1. Extracting infrastructure data from OpenStreetMap and UN OCHA Humanitarian Data Exchange
  2. Programmatically generating 1,000 instruction-tuning pairs spanning 12 emergency types across 15 Togo cities
  3. Fine-tuning Mistral-7B via LoRA (QLoRA 4-bit) on a single T4 GPU
  4. Evaluating on a blind test set where all infrastructure context is withheld from prompts

Results

Metric Base Mistral-7B yeye Delta
Combined score 73.8% 79.6% +5.8pp
Facility grounding 67.5% 88.5% +21.0pp
Exact facility recall 38.0% 87.0% +49.0pp
Win rate 34/100 60/100 McNemar p=0.007

The headline result: exact facility name recall improves from 38% to 87% under blind evaluation — a 49 percentage-point gain (95% CI: +37.4 to +60.6 percentage points).

Geodispatch Figures

Dataset

The infrastructure dataset contains 1,903 facilities extracted from Togo:

Category Count Named
Hospitals and clinics 695 648 (93%)
Fuel stations 589 446 (76%)
Pharmacies 382 340 (89%)
Police stations 222 176 (79%)
Fire stations 15 13 (87%)

Data sources:

  • OpenStreetMap via Overpass API (bounding box: 6.08°N–11.15°N, 0.15°W–1.80°E)
  • UN OCHA Humanitarian Data Exchange (validated health facility dataset)

Deduplication: OSM and HDX records within 50m sharing the same category are merged.

Pipeline

OpenStreetMap ──┐
                ├──→ Pair Generator ──→ LoRA Fine-Tuning ──→ Blind Evaluation ──→ Results
UN OCHA HDX  ───┤     1,000 pairs        r=16, α=32          Context withheld
                │     800/100/100         QLoRA 4-bit          n=100
Scenario Params ┘     12 types            T4 GPU

Fine-Tuning Configuration

  • Base model: Mistral-7B-Instruct-v0.2
  • Method: LoRA with QLoRA 4-bit (NF4 quantization)
  • LoRA targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Rank: r=16, α=32, dropout=0.05
  • Trainable parameters: ~0.4% of full model
  • Optimizer: AdamW, lr=2×10⁻⁴, cosine decay, 5% warmup
  • Hardware: Single NVIDIA T4 GPU (Google Colab Pro)

Evaluation

Scoring uses three weighted components:

  • Facility grounding (50%): Exact name match from 1,903-entry dataset → 1.0; city/location token only → 0.5; no reference → 0.0
  • Distance/ETA plausibility (30%): Values checked against ±50% of ground truth
  • Rubric completeness (20%): Severity acknowledgment, dispatch action, constraint reporting

Statistical significance tested via McNemar's test on win/loss counts.

Quickstart

Requirements

  • Python 3.10+
  • Google Colab Pro (T4 or A100 GPU) or equivalent
  • HuggingFace account with Mistral-7B access

Setup

pip install transformers peft bitsandbytes datasets accelerate

Run

Open yeye.ipynb in Google Colab and run all cells sequentially. The notebook handles:

  1. Infrastructure data extraction from OSM/HDX
  2. Instruction-tuning pair generation
  3. LoRA fine-tuning on Mistral-7B
  4. Blind evaluation and scoring
  5. Statistical testing and visualization

Limitations

  • Evaluation circularity: Training and test sets are generated by the same program — the model may have learned generator-specific patterns rather than geographic knowledge.
  • No human evaluation: Scoring dimensions are automated proxies; correspondence to expert dispatcher judgment is unknown.
  • Synthetic data only: No real Togo emergency call transcripts were used.
  • Language gap: All prompts are in English; operational dispatch in Togo uses French, Ewe, Kabiyè, and other local languages.
  • Resource-type confusion: The model sometimes dispatches the wrong resource type (e.g., police to a gas leak) — a safety-critical failure mode.
  • No RAG baseline: Retrieval-augmented generation was not compared.
  • Single country, single model: Generalizability is assumed but not demonstrated.

Data Sources and Licensing

Source License
OpenStreetMap ODbL (Open Database License)
UN OCHA HDX CC BY (Creative Commons Attribution)

No personal data, emergency call recordings, or protected health information was used.

Citation

@inproceedings{yeye2026,
  title     = {yeye

: Adapting Large Language Models for Emergency Dispatch
               in Togo Using Local Infrastructure Data},
  author    = {[Author]},
  booktitle = {Deep Learning Indaba 2026},
  year      = {2026}
}

License

This project is released for research purposes. See LICENSE for details.

About

The paper demonstrates how LLMs can be used to support emergency report by using local infrastructure data: study case on TOGO

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors