Benchmark and supporting code for BugStone
-
./data/ground_truth.json
Ground truth mapping from security coding rules to commits. -
./data/ground_truth_human_rules.json
Ground truth mapping from human-written security coding rules to commits.
./prompts/prompts.json
Prompt configurations used for evaluation.
- Clone the Linux kernel repository:
git clone https://github.com/torvalds/linux.git
- Run the evaluation script with options:
run_eval.py [OPTIONS] [OPTIONS]: -h, --help Show help message and exit. -c PROMPT_CONFIG, --prompt_config PROMPT_CONFIG Prompt type: Basic, Patch, Rule, HuRule, Rule+Patch, or RuleNOCoT. -m MODEL_NAME, --model_name MODEL_NAME LLM model to use. -u BASE_URL, --base_url BASE_URL Service endpoint URL. -k API_KEY, --api_key API_KEY API key for model access. -l LINUX_DIR, --linux_dir LINUX_DIR Path to the cloned Linux kernel repository.