BugStone-Bench

Benchmark and supporting code for BugStone

Ground Truth Data

./data/ground_truth.json
Ground truth mapping from security coding rules to commits.
./data/ground_truth_human_rules.json
Ground truth mapping from human-written security coding rules to commits.

Prompts

./prompts/prompts.json
Prompt configurations used for evaluation.

Running Evaluation

Clone the Linux kernel repository:

git clone https://github.com/torvalds/linux.git

Run the evaluation script with options:

run_eval.py [OPTIONS]

[OPTIONS]:
 -h, --help
     Show help message and exit.
 -c PROMPT_CONFIG, --prompt_config PROMPT_CONFIG
     Prompt type: Basic, Patch, Rule, HuRule, Rule+Patch, or RuleNOCoT.
 -m MODEL_NAME, --model_name MODEL_NAME
     LLM model to use.
 -u BASE_URL, --base_url BASE_URL
     Service endpoint URL.
 -k API_KEY, --api_key API_KEY
     API key for model access.
 -l LINUX_DIR, --linux_dir LINUX_DIR
     Path to the cloned Linux kernel repository.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
prompts		prompts
results		results
LICENSE		LICENSE
README.md		README.md
accuracy_checking.py		accuracy_checking.py
run_eval.py		run_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BugStone-Bench

Ground Truth Data

Prompts

Running Evaluation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BugStone-Bench

Ground Truth Data

Prompts

Running Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages