Skip to content

USCDataScience/phishing.usc.edu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Phishing Data Visualization Site

Static USC Data Science site for the DSCI 550 Spring 2021 web data visualization assignment on phishing and fraudulent email analysis. The site collects student visualization projects, supporting datasets, and course source material in a GitHub Pages-ready layout modeled after ufo.usc.edu.

Published site:

Preview

cd /Users/mattmann/git/phishing.usc.edu
python3 -m http.server 8008

Then open:

Contents

  • index.html - phishing-themed landing page modeled after ufo.usc.edu
  • html/d3-examples.html - Spring 2021 team gallery
  • teams/team_4/ - Team 4 visualizations covering phishing email frequency, attacker geography, urgency over time, word clouds, and language-style patterns
  • teams/team_8/ - Team 8 / Team Banana D3 visualizations and ImageSpace screenshots covering send-time signatures, social-engineering keywords, attacker titles, unemployment context, and email-similarity distributions
  • assets/source-docs/ - submitted report, assignment PDF, and source phishing ARFF dataset
  • images/phishing-cybersecurity-banner.png - generated cybersecurity/phishing hero image
  • teams/team_8/d3/data/ - Team 8 visualization data, including raw similarity CSVs tracked with Git LFS

Teams

Team 4 contributed a collection of phishing email visualizations originally misplaced in the ufo.usc.edu material. Their work includes a calendar-style frequency view, map-based attacker-location exploration, stacked urgency/time analysis, word-cloud views, and a sunburst view of language styles.

Team 8 / Team Banana:

  • Madeleine Thompson
  • Sarah Pursley
  • Katie Chak
  • Amber Yu
  • Claudia Winarko

Their project combines D3 charts and ImageSpace screenshots to explore fraudulent email timing, social-engineering language, attacker title vocabulary, contextual country-level unemployment data, and pairwise similarity across phishing messages.

Data Notes

The published histogram uses teams/team_8/d3/data/similarity_histogram_bins.csv so the page loads quickly in the browser. The original raw Team 8 similarity files are preserved for reproducibility and tracked with Git LFS because they are too large for normal GitHub blobs:

  • teams/team_8/d3/data/cosine.csv
  • teams/team_8/d3/data/edit.csv
  • teams/team_8/d3/data/jaccard.csv

To clone the full data:

git lfs install
git clone https://github.com/USCDataScience/phishing.usc.edu.git

Source Material

Original local sources:

  • /Users/mattmann/Desktop/STUFF/USC/Teaching/DSCI550-Spring2021/TEAM_BANANA_DSCI550_HW_DATAVIS
  • /Users/mattmann/git/ufo.usc.edu/teams/team_4

About

USC DSCI 550 Spring 2021 phishing data visualization site

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors