Static USC Data Science site for the DSCI 550 Spring 2021 web data visualization assignment on phishing and fraudulent email analysis. The site collects student visualization projects, supporting datasets, and course source material in a GitHub Pages-ready layout modeled after ufo.usc.edu.
Published site:
cd /Users/mattmann/git/phishing.usc.edu
python3 -m http.server 8008Then open:
- http://localhost:8008/
- http://localhost:8008/html/d3-examples.html
- http://localhost:8008/teams/team_4/team_4.html
- http://localhost:8008/teams/team_8/team_8.html
index.html- phishing-themed landing page modeled afterufo.usc.eduhtml/d3-examples.html- Spring 2021 team galleryteams/team_4/- Team 4 visualizations covering phishing email frequency, attacker geography, urgency over time, word clouds, and language-style patternsteams/team_8/- Team 8 / Team Banana D3 visualizations and ImageSpace screenshots covering send-time signatures, social-engineering keywords, attacker titles, unemployment context, and email-similarity distributionsassets/source-docs/- submitted report, assignment PDF, and source phishing ARFF datasetimages/phishing-cybersecurity-banner.png- generated cybersecurity/phishing hero imageteams/team_8/d3/data/- Team 8 visualization data, including raw similarity CSVs tracked with Git LFS
Team 4 contributed a collection of phishing email visualizations originally misplaced in the ufo.usc.edu material. Their work includes a calendar-style frequency view, map-based attacker-location exploration, stacked urgency/time analysis, word-cloud views, and a sunburst view of language styles.
Team 8 / Team Banana:
- Madeleine Thompson
- Sarah Pursley
- Katie Chak
- Amber Yu
- Claudia Winarko
Their project combines D3 charts and ImageSpace screenshots to explore fraudulent email timing, social-engineering language, attacker title vocabulary, contextual country-level unemployment data, and pairwise similarity across phishing messages.
The published histogram uses teams/team_8/d3/data/similarity_histogram_bins.csv so the page loads quickly in the browser. The original raw Team 8 similarity files are preserved for reproducibility and tracked with Git LFS because they are too large for normal GitHub blobs:
teams/team_8/d3/data/cosine.csvteams/team_8/d3/data/edit.csvteams/team_8/d3/data/jaccard.csv
To clone the full data:
git lfs install
git clone https://github.com/USCDataScience/phishing.usc.edu.gitOriginal local sources:
/Users/mattmann/Desktop/STUFF/USC/Teaching/DSCI550-Spring2021/TEAM_BANANA_DSCI550_HW_DATAVIS/Users/mattmann/git/ufo.usc.edu/teams/team_4