Fiftyville (CS50)

SQL Investigation

Description:

An SQL-based investigation project, centered on solving the CS50 Duck theft in the fictional town of Fiftyville. Utilizing meticulous SQL queries to analyze the ‘fiftyville.db’ database, it unravels the mystery by piecing together information from multiple tables, identifying the thief, and revealing their escape plan and accomplice.

Objectives:

  1. To determine the identity of the CS50 Duck thief.
  2. To trace the city the thief fled to post-theft.
  3. To expose the accomplice involved in the heist.

Detail:

SQL Query Analysis

  1. Crime Scene Investigation:
    • Started with identifying the crime scene report from ‘crime_scene_reports’, pinpointing the exact date, location, and time of the theft.
    • SELECT *
      FROM crime_scene_reports
      WHERE year = 2021 AND month = 7 AND day = 28 AND street = 'Humphrey Street';
      
  2. Witness Interviews:
    • Extracted relevant witness interviews from the ‘interviews’ table. These interviews provided critical clues, such as the thief getting into a car and making a phone call.
  3. ATM Transaction Analysis:
    • Identified individuals who made ATM withdrawals on the day of the crime at a specific location (‘Leggett Street’) using a join between ‘people’, ‘bank_accounts’, and ‘atm_transactions’ tables. This narrowed down the list of potential suspects.
    • SELECT name
      FROM people
      JOIN bank_accounts ON bank_accounts.person_id = people.id
      JOIN atm_transactions ON atm_transactions.account_number = bank_accounts.account_number
      WHERE year = 2021 AND month = 7 AND day = 28 AND atm_location = 'Leggett Street' AND transaction_type = 'withdraw';
      
  4. Phone Call Duration Scrutiny:
    • Further narrowed down suspects by cross-referencing those who made brief phone calls (under 60 seconds) on the same day, implying coordination related to the crime.
  5. License Plate and Security Logs Check:
    • Retrieved the license plates of the remaining suspects and cross-referenced them with bakery security logs to determine who left the bakery’s vicinity soon after the theft.
  6. Final Suspect Identification:
    • Examined phone calls made by the final suspects and checked their presence on the earliest flight out of town the next day. This was achieved by joining ‘phone_calls’, ‘people’, ‘passengers’, and ‘flights’ tables.
    • SELECT name
      FROM people
      JOIN bakery_security_logs ON bakery_security_logs.license_plate = people.license_plate
      WHERE bakery_security_logs.hour = 10 AND bakery_security_logs.minute >= 15 AND bakery_security_logs.minute <= 25;
      
  7. Escape Route and Accomplice Identification:
    • Determined the flight destination and identified the accomplice based on the phone numbers called by the suspects.

SQL Techniques

  • Nested Queries and Joins: Extensively used nested queries and joins across multiple tables to correlate data from different sources, essential for solving complex real-world data analysis problems.
  • Date and Time Filtering: Applied precise filtering based on date and time to pinpoint activities relevant to the crime timeline.
  • Ordering and Limiting Results: Used ordering and limiting results to identify key information, such as the earliest flight out of town.
  • Combining Multiple Data Points: Integrated information from various aspects of the suspects’ activities (ATM withdrawals, phone calls, car movements) to build a comprehensive profile leading to the identification of the culprit.
Explore details

Project Status:

Completed

Data sources:

Problem Set 7 from Harvard’s CS50x 2023.

Full Overview