✅ Python Program to Find Email Distribution by Hour (mbox-short.txt)
10.2 Write a program to read through the mbox-short.txt and figure out the distribution by hour of the day for each of the messages. You can pull the hour out from the 'From ' line by finding the time and then splitting the string a second time using a colon. From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008 Once you have accumulated the counts for each hour, print out the counts, sorted by hour as shown below.
| Python example showing how to analyze email timestamps and calculate hourly distribution using dictionaries. |
📘 Problem Statement
Write a Python program to read through the mbox-short.txt file and determine the distribution of emails by hour of the day.
Each email message contains a line starting with From that includes a timestamp like:
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
Your task is to:
-
Extract the hour from the time (
09:14:16) - Count how many emails were sent during each hour
- Display results sorted by hour
🧠 Logic Behind the Program
- Open the mailbox file.
- Read each line.
-
Select only lines starting with
"From "(with space). - Extract the time field.
-
Split the time using
:to get the hour. - Store counts using a dictionary.
- Sort and print results.
💻 Optimized Python Program
name = input("Enter file: ")
if len(name) < 1:
name = "mbox-short.txt"
handle = open(name)
counts = {}
for line in handle:
if line.startswith("From "):
words = line.split()
time = words[5]
hour = time.split(":")[0]
counts[hour] = counts.get(hour, 0) + 1
# Sort by hour
result = sorted(counts.items())
for hour, count in result:
print(hour, count)
📊 Example Output
04 3
06 1
07 1
09 2
10 3
11 6
14 1
15 2
16 4
17 2
18 1
19 1
🔍 Code Explanation
✔ Dictionary Counting
counts.get(hour, 0) + 1
- If hour exists → increase count
- If not → start from 0
✔ Extracting Hour
time.split(":")[0]
Converts:
09:14:16 → 09
✔ Sorting Output
sorted(counts.items())
Sorts results by hour automatically.
⚠️ Common Mistakes (Avoid These)
❌ Using From: instead of From
❌ Forgetting space after From
❌ Naming variable list (overwrites Python built-in function)
0 Comments