Overview
This guide helps you create a comprehensive system resource logger to monitor CPU, memory, disk usage, and processes on your Ubuntu server. Perfect for troubleshooting backup failures and identifying resource bottlenecks.
1. Creating the Logger Script
Step 1: Create the monitoring script
sudo nano /usr/local/bin/server_monitor.sh
Step 2: Add the script content
#!/bin/bash
# Configuration
LOG_FILE="/var/log/server_monitor.log"
MAX_LOG_SIZE="100M"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
# Create log entry
{
echo "=== SYSTEM MONITOR: $DATE ==="
# System Load and Uptime
echo "UPTIME & LOAD:"
uptime
# CPU Usage
echo "CPU USAGE:"
top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1 | sed 's/us,//' | xargs -I {} echo "CPU: {}%"
# Memory Usage
echo "MEMORY USAGE:"
free -h | grep -E "(Mem:|Swap:)"
# Disk Usage
echo "DISK USAGE:"
df -h / | tail -1 | awk '{print "Root: " $5 " used (" $3 "/" $2 ")"}'
# Top 3 CPU consuming processes
echo "TOP CPU PROCESSES:"
ps aux --sort=-%cpu | head -4 | tail -3 | awk '{printf "%-10s %s%% %s\n", $1, $3, $11}'
# Top 3 Memory consuming processes
echo "TOP MEMORY PROCESSES:"
ps aux --sort=-%mem | head -4 | tail -3 | awk '{printf "%-10s %s%% %s\n", $1, $4, $11}'
# MongoDB and Node.js specific monitoring
echo "MONGODB & NODE.JS PROCESSES:"
ps aux | grep -E "(mongo|node)" | grep -v grep | awk '{printf "%-10s CPU:%s%% MEM:%s%% %s\n", $1, $3, $4, $11}' || echo "No MongoDB/Node.js processes found"
# Network connections (if needed for backup monitoring)
echo "NETWORK CONNECTIONS:"
netstat -tuln | grep -E "(:27017|:3000|:443)" | wc -l | xargs -I {} echo "Active connections: {}"
echo "----------------------------------------"
echo ""
} >> "$LOG_FILE"
# Rotate log if it gets too large
if [ -f "$LOG_FILE" ]; then
LOG_SIZE=$(stat -f%z "$LOG_FILE" 2>/dev/null || stat -c%s "$LOG_FILE" 2>/dev/null)
MAX_SIZE_BYTES=104857600 # 100MB in bytes
if [ "$LOG_SIZE" -gt "$MAX_SIZE_BYTES" ]; then
mv "$LOG_FILE" "${LOG_FILE}.old"
echo "Log rotated at $(date)" > "$LOG_FILE"
fi
fi
Step 3: Make the script executable
sudo chmod +x /usr/local/bin/server_monitor.sh
2. Setting Up Automated Logging
Option A: Every 5 minutes (24/7 monitoring)
crontab -e
Add this line:
*/5 * * * * /usr/local/bin/server_monitor.sh
Option B: Only during backup hours (6:25 AM - 7:10 AM)
crontab -e
Add these lines:
25-35/5 6 * * * /usr/local/bin/server_monitor.sh
0-10/5 7 * * * /usr/local/bin/server_monitor.sh
Option C: Custom schedule for specific problems
# Every hour during business hours
0 9-17 * * 1-5 /usr/local/bin/server_monitor.sh
# Every 10 minutes during peak times
*/10 6-8,17-19 * * * /usr/local/bin/server_monitor.sh
3. Log Storage and Management
Log Location and Format
- File:
/var/log/server_monitor.log - Format: Timestamped entries with clear sections
- Size: Approximately 2-5 KB per entry
- Daily entries: 288 entries (every 5 minutes) = ~1-3 MB/day
Automatic Log Rotation
Create log rotation configuration:
sudo nano /etc/logrotate.d/server_monitor
Add this content:
/var/log/server_monitor.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
create 644 root root
postrotate
# Optional: restart any services if needed
endscript
}
Manual Log Management
# View recent logs
tail -f /var/log/server_monitor.log
# View logs from specific time
grep "2025-08-02 07:" /var/log/server_monitor.log
# Check log file size
ls -lh /var/log/server_monitor.log
# Clean old logs manually
find /var/log -name "server_monitor.log.*" -mtime +30 -delete
4. Log Retention Policy
Default Retention: 30 Days
- Active log:
/var/log/server_monitor.log - Rotated logs:
/var/log/server_monitor.log.1.gz,/var/log/server_monitor.log.2.gz, etc. - Total retention: 30 days of compressed logs
Disk Space Usage
| Frequency | Daily Size | Monthly Size | 30-Day Retention |
|---|---|---|---|
| Every 5 min | ~3 MB | ~90 MB | ~90 MB (compressed) |
| Every 10 min | ~1.5 MB | ~45 MB | ~45 MB (compressed) |
| Hourly | ~300 KB | ~9 MB | ~9 MB (compressed) |
Custom Retention Settings
To change retention period, edit /etc/logrotate.d/server_monitor:
# Keep logs for 7 days
rotate 7
# Keep logs for 90 days
rotate 90
# Keep logs for 1 year
rotate 365
5. Troubleshooting Specific Problems
For Backup Failures (like your 7 AM issue)
# Check logs around backup time
grep -A 10 -B 5 "2025-08-02 07:0[0-9]" /var/log/server_monitor.log
# Look for high resource usage
grep -E "(CPU: [8-9][0-9]%|CPU: 100%)" /var/log/server_monitor.log
# Check memory pressure
grep -E "Mem:.*[8-9][0-9]%" /var/log/server_monitor.log
For Performance Issues
# Find peak usage times
awk '/CPU:/ {print $1, $2, $4}' /var/log/server_monitor.log | sort -k3 -nr | head -10
# Monitor specific processes
grep -A 5 "MONGODB & NODE.JS" /var/log/server_monitor.log
Real-time Monitoring During Issues
# Watch logs in real-time
tail -f /var/log/server_monitor.log
# Filter for specific problems
tail -f /var/log/server_monitor.log | grep -E "(CPU:|MEMORY:|ERROR)"
6. Advanced Usage
Create Alerts for High Usage
Add to your monitoring script:
# Alert thresholds
CPU_THRESHOLD=80
MEMORY_THRESHOLD=85
DISK_THRESHOLD=90
# Check and alert (add email/notification logic)
if [ "$CPU_USAGE" -gt "$CPU_THRESHOLD" ]; then
echo "ALERT: High CPU usage detected: ${CPU_USAGE}%" >> "$LOG_FILE"
fi
Integration with Backup Scripts
Modify your backup script to log before/after:
# At start of backup
/usr/local/bin/server_monitor.sh
echo "BACKUP START: $(date)" >> /var/log/server_monitor.log
# Your backup commands here
# At end of backup
echo "BACKUP END: $(date)" >> /var/log/server_monitor.log
/usr/local/bin/server_monitor.sh
7. Maintenance Commands
Check Logger Status
# Verify cron job is running
crontab -l | grep server_monitor
# Check last execution
ls -la /var/log/server_monitor.log
# Test script manually
sudo /usr/local/bin/server_monitor.sh
Clean Up Resources
# Remove old logs beyond retention
sudo logrotate -f /etc/logrotate.d/server_monitor
# Check disk usage
du -sh /var/log/server_monitor.log*
# Emergency cleanup (keep only last 7 days)
find /var/log -name "server_monitor.log.*" -mtime +7 -delete
This logger will help you identify exactly what’s causing your 7 AM backup failures by providing detailed resource usage data around that time period.
