# Email Reports Automation - Deployment Guide

Guide for deploying the Email Reports Automation System from Windows (development) to Linux server (production) via cPanel.

## Table of Contents

1. [Deployment Overview](#deployment-overview)
2. [Pre-Deployment Checklist](#pre-deployment-checklist)
3. [Server Requirements](#server-requirements)
4. [Deployment Steps](#deployment-steps)
5. [Post-Deployment Testing](#post-deployment-testing)
6. [Scheduling Automation](#scheduling-automation)
7. [Maintenance & Updates](#maintenance--updates)
8. [Troubleshooting](#troubleshooting)

---

## Deployment Overview

### Architecture

- **Development**: Windows 10/11 PC (`c:\Apps\Email Reports`)
- **Production**: Linux server with cPanel (shared hosting or VPS)
- **Deployment Method**: Manual file upload via cPanel File Manager or FTP/SFTP
- **Execution**: SSH command line or cPanel cron jobs

### Deployment Strategy

1. Develop and test locally on Windows
2. Upload files to Linux server via cPanel
3. Configure environment for Linux paths
4. Test manually via SSH
5. Set up monthly cron job for automation

---

## Pre-Deployment Checklist

### Local Testing (Windows)

Before deploying, verify everything works on Windows:

- [ ] All 30 PDFs processed successfully
- [ ] All clients matched correctly in database
- [ ] Emails generated without errors
- [ ] Gmail API authentication working
- [ ] Google Sheets approval workflow functional
- [ ] Gmail drafts created successfully
- [ ] Unit tests passing (`python run_tests.py`)
- [ ] Integration tests passing
- [ ] Documentation up to date

### Gather Credentials

Collect the following files and information:

- [ ] `credentials.json` (Gmail API OAuth client)
- [ ] `token.json` (Gmail API authorization token)
- [ ] `.env` file with production configuration
- [ ] `data/clients.csv` with all 30 clients
- [ ] cPanel login credentials (username, password, URL)
- [ ] Server Python version (verify 3.8+)
- [ ] Server SSH access (if available)

---

## Server Requirements

### Minimum Requirements

- **Operating System**: Linux (CentOS, Ubuntu, Debian)
- **Python**: 3.8 or higher
- **Disk Space**: 2GB minimum (for PDFs, logs, virtual environment)
- **RAM**: 512MB minimum (1GB recommended)
- **cPanel**: Access to File Manager, Terminal (SSH), Cron Jobs
- **Internet**: Outbound HTTPS access for Gmail API (port 443)

### Verify Server Python Version

Via cPanel Terminal or SSH:

```bash
python3 --version
# Should show: Python 3.8.x or higher
```

If Python 3.8+ not available, contact hosting provider.

---

## Deployment Steps

### Step 1: Prepare Files for Upload

On Windows, create deployment package:

```bash
cd "c:\Apps\Email Reports"

# Create deployment folder
mkdir deploy

# Copy application files (exclude development files)
xcopy src deploy\src\ /E /I
xcopy templates deploy\templates\ /E /I
xcopy tests deploy\tests\ /E /I
copy main.py deploy\
copy requirements.txt deploy\
copy .env deploy\
copy credentials.json deploy\
copy token.json deploy\

# Copy data (client database only, not PDFs)
mkdir deploy\data
copy data\clients.csv deploy\data\

# Create empty directories for logs and PDFs
mkdir deploy\logs
mkdir deploy\data\pdfs
mkdir deploy\data\email_previews
mkdir deploy\data\archive
```

**What NOT to upload:**
- `venv/` (will recreate on server)
- `__pycache__/` (will regenerate)
- `logs/*.log` (old log files)
- `data/pdfs/*.pdf` (test PDFs)
- `.git/` (version control files)
- `test_*.py` (root-level test scripts, not in tests/)

### Step 2: Upload Files to Server

#### Option A: cPanel File Manager (Recommended for First-Time)

1. Log in to cPanel (`https://yourdomain.com:2083`)
2. Navigate to **File Manager**
3. Go to `/home/username/` (or `/home/username/public_html/`)
4. Create directory: `email_reports/`
5. Open `email_reports/` folder
6. Click **Upload**
7. Select all files from `deploy/` folder
8. Wait for upload to complete (may take 5-10 minutes)

#### Option B: FTP/SFTP (Faster for Large Transfers)

Using FileZilla, WinSCP, or similar:

1. Connect to server:
   - Host: `ftp.yourdomain.com` or `yourdomain.com`
   - Port: 21 (FTP) or 22 (SFTP)
   - Username: cPanel username
   - Password: cPanel password
2. Navigate to `/home/username/`
3. Create directory: `email_reports/`
4. Upload all files from `deploy/` folder

### Step 3: Set File Permissions

Via cPanel File Manager or SSH:

```bash
# Navigate to project directory
cd /home/username/email_reports

# Set directory permissions (755 = rwxr-xr-x)
find . -type d -exec chmod 755 {} \;

# Set file permissions (644 = rw-r--r--)
find . -type f -exec chmod 644 {} \;

# Make Python files executable
chmod +x main.py
chmod +x src/*.py

# Secure sensitive files (600 = rw-------)
chmod 600 .env
chmod 600 credentials.json
chmod 600 token.json
chmod 600 data/clients.csv

# Verify permissions
ls -la
```

### Step 4: Update .env for Linux Paths

Edit `.env` file on server (via cPanel File Manager or SSH text editor):

**Change Windows paths to Linux paths:**

```ini
# Before (Windows):
PROJECT_ROOT=c:/Apps/Email Reports
DATA_DIR=c:/Apps/Email Reports/data
LOGS_DIR=c:/Apps/Email Reports/logs

# After (Linux):
PROJECT_ROOT=/home/username/email_reports
DATA_DIR=/home/username/email_reports/data
LOGS_DIR=/home/username/email_reports/logs
```

**Update all path variables:**
```ini
CLIENT_DATABASE_PATH=/home/username/email_reports/data/clients.csv
GMAIL_CREDENTIALS_PATH=/home/username/email_reports/credentials.json
GMAIL_TOKEN_PATH=/home/username/email_reports/token.json
TEMPLATE_PATH=/home/username/email_reports/templates/email_template.html
```

### Step 5: Create Virtual Environment on Server

Via SSH or cPanel Terminal:

```bash
cd /home/username/email_reports

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

# Upgrade pip
pip install --upgrade pip

# Install dependencies
pip install -r requirements.txt

# Verify installation
pip list
```

**Expected packages:**
- google-api-python-client
- google-auth-oauthlib
- gspread
- pdfplumber
- Jinja2
- RapidFuzz
- premailer
- python-dotenv
- pytest (for testing)

### Step 6: Test Gmail API Authentication

```bash
cd /home/username/email_reports
source venv/bin/activate

# Test OAuth authentication
python3 -c "from src.gmail_reader import GmailReader; r = GmailReader(); print('✓ Gmail API authenticated')"
```

**Expected output:**
```
✓ Gmail API authenticated
```

**If authentication fails:**
- Verify `credentials.json` uploaded correctly
- Verify `token.json` uploaded correctly
- Check file permissions (both should be readable)
- Ensure `token.json` was generated on Windows (contains valid refresh token)

---

## Post-Deployment Testing

### Test 1: Manual Workflow Test

Run the full workflow manually to verify all components work:

```bash
cd /home/username/email_reports
source venv/bin/activate

# Test with single PDF (if you have test PDFs on server)
python3 main.py --process-pdfs

# Or run full workflow
python3 main.py --full
```

### Test 2: Verify Logging

```bash
# Check if logs are being written
ls -lh logs/

# View latest log
tail -n 50 logs/email_reports_$(date +%Y-%m-%d).log
```

### Test 3: Verify Client Database

```bash
python3 -c "from src.client_database import ClientDatabase; db = ClientDatabase('data/clients.csv'); print(f'Loaded {len(db.clients)} clients')"
```

Expected: `Loaded 30 clients`

### Test 4: Test PDF Extraction

```bash
# Extract PDFs from Gmail (if PDFs are available)
python3 main.py --extract-from-gmail

# Check PDFs downloaded
ls -lh data/pdfs/
```

---

## Scheduling Automation

### Set Up Monthly Cron Job

Automate monthly report processing using cPanel cron jobs.

#### Via cPanel Cron Jobs Interface

1. Log in to cPanel
2. Navigate to **Advanced** → **Cron Jobs**
3. Add new cron job:

**Frequency:** Monthly (1st day of month, 9:00 AM)

```
0 9 1 * * cd /home/username/email_reports && source venv/bin/activate && python3 main.py --full >> logs/cron_$(date +\%Y\%m\%d).log 2>&1
```

**Breakdown:**
- `0 9 1 * *` - Run at 9:00 AM on the 1st day of every month
- `cd /home/username/email_reports` - Navigate to project directory
- `source venv/bin/activate` - Activate virtual environment
- `python3 main.py --full` - Run full workflow
- `>> logs/cron_$(date +\%Y\%m\%d).log 2>&1` - Log output

#### Via SSH (crontab)

```bash
# Edit crontab
crontab -e

# Add this line:
0 9 1 * * cd /home/username/email_reports && source venv/bin/activate && python3 main.py --full >> logs/cron_$(date +\%Y\%m\%d).log 2>&1

# Save and exit
# Verify cron job added:
crontab -l
```

#### Cron Job Email Notifications

To receive email notification when cron job runs:

```bash
# Add MAILTO at top of crontab
MAILTO=youremail@example.com

# Then add cron job as above
0 9 1 * * cd /home/username/email_reports && ...
```

#### Cron Job Log Monitoring

View cron job execution logs:

```bash
# View latest cron log
tail -f logs/cron_$(date +%Y%m%d).log

# View all cron logs
ls -lh logs/cron_*
```

---

## Maintenance & Updates

### Updating Code on Server

When you make changes to code on Windows:

1. Test changes locally first
2. Upload modified files to server (via cPanel or FTP)
3. Restart any running processes
4. Test on server

**Quick update via SCP (if SSH available):**
```bash
# From Windows (using WSL or Git Bash)
scp -r src/ username@yourdomain.com:/home/username/email_reports/
```

### Updating Client Database

Option 1: Edit directly on server via cPanel File Manager

Option 2: Upload updated CSV from Windows:

```bash
# Via cPanel File Manager: Upload data/clients.csv
# Or via FTP: Upload to /home/username/email_reports/data/clients.csv
```

Option 3: Via SSH (if available):

```bash
# Edit on server
cd /home/username/email_reports/data
nano clients.csv
# Make changes, save (Ctrl+O), exit (Ctrl+X)
```

### Updating Dependencies

If you add new Python packages:

```bash
# On Windows: Update requirements.txt
pip freeze > requirements.txt

# Upload requirements.txt to server
# Then on server:
cd /home/username/email_reports
source venv/bin/activate
pip install -r requirements.txt --upgrade
```

### Rotating Logs

Logs can grow large over time. Set up log rotation:

```bash
# Create log rotation script
nano /home/username/email_reports/rotate_logs.sh
```

Add:
```bash
#!/bin/bash
cd /home/username/email_reports/logs
# Archive logs older than 90 days
find . -name "*.log" -mtime +90 -exec gzip {} \;
# Delete archived logs older than 1 year
find . -name "*.log.gz" -mtime +365 -delete
```

Make executable and add to crontab:
```bash
chmod +x rotate_logs.sh

# Add to crontab (run monthly)
crontab -e
# Add: 0 0 1 * * /home/username/email_reports/rotate_logs.sh
```

---

## Troubleshooting

### "Python command not found"

**Problem:** `python` or `python3` not available

**Solution:**
```bash
# Check available Python versions
which python
which python3
ls /usr/bin/python*

# Use full path in cron job
/usr/bin/python3.8 main.py --full
```

### "ModuleNotFoundError" when running via cron

**Problem:** Virtual environment not activated in cron job

**Solution:** Ensure cron job activates venv:
```bash
cd /home/username/email_reports && source venv/bin/activate && python3 main.py --full
```

### "Permission denied" errors

**Problem:** Incorrect file permissions

**Solution:**
```bash
# Make directories writable
chmod 755 data/ logs/ data/pdfs/ data/email_previews/

# Make Python files executable
chmod +x main.py
chmod +x src/*.py
```

### "Gmail API authentication failed"

**Problem:** token.json not valid on server

**Solution:**
```bash
# Re-generate token.json on Windows
python test_oauth.py

# Upload new token.json to server
# Verify permissions
chmod 600 token.json
```

### "Client database not found"

**Problem:** Incorrect path in .env file

**Solution:**
```bash
# Verify file exists
ls -la /home/username/email_reports/data/clients.csv

# Update .env with correct absolute path
nano .env
# CLIENT_DATABASE_PATH=/home/username/email_reports/data/clients.csv
```

### Cron job not running

**Problem:** Cron job syntax error or environment issues

**Solution:**
```bash
# Verify cron job syntax
crontab -l

# Test command manually
cd /home/username/email_reports && source venv/bin/activate && python3 main.py --full

# Check cron logs
grep CRON /var/log/syslog
# Or check cPanel cron email notifications
```

### Out of disk space

**Problem:** PDFs and logs filling up disk

**Solution:**
```bash
# Check disk usage
df -h
du -sh /home/username/email_reports/*

# Clean up old PDFs
rm -f data/archive/*.pdf

# Compress old logs
gzip logs/*.log

# Delete logs older than 1 year
find logs/ -name "*.log*" -mtime +365 -delete
```

---

## Cross-Platform Compatibility Notes

### Path Separators

- **Windows:** `c:\Apps\Email Reports\data\clients.csv` (backslashes)
- **Linux:** `/home/username/email_reports/data/clients.csv` (forward slashes)
- **Solution:** Use `pathlib.Path` in code (already implemented)

### Line Endings

- **Windows:** CRLF (`\r\n`)
- **Linux:** LF (`\n`)
- **Solution:** Use `dos2unix` if needed:
  ```bash
  dos2unix main.py
  dos2unix src/*.py
  ```

### File Permissions

- Windows doesn't have Unix file permissions
- Linux requires proper chmod settings
- **Solution:** Set permissions after upload (Step 3 above)

---

## Backup & Disaster Recovery

### Automated Backups

Create backup script on server:

```bash
nano /home/username/backup_email_reports.sh
```

Add:
```bash
#!/bin/bash
BACKUP_DIR=/home/username/backups/email_reports
DATE=$(date +%Y%m%d_%H%M%S)

# Create backup directory
mkdir -p $BACKUP_DIR

# Backup critical files
tar -czf $BACKUP_DIR/email_reports_$DATE.tar.gz \
  /home/username/email_reports/data/clients.csv \
  /home/username/email_reports/.env \
  /home/username/email_reports/credentials.json \
  /home/username/email_reports/token.json

# Keep only last 30 days of backups
find $BACKUP_DIR -name "email_reports_*.tar.gz" -mtime +30 -delete

echo "Backup created: $BACKUP_DIR/email_reports_$DATE.tar.gz"
```

Make executable and schedule:
```bash
chmod +x backup_email_reports.sh

# Add to crontab (daily at 2 AM)
crontab -e
# Add: 0 2 * * * /home/username/backup_email_reports.sh
```

### Restoring from Backup

```bash
# List available backups
ls -lh /home/username/backups/email_reports/

# Extract backup
cd /tmp
tar -xzf /home/username/backups/email_reports/email_reports_20250105_020000.tar.gz

# Restore files
cp -f home/username/email_reports/data/clients.csv /home/username/email_reports/data/
cp -f home/username/email_reports/.env /home/username/email_reports/
# etc.
```

---

## Performance Optimization

### For Shared Hosting

If running on shared hosting with limited resources:

1. **Process PDFs in smaller batches:**
   ```bash
   # Instead of processing all 30 at once
   python3 main.py --process-pdfs --limit 10
   ```

2. **Use ionice for lower priority:**
   ```bash
   ionice -c3 python3 main.py --full
   ```

3. **Schedule during off-peak hours:**
   ```
   # Run at 2 AM instead of 9 AM
   0 2 1 * * cd /home/username/email_reports && ...
   ```

---

## Security Considerations

### Protecting Sensitive Files

```bash
# Restrict access to sensitive files
chmod 600 credentials.json
chmod 600 token.json
chmod 600 .env
chmod 600 data/clients.csv

# Ensure only owner can access project directory
chmod 700 /home/username/email_reports
```

### Hiding from Web Access

If email_reports is in public_html:

Create `.htaccess`:
```apache
# Deny web access to entire directory
Deny from all
```

Or move outside public_html:
```bash
mv /home/username/public_html/email_reports /home/username/email_reports
```

---

## Support

For deployment assistance:
1. Check server logs: `tail -f logs/email_reports_$(date +%Y-%m-%d).log`
2. Review cron logs: `tail -f logs/cron_*`
3. Test components individually (see Post-Deployment Testing)
4. Contact hosting provider for server-specific issues

---

**Email Reports Automation System - Deployment Guide**
*Version 1.0 - January 2025*
