Email Sync: The Delta Sync Engine
Syncing 10,000 emails in 2 seconds instead of 5 minutes
Introduction
The task is to manage and sync 10,000 emails in the customer's Outlook inbox. Every time the customer wants to check for new messages, should the application download all 10,000 emails again? That would take minutes, waste bandwidth, and hammer both the server and Microsoft's API.
The naive approach—"fetch everything every time"—doesn't scale.
The solution? Delta Sync.
Delta sync is an incremental synchronization strategy where we only fetch what changed since the last check. Microsoft Graph API provides a powerful delta query mechanism that returns:
- New emails that arrived
- Updated emails that were read, flagged, or modified
- Deleted emails that were removed or moved
This article explores how the AI Email Coach implements delta sync to efficiently mirror a user's Outlook inbox. I'll dive deep into:
- The Delta Sync Concept: How Microsoft's
@odata.deltaLinkworks - Backend Architecture: Token refresh, delta token management, and atomic commits
- Data Consistency: Ensuring the local database perfectly reflects the remote state
- Performance Optimization: Bulk operations and bandwidth efficiency
- Error Handling: What happens when things go wrong
Part 1: System Overview
The email sync flow involves three parties working together to keep the local database synchronized with Microsoft Outlook:
The Three Parties
-
User (Frontend)
- Triggers sync by clicking "Sync" button on an account card
- Receives feedback (success message, sync statistics)
- Expects fast, reliable synchronization
-
Backend (FastAPI Server)
- Orchestrates the entire sync process
- Manages delta tokens (checkpoints)
- Transforms Microsoft's data format into the database schema
- Ensures data consistency through atomic transactions
-
Microsoft Graph API
- Provides the delta query endpoint
- Returns incremental changes since last sync
- Handles pagination for large result sets
- Issues new delta tokens after each successful sync
The Contract Between Parties
Frontend ↔ Backend:
- Request:
POST /api/emails/sync_outlook/{account_id}with JWT authentication - Response: Sync statistics (inserted, updated, deleted counts)
- Expectation: Idempotent operation (safe to call multiple times)
Backend ↔ Microsoft:
- Request:
GET /me/mailFolders/inbox/messages/deltawith optional delta token - Response: Array of changes + pagination links + new delta token
- Expectation: Incremental updates only (not full dataset)
Backend ↔ Database:
- Write: Upsert emails, delete removed messages, update delta token
- Expectation: All-or-nothing atomic commit (transaction boundary)
The Sync Flow Step-by-Step
1. User clicks "Sync" on account card
↓
2. Frontend sends POST /sync_outlook/{account_id} with JWT
↓
3. Backend verifies user owns this account
↓
4. Backend refreshes access token (using encrypted refresh token)
↓
5. Backend loads delta token from database
│
├─ First sync? → No delta token exists → Full sync
└─ Subsequent sync? → Delta token exists → Incremental sync
↓
6. Backend calls Microsoft Graph API
│
├─ Pagination loop: Fetch all pages of changes
└─ Collect @odata.deltaLink for next sync
↓
7. Backend processes changes
│
├─ Deletions: Collect IDs for bulk delete
├─ Upserts: Insert new or update existing emails
└─ Link each email to email_account_id
↓
8. Backend commits transaction atomically
│
├─ Save all email changes
├─ Execute bulk delete
└─ Update delta token (checkpoint)
↓
9. Backend returns statistics to frontend
↓
10. Frontend shows success message
Key Design Principles
This sync implementation is built on four core principles:
-
Incremental Updates Only
- Never re-download unchanged data
-
Atomic Consistency
- All database changes happen in one transaction
- If sync fails halfway, database remains unchanged
- Delta token only saved if ALL changes succeed
-
Account Isolation
- Each account has its own delta token
- Syncs are independent (one account's failure doesn't affect others)
- Supports multi-account users seamlessly
-
Idempotent Operations
- Safe to call sync multiple times
- Duplicate calls won't create duplicate emails
message_iduniqueness constraint prevents duplicates
What Makes Delta Sync "Magic"?
The brilliance of delta sync lies in the delta token—a special URL that Microsoft Graph returns after each sync. This token is a checkpoint that encodes:
- Which folder is being synced (Inbox, Sent Items, etc.)
- The exact point in time of the last sync
- A cryptographic signature to prevent tampering
When the backend sends this token back to Microsoft in the next sync request, Microsoft's servers know exactly where we left off and return only the changes since then.
Example:
First Sync (No delta token):
Request: GET /messages/delta
Response: 5,000 emails + deltaLink: "...?$skiptoken=abc123"
Second Sync (With delta token):
Request: GET /messages/delta?$skiptoken=abc123
Response: 3 new emails + deltaLink: "...?$skiptoken=xyz789"
This is the foundation that makes efficient email synchronization possible.
In the next sections, I'll dive into how this is implemented, starting with the delta sync concept and moving through the complete backend architecture.