back to home

December 02 2025

Email Sync: The Delta Sync Engine

Syncing 10,000 emails in 2 seconds instead of 5 minutes


Introduction

The task is to manage and sync 10,000 emails in the customer's Outlook inbox. Every time the customer wants to check for new messages, should the application download all 10,000 emails again? That would take minutes, waste bandwidth, and hammer both the server and Microsoft's API.

The naive approach—"fetch everything every time"—doesn't scale.

The solution? Delta Sync.

Delta sync is an incremental synchronization strategy where we only fetch what changed since the last check. Microsoft Graph API provides a powerful delta query mechanism that returns:

This article explores how the AI Email Coach implements delta sync to efficiently mirror a user's Outlook inbox. I'll dive deep into:


Part 1: System Overview

The email sync flow involves three parties working together to keep the local database synchronized with Microsoft Outlook:

The Three Parties

  1. User (Frontend)

    • Triggers sync by clicking "Sync" button on an account card
    • Receives feedback (success message, sync statistics)
    • Expects fast, reliable synchronization
  2. Backend (FastAPI Server)

    • Orchestrates the entire sync process
    • Manages delta tokens (checkpoints)
    • Transforms Microsoft's data format into the database schema
    • Ensures data consistency through atomic transactions
  3. Microsoft Graph API

    • Provides the delta query endpoint
    • Returns incremental changes since last sync
    • Handles pagination for large result sets
    • Issues new delta tokens after each successful sync

The Contract Between Parties

Frontend ↔ Backend:

Backend ↔ Microsoft:

Backend ↔ Database:


The Sync Flow Step-by-Step

1. User clicks "Sync" on account card
   ↓
2. Frontend sends POST /sync_outlook/{account_id} with JWT
   ↓
3. Backend verifies user owns this account
   ↓
4. Backend refreshes access token (using encrypted refresh token)
   ↓
5. Backend loads delta token from database
   │
   ├─ First sync? → No delta token exists → Full sync
   └─ Subsequent sync? → Delta token exists → Incremental sync
   ↓
6. Backend calls Microsoft Graph API
   │
   ├─ Pagination loop: Fetch all pages of changes
   └─ Collect @odata.deltaLink for next sync
   ↓
7. Backend processes changes
   │
   ├─ Deletions: Collect IDs for bulk delete
   ├─ Upserts: Insert new or update existing emails
   └─ Link each email to email_account_id
   ↓
8. Backend commits transaction atomically
   │
   ├─ Save all email changes
   ├─ Execute bulk delete
   └─ Update delta token (checkpoint)
   ↓
9. Backend returns statistics to frontend
   ↓
10. Frontend shows success message

Key Design Principles

This sync implementation is built on four core principles:

  1. Incremental Updates Only

    • Never re-download unchanged data
  2. Atomic Consistency

    • All database changes happen in one transaction
    • If sync fails halfway, database remains unchanged
    • Delta token only saved if ALL changes succeed
  3. Account Isolation

    • Each account has its own delta token
    • Syncs are independent (one account's failure doesn't affect others)
    • Supports multi-account users seamlessly
  4. Idempotent Operations

    • Safe to call sync multiple times
    • Duplicate calls won't create duplicate emails
    • message_id uniqueness constraint prevents duplicates

What Makes Delta Sync "Magic"?

The brilliance of delta sync lies in the delta token—a special URL that Microsoft Graph returns after each sync. This token is a checkpoint that encodes:

When the backend sends this token back to Microsoft in the next sync request, Microsoft's servers know exactly where we left off and return only the changes since then.

Example:

First Sync (No delta token):
  Request:  GET /messages/delta
  Response: 5,000 emails + deltaLink: "...?$skiptoken=abc123"

Second Sync (With delta token):
  Request:  GET /messages/delta?$skiptoken=abc123
  Response: 3 new emails + deltaLink: "...?$skiptoken=xyz789"

This is the foundation that makes efficient email synchronization possible.


In the next sections, I'll dive into how this is implemented, starting with the delta sync concept and moving through the complete backend architecture.