Memory Mapped Files & Binary Protocols
┌─────────────────────────────────────────────────┐
│ 📄 data.bin │
│ │
│ ┌───────────────────────────────────────────┐ │
│ │ H e l l o W o r l d ! │ │
│ │ │ │
│ │ 0 1 1 0 1 0 1 1 1 0 0 1 1 1 0 0 1 0 1 │ │
│ │ 1 0 1 0 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 │ │
│ │ 0 1 0 1 0 1 1 1 1 0 1 0 1 1 0 1 0 0 1 │ │
│ │ │ │
│ │ [ more data blocks and binary content ] │ │
│ │ [ structured records and metadata ] │ │
│ │ [ event streams and time series data ] │ │
│ └───────────────────────────────────────────┘ │
│ │
│ 📊 Size: 1,024 KB 📅 Modified: Today │
│ 🔒 Permissions: rw 🚀 Memory Mapped: Yes │
└─────────────────────────────────────────────────┘
Just bytes, but fast!
What is a Memory Mapped File?
Memory mapping creates a bridge between your file and memory:
- Direct mapping: File contents appear as regular memory in your process
- Zero-copy access: No intermediate buffers or copying
- OS managed: The kernel handles loading/storing pages transparently
- Simple interface: Treat file data like a byte array
Traditional I/O: File → Buffer → Process Memory
Memory Mapped: File ←→ Process Memory (direct)
How It Works Under the Hood
Virtual Memory Physical Memory File System
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ 0x1000-0x2000│─────────────▶│ Page Frame A │◀─────────────│ data.bin │
│ (4KB) │ │ (4KB) │ │ Block 0-4KB │
│ │ │ │ │ │
│ 0x2000-0x3000│─────────────▶│ Page Frame B │◀─────────────│ data.bin │
│ (4KB) │ │ (4KB) │ │ Block 4-8KB │
│ │ │ │ │ │
│ 0x3000-0x4000│─────────────▶│ Page Frame C │◀─────────────│ data.bin │
│ (4KB) │ │ (4KB) │ │ Block 8-12KB│
└──────────────┘ └──────────────┘ └──────────────┘
↑ ↑ ↑
Program View Physical RAM Persistent Storage
Key insight: The OS maps virtual memory addresses directly to file blocks, eliminating the need for explicit read/write operations.
Performance: Why It’s Fast
System Call Elimination
Traditional I/O: read() → kernel → user space (1000ns+)
Memory Mapped: ptr[i] → CPU cache → register (1-10ns)
The Magic Ingredients
- Demand Paging: Only load data when you access it
- OS Prefetching: Kernel reads ahead for sequential patterns
- CPU Cache Friendly: Sequential access leverages L1/L2 cache
- Shared Pages: Multiple processes can share the same physical memory
Code Comparison
Traditional I/O (System Call Heavy)
// Every operation crosses the user→kernel boundary
file, _ := os.Open("data.txt")
buf := make([]byte, 1024)
n, _ := file.Read(buf) // System call #1
file.Write([]byte("hello")) // System call #2
Memory Mapped (Direct Memory Access)
mmap, err := syscall.Mmap(
int(file.Fd()), // file descriptor
0, // offset
int(fileSize), // size to map
syscall.PROT_READ|syscall.PROT_WRITE,
syscall.MAP_SHARED,
)
// Now these are just memory operations!
data := mmap[100] // Direct read (no syscall)
mmap[200] = 42 // Direct write (no syscall)
Live Example: File Modification
Initial state:
File on disk: [H][e][l][l][o][ ][W][o][r][l][d]
Memory mapped: [H][e][l][l][o][ ][W][o][r][l][d]
Address: 0x1000 0x1005
Program execution:
mapped[6] = 'G' // Change 'W' to 'G'
mapped[8] = '!' // Change 'r' to '!'
Immediately after:
Memory mapped: [H][e][l][l][o][ ][G][o][!][l][d]
File on disk: [H][e][l][l][o][ ][W][o][r][l][d] ← Not synced yet
After sync/close:
File on disk: [H][e][l][l][o][ ][G][o][!][l][d] ← Now matches
Result: "Hello World" → "Hello Go!ld"
Binary Protocols: The Perfect Partner
Why Binary Beats Text
JSON: {"timestamp": 1640995200000, "data": "hello"} // 45 bytes
Binary: [8 bytes timestamp][4 bytes size][5 bytes data] // 17 bytes
// 2.6x smaller!
Benefits: Smaller size, faster parsing, type safety, precise alignment
Event Stream Protocol Design
File Structure
// Stream Header (24 bytes) - written once
type StreamHeader struct {
WriteOffset int64 // Current write position
EventCount int64 // Total events stored
StartTimestamp int64 // Stream creation time
}
// Event Entry (16 bytes + data)
type EventEntry struct {
Timestamp int64 // Unix nanoseconds
Size uint32 // Data length
Checksum uint32 // CRC32 validation
// Variable-length data follows
}
Sequential Layout Strategy
┌─────────────┬─────────────┬─────────────┬─────────────┐
│ Stream │ Event 1 │ Event 2 │ Event 3 │
│ Header │ [16B header]│ [16B header]│ [16B header]│
│ (24 bytes) │ [data] │ [data] │ [data] │
└─────────────┴─────────────┴─────────────┴─────────────┘
Why sequential? Maximum space utilization + cache-friendly access patterns
RESULTS--------------------------
Real-World Performance Results
Test: 10,000 messages × 1KB each = 10MB total
Method | Throughput (msgs/sec) | Throughput (MB/s) | vs Baseline |
---|---|---|---|
Zig Memory Mapped | 5,649,347 | 5,777 | 54.4x ⚡ |
Memory Mapped | 3,984,986 | 4,081 | 38.4x |
Buffered Sequential | 1,041,608 | 1,067 | 10.0x |
Channel Synchronized | 726,526 | 744 | 7.0x |
Standard Sequential | 103,865 | 106 | 1.0x |
Key insight: Memory mapping delivers 38x performance improvement over standard I/O!
When to Use Memory Mapped Files
✅ Great For
- Log processing: Scan large files without loading into RAM
- Database systems: Efficient random access to data pages
- Inter-process communication: Share data between processes
- Data pipelines: High-throughput sequential processing
- Working with files larger than available RAM
- Durability
❌ Avoid When
- Small, infrequent I/O operations
- Mobile/embedded systems with limited virtual memory
Production Considerations
Key Limitations
- Crash safety: No atomic operations, corruption possible during crashes
- Error handling: Memory access errors become SIGBUS/SIGSEGV signals
- Memory pressure: Large mappings can impact garbage collector performance
- Platform differences: Windows vs Unix behavior varies
- NOT thread safe for writes*: Write operations must be synchronized to avoid data corruption
Key Takeaways
The Performance Story
- System calls are expensive - eliminate them for hot paths
- Sequential access wins - leverage CPU cache and OS prefetching
- Binary protocols provide significant advantages over text formats
- Memory mapping scales - handle TB files on GB RAM systems