Engineering Deep Dive

Tech Under the Hood

Last Roll is built for speed. Here's how GPU compositing, hardware-accelerated computer vision, and zero-dependency binary parsing come together to deliver instant response — even on 45-megapixel RAW files.

GPU Rendering Focus Peaking AF Point Parsing JPEG Extraction JSON Persistence
01 — Core Animation · Metal

GPU-Accelerated Rendering

CALayer · CATransform3D · Trilinear Interpolation · Metal
Core Animation Metal CALayer CATransform3D

The photo viewer isn't built on SwiftUI or AppKit in the traditional sense. The viewer is built on CALayer — Apple's compositing engine that sits directly on top of Metal, the same low-level GPU API used by games and professional 3D software on Mac.

Each image is uploaded to the GPU as a texture, once. Zoom and pan are not software operations — they are 3D matrix transforms (CATransform3D) applied directly to the hardware layer. Zooming from 100% to 200% on a 45-megapixel file is instantaneous because the CPU never touches a single pixel. The GPU composes the layers autonomously at 60 or 120 fps.

INPUT Image File NEF / JPEG CPU · ONE-TIME Decode CGImage GPU · VRAM Texture CALayer GPU · ZERO CPU Transform CATransform3D OUTPUT Display 60 / 120 fps

Zoom and pan are expressed as a CATransform3D — a 4×4 homogeneous matrix applied entirely by the GPU compositor:

sx
0
0
0
0
sy
0
0
0
0
1
0
tx
ty
0
1
sx, sy — scale factors (zoom level)

tx, ty — translation offsets (pan)

The GPU applies this transform at every frame with no CPU involvement. Pinching from 100% to 400% on a 45 MP file produces the same frame time as panning a thumbnail.
The interpolation filter is set to trilinear — the same algorithm used in professional 3D rendering engines. It samples neighboring pixels and pre-computed downscaled versions of the image (mipmaps), preventing aliasing artifacts at any zoom level.
02 — Accelerate Framework

Focus Peaking

vDSP · vImage · SIMD · AVX2 · Computer Vision
Accelerate vImageConvolve_PlanarF vDSP_vfltu8 AVX2 / NEON SIMD

Focus peaking is probably the most technically sophisticated feature in the app. Press D and the following sequence runs in milliseconds — start to finish, on a full-resolution file.

Step 1
Grayscale
vDSP_vfltu8
Step 2
Laplacian 3×3
vImageConvolve
Step 3
Threshold
vDSP_meanv
Step 4
Dilation
vImageConvolve
Step 5
Composite
CALayer

01 Grayscale Conversion

The color image is rendered into a byte buffer via CGContext. Each pixel becomes a single float value normalized between 0.0 and 1.0, converted using vDSP_vfltu8 — a vectorized function that transforms an entire array in a single SIMD instruction rather than pixel-by-pixel.

02 Laplacian 3×3 Convolution

The core of the algorithm. A 3×3 kernel slides over every pixel and computes the second derivative of the luminance signal. In plain terms: it detects where brightness changes sharply — and sharp transitions correspond to in-focus edges.

0
−1
0
−1
4
−1
0
−1
0
The Laplacian kernel. The center pixel is amplified by 4, while the four cardinal neighbors are subtracted. High output = sharp local contrast = in-focus area.

Executed via vImageConvolve_PlanarF using AVX2 vector instructions on Intel and NEON on Apple Silicon — 8 pixels processed per clock cycle in parallel.

This operation on a 24-megapixel image would take tens of seconds in interpreted Python. Here it runs in single-digit milliseconds.

03 Adaptive Threshold

Instead of a fixed cutoff — which would produce different results on bright vs. dark images — the threshold is derived from the statistical properties of the Laplacian map itself:

threshold = μ + k × σ

Where μ is the mean and σ the standard deviation (computed with vDSP_meanv and vDSP_vsq), and k is the user-controlled sensitivity slider (0.3 → 3.0). This is the same statistical approach used in industrial machine vision systems.

04 Morphological Dilation

Raw detected edges are one pixel thin — effectively invisible at screen resolution. A second convolution pass with a diamond-shaped kernel expands them outward, producing the visible colored outline around sharp zones. Without this step, the focus peaking overlay would appear as invisible scattered single pixels rather than clean borders.

05 Core Animation Compositing

The result is an RGBA image with pre-multiplied alpha (for correct gamma handling) composited as a separate CALayer on top of the photo. Changing the overlay color or sensitivity only recomputes the necessary buffer — no part of the interface is redrawn.

The entire 5-step pipeline runs synchronously without blocking the UI, because the Accelerate framework is optimized to complete in 10–50 milliseconds even on full-resolution images — well within a single display frame.
03 — Binary Parsing · Nikon Only

Reading AF Data from RAW

TIFF Container · EXIF IFD · MakerNote · NEF
TIFF / NEF EXIF tag 0x927C Nikon MakerNote AF Info 2 · 0x00B7 Nikon only

Nikon's RAW format (.NEF) is a TIFF container — the same format used for professional images since 1986. Autofocus data is hidden inside the MakerNote, a proprietary block that Nikon embeds inside EXIF tag 0x927C.

Last Roll uses no external libraries for this. The parser is written entirely in Swift and works directly on raw file bytes, navigating the nested binary structure layer by layer:

Location Size Description
0x0000 2 B Byte order marker — 0x4949 = little-endian (Intel) · 0x4D4D = big-endian (Motorola)
0x0002 2 B TIFF magic number 0x002A — confirms TIFF container
0x0004 4 B IFD0 offset — entry point to the Image File Directory chain
IFD0 var Navigate IFD entries → locate EXIF IFD pointer at tag 0x8769
EXIF IFD var Locate MakerNote at tag 0x927C — Nikon proprietary data block
MakerNote var Sub-TIFF header with its own independent byte order — parsed separately
MN + offset 74 B AF Info 2 block at tag 0x00B7 — contains AF area width, height, X and Y coordinates at fixed byte offsets

The result is a rectangle in the original image's pixel coordinate space. It's then scaled to screen coordinates and drawn as a vector overlay — a CGPath with rounded corners and a center crosshair — directly on top of the photo.

The file is never opened with an image decoder. Only raw bytes are read. This makes AF point extraction possible even before any pixel decoding begins — and keeps the implementation free of any external dependencies.
04 — RAW Loading Strategy

JPEG Extraction from RAW

Embedded Preview · Byte-level Scanning · Zero Dependencies
SOI · 0xFF 0xD8 0xFF EOI · 0xFF 0xD9 No LibRaw

RAW files don't contain only raw sensor data. Every modern RAW file embeds at least one full-resolution JPEG preview — the same image the camera displays on its rear LCD. Last Roll exploits this to load photos at maximum speed.

The loader scans the file's byte stream looking for the standard JPEG boundary markers:

// JPEG boundary markers (universal across all camera brands) let SOI: [UInt8] = [0xFF, 0xD8, 0xFF] // Start Of Image let EOI: [UInt8] = [0xFF, 0xD9] // End Of Image // Collect all embedded JPEG blocks, return the largest let blocks = findAllJPEGBlocks(in: rawBytes) return blocks.max(by: { $0.count < $1.count })

All embedded JPEG blocks are located and the largest one is returned — which corresponds to the full-resolution preview. This approach is significantly faster than full RAW decoding and requires no external libraries.

This strategy works across all RAW formats that embed JPEG previews — NEF, CR2, CR3, ARW, RAF — because JPEG boundary markers are universal and unambiguous in any byte stream.
05 — Data Layer

Session Persistence

JSON · ISO 8601 · Atomic Writes · Backwards Compatibility
ISO 8601 .atomic write decodeIfPresent

Sessions are serialized as JSON with dates in ISO 8601 format (2026-02-14T09:00:00Z) — the international standard for date interchange, ensuring sessions are read correctly across time zones and system locales.

// Atomic write: write to temp → rename to final path in one OS op try data.write(to: url, options: .atomic) // Backwards-compatible decode: missing fields → default values, not errors let notes = try container.decodeIfPresent(String.self, forKey: .notes) ?? ""

Atomic Writes

The file is first written to a temporary location, then renamed to the target path in a single OS-level operation — the only operation that is truly atomic on macOS. If the app crashes mid-save, the previous file is untouched. Culling data is never lost to a partial write.

Backwards Compatibility

Sessions created by older app versions — before the notes field existed — decode cleanly. The custom decoder uses decodeIfPresent so missing JSON keys produce a default empty string rather than a parse error. No migration scripts, no versioning flags.