#51815: build: consume Electron-generated PGO profiles in release builds
Description of Change
Wires Electron release builds to consume Electron-generated PGO profiles instead of Chrome's published ones. This is the consumption side of the PGO pipeline (generation side: #51812 — the two PRs are independent and can merge in either order).
The problem
Release builds apply Chrome's published PGO profile (chrome_pgo_phase = 2). PGO profiles match functions by symbol name + control-flow hash, so every function that differs from Chrome — all of Node.js, patched Chromium files, V8 built with Node's flags, and the Electron shell — gets no optimization guidance and is laid out as cold. The same applies to V8's builtins profile: Chrome's rejects Electron's promise/async builtins (RunMicrotasks, AsyncFunctionAwait, FulfillPromise, …) because Node's integration changes their codegen.
This isn't theoretical. Shipping Electron 44 has a 2.1× regression in crypto.randomBytes vs Electron 42 (829K → 390K ops/s) caused by exactly this: a BoringSSL patch changed function hashes, Chrome's profile silently stopped covering them. The Electron profile recovers it completely (839K ops/s).
How it works
build/pgo_profiles/<target>.pgo.txt state files name the profile per target (only these change on updates)
|
v gclient hooks (DEPS) run script/pgo/download-profiles.py
build/pgo_profiles/electron-<target>-<version>.profdata downloaded under their versioned names
|
v Chromium's standard chrome_pgo_phase = 2 machinery
-fprofile-use + every other flag upstream a small Chromium patch teaches its profile
maintains (warning suppressions, ext-TSP resolution to read Electron's state files,
block layout, future additions) applied including per-arch Linux profiles which
to Electron's profiles unchanged upstream does not have
- Chromium's PGO configuration stays authoritative: the patch only redirects which profile the standard
chrome_pgo_phase = 2flow resolves; all compiler/linker flags come from upstream's config and track it automatically (this addresses the cflags-drift concern raised in review — the parallel-config approach had in fact already drifted, missing-enable-ext-tsp-block-placement) - A single
release.gnimport works on every platform/arch —chrome_pgo_phasealready defaults to 2 for official builds, so it only wires the V8 builtins profile - Chrome's published profiles are no longer downloaded (
checkout_pgo_profiles = False); an explicitly setpgo_data_pathstill takes precedence as an escape hatch back to them - 32-bit targets (
win-x86,linux-arm) get the C++ profile but keep no builtins PGO (Electron only generates a 64-bit builtins profile)
Benchmark results
37 statistically significant improvements, 0 regressions, 1 tie across 38 app-operation benchmarks — overall geomean +19.5% vs the shipping nightly. Full data below.
All benchmark records (click to expand)
1. Headline: shipping nightly vs this work (conclusive, statistically controlled)
Methodology: official v44.0.0-nightly.20260529 vs an identical-source build with Electron PGO profiles + ThinLTO¹, Linux x64, 38 benchmarks × 5 interleaved rounds per build (A,B,A,B,… to cancel thermal/cache drift), idle machine, Welch's t-test at p < 0.05.
| Area | Geomean | Largest wins |
|---|---|---|
| contextBridge | +28.1% | deep nested objects +54.6%, small objects +49.6%, arrays +48.3%, callbacks +40.9% |
| Networking | +18.8% | POST echo +39.5%, response.json() +36.6%, fetch 1KB +28.9%, XHR +27.7% |
| IPC | +10.9% | structured clone (Map/Set/Date) +16.1%, 1MB payloads +14.7%, 5MB throughput +13.0% |
¹ The comparison build also includes ThinLTO --lto-O2 (#51669/#51809), since that is the configuration releases will ship with once both land. For PGO's isolated contribution see section 2.
Full 38-test table (click to expand)
| Test | Nightly | With profiles | Δ | Significant |
|---|---|---|---|---|
| bridge: void call (no args) | 1,823,713 | 2,292,331 | +25.7% | ✅ |
| bridge: echo number | 1,449,493 | 1,880,806 | +29.8% | ✅ |
| bridge: echo 1KB string | 1,446,386 | 1,884,338 | +30.3% | ✅ |
| bridge: echo small object | 369,786 | 553,153 | +49.6% | ✅ |
| bridge: echo deep nested object | 103,109 | 159,361 | +54.6% | ✅ |
| bridge: echo 500-elem array | 1,133 | 1,681 | +48.3% | ✅ |
| bridge: echo 64KB typed array | 15,362 | 15,966 | +3.9% | ✅ |
| bridge: echo 1MB typed array | 637 | 650 | +2.0% | tie (memcpy-bound) |
| bridge: transform object | 342,173 | 506,416 | +48.0% | ✅ |
| bridge: callback round-trip | 144,697 | 203,861 | +40.9% | ✅ |
| bridge: exposed property access | 7,419,663 | 8,031,797 | +8.3% | ✅ |
| bridge: async function (promise) | 159,229 | 202,957 | +27.5% | ✅ |
| bridge: invoke round-trip (small) | 10,603 | 11,920 | +12.4% | ✅ |
| ipc: invoke small object | 11,529 | 12,373 | +7.3% | ✅ |
| ipc: invoke 10KB JSON object | 3,070 | 3,478 | +13.3% | ✅ |
| ipc: invoke 64KB typed array | 2,276 | 2,592 | +13.9% | ✅ |
| ipc: invoke 1MB typed array | 210 | 241 | +14.7% | ✅ |
| ipc: invoke structured clone (Map/Set/Date) | 4,702 | 5,462 | +16.1% | ✅ |
| ipc: invoke 5MB throughput (MB/s) | 223 | 252 | +13.0% | ✅ |
| ipc: invoke 20MB throughput (MB/s) | 227 | 254 | +11.7% | ✅ |
| ipc: one-way send throughput (msgs/s) | 23,392 | 23,747 | +1.5% | ✅ |
| ipc: webContents.send round-trip | 12,250 | 13,646 | +11.4% | ✅ |
| ipc: MessagePort round-trip | 14,033 | 15,239 | +8.6% | ✅ |
| ipc: executeJavaScript round-trip | 11,795 | 12,844 | +8.9% | ✅ |
| net: fetch 1KB round-trip | 761 | 981 | +28.9% | ✅ |
| net: fetch 64KB round-trip | 735 | 907 | +23.4% | ✅ |
| net: fetch 1MB round-trip | 279 | 317 | +13.7% | ✅ |
| net: fetch 8MB throughput (MB/s) | 524 | 554 | +5.9% | ✅ |
| net: 4x parallel 64KB fetches | 237 | 288 | +21.4% | ✅ |
| net: POST echo 2KB | 821 | 1,146 | +39.5% | ✅ |
| net: POST echo 256KB | 287 | 347 | +21.1% | ✅ |
| net: fetch + response.json() (10KB) | 727 | 993 | +36.6% | ✅ |
| net: fetch + response.text() (64KB) | 665 | 840 | +26.3% | ✅ |
| net: XHR 64KB round-trip | 691 | 883 | +27.7% | ✅ |
| net: WebSocket echo 1KB | 9,955 | 10,643 | +6.9% | ✅ |
| net: WebSocket echo 64KB | 1,574 | 1,685 | +7.1% | ✅ |
| net: node https.get 64KB | 4,003 | 4,355 | +8.8% | ✅ |
| net: node https.get 1MB | 358 | 372 | +4.0% | ✅ |
All values ops/s unless noted. The single non-win (1MB typed array marshaling) is dominated by raw memcpy, which PGO cannot accelerate.
2. PGO's isolated contribution (Chrome profile vs Electron profile, same build otherwise)
| Benchmark | Improvement |
|---|---|
| Speedometer 3.1 (Linux x64) | +9.5% |
crypto.randomBytes |
+118% (regression recovery: 390K → 839K ops/s, matching Electron 42's 829K) |
| Startup → app ready (macOS M1) | −13% (45.2ms → 39.2ms) |
3. The training-coverage story (why profiles must cover app workloads)
A profile trained only on browser benchmarks pessimizes code those benchmarks never run (PGO marks uncovered functions cold). Measured cost on a benchmark-only profile, and recovery after adding Electron-specific training (main-process Node.js, contextBridge/IPC marshaling, networking over TLS — see #51812):
| Path | Benchmark-only profile | Enriched profile |
|---|---|---|
| Node Buffer ops (macOS arm64) | −63% vs unprofiled | recovered |
| contextBridge calls | baseline | +23–27% further |
| Large-payload IPC | baseline | +24% further |
| Geomean of 22 app operations | baseline | +8% further |
4. V8 builtins profile coverage
| Chrome's published profile | Electron's profile | |
|---|---|---|
Promise/async builtins (RunMicrotasks, AsyncFunctionAwait, FulfillPromise, PromiseConstructor, …) |
Rejected (hash mismatch from Node's codegen flags) | Covered (113–306 block hints each) |
| All other builtins | Covered | Covered |
5. Cumulative Speedometer 3.1 progression (Linux x64, containerized; same source, same V8)
| Configuration | Score | Step |
|---|---|---|
| Shipping configuration (Chrome profile, default LTO) | 22.08 | baseline |
+ ThinLTO --lto-O2 (#51669 / #51809) |
24.99 | +13.2% |
| + Electron C++ PGO profile (this PR) | 27.37 | +9.5% |
| + Electron V8 builtins profile (this PR) | ~27.8 | +1.6% |
With the full optimization stack, Electron is as fast as — and in some cases faster than — Chrome on the same workloads.
6. Real-hardware corroboration (macOS, Apple Silicon)
ThinLTO-only numbers from #51669/#51809 testing — included to show container results translate to real hardware (PGO stacks on top of these):
| Hardware | Speedometer 3.1 | Gain |
|---|---|---|
| M1 | 26.8 → 31.7 | +18.3% |
| M5 | 56.6 → 65.5 | +15.7% |
Relationship to other PRs
- #51812 — generates and publishes the profiles this PR consumes (independent; either merge order works — the profiles referenced by this PR's state files are already published)
- #51669 / #51809 — ThinLTO link-time optimization (independent; benefits multiply)
Checklist
- I have built and tested this change
- I have filled out the PR description
- I have reviewed and verified the changes
-
npm testpasses - PR release notes describe the change in a way relevant to app developers
Release Notes
Notes: Improved runtime performance.
Backports
Semver Impact
Semantic Versioning helps users understand the impact of updates:
- Major (X.y.z): Breaking changes that may require code modifications
- Minor (x.Y.z): New features that maintain backward compatibility
- Patch (x.y.Z): Bug fixes that don't change the API
- None: Changes that don't affect using facing parts of Electron