PHP RFC: SIMD-Accelerated CRC via crc-fast for ext/hash
- Version: 1
- Date: 2026-02-17
- Author: Michael Wallner mike@php.net, Don MacAskill don@smugmug.com
- Status: Under Discussion
- First Published: 2026-02-17
- Implementation: https://github.com/php/php-src/pull/20513
Introduction
This RFC proposes adding optional support for the crc-fast library in PHP's ext/hash extension, enabling SIMD-accelerated CRC-32 and CRC-64 computation. When enabled via a new --with-crc-fast configure flag, this integration delivers up to 233× faster CRC throughput on ARM (aarch64) and ~4× on x86_64, while also adding support for CRC-64/NVME - a checksum algorithm PHP cannot currently compute natively.
PHP's current CRC implementation in ext/hash has two significant limitations:
1. No hardware acceleration on aarch64
On x86_64, PHP uses hardware CRC instructions for crc32c, achieving reasonable throughput (~27 GiB/s on Intel Sapphire Rapids). However, on aarch64, an increasingly popular server architecture (AWS Graviton, Apple Silicon, Ampere Altra), PHP falls back to table-based software computation, yielding only ~0.4 GiB/s. This is a 233× performance gap compared to what the hardware is capable of.
2. No CRC-64/NVME support
CRC-64/NVME (also known as CRC-64/ROCKSSOFT or CRC-64/WE) is the checksum algorithm:
- Required by AWS S3 as a recommended integrity check for object uploads and multipart transfers (as of 2024)
- Used in the NVMe specification for data integrity
- Used in the Linux kernel
- Supported natively in AWS SDKs (including the Rust SDK via
aws-smithy-checksums)
PHP applications interacting with AWS S3 that need CRC-64/NVME checksums must currently rely on workarounds. There is no way to compute this checksum using hash() or any other built-in PHP function.
Real-world impact
These limitations affect PHP applications that:
- Process large volumes of data with integrity checks (media platforms, file storage services, backup systems)
- Interact with AWS S3's checksum verification features
- Run on ARM-based cloud infrastructure (Graviton, which AWS promotes for cost savings)
- Use CRC checksums in hot paths (e.g., deduplication, content-addressable storage)
Proposal
Add optional crc-fast library support to ext/hash via a new configure flag:
./configure --with-crc-fast
When enabled, the following changes take effect:
Accelerated existing algorithms
All existing CRC-32 hash algorithms registered in ext/hash gain SIMD-accelerated computation transparently. The affected algorithms and their hash() names are:
hash() name | Algorithm | Notes |
|---|---|---|
crc32 | CRC-32/BZIP2 (byte-reversed) | PHP's idiosyncratic “crc32”; see Legacy Behavior |
crc32b | CRC-32/ISO-HDLC | Standard CRC-32 |
crc32c | CRC-32/ISCSI | Castagnoli, already HW-accelerated on x86_64 |
All accelerated variants produce identical output to the current implementations. This is a pure performance optimization with no behavioral change.
New CRC-64 algorithms
The following CRC-64 variants are registered, covering all widely-used CRC-64 algorithms:
hash() name | Algorithm |
|---|---|
crc64nvme | CRC-64/NVME |
crc64ecma | CRC-64/ECMA-182 |
crc64iso | CRC-64/GO-ISO |
crc64xz | CRC-64/XZ |
crc64redis | CRC-64/REDIS |
crc64ms | CRC-64/MS |
No change when disabled
When --with-crc-fast is not specified or libcrc_fast is not installed, PHP builds and behaves exactly as it does today. The new CRC-64 algorithms are not available, and CRC-32 uses the existing implementations.
Technical Details
The crc-fast library
crc-fast is a Rust library that provides:
- SIMD-accelerated CRC computation using
PCLMULQDQ/VPCLMULQDQon x86_64 andPMULLon aarch64 - 8-at-a-time folding of the Intel PCLMULQDQ paper's algorithm (vs. the paper's 4-at-a-time)
- Runtime CPU feature detection with automatic fallback to table-based software computation
- C-compatible shared library (
libcrc_fast.so/libcrc_fast.dylib) via a stable C ABI - Support for all known CRC-32 and CRC-64 variants, including non-reflected variants
- Memory safety validated with Miri, fuzz-tested with libFuzzer
The library is:
- Licensed under Apache-2.0 OR MIT (compatible with PHP's license)
- Published on crates.io with 6.4M+ downloads
- Used by the AWS SDK for Rust (
aws-smithy-checksums) - Running in production at SmugMug and Flickr on Linux (aarch64 + x86_64)
Performance
Benchmarks comparing PHP's current ext/hash CRC-32/ISO-HDLC (crc32b) throughput against crc-fast, measured on 1 GiB input:
aarch64
| Platform | PHP (current) | With crc-fast | Speedup |
|---|---|---|---|
| Apple M3 Ultra | ~0.4 GiB/s | ~99.6 GiB/s | 233× |
| AWS Graviton4 | ~0.4 GiB/s | ~56.7 GiB/s | 141× |
| AWS Graviton3 | ~0.4 GiB/s | ~27.2 GiB/s | 68× |
x86_64
| Platform | PHP (current) | With crc-fast | Speedup |
|---|---|---|---|
| Intel Sapphire Rapids | ~27.0 GiB/s | ~108.3 GiB/s | 4× |
| AMD EPYC Genoa (Zen4) | ~13.6 GiB/s | ~53.7 GiB/s | 4× |
CRC-64/NVME (new, no PHP baseline exists)
| Platform | Throughput |
|---|---|
| Apple M3 Ultra | ~70.0 GiB/s |
| Intel Sapphire Rapids | ~54.6 GiB/s |
| AWS Graviton4 | ~36.1 GiB/s |
The aarch64 improvement is the most dramatic because PHP currently has no hardware-accelerated CRC path on ARM. The x86_64 improvements come from the 8-at-a-time folding approach and AVX-512 VPCLMULQDQ usage on supporting processors.
Legacy Behavior
PHP's hash('crc32', ...) computes CRC-32/BZIP2 with byte-reversed output, which differs from the CRC-32 implementation in most other languages and from PHP's own crc32() function. This is a long-standing quirk documented in the PHP manual.
The crc-fast library accounts for this via a dedicated CRC_32_PHP algorithm constant that exactly reproduces PHP's existing behavior. No output changes occur for any existing algorithm.
The following equivalences are preserved:
// These continue to produce identical results with or without crc-fast: hash('crc32', $data); // CRC-32/BZIP2, byte-reversed (PHP's quirk) hash('crc32b', $data); // CRC-32/ISO-HDLC (standard CRC-32) hash('crc32c', $data); // CRC-32/ISCSI (Castagnoli) crc32($data); // CRC-32/ISO-HDLC (native function, unaffected)
Backward Incompatible Changes
None. The feature is entirely opt-in via --with-crc-fast. When disabled, PHP is unchanged. When enabled, existing CRC-32 algorithms produce identical output at higher throughput, and new CRC-64 algorithm names are additive.
Proposed PHP Version
PHP 8.6 (next minor release targeting the master branch).
RFC Impact
To SAPIs
None. The change is internal to ext/hash.
To Existing Extensions
None. The ext/hash public API is unchanged. Extensions that register custom hash algorithms are unaffected.
To Opcache
None.
New Constants
None proposed. Algorithm selection uses string names via the existing hash() API.
php.ini Defaults
No new INI settings. The feature is controlled entirely at compile time via --with-crc-fast.
Proposed Voting Choices
Primary Vote requiring a 2/3 majority to accept the RFC:
Patches and Tests
- Implementation: https://github.com/php/php-src/pull/20513
- Standalone PHP extension (reference): https://github.com/awesomized/crc-fast-php-ext
- Underlying library: https://github.com/awesomized/crc-fast-rust
References
- PHP internals mailing list discussion (November 2025)
- crc-fast Rust crate (6.4M+ downloads)
- PHP PR #3913: CRC-32C addition in PHP 7.4 (precedent for adding CRC variants without RFC)
Rejected Features
Keep this updated with features that were discussed on the mail lists.
Changelog
If there are major changes to the initial proposal, please include a short summary with a date or a link to the mailing list announcement here, as not everyone has access to the wikis' version history.