====== PHP RFC: Add MariaDB-specific features to mysqlnd and mysqli =====
* Version: 0.4
* Date: 2026-04-13
* Author: Georg Richter, georg@php.net
* Status: Under discussion
===== Introduction =====
The PHP MySQL ecosystem is currently centered around mysqlnd, mysqli, and PDO_MySQL, which together provide a unified and efficient interface for interacting with MySQL-compatible database servers. MariaDB originated as a fork of MySQL and continues to share a large portion of its protocol, behavior, and feature set. Because of this substantial overlap, maintaining a completely separate PHP extension for MariaDB would introduce unnecessary duplication, fragmentation, and long-term maintenance cost.
At the same time, MySQL and MariaDB have diverged over the years. MariaDB has introduced a number of server-side features that are not available in MySQL, many of which provide significant performance and observability improvements. Currently, these MariaDB-specific capabilities are either inaccessible or only partially usable from PHP, despite being available through the same wire protocol and client library functions.
This RFC proposes to extend mysqlnd and mysqli to optionally expose MariaDB-specific functionality when connected to a MariaDB server, without affecting behavior when used with MySQL. The goal is not to fork or specialize the extensions, but to enhance them in a way that preserves compatibility while allowing PHP applications to benefit from MariaDB’s additional capabilities.
The MariaDB features targeted by this proposal include:
* **Capability Detection** – Enabling accurate identification of MariaDB servers and their specific feature sets.
* **Bulk operations** – Enabling significantly faster ''INSERT'', ''UPDATE'', ''REPLACE'', and ''DELETE'' statements via the new ''execute_many()'' method. Benchmarks (see Performance Analysis) demonstrate that by amortizing network round-trip costs and reducing driver-level overhead, performance gains can exceed 100x in remote or resource-constrained scenarios (e.g., Raspberry Pi via Ethernet) and remain substantial (up to 9x) even on high-performance ''unix_socket'' connections. This feature allows PHP applications to achieve throughput parity with native bulk-loading tools while maintaining the security of the prepared statement protocol.
* **Progress indication for long-running operations** – Improving transparency and enabling better application-level monitoring and user feedback for heavy maintenance tasks.
Exposing these features through existing PHP database extensions allows developers to take advantage of MariaDB’s strengths while continuing to rely on familiar MySQL-oriented APIs. This approach aligns with PHP’s long-standing emphasis on pragmatism, performance, and backward compatibility, while preparing the extensions to handle protocol divergence and feature evolution in MariaDB.
The following internal improvements are related to this work but will be submitted as separate Pull Requests to simplify the RFC process:
* MariaDB Authentication Plugins (e.g., parsec).
* Extended Metadata (mapping types like ''LONGTEXT+JSON'' to ''MYSQL_TYPE_JSON'').
* Prepared statement metadata caching.
These PR-bound features require the Capability Detection logic defined in this RFC to function correctly.
===== Proposal =====
Notes:
* The concepts apply to mysqlnd internally and are exposed through mysqli. PDO_MySQL is currently out of scope for visible API changes in this RFC: This is because the current PDO interface lacks a native method for high-performance bulk execution or iterable-based parameter binding. While mysqli can be extended with execute_many(), a similar addition to PDO would require a separate, cross-driver RFC to modify the core PDO base class. By implementing the logic in mysqlnd now, we provide the necessary foundation for that future PDO expansion.
==== Detecting MySQL or MariaDB ====
A fundamental requirement for exposing MariaDB-specific features in PHP extensions is reliably detecting whether the connected server is MariaDB or MySQL. This cannot be done solely by checking the server_version string. While standard MySQL and MariaDB installations report the server type and version in server_version, many derived products—including cloud offerings, packaged distributions, and proxies may report different names, modified version numbers, or other inconsistencies. Therefore, relying on server_version alone is unreliable for feature detection.
A robust approach is to determine server type through capability flags exchanged between the client and server during the connection handshake. These flags indicate which features the server supports and which extensions the client can enable.
One key difference between MySQL and MariaDB is the handling of the long password capability:
MySQL servers always indicate the CLIENT_LONG_PASSWORD capability.
MariaDB historically used the same capability name, but later versions renamed it to CLIENT_MYSQL and removed it entirely.
By examining these flags, the client can detect MariaDB servers accurately and safely. Once a MariaDB server is recognized, the client can both read and advertise MariaDB-specific capabilities, enabling access to enhanced features.
To fully support modern MariaDB features, PHP extensions must handle 64-bit capability flags. These flags extend the feature set beyond what can fit in traditional 32-bit integers. For legacy 32-bit clients and to maintain backward compatibility, it is necessary to expose these capabilities in both server_capabilities and ext_server_capabilities. This dual exposure ensures that extensions can access all advertised features safely, regardless of the client architecture.
MySQL has also introduced a forward-looking mechanism for capability expansion via the CLIENT_CAPABILITY_EXTENSION flag. This flag reserves space for extending the traditional 32-bit capability field to 64 bits. As both MySQL and MariaDB move toward extended capability sets, it becomes important to clearly separate vendor-specific extensions.
To avoid ambiguity and potential conflicts between MySQL and MariaDB capability flags in the extended range, PHP should introduce distinct namespaces for these flags. For example, MariaDB-specific extended capabilities could be exposed using a prefix such as MYSQLI_MARIADB_*, ensuring that they remain clearly distinguishable from MySQL-defined flags and any future extensions
This capability-based detection allows PHP extensions to:
* Enable MariaDB-specific features such as bulk operations, extended metadata, progress reporting, and mariadb_stmt_execute_direct.
* Support MariaDB authentication plugins like parsec.
* Maintain full backward compatibility with MySQL servers.
Example:
server_capabilities & MYSQLI_CLIENT_MYSQL) {
printf("Connected to MySQL server\n");
} else {
printf("Connected to Mariadb server\n");
printf("Bulk operations are ");
if ($link->extended_server_capabilities & MYSQLI_MARIADB_CLIENT_STMT_BULK_OPERATIONS) {
printf("supported\n");
} else {
printf("not supported\n");
}
}
$link->close();
?>
==== Bulk Execution ====
MariaDB provides mechanisms to improve the performance of repeated or multi-row statements through bulk execution. The most common scenario is INSERT statements. In many cases, a standard single-row INSERT or REPLACE can be rewritten using the multi-value syntax:
INSERT INTO table (col1, col2) VALUES (?, ?), (?, ?), ..., (?, ?)
This approach can significantly reduce network round-trips and server parsing overhead. However, it has some limitations:
* Not all INSERT or REPLACE statements can be safely rewritten using multi-value syntax (e.g., INSERT ... SELECT or statements with complex ON DUPLICATE KEY UPDATE clauses).
* DELETE and UPDATE statements cannot generally use multi-value or IN(...) optimizations when additional conditions exist. Even when IN(...) could work, it only applies to simple statements with a single column in the WHERE clause. For complex queries, each statement must be executed individually.
The MySQL protocol already anticipated support for bulk operations when prepared statements were introduced in MySQL 4.1. Prepared statement headers include a 4-byte iteration count, intended to allow repeated execution of a statement. This field, however, was never used by MySQL.
MariaDB addressed this limitation in version 10.2 by introducing a new command, COM_STMT_BULK_EXECUTE, which implements an enhanced protocol for executing bulk operations. This protocol provides several advantages:
* Efficient execution of multiple statement iterations in a single network round-trip.
* Support for indicator variables, allowing the client to specify NULL or default values for individual parameters within the bulk operation.
* Significant reduction in driver and server-side overhead, which can result in performance gains ranging from 2.5x on high-end local hardware to over 100x in remote network scenarios (depending on network latency and CPU processing power).
From a PHP perspective, exposing bulk execution through mysqli or PDO would allow applications to take advantage of these performance gains without requiring changes to existing SQL logic, other than optionally preparing statements for bulk execution.
=== Supporting indicator variables ===
To fully leverage MariaDB's bulk protocol, this RFC introduces the mysqli_indicator. Indicators allow the client to specify per-row behavior for individual parameters without changing the SQL statement or the bound types.
Available Indicators:
* mysqli_indicator::None
* mysqli_indicator::Null: Explicitly set column value to NULL (as an alternative also NULL can be used)
* mysqli_indicator::Default: Instructs the server to use the column's defined DEFAULT value.
* mysqli_indicator::Ignore: Tells the server to "ignore" this column for the current iteration (equivalent to not including the column in a specific INSERT or UPDATE row).
=== Proposed syntax ===
mysqli_stmt::execute_many
Exposes the bulk execution capability to PHP. It functions similarly to bind_param() but operates on an iterable collection of data. This method supports two distinct operation modes: Implicit Autodetection and Strict-Typed Optimization.
Syntax:
Object oriented style:
public mysqli_stmt::execute_many(iterable $data, ?iterable $control = null, ?string $types = null): bool
Procedural style:
public mysqli_stmt_execute_many(mysqli_stmt $stmt, iterable $data, ?iterable $control = null, ?string $types = null): bool
Parameters:
* $data: An iterable (array or Generator) containing the source values.
* $control (Optional): An iterable used as a transformation template or mask. It allows for non-destructive data overrides without modifying the source $data.
* Global Mode: If $control contains a single row, that row is applied as a mask to every iteration of $data.
* Streaming Mode: If $control contains multiple rows, it is iterated in parallel with $data.
* $types (Optional): A string specifying the binary types for the corresponding data columns:
* 1: 8-bit integer
* 2: 16-bit integer
* 4: 32-bit integer
* 8/i: 64-bit integer
* d: double
* s: string/blob
* If omitted: The driver defaults to the "All-Strings" path where all scalar values are implicitly cast to strings via server-side implicit casting.
**Return Values:**
Returns ''true'' on success or ''false'' on failure.
**Execution Behavior and Atomicity**
To guarantee atomicity across all protocols and transactional engines, it is highly recommended to wrap execute_many() in an explicit transaction. This ensures an "all-or-nothing" execution regardless of whether the driver is using native bulk protocols or the fallback emulation.
begin_transaction();
try {
$stmt->execute_many($data);
$mysqli->commit();
} catch (mysqli_sql_exception $e) {
$mysqli->rollback();
throw $e;
}
?>
**Notes:**
* By explicitly defining the integer width, mysqlnd can optimize the internal buffer allocation and minimize the data size sent over the wire. For datasets with millions of rows, using 1 (TINYINT) instead of the standard 64-bit integer can result in an 87.5% reduction in memory usage for that specific column's buffer during the transfer.
* On non-transactional engines like MyISAM, explicit transactions have no effect. In these cases, a failure during a bulk operation will result in partial data insertion (rows processed before the error will remain in the table).
Example:
query("DROP TABLE IF EXISTS t1");
$link->query("CREATE TABLE t1 (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50),
score DOUBLE DEFAULT 20.0
)");
/* 1. Simple Bulk Insert (Autodetection Mode) */
$rows = [
["Jennifer", 20.34],
["Marcus", 21.9],
["Zak", 20.91]
];
$stmt = $link->prepare("INSERT INTO t1 (name, score) VALUES (?, ?)");
$stmt->execute_many($rows);
/* 2. Bulk Update using $control for DEFAULT values /
/ We want to reset specific users to the table default score */
$update_data = [
[25.5, 1], // Update ID 1 to 25.5
[0.0, 2] // Placeholder for ID 2 (will be overridden by control)
];
/* Control mask: Row 0 col 0 uses data. Row 1 col 0 uses DEFAULT */
$control = [
[mysqli_indicator::None, mysqli_indicator::None],
[mysqli_indicator::Default, mysqli_indicator::None]
];
$stmt_upd = $link->prepare("UPDATE t1 SET score = ? WHERE id = ?");
/* Using named parameters to pass control while skipping types */
$stmt_upd->execute_many(data: $update_data, control: $control);
printf("%d rows updated\n", $stmt_upd->affected_rows);
/* 3. Streaming with explicit types for performance */
function user_generator() {
yield ["Rasmus", 99.9];
yield ["Derick", 98.5];
}
$stmt_ins = $link->prepare("INSERT INTO t1 (name, score) VALUES (?, ?)");
$stmt_ins->execute_many(data: user_generator(), types: "sd");
$link->close();
?>
=== Streaming Support: A Secure Alternative to LOAD DATA LOCAL INFILE ===
In many production environments, the ''LOAD DATA LOCAL INFILE'' command is disabled (''local_infile=0'') to mitigate security risks, such as unauthorized local file reads by a compromised client. While this secures the server, it often leaves developers with the slow alternative of executing thousands of individual ''INSERT'' statements.
''execute_many()'' provides a high-performance, secure alternative that operates over the standard Prepared Statement protocol. By accepting an **iterable**, it allows for "Streaming" data directly into the database.
** Key Advantages:**
* **Memory Efficiency**: By using a PHP **Generator**, only one row of data resides in PHP's memory at a time. This allows for the processing of files with a constant, minimal memory footprint.
* **Format Versatility**: While ''LOAD DATA'' is generally limited to flat text files (CSV/TSV), ''execute_many()'' can stream data from any source. This includes:
* **XML/JSON**: Parsing large documents via ''XMLReader'' or a JSON machine.
* **Remote APIs**: Fetching and inserting records from a REST or SOAP endpoint in chunks.
* **Binary Streams**: Processing custom data formats or encrypted streams.
* **Programmable Filtering**: Because the data is processed in a PHP loop before being "yielded" to the driver, developers can perform complex validation, data cleaning, or conditional skipping natively in PHP.
The following snippet demonstrates how a Generator acts as a programmable filter, replacing the need for row-level "Ignore" indicators:
prepare("INSERT INTO users (id, name) VALUES (?, ?)");
$stmt->execute_many(csv_streamer('large_data_export.csv'));
?>
==== Progress Indication ====
Long-running SQL statements such as large ALTER TABLE, CREATE INDEX or complex UPDATE operations can take a significant amount of time to complete. Traditionally, clients have no insight into the progress of such operations and can only wait for completion or timeout.
MariaDB provides server-side progress reporting, allowing the server to send progress information to the client while a statement is executing. This information includes:
* Current progress value
* Maximum progress value
* Stage or operation context
Progress information is sent asynchronously during statement execution and does not alter the semantics of the SQL statement itself.
Key characteristics:
* Progress messages are sent out-of-band while a statement is running.
* They do not interrupt or terminate the executing statement.
* Clients may ignore progress messages without affecting correctness.
* Progress reporting is optional and enabled only when both client and server support it.
From a PHP perspective, exposing this feature allows applications to:
* Provide feedback to users for long-running operations
* Implement progress bars or logging for batch jobs
* Detect stalled operations earlier
* Improve observability without polling or additional queries
Example pseudo-code:
options(MYSQLI_OPT_PROGRESS_CALLBACK, 'callback_progress');
$mysqli->query("ALTER TABLE big_table ADD INDEX idx_col (col)");
?>
==== PDO Considerations (Out of Scope) ====
PDO_MySQL currently does not support array binding or bulk execution semantics required to expose MariaDB-specific bulk operations in a clean and consistent way.
This RFC focuses on enhancing mysqlnd and mysqli first. Future RFCs may extend PDO_MySQL once the underlying mysqlnd capabilities are available and an appropriate PDO API design is agreed upon.
===== Implementation / Proof-of-Concept =====
A proof-of-concept is available on [[https://github.com/9EOR9/php-src/tree/8.6-mariadb|github]]
The implementation modifies:
mysqlnd
* to support COM_STMT_BULK_EXECUTE
* to detect MariaDB server
* to cache metadata (prepared statements) (PR)
* to support extended field types (PR)
mysqli
* to expose execute_many()
* Indicator handling in parameter binding
* Protocol-level result processing for bulk operations
* Metadata caching (PR)
* Support for extended metadata (PR)
The patch is intended for PHP 8.6 and requires MariaDB Server ≥ 10.2 for full functionality.
Fallback behavior for MySQL Server remains unchanged and continues to use repeated COM_STMT_EXECUTE.
===== Performance Analysis: execute_many() vs. sequential execute() =====
It is impossible to say that ''execute_many()'' is "n-times" faster than a sequential ''execute()'' loop as a universal rule. The performance delta is highly dependent on two primary factors: hardware processing power and network latency.
For this analysis, two distinct hardware environments were used:
* Localhost (High-Performance): Server and Client on an Intel i7-13700HX (16C/24T) with Samsung PCIe 4.0 NVMe storage. Connection via loopback, representing near-zero latency.
* Remote Client (Resource-Constrained): A Raspberry Pi 4 (Cortex-A72, 4GB RAM) booting from USB Mass Storage. Connected via Gigabit Ethernet to the 10.1 server.
==== Comparative Measurements ====
^ Operation Type ^ Environment ^ execute_many() (rows/sec) ^ execute() Loop (rows/sec) ^ Improvement Factor ^
| INSERT | Remote (10.23 → 10.1) | 588,859.83 | 5,555.30 | 106.0x |
| UPDATE | Remote (10.23 → 10.1) | 153,462.00 | 4,027.00 | 38.1x |
| DELETE | Remote (10.23 → 10.1) | 207,079.00 | 3,719.00 | 55.6x |
| INSERT | Localhost (10.1) | 740,103.14 | 83,165.25 | 8.9x |
| UPDATE | Localhost (10.1) | 182,057.00 | 73,294.00 | 2.5x |
| DELETE | Localhost (10.1) | 213,617.00 | 74,263.00 | 2.9x |
==== Analysis of Performance Factors ====
The measurements reveal how these two variables determine the "efficiency gap."
=== 1. Network Latency (The Round-Trip Penalty) ===
The most dramatic performance gains are observed in the remote Ethernet environment. While ''execute_many()'' performance remains remarkably high even on the Raspberry Pi, the sequential loop suffers a catastrophic penalty over the network.
Remote Insert: 5,555 rows/sec
Localhost Insert: 83,165 rows/sec
In a loop, the application is IO-bound, forced to wait for a network acknowledgment for every single row. ''execute_many()'' mitigates this by batching thousands of rows into fewer network packets, effectively negating the Round-Trip Time (RTT) bottleneck.
=== 2. Hardware and Protocol Overhead ===
The 106x improvement on the Raspberry Pi demonstrates that batching is not just about the network, but also about reducing CPU interrupts and context switching on low-power hardware.
Even on localhost with a high-end i7-13700HX and NVMe storage, ''execute_many()'' provides a ~9x speedup for INSERTS.
This proves that reducing the driver-level overhead of repeated function calls and internal buffer management provides a significant performance floor, allowing the application to finally saturate modern PCIe 4.0 storage throughput.
==== Conclusion ====
For remote clients or IoT hardware like the Raspberry Pi, ''execute_many()'' is a transformative optimization, offering over 100x speedup for data ingestion. The data confirms that as network latency increases, the necessity of batching becomes the single most important factor in database scalability.
===== Visibility and Impacts =====
MariaDB-specific features can be categorized based on whether they require changes to the PHP API or are handled entirely internally by mysqlnd or the PHP extensions.
==== Visible Features ====
These features require new methods, functions, or options in PHP to expose MariaDB-specific functionality to developers:
| Feature | PHP Extension | Description | Required API Change |
| Server capabilities | mysqli | Additional connection properties | New: $link->server_capabilities and $link->ext_server_capabilities|
| Bulk Execution | mysqli | Execute multiple sets of parameters efficiently in one round-trip | $stmt->execute_many() |
| Progress Indication | mysqli | Report progress of long-running operations | Additional parameter MYSQLI_OPT_PROGRESS_CALLBACK for mysqli_options |
Impact: Applications can call these new methods directly to leverage MariaDB-specific performance or monitoring features. Backward compatibility with MySQL is maintained because these methods can be no-ops when connected to a MySQL server.
==== Invisible Features ====
These features will be submitted as separate PRs e:
| Feature | PHP Extension | Description | Notes |
| Prepared Statement Metadata Caching | mysqlnd | Avoids resending metadata for repeated executes | Transparent to application code |
| Authentication Plugins | mysqlnd | Supports MariaDB authentication plugins (e.g., parsec) | Applications use standard connection functions |
| Extended Metadata / Type Granularity | mysqli | maps several types to MYSQL_TYPE_*, e.g. JSON and VECTOR | Transparent to application code |
===== Impacts =====
Backward Compatibility: All invisible features are fully backward-compatible with MySQL. Visible features are MariaDB-specific but can be safely ignored or no-oped when connected to a MySQL server.
=== MySQL Connectivity and Emulation ===
To maintain a unified API across the mysqli ecosystem, execute_many() includes a transparent fallback for servers that do not support the MariaDB COM_STMT_BULK_EXECUTE protocol (e.g., standard MySQL).
* Execution Logic: On non-supported connections, mysqlnd emulates the behavior by iterating through the dataset and issuing individual COM_STMT_EXECUTE commands. This ensures that code written for execute_many() remains portable.
* Atomicity Impact:
* * Native: Atomic at the protocol level (the whole packet succeeds or fails).
* * Emulated: Non-atomic in autocommit mode (each row is a separate transaction). Users requiring consistent atomic behavior across platforms should utilize explicit transactions (begin_transaction()).
* Indicator constraints: The ::Default and ::Ignore cases will trigger a mysqli_sql_exception on MySQL connections. This is a deliberate design choice to avoid the significant performance penalty and complexity of a driver-side SQL parser required to "guess" server-side column defaults.
===== Backward Incompatible Changes =====
None
===== RFC Impact =====
==== SAPIs ====
There is no impact on existing SAPIs. The changes are confined to the ''mysqlnd'' driver and the ''mysqli'' extension.
==== Frameworks and Applications ====
Backward Compatibility: Existing applications using ''mysqli'' or ''PDO_MySQL'' will see no change in behavior when connected to MySQL or older MariaDB servers.
New Capabilities: Frameworks (like Doctrine) can implement high-speed bulk inserts by detecting the ''execute_many()'' method, providing a standardized path for MariaDB-specific optimizations that previously required raw SQL hacks or ''LOAD DATA'' workarounds.
==== Backward Incompatible Changes ====
None. This RFC adds new methods and internal optimizations. It does not modify the signature of existing functions or change default connection behaviors.
==== Performance Impact ====
Positive: Significant performance gains for bulk operations (up to 100x speedup in high-latency environments).
Minimal Overhead: The server detection logic occurs only once during the connection handshake, introducing negligible overhead.
==== Memory Consumption ====
execute_many(): When used with a Generator, memory consumption constant, as only the current row is buffered. This is significantly more efficient than manual batching in PHP, which often involves building large arrays or long SQL strings.
==== Security ====
execute_many() improves security posture by providing a high-performance alternative to ''LOAD DATA LOCAL INFILE'', allowing administrators to keep ''local_infile'' disabled without sacrificing ingestion speed.
===== Open Issues =====
===== Future Scope =====
==== Unified PDO Bulk Interface ====
While this RFC focuses on the mysqli extension and the underlying mysqlnd protocol logic, the binary foundation is designed for forward-compatibility. A future PDO Bulk RFC could leverage this implementation to provide a unified PDOStatement::execute_many() interface. This would allow developers to write driver-agnostic bulk logic while benefiting from MariaDB’s high-performance binary protocol under the hood.
===== Implementation =====
===== References =====
* [[https://mariadb.com/docs/server/reference/clientserver-protocol|MariaDB Client/Server Protocol]]
* [[https://mariadb.com/docs/server/reference/clientserver-protocol/mariadb-protocol-differences-with-mysql| MariaDB Protocol Differences with MySQL]]
* [[https://github.com/9EOR9/php-mariadb-rfc|Benchmarks and results]]
* [[https://github.com/9EOR9/php-src/tree/8.6-mariadb|Proof of concept]]
===== Rejected Features =====
===== Changelog =====
* 2026-04-07: Updated benchmark results
* 2026-04-07: removed performance improvement by execute_direct (this doesn't seem to work with php streams yet)
* 2026-04-10: Removed invisible features - they will be implemented and submitted by PR
* 2026-04-10: Moved indicator variables from mysqlnd to mysqli
* 2026-04-11: Added future scope section (PDO )
* 2026-04-14: Updated execute_many signature with $control parameter and named parameter support.
* 2026-04-15: Added note about atomicity in execute_many()