PHP RFC: Add data encoding API

Version: 1.5
Date: 2025-11-05
Author: Ignace Nyamagana Butera, nyamsprod@gmail.com
Status: Under Discussion
First Published at: https://wiki.php.net/rfc/data_encoding_api

Introduction

To improve interoperability between PHP and other programming languages, and to simplify data encoding in PHP, we propose adding native support for encoding and decoding data using:

the family of RFC 4648 algorithms (Base16, Base32, and Base64).
the family of base58 algorithms
the family of base85 algorithms

Currently, PHP only supports a limited subset of RFC 4648. With this RFC, we aim to provide full compliance with the standard and introduce missing yet popular encoding algorithms for developers.

Downsides of the current approach

PHP provides partial support for Base64 via the base64_encode and base64_decode functions but they do not provide:

support for base64 URL alphabet and specific settings
support for base64 IMAP alphabet and specific settings
support for padding character removal during encoding

PHP provides partial support for Base16 via the bin2hex and hex2bin functions but they do not provide:

support for strict decoding mechanism
support for strict encoding (PHP uses lowercased letters whereas the RFC recommends using uppercased letters)

PHP currently lacks native support for Base32, Base58 and Base85 encoding and decoding algorithm.

In addition to the absence of these algorithms, the ecosystem suffers from a fragmented landscape of user-land packages—many of which claim compliance to one or the other algorithm without clearly specifying which variant is being refer to or implemented. This lack of a consistent reference can become problematic when applications rely on a specific algorithm for processing incoming or outgoing data. This challenge makes working with data encoding in PHP more complex than necessary.

PHP currently offers Base58 encoding and decoding via a PECL extension. This proposal seeks to integrate these functions directly into the PHP core, with additional support for the Flickr variant of Base58.

Although Base58 is not defined in a dedicated RFC defined under the IETF, it has seen widespread adoption in production systems—most notably in Bitcoin and other cryptocurrencies for encoding addresses and keys, as well as in platforms like Flickr for generating compact, URL-safe identifiers. Compared to Base64, Base58 yields shorter output, avoids visually ambiguous characters (such as 0, O, I, and l), and is inherently safe for use in URLs without additional encoding. The Base58 algorithm is simple, deterministic, and has remained stable and well-understood in the software ecosystem for over a decade.

Base85 (Ascii85) is a compact, widely used binary-to-text encoding. Base85 is more efficient for binary-heavy data such as checksums, digital signatures, or compact serialization formats. Given PHP’s growing role in systems integration and tooling (e.g., Composer, code profilers, Git interactions), native Base85 support would make such use cases cleaner and faster.

Last but not least, PHP lacks support for generating time constant encoding string which can be critical for security related processes.

The goal of this RFC is to propose adding the encoding and decoding functionalities defined in RFC 4648 to the PHP standard library as well as Base58 and Base85. It also introduces a native, constant-time implementation to address security concerns in data encoding. Once adopted, this feature will simplify data encoding in PHP, enhance interoperability with other programming languages, and strengthen security within the PHP ecosystem.

Proposal

A new, always available Encoding namespace will be added to the standard library. The namespace will contain classes and functions for encoding and decoding string or byte sequences.

For this purpose, the following internal classes and functions are added:

namespace Encoding {
    class EncodingError extends \Error
    {
    }
 
    class EncodingException extends \Exception
    {
    }
 
    class UnableToDecodeException extends EncodingException
    {
    }
 
    class UnableToEncodeException extends EncodingException
    {
    }
 
    enum Base16
    {
        case Upper;
        case Lower;
    }
 
    enum Base32
    {
        case Ascii;
        case Hex;
        case Crockford;
        case Z;
    }
 
    enum Base58
    {
        case Bitcoin;
        case Flickr;
    }
 
    enum Base64
    {
        case Standard;
        case UrlSafe;
        case Imap;
    }
 
    enum Base85
    {
        case Adobe;
        case Z85;
        case Git;
    }
 
    enum PaddingMode
    {
        case VariantControlled;
        case StripPadding;
        case PreservePadding;
    }
 
    enum DecodingMode
    {
        case Forgiving;
        case Strict;
    }
 
    enum TimingMode
    {
        case Variable;
        case Constant;
    }
}

The following Base16 functions are added:

namespace Encoding {
    /**
     * @throws UnableToEncodeException
     */
    function base16_encode(
        string $data,
        Base16 $variant = Base16::Upper,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
 
    /**
     * @throws UnableToDecodeException
     */
    function base16_decode(
        string $data,
        Base16 $variant = Base16::Upper,
        DecodingMode $decodingMode = DecodingMode::Strict,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
}

The following Base32 functions are added:

namespace Encoding {
    /**
     * @throws UnableToEncodeException
     */
    function base32_encode(
        string $data,
        Base32 $variant = Base32::Ascii,
        PaddingMode $paddingMode = PaddingMode::VariantControlled,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
 
    /**
     * @throws UnableToDecodeException
     */
    function base32_decode(
        string $data,
        Base32 $variant = Base32::Ascii,
        DecodingMode $decodingMode = DecodingMode::Strict,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
}

The following Base58 functions are added:

namespace Encoding {
    /**
     * @throws UnableToEncodeException
     */
    function base58_encode(
        string $data,
        Base58 $variant = Base58::Bitcoin,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
 
    /**
     * @throws UnableToDecodeException
     */
    function base58_decode(
        string $data,
        Base58 $variant = Base58::Bitcoin,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
}

The following Base64 functions are added:

namespace Encoding {
    /**
     * @throws UnableToEncodeException
     */
    function base64_encode(
        string $data,
        Base64 $variant = Base64::Standard,
        PaddingMode $paddingMode = PaddingMode::VariantControlled,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
 
    /**
     * @throws UnableToDecodeException
     */
    function base64_decode(
        string $data,
        Base64 $variant = Base64::Standard,
        DecodingMode $decodingMode = DecodingMode::Strict,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
}

The following Base85 functions are added:

namespace Encoding {
    /**
     * @throws UnableToEncodeException
     */
    function base85_encode(
        string $data,
        Base85 $variant,
        PaddingMode $paddingMode = PaddingMode::VariantControlled,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
 
    /**
     * @throws UnableToDecodeException
     */
    function base85_decode(
        string $data,
        Base85 $variant,
        TimingMode $timingMode = TimingMode::Variable,
    ): string;
}

API Design

The RFC chooses to use a functions-based API instead of a class-based API for the following reasons:

most PHP scripts use encoding in a one off fashion, and using a class-based API would feel overly complicated for a quick encode or decode operation
using functions emphasises that encoding/decoding operations have no internal state or side effects.
creating a class-based API on top of a function-based API, in user-land, is trivial.

The RFC chooses to use enum-based options rather than boolean or arbitrary string values to improve readability, static analysis and developer experience when using the API.

The general signature semantic chosen for each algorithm is the following:

For encoding:

/**
 * @throws UnableToEncodeException
 */
function algo_encode(string $data, Enum ...$options): string;

For decoding:

/**
 * @throws UnableToDecodeException
 */
function algo_decode(string $data, Enum ...$options): string;

where:

algo is the name of the underlying encoding algorithm.
$options is a list of options, represented by Enum instances, which MAY be encoding specific.

When decoding is performed a UnableToDecodeException exception is thrown on any error. When not strict, a tolerance toward the encoded string is allowed but decoding can still trigger a UnableToDecodeException exception if the string is still invalid after applying tolerant related operations on the encoded string. Similarly if a string can not be encoded a UnableToEncodeException exception is thrown. The EncodingError class is added for completeness.

Parameters

String Parameters

$data : the string to encode or decode;

Options

Variant support

Base encodings support a range of alphabets and extra configurations that can collectively be referred to as variants. The following Enum are introduced to help developers choose the correct variant to use.

Base16 Variants

<?php
 
enum Base16
{
   case Upper;
   case Lower;
}

Base16 does not define multiple alphabets, but it can be encoded using either uppercase or lowercase letters.

The default variant is Base16::Upper as per RFC 4648 the Base16 alphabet is defined using uppercased letters.

Base32 Variants

<?php
 
enum Base32
{
   case Ascii;
   case Hex;
   case Crockford;
   case Z;

Base32 supports multiple variants, and we provide the most common ones out of the box:

Ascii : the RFC 4648 Standard variant (case sensitive)
Hex : the RFC 4648 Hexadecimal variant (case sensitive)
Crockford: The douglas Crockford base32 (case insensitive)
Z: the Z-base-32 variant (case sensitive)

The default variant is Base32::Ascii as per RFC 4648 the Base32 alphabet is defined using uppercased letters.

Base58 Variants

<?php
 
enum Base58
{
   case Bitcoin;
   case Flickr;
}

Base58 supports multiple variants, and we provide the most common ones out of the box:

Bitcoin : the base58 Bitcoin variant (case sensitive)
Flickr : the Flickr variant (case sensitive)

The default variant is Base58::Bitcoin as it is the most used Base58 variant. Of note, the only difference between the bitcoin and the flickr variants is in the order of the characters the alphabet used.

Base64 Variant

<?php
 
enum Base64
{
   case Standard;
   case UrlSafe;
   case Imap;
}

Base64 supports multiple variants, and we provide the most common ones out of the box. All Base64 variants are case-sensitive.

Standard: the RFC 4648 Standard variant
UrlSafe: the RFC 4648 URL and Filename Safe variant
Imap: the RFC 3501 Imap variant

The default variant is Base64::Standard as per RFC 4648 the Base64 alphabet is not Url-safe.

Base85 Variant

<?php
 
enum Base85
{
   case Adobe;
   case Z85;
   case Git;
}

Base85 supports multiple variants, and we provide the most common ones out of the box. All Base85 variants are case-sensitive.

Adobe: the original implementation used by Adobe in PDF
Z85: RFC from Zero MQ
Git: the Git variant

No default variant is chosen. The variant should always be referenced in the function definition.

Padding presence during encoding

<?php
 
enum PaddingMode
{
  case VariantControlled;
  case StripPadding;
  case PreservePadding;
}

Base32, Base64 and Base85 use a padding character. The padding character has a technical role. It ensures that the encoded output represents complete blocks of data and allows the decoder to reconstruct the original binary input unambiguously. But to improve readability or interoperability, some variants have chosen to not include them in the result of their encoding process. This option MUST tell the encoding mechanism if the padding character needs to be present or not at the end of the encoding process, when applicable.

Values

VariantControlled — Padding is included or omitted according to the rules defined by the selected variant.
StripPadding — Padding characters are removed from the encoded output.
PreservePadding — Padding characters are retained in the encoded output.

Rules

If the selected variant does not support padding and PaddingMode::PreservePadding is specified, a ValueError MUST be thrown.
If the selected variant requires padding and PaddingMode::StripPadding is specified, a ValueError MUST be thrown.

The default padding mode is PaddingMode::VariantControlled, indicating that the padding character is added only when mandated by the chosen variant.

Decoding Mode

<?php
 
enum DecodingMode
{
  case Forgiving;
  case Strict;
}

For all functions, you MUST be able to specify how decoding is performed. By default, the $decodingMode is set to DecodingMode::Strict, meaning the algorithm strictly follows the rules defined by the RFC. Alternatively, you can set $decodingMode to DecodingMode::Forgiving. In this mode, several adjustments are applied to the $data string before the actual decoding process begins:

When applicable, the $data string is converted into the correct character casing.
When applicable, the padding length is corrected to allow correct decoding.

Independent of the mode:

The alphabet is treated as a sequence of byte values without any special treatment for multi-byte UTF-8.
The following characters: \r, \t, \n and the space character are all ignored during the decoding processus.
There should be a protection against NULL bytes presence in the $data string.

Although the forgiving decoding mode is available, it is intentionally restricted to account for the security considerations outlined in section 12 of RFC 4648

The default decoding mode is DecodingMode::Strict.

Timing generation mode

<?php
 
enum TimingMode
{
   case Variable;
   case Constant;
}

In some cases, for security reasons, you may prefer to use a more secure algorithm to prevent information leakage during the encoding or decoding process. Since different algorithms can have varying processing times, an optional enum is proposed to allow developers to opt into a more secure approach. For now, a constant-time generation algorithm is provided alongside the standard implementation, which does not protect against timing attacks. Depending on the implementation, this option may not be available for all encoding algorithms.

The default timing mode is TimingMode::Variable.

Error conditions

ValueError thrown when:

The combination of options is invalid.
The input or output parameters are of incorrect type.
The input length violates variant constraints.

UnableToEncodeException thrown during encoding when the operation fails due to malformed or corrupted data. UnableToDecodeException thrown during decoding when the operation fails due to malformed or corrupted data.

Usage examples

The examples below demonstrate the interaction between input data and the available options for each decoding function. The TimingMode option is omitted, as it does not influence the outcome of encoding or decoding operations.

Base16 Encoding

<?php
 
use Encoding\Base16;
use Encoding\DecodingMode;
 
use function Encoding\base16_encode;
use function Encoding\base16_decode;
 
$data = 'Hello world!';
$encodedUpper = "48656C6C6F20776F726C6421"; // using uppercase characters
$encodedLower = "48656c6c6f20776f726c6421"; // using lowercase characters
$encodedUpperWithSpaces = "48 65\n6C\t6C\r6F 2C 20 57 6F 72 6C 64 21";
$encodedWithSpaces = "48 65\n6C\t6C\r6F 2c 20 57 6f 72 6C 64 21";
 
echo base16_encode($data);
// returns "48656C6C6F20776F726C6421" the letters are uppercased by default
 
echo base16_encode($data, variant: Base16::Lower);
// returns "48656c6c6f20776f726c6421" with lowercased letters
 
echo base16_decode($encodedUpper);
// returns 'Hello world!'
 
echo base16_decode($encodedUpperWithSpaces);
// returns 'Hello world!'
 
echo base16_decode($encodedLower);
// throws a UnableToDecodeException exception
// by default value is expected with uppercased letters
 
 
echo base16_decode($encodedWithSpaces);
// throws a UnableToDecodeException exception
// by default value is expected with uppercased letters
// the example contains lowercased letters (the space do not affect decoding)
 
 
echo base16_decode($encodedLower, variant: Base16::Lower);
// 'Hello world!' the selected variant is in accordance with the data.
 
echo base16_decode($encodedLower, variant: Base16::Upper, decodingMode: DecodingMode::Forgiving);
// 'Hello world!' the decoding is case-insensitive.
 
echo base16_decode($encodedWithSpaces, variant: Base16::Upper, decodingMode: DecodingMode::Forgiving);
// 'Hello world!' the decoding is case-insensitive and is not affected by the whitespaces.

By default, the encoding process conforms to the rules defined in RFC 4648. The base16_encode() function produces an encoded string consisting exclusively of uppercase letters. To obtain a result equivalent to PHP’s bin2hex() function, the variant Base16::Lower MUST be used.

During decoding, the base16_decode() function expects input containing uppercase letters. If the input includes lowercase characters, an `UnableToDecodeException` WILL be raised. To permit case-insensitive decoding, the DecodingMode::Forgiving mode MAY be used.

As for all the encoding algorithms, whitespaces should not be taken into account during decoding.

Base32 Encoding

<?php
 
use Encoding\Base32;
use Encoding\DecodingMode;
use Encoding\PaddingMode;
 
use function Encoding\base32_encode;
use function Encoding\base32_decode;
 
$data = 'Hello world!';
$encodedAscii = "JBSWY3DPEBLW64TMMQ======";
$encodedCrockFord = "91JPRV3F41BPYWKCCG";
 
echo base32_encode($data);
// returns "JBSWY3DPEBLW64TMMQ======"
 
echo base32_encode($data, variant: Bas32::Ascii);
// returns "JBSWY3DPEBLW64TMMQ======"
 
echo base32_encode($data, paddingMode: PaddingMode::StripPadding);
// returns "JBSWY3DPEBLW64TMMQ"
 
echo base32_encode($data, variant: Bas32::CrockFord);
// returns "91JPRV3F41BPYWKCCG"
 
echo base32_encode($data, variant: Bas32::CrockFord, paddingMode: PaddingMode::PreservePadding);
// throw ValueError the variant does not support the padding mode
 
echo base32_decode($encodedAscii);
// returns 'Hello world!'
 
echo base32_decode("JBSWY3DPEBLW64TMMQ");
// throws a UnableToDecodeException exception the padding character is missing
 
echo base32_decode("JBSWY3DPEBLW64TMMQ", decodingMode: DecodingMode::Forgiving);
// returns 'Hello world!'
 
echo base32_decode($encodedAscii, variant: Bas32::CrockFord);
// throws a UnableToDecodeException exception if the encoding string contains
// invalid characters may returns a meaningless string if the characters are
// all supported, but the data was encoded with a different variant.
 
echo base32_decode($encodedCrockFord, variant: Bas32::CrockFord);
// returns 'Hello world!'

By default, the encoding process conforms to the rules defined in RFC 4648. The base32_encode() function produces an encoded string using the Base32::Ascii variant. If padding is not required, it MAY be omitted by specifying an appropriate value from the PaddingMode enumeration, as illustrated in the third example.

When a variant that does not permit padding usage, such as Base32::Crockford, the function WILL raise a ValueError. This indicates an invalid combination of options and MUST be corrected by the caller.

During decoding, the Base32::Ascii variant is applied by default in strict mode. In this mode, the function REQUIRES the input data to include the expected padding. If the padding is missing, as shown in the second decoding example, the operation WILL fail. To allow decoding of unpadded input, the `DecodingMode::Forgiving` mode MAY be used, which enables padding-tolerant behaviour.

If an input string is decoded using an incorrect variant and contains unsupported alphabet characters, an `UnableToDecodeException` WILL be raised. If the input consists solely of valid characters but belongs to a different alphabet, the result MAY be meaningless, but no exception WILL be raised.

Base58 Encoding

<?php
 
use Encoding\Base58;
use Encoding\DecodingMode;
 
use function Encoding\base58_encode;
use function Encoding\base58_decode;
 
$data = 'Hello world!';
$encodedBitcoin = "72k1xXWG59fYdzSNoA";
$encodedFlickr = "Z7Pznk19XTTzBtx";
 
echo base58_encode($data);
// returns "72k1xXWG59fYdzSNoA" default to Bitcoin variant
 
echo base58_encode($data, variant: Base58::Bitcoin);
// returns "72k1xXWG59fYdzSNoA" the variant is explicitly specified
 
echo base58_encode($data, variant: Base58::Flickr);
// returns "Z7Pznk19XTTzBtx" the flickr variant
 
echo base58_decode($encodedBitcoin);
// returns 'Hello world!'
 
echo base58_decode($encodedFlickr);
// depending on the string if it contains only supported
// character from the Bitcoin variant, a meaningless string 
// is returned; otherwise an UnableToDecodeException exception is thrown
 
echo base58_decode($encodedFlickr, variant: Base58::Flickr);
// returns 'Hello world!'

By default, the encoding process conforms to the rules defined for the Bitcoin Base58 variant. The base58_encode() function produces an encoded string using the `Base58::Bitcoin` variant.

The same variant is applied by default during decoding. Because the Base58 algorithm is case-sensitive, and does not use any padding, there is no forgiving mode; decoding is always performed strictly.

Base64 Encoding

<?php
 
use Encoding\Base64;
use Encoding\PaddingMode;
use Encoding\DecodingMode;
 
use function Encoding\base64_encode;
use function Encoding\base64_decode;
 
$data = 'This is an encoded string';
 
echo base64_encode($data);
// "VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw=="
 
echo base64_encode($data, paddingMode: PaddingMode::StripPadding);
// "VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw"
 
echo base64_decode("VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw");
// throws a UnableToDecodeException exception
// by default the Base64::Standard is used 
// and expect padding characters when application
 
echo base64_decode("VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw", decodingMode: DecodingMode::Forgiving);
// returns 'This is an encoded string'
// the Forgiving mode allow decoding in absence of padding string.
 
$data = chr(0xFF) . chr(0xFF);
echo base64_encode($data); // "//8="
echo base64_encode($data, variant: Base64::UrlSafe); // "__8"
echo base64_encode($data, paddingMode: PaddingMode::StripPadding); // "//8"

By default, the encoding process conforms to the rules defined in RFC 4648. The base64_encode() function produces an encoded string using the Base32::Standard variant which expect the uses of the padding character =. If padding is not required, it MAY be omitted by specifying an appropriate value from the PaddingMode enumeration, as illustrated in the second example.

During decoding, the Base64::Standard variant is applied by default in strict mode. In this mode, the function REQUIRES the input data to include the expected padding. If the padding is missing, as shown in the first decoding example, the operation WILL fail. To allow decoding of unpadded input, the DecodingMode::Forgiving mode MAY be used, which enables padding-tolerant behaviour.

If an input string is decoded using an incorrect variant and contains unsupported alphabet characters, an UnableToDecodeException WILL be raised. If the input consists solely of valid characters but belongs to a different alphabet, the result MAY be meaningless, but no exception WILL be raised.

Base85 Encoding

<?php
 
use Encoding\Base85;
use Encoding\PaddingMode;
use Encoding\DecodingMode;
 
use function Encoding\base85_encode;
use function Encoding\base85_decode;
 
$data = 'Hello world!';
$encodedAdobe = "<~87cURD]j7BEbo80~>";
$encodedAdobeNoPadding = "87cURD]j7BEbo80";
$encodedZ85 = "nm=QNz.92Pz/P";
 
echo base85_encode($data, variant: Base85::Adobe);
// "<~87cURD]j7BEbo80~>"
 
echo base85_encode($data, variant: Base85::Adobe, paddingMode: PaddingMode::StripPadding);
// "87cURD]j7BEbo80"
 
echo base85_encode($data, variant: Base85::Z85, paddingMode: PaddingMode::StripPadding);
// throw a ValueError
 
echo base85_decode("87cURD]j7BEbo80", variant: Base85::Adobe);
// throws a UnableToDecodeException exception
// the Base85::Adobe is used 
// and expect by default padding characters to be used
 
echo base85_decode("87cURD]j7BEbo80", variant: Base85::Adobe, decodingMode: DecodingMode::Forgiving);
// returns 'Hello world!'
// the Forgiving mode allows decoding in absence of padding string.
 
echo base85_decode($encodedZ85, variant: Base85::Adobe);
// depending on the string if it contains only supported
// character from the Adobe variant, a meaningless string 
// is returned; otherwise an UnableToDecodeException exception is thrown
 
echo base85_decode($encodedZ85, variant: Base85::Z85);
// returns 'Hello world!'

The variant should always be explicitly set. When the Base85::Adobe variant is used, the encoding process conforms to the rules defined in Adobe Ascii85 specification. The base85_encode() function produces an encoded string which expects the uses of the padding sequence <~ and ~>. If padding is not required, it MAY be omitted by specifying an appropriate value from the PaddingMode enumeration, as illustrated in the second example.

When a variant that does not permit padding usage, such as Base85::Z85, the function WILL raise a ValueError. This indicates an invalid combination of options and MUST be corrected by the caller.

During decoding, the variant should always be explicitly set. By default the chosen variant is applied in strict mode. In this mode, the function REQUIRES the input data to include the expected padding if required. The Base85::Adobe, by default requires such padding characters, this is not the case, for instance, for the Base85::Git variant. If the padding is missing, as shown in the first decoding example, the operation WILL fail. To allow decoding unpadded input, the `DecodingMode::Forgiving` mode MAY be used, which enables padding-tolerant behaviour.

Migration path

Due to the widespread use of the current API, this RFC proposes a gradual migration path to help users transition to the new API. However, the full deprecation and removal of the current functions—base64_encode, base64_decode, hex2bin, and bin2hex—will be handled separately through the traditional RFC deprecation process, which occurs before each PHP version release. This ensures users have sufficient time to adopt the new API.

Base16 functions

bin2hex

The bin2hex function encodes a string using the Base16 algorithm, but it defaults to a lowercase alphabet, which contradicts the recommendation in RFC 4648. To migrate a bin2hex call to the new API while preserving current behaviour use

<?php
 
$data = 'Hello world!';
 
//before
echo bin2hex($data);
//after
echo Encoding\base16_encode($data, variant: Encoding\Base16::Lower);

hex2bin

The hex2bin function is lenient and accepts both lowercase and uppercase input. To migrate a hexbin call to the new API while preserving current behaviour use:

<?php
 
$data = "6578616d706c65206865782064617461";
 
//before
echo hex2bin($data);
//after
echo Encoding\base16_decode($data, decodingMode: Encoding\DecodingMode::Forgiving);

Base64 functions

base64_encode

This function already follows the standard Base64 encoding algorithm. Migrating is straightforward:

$data = 'This is an encoded string';
//before
echo base64_encode($data);
//after
echo Encoding\base64_encode($data);

base64_decode

Migrating base64_decode is more complex. The function is lenient, accepting non-alphabet characters and misplaced padding:

base64_decode('dG9===0bw??'); // returns 'toto'

The proposed API enforces RFC 4648, Section 12. This includes rejecting invalid characters and padding in non-terminal positions for security reasons:

Encoding\base64_decode('dG90bw??', decodingMode: Encoding\DecodingMode::Forgiving);  // will throw because of outside alphabet letter
Encoding\base64_decode('dG9===0bw', decodingMode: Encoding\DecodingMode::Forgiving); // will throw because of unsafe use of the padding character
Encoding\base64_decode('dG90bw', decodingMode: Encoding\DecodingMode::Forgiving);    // returns 'toto'

This stricter behavior provides a safer migration path, as rejecting previously accepted input exposes potential security vulnerabilities in the original implementation.

In other Languages

Go

In its standard package Go supports all RFC4648 algorithms as well as acii85 format

Python

Python has updated its encoding support and now supports all RFC4648 algorithms as well as acii85 format. Python also has an extensive support for many Base85 variants.

JavaScript/NodeJS

Does not support base32 natively nor base85.

C#

Only natively supports base64 (not base64 URL)

Java

Only natively supports base64

Backward Incompatible Changes

The namespace Encoding is now reserved

Proposed PHP Version(s)

The next minor PHP version (PHP 8.6).

RFC Impact

To SAPIs

None.

To Existing Extensions

None.

To Opcache

None.

Implementation

Tim Düsterhus has volunteered to do the implementation, but will check whether or not a constant time implementation is possible for all combinations of options.

After the project is implemented, this section should contain

the version(s) it was merged into
a link to the git commit(s)
a link to the PHP manual entry for the feature
a link to the language specification section (if any)

Future Scope

The current functions for Base64 and Base16 can be deprecated at some distant point of time
Add Base64 support to PHP convert.base64-encode and convert.base64-decode stream filters

Vote

Yes or no vote, 2/3 required to pass.

Add the new encoding API described above ?
Real name	Yes	No	Abstain
Final result:	0	0	0
This poll has been closed.

References

RFC4648: https://datatracker.ietf.org/doc/html/rfc4648
Douglas CrockFord base32: https://www.crockford.com/base32.html
Z-Base32: https://philzimmermann.com/docs/human-oriented-base-32-encoding.txt
IMAP Base64: https://datatracker.ietf.org/doc/html/rfc3501#section-5.1.3
Base58 Bitcoin: https://bitcoinwiki.org/wiki/base58|base58

RFC Discussion thread: https://news-web.php.net/php.internals/127716

Changelog

2025/11/25: Base85 decoding/encoding functions requires explicit variant input
2025/11/05: Update migration path from legacy to new encoding API
2025/11/05: Rewrite the example section to showcase usage examples
2025/11/05: Add for completeness EncodingError and UnableToEncodeException
2025/10/16: Add base85 into the proposed API
2025/07/03: Add migration path from legacy to new encoding API
2025/07/01: Shorten Variant names
2025/07/01: Add base58 into the proposed API
2025/06/19: First draft

PHP RFC: Add data encoding API

Introduction

Downsides of the current approach

Proposal

API Design

Parameters

String Parameters

Options

Variant support

Base16 Variants

Base32 Variants

Base58 Variants

Base64 Variant

Base85 Variant

Padding presence during encoding

Values

Rules

Decoding Mode

Timing generation mode

Error conditions

Usage examples

Base16 Encoding

Base32 Encoding

Base58 Encoding

Base64 Encoding

Base85 Encoding

Migration path

Base16 functions

bin2hex

hex2bin

Base64 functions

base64_encode

base64_decode

In other Languages

Go

Python

JavaScript/NodeJS

C#

Java

Backward Incompatible Changes

Proposed PHP Version(s)

RFC Impact

To SAPIs

To Existing Extensions

To Opcache

Implementation

Future Scope

Vote

References

Changelog

Page Tools

Table of Contents