rfc:enum_v2

PHP RFC: Enumeration type (alternative proposal)

Note: this is a counterproposal to this RFC

Introduction

Traditionally, PHP has used independent constants to represent related magic numbers. I propose to add a concept well known from many other languages, enumeration type.

Consider the following perfectly valid code:

preg_split($foo, $bar, LC_ALL * PHP_MAJOR_VERSION);

What will this call produce? I have no idea, either:)

It would be so much better to have a more foolproof and type-safe version, like:

preg_split($foo, $bar, Split::NoEmpty);

Proposal

Basics

A simple enum:

enum Letters {
    a,          // Enum constants start with 0 by default
    b,          // 1, always previous value +1 if not specified explicitly
    c = 10,     // Values can be set explicitly
    d,          // 11
    e = 1,      // Constants with duplicate values are allowed
    f = 2 * 2,  // Can use the same expressions as class constants
    g = f,      // Can use other constants too
    h = g + 10, // And in expressions too
    H,          // Valid, constant names are case sensitive
    i,          // Optional comma after the last constant is permitted
}

A binary enum is used to represent a set of values:

binary enum FileMode {
    Read = 1,
    Write = 2,
    Execute = 1 << 2,
    ReadWrite = Read | Write,
}
 
$foo = FileMode::Read | FileMode::Execute;

Both enum types can extend other enums:

enum A { foo = 1 }
 
enum B extends A { bar = 2 }
 
$x = B::foo;

Overriding constants from base enums is not allowed:

enum C extends A {
    foo = 3 // CompileError
}

Constants must be int and thus fit into zend_long:

enum foo {
    bar = 2 ** 100, // CompileError
    baz = 1.5       // CompileError
}
 
$x = 'this is a string';
$y = (foo)$x; // TypeError

Type coercion and casts

Enums are implicitly coercible to bool and string:

function f(FileMode $mode) {
    if ($mode) {
        echo "mode: $mode";
    }
}
 
f(FileMode::Read); // Outputs "mode: 1"

Enum types can be explicitly cast to each other and int:

$foo = (FileMode)123;
$bar = SomeEnum::Const;
$foo = (FileMode)$bar;

Conversion from other types is not checked, thus enums can hold values not covered by their constants. Enum::isKnownValue() will return `false` while Enum::toHumanReadableString() will return a numeric string instead of constant name(s).

Enum operations

Enums are immutable and don't support arithmetic operations:

$foo = FileMode::Read;
$foo = FileMode::Read + 1; // CompileError
$foo += 1;                 // TypeError
$bar = $foo + 1;           // TypeError

However, binary enums support bitwise operations:

$foo = FileMode::Read | FileMode::Execute;
$foo |= FileMode::Write;
$foo &= ~FileMode::Read;

Enum usage

Concrete enum types can be used as typehints:

function open(string $filename, FileMode $mode)

However, not the enum keyword itself:

function open(string $filename, enum $mode) // CompileError

When the type is clear from typehints, enum name can be omitted:

open('foo.txt', Read | Write)

is equivalent to:

open('foo.txt', FileMode::Read | FileMode::Write)

Same for switch statements:

function f(Letters $x) {
    switch ($x) {
    case a:
        // ...
    case b:
        // ...
    }
}

Internal representation

Internally, enums are classes and enum constants are public class constants. This makes them the fourth OOP-ey type in PHP, along with class, interface and trait. They can be autoloaded just like the former types. All enums inherit from this base class (here is PHP pseudocode):

final // In the sense that userspace can't explicitly extend it
class Enum {
    private int $value;
    private function __construct(); // It shouldn't be possible to create enums like this: $foo = new Enum();
    public function isBinary(): bool;
    public function __toString(): string {
        return (string)(int)$this->value;
    }
 
    // Whether the current value is represented by one of this enum's constants
    // or their combination for binary enums
    public function isKnownValue(): bool;
 
    // Returns a human readable representation of this enum's value
    // e.g. (FileMode::Read | FileMode::Write)->toHumanReadableString() would return 'Read | Write'
    // For unrecognized values, returns a decimal (simple enums) or hexadecimal (binary enums) string.
    public function toHumanReadableString(): string;
 
    public static function parse(string $enum) : ?WhateverConcreteEnumTypeIsExtendingThis;
}

Conventions used in this document

Currently, PascalCase is used in enums due to author's experience with C#. While I believe that this convention is nice and makes enums conveniently distinct from PHP conventions that use camelCase for properties and UNDERSCORED_UPPERCASE for constants, I'm not attached to it. The recommended convention for use in language documentation and, subsequently, the PHP core will be determined during community discussion or voted for during the voting phase.

Same applies to the Enum class name.

Backwards Incompatible Changes

enum and binary will become reserved keywords. Class name 'Enum' (or whatever we decide to call it) will become unavailable.

Proposed PHP Version(s)

PHP 8.1?

Open Issues

Make sure there are no open issues when the vote starts!

  • Naming conventions
  • Base class name(s)
  • Type coercion details?

Unaffected PHP Functionality

List existing areas/features of PHP that will not be changed by the RFC.

This helps avoid any ambiguity, shows that you have thought deeply about the RFC's impact, and helps reduces mail list noise.

Future Scope

After this RFC is implemented, enums may be used for new features or factored into existing ones.

Proposed Voting Choices

  • Accept this RFC (yes / no)?
  • What should be enum base class fully qualified name (\Enum / \PHP\Enum / something else )?
  • What enum constant naming convention should be used (PascalCase / camelCase / UPPER_UNDERSCORED)?

Patches and Tests

Links to any external patches and tests go here.

If there is no patch, make it clear who will create a patch, or whether a volunteer to help with implementation is needed.

Make it clear if the patch is intended to be the final patch, or is just a prototype.

For changes affecting the core language, you should also provide a patch for the language specification.

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged into
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature
  4. a link to the language specification section (if any)

References

Links to external references, discussions or RFCs * https://wiki.php.net/rfc/enum - old proposal that wanted to introduce

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/enum_v2.txt · Last modified: 2020/05/16 18:07 by maxsem