====== PHP RFC: Records ====== * Version: 0.9 * Date: 2024-07-19 * Author: Robert Landers, , * Status: Under Discussion (or Accepted or Declined) * First Published at: http://wiki.php.net/rfc/records ===== Introduction ===== This RFC proposes the introduction of ''%%record%%'' objects, which are immutable classes with [[https://en.wikipedia.org/wiki/Value_semantics|value semantics]]. ==== Value objects ==== Value objects are immutable objects that represent a value. They’re used to store values with a different semantic by wrapping their technical value, adding additional context. For example, a ''%%Point%%'' object with ''%%x%%'' and ''%%y%%'' properties can represent a point in a 2D space, and an ''%%ExpirationDate%%'' can represent a date when something expires. This prevents developers from accidentally using the wrong value in the wrong context. Consider this example where a function accepts an integer as a user ID, and the ID is accidentally set to a nonsensical value: function updateUserRole(int $userId, string $role): void { // ... } $user = getUser(/*...*/) $uid = $user->id; // ... $uid = 5; // accidentally sets uid to an unrelated integer // ... updateUserRole($uid, 'admin'); // accidental passes a nonsensical value for uid Currently, the only solution to this is to use a **class**, but this requires significant boilerplate code. Further, **readonly classes** have many edge cases and are rather unwieldy. === The solution === Like arrays, strings, and other values, **record** objects are strongly equal (''%%===%%'') to each other if they contain the same values. Let’s take a look at an updated example using a ''%%record%%'' type for ''%%UserId%%''. Thus, if someone were to pass an ''%%int%%'' to ''%%updateUserRole%%'', it would throw an error: record UserId(int $id); function updateUserRole(UserId $userId, string $role): void { // ... } $user = getUser(/*...*/) $uid = $user->id; // $uid is a UserId object // ... $uid = 5; // ... updateUserRole($uid, 'admin'); // This will throw a TypeError Now, if ''%%$uid%%'' is accidentally set to an integer, the call to ''%%updateUserRole%%'' will throw a ''%%TypeError%%'' because the function expects a ''%%UserId%%'' object instead of a plain integer. ===== Proposal ===== This RFC proposes the introduction of a ''%%record%%'' keyword in PHP to define immutable value objects. These objects will allow properties to be initialized concisely and will provide built-in methods for common operations such as modifying properties, performing equality checks, and using a function-like instantiation syntax. Records can implement interfaces and use traits but can’t extend other records or classes; composition is allowed, however. ==== Syntax and semantics ==== === Definition === A **record** is defined by the keyword ''%%record%%'', followed by the name of its type (e.g., ''%%UserId%%''), and then must list one or more typed parameters (e.g., ''%%int $id%%'') that become properties of the record. A parameter may provide ''%%private%%'' or ''%%public%%'' modifiers, but are ''%%public%%'' when not specified. This is referred to as the "inline constructor." A **record** may optionally implement an interface using the ''%%implements%%'' keyword, which may optionally be followed by a record body enclosed in curly braces ''%%{}%%''. A **record** may not extend another record or class. A **record** may contain a traditional constructor with zero arguments to perform further initialization. A **record** body may contain property hooks, methods, and use traits. A **record** body may also declare properties whose values are only mutable during a constructor call. At any other time, the property is immutable. A **record** body may also contain static methods and properties, which behave identically to static methods and properties in classes. They may be accessed using the ''%%::%%'' operator. As an example, the following code defines a **record** named ''%%Pigment%%'' to represent a color, ''%%StockPaint%%'' to represent paint colors in stock, and ''%%PaintBucket%%'' to represent a collection of stock paints mixed together. The actual behavior isn’t important, but illustrates the syntax and semantics of records. namespace Paint; // Define a record with several primary color properties record Pigment(int $red, int $yellow, int $blue) { // property hooks are allowed public string $hexValue { get => sprintf("#%02x%02x%02x", $this->red, $this->yellow, $this->blue), } // methods are allowed public function mix(Pigment $other, float $amount): Pigment { return $this->with( red: $this->red * (1 - $amount) + $other->red * $amount, yellow: $this->yellow * (1 - $amount) + $other->yellow * $amount, blue: $this->blue * (1 - $amount) + $other->blue * $amount ); } // all properties are mutable in constructors public function __construct() { $this->red = max(0, min(255, $this->red)); $this->yellow = max(0, min(255, $this->yellow)); $this->blue = max(0, min(255, $this->blue)); } public function with() { // prevent the creation of a new Pigment from an existing pigment throw new \LogicException("Cannot create a new Pigment from an existing pigment"); } } // simple records do not need to define a body record StockPaint(Pigment $color, float $volume); record PaintBucket(StockPaint ...$constituents) { public function mixIn(StockPaint $paint): PaintBucket { return $this->with(...[...$this->constituents, $paint]); } public function color(): Pigment { return array_reduce($this->constituents, fn($color, $paint) => $color->mix($paint->color, $paint->volume), Pigment(0, 0, 0)); } } === Usage === A record may be used much like a class, as the behavior of the two is very similar, assisting in migrating from one implementation to another: $gray = $bucket->mixIn($blackPaint)->mixIn($whitePaint); Records are instantiated in a function format, with ''%%&%%'' prepended. This provides visual feedback that a record is being created instead of a function call. $black = &Pigment(0, 0, 0); $white = &Pigment(255, 255, 255); $blackPaint = &StockPaint($black, 1); $whitePaint = &StockPaint($white, 1); $bucket = &PaintBucket(); $gray = $bucket->mixIn($blackPaint)->mixIn($whitePaint); $grey = $bucket->mixIn($blackPaint)->mixIn($whitePaint); assert($gray === $grey); // true === Optional parameters and default values === A ''%%record%%'' can also be defined with optional parameters that are set if omitted during instantiation. One or more properties defined in the inline constructor may have a default value declared using the same syntax and rules as any other default parameter in methods/functions. If a property has a default value, it is optional when instantiating the record, and PHP will assign the default value to the property if omitted. record Rectangle(int $x, int $y = 10); var_dump(&Rectangle(10)); // output a record with x: 10 and y: 10 === Auto-generated with method === To make records more useful, the RFC proposes generating a ''%%with%%'' method for each record. This method allows for partial updates to the properties, creating a new instance of the record with the specified properties updated. == How the with method works == **Named arguments** The ''%%with%%'' method accepts only named arguments defined in the inline constructor. Properties not defined in the inline constructor can’t be updated by this method. **Variadic arguments** Variadic arguments from the inline constructor don’t require named arguments in the ''%%with%%'' method. However, mixing named and variadic arguments in the same ''%%with%%'' method call is not allowed by PHP syntax. Using named arguments: record UserId(int $id) { public string $serialNumber; public function __construct() { $this->serialNumber = "U{$this->id}"; } } $userId = &UserId(1); $otherId = $userId->with(2); // Fails: Named arguments must be used $otherId = $userId->with(serialNumber: "U2"); // Error: serialNumber is not defined in the inline constructor $otherId = $userId->with(id: 2); // Success: id is updated Using variadic arguments: record Vector(int $dimensions, int ...$values); $vector = &Vector(3, 1, 2, 3); $vector = $vector->with(dimensions: 4); // Success: values are updated $vector = $vector->with(dimensions: 4, 1, 2, 3, 4); // Error: mixing named arguments with variadic arguments is not allowed by PHP syntax $vector = $vector->with(dimensions: 4)->with(1, 2, 3, 4); // Success: First update dimensions, then values == Custom with method == A developer may define their own ''%%with%%'' method if they choose, and reference the generated ''%%with%%'' method using ''%%parent::with()%%''. This allows a developer to define policies or constraints on how data can change from instance to instance. record Planet(string $name, int $population) { // create a with method that only accepts population updates public function with(int $population): Planet { return parent::with(population: $population); } } $pluto = Planet("Pluto", 0); // we made it! $pluto = $pluto->with(population: 1); // and then we changed the name $mickey = $pluto->with(name: "Mickey"); // Error: no named argument for population === Constructors === A **record** has two types of constructors: the inline constructor and the traditional constructor. The inline constructor is always required and must define at least one parameter. The traditional constructor is optional and can be used for further initialization logic, but must not accept any arguments. When a traditional constructor exists and is called, the properties are already initialized to the values from the inline constructor and are mutable until the end of the method, at which point they become immutable. // Inline constructor defining two properties record User(string $name, string $emailAddress) { public string $id; // Traditional constructor public function __construct() { if (!is_valid_email($this->emailAddress)) { throw new InvalidArgumentException("Invalid email address"); } $this->id = hash('sha256', $this->emailAddress); $this->name = ucwords($this->name); // all properties are now immutable } } ==== Implementing Interfaces ==== A **record** can implement interfaces, but it cannot extend other records or classes, but may use traits: interface Vehicle {} interface Car extends Vehicle { public function drive(): void; } interface SpaceShip extends Vehicle { public function launch(): void; } record FancyCar(string $model) implements Car { public function drive(): void { echo "Driving a Fancy Car {$this->model}"; } } record SpaceCar(string $model) implements Car, SpaceShip { public function drive(): void { echo "Driving a Space Car {$this->model}"; } public function launch(): void { echo "Launching a Space Car {$this->model}"; } } record Submarine(string $model) implements Vehicle { use Submersible; } record TowTruct(string $model, private Car $towing) implements Car { use Towable; } ==== Mental models and how it works ==== From the perspective of a developer, declaring a record declares an object with the same name. The developer can consider the record function (the inline constructor) as a factory function that creates a new object or retrieves an existing object from an array. For example, this would be a valid mental model for a Point record: record Point(int $x, int $y) { public float $magnitude; public function __construct() { $this->magnitude = sqrt($this->x ** 2 + $this->y ** 2); } public function add(Point $point): Point { return &Point($this->x + $point->x, $this->y + $point->y); } public function dot(Point $point): int { return $this->x * $point->x + $this->y * $point->y; } } // similar to declaring the following function and class // used during construction to allow mutability class Point_Implementation { public int $x; public int $y; public float $magnitude; public function __construct() { $this->magnitude = sqrt($this->x ** 2 + $this->y ** 2); } public function with(...$parameters) { // validity checks omitted for brevity $parameters = array_merge([$this->x, $this->y], $parameters); return Point(...$parameters); } public function add(Point $point): Point { return Point($this->x + $point->x, $this->y + $point->y); } public function dot(Point $point): int { return $this->x * $point->x + $this->y * $point->y; } } // used to enforce immutability but has nearly the same implementation readonly class Point { public float $magnitude; public function __construct(public int $x, public int $y) {} public function with(...$parameters): self { // validity checks omitted for brevity $parameters = array_merge([$this->x, $this->y], $parameters); return Point(...$parameters); } public function add(Point $point): Point { return Point($this->x + $point->x, $this->y + $point->y); } public function dot(Point $point): int { return $this->x * $point->x + $this->y * $point->y; } } function Point(int $x, int $y): Point { static $points = []; $key = hash_object($mutablePoint); if ($points[$key] ?? null) { // return an existing point return $points[$key]; } // create a new point $reflector = new \ReflectionClass(Point_Implementation::class); $mutablePoint = $reflector->newInstanceWithoutConstructor(); $mutablePoint->x = $x; $mutablePoint->y = $y; $mutablePoint->__construct(); // copy properties to an immutable Point and return it $point = new Point($mutablePoint->x, $mutablePoint->y); $point->magnitude = $mutablePoint->magnitude; return $points[$key] = $point; } In reality, this is quite different from how it works in the engine, but this provides a mental model of how behavior should be expected to work. ==== Performance considerations ==== To ensure that records are both performant and memory-efficient, the RFC proposes leveraging PHP’s copy-on-write (COW) semantics (similar to arrays) and interning values. Unlike interned strings, the garbage collector will be allowed to clean up these interned records when they’re no longer referenced. $point1 = &Point(3, 4); $point2 = $point1; // No data duplication, $point2 references the same data as $point1 $point3 = Point(3, 4); // No data duplication, it is pointing to the same memory as $point1 $point4 = $point1->with(x: 5); // Data duplication occurs here, creating a new instance $point5 = &Point(5, 4); // No data duplication, it is pointing to the same memory as $point4 === Cloning and with() === Calling ''%%clone%%'' on a ''%%record%%'' results in the same record object being returned. As it is a "value" object, it represents a value and is the same thing as saying ''%%clone 3%%''—you expect to get back a ''%%3%%''. If ''%%->with()%%'' is called with no arguments, a warning will be emitted, as this is most likely a mistake. ==== Serialization and deserialization ==== Records are fully serializable and deserializable, even when nested. record Single(string $value); record Multiple(string $value1, string $value2); echo $single = serialize(&Single('value')); // Outputs: "O:6:"Single":1:{s:5:"value";s:5:"value";}" echo $multiple = serialize(&Multiple('value1', 'value2')); // Outputs: "O:8:"Multiple":1:{s:6:"values";a:2:{i:0;s:6:"value1";i:1;s:6:"value2";}}" echo unserialize($single) === &Single('value'); // Outputs: true echo unserialize($multiple) === &Multiple('value1', 'value2'); // Outputs: true If a record contains objects or values that are unserializable, the record will not be serializable. ==== Equality ==== A ''%%record%%'' is always strongly equal (''%%===%%'') to another record with the same value in the properties, much like an ''%%array%%'' is strongly equal to another array containing the same elements. For all intents, ''%%$recordA === $recordB%%'' is the same as ''%%$recordA == $recordB%%''. Comparison operations will behave exactly like they do for classes, which is currently undefined. === Non-trivial values === For non-trivial values (e.g., objects, closures, resources, etc.), the ''%%===%%'' operator will return ''%%true%%'' if the two operands reference the same instances. For example, if two different DateTime records reference the exact same date and are stored in a record, the records will not be considered equal: $date1 = DateTime('2024-07-19'); $date2 = DateTime('2024-07-19'); record Date(DateTime $date); $dateRecord1 = Date($date1); $dateRecord2 = Date($date2); echo $dateRecord1 === $dateRecord2; // Outputs: false However, this can be worked around by being a bit creative (see: mental model) as only the values passed in the constructor are compared: record Date(string $date) { public DateTime $datetime; public function __construct() { $this->datetime = new DateTime($this->date); } } $date1 = &Date('2024-07-19'); $date2 = &Date('2024-07-19'); echo $date1->datetime === $date2->datetime ? 'true' : 'false'; // Outputs: true ==== Type hinting ==== A ''%%\Record%%'' interface will be added to the engine to allow type hinting for records. All records implement this interface. function doSomething(\Record $record): void { // ... } The only method on the interface is ''%%with%%'', which is a variadic method that accepts named arguments and returns ''%%self%%''. ==== Reflection ==== A new reflection class will be added to support records: ''%%ReflectionRecord%%'' which will inherit from ''%%ReflectionClass%%'' and add a few additional methods: * ''%%ReflectionRecord::finalizeRecord(object $instance): Record%%'': Finalizes a record under construction, making it immutable. * ''%%ReflectionRecord::isRecord(mixed $object): bool%%'': Returns ''%%true%%'' if the object is a record, and ''%%false%%'' otherwise. * ''%%ReflectionRecord::getInlineConstructor(): ReflectionFunction%%'': Returns the inline constructor of the record as ''%%ReflectionFunction%%''. * ''%%ReflectionRecord::getTraditionalConstructor(): ReflectionMethod%%'': Returns the traditional constructor of the record as ''%%ReflectionMethod%%''. * ''%%ReflectionRecord::makeMutable(Record $instance): object%%'': Returns a new record instance with the properties mutable. * ''%%ReflectionRecord::isMutable(Record $instance): bool%%'': Returns ''%%true%%'' if the record is mutable, and ''%%false%%'' otherwise. Using ''%%ReflectionRecord%%'' will allow developers to inspect records, their properties, and methods, as well as create new instances for testing or custom deserialization. Attempting to use ''%%ReflectionClass%%'' or ''%%ReflectionFunction%%'' on a record will throw a ''%%ReflectionException%%'' exception. === finalizeRecord() === The ''%%finalizeRecord()%%'' method is used to make a record immutable and look up its value in the internal cache, returning an instance that represents the finalized record. Calling ''%%finalizeRecord()%%'' on a record that has already been finalized will return the same instance. Attempting to finalize a regular object will throw a ''%%ReflectionException%%''. === isRecord() === The ''%%isRecord()%%'' method is used to determine if an object is a record. It returns ''%%true%%'' if the object is a finalized record. === getInlineConstructor() === The ''%%getInlineConstructor()%%'' method is used to get the inline constructor of a record as a ''%%ReflectionFunction%%''. This can be used to inspect inlined properties and their types. Invoking the ''%%invoke()%%'' method on the ''%%ReflectionFunction%%'' will create a finalized record. === getTraditionalConstructor() === The ''%%getTraditionalConstructor()%%'' method is used to get the traditional constructor of a record as a ''%%ReflectionMethod%%''. This can be useful to inspect the constructor for further initialization. Invoking the ''%%invoke()%%'' method on the ''%%ReflectionMethod%%'' on a finalized record will throw an exception. === makeMutable() === The ''%%makeMutable()%%'' method is used to create a new instance of a record with mutable properties. The returned instance doesn’t provide any value semantics and should only be used for testing purposes or when there is no other option. A mutable record can be finalized again using ''%%finalizeRecord()%%''. A mutable record will not be considered a record by ''%%isRecord()%%'' or implement the ''%%\Record%%'' interface. It is a regular object with the same properties and methods as the record. For example, ''%%var_dump()%%'' will output ''%%object%%'' instead of ''%%record%%''. === isMutable() === The ''%%isMutable()%%'' method is used to determine if a record has been made mutable via ''%%makeMutable()%%'' or otherwise not yet finalized. === Custom deserialization example === In cases where custom deserialization is required, a developer can use ''%%ReflectionRecord%%'' to manually construct a new instance of a record. record Seconds(int $seconds); $example = &Seconds(5); $reflector = new ReflectionRecord(Seconds::class); $expiration = $reflector->newInstanceWithoutConstructor(); // this is a mutable object $expiration->seconds = 5; assert($example !== $expiration); // true $expiration = $reflector->finalizeRecord($expiration); assert($example === $expiration); // true ==== var_dump ==== When passed an instance of a record the ''%%var_dump()%%'' function will output the same as if an equivalent object were passed — e.g., both having the same properties — except the output generated will replace the prefix text "object" with the text "record." record(Point)#1 (2) { ["x"]=> int(1) ["y"]=> int(2) } ==== Considerations for implementations ==== A ''%%record%%'' cannot share its name with an existing ''%%record%%'', ''%%class%%'', ''%%interface%%'', ''%%trait%%'', or ''%%function%%'', just like a class. ==== Autoloading ==== Records will be autoloaded in the same way as classes. ==== New Functions ==== * ''%%record_exists%%'' will return ''%%true%%'' if a record exists and ''%%false%%'' otherwise. It has the same signature as ''%%class_exists%%''. ===== Backward Incompatible Changes ===== To avoid conflicts with existing code, the ''%%record%%'' keyword will be handled similarly to ''%%enum%%'' to prevent backward compatibility issues. Since ''%%&%%'' is currently a syntax error when prefixed on a function call, it will be used to denote a record instantiation. ===== Proposed PHP Version(s) ===== PHP 8.5 ===== RFC Impact ===== ==== To SAPIs ==== N/A ==== To Existing Extensions ==== N/A ==== To Opcache ==== Unknown. ==== New Constants ==== None ==== php.ini Defaults ==== None ===== Open Issues ===== * Distill how CoW works, exactly. * Address conflict with ''%%&%%'' syntax: https://3v4l.org/CE5rt ===== Unaffected PHP Functionality ===== None. ===== Future Scope ===== * Records for "record-like" types, such as DateTime, DateInterval, and others. * Short definition syntax for classes ===== Proposed Voting Choices ===== 2/3 majority. ===== Patches and Tests ===== TBD ===== Implementation ===== To be completed during a later phase of discussion. ===== References ===== * [[https://en.wikipedia.org/wiki/Value_semantics|Value semantics]] ===== Rejected Features ===== TBD