rfc:data-classes
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
rfc:data-classes [2024/04/18 13:04] – ilutov | rfc:data-classes [2024/04/22 12:51] (current) – ilutov | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== PHP RFC: Data classes | + | Data classes |
- | + | ||
- | * Date: 2024-04-16 | + | |
- | * Author: Ilija Tovilo, tovilo.ilija@gmail.com | + | |
- | * Status: Draft | + | |
- | * Target Version: PHP 8.x | + | |
- | * Implementation: | + | |
- | + | ||
- | ===== Proposal ===== | + | |
- | + | ||
- | This RFC proposes | + | |
- | + | ||
- | < | + | |
- | data class Position { | + | |
- | public function __construct( | + | |
- | public $x, | + | |
- | public $y, | + | |
- | ) {} | + | |
- | } | + | |
- | + | ||
- | $p1 = new Position(1, 2); | + | |
- | $p2 = $p1; | + | |
- | $p2-> | + | |
- | + | ||
- | var_dump($p1 === $p2); // false | + | |
- | + | ||
- | $p2->x--; | + | |
- | var_dump($p1 === $p2); // true | + | |
- | </ | + | |
- | + | ||
- | ====== Data transfer objects ====== | + | |
- | + | ||
- | ===== The problem ===== | + | |
- | + | ||
- | Classes are commonly used to model data in PHP. Such classes have many names (data transfer objects, plain old php objects, | + | |
- | + | ||
- | Using classes for data comes with one significant downside: Objects are passed by reference, rather than by value. When dealing with mutable data, this makes it very easy to shoot yourself in the foot by exposing mutations to places that don't expect to see them. | + | |
- | + | ||
- | Consider the following example: | + | |
- | + | ||
- | <code php> | + | |
- | class Position { | + | |
- | public function __construct( | + | |
- | public $x, | + | |
- | public $y, | + | |
- | ) {} | + | |
- | } | + | |
- | + | ||
- | function createShapes() { | + | |
- | // Use same position for both shapes | + | |
- | $pos = new Position(10, | + | |
- | $circle = new Circle(position: | + | |
- | $square = new Square(position: | + | |
- | return [$circle, $square]; | + | |
- | } | + | |
- | + | ||
- | $shapes = createShapes(); | + | |
- | + | ||
- | function applyGravity() { | + | |
- | foreach ($shapes as $shape) { | + | |
- | /* We're not physicists. :P */ | + | |
- | $shape-> | + | |
- | } | + | |
- | } | + | |
- | + | ||
- | applyGravity($shape); | + | |
- | + | ||
- | foreach ($shapes as $shape) { | + | |
- | var_dump($shape-> | + | |
- | } | + | |
- | // Position(10, | + | |
- | </ | + | |
- | + | ||
- | Since both shapes are created with the same position, '' | + | |
- | + | ||
- | What's the solution? '' | + | |
- | + | ||
- | ===== The solution ===== | + | |
- | + | ||
- | Like arrays, strings and other value types, data classes are // | + | |
- | + | ||
- | With this description, | + | |
- | + | ||
- | <code php> | + | |
- | data class Position { ... } | + | |
- | + | ||
- | function createShapes() { | + | |
- | // Use same position for both shapes | + | |
- | $pos = new Position(10, | + | |
- | $circle = new Circle(position: | + | |
- | $square = new Square(position: | + | |
- | return [$circle, $square]; | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | // | + | |
- | + | ||
- | ====== Growable data structures ====== | + | |
- | + | ||
- | ===== The problem ===== | + | |
- | + | ||
- | The same problem exists, and is in fact greatly exacerbated, | + | |
- | + | ||
- | <code php> | + | |
- | // Pseudo-code for an internal class | + | |
- | class List { | + | |
- | public $storage = < | + | |
- | + | ||
- | public function append($element) { | + | |
- | $clone = clone $this; // including storage | + | |
- | $clone-> | + | |
- | return $clone; | + | |
- | } | + | |
- | } | + | |
- | + | ||
- | // Userland | + | |
- | $list = new List(); | + | |
- | for ($i = 0; $i < 1000; $i++) { | + | |
- | $list = $list-> | + | |
- | } | + | |
- | </ | + | |
- | + | ||
- | Not only will this loop create a copy for each list object on each iteration, but it will also copy its entire storage. With this approach, time complexity of a single insert becomes O(n). For m inserts, it becomes O(m*n), which is catastrophic. Looking at the code above, it becomes evident that '' | + | |
- | + | ||
- | And when it is shared, we only need a single copy, rather than a copy for each insertion. | + | |
- | + | ||
- | <code php> | + | |
- | function appendAndPrint($list) { | + | |
- | $list = $list-> | + | |
- | $list = $list-> | + | |
- | var_dump($list); | + | |
- | } | + | |
- | + | ||
- | $list = new List(); | + | |
- | $list = $list-> | + | |
- | appendAndPrint($list); | + | |
- | var_dump($list); | + | |
- | </ | + | |
- | + | ||
- | ===== The solution ===== | + | |
- | + | ||
- | As a reminder, data classes are // | + | |
- | + | ||
- | <code php> | + | |
- | function appendAndPrint($list) { | + | |
- | $list-> | + | |
- | $list-> | + | |
- | var_dump($list); | + | |
- | } | + | |
- | + | ||
- | $list = new List(); | + | |
- | $list-> | + | |
- | appendAndPrint($list); | + | |
- | var_dump($list); | + | |
- | </ | + | |
- | + | ||
- | Mind the '' | + | |
- | + | ||
- | One of the primary motivators of this RFC is to enable the possibility of introducing internal data structures, such as lists (e.g. Vector from php-ds) as a faster and stricter alternative to arrays, without introducing many of the pitfalls some other languages suffer from by making them reference types. | + | |
- | + | ||
- | ===== CoW 🐄 ===== | + | |
- | + | ||
- | But wait, this sounds familiar. | + | |
- | + | ||
- | < | + | |
- | What's the solution? '' | + | |
- | + | ||
- | < | + | |
- | </ | + | |
- | + | ||
- | You may assume that data classes come with the same slowdown as creating a copy for each usage of a data class. However, data classes have a cool trick up their sleeves: Copy-on-write, | + | |
- | + | ||
- | <code php> | + | |
- | function print($value) { | + | |
- | var_dump($value); | + | |
- | } | + | |
- | + | ||
- | function appendAndPrint($value) { | + | |
- | $value[] = ' | + | |
- | var_dump($value); | + | |
- | } | + | |
- | + | ||
- | print([' | + | |
- | appendAndPrint([' | + | |
- | + | ||
- | $array = [' | + | |
- | print($array); | + | |
- | appendAndPrint($array); | + | |
- | </ | + | |
- | + | ||
- | //Note:// This code ignores the fact that array literals are constant, for simplicity. | + | |
- | + | ||
- | With the rules described above, the only line performing potential copies is '' | + | |
- | + | ||
- | This is already how arrays work today. Data classes follow the exact same principle. | + | |
- | + | ||
- | <code php> | + | |
- | function print($value) { | + | |
- | var_dump($value); | + | |
- | } | + | |
- | + | ||
- | function modifyAndPrint($value) { | + | |
- | $value-> | + | |
- | var_dump($value); | + | |
- | } | + | |
- | + | ||
- | print(new Position(1, 2)); | + | |
- | appendAndPrint(new Position(1, 2)); | + | |
- | + | ||
- | $pos = new Position(1, 2); | + | |
- | print($pos); | + | |
- | appendAndPrint($pos); | + | |
- | </ | + | |
- | + | ||
- | Only one implicit copy happens, namely in '' | + | |
- | + | ||
- | ===== Equality/ | + | |
- | + | ||
- | TODO | + | |
- | + | ||
- | ===== Method calls ===== | + | |
- | + | ||
- | TODO | + | |
- | + | ||
- | ===== Reflection ===== | + | |
- | + | ||
- | TODO | + | |
- | + | ||
- | ===== Future scope ===== | + | |
- | + | ||
- | - Hashing for '' | + | |
- | + | ||
- | ===== Vote ===== | + | |
- | + | ||
- | Voting starts xxxx-xx-xx and ends xxxx-xx-xx. | + | |
- | + | ||
- | As this is a language change, a 2/3 majority is required. | + | |
- | + | ||
- | <doodle title=" | + | |
- | * Yes | + | |
- | * No | + | |
- | </ | + |
rfc/data-classes.1713445486.txt.gz · Last modified: 2024/04/18 13:04 by ilutov