rfc:php6-rethink

Request for Comments: PHP6 rethink

Introduction

Any time we do a backwards compatibility break we should stop to think about what other improvements we could make to the language while we are at it. Sometimes, drastic changes are needed. The jump from PHP 4 to PHP 5 was huge. It included a more modern object model and many other similar improvements. This RFC is an idea for when we jump from PHP 5 to PHP 6 and how we might improve upon the object model that we have. Since classes define new types, I will also discuss types in general.

What is good about types and objects in PHP?

This is not a comprehensive list.

  • The object model includes inheritance, polymorphism and abstractions. In concept it works pretty much like any modern OOP language. You have interfaces which you can implement, and classes which you can extend. Nothing particularly special, but that's not a bad thing.
  • Dynamic typing and type juggling. During development it really helps to not have to specify types everywhere. Type juggling between numbers and strings is also useful.
  • Type-hints. Though ill-named, type-hints bring some of the usefulness of static types to PHP. It really cuts down on validation.

What is bad about types and objects in PHP?

  • Functions and methods don't give return values. PHP doesn't support declaring return types so people started using docblocks to declare the return values. IDE's help mitigate the cost of writing and maintaining them, but it doesn't fix the problem.
  • Variables cannot typed. Dynamic typing helps alleviate development headache because you don't have to worry about types. Once the project matures, though, not having declared types becomes a grievance. Class properties get the same treatment as function return methods: people use docblocks to alleviate the problem.
  • Type hints do not work for primitive (scalar) values. If some property of class needs to be an integer, you can't simply type-hint it with `int`. To do it properly you should use `filter_var` and `FILTER_VALIDATE_INT` and if it isn't an integer you throw an Exception. This is much clunkier and takes significantly more work than a type-hint.
  • Implementing best-practice visibility is verbose. Generally class members should not be publicly visible and should be accessed through methods. Besides preventing foreign code from messing with the class's behavior, it also future-proofs your code because later on you can add logic when you set and retrieve class members. This adds a lot of boilerplate code. IDE's can generate the code, but it's there.
  • Writing 'function' over and over in a class again gets annoying. It could be inferred from context that it is defining a method.

An example: What is wrong/inconvenient

Consider the following class that holds two properties, id and name. The class follows fairly conventional practices; its variables are not public and provides read / write access through get and set functions in order to future-proof code.

class Person {
    /**
     * @var int
     */
    protected $id;
 
    /**
     * @var string
     */
    protected $name;
 
    /**
     * @param int $id
     * @param string $name
     */
    function __construct($id, $name) {
        $this->id = $id;
        $this->name = $name;
    }
 
    /**
     * @return int
     */
    function getId() {
        return $this->id;
    }
 
    /**
     * @return string
     */
    function getName() {
        return $this->name;
    }
 
    /**
     * @param int $id
     */
    function setId($id) {
        $this->id = $id;
    }
 
    /**
     * @param string $id
     */
    function setName($name) {
        $this->id = $name;
    }
 
}

There is nothing in this code that prevents people from specifying incorrect types. If catching these errors is important enough, you have to introduce type checking with something like this:

class Person {
    // properties
 
    /**
     * @param int $id
     * @param string $name
     * @throws InvalidArgumentException
     */
    function __construct($id, $name) {
        $this->setId($id);
        $this->setName($name);
    }
 
    // getters
 
    /**
     * @param int $id
     * @throws InvalidArgumentException
     */
    function setId($id) {
        if (filter_var($id, FILTER_VALIDATE_INT) === FALSE) {
            throw new InvalidArgumentException();
        }
        $this->id = $id;
    }
 
    /**
     * @param string $name
     * @throws InvalidArgumentException
     */
    function setName($name) {
 
        if (is_bool($name) || !is_scalar($name)) {
            throw new InvalidArgumentException();
        }
        $this->id = $name;
    }
 
}

That's not even the end of it, though. What if our class really needs to have an interface? We end up with something like this:

interface Person {
 
    /**
     * @return int
     */
    function getId();
 
    /**
     * @return string
     */
    function getName();
 
}
 
class PersonImpl implements Person {
    protected $id;
 
    /**
     * @var string
     */
    protected $name;
 
    /**
     * @param int $id
     * @param string $name
     */
    function __construct($id, $name) {
        $this->id = $id;
        $this->name = $name;
    }
 
    /**
     * @return int
     */
     function getId() {
        return $this->id;
    }
 
    /**
     * @return string
     */
    function getName() {
        return $this->name;
    }
 
    /**
     * @param int $id
     * @throws InvalidArgumentException
     */
    function setId($id) {
        if (filter_var($id, FILTER_VALIDATE_INT) === FALSE) {
            throw new InvalidArgumentException();
        }
        $this->id = $id;
    }
 
    /**
     * @param string $name
     * @throws InvalidArgumentException
     */
    function setName($name) {
 
        if (is_bool($name) || !is_scalar($name)) {
            throw new InvalidArgumentException();
        }
        $this->id = $name;
    }
 
}

Note that I've left off public visibility modifiers so there is less to look at, but it would not be abnormal to see them in the wild.

An example: How things might be done

I feel like we can improve this by changing how types and more specifically objects work. By adding static types as an option (and supporting them with type-hints), creating automatic getters and setters, and reducing the need for some keywords, we can cut out most of that code. An example is worth a thousand words:

interface Person {
 
    int get id();
 
    string get name();
 
}
 
class PersonImpl implements Person {
 
    int $id; // creates get and set methods for id
 
    string $name;  // creates get and set methods for name
 
    __construct(int $id, string $name) {
        $this->id = $id; // calls the set id method
        $this->name = $name; // calls the set name method
    }
 
}

The class definition would be exactly equivalent to this PHP 6 syntax:

class PersonImpl implements Person {
 
    private int $__id;
 
    private string $__name;
 
    public __construct(int $id, string $name) {
        $this->id = $id; // calls the set id method
        $this->name = $name; // calls the set name method
    }
 
    public int get id() {
        return $this->__id;
    }
 
    public string get name() {
        return $this->__name;
    }
 
    public void set id(int $id) {
        $this->__id = $id;
    }
 
    public void set name(string $name) {
        $this->__name = $name;
    }
 
}

Notes:

  • Get and set methods are generated for public and protected variables, but get and set methods are not generated for private variables.
  • A get method can have a different visibility than the set method in the same class but you can't downgrade visibility when inheriting/implementing another class/interface.

FAQ

Q: Why do we want the syntax `$object->property;` to trigger a get method and `$object->property = 'prop';` to trigger a set method?

A:

  1. Implementation details can be change later. What if you need to change what happens when you set a variable later on? I'll use the real-world example of upper-casing the string (used often to make things case-insensitive). If you use simple assignment you can't do this without changing the calling code:
    class Object {
        var $property;
    }
    $object = new Object();
    $object->property = 'name';
     
    // Later on I need to upper-case the value when it is set, so I modify the class:
    class Object {
        private var $property;
     
        function getProperty() {
            return $this->property;
        }
     
        function setProperty($property) {
            $this->property = touppercase($property;) 
        }
    }
     
    // code like the following will now break:
    $object->property = 'name';

    You could use `_get` and `_set` magic but they are really a poor-man's get and set implementation. They are also very hard to build tools for.

  2. It is clear that you are accessing or modifying an object property when you use the traditional syntax. When you use the old-style getter and setter methods, it's not as clear.
    $object = new Object();
    $object->property = 'foo'; //obviously accessing a property
    $object->setProperty('foo'); // argue about clarity if you want, but it's less obvious to me

Changelog

rfc/php6-rethink.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1