PHP RFC: Structural Type Hinting
- Version: 0.2
- Date: 2013-06-25
- Author: Anthony Ferrara ircmaxell@php.net
- Status: Withdrawn
- First Published at: http://wiki.php.net/rfc/protocol_type_hinting
Introduction
Currently PHP uses an enforced interface style, where the argument object to a function or method must implement the respective interface. This RFC discusses inverting the dependency to allow for run-time resolving of interfaces. This is very similar to how GO implements interfaces.
In short, rather than checking the type-tree as traditional type-hinting works, this only checks to see that the public APIs match (that if you added “implements $interface” to the class definition, if it would pass)...
What Are Go-Style Interfaces?
Go-Style interfaces (called Structural Typing by this RFC) are basically interface type hinting which are resolved by looking at the structure of the object being passed rather than the type information. The class never implements the interface, but instead provides a compatible API. The receiver (the method receiving the argument) can choose to enforce the requirement or not.
Why Go-Style Interfaces?
The basic premise is that it makes two areas significantly easier to manage:
Cross-Project Dependencies
When you have cross-project dependencies. Currently, both packages must declare a dependency to a third package for the interface. A good example of this is the PSR-3 logger interface. Currently, the PSR-3 interface must be included by every project that declares a logger which you want to conform to the interface. This results in pulling in project-level dependencies even in cases where you don't need them.
Implementing Structural Typing would allow the Class maintainers to just build their classes without additional dependencies, but the receivers (consumers) of those objects to still have some form of type-safety.
Decoupling Of Classes
Right now, there's now way to type-hint on the PDO class while still allowing for decorators or other objects that also satisfy the same interface. Using Structural Typing that's completely possible. This should greatly simplify the building of Mock objects as well as generators.
- class_decoupling.php
<?php public function foo(<PDO> $db) { $db->query('foo'); } ?>
This further strengthens the safety offered by type-hints, while decoupling the class further from the interface and classes of the hint.
Proposal
I propose adding a special syntax to the current type-hinting paradigm to allow for indicating a non-strict *instanceof* which would resolve the dependency as a Structural Type.
Proposed Syntax
The current proposed syntax is basically wrapping a traditional type hint (foo(Logger $logger)) with `<>` to indicate the argument should be treated as a Structural Type Hint.
- duck_typing.php
<?php interface Logger { public function log($argument); } class Bar { public static function foo(<Logger> $logger) { $logger->log('foo'); } } class FileLogger { public function log($message) { file_put_contents('somelogfile', $message); } } Bar::foo(new FileLogger); ?>
Proposed Behavior
Any valid class-style identifier can be used as the “structure name”. That means that both classes and interfaces are supported as identifiers. Then the passed in object is checked to ensure that every method on the “structure type” matches in signature and flags to the passed object. If any do not match, the object is rejected and an E_RECOVERABLE_ERROR is raised.
- duck_typing.php
<?php interface Logger { public function log($argument); } class Bar { public static function foo(<Logger> $logger) { $logger->log('foo'); } } class FileLogger { public function log($message) { file_put_contents('somelogfile', $message); } } class StringLogger implements Logger { public function log($message) {} } class StaticLogger { public static function log($message) { /* blah */ } } class OtherLogger { public static function log($message, $bar) { /* blah */ } } Bar::foo(new FileLogger); // Good! Bar::foo(new StringLogger); // Good! Bar::foo(new StaticLogger); // Bad! STATIC does not match! Bar::foo(new OtherLogger); // Bad! Arg count does not match! ?>
Use-Cases
Flexible Middleware
Right now there's a project called Stack. The premise is to provide middlewares for Symfony's HttpKernel. In practice these middlewares are nothing more than decorators for the HttpKernel. Let's show the HttpKernel Interface:
- HttpKernelInterface.php
<?php namespace Symfony\Component\HttpKernel; use Symfony\Component\HttpFoundation\Request; interface HttpKernelInterface { const MASTER_REQUEST = 1; const SUB_REQUEST = 2; public function handle(Request $request, $type = self::MASTER_REQUEST, $catch = true); } ?>
Really, there isn't anything Symfony specific there. The only specific part is the Request class, which is quite big.
So right now, middleware has to be coupled on the HttpKernel and as such, on the Request class and a whole lot of other parts of Symfony...
Reducing Coupling
Let's imagine you're creating a firewall middleware to limit requests to a specific IP (or allow for all except a certain IP). As it stands today, you need a large chunk of Symfony to do so.
But with Structure Typing, you can create two new interfaces:
- RequestGetClientIp.php
<?php interface RequestGetClientIp { public function getClientIp() ?>
- HttpKernelClientIp.php
<?php interface HttpKernelInterfaceForClientIp { const MASTER_REQUEST = 1; const SUB_REQUEST = 2; public function handle(<RequestGetClientIp> $request, $type = self::MASTER_REQUEST, $catch = true); } ?>
Now, my middleware becomes:
- firewall.php
<?php class Firewall implements HttpKernelInterfaceForClientIp { protected $parent = null; public function __construct(<HttpKernelInterfaceForClientIp> $parent) { $this->parent = $parent; } public function handle(<RequestGetClientIp> $request, $type = self::MASTER_REQUEST, $catch = true) { if ($request->getClientIp() === '127.0.0.1') { return $this->parent->handle($request); } throw new Exception('Not Authorized'); } } ?>
The cool thing is that I'm effectively decoupled from Symfony here. If ZendFramework changed their Http\Client to use the same basic API, you could re-use the middleware on both, without needing a cross-project dependency between Symfony and Zend (and thereby loading the interfaces on every request.
Standards Based Interface Declarations
Currently, the PSR-FIG Group group is starting to publish interfaces for standardized components. At present, this requires that each project that either provides a “standard implementation” or uses a “standard implementation” must declare a dependency on this third project.
This raises a significant issue, because it causes a triangular dependency which requires some external effort to resolve. This means that you need some sort of tool to resolve that dependency for you, or you both sides copy/paste the implementation into their tree, and must “register an autoloader” for that dependency, and the first one to do so will win. Either way, it's not a straight forward solution.
For example, take the PSR-3 LoggerInterface
- LoggerInterface.php
<?php namespace psr\log; interface LoggerInterface { public function emergency($message, array $context = array()); public function alert($message, array $context = array()); public function critical($message, array $context = array()); public function error($message, array $context = array()); public function warning($message, array $context = array()); public function notice($message, array $context = array()); public function info($message, array $context = array()); public function debug($message, array $context = array()); public function log($level, $message, array $context = array()); } ?>
Reducing Coupling
Using Structural Typing, we can solve this triangular dependency by only requiring the interface when we need it (at the receiver) as opposed to at the producer:
We can also narrow the requirement for an acceptable logger based on our application's needs. So we can redefine:
- Psr3LogWarningAndError.php
<?php interface Psr3LogWarningAndError { public function error($message, array $context = array()); public function warning($message, array $context = array()); } ?>
- MyCode.php
<?php class MyClass { protected $logger; public function __construct(<Psr3LogWarningAndError> $logger) { $this->logger = $logger; } public function doSomething($foo) { if (!$foo) { $this->logger->warning("Foo!!!", [$foo]); } } } ?>
So now, our code can depend on the narrower interface that we actually need, while allowing all PSR-3 compatible implementations to pass. But it also will allow us to replace polymorphically the logger with a non-PSR-3 logger (because it doesn't implement other parts of the interface, for example), but fulfills our entire need.
The key here is that it inverts the dependency on who gets to define what the needed units of functionality are. It allows the receiving code to define the scope of required functionality instead of the sending code.
It also solves the triangular dependency problem since the sender never needs to explicitly require the dependency. That can be left for an off-line check (or a test), reducing the amount of and need for dependency resolving tools for the purpose of common interfaces...
Trait Typing
Currently, traits do not allow for specifying of typing information. And this is a good thing (it is by design).
However, there are many times where we may wish to infer that the functionality presented by a trait is present in an object. An example would be Zend Framework's ProvidesEvents Trait. It basically looks like this:
- ProvidesEvents.php
<?php namespace Zend\EventManager; trait ProvidesEvents { protected $events; public function setEventManager(EventManagerInterface $events) { /* snip */ } public function getEventManager() { /* snip */ } } ?>
As the current system stands, classes that use the trait need to also implement a separate interface to get the message of the behavior across to the receiver that it supports events.
With Structural Type Hinting, we can instead hint against the trait directly, which would require a class with the same public API that the trait provides, irrespective of if it is actually using the trait or not.
- RequiresEvents.php
<?php function triggerEvent(<Zend\EventManager\ProvidesEvents> $target, $eventName) { $target->getEventManager()->trigger($eventName, $target); } ?>
If the object uses the trait, it will always resolve. But it also gives other classes which implement the same public API the ability to resolve the trait.
The Place For Current Interfaces
Why not just get rid of current interfaces and change their behavior to Structural typing (besides the MASSIVE BC break)?
In practice there are two reasons (times) that you would use an interface:
1. To provide typing information about a domain object (or a value object). This is where the typing actually means something specific to the application.
An example would be a User object in an application. You may want to have a UserInterface which the User class implements, because the interface actually implies that the object *belongs* to the domain. It's providing *type* information about the object.
2. To provide functionality information about a object. This where the interface really just describes the functionality of the object.
An example would be a service which encodes a password. There's no special typing information needed. The interface simply provides a semantic way of identifying the API of the service. So it's not really providing *type*, but more capability.
With this new proposal, Type information would still be implemented via traditional interfaces. But capability information would use Structural Typing.
So there is very much a place for both to live side-by-side in the same language.
Backward Incompatible Changes
Considering there is no addition to the reserved word table, and this only adds new branches to the compiler, there are no BC breaks.
Effects On Opcode Caches
The current implementation would have no effect on Op-Code caching.
Proposed PHP Version(s)
Proposed for PHP 5.NEXT
SAPIs Impacted
No SAPI impact.
Impact to Existing Extensions
There shouldn't be any Extension impact, as no APIs are changed. The only potential impact would be for extensions which are pre-processing the op-array prior to compiling, where the new operand type IS_PROTOCOL is used to signify the type-hint at the compiler level.
New Constants
None
php.ini Defaults
None
Open Issues
Raising Errors
Currently, only a E_RECOVERABLE_ERROR is raised saying that the passed object does not look like the structure type. We may want to raise an E_NOTICE or E_WARNING as well to say WHAT did not match (was a method missing? Was the signature different? etc).
Syntax
I considered implementing a new “type” of interface for this (declaring a new reserved word “structure”). However after thinking about it, I felt that it was necessary to extend this concept to classes as well as traditional interfaces.
If the <Foo> $foo syntax is not acceptable, there are a few alternatives that would likely work:
- @Foo $foo
- %Foo $foo
- *Foo $foo
- ~Foo $foo
Personally, I think the <Foo> $foo syntax is the clearest, but it may be too close to Generics for comfort...
Performance
It's worth noting that since this is a separate branch, the current performance of normal type-hints remains uneffected.
Currently, instanceof short-circuiting and caching of structure checks has been implemented.
While performance is an apparent concern, the benchmarks indicate performance at works on-par with existing type hints (and when called multiple times can be faster)...
Here are the results of the benchmarks:
- benchmark.php
<?php interface Foo { public function foo(); } class Bar { public function foo() {} } class Baz implements Foo { public function foo() {} } function benchmark($func, $times, $arg) { $s = microtime(true); for ($i = 0; $i < $times; $i++) { $func($arg); } $e = microtime(true); return $e - $s; } $times = 1000000; $interface = benchmark(function(Foo $foo) {}, $times, new Baz); echo "Interface in $interface seconds, " . ($interface / $times) . " seconds per run\n"; $structural = benchmark(function(<Foo> $foo) {}, $times, new Bar); echo "Structural in $structural seconds, " . ($structural / $times) . " seconds per run\n"; $native = benchmark(function($foo) {}, $times, new Bar); echo "Native in $native seconds, " . ($native / $times) . " seconds per run\n"; ?>
When Run Once
When run once (with $times = 1):
Interface in 1.5974044799805E-5 seconds, 1.5974044799805E-5 seconds per run
Structural in 1.4066696166992E-5 seconds, 1.4066696166992E-5 seconds per run
Native in 6.9141387939453E-6 seconds, 6.9141387939453E-6 seconds per run
The margin of error for the test is approximately the same difference as between Interface and Structural. This means that the performance for a single run is about constant.
When Run Many Times
When run with $times = 1000000;
Interface in 0.50202393531799 seconds, 5.0202393531799E-7 seconds per run
Structural in 0.48089909553528 seconds, 4.8089909553528E-7 seconds per run
Native in 0.3850359916687 seconds, 3.850359916687E-7 seconds per run
In this case, the margin of error was less than the difference, meaning that the Structural approach is slightly more performant at runtime than the interface based approach.
Unaffected PHP Functionality
Any not using the new syntax.
Future Scope
Patches and Tests
I have created a proof-of-concept patch (needs a little bit of refactoring for the official change), but is functional with some basic tests:
Proof-Of-Concept Branch: https://github.com/ircmaxell/php-src/tree/protocol_proof_of_concept
Diff From Current Master: https://github.com/ircmaxell/php-src/compare/protocol_proof_of_concept
References
ChangeLog
* 0.1 - Initial Draft * 0.2 - Rename to Structural Typing, add benchmark results