rfc:autodefine

RFC __autodefine

Introduction

This proposal proposes to introduce the ability for automagically defining missing definitions at run time. The reader should have a reasonable level of PHP knowledge and computer languages when reading this proposal. Note: Whenever __autodefine is used, also spl_autodefine is meant.

Current situation

PHP currently supports defining missing classes by the __autoload function. This solution mimics the class loader concept from Java. Once defined, undefined class functions (methods) can be emulated by using __call and __callStatic. It can be argued that __autoload is enough for the future if the future is object-oriented.

Proposition

While very useful in its own right __autoload is limited: only classes can be loaded dynamically. However, there is still a lot of code that is either mixed functional and object-oriented or only functional. Excluding that code from such a solution makes life very hard on everyone maintaining mixed code bases. It is understandable that by denying this function to non-object oriented code an incentive exists for maintainers and developers to switch to object-oriented programming. Although understandable it is not acceptable, for the following reasons:

  1. There are enough people maintaining and writing code in a perfect manner that do not grasp object orientation
  2. Any large code base should have the possibility to slowly migrate and not be coerced into an object oriented way of working
  3. Even if this coercion would work, it would not guarantee the solution is object-oriented, only that it is coded in classes

So, it can be stated that a number of language constructs are missing the option to autoload.

From a conceptual point, tokens in a sequence of statements refer to definitions. The implementation of almost all run time environments demands that a definition exists before first referenced. __autoload changed this rule by allowing classes to be defined when the execution of a statement hits on an undefined class. This behaviour should be generalized.

Generally, it can be said that PHP needs a mechanism for automagically defining undefined elements. The proposed name for that function is __autodefine. The parameters to this function would be $name and $type. The type parameter would refer to all the existing elements that are definable. Probably the maximal set of language constructs that could be supported are: T_ARRAY, T_CLASS, T_CLASS_C, T_CONST, T_FUNCTION T_FUNC_C, T_INCLUDE, T_INCLUDE_ONCE, T_INTERFACE, T_METHOD_C, T_NAMESPACE, T_NS_C, T_REQUIRE, T_REQUIRE_ONCE, T_USE, T_VAR, T_VARIABLE (Source: http://nl2.php.net/manual/en/tokens.php). Namespaces will be part of the name as is the case in call_user_func.

The relation with the current implementation is that __autoload equals __autodefine( $name, T_CLASS ) with the exception of the optional file_extensions parameter. __autoload need not be changed and __autodefine can live alongside __autoload. Author believes __autoload and its related spl_* functions should be marked 'deprecated'.

Minimal solution proposal

The author believes the minimal solution should support at least the types T_FUNCTION, T_CLASS, T_INTERFACE and include (T_INCLUDE, T_INCLUDE_ONCE, T_REQUIRE, T_REQUIRE_ONCE). At the very minimal only the __autodefine is needed. However, looking at the SPL it is probably best to also implement the same set of spl_* functions that currently exist for autoload support.

Function prototype

 __autodefine( $name, $type )
 $name is the string of the missing definition

 $type is the integer identification of the type and defined by constants.
$type constant $name format $name value examples
T_FUNCTION [namespace][class name][::|->]function name a\b\foo, System::boot, SomeClass->getWidth
T_CLASS [namespace]class name Image, ns\Image
T_INTERFACE [namespace]class name Image, ns\Image
T_INCLUDE file name somefile, ../somefile, \includes\Somefile
T_INCLUDE_ONCE file name somefile, ../somefile, \includes\Somefile
T_REQUIRE file name somefile, ../somefile, \includes\Somefile
T_REQUIRE_ONCE file name somefile, ../somefile, \includes\Somefile

Code examples

 foo(b);            // __autodefine( 'foo', T_FUNCTION )
 
 p = new PHP();     // __autodefine( 'PHP', T_CLASS )
 
 include 'piece'    // __autodefine( 'piece', T_INCLUDE )
 
 ons\foo();         // __autodefine( 'ons\foo', T_FUNCTION )
 
 p->im();           // __autodefine( 'PHP->im', T_FUNCTION )
 
 p->cm();           // __autodefine( 'PHP::cm', T_FUNCTION )

Advantages

A whole new array of possibilities opens up for managing code, both at run time and both at design time (development). Code is no longer bound to file containers and file systems. No more dependencies on the include path. Small pieces of code can exist on their own in any place (e.g. zip file). At run time only the code pieces that are needed for execution are retrieved and defined. No longer parsing of complete files when only 5 lines of code will be executed. A parse error in a file not relevant to the piece of code that will be executed will not prevent execution anymore. By splitting code up, developers can work side by side on the same code base, every developer on a set of code pieces, much smaller than the files now and thus reducing (locking) conflicts. Code pieces can be tested standalone and accepted. Progress is measurable on code piece level. Basically, a lot of metafunctions are suddenly possible because splitting up code in smaller pieces has become a workable solution. Code could be for example managed in a wiki using pages to represent code pieces, automatically getting revisions etc. If you put the code in a Wiki, documentation can be kept apart from the code (less parsing during execution). If needed the code pieces can be assembled into any number of files for delivery. An extreme solution is assembling all the code pieces into a zip file and having a run time implementation of __autodefine that picks definitions from a zip file. While exactly the same code is available during development in a wiki. Large codebases can be split up over time into more manageable pieces giving the developers renewed control over their application. The __autodefine function will enable all sorts of new ways of managing code and work processes. Security can be enhanced enormously as it is not common anymore where definitions will come from and how they are decrypted. Even if someone gets access to the source code (outside the web root), it might be scrambled. Also insight into the system can be obtained by observing definition patterns that can be gathered during __autodefine processing. Etcetera...

Disadvantages

Execution may be slower (may because maybe less code needs to be parsed and loaded). However, this can be countered with different implementations for __autodefine at production time and development time. The author also believes that others will find ways to counter this disadvantage by using caching or keeping definitions in memory inbetween executions.

Changelog

2010-10-07 Initial version

rfc/autodefine.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1