rfc:phpp

Request for Comments: New File Type for Pure-Code PHP Scripts

Abstract

This RFC proposes the creation of a new .phpp file type and a new .phpf file type. These file types will both be parsed by the webserver as 100% PHP code; that is, all code will be executed without <?php and ?> tags. No raw HTML output would be allowed in these scripts using ?>, though echo/print and other output statements will continue to work in those scripts without any restriction. In addition, the .phpp file will also recursively disallow any ?> content from scripts that it includes, resulting in a stack that contains only PHP code. The .phpf file will not have this same restriction.

Background

PHP as a Templating Language

PHP was originally created to operate as server-side code embedded within a block of HTML with the intention of allowing dynamic, conditional output to be directly and easily inserted into HTML pages. This is a fundamental characteristic of PHP that remains to this day, and is arguably one of the primary reasons for its popularity and success.

Modern Frameworks and Standards

The web has evolved considerably since the late 1990s when most people's idea of a “website” consisted of an “About Me” page with a photo, lots of text, some links; and, of course, a guestbook. Today, websites are largely more functional and practical in nature, replacing much of what used to be done by client-side software executed by the operating system.

As the web has evolved, so too have the various methods for developing in it. PHP now largely takes the form of classes and code-only libraries that are hooked from the UI layer of the modern website. The widely-accepted MVC (Model-View-Controller) standard mandates that the UI layer (“view”) should hook to a bridge layer (“controller”), which then calls on complex pure-code classes/functions (“model”) and funnels the sanitized results back to the view layer. As such, the model layer is inherently forbidden to have raw HTML (i.e. “view”) code included anywhere in its stack, since the very purpose of MVC is to segregate these roles so that designers and programmers can simultaneously work on a project without having to be skilled in both or worry about frequent collisions and merge conflicts between the two. In a way, this could be thought of as an evolved form of templating.

However, many frameworks utilize more “tangled” inclusion stacks where the model and view are not necessarily seperated throughout. For these instances, it would be beneficial to have code-only scripts that are allowed to include scripts that may contain raw HTML as well.

The Tags

Over the years, people have occasionally called for one of a few fundamental changes to how PHP scripts are executed, typically involving the removal of the <?php and/or ?> tags or changing it so that PHP files default to PHP mode instead of HTML when parsing. Such changes would be devastatingly bad, as they would quite literally cause every single PHP script ever written to break. Even for a major version release, such a BC break would be utterly insane.

This RFC does NOT, in any way, change the way in which .php scripts are parsed, period! There will be no BC breaks.

The Webserver

The webserver must be configured so that PHP will be called to execute a file with a given extension (.php), otherwise the code would just be displayed as text.

Current Behavior

The <?php tag, contained within one of these files, tells the webserver to, in essence, “switch to PHP mode” and start parsing the data as PHP code. When the ?> tag is reached, the webserver “switches back” and resumes parsing it as HTML. If no tags are given, the webserver will parse the file data as HTML code until a <?php tag is reached.

The Problem

Pure-code PHP files (i.e. PHP scripts that have <?php at the top and never use the ?> tag; thus never “switching to HTML mode”) are extremely common these days, and that trend will likely continue even further into the foreseeable future. At some point, a reasonable observer begins to ask whether it is appropriate for these scripts to be “defaulting” to HTML output, overridden by a <?php tag at the very top. Another reasonable observer could then counter, “Well, aside from BC breakage, there will always be PHP scripts that contain raw HTML code; to remove that feature from PHP would be to remove one of its most popular aspects.”

The first observer might then counter that having to include a <?php tag at the top of every file, while admittedly very low-impact, is tedious and sometimes even forgotten (though not for long if the developer bothers to test his/her own code). Also, sometimes a developer might inadvertently add line breaks or other whitespace before the initial <?php tag at the top of the file, thus potentially causing rendering issues that might be difficult to track down if spotted later on. The ability to include a class or other pure-code file that the developer can be 100% sure will not contain any such raw output could prove to be tremendously helpful, particularly for someone trying to code in an MVC or other segregated framework standard.

The Solution

It is my view that both of these hypothetical observers are correct. The only logical conclusion, then, is that PHP has naturally branched into two primary uses, both of which are equally valid yet substantively distinct. Changing how existing .php files are parsed would not be a viable option. Therefore, the reasonable solution is to create a new, second file type designed for the “pure code” scripts mentioned above. These files would have the extension, “.phpp”, which stands for “PHP Pure”.

A .phpp file would behave exactly like a .php file, except for the following:

  • Instead of defaulting to HTML, everything within the file is parsed as if it was in a <?php tag.
  • If ?> tag is reached within one of these files, an E_RECOVERABLE_ERROR will be thrown and those tags, as well as any raw HTML code, will be ignored.
  • If a <?php tag is reached within one of these files, an E_NOTICE error will be thrown.
  • Any .phpp scripts included by a .phpp file must also adhere to these rules and contain only pure HTML code.
  • A regular .php script cannot be included from a .phpp script. An E_WARNING will be thrown for include and an E_RECOVERABLE_ERROR will be thrown for require; in both instances, the included file will be ignored.

However, there are frameworks and libraries that don't follow a segregated framework but would nonetheless benefit from being able to use PHP-only files. We thus have a situation where there are viable use-cases for both having a pure PHP stack and having a pure PHP file with an impure stack. The logical solution, then, is to create another file type to accommodate this. Pure PHP files that contain impure PHP code would have the extension, “.phpf”, which stands for “PHP Framework” because existing frameworks will be the primary use-case for this type.

A .phpf file would behave exactly like a .phpp file, except that regular .php scripts can be included and are allowed to contain raw ?> content. If a .phpp file is included, then the standard .phpp rules would still apply for that script's inclusion stack.

Inclusion of Mixed Code

The .phpp file type is generally intended for use with applications that have been designed from the ground-up with MVC code segregation. If you're working within a less certain framework, it is recommended that you use .phpf instead.

The following flow chart shows an example of both the right way and the wrong way to use a .phpp stack:

Naming Conventions

Obviously, PHP does not rely on file extensions themselves to determine how to parse a script. This RFC will NOT change that! The .phpp and .phpf extensions mentioned in this RFC are merely the recommended naming convention.

Script Inclusion

If you're including a .phpf or .phpp script from within a PHP script, we'll need a way to tell PHP what type of file we're including. Instead of creating new keywords, the most sensible approach will be to add a bit constant to include and require. The following syntax is proposed:

require[(] $script_filename, $script_type = PHP_SCRIPT_TYPE_NORMAL [)];

Possible values for $script_type:

PHP_SCRIPT_TYPE_NORMAL (0x01)

  • If the included file contains PHP code, parse it.

PHP_SCRIPT_TYPE_TAGLESS (0x02)

  • Code is assumed to be PHP throughout the script. The <?php tag throws E_NOTICE and the ?> tag throws E_RECOVERABLE_ERROR.

PHP_SCRIPT_TYPE_STACK (0x04)

  • The $script_type is applied to all child scripts of the one being included.
  • Question : Would anyone see value in adding an override constant that, while not recommended, allows the developer to apply a different $script_type somewhere deeper in the stack? Personally this doesn't sound useful to me, but I'd be willing to put it in if enough of you wanted it.

PHP_SCRIPT_TYPE_CODE_FILE (PHP_SCRIPT_TYPE_NORMAL & PHP_SCRIPT_TYPE_TAGLESS)

  • The entire script is assumed to be PHP code and is parsed accordingly.

PHP_SCRIPT_TYPE_CODE_STACK (PHP_SCRIPT_TYPE_NORMAL & PHP_SCRIPT_TYPE_TAGLESS & PHP_SCRIPT_TYPE_STACK)

  • The entire script and all its child scripts (i.e. its “stack”) are assumed to be PHP code and parsed accordingly.

Using "as" Instead of Comma?

It has been suggested that, instead of “require $filename, $constant” we should use “require $filename as $constant”. Further discussion on this topic is needed.

Webserver Execution

Ideally, we want .phpf and .phpp files to be able to be directly executed from the browser in addition to include/require. This will probably require the creation of two new handlers.

However, script inclusion is the more important priority. Direct webserver execution can be implemented at a later time if need be.

Final Thoughts

I targetted this for PHP 6 since, even though it doesn't pose any BC-break issues, it is a fundamental feature addition to the core. Unlike some similar RFCs floating around about this topic, this does not add any new keywords or other complex/confusing elements to the language and it doesn't run the risk of causing any existing code to break. Instead, we're simply creating a new file type designed to allow PHP to keep pace with modern, emerging web development standards while simultaneously preserving PHP's fundamental identity as an embedded language.

With this new feature, developers will be able to adopt best-practices architectural models with far greater ease. Companies that develop and maintain PHP implementations will be better able to hire designers with little-to-no programming ability and vice versa. Developers who are accustomed to non-embedded language might also take a second look at PHP and come to realize just how awesome this language truly is. Either way, we're getting something good without having to give-up anything (time notwithstanding), so this feature should prove to be a win-win no matter what.

Vote

Voting has not yet been initiated. Please check back later.

Changelog

Version 1.20 : Language clarifications. Added .phpf script type.

Version 1.10 : Status => Under Discussion.

Version 1.00 : Initial proposal.

rfc/phpp.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1