ABANDONED
I have abandoned this proposal. I have come to feel it changes the spirit of PHP too much, offering too little gain for the degree of unhappiness it inspires and the potential for confusion it creates. I am leaving it here for historical purposes. -Tom Boutell
Request for Comments: Source Files Without Opening Tag
- Version: 1.1.1
- Date: 2012-04-16
- Author: Thomas Boutell tom@punkave.com
- Status: Under Discussion
- First Published at: http://wiki.php.net/rfc/source_files_without_opening_tag
- Other formats: none yet
This RFC proposes a way to support source code files without <?php
at the top.
Introduction
The purpose of this RFC is to provide a way to support source files that do not begin with <?php
while maintaining full backwards compatibility with files that do.
Why is this desirable?
In modern framework development and larger projects in general, it is often considered good practice to implement PHP classes in files which contain only
PHP code. If methods of such classes do have a desire for HTML templating, they accomplish it by requiring a separate template file. In such “pure code” files, typing <?php
at the top is:
1. Error-prone in a subtle and hard-to-debug way: if any whitespace is introduced before <?php
, the code still runs, but your XHTML doctype fails to be recognized, your header()
calls fail, etc. Since you may not use these features in every situation the bug is often not spotted until an inopportune time.
2. Tedious. There is a small but real frustration involved in this redundancy. Small but real frustrations can contribute to long-term disenchantment with a programming language.
However these same projects and frameworks may advocate the use of “raw HTML” in PHP files intended as templates for rendering pages, forms and the like. This is a longstanding feature of PHP (indeed the original feature of PHP). Support for it should be maintained, and may perhaps be improved in future to address PHP's current limitations as a templating language. The two modes should not be mutually exclusive as this makes it impossible for code to interoperate. This proposal aims not to close any doors in this regard.
Related RFC
Proposal
Part 1: Enhance the include, include_once, require and require_once keywords
These keywords will be enhanced with a second, optional parameter introduced by the “AS” keyword.
The first parameter (the URL/filename to the file to be included) does not change.
The second parameter is a combination of integer flags, combined in the usual way with the OR operator (|
).
If this second parameter is absent, the four keywords behave exactly as they do now.
When the second parameter is present, it may be a bitwise OR of zero or more of the following constants which add to (but never subtract from) the existing behavior of each keyword:
If INCLUDE_PURE_CODE
is present, the parser begins reading the included file as if the <?php
tag had already been encountered, and any occurrence of the ?>
and <?php
tokens later in that file is a fatal error. This rule does NOT extend to other files included and/or required later. Files required in INCLUDE_PURE_CODE mode can still require template files that do contain <?php
and ?>
.
If INCLUDE_ONCE
is present or the include_once
or require_once
keyword was used, the file is not included if it has already been included once (like the normal behavior of include_once
and require_once
). Note that the use of either of the _once
keywords implicitly turns on this bit regardless.
If INCLUDE_ERROR_ON_FAILURE
is present, or the require
or require_once
keyword was used, an E_COMPILE_ERROR
fatal error is generated if the file cannot be included (exactly like a failure of the require
keyword). Otherwise an E_WARNING
is generated, as is normal for the include
keyword with no second parameter. Note that the use of either of the require_
keywords implicitly turns on this bit regardless.
Examples:
// Absolutely no change to existing behavior require 'filename.php'; // Load filename.phpp. This file must consist purely of source code, no <?php or ?> tokens needed or permitted require 'filename.phpp' AS INCLUDE_PURE_CODE; // Behaves just like include_once include 'filename.php' AS INCLUDE_ONCE; // Behaves just like require include 'template.php' AS INCLUDE_ERROR_ON_FAILURE; // Combine them all: includes only once, with a fatal error on failure, parsing in "code mode" include 'filename.phpp' AS INCLUDE_PURE_CODE | INCLUDE_ONCE | INCLUDE_ERROR_ON_FAILURE; // Exactly the same as previous example require_once 'filename.phpp' AS INCLUDE_PURE_CODE;
Part 2: Filename Convention
Although this proposal gives implementers flexibility in when and where they use the INCLUDE_PURE_CODE bit, it is still desirable in most cases to have a commonly recognized convention to distinguish files that should be read starting in “PHP mode” from legacy and template files that should be read starting in “HTML mode.” The following convention is proposed for environments in which file extensions are a relevant and useful concept:
- Files that should be read starting in HTML mode should have a
.php
extension, for backwards compatibility. - Files that should be read starting in PHP mode should have a
.phpp
extension (short for “Pure PHP”).
However enforcement of this convention is NOT proposed. The choice to apply INCLUDE_PURE_CODE
is made entirely by the programmer (typically the author of a class file autoloader).
Anticipated And Previously Raised Questions
(Thanks to those who raised and responded to some of these questions already on the internals list. I am summarizing in many cases.)
“Does this break my existing code?”
No. Code that never uses the new keyword will not be affected in any way. The proposal allows autoloaders to load files the old-fashioned way and to recognize when to do so by a simple common convention or by other local conventions as appropriate.
“Isn't the INCLUDE_PURE_CODE
flag even more work than typing <?php
?”
Typically projects that will benefit from this flag also have autoloaders to load classes implicitly when they are first used. So INCLUDE_PURE_CODE
would be typed once in the autoloader, not many times everywhere.
“Won't this slow down the autoloader?”
Not really. Even in a worst-case scenario where stat()
calls are slow and the autoloader performs no caching even in a production environment, the autoloader will often be able to assume that only .phpp
files are expected because that is the convention of the library or framework from which they came, so it won't be necessary to stat()
first for .phpp
and then check for .php
as well. It is also possible to prewarm autoloader caches as part of deployment.
“Won't this break if you try to use the code with an older version of PHP?”
Of course. A choice to use this feature implies a choice to support only the supporting version of PHP or newer. But it'll break cleanly with a clear error message, just like code that tries to use traits or other newer features.
“Why does the proposal forbid the use of ?>
to get back to HTML mode in a .phpp
file?”
The first version of the proposal did permit this as a compromise. However it did not please anyone. Those who want to write “pure PHP” class files are not interested in switching from code to markup in the middle of a method and are still able to include
regular template PHP files as needed, following good MVC separation practices.
“Why not introduce a new keyword rather than enhancing four keywords?”
A new keyword was proposed and did not go over well. Enhancing the existing keywords, allowing their existing behavior to automatically switch on some of the flag bits, turns out to be both more elegant and more familiar.
“Why three flags instead of one? Aren't the other two redundant?”
While the INCLUDE_ONCE
and INCLUDE_FATAL_ERROR_ON_FAILURE
flags are technically redundant, if a developer chooses to start with the include
keyword they can decide which of the flags to apply at runtime, which was not possible before without a series of if/else clauses.
“Why bitwise flags instead of an associative array of options?”
Bitwise flags are faster and also provide built-in error checking: use of a constant not defined by a particular version of PHP will generate a notice. Require statements are something PHP executes quite often, so generating unnecessary arrays and testing array values is an unnecessary performance hit.
“Why is the AS keyword necessary? Why not a comma?”
Since these keywords are language constructs, not functions, and their parameters are not enclosed in parentheses, the meaning is ambiguous with a comma. Right now foo(include 'baz', INCLUDE_ONCE)
would pass two separate values to the foo
function. Changing this in the grammar would be problematic. The use of the AS keyword removes the ambiguity.
Changelog
- 2012-05-06 Thomas Boutell: formally abandoned by original proponent.
- 2012-04-09 Yasuo Ohgaki: Added related RFC.
- 2012-04-10 Thomas Boutell: removed misleading word “Option” from parts 1 and 2, which are not meant to be mutually exclusive (see the original text).
- 2012-04-10 Thomas Boutell: version 1.1. Replaced
require_path
with enhancements to the standardinclude
/require
family of keywords. Replaced an array of options with a bitwise OR of options. Forbade the use of?>
entirely in pure PHP files (without restricting it at all in other PHP files). - 2012-04-16 Thomas Boutell: added Nikita Popov's “AS” keyword as a workaround for the fact that a comma can't be introduced between the two parameters without creating an ambiguity in the grammar.