rfc:source_files_without_opening_tag

ABANDONED

I have abandoned this proposal. I have come to feel it changes the spirit of PHP too much, offering too little gain for the degree of unhappiness it inspires and the potential for confusion it creates. I am leaving it here for historical purposes. -Tom Boutell

Request for Comments: Source Files Without Opening Tag

This RFC proposes a way to support source code files without <?php at the top.

Introduction

The purpose of this RFC is to provide a way to support source files that do not begin with <?php while maintaining full backwards compatibility with files that do.

Why is this desirable?

In modern framework development and larger projects in general, it is often considered good practice to implement PHP classes in files which contain only PHP code. If methods of such classes do have a desire for HTML templating, they accomplish it by requiring a separate template file. In such “pure code” files, typing <?php at the top is:

1. Error-prone in a subtle and hard-to-debug way: if any whitespace is introduced before <?php, the code still runs, but your XHTML doctype fails to be recognized, your header() calls fail, etc. Since you may not use these features in every situation the bug is often not spotted until an inopportune time.

2. Tedious. There is a small but real frustration involved in this redundancy. Small but real frustrations can contribute to long-term disenchantment with a programming language.

However these same projects and frameworks may advocate the use of “raw HTML” in PHP files intended as templates for rendering pages, forms and the like. This is a longstanding feature of PHP (indeed the original feature of PHP). Support for it should be maintained, and may perhaps be improved in future to address PHP's current limitations as a templating language. The two modes should not be mutually exclusive as this makes it impossible for code to interoperate. This proposal aims not to close any doors in this regard.

Proposal

Part 1: Enhance the include, include_once, require and require_once keywords

These keywords will be enhanced with a second, optional parameter introduced by the “AS” keyword.

The first parameter (the URL/filename to the file to be included) does not change.

The second parameter is a combination of integer flags, combined in the usual way with the OR operator (|).

If this second parameter is absent, the four keywords behave exactly as they do now.

When the second parameter is present, it may be a bitwise OR of zero or more of the following constants which add to (but never subtract from) the existing behavior of each keyword:

If INCLUDE_PURE_CODE is present, the parser begins reading the included file as if the <?php tag had already been encountered, and any occurrence of the ?> and <?php tokens later in that file is a fatal error. This rule does NOT extend to other files included and/or required later. Files required in INCLUDE_PURE_CODE mode can still require template files that do contain <?php and ?>.

If INCLUDE_ONCE is present or the include_once or require_once keyword was used, the file is not included if it has already been included once (like the normal behavior of include_once and require_once). Note that the use of either of the _once keywords implicitly turns on this bit regardless.

If INCLUDE_ERROR_ON_FAILURE is present, or the require or require_once keyword was used, an E_COMPILE_ERROR fatal error is generated if the file cannot be included (exactly like a failure of the require keyword). Otherwise an E_WARNING is generated, as is normal for the include keyword with no second parameter. Note that the use of either of the require_ keywords implicitly turns on this bit regardless.

Examples:

// Absolutely no change to existing behavior
require 'filename.php';

// Load filename.phpp. This file must consist purely of source code, no <?php or ?> tokens needed or permitted
require 'filename.phpp' AS INCLUDE_PURE_CODE;
 
// Behaves just like include_once
include 'filename.php' AS INCLUDE_ONCE;
 
// Behaves just like require
include 'template.php' AS INCLUDE_ERROR_ON_FAILURE;
 
// Combine them all: includes only once, with a fatal error on failure, parsing in "code mode"
include 'filename.phpp' AS INCLUDE_PURE_CODE | INCLUDE_ONCE | INCLUDE_ERROR_ON_FAILURE;
 
// Exactly the same as previous example
require_once 'filename.phpp' AS INCLUDE_PURE_CODE;

Part 2: Filename Convention

Although this proposal gives implementers flexibility in when and where they use the INCLUDE_PURE_CODE bit, it is still desirable in most cases to have a commonly recognized convention to distinguish files that should be read starting in “PHP mode” from legacy and template files that should be read starting in “HTML mode.” The following convention is proposed for environments in which file extensions are a relevant and useful concept:

  • Files that should be read starting in HTML mode should have a .php extension, for backwards compatibility.
  • Files that should be read starting in PHP mode should have a .phpp extension (short for “Pure PHP”).

However enforcement of this convention is NOT proposed. The choice to apply INCLUDE_PURE_CODE is made entirely by the programmer (typically the author of a class file autoloader).

Anticipated And Previously Raised Questions

(Thanks to those who raised and responded to some of these questions already on the internals list. I am summarizing in many cases.)

“Does this break my existing code?”

No. Code that never uses the new keyword will not be affected in any way. The proposal allows autoloaders to load files the old-fashioned way and to recognize when to do so by a simple common convention or by other local conventions as appropriate.

“Isn't the INCLUDE_PURE_CODE flag even more work than typing <?php?”

Typically projects that will benefit from this flag also have autoloaders to load classes implicitly when they are first used. So INCLUDE_PURE_CODE would be typed once in the autoloader, not many times everywhere.

“Won't this slow down the autoloader?”

Not really. Even in a worst-case scenario where stat() calls are slow and the autoloader performs no caching even in a production environment, the autoloader will often be able to assume that only .phpp files are expected because that is the convention of the library or framework from which they came, so it won't be necessary to stat() first for .phpp and then check for .php as well. It is also possible to prewarm autoloader caches as part of deployment.

“Won't this break if you try to use the code with an older version of PHP?”

Of course. A choice to use this feature implies a choice to support only the supporting version of PHP or newer. But it'll break cleanly with a clear error message, just like code that tries to use traits or other newer features.

“Why does the proposal forbid the use of ?> to get back to HTML mode in a .phpp file?”

The first version of the proposal did permit this as a compromise. However it did not please anyone. Those who want to write “pure PHP” class files are not interested in switching from code to markup in the middle of a method and are still able to include regular template PHP files as needed, following good MVC separation practices.

“Why not introduce a new keyword rather than enhancing four keywords?”

A new keyword was proposed and did not go over well. Enhancing the existing keywords, allowing their existing behavior to automatically switch on some of the flag bits, turns out to be both more elegant and more familiar.

“Why three flags instead of one? Aren't the other two redundant?”

While the INCLUDE_ONCE and INCLUDE_FATAL_ERROR_ON_FAILURE flags are technically redundant, if a developer chooses to start with the include keyword they can decide which of the flags to apply at runtime, which was not possible before without a series of if/else clauses.

“Why bitwise flags instead of an associative array of options?”

Bitwise flags are faster and also provide built-in error checking: use of a constant not defined by a particular version of PHP will generate a notice. Require statements are something PHP executes quite often, so generating unnecessary arrays and testing array values is an unnecessary performance hit.

“Why is the AS keyword necessary? Why not a comma?”

Since these keywords are language constructs, not functions, and their parameters are not enclosed in parentheses, the meaning is ambiguous with a comma. Right now foo(include 'baz', INCLUDE_ONCE) would pass two separate values to the foo function. Changing this in the grammar would be problematic. The use of the AS keyword removes the ambiguity.

Changelog

  • 2012-05-06 Thomas Boutell: formally abandoned by original proponent.
  • 2012-04-09 Yasuo Ohgaki: Added related RFC.
  • 2012-04-10 Thomas Boutell: removed misleading word “Option” from parts 1 and 2, which are not meant to be mutually exclusive (see the original text).
  • 2012-04-10 Thomas Boutell: version 1.1. Replaced require_path with enhancements to the standard include/require family of keywords. Replaced an array of options with a bitwise OR of options. Forbade the use of ?> entirely in pure PHP files (without restricting it at all in other PHP files).
  • 2012-04-16 Thomas Boutell: added Nikita Popov's “AS” keyword as a workaround for the fact that a comma can't be introduced between the two parameters without creating an ambiguity in the grammar.
rfc/source_files_without_opening_tag.txt · Last modified: 2017/09/22 13:28 by 127.0.0.1