Summary
PHP has been known to be more vulnerable to script inclusion attacks than other languages. PHP is made for Web and embedding feature is valuable for easier web application development. However, embedded scripts are serious security risk as it can executes any PHP code in any files. PHP is made for web application and today's web application should have strong security.
This RFC proposes PHP open tags to be optional, template mode switch and script only include feature.
Proposal
Open Issue
Add flag that controls embed(template) feature of PHP
Flag to control embed (template) mode for directly called scripts. (e.g. http://some/foo.php or php bar.php Directly executing scripts are script accessed by browser directly, script executed from CLI binary.)
NOTE: PHP script that has a “<?php” or like at the top of script works regardless of template_mode.
1) php.ini
template_mode = On ; On by default. PHP_INI_SYSTEM.
; On: "<?php" or like is requried for directly executing script.
; Off: Open tag is not required for directly executing script.
2) Environment variable
PHP_TEMPLATE_MODE=1 # O for disable. No default.
3) CLI
php -x foo.php # template mode. "<?php" or like is required. DEFAULT
php -X foo.php # non-template mode. "<?php" or like is not required.
Introduce functions(language constructs) includes program only script
4) New functions to include program only script.
script() - Includes script only file. Other than that. It behaves like include()
script_once() - Includes script only file. Other than that. It behaves like include_once()
These are not affected by template_mode at all. These are always script only mode(template_mode=off). “<?php” or like is only allowed at the top of a script.
Existing functions include/require program and template scripts
5) include()/require behavior does not change.
include()/include_once()/require()/require_once() does not change behavior.
These are not affected by template_mode at all. These are always embedded mode(template_mode=on).
Behaviors
6) Template mode behavior
When template_mode=off
Only directly executed PHP scripts are affected.
Allow open tags (<?php
, <?
, <%
) only at the top of the script for compatibility, and issue a parse error when used in other places.
Complete removal of <script lanuage=“PHP”>
open tag.
Ignore close tags (?>
and %>
) completely. Raising error is preferred, but ignore them for better compatibility. i.e. There are many scripts that have ?>
at the end even for program only scripts.
Future Scope
script()/script_once() allows “<?php” or like at the top of script. This is only for easier transition. It may be removed for PHP 7 or later. It may apply to directly called scripts. Security sensitive information should not be written into script directly. Good deployment tool and code should be able to use environment variables for these.
Use of environment variable is difficult for self contained applications unless there is standard deployment tool. Creating general purpose deployment tool is hard since there are many web servers to support and configuration differs even when web server is the same. If there is no standard deployment tool, we may keep allowing to have “<?php” or like at the top of scripts.
Compatibility
Fully compatible with current code. i.e. include()/require() works as it is now regardless of template mode or not. No compatibility issue at all.
Adopting
RFC could be lines of change. (Excluding script()/script_once() Open Issue)
Introduce script()/script_once() for explicit script inclusion. i.e. script()/script_once() always execute script, no embedding mode at all.
New code can be fully compatible with OLD systems. i.e. Users may write script() function wraps include().
Possible issues
New code that omits PHP open tags may be disclosed. For maximum security, user may use “<?php” at be beginning of PHP scripts contains sensitive data. (e.g. password/
API key/etc. Simple configuration/code error is obvious)
Benefits and Tips
PHP can be as secure as other languages against local file inclusion(LFI).
LFI becomes difficult. LFI attacks result in syntax errors almost always instead of attack code execution and/or massive information disclosure.
Embedded mode is weak to LFI. It is a characteristics of embedded language. Non-embedded mode makes PHP much stronger than now against LFI.
-
People do make mistakes with embed everything by default. Some recent LFI issues.
-
Transition is very easy, compatible for both forward/backward not like the related
RFC.
Writing “<?php” at the top of code is highly & always recommended for script contains security sensitive information. It's good for people who like embedded nature of PHP.
People who don't want to write script open tags don't have to write it at their own risks.
Details
Example Usage
Mode control and compatibility
NOTE: This example assumes template_mode=on/off affects behavior of include()/require(). This is open issue of this RFC.
Modern frameworks have single bootstrap script and template rendering function.
In bootstrap script
ini_set('template_mode', 'off'); // Older PHP ignores
In template rendering function
function render_template($template, $template_vars) {
ini_set('template_mode', 'on'); // Older PHP ignores
include($template, $template_vars); // Or use any other method to render.
ini_set('template_mode', 'off'); // Older PHP ignores
}
For better security, program only script should use script()/script_once() as it does not allow embedded mode. To be compatible with older PHP, user has to define their own script()/script_once().
if (PHP_VERSION_ID < 50600) {
// User have to use () for script/script_once for compatibility
function script($php_script) {
// Does not inherit scope, but it's not issue for most program only scripts.
// Optional: May use tokenizer if $php_script contains script only or not
return include($php_script);
}
function script_once($php_script) {
return include_once($php_script);
}
}
Patch
Vote
VOTE: 2014/2/16 - 2014/2/22
Most of LFI can be prevented by using script only program inclusion.
Directly called script cannot use script()/script_once(). Remove inconsistency between directly executed script and indirectly executed script.
Thank you for voting.
Implementation
After the project is implemented, this section should contain
the version(s) it was merged to
a link to the git commit(s)
a link to the PHP manual entry for the feature
Discussions
Motivation of this RFC
Motivation to have this RFC is
“File Includes” is fatal security breach.
The reason why PHP is unsecure to “File Include” than other language is “Mandatory embed mode”
Non mandatory embed mode gives option users to better security.
With this RFC, PHP could be as safe as other scripting languages
with respect to file includes. This RFC is fully compatible with current
code. Writing backward compatible code is as few as 3 lines.
Most of security measures are not perfect solutions, but
mitigation, just like canary and DEP. I suppose people who are
concerned with security understand the value of these protections.
Is there any good reasons not to have non mandatory embedded
mode as a additional security measure? Why not to make it harder
for attackers to exploit?
In short, I'm really annoyed to hear “PHP is insecure than
Ruby/Perl/Python/etc”
This must be noted. Even if this RFC emphases on security, this RFC
is not for introducing a new security measure. This RFC is for
changing embedded only language to non-embedded/embedded language.
Side effect of this change will provide better security against LFI.
Why this is better than now?
Framework authors
Framework developers will have effective protection against suicide code like “include $_REQUEST['var'];” in user scripts.
If framework adopts template_mode=off, the suicide code above will raise syntax errors for LFI code execution and information disclosure attack patterns.
It could be as few as 3 lines to adopt template_mode=off.
Framework developers are not required to rewrite include/require statements.
Framework can be compatible with older PHP.
Fully compatible with current code. (template_mode=on)
Mostly compatible with current code. (template_mode=off)
Programmers
Programmers don't have to do anything nor know anything, if their framework adopt template_mode=off.
No new syntax nor parameter.
All files are treated as PHP script only files. (template_mode=off)
Programmers can easily determine which file will be executed as PHP script.
Embedding script attack becomes impossible.
Open tag is only allowed at the beginning of file.
No close tag is allowed with template_mode=off.
Note: if attacker can inject at the beginning of file or construct valid syntax, this is not the case.
Including non PHP scripts will result in syntax errors almost always.
inlcude('/etc/passwd'), include('config/secrete.yml'), include('uploadded_image/attack.jpg') and so on.
current PHP just dump them to attackers.
Programmers can have confidence locating uploaded data files(image/documents/etc) under docroot for better performance.
The only looser is attacker here. If you think of drawback that exceeds these benefits, please let me know. yohgaki@ohgaki.net
What is compatibility issues?
No new syntax or parameters. Only template_mode ini entry.
None when template_mode=on. (The default)
None for script only file begins with open tags regardless of template_mode. <?php, <?, <%
When template_mode=off, include()/requrie() execute file as scripts. For embedded pages, users are supposed to control template_mode.
For framework developers, it should be simple task to adopt template_mode. template_mode=Off in bootstrap, template_mode=on and off is render method. It would be as few as 3 lines of changes.
For framework users, they don't even have to know the change and still get benefits if framework supports this.
What protection should be done for LFI?
Data File Validation by tokenizer
To find out if a file contains PHP code, tokenizer may be used.
$token = token_get_all($file);
if (has_php_token($token)) { // there is no has_php_token(), implement your own function.
echo 'File has PHP token';
}
Note that this method can be false positive. Complete solution is adopt template_mode=off and make sure that all files have headers/data formats which cause syntax error with LFI.
Check List
Common protections
ALWAYS VALIDATE USER INPUTS if the input is reasonable one or not. (Not only for LFI, but for general security)
Validate parameters contains only valid chars. (i.e. no “../” Use white list chars, not black list.)
Validate parameters are not too long. (i.e. file names would not be too long such as “valid/path/../../../../../../../etc/passwd”)
For example: if (strspn($filename, '01234567890ABCDEF') != 32) die('Invalid file name as MD5 sum');
Avoid regular expression as much as possible.
Never use ereg() to validate.
Always add file extension manually for include()/require()/include_once()/require_once().
Use null byte protection! Null byte protection requires 5.3.4 or later.
For example: include($module_name.'.php'); This makes only “.php” files can be read by include() with recent PHP.
If you are using PHP 5.3.3 or less, null byte protection is not available. You MUST make sure strings are not contains null bytes.
Add safe prefix for include()/require()/include_once()/require_once().
Validate data file.
It would not help much without this
RFC, but it should be done anyway.
Check MIME types by mime extension at least.
Check data file has proper headers.
For free format data files, add your own header.
If you are using Unicode Normalization, you must know what you are doing.
Use open_basedir ini option to reduece damage when there is LFI.
Do not disclose your system information.
Be protective
Enable all errors.
Write code that will raise no errors/exceptions for normal execution.
Stop script execution for all errors/exceptions that are unexpected.
Log and notify admin when unexpected error occurred.
Give least permission to web server processes.
For linux, use selinux.
A lot of effort will be needed to adopt template_mode?!
No. Users may stick to current behavior. Current behavior(template_mode=on) is the default.
Even if users decided to adopt templat_mode=off, the changes that are required is a few as 3 lines. For modern frameworks, it would be simple task.
Framework users do not have to do nor learn the change as long as they follow their framework practice.
template_mode is similar to allow_url_include
allow_url_include:
template_mode:
template_mode=off is safe guard
This is not an direct security mesure, but a safe guard for risk mitigation.
Mitigation factor is large if programmer follow guideline.
Think it as canary protection for stack smashing or DEP for heap overflow.
Since LFI is critical, additional protections are preferred whenever possible.
template_mode=off could be security issue
Security measure or accident?
Script only mode causes syntax error for LFI attacks, but it's an accident.
Many security measures are rely on accidental protections.
ALSR is one of a security by accident so that attacker's guess would not work.
Background
Compare to other languages such as Ruby/Perl/Phthon/Lua/etc, current PHP is known to be weak for script injections/information disclosure. This proposal aims to reduce the LFI(Local File Include) risk to other scripting language level by optional embedded scripting mode.
PHP is an embedded language since it was born. To execute a PHP code, PHP tag is required and PHP tag is mandatory. Mandatory Embed mode was reasonable 10 years ago, but it became less reasonable with recent framework adoption. PHP has security issue known as LFI(Local File Include)/RFI(Remote File Include). The risk of LFI/RFI was reduced by null byte check in fopen wrapper and allow_url_include php.ini option. However, there is remaining risk that cannot be ignored.
LFI/RFI is fatal security error. Currently, LFI prevention is upto programmers and fatal FLI may happen anywhere in PHP script. It is preferred to have systematic approach to reduce LFI risk as much as possible. This RFC provides a additional protection for LFI by optional embedded mode.
Mandatory Embed mode = Less Security
There are countermeasures allow_url_include=off for RFI, Null byte check in fopen wrapper and open_basedir for LFI. However, injecting PHP code in a file is easy because of the mandatory embedded script nature. Simple vulnerability in include()/require() could result in a fatal code execution and/or fatal information disclosure security incident.
PHP script inclusion detection may be done for data files. However, it's not a simple task.
<?, <% should be checked even if it's a binary. It's possible to have “<?” or “<%” in a binary. One should check if it contains a valid(executable) PHP script in it or not.
<script language=“php”> should be checked. “<script language=” is common in
HTML files. Checking “<script language=” is not a simple task.
Most reliable method is to use get_all_token(), but tokenzier is optional module.
Almost all applications are not checking these and it would be impossible to make developers do a proper validation.
Nothing can be done for files that are not controlled by PHP application.
Data files may be protected by injecting “<?php die()?>” into data files.
Data before <?php die() ?> may be disclosed.
It is possible to inject <?php die() ?> some data files like /etc/passwd, but it is not feasible.
Injecting <?php die() ?> to all data file is impossible.
Note: With template_mode=off, you don't have to inject die() into data file to protect from LFI disclosure/execution.
LFI is used information disclosure. Disabling embedded(template) mode is simple protection against LFI.
PHP script injection examples WITHOUT file uploads
Common novice PHP programmer mistake is writing code like
include $_GET['name'];
Because PHP is embedded language and the nature of embedded language cannot be changed, this kind of code raises fatal security issues. Attacker may use any uploaded files without proper validations may be used to execute PHP code. Attacker may also be read any files with permissions.
Note: Please keep in mind that this RFC aims to reduce LFI risk without full knowledge of LFI. For example, not many people realizes that LFI may be used with SQL Injection and the Null Byte protection does not help at all for this case.
Session Data
Many applications store user inputs in $_SESSION array and use files are session storage. If attacker can inject PHP script in $_SESSION array, all they have to do is find a vulnerable include()/require(). Session save_path could be guessed easily.
Attacker may
Find LFI
Find a way to inject PHP script in $_SESSION
Guess session save_path
Take over target server
Cached Data
Modern applications cache data into files for better performance. Since cache is only used by application, programmers do not care if it contains PHP code or not. Attacker may follow similar steps for session data exploitation to execute attack script.
Email
Some applications save email messages. These message may be used for LFI.
Log Files
Errors and accesses are logged usually. These logs may be used for LFI.
SQL injection
SQL injection is known to be able executing arbitrarily code by itself with certain configuration. With PHP's LFI, SQL injection may be used for arbitrarily code execution since many SQL database servers support writing data to a file.
Find SQL injection
Find means to inject PHP script into SQL data
Save SQL data into file
Use LFI to execute the code
Since attacker has control of full file name, Null Byte protection does not work at all.
When LFI and SQL Injection can be abused, attacker may steal data without blind SQL injection. Attacker may simply dump data into file and use LFI steal it.
Find SQL injection
Save SQL data into file
Use LFI to steal data
It's much simpler than blind SQL injection. It may also be possible bypass certain IPS/IDS.
All developers know how include('/etc/passwd) or include('.htaccess') works.
include $_REQUEST['var'];
Above code may be used to read sensitive data such as /etc/passwd, .htaccess, .htpasswd or any other files attackers may interested.
Note: open_basedir will not be set by default.
Changelog
2012-04-01 Moriyoshi Koizumi: Initial.
2012-04-09 Yasuo Ohgaki: Version 1.1. Made
RFC a serious
RFC. Added security issue discussion for embedded mode. Proposal for php.ini and CLI options.
2012-04-10 Yasuo Ohgaki: Version 1.2. More examples.
2012-04-11 Yasuo Ohgaki: Version 1.3. Reorganized sections and sentences.
2012-04-12 Yasuo Ohgaki: Version 1.4. Simplified summary. Reorganized sections and sentences.
2012-04-16 Yasuo Ohgaki: Version 1.4. Added ENV var switch and script()/script_once().