rfc:nophptags

Optional PHP tags by php.ini and CLI options

Summary

PHP has been known to be more vulnerable to script inclusion attacks than other languages. PHP is made for Web and embedding feature is valuable for easier web application development. However, embedded scripts are serious security risk as it can executes any PHP code in any files. PHP is made for web application and today's web application should have strong security.

This RFC proposes PHP open tags to be optional, template mode switch and script only include feature.

Proposal

Open Issue

  • We may change include()/require() behavior according to template_mode. i.e. include()/require() behaves like script() when template_mode=on. And template_mode could be PHP_INI_ALL. i.e. Adoption of this RFC could be as few as 3 lines of PHP code with this.
  • We may only introduce script()/script_once(), even if it seems a little odd. i.e. Directly called script always requires “<?php”, while script() does not.
Add flag that controls embed(template) feature of PHP

Flag to control embed (template) mode for directly called scripts. (e.g. http://some/foo.php or php bar.php Directly executing scripts are script accessed by browser directly, script executed from CLI binary.)

NOTE: PHP script that has a “<?php” or like at the top of script works regardless of template_mode.

1) php.ini

template_mode = On   ; On by default. PHP_INI_SYSTEM.
                     ; On: "<?php" or like is requried for directly executing script.
                     ; Off: Open tag is not required for directly executing script.

2) Environment variable

PHP_TEMPLATE_MODE=1  # O for disable. No default.

3) CLI

php -x foo.php  # template mode. "<?php" or like is required. DEFAULT
php -X foo.php  # non-template mode. "<?php" or like is not required.
Introduce functions(language constructs) includes program only script

4) New functions to include program only script.

script() - Includes script only file. Other than that. It behaves like include()
script_once() - Includes script only file. Other than that. It behaves like include_once()

These are not affected by template_mode at all. These are always script only mode(template_mode=off). “<?php” or like is only allowed at the top of a script.

Existing functions include/require program and template scripts

5) include()/require behavior does not change.

include()/include_once()/require()/require_once() does not change behavior.

These are not affected by template_mode at all. These are always embedded mode(template_mode=on).

Behaviors

6) Template mode behavior

  • When template_mode=off
    • Only directly executed PHP scripts are affected.
    • Allow open tags (<?php, <?, <%) only at the top of the script for compatibility, and issue a parse error when used in other places.
    • Complete removal of <script lanuage=“PHP”> open tag.
    • Ignore close tags (?> and %>) completely. Raising error is preferred, but ignore them for better compatibility. i.e. There are many scripts that have ?> at the end even for program only scripts.
  • When template_mode=on
    • Exactly the same as now.

Future Scope

  • script()/script_once() allows “<?php” or like at the top of script. This is only for easier transition. It may be removed for PHP 7 or later. It may apply to directly called scripts. Security sensitive information should not be written into script directly. Good deployment tool and code should be able to use environment variables for these.
  • Use of environment variable is difficult for self contained applications unless there is standard deployment tool. Creating general purpose deployment tool is hard since there are many web servers to support and configuration differs even when web server is the same. If there is no standard deployment tool, we may keep allowing to have “<?php” or like at the top of scripts.

Compatibility

  • Fully compatible with current code. i.e. include()/require() works as it is now regardless of template mode or not. No compatibility issue at all.
  • Adopting RFC could be lines of change. (Excluding script()/script_once() Open Issue)
  • Introduce script()/script_once() for explicit script inclusion. i.e. script()/script_once() always execute script, no embedding mode at all.
  • New code can be fully compatible with OLD systems. i.e. Users may write script() function wraps include().

Possible issues

  • New code that omits PHP open tags may be disclosed. For maximum security, user may use “<?php” at be beginning of PHP scripts contains sensitive data. (e.g. password/API key/etc. Simple configuration/code error is obvious)

Benefits and Tips

Details

Example Usage

Mode control and compatibility

NOTE: This example assumes template_mode=on/off affects behavior of include()/require(). This is open issue of this RFC.

Modern frameworks have single bootstrap script and template rendering function.

In bootstrap script

ini_set('template_mode', 'off'); // Older PHP ignores

In template rendering function

  function render_template($template, $template_vars) {
    ini_set('template_mode', 'on'); // Older PHP ignores
    include($template, $template_vars); // Or use any other method to render.
    ini_set('template_mode', 'off'); // Older PHP ignores
  }

For better security, program only script should use script()/script_once() as it does not allow embedded mode. To be compatible with older PHP, user has to define their own script()/script_once().

  if (PHP_VERSION_ID < 50600) {
    // User have to use () for script/script_once for compatibility
    function script($php_script) {
      // Does not inherit scope, but it's not issue for most program only scripts. 
      // Optional: May use tokenizer if $php_script contains script only or not 
      return include($php_script); 
    }
    function script_once($php_script) {
      return include_once($php_script);
    }
  }

Patch

This is PoC patch that only controls template mode. No script()/script_once() implementation.

https://gist.github.com/2266652

Vote

VOTE: 2014/2/16 - 2014/2/22

Most of LFI can be prevented by using script only program inclusion.

Introduce script()/script_once()
Real name Yes No
Final result: 0 0
This poll has been closed.

Directly called script cannot use script()/script_once(). Remove inconsistency between directly executed script and indirectly executed script.

Allow to omit script open tag for direct script execution
Real name Yes No
Final result: 0 0
This poll has been closed.

Thank you for voting.

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged to
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature

Discussions

Motivation of this RFC

Motivation to have this RFC is

  1. “File Includes” is fatal security breach.
  2. The reason why PHP is unsecure to “File Include” than other language is “Mandatory embed mode”
  3. Non mandatory embed mode gives option users to better security.

With this RFC, PHP could be as safe as other scripting languages with respect to file includes. This RFC is fully compatible with current code. Writing backward compatible code is as few as 3 lines.

Most of security measures are not perfect solutions, but mitigation, just like canary and DEP. I suppose people who are concerned with security understand the value of these protections.

Is there any good reasons not to have non mandatory embedded mode as a additional security measure? Why not to make it harder for attackers to exploit?

In short, I'm really annoyed to hear “PHP is insecure than Ruby/Perl/Python/etc”

This must be noted. Even if this RFC emphases on security, this RFC is not for introducing a new security measure. This RFC is for changing embedded only language to non-embedded/embedded language. Side effect of this change will provide better security against LFI.

Why this is better than now?

  • Framework authors
    • Framework developers will have effective protection against suicide code like “include $_REQUEST['var'];” in user scripts.
    • If framework adopts template_mode=off, the suicide code above will raise syntax errors for LFI code execution and information disclosure attack patterns.
      • Note: If the file is valid PHP code, it's possible executing code. However, the risk is the same or less compare to Ruby/Perl/Python/etc.
    • It could be as few as 3 lines to adopt template_mode=off.
    • Framework developers are not required to rewrite include/require statements.
    • Framework can be compatible with older PHP.
    • Fully compatible with current code. (template_mode=on)
    • Mostly compatible with current code. (template_mode=off)
      • <?php,<?,<% at the beginning of a file is allowed regardless of tempalte_mode for maximum compatibility.
  • Programmers
    • Programmers don't have to do anything nor know anything, if their framework adopt template_mode=off.
    • No new syntax nor parameter.
    • All files are treated as PHP script only files. (template_mode=off)
      • Programmers can easily determine which file will be executed as PHP script.
        • Getting MIME types would be enough.
      • Embedding script attack becomes impossible.
        • Open tag is only allowed at the beginning of file.
        • No close tag is allowed with template_mode=off.
        • Note: if attacker can inject at the beginning of file or construct valid syntax, this is not the case.
      • Including non PHP scripts will result in syntax errors almost always.
        • inlcude('/etc/passwd'), include('config/secrete.yml'), include('uploadded_image/attack.jpg') and so on.
        • current PHP just dump them to attackers.
      • Programmers can have confidence locating uploaded data files(image/documents/etc) under docroot for better performance.
        • FLI can be done w/o uploaded file, but abuse of uploaded file is common attack pattern.
  • Attackers
    • When attackers try to exploit LFI with template_mode=off,
      • attack file must be a valid PHP script.
      • making a valid attack file is vary difficult
        • attacker cannot execute script.
        • attacker cannot steal data/information.
  • Auditors, Managers and Decision makers
    • Codes that have to be audited with extreme care will be reduced.
    • Even if there is vulnerable piece of code, it becomes very difficult to exploit it.
    • Decision maker that worries about infamous PHP LFI will not against adopting PHP anymore.
  • PHP Developers
    • PHP's infamous LFI reputation can be removed.
      • PHP becomes as strong as, or stronger than other scripting languages against LFI.
      • If you are curious about this reputation, search CVE for number of PHP LFI and others.

The only looser is attacker here. If you think of drawback that exceeds these benefits, please let me know. yohgaki@ohgaki.net

What is compatibility issues?

  • No new syntax or parameters. Only template_mode ini entry.
  • None when template_mode=on. (The default)
  • None for script only file begins with open tags regardless of template_mode. <?php, <?, <%
  • When template_mode=off, include()/requrie() execute file as scripts. For embedded pages, users are supposed to control template_mode.
  • For framework developers, it should be simple task to adopt template_mode. template_mode=Off in bootstrap, template_mode=on and off is render method. It would be as few as 3 lines of changes.
  • For framework users, they don't even have to know the change and still get benefits if framework supports this.

What protection should be done for LFI?

Data File Validation by tokenizer

To find out if a file contains PHP code, tokenizer may be used.

$token = token_get_all($file);
if (has_php_token($token)) { // there is no has_php_token(), implement your own function.
   echo 'File has PHP token';
}

Note that this method can be false positive. Complete solution is adopt template_mode=off and make sure that all files have headers/data formats which cause syntax error with LFI.

Check List

  • Common protections
    • ALWAYS VALIDATE USER INPUTS if the input is reasonable one or not. (Not only for LFI, but for general security)
      • Validate parameters contains only valid chars. (i.e. no “../” Use white list chars, not black list.)
      • Validate parameters are not too long. (i.e. file names would not be too long such as “valid/path/../../../../../../../etc/passwd”)
      • For example: if (strspn($filename, '01234567890ABCDEF') != 32) die('Invalid file name as MD5 sum');
      • Avoid regular expression as much as possible.
      • Never use ereg() to validate.
        • Not binary safe and its line oriented = Can't be used for validation.
    • Always add file extension manually for include()/require()/include_once()/require_once().
      • Use null byte protection! Null byte protection requires 5.3.4 or later.
      • For example: include($module_name.'.php'); This makes only “.php” files can be read by include() with recent PHP.
      • If you are using PHP 5.3.3 or less, null byte protection is not available. You MUST make sure strings are not contains null bytes.
    • Add safe prefix for include()/require()/include_once()/require_once().
      • This protects against absolute path attack.
      • For example: include('/my/base/path/'.$module_name.'.php');
    • Validate data file.
      • It would not help much without this RFC, but it should be done anyway.
      • Check MIME types by mime extension at least.
      • Check data file has proper headers.
      • For free format data files, add your own header.
    • If you are using Unicode Normalization, you must know what you are doing.
      • Always normalize first, then validate.
    • Use open_basedir ini option to reduece damage when there is LFI.
    • Do not disclose your system information.
      • Do not write error that contains path or anything that may help attackers.
    • Be protective
      • Enable all errors.
      • Write code that will raise no errors/exceptions for normal execution.
      • Stop script execution for all errors/exceptions that are unexpected.
      • Log and notify admin when unexpected error occurred.
      • Give least permission to web server processes.
      • For linux, use selinux.
  • When template_mode=on for all script, it's the same as now.
    • For images/audios/videos, check the meta information if they are correct.
    • For image files converting to other format/adding water mark may help.
      • Attacker may use color table to inject PHP code.
      • Color table would not change by changing size.
      • Attacker may use color table after conversion.
      • Beware converting back to original format may restore color table.
    • For all files to be protected, insert <?php die() ?> where it is possible.
      • It should be inserted at beginning of file as much as possible.
      • Location to be inserted depends on data file, but data file has reserved area. Inject <?php die()?> into the area.
    • Previous requirement is almost impossible to enforce, since PHP may read any file with permission. (e.g. inserting <?php die()?> in /etc/passwd is possible, but it is not feasible obviously. )
    • Consider template_mode=off, since embedded mode is dangerous by its nature.
  • When template_mode=off in php.ini or bootstrap script,
    • Enable template_mode only when it is absolutely required.
    • If there are free format data files, add some header so that LFI causes syntax error.
    • Note: When template_mode=off, even if programmer dropped all protections, LFI would result in syntax errors in most cases.

A lot of effort will be needed to adopt template_mode?!

  • No. Users may stick to current behavior. Current behavior(template_mode=on) is the default.
  • Even if users decided to adopt templat_mode=off, the changes that are required is a few as 3 lines. For modern frameworks, it would be simple task.
  • Framework users do not have to do nor learn the change as long as they follow their framework practice.

template_mode is similar to allow_url_include

  • allow_url_include:
    • enable only when url include is needed.
    • prevents RFI which results in critical security incident.
  • template_mode:
    • enable only when template mode is needed.
    • prevents LFI which results in critical security incident.

template_mode=off is safe guard

  • This is not an direct security mesure, but a safe guard for risk mitigation.
    • It is a side effect of making PHP non-mandatory embedded language.
  • Mitigation factor is large if programmer follow guideline.
  • Think it as canary protection for stack smashing or DEP for heap overflow.
  • Since LFI is critical, additional protections are preferred whenever possible.

template_mode=off could be security issue

  • When template_mode=off, PHP script may be exposed.
    • Files contains open tag “<?php” or like at beginning, is treated as PHP script regardless of template_mode
    • Template files/legacy codes that do not have “<?php” or like on top, it may be exposed without proper code. It will result in syntax errors mostly since it does not allow close tags “?>”. However, this might be a issue still.
      • PHP scripts that aren't be accessed by users should be located other than docroot. This is good practice without this RFC also.
      • PHP users are used to such case(?) since short tags were optional for a long time.

Security measure or accident?

  • Script only mode causes syntax error for LFI attacks, but it's an accident.
  • Many security measures are rely on accidental protections.
  • ALSR is one of a security by accident so that attacker's guess would not work.

Background

Compare to other languages such as Ruby/Perl/Phthon/Lua/etc, current PHP is known to be weak for script injections/information disclosure. This proposal aims to reduce the LFI(Local File Include) risk to other scripting language level by optional embedded scripting mode.

PHP is an embedded language since it was born. To execute a PHP code, PHP tag is required and PHP tag is mandatory. Mandatory Embed mode was reasonable 10 years ago, but it became less reasonable with recent framework adoption. PHP has security issue known as LFI(Local File Include)/RFI(Remote File Include). The risk of LFI/RFI was reduced by null byte check in fopen wrapper and allow_url_include php.ini option. However, there is remaining risk that cannot be ignored.

LFI/RFI is fatal security error. Currently, LFI prevention is upto programmers and fatal FLI may happen anywhere in PHP script. It is preferred to have systematic approach to reduce LFI risk as much as possible. This RFC provides a additional protection for LFI by optional embedded mode.

Mandatory Embed mode = Less Security

There are countermeasures allow_url_include=off for RFI, Null byte check in fopen wrapper and open_basedir for LFI. However, injecting PHP code in a file is easy because of the mandatory embedded script nature. Simple vulnerability in include()/require() could result in a fatal code execution and/or fatal information disclosure security incident.

PHP script inclusion detection may be done for data files. However, it's not a simple task.

  • <?, <% should be checked even if it's a binary. It's possible to have “<?” or “<%” in a binary. One should check if it contains a valid(executable) PHP script in it or not.
  • <script language=“php”> should be checked. “<script language=” is common in HTML files. Checking “<script language=” is not a simple task.
  • Most reliable method is to use get_all_token(), but tokenzier is optional module.
  • Almost all applications are not checking these and it would be impossible to make developers do a proper validation.
  • Nothing can be done for files that are not controlled by PHP application.

Data files may be protected by injecting “<?php die()?>” into data files.

  • Data before <?php die() ?> may be disclosed.
  • It is possible to inject <?php die() ?> some data files like /etc/passwd, but it is not feasible.
  • Injecting <?php die() ?> to all data file is impossible.
  • Note: With template_mode=off, you don't have to inject die() into data file to protect from LFI disclosure/execution.

LFI is used information disclosure. Disabling embedded(template) mode is simple protection against LFI.

PHP script injection examples WITHOUT file uploads

Common novice PHP programmer mistake is writing code like

include $_GET['name'];

Because PHP is embedded language and the nature of embedded language cannot be changed, this kind of code raises fatal security issues. Attacker may use any uploaded files without proper validations may be used to execute PHP code. Attacker may also be read any files with permissions.

Note: Please keep in mind that this RFC aims to reduce LFI risk without full knowledge of LFI. For example, not many people realizes that LFI may be used with SQL Injection and the Null Byte protection does not help at all for this case.

Session Data

Many applications store user inputs in $_SESSION array and use files are session storage. If attacker can inject PHP script in $_SESSION array, all they have to do is find a vulnerable include()/require(). Session save_path could be guessed easily.

Attacker may

  1. Find LFI
  2. Find a way to inject PHP script in $_SESSION
  3. Guess session save_path
  4. Take over target server
Cached Data

Modern applications cache data into files for better performance. Since cache is only used by application, programmers do not care if it contains PHP code or not. Attacker may follow similar steps for session data exploitation to execute attack script.

Email

Some applications save email messages. These message may be used for LFI.

Log Files

Errors and accesses are logged usually. These logs may be used for LFI.

SQL injection

SQL injection is known to be able executing arbitrarily code by itself with certain configuration. With PHP's LFI, SQL injection may be used for arbitrarily code execution since many SQL database servers support writing data to a file.

  1. Find SQL injection
  2. Find means to inject PHP script into SQL data
  3. Save SQL data into file
  4. Use LFI to execute the code

Since attacker has control of full file name, Null Byte protection does not work at all.

When LFI and SQL Injection can be abused, attacker may steal data without blind SQL injection. Attacker may simply dump data into file and use LFI steal it.

  1. Find SQL injection
  2. Save SQL data into file
  3. Use LFI to steal data

It's much simpler than blind SQL injection. It may also be possible bypass certain IPS/IDS.

Information disclosure example

All developers know how include('/etc/passwd) or include('.htaccess') works.

include $_REQUEST['var'];

Above code may be used to read sensitive data such as /etc/passwd, .htaccess, .htpasswd or any other files attackers may interested.

Note: open_basedir will not be set by default.

Changelog

  1. 2012-04-01 Moriyoshi Koizumi: Initial.
  2. 2012-04-09 Yasuo Ohgaki: Version 1.1. Made RFC a serious RFC. Added security issue discussion for embedded mode. Proposal for php.ini and CLI options.
  3. 2012-04-10 Yasuo Ohgaki: Version 1.2. More examples.
  4. 2012-04-11 Yasuo Ohgaki: Version 1.3. Reorganized sections and sentences.
  5. 2012-04-12 Yasuo Ohgaki: Version 1.4. Simplified summary. Reorganized sections and sentences.
  6. 2012-04-16 Yasuo Ohgaki: Version 1.4. Added ENV var switch and script()/script_once().
rfc/nophptags.txt · Last modified: 2018/06/18 10:18 by cmb