rfc:readable_var_representation
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
rfc:readable_var_representation [2021/01/23 20:01] – tandre | rfc:readable_var_representation [2021/02/06 00:30] – Start voting tandre | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== PHP RFC: var_representation() : readable alternative to var_export() ====== | ====== PHP RFC: var_representation() : readable alternative to var_export() ====== | ||
- | * Version: 0.2 | + | * Version: 0.3 |
* Date: 2021-01-22 | * Date: 2021-01-22 | ||
* Author: Tyson Andre, tandre@php.net | * Author: Tyson Andre, tandre@php.net | ||
- | * Status: | + | * Status: |
* First Published at: http:// | * First Published at: http:// | ||
* Implementation: | * Implementation: | ||
Line 20: | Line 20: | ||
- Unconditionally return a string instead of printing to standard output. | - Unconditionally return a string instead of printing to standard output. | ||
- Use '' | - Use '' | ||
+ | - Escape control characters including tabs, newlines, etc., unlike var_export()/ | ||
- Change the way indentation is done for arrays/ | - Change the way indentation is done for arrays/ | ||
- Render lists as ''" | - Render lists as ''" | ||
- Always render empty lists on a single line instead of two lines. | - Always render empty lists on a single line instead of two lines. | ||
- Prepend '' | - Prepend '' | ||
- | - Support the bit flag '' | + | - Support the bit flag '' |
<code php> | <code php> | ||
Line 48: | Line 49: | ||
php > echo var_representation([]); | php > echo var_representation([]); | ||
[] | [] | ||
+ | // lines are indented by a multiple of 2, similar to var_export but not exactly the same | ||
+ | php > echo var_representation([(object) [' | ||
+ | [ | ||
+ | (object) [ | ||
+ | ' | ||
+ | ' | ||
+ | 1.0, | ||
+ | ], | ||
+ | ' | ||
+ | 2, | ||
+ | ]), | ||
+ | ], | ||
+ | ' | ||
+ | ], | ||
+ | ] | ||
</ | </ | ||
Line 61: | Line 77: | ||
' | ' | ||
] | ] | ||
+ | |||
+ | |||
</ | </ | ||
Line 84: | Line 102: | ||
=== Encoding binary data === | === Encoding binary data === | ||
This does a better job at encoding binary data in a form that is easy to edit. | This does a better job at encoding binary data in a form that is easy to edit. | ||
- | var_export() | + | var_export() |
- | even control characters such as tabs, vertical tabs, backspaces, carriage returns, etc. | + | not even control characters such as tabs, vertical tabs, backspaces, carriage returns, newlines, etc. |
<code php> | <code php> | ||
php > echo var_representation(" | php > echo var_representation(" | ||
" | " | ||
+ | // var_export gives no visual indication that there is a carriage return before that newline | ||
php > var_export(" | php > var_export(" | ||
'' | '' | ||
' . " | ' . " | ||
+ | // Attempting to print control characters to your terminal with var_export may cause unexpected side effects | ||
+ | // and unescaped control characters are unreadable | ||
+ | php > var_export(implode('', | ||
+ | '' | ||
+ | |||
+ | |||
+ | hp > // (first character and closing ' was hidden by those control characters) | ||
+ | php > echo var_representation(implode('', | ||
+ | " | ||
+ | |||
// Bytes \x80 and above are passed through with no modification or encoding checks. | // Bytes \x80 and above are passed through with no modification or encoding checks. | ||
Line 185: | Line 214: | ||
* You are generating a snippet of code to '' | * You are generating a snippet of code to '' | ||
* The output is occasionally or frequently read by humans (e.g. CLI or web app output, a REPL, unit test output, etc.). | * The output is occasionally or frequently read by humans (e.g. CLI or web app output, a REPL, unit test output, etc.). | ||
- | * The output contains control characters such as newlines, tabs, '' | + | * The output contains control characters such as newlines, tabs, '' |
* You want to unambiguously see control characters in the raw output regardless of how likely they are (e.g. dumping php ini settings, debugging mysterious test failures, etc) | * You want to unambiguously see control characters in the raw output regardless of how likely they are (e.g. dumping php ini settings, debugging mysterious test failures, etc) | ||
* You are writing unit tests for applications supporting PHP 8.1+ (or a var_representation polyfill) that test the exact string representation of the output (e.g. phpt tests of php-src and PECL extensions) - see the section [[# | * You are writing unit tests for applications supporting PHP 8.1+ (or a var_representation polyfill) that test the exact string representation of the output (e.g. phpt tests of php-src and PECL extensions) - see the section [[# | ||
Line 194: | Line 223: | ||
This flag may be useful when any of the following apply: | This flag may be useful when any of the following apply: | ||
- | * You are writing or modifying tests of exact variable representation and want to write the equivalent of '' | + | * You are writing or modifying tests of exact variable representation and want to write the equivalent of |
+ | <code php> | ||
+ | $this->assertSame(" | ||
+ | // instead of the much longer and harder to type | ||
+ | $this->assertSame(" | ||
+ | </ | ||
* You are generating human-readable output and expect the output to be a small object/ | * You are generating human-readable output and expect the output to be a small object/ | ||
* You want the output to be as short as possible while still being somewhat human readable, e.g. sending an extremely long array representation over the network, or are saving it to a file/ | * You want the output to be as short as possible while still being somewhat human readable, e.g. sending an extremely long array representation over the network, or are saving it to a file/ | ||
Line 313: | Line 347: | ||
Adding more flags here would increase the scope of the rfc and complexity of implementing the change and for reviewing/ | Adding more flags here would increase the scope of the rfc and complexity of implementing the change and for reviewing/ | ||
+ | |||
+ | === Supporting an indent option === | ||
+ | |||
+ | This was left out since I felt it would increase the scope of the RFC too much. | ||
+ | |||
+ | If an '' | ||
+ | |||
+ | The fact that embedded newlines are now no longer emitted as parts of strings makes it easier to efficiently convert the indentation to spaces or tabs using '' | ||
+ | |||
+ | <code php> | ||
+ | php > echo var_representation([[[' | ||
+ | [ | ||
+ | [ | ||
+ | [ | ||
+ | ' | ||
+ | ], | ||
+ | ], | ||
+ | ] | ||
+ | php > echo preg_replace('/ | ||
+ | [ | ||
+ | [ | ||
+ | [ | ||
+ | ' | ||
+ | ], | ||
+ | ], | ||
+ | ] | ||
+ | </ | ||
+ | ```` | ||
==== Adding magic methods such as __toRepresentation() to PHP ==== | ==== Adding magic methods such as __toRepresentation() to PHP ==== | ||
Line 330: | Line 392: | ||
It may be useful to override this string representation through additional flags, callbacks, or other mechanisms. | It may be useful to override this string representation through additional flags, callbacks, or other mechanisms. | ||
However, I don't know if there' | However, I don't know if there' | ||
+ | |||
+ | ==== Emitting code comments in result about references/ | ||
+ | |||
+ | Adding a comment such as ''/ | ||
+ | |||
+ | (Or ''/ | ||
===== Discussion ===== | ===== Discussion ===== | ||
Line 395: | Line 463: | ||
</ | </ | ||
- | I believe that the improvements of var_representation make adding a new function worth it. | + | I believe that the improvements of var_representation make adding a new function worth it. See the section [[# |
As mentioned earlier, a lot of existing php code depends on the exact default output of var_export() (e.g. unit tests of php-src itself and otherwise), which was introduced in php 4.2 and predates namespaces and short arrays. | As mentioned earlier, a lot of existing php code depends on the exact default output of var_export() (e.g. unit tests of php-src itself and otherwise), which was introduced in php 4.2 and predates namespaces and short arrays. | ||
Line 402: | Line 470: | ||
The last time '' | The last time '' | ||
- | ===== Proposed Voting Choices | + | ===== Vote ===== |
- | Yes/No, requiring 2/3 majority. | + | This is a Yes/ |
+ | |||
+ | <doodle title=" | ||
+ | * Yes | ||
+ | * No | ||
+ | </ | ||
===== References ===== | ===== References ===== | ||
Line 411: | Line 484: | ||
* https:// | * https:// | ||
* https:// | * https:// | ||
+ | |||
+ | ===== Appendix ===== | ||
+ | ==== Comparison of string encoding with other languages ==== | ||
+ | |||
+ | See https:// | ||
+ | < | ||
+ | ASCII is the American Standard Code for Information Interchange. | ||
+ | It is a 7-bit code (with 128 characters). | ||
+ | ASCII as their lower half. The international counterpart of | ||
+ | ASCII is known as ISO 646-IRV. | ||
+ | </ | ||
+ | |||
+ | If there are any control characters (in the ranges \x00-\x1f and \x7f), '' | ||
+ | If there are no control characters, strings are represented the way '' | ||
+ | |||
+ | <code php> | ||
+ | php > echo var_representation(implode('', | ||
+ | " | ||
+ | php > echo var_representation(implode('', | ||
+ | " !\"# | ||
+ | </ | ||
+ | |||
+ | Python appears to have the same inner representation with shorter representations only for '' | ||
+ | |||
+ | <code python> | ||
+ | # \x00-\x1f | ||
+ | print(repr('' | ||
+ | ' | ||
+ | # \x20-\x7f | ||
+ | print(repr('' | ||
+ | ' !"# | ||
+ | </ | ||
+ | |||
+ | |||
+ | JSON escapes a wider range of control characters, but the format does not require escaping backspaces(\x7f), | ||
+ | |||
+ | <code javascript> | ||
+ | > console.log(JSON.stringify(" | ||
+ | " | ||
+ | > console.log(JSON.stringify(" | ||
+ | " !\"# | ||
+ | </ | ||
+ | |||
+ | Ruby has additional shorter escapes for '' | ||
+ | |||
+ | <code ruby> | ||
+ | puts(" | ||
+ | " | ||
+ | puts(" !\"# | ||
+ | " !\" | ||
+ | </ | ||
===== Rejected Features ===== | ===== Rejected Features ===== | ||
Line 509: | Line 633: | ||
- This may be much slower and end users may not expect that - a lot of small stream writes with dynamic C function calls would be something I'd expect to take much longer than converting to a string then writing to the stream. | - This may be much slower and end users may not expect that - a lot of small stream writes with dynamic C function calls would be something I'd expect to take much longer than converting to a string then writing to the stream. | ||
- Adding even more ways to dump to a stream/ | - Adding even more ways to dump to a stream/ | ||
+ | |||
+ | ==== Changing var_dump ==== | ||
+ | |||
+ | var_dump is a function which I consider to have goals that are incompatible ways. | ||
+ | If an exact representation of reference cycles, identical objects, and circular object data is needed, the code snippet '' | ||
+ | |||
+ | In particular, var_dump() dumps object ids, indicates objects that are identical to each other, shows recursion, and shows the presence of references. It also redundantly annotates values with their types, and generates output for types that cannot be evaluated (e.g. '' | ||
+ | |||
+ | Adding a comment such as ''/ | ||
- | ==== Changelog ==== | + | https:// |
- | * 0.2: Add the section "When would a user use var_representation?" | + | ===== Changelog ===== |
+ | * 0.2: Add the section "When would a user use var_representation?" | ||
+ | * 0.3: Add more examples, add discussion section on indent |
rfc/readable_var_representation.txt · Last modified: 2021/02/19 15:19 by tandre