rfc:flexible_heredoc_nowdoc_syntaxes

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revisionBoth sides next revision
rfc:flexible_heredoc_nowdoc_syntaxes [2017/09/16 16:26] – created tpuntrfc:flexible_heredoc_nowdoc_syntaxes [2017/11/15 20:04] tpunt
Line 3: Line 3:
   * Date: 2017-09-16   * Date: 2017-09-16
   * Author: Thomas Punt, tpunt@php.net   * Author: Thomas Punt, tpunt@php.net
-  * Status: Draft+  * Status: Accepted
   * First Published at: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes   * First Published at: https://wiki.php.net/rfc/flexible_heredoc_nowdoc_syntaxes
  
Line 39: Line 39:
 // no indentation // no indentation
 echo <<<END echo <<<END
-   +      
-  +     
- c+    c
 END; END;
 /* /*
-   +      
-  +     
- c+    c
 */ */
  
-// 1 space of indentation+// 4 spaces of indentation
 echo <<<END echo <<<END
-   +      
-  +     
- c +    
- END;+    END;
 /* /*
   a   a
  b  b
-c 
-*/ 
- 
-// 2 spaces of indentation 
-echo <<<END 
-   a 
-  b 
- c 
-  END; 
-/* 
- a 
-b 
-c 
-*/ 
- 
-// 3 (or more) spaces of indentation 
-echo <<<END 
-   a 
-  b 
- c 
-   END; 
-/* 
-a 
-b 
 c c
 */ */
 </code> </code>
  
-Tabs are supported as well. If tabs and spaces are intermixed (for whatever reason...), then each space and each tab is considered as 1 indentation. So if the closing marker is indented by 1 tab, and the heredoc/nowdoc body is indented by spaces, then regardless of the closing marked *looking* further indented, only 1 bit of whitespace will still be stripped from each line:+If the closing marker is indented further than any lines of the body, then a ''ParseError'' will be thrown:
 <code php> <code php>
-// 1 tab indentation+
 echo <<<END echo <<<END
-   a 
-  b 
- c 
- END; 
-/* 
   a   a
  b  b
 c c
-*/+ END; 
 + 
 +// Parse error: Invalid body indentation level (expecting an indentation at least 5) in %s on line %d
 </code> </code>
  
-Moral of the story: don't mix tabs and spaces...+Tabs are supported as well, however, tabs and spaces **must not** be intermixed regarding the indentation of the closing marker and the indentation of the body (up to the closing marker). In any of these cases, a ''ParseError'' will be thrown: 
 +<code php> 
 +// different indentation for body (spaces) ending marker (tabs
 +
 + echo <<<END 
 +
 + END; 
 +
 + 
 +// mixing spaces and tabs in body 
 +
 +    echo <<<END 
 +    a 
 +     END; 
 +
 + 
 +// mixing spaces and tabs in ending marker 
 +
 + echo <<<END 
 +   a 
 + END; 
 +
 +</code> 
 + 
 +These whitespace constraints have been included because mixing tabs and spaces for indentation is harmful to legibility. 
 + 
 +Ultimately, the purpose of stripping leading whitespace is to allow for the body of the heredoc and nowdoc to be indented to the same level as the surrounding code, without causing unnecessary (and perhaps undesirable) whitespace to prepend each line of the body textWithout this, developers may choose to de-indent the body text to prevent leading whitespace, which leads us back to the current situation of having indentation levels of code ruined by these syntaxes.
  
 ==== Closing Marker New Line ==== ==== Closing Marker New Line ====
  
-Removing the closing marker requirement will change code from:+Currently, in order to terminate a heredoc or nowdoc, a new line **must** be used after the closing marker. Removing this requirement will change code from:
 <code php> <code php>
 stringManipulator(<<<END stringManipulator(<<<END
Line 137: Line 136:
 </code> </code>
  
-This change was actually brought up in a previous RFC ([[rfc:heredoc-scanner-loosening|PHP RFC: Loosening heredoc/nowdoc scanner]]). One of the big gotchas that it mentioned, however, was that if the ending marker was found at the start of a line, then regardless of whether it was apart of another word, it would still be considered as the ending marker. For example, the following would not work:+This change was actually brought up in a previous RFC ([[rfc:heredoc-scanner-loosening|PHP RFC: Loosening heredoc/nowdoc scanner]]). One of the big gotchas that it mentioned, however, was that if the ending marker was found at the start of a line, then regardless of whether it was apart of another word, it would still be considered as the ending marker. For example, the following would not work (due to ''ENDING'' containing ''END''):
 <code php> <code php>
 $values = [<<<END $values = [<<<END
Line 151: Line 150:
 The implementation I am proposing avoids this problem by checking to see if a continuation of the found marker exists, and if so, then if it forms a valid identifier. This means that the terminating marker string will only be considered as such if it is matched exactly as a standalone, valid symbol (that is also found at the start of the line). This enables for the above snippet to now work. The implementation I am proposing avoids this problem by checking to see if a continuation of the found marker exists, and if so, then if it forms a valid identifier. This means that the terminating marker string will only be considered as such if it is matched exactly as a standalone, valid symbol (that is also found at the start of the line). This enables for the above snippet to now work.
  
-Something such as the following will still not work, however:+Examples such as the following will still not work, however:
 <code php> <code php>
 $values = [<<<END $values = [<<<END
Line 160: Line 159:
 /* /*
 Parse error: syntax error, unexpected 'ING' (T_STRING), expecting ']' in %s on line %d Parse error: syntax error, unexpected 'ING' (T_STRING), expecting ']' in %s on line %d
 +*/
 +
 +echo <<<END
 +END{$var}
 +END;
 +/*
 +Parse error: syntax error, unexpected '$var' (T_VARIABLE) in %s on line %d
 */ */
 </code> </code>
  
-(Notice the space after the first ''END''.) There is not a great deal that can be done about this. So the simple rule is: **do not choose a marker that appears in the body of the text** (though it would specifically have to occur at the start of a line in the text to cause problems).+There is not a great deal that can be done about this. So the simple rule is: **do not choose a marker that appears in the body of the text** (though it would specifically have to occur at the start of a line in the text to cause problems).
  
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
Line 170: Line 176:
  
   * the colliding marker begins at the start of a line in the text   * the colliding marker begins at the start of a line in the text
-  * the colliding marker can be seen as standalone+  * the colliding marker can be seen as standalone, valid symbol name
  
-The changes proposed in this RFC therefore comes down to whether you believe developers are responsible enough to choose non-colliding markers. I firmly believe that since we give developers the power to choose their own markers, then they should be responsible enough to choose markers that do not collide with the inner multiline text+The changes proposed by this RFC therefore come down to whether you believe developers are responsible enough to choose non-colliding markers. I firmly believe that since we give developers the power to choose their own markers, then they should be responsible enough to choose markers that do not collide with the inner multiline text.
- +
-Therefore, I believe the tradeoff of making the heredoc and nowdoc syntaxes more flexible in return for requiring developers to actually choose good marker names is a tradeoff worth making.+
  
 So to quickly reiterate, the changes proposed by this RFC will enable for code such as the following: So to quickly reiterate, the changes proposed by this RFC will enable for code such as the following:
Line 181: Line 185:
 { {
     stringManipulator(<<<END     stringManipulator(<<<END
-       +   
-      +  
-     c+ c
 END END
 ); );
Line 210: Line 214:
 ===== Proposed Voting Choices ===== ===== Proposed Voting Choices =====
  
-There will be two votes, both requiring a 2/3 majority. The first will be regarding whether the closing marker should be able to be indented. The second will be whether the closing marker should remove the new line requirement. These votes are orthogonal to one-another (it doesn't matter if one fails).+There will be two votes, both requiring a 2/3 majority. The first will be regarding whether the closing marker can be indented. The second will be whether the closing marker should remove the new line requirement. These votes are orthogonal to one-another (if one fails and the other passes, then the other still passes). 
 + 
 +Voting starts on 2017.11.01 and ends on 2017.11.15. 
 + 
 +<doodle title="Allow for the closing marker to be indented and for the leading whitespace to be stripped?" auth="tpunt" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle> 
 +'''' 
 +<doodle title="Remove the trailing new line requirement from the closing marker?" auth="tpunt" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle>
  
 ===== Patches and Tests ===== ===== Patches and Tests =====
rfc/flexible_heredoc_nowdoc_syntaxes.txt · Last modified: 2018/04/13 19:59 by nikic