rfc:engine_warnings

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
rfc:engine_warnings [2019/08/28 08:52] nikicrfc:engine_warnings [2020/08/03 12:41] (current) nikic
Line 2: Line 2:
   * Date: 2019-08-27   * Date: 2019-08-27
   * Author: Nikita Popov <nikic@php.net>   * Author: Nikita Popov <nikic@php.net>
-  * Status: Draft+  * Status: Implemented
   * Target Version: PHP 8.0   * Target Version: PHP 8.0
  
 ===== Introduction ===== ===== Introduction =====
  
-While newly introduced error conditions in the engine typically use ''Error'' exceptions, we have many old error conditions that use inappropriate severity levels for historical reasons. For example, accessing an undefined variable, while being a very severe programming error, only generates a notice. This RFC proposed to reevalute existing error conditions and reclassify their severity level as appropriate.+While newly introduced error conditions in the engine typically use ''Error'' exceptions, we have many old error conditions that use inappropriate severity levels for historical reasons. For example, accessing an undefined variable, while being a very severe programming error, only generates a notice. This RFC proposes to reevaluate existing error conditions and reclassify their severity level as appropriate.
  
 ===== Proposal ===== ===== Proposal =====
 +
 +==== General Guidelines ====
 +
 +As we don't have any existing rules on the matter, here are some general guidelines I try to follow in the following reclassification:
 +
 +  * Error exceptions should be the baseline for error conditions that indicate a programming error.
 +  * If there is an expectation that a certain error condition is commonly intentionally suppressed, especially in legacy code, an exception should not be used.
 +  * If the error condition is data-dependent, it may be preferable not to use an exception.
 +  * For error conditions that have known false positives, a notice should be used.
 +  * Avoid promoting from notice directly to Error exception. I'm only proposing this for the case of undefined variables, because it is so severely misclassified right now.
  
 ==== Proposed Classification ==== ==== Proposed Classification ====
Line 15: Line 25:
 The following table contains a list of errors with notice or warning severity generated in the engine, excluding warnings that are generated by functions which have an optimized opcode implementation. The following table contains a list of errors with notice or warning severity generated in the engine, excluding warnings that are generated by functions which have an optimized opcode implementation.
  
-The table shows both the current error level, as well as the proposed level. A rationale for the proposed change (or non-change) is provided below group of errors.+The table shows both the current error level, as well as the proposed level. A rationale for the proposed change (or non-change) is provided below each group of errors
 + 
 +The "undefined variable", "undefined array index" and "division by zero" error conditions are discussed separately below, because they are more controversial.
  
 ^ Message ^ Current Level ^ Proposed Level ^ ^ Message ^ Current Level ^ Proposed Level ^
-| Undefined variable: %s | Notice | Error exception | 
-| **Rationale:** In most cases, an undefined variable is a severe programming error. The current low classification of this error is likely a legacy from the Dark Ages of register_globals, where it was more normal to have variables that are not always available. ||| 
 | Attempt to increment/decrement property '%s' of non-object | Warning | Error exception | | Attempt to increment/decrement property '%s' of non-object | Warning | Error exception |
 | Attempt to modify property '%s' of non-object | Warning | Error exception | | Attempt to modify property '%s' of non-object | Warning | Error exception |
 | Attempt to assign property '%s' of non-object | Warning | Error exception | | Attempt to assign property '%s' of non-object | Warning | Error exception |
 | Creating default object from empty value | Warning | Error exception | | Creating default object from empty value | Warning | Error exception |
-| **Rationale:** These are all related errors that are generated when a property is accessed on a non-object inside a write context. If the non-object is "truthy" a warning is generated and the operation is ignored, if it is "falsy" an empty stdClass object is created instead. While auto-vivification is a core part of the language for arrays, the same is not the case for objects, and creating a property on a non-object is almost certainly a programming error rather than an intentional action. |||+| **Rationale:** These errors are generated when a property is accessed on a non-object inside a write context. If the non-object is "truthy" a warning is generated and the operation is ignored, if it is "falsy" an empty stdClass object is created instead. While auto-vivification is a core part of the language for arrays, the same is not the case for objects, and creating a property on a non-object is almost certainly a programming error rather than an intentional action. |||
 | Trying to get property '%s' of non-object | Notice | Warning | | Trying to get property '%s' of non-object | Notice | Warning |
 | Undefined property: %s::$%s | Notice | Warning | | Undefined property: %s::$%s | Notice | Warning |
-| **Rationale:** The first warning is for the same case as above, but for read contexts. The suggested classification is based on two facts: FirstPHP is generally somewhat lenient about reading of "missing" data, and there is likely significant code around there using this factso changing this into an exception would be ill-advisedSecond, apart from specific cases like JSON data in object representationavailable object properties are generally known and fixed and accessing a non-existing property is a severe programming error, similar to accessing an undefinded variableAs suchthe current notice classification is too low. |||+| **Rationale:** The first warning is for the same case as above, but for read contexts. This is classified as a warningbecause it usually indicates a programming error (in modern codeall non-magic properties tend to be known and fixed). Howeverobject properties can also be dynamic (e.g. JSON in object form)in which case accessing an undefined property may be less severe issueGenerallyPHP is somewhat lenient with read accesses to "missing" data. |||
 | Cannot add element to the array as the next element is already occupied | Warning | Error exception | | Cannot add element to the array as the next element is already occupied | Warning | Error exception |
 | **Rationale:** This error condition occurs when trying to push to an array for which the ''PHP_INT_MAX'' key is already used. This error condition practically never occurs outside of specially crafted code, and implies data loss if it does. As such, it is changed into an exception. ||| | **Rationale:** This error condition occurs when trying to push to an array for which the ''PHP_INT_MAX'' key is already used. This error condition practically never occurs outside of specially crafted code, and implies data loss if it does. As such, it is changed into an exception. |||
Line 49: Line 59:
 | A non well formed numeric value encountered | Notice | (Notice) | | A non well formed numeric value encountered | Notice | (Notice) |
 | **Rationale:** The difference between these two warnings is whether a string is completely non-numeric, or whether it has a numeric prefix. This is a runtime issue based on the specific string value involved in an operation, which may be user-controlled. For this reason we don't promote to exceptions. ||| | **Rationale:** The difference between these two warnings is whether a string is completely non-numeric, or whether it has a numeric prefix. This is a runtime issue based on the specific string value involved in an operation, which may be user-controlled. For this reason we don't promote to exceptions. |||
-| Accessing static property %s::$%s as non static | Notice | ??? +| Accessing static property %s::$%s as non static | Notice | (Notice) 
-| **Rationale:** This notice is somewhat confusing in what it does: It is thrown when accessing ''%%$obj->staticProp%%'' but does **not** actually read the static property. Instead it will fall back to using the dynamic property named ''staticProp''. There is more inconsistency in this area, in that accessing a protected static property on the object will generate an Error exception, even though it would not actually access that property. I'm not sure what to do here, both removing the notice completely and promoting it to an Error exception would be viable options to me. |||+| **Rationale:** This notice is somewhat confusing in what it does: It is thrown when accessing ''%%$obj->staticProp%%'' but does **not** actually read the static property. Instead it will fall back to using the dynamic property named ''staticProp''. There is more inconsistency in this area, in that accessing a protected static property on the object will generate an Error exception, even though it would not actually access that property. I'm not sure what to do here, but would be inclined to just leave it alone. |||
 | Array to string conversion | Notice | Warning | | Array to string conversion | Notice | Warning |
-| **Rationale:** This is generally a bug (and the "Array" string you get is meaningless), but in many cases also not a particularly severe one. ||| +| **Rationale:** This is generally a bug (and the "Array" string you get is meaningless), but in many cases also not a particularly severe one. Since [[rfc:tostring_exceptions|string conversion exceptions]] are supported now, we could also promote this to an Error exception, and I'generally open to that. |||
-| Undefined offset%d Notice | (Notice) | +
-| Undefined index: %s | Notice | (Notice) | +
-| **Rationale:** I believe that especially in legacy code it is pretty common to ignore undefined index diagnostics rather than using explicit isset checks. I'keeping it as a notice for that reason, but a warning might be preferable. |||+
 | Resource ID#%d used as offset, casting to integer (%d) | Notice | Warning | | Resource ID#%d used as offset, casting to integer (%d) | Notice | Warning |
-| **Rationale:** This is a meaningful operation, but should be performed with an explicit integer cast. Without that indication of intent, this is more likely a bug than not. |||+| **Rationale:** This is in principle a meaningful operation, but exotic enough that intent should be indicated with an explicit integer cast. |||
 | String offset cast occurred | Notice | Warning | | String offset cast occurred | Notice | Warning |
 | Illegal string offset '%s' | Warning | (Warning) | | Illegal string offset '%s' | Warning | (Warning) |
Line 63: Line 70:
 | Uninitialized string offset: %d | Notice | Warning | | Uninitialized string offset: %d | Notice | Warning |
 | Illegal string offset: %d | Warning | (Warning) | | Illegal string offset: %d | Warning | (Warning) |
-| **Rationale:** The former is used when reading an out-of-bounds string offset, the latter when writing to an out-of-bounds //negative// string offset (for positive offsets, the string is extended instead). The notice is promoted to a warning, as this is generally an algorithmic bug, and string offset access if far rarer than array offset access. |||+| **Rationale:** The former is used when reading an out-of-bounds string offset, the latter when writing to an out-of-bounds //negative// string offset (for positive offsets, the string is extended instead). In line with undefined index/property, we consistently generate a warning here. |||
 | Cannot assign an empty string to a string offset | Warning | Error exception | | Cannot assign an empty string to a string offset | Warning | Error exception |
-| **Rationale:** This indicates some kind of logic error. As this is a rare operation in the first place, throw an Error seems preferable here. ||| +| **Rationale:** This operation is not meaningful and indicates some kind of logic error. ||| 
-| Only variables should be passed by reference | Notice | ??? +| Only variables should be passed by reference | Notice | (Notice) 
-| Only variable references should be returned by reference | Notice | ??? +| Only variable references should be returned by reference | Notice | (Notice) 
-| Only variable references should be yielded by reference | Notice | ??? +| Only variable references should be yielded by reference | Notice | (Notice) 
-| Only variables should be assigned by reference | Notice | ??? +| Only variables should be assigned by reference | Notice | (Notice) 
-| Attempting to set reference to non referenceable value | Notice | ??? | +| Attempting to set reference to non referenceable value | Notice | (Notice) | 
-| Parameter %d to %s%s%s() expected to be a reference, value given | Warning | ??? +| Cannot pass by-reference argument %d of %s%s%s() by unpacking a Traversable, passing by-value instead | Warning | (Warning) 
-| Cannot pass by-reference argument %d of %s%s%s() by unpacking a Traversable, passing by-value instead | Warning | ??? +| **Rationale:** The use of values where a reference is expected is currently somewhat inconsistent, with everything from compiler errors, Error exceptions, warnings and notices being possible depending on the specific case. Passing a non-variable to a reference argument is often a programming error, because it will not be possible to modify the passed value and the reference cannot serve its purpose. However, this is complicated by optional reference arguments or return values that are optionally references. In both cases the warning may be a false positiveIt'not really clear what to do here, so I'm retaining the current classification for now. ||| 
-| **Rationale:** The use of values where a reference is expected is currently wildly inconsistent, with everything from compiler errors, Error exceptions, warnings and notices being possible depending on the specific case. Generally passing a non-variable to a reference argument is a programming error, because it will not be possible to modify the passed value and the reference cannot serve it'purpose. However, this is complicated by optional reference arguments (or return values for that matter), where it might be desirable to allow passing/returning null literalI'not sure what the best classification here is. |||+ 
 +==== Undefined variable ==== 
 + 
 +In most cases, accessing an undefined variable is a severe programming error. The current low classification is a legacy from the Dark Ages of PHP, where features like register_globals made conditionally defined variables more typical, and code quality standards were lower. 
 + 
 +Ideally, undefined variables should be compile errors, but as the dynamic nature of PHP precludes a reliable compile-time analysis, this RFC proposes to generate an Error exception instead. 
 + 
 +However, throwing an exception may complicate the upgrading of legacy code that currently suppresses the generation of notices wholesale, as the issue can no longer be ignored. Some people have even suggested that the use of undefined variables is a legitimate coding style choice. 
 + 
 +For this reason, a separate vote will decide whether we should throw an Error exception, generate a warning or keep the current notice. 
 + 
 +==== Undefined array index ==== 
 + 
 +Similarly to undefined variables or an undefined object properties, reading an undefined array index/key would generally be considered a programming error in modern PHP code. However, while variables and object properties are predominantly statically known (i.e., when variable variables and dynamic object properties are not used), the same is not true to array keys, which tend to be dynamic. 
 + 
 +Some languages, such as JavaScript, do not consider accesses to undefined array keys to be an error condition at all, and allow such an operation to be performed silently. While it is not predominant in the PHP world, some people subscribe to such a coding style also for PHP code, and as such would prefer undefined array key access to remain an easily suppressible notice.  
 + 
 +A separate vote will decide whether to elevate undefined array offset/index conditions to a warning, or leave them as notices.
  
 ==== Division by zero ==== ==== Division by zero ====
  
-Division by zero currently has somewhat inconsistent behavior. The ''%'' operator throws a ''DivisionByZeroError''. However, the ''/'' throws a "Division by zero" warning and returns one of +Inf, -Inf or NaN, following IEEE 754 semconds.+Division by zero currently has somewhat inconsistent behavior. The ''%'' operator throws a ''DivisionByZeroError''. However, the ''/'' throws a "Division by zero" warning and returns one of +Inf, -Inf or NaN, following IEEE 754 semantics.
  
-The rationale for this behavioral discrepancy is as follows: The ''%'' operator operators on integers. As such the conjugated operation is actually not ''/'', but rather ''intdiv()'', which //does// throw ''DivisionByZeroError''. An integer operation shouldn't return a floating point number, so throwing an Error exception is the only choice here.+The rationale for this behavioral discrepancy is as follows: The ''%'' operator works on integers. As such the conjugated operation is actually not ''/'', but rather ''intdiv()'', which //does// throw ''DivisionByZeroError''. An integer operation shouldn't return a floating point number, so throwing an Error exception is the only choice here.
  
-Not throwing an Error exception for division by zero using ''/'' is motivated by the fact that such division do have a well-defined result under IEEE 754. In some areas of application (such as numerics) it may be useful to not treat division by zero as an error condition at all, though such applications are unusual for PHP. Similarly, Bob Weinand argued that for reporting code that makes heavy use of divisions, it may be preferable to have the ability to suppress this error condition.+Not throwing an Error exception for division by zero using ''/'' is motivated by the fact that such division does have a well-defined result under IEEE 754. In some areas of application (such as numerics) it may be useful to not treat division by zero as an error condition at all, though such applications are unusual for PHP. Similarly, Bob Weinand argued that for reporting code that makes heavy use of divisions, it may be preferable to have the ability to suppress this error condition.
  
 On the other hand, the current behavior, and especially the discrepancy with ''%'' is quite unexpected, and many people expect that a division by zero error will in fact generate a DivisionByZeroError (duh). This discussion has already come up multiple times on the internals list and in pull requests. On the other hand, the current behavior, and especially the discrepancy with ''%'' is quite unexpected, and many people expect that a division by zero error will in fact generate a DivisionByZeroError (duh). This discussion has already come up multiple times on the internals list and in pull requests.
Line 89: Line 113:
 ===== Backward Incompatible Changes ===== ===== Backward Incompatible Changes =====
  
-Conversion of noticed to warnings is fairly harmless, because both continue execution after the diagnostic has been generated. Conversion to exceptions implies that the current control flow will be aborted.+Conversion of notices to warnings is fairly harmless, because both continue execution after the diagnostic has been generated. Conversion to exceptions implies that the current control flow will be aborted.
  
 This may impact code that makes very liberal use of the error suppression operator ''@'' or disables error reporting wholesale. The proposal does try to avoid changing notices that are more likely to be suppressed into exceptions. This may impact code that makes very liberal use of the error suppression operator ''@'' or disables error reporting wholesale. The proposal does try to avoid changing notices that are more likely to be suppressed into exceptions.
 +
 +If desired, an error handler can be provided that filters out the error conditions that will be turned into an exception, so that projects can focus on addressing them prior to an upgrade.
  
 ===== Vote ===== ===== Vote =====
  
-The main portion of this RFC will be voted as a single proposalHowever, I may split off specific warnings into separate votes, if I think they are controversial (such as the division by zero case).+All the following votes are **independent**Each requires a 2/3 majority and may pass/fail independent of other votes. Voting closes 2019-09-26. 
 + 
 +As the "undefined variable" vote is a 3-way voteacceptance is determined as follows: "Error exception" is accepted if it has 2/3 majority. Otherwise, "Warning" is accepted if the first two options together have 2/3 majority. Otherwise, "Keep Notice" applies. 
 + 
 +<doodle title="Change undefined variable severity to?" auth="nikic" voteType="single" closed="true"> 
 +   * Error exception 
 +   * Warning 
 +   * Keep Notice 
 +</doodle> 
 +
 +<doodle title="Change undefined array index severity to?" auth="nikic" voteType="single" closed="true"> 
 +   * Warning 
 +   * Keep Notice 
 +</doodle> 
 +
 +<doodle title="Change division by zero severity to?" auth="nikic" voteType="single" closed="true"> 
 +   * DivisionByZeroError exception 
 +   * Keep Warning 
 +</doodle> 
 +. 
 +<doodle title="Accept remaining classifications shown in the table above?" auth="nikic" voteType="single" closed="true"> 
 +   * Yes 
 +   * No 
 +</doodle> 
 + 
 +===== Changelog =====
  
 +  * 2019-09-12: Split out undefined index/offset into a separate section.
 +  * 2019-09-10: Keep current classification for reference errors.
 +  * 2019-08-28: Split off the "undefined variable" case into a separate vote, as it was a major point of contention on-list.
rfc/engine_warnings.1566982359.txt.gz · Last modified: 2019/08/28 08:52 by nikic