rfc:namespaceresolution

This is an old revision of the document!


Request for Comments: How to write RFCs

This RFC discusses the way identifiers inside a namespace are resolved.

Introduction

Generally in namespaces we support fully qualified names. However what happens if a non fully qualified name is used that is not defined inside the namespace? Should this cause a fatal error or should an attempt be made to resolve this call to the global namespace?

Why do we need RFCs?

Obviously its important that we make a conscious decision for these questions. Depending on how we approach this, users might unintentionally trigger autoload(), call functions in the global namespace they did not expect or they could run into trouble when trying to migrate existing code to namespaces. ===== Possible approaches ===== ==== Fallback to the global namespace ==== In this scenario when an identifier does not resolve inside the current namespace and attempt would be made to find the identifier in the global namespace. When referencing global identifiers in namespaces, it is probably a reasonable assumption that the bulk of that will be function calls. This is because currently most functionality in PHP is provided via functions. Also once an instance of a class has been made, one does not have to reference the identifier in common usage (obviously there will still be cases, like with instanceof/type hint checks). In the past people created classes often for the sole reason of being able to sort of “namespace” their functions. Given that we now have real namespaces, class usage as a namespace replacement is no longer necessary. Still another possible assumption, which is considerably more dangerous to make, would be that most code that uses namespaces will mostly use classes inside the current namespace. One noteworthy aspect here is that for classes we have autoload(). If non fully qualified identifiers can be used to reference global identifiers, “lazy” programmers can skip fully qualifying identifiers even if they have the full intention of referencing a global identifier. With autoload() this could trigger expensive operations, which are essentially useless. For functions however we do not have autoload() capabilities. This brings the advantage that falling back to the global namespace does not run the performance risk of autoload(). So a fallback would be much less expensive, but obviously there would still be overhead for not making intentional references to the global namespace fully qualified. At the same time the ability to automatically fallback into the global namespace gives the ability to overload global identifiers inside a namespace without having to modify existing code inside that namespace. This however can also be considered dangerously similar to the ambiguity issues we solved by changing the namespace separator. Static code analysis becomes more difficult, which is always the cost of overloading. Further more users need to be aware that if they are overloading internal identifiers that they to make sure that either the relevant code is loaded. For classes there is the autoload() approach would would ensure that the class to overload is loaded on demand if necessary. Users that do not use autoload() or that are overloading function (and constants) run the risk of their code behaving differently in not so obvious ways if they do not always flat out load all files defining relevant functions (and constants) for this namespace. One approach to make it at least noticeable when a fallback into the global namespace occurs would be to for example throw an E_NOTICE. This would obviously discourage users from using the fallback for overloading, but it would ensure that people migrating legacy code or people who have not yet fully understood namespaces, would be able to find out about where they are loosing performance. Another approach to reduce some of the issues is by simply removing functions (and constants) from namespaces. As a result of the above notes, we might decide to go with a few different options based on how one weighs these aspects: - only for functions/constants - only for classes - only for internal identifiers - for everything === Only for functions/constants === Assumption: Most people will use global functions and namespaced classes. By throwing an E_NOTICE when a fallback occurs, the performance issues become more manageable, but it would reduce the feasibility of overloading. Also note that if functions (and constants) would be removed from namespaces, then most disadvantages would be removed as functions (and constants) would always directly reference the global namespace. == Advantages == - Does not require fully qualified names for functions (and constants) - No performance “bomb” with autoload()

  1. Ability to overload global functions (and constants)
Disadvantages
  1. Overloading global identifiers requires ensuring that all relevant files are loaded or unexpected behavior might occur
  2. There is still overhead for the fallback
  3. Classes still need fully qualified names

Only for classes

Assumption: People want to overload global classes

By throwing an E_NOTICE when a fallback occurs, the performance issues become more manageable, but it would reduce the feasibility of overloading.

Advantages
  1. Does not require fully qualified names for classes
  2. Ability to overload global classes
Disadvantages
  1. Functions (and constants) still need fully qualified names
  2. Possible performance bomb with autoload() === Only for internal identifiers === Assumption: People will leave the global namespace to PHP and namespace their own code. == Advantages == - Does not require fully qualified names for all internal identifiers - Internal identifiers work the same inside and outside of namespaces (though overloading would still be possible) == Disadvantages == - Less clear rule as its not possible form just reading the calling code if something is internal or not - Defining a function in userland code to emulate functionality from a newer PHP version will not enable fallbacks - Higher performance overhead (???) === For everything === Assumption: People want to easily migrate their existing code and beginners should not have to know (as much about) if they are coding inside a namespace or not. By throwing an E_NOTICE when a fallback occurs, the performance issues become more manageable, but it would reduce the feasibility of overloading. Also note that if functions (and constants) would be removed from namespaces, then some of the disadvantages would be removed as functions (and constants) would always directly reference the global namespace. == Advantages == - Does not require fully qualified names for all global identifiers - Simple rule, everything falls back - Ability to overload any kind of global identifier == Disadvantages == - There is overhead for the fallback - Additionally there is a possible performance bomb with autoload()
  3. Overloading global identifiers requires ensuring that all relevant files are loaded or unexpected behavior might occur

Require fully qualified names everywhere

Assumption: People are willing to spend more time on updating their legacy code that they migrate to namespaces and adapt their coding when working within namespaces.

Advantages
  1. No risk for people relying on behavior that does the same but with more overhead
Disadvantages
  1. Require fully qualified names for all global identifiers

More about RFCs

Changelog

rfc/namespaceresolution.1225363867.txt.gz · Last modified: 2017/09/22 13:28 (external edit)