rfc:remove_utf_8_decode_encode

PHP RFC: Deprecate utf8_decode() and utf8_encode()

Introduction

Purpose of this RFC is described in bug #60429

The purpose of the functions utf8_encode and utf8_decode are time and again misunderstood and have probably caused more encoding related problems than they have solved. The biggest reason for this is their naming. Their purpose is to *convert* the encoding of a string from ISO-8859-1 to UTF-8, yet they are named in a way that suggests some other magical function that is necessary to work with UTF-8 text. Users looking for “UTF-8 support” in their app quickly find these functions due to their naming and use them without understanding what they do, often only testing with ASCII text which appears to work fine of first sight.

Why is ISO-8859-1 presumed to be the default encoding when converting to UTF-8, hence why do these functions occupy such a prominent spot in the namespace? There's simply no good reason for it.

The same functionality is available through iconv and mb_convert_encoding. Therefore I suggest to slowly deprecate utf8_encode and utf8_decode to clear up a recurring confusion and consolidate features into the existing, much more versatile iconv and mb_ functions.

Proposal

  • Document utf8_decode() and utf8_encode() deprecation now.

Use of utf_*() is deprecated in favor of generic encoding conversion functions. Use mbstring, iconv or intl module feature to convert character encoding.

Backward Incompatible Changes

Programs use utf8_decode() and utf8_encode() for ISO-8859-1 to/from UTF-8 conversion should use other encoding conversion function available. e.g. mb_convert_encoding()

Proposed PHP Version(s)

  • None

RFC Impact

To SAPIs

None

To Existing Extensions

XNL module

To Opcache

None

New Constants

None

php.ini Defaults

No change.

Open Issues

None

Unaffected PHP Functionality

Other XML module features except utf8_decode() and utf8_encode() are unaffected.

Future Scope

* Remove or alias functions in the future.

Proposed Voting Choices

State whether this project requires a 2/3

Remove utf8_decode() and utf8_encode() function
Real name Yes No
Final result: 0 0
This poll has been closed.

Vote starts: 2016/09/10

Vote ends: 2016/09/20 23:59:59 UTC

Patches and Tests

* N/A

Implementation

  1. the version(s) it was merged to
  2. a link to the git commit(s)
  3. a link to the PHP manual entry for the feature

References

Rejected Features

Keep this updated with features that were discussed on the mail lists.

rfc/remove_utf_8_decode_encode.txt · Last modified: 2017/09/22 13:28 (external edit)