PHP: ideas:usercomments

Revamping user comments at php.net

Possible mentor: Philip Olson, Sherif Ramadan

This task involves rethinking how we handle user comments at php.net. This is mainly (only, currently) done within the PHP manual but it could leak into other areas like at pecl.php.net.

The first task involves defining what a user note is (or should be). Is an example that happens to use strlen() a fine user note for the strlen() documentation? Are examples even appropriate user notes? Is the current moderation system working? Many other questions will arise.

Progress

There has been some discussion.

a few user notes concerns (October 22, 2008)
user comments of the future (March 27, 2009)
explanation of the current flow (April 06, 2010)
user notes proposal (April 08, 2010)
user notes: code snippets (April 10, 2010)
concerns via reddit (May 01, 2011)
user notes: moderation (May 02, 2011)
About note maintenance; when to delete notes (Feb 24, 2012)
Several others no doubt

Status

Open

Baby steps: step A, moderation flags

This involves a simple user notes moderation flag option. Users can flag notes as “bad”, which prompts a moderator to look into it. Users do not need to login, but if not logged in, must answer a recaptcha or similar.

How it will probably work

User clicks on “flag” button near individual note
Popup
1. Checks if logged in
2. If not logged in, prompts reCaptcha or similar (human?)
Window now has several options, including:
1. A button to submit the flag, with no details (quick flag)
2. Or optionally allow entering details such as name, email, textarea and/or select reason for flag (e.g., Spam)
Flagged notes generate email to a list, and are also listed within our web admin panel. (only email for now is fine)
1. People can subscribe to receive these emails? (future)
Moderator takes action, which includes current options (delete/reject/modify) and perhaps a new unflag option.
If note flag is accepted (e.g., deleted), should we notify the user via email? Or list via email?

Notes about the current system

The current system was written in the 90's, and contains files spread out in several places throughout the PHP project. Also, various note moderation statistics are listed here.

Files

Note archive (note actions are emailed here):
http://news.php.net/php.notes

Revamping Tactics

Currently the user comments system introduces a number of a fundamental problems that inhibit both its use as well as its usefulness. This section aims to address many of those existing problems and the tactical approach to hopefully weeding them out and setting the new system on a more manageable future course.

Here is an incomplete list of some of the issues I'll be attempting to tackle first.

The importance of what data is stored and how that data can later be used needs to be addressed head-on. The current database schema in the master database allows for a number of useful options to further advance this system to the next step (such as votes, and rating). However, this schema makes these particular pieces of information in the database difficult to work with. For example, it can not be easily implemented in a manner that allows us to store individual votes or ratings on a per user bases in its existing form. This would require a second table for storing individual rows that can be indexed and joined to the note table for collecting such information in a more fine-grained manner.
The ability for the user to flag certain comments is key in the moderation step. There aren't nearly as many moderators with access to the moderation system as there are users. However, any information collected from the user needs to be useful to an autonomous system as well as a human. Cutting back on tedious human moderation is vital in fighting spam and other abusive behavior. This would entail that the user should not be able to simply flag a comment without offering up an actual reason. Flagging a comment without intent presents the obvious problem of a human still having to decide why this comment should be moderated or removed. If we can alleviate the human factor and narrow down information that allows the human to make a quick and intelligent decision based on the available information we can expedite the moderation process.
Presenting the information in the user-submitted comments to the user in a more meaningful and useful manner can also play an important role in allow, encouraging, and enabling the power of a community to unanimously decide on what is considered more or less valuable. Currently the system only presents the comments in a descending order by date. This allows the more current comments to remain at the top. While this is a good idea, it's also not the best idea. Some improvements may including using multiple criteria for ordering and presenting comments to the user, such as rating, votes, and dates with optional sorting fields (descending/ascending).

New User Notes Features

Since starting work on improving usernotes I've managed to find some good ways to implement some of the proposed features and improve on the existing system without introducing new backwards-compatibility breaks. None of the new features have gone into production on php.net as of yet, but hopefully will be introduced soon once some of the implementation details have been worked out and more thoroughly tested. To get an idea of the new features you can try the sandboxed demo here at http://php.sheriframadan.com/

Here are the features I've managed to implement so far...

Flag Notes
- This feature allows users to flag a note for a specific reason. Possible reasons for a user to want to flag a note may include spam, abusive or profane language, bad or destructive code, notes posted in the wrong section, notes that should be bug reports, duplicate notes, authors that would like to have their notes removed, or some other unlisted reason. Some of the uses for this feature enables moderators with notes karma to be able to more easily spot notes that they've missed and that other users have been able to identify as in-need-of-moderation for one or more reasons. This feature requires a bit of integration with the existing user-notes.php backend so that svn accounts can search flag notes and review them for any necessary action.
Voting
- One of the things voiced by other users and listed in the Progress section of this wiki page, is the ability to up-vote or down-vote a note based on it's popularity among the community. This feature has been integrated in usernotes and allows the votes to populate through the existing notes system into the mirrors.
- This feature also introduces a slight draw-back, but one that still trumps the existing short-comings of usernotes, which is that it does not prevent spam or misuse on a homogeneous level. The existing implementation relies on a central database served by master, to identify each existing request for a vote, and limit the number of votes per note, per day, per IP address. This prevents anyone with the same IP address to place more than one vote for the same note on any given day. It does not prevent anyone from abusing this system through a coordinated botnet or DoS attack, although it won't be too difficult to rate limit the request quota. This just puts everyone at a disadvantage due to the current manner in which usernotes exists today.
- The next step will be to implement an authentication mechanism whereby the existing mirrors can still utilize some form of authentication having to each retain or access a central database or require users to register via php.net itself. Possible suggestions for this have been to use OAuth 2.0.
Sorting
- The existing user notes were always sorted by date of submission in descending order. This made the usefulness of older notes degrade slowly over time as more and more notes for the same page were added and notes of a lesser quality were not removed.
- The new features allows the notes to be sorted by a rating system of +/- 1 based on number of votes and grouped by dates in a descending order. The rating is calculated by dividing the number of +1 (up votes) by the number of total votes (up votes plus down votes) -- calculated as a percentage of 0 - 100%.
- Notes that have no votes at all are ranked above notes with 30% or lesser ratings. Notes with greater than 30% ratings are ranked above notes with no votes at all.
- There is a slight exception to the rule with notes that have fewer than 4 votes. This way notes can not drop suddenly to the bottom of the page without some fair chance (trying to account for error in small sample sizes).
- Another exception to the rule is notes submitted on today's date. They're always left at the top for that day to give them a fair chance of exposure as well and hopefully collect a good rating sample.
Minor Improvements
- Some additional minor improvements have been introduced to allow usernotes to bring a more positive experience to the user.
- For example, visiting direct link to one of the notes such as http://php.net/strstr#107588 causes the anchor to direct the UA to the part of the page that contains this note ID. With some added javascript the page will now highlight the note for the user in an effort to focus their attention on the specific note they were looking for (or possibly where they were being directed to).
- User notes now also provide a timestamp in relative date format (e.g. “1 month ago”, “2 years ago”, “4 days ago”, etc...) instead of a formated date timestamp (e.g “2012-09-05”).
- Now, usernotes also helps make the experience of using its features easier to navigate with javascript in the cases of voting, and flagging user notes. The user does not have to leave the page to do so as long as they have javascript enabled. The functionality also degrades gracefully in the event javascript is not enabled on the client side. The next step will be to make adding notes also possible without leaving the page in the near future.
- Also, you can now see how many notes have been added to each page at the top of the notes div (giving you some indication of number of notes per section without scrolling or counting the notes manual). Might be useful to some people.
- More improved UX is also in the works :)

Some of the current faced involve the decentralized manner in which the usernotes system is implemented in order to be distributed to the mirrors without requiring each mirror to retian it's own database and without centralizing a database that integrates into all of the mirrors. Since the entire notes system is primarily file-based and remotely synchronized there's no easy solution to authentication. The best proposed solution so far is OAuth 2.0, which can work, but might still require some centralized database in order to prevent some potential misuse such as the confused deputy problem.