As a follow up to my comments at the Yale Law School Information Society Project Symposium on Reputation Economies in Cyberspace, I wanted to talk a bit about digital reputations. By this I mean some way of attaching numeric data and supporting information to a person, an object, a collection of such, or relationships among such to indicate degrees of quality or preference.
Here are some example:
- The numbers of stars given to a product sold at Amazon.com, together with the individual rankings and the comments.
- The percentage of sales of a product sold at Amazon.com that are not returned by the buyer.
- The number of “yes” answers to the question “was this review valuable to you” associated with a review of a product sold on Amazon.com.
- An electronic version of your college transcript, together with your grade point average.
- Your scientific reputation based on the number of papers you have published, the reputation of your co-authors, and the journals in which you have published them.
- Your job performance evaluation, on some numeric scale.
- The number of friends you have in Facebook or connections you have on LinkedIn.
- The number of hits you have on a website.
- Your Google page rank.
A question that was posed at the symposium was whether reputations were portable. As many speakers mentioned, reputations must be taken in context. Two college transcripts are not comparable among universities that vary significantly in quality, unless you normalize both for the quality differences as well as differences in grading schemes. Note that the “quality of a university” is itself a digital reputation.
Your reputation in one social networking site is not instantly portable to another because the uses of the sites may vary and the people participating may differ. Yesterday I read about specific social networks for wine lovers. Your reputation there might not and probably should not translate to a reputation in LinkedIn, which is more business oriented.
Digital reputations are computed or inferred from basic data and metadata. The number of formulae to compute reputations is unbounded: if you come up with a weighted expression, I can always fiddle with the weights to get something else. The trick is to come up with a scheme that works well for its intended use. For example, if you use some reputation system to judge potential hires, you can measure that against their workplace evaluation a year after they joined. If necessary, you can refine or even replace the formula.
It’s easy to come up with reputation schemes that are circular in nature. If I compute my reputation as the average of the reputations of my friends, and they do the same for themselves, we have a circularity. In a similar way, I presume, Google excludes self-links from its page rankings.
No one number coming out of a reputation formula will ever say everything about a person or object. Temper the quantitative approach with qualitative ones to ensure that what you are measuring matches what you are otherwise observing.
Since potentially anything about a person or object could be part of a reputation computation, we need to have some sort of common representation, some standards for these. In fact, we probably have multiple ways of doing this right now. Depending on to whom you speak, you can easily find several standards for representing a person, especially if you call that person a “patient,” a “guest,” a “passenger,” or a “customer.”
I know some work has been done to unify this data through inferencing engines, and I suspect that the Semantic Web may find broader use since so much of the data coming out of e-commerce and social networks is highly tagged. Thus “creeping semanticism,” as my colleague David Fallside called it, may combine nicely with more brute force methods.
What is likely to be more important than specific standards for the data will be standards for how the privacy of that information is handled. When it comes to privacy, there can be no vagueness or ambiguity. We’re seeing some market experiments right now with the privacy of information in social networking sites. This is waking consumers up to the possible problems of allowing unfettered data about themselves become publicly known.