SIGIR09: Telling Experts from Spammers: Expertise Ranking in Folksonomies
From our friends in Southhampton (correction: and Hasso-Platner), a study of how to differentiate experts (who really know how to tag stuff) from spammers (who want to tag their own stuff, but try to acquire credibility by copying tags others have used). They try to exploit the difference that the people who tag first are obviously not copying. They compared their classifier to some obvious baselines, such as assigning expertise to those with the most tags. Evaluating their classifier was tricky because there isn’t a ground-truth data set. So they used a simulation, inserting a variety of different simulated experts and spammers into the tag stream of delicious, and checking how there classifier deals with them. Their classifier won.
Of course, you can only draw limited confidence from this kind of simulation. Their simulated users fit their model of the world (spammers labeled late) so of course a tool designed to their model will do well on their simulated users. I wonder, would it have been that hard to just do manual labeling of expertise on some real delicious users? This would obviously give more trustable results than simulations. Indeed, they found that by manual examination, the top 50 users of the tag “mortgage” were spammers. However, they say that the problem was finding a good ground truth for experts. But that suggests it would still be possible to evaluate differentiation of spammers from non-spammers, even if you can’t evaluate differentiation of experts.
Hello David, I’m happy to hear you liked our talk! Thanks for having us there and the good feedback we got.
Your summary of our ranking approach is very accurate, particularly by pointing that the tricky part is the evaluation — as is often in such cases. We’ve been discussing with some very nice folks from Yahoo at SIGIR about how we could collaborate on this part. After all, we’re avid Delicious users and would love to see this work out in practice for the benefit of the Delicious community. Our motivation actually came from the lack of good features in the current state of Delicious to discover new and interesting information.
Best wishes,
Michael
PS: Just for the record, only Albert Au Yeung is from Uni Southampton in UK — for me it’s the Hasso-Plattner-Institute in Germany.