<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Haystack Blog &#187; Collective Intelligence</title>
	<atom:link href="http://groups.csail.mit.edu/haystack/blog/category/collective-intelligence/feed/" rel="self" type="application/rss+xml" />
	<link>http://groups.csail.mit.edu/haystack/blog</link>
	<description>MIT CSAIL Research</description>
	<lastBuildDate>Tue, 24 Nov 2009 04:05:39 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Introducing &#8220;Eyebrowse&#8221; &#8211; Track and share your web browsing in real time</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/08/28/introducing-eyebrowse-track-and-share-your-web-browsing-in-real-time/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/08/28/introducing-eyebrowse-track-and-share-your-web-browsing-in-real-time/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 06:55:05 +0000</pubDate>
		<dc:creator>Max Van Kleek</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Social Computing]]></category>
		<category><![CDATA[Web Architectures]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=450</guid>
		<description><![CDATA[We&#8217;ve launched a service for letting people share, in real time, what pages they&#8217;re looking at on the web.  Our system, eyebrowse, lets the person choose exactly what sites they want to share their viewing patterns about, and eyebrowse does the rest &#8212; producing statistical visualisations of your web browsing habits over time, compared to [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve launched a service for letting people share, in real time, what pages they&#8217;re looking at on the web.  Our system, eyebrowse, lets the person choose exactly what sites they want to share their viewing patterns about, and eyebrowse does the rest &#8212; producing statistical visualisations of your web browsing habits over time, compared to your friends and the world.  It&#8217;s called &#8220;eyebrowse&#8221; and is available here:</p>
<p><strong><a href="http://eyebrowse.csail.mit.edu">http://eyebrowse.csail.mit.edu</a></strong></p>
<p>It currently requires Firefox/Iceweasel and works on all major platforms.  All data that is collected is <strong>public</strong> and available to <strong>anyone</strong> who wants it (we do not horde or claim to own any of your data. We like Twitter&#8217;s model.)  We will soon provide a nice interface with daily tarballs of the database in RDF, XML and CSV.</p>
<p><strong>Why would you want to share your web trails?</strong></p>
<p>1. For Science!  It&#8217;s not fair that certain Search Engine Companies can do web trail research because they have access to massive repositories of data.  There should be public corpora for IR researchers around the world.  And these should be OPEN.</p>
<p>2. For your friends!  You look at lots of cool stuff on the web every day.  You might not think of explicitly sharing every single thing you read.  Eyebrowse is lightweight enough that you just have to tell it once per site you want to share.  I&#8217;ve already discovered tons of weird things that my friends are looking at that they would not have bothered to share explicitly.</p>
<p>3. To understand your own browsing habits.  How many times do you read ACM/IEEE every day? I bet you don&#8217;t know. Now you can get quantitative statistics and visualise long-term journal revisitation patterns &#8211; and other things.</p>
<p><strong>Will it violate my privacy?</strong></p>
<p>1. We give you control.  You have to tell eyebrowse explicitly what you want to share on a site-by-site (host) basis. You can take things off the whitelist at any time.  You can also go back and delete things that it has logged in the past all through our web interface.   It also respects Private Browsing (aka pornmode) and will not log any data regardless during this mode.</p>
<p>2. It fosters contemplation/awareness: We are trying to also raise awareness of what OTHERS (e.g. Google Analytics) are collecting about you as you surf the web, by showing you what you can learn from yourself by selectively publishing your own data feeds.</p>
<p>By letting people selectively publish web trails in an open, non-invasive way, we are hoping to foster a discussion of how we can use our web browsing behavior to build more adaptive and effective interfaces that <strong>respect people&#8217;s privacy</strong>.</p>
<p>Feedback is appreciated.  Please email us directly at : eyebrowse@csail.mit.edu</p>
<p>Oh and eyebrowse is free and open source software, licensed under the MIT License.  The source is available as part of the list-it codebase here: <a href="http://code.google.com/p/list-it">http://code.google.com/p/list-it</a></p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/08/28/introducing-eyebrowse-track-and-share-your-web-browsing-in-real-time/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Talk: Community-based ontology development alignment and evaluation</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/07/27/talk-community-based-ontology-development-alignment-and-evaluation/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/07/27/talk-community-based-ontology-development-alignment-and-evaluation/#comments</comments>
		<pubDate>Mon, 27 Jul 2009 19:23:10 +0000</pubDate>
		<dc:creator>David Karger</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[CSAIL]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=410</guid>
		<description><![CDATA[Natasha Noy gave a talk at CSAIL with the above title.  She works in with a large medical bioinformatics group at Stanford.  The bioinformatics community in general couldn&#8217;t care less about cool computer science but is one of the few groups that have heavily adopted formal ontologies as a way to get their work done.  [...]]]></description>
			<content:encoded><![CDATA[<p>Natasha Noy gave a talk at CSAIL with the above title.  She works in with a <a href="http://protege.stanford.edu/">large medical bioinformatics group</a> at Stanford.  The bioinformatics community in general couldn&#8217;t care less about cool computer science but is one of the few groups that have heavily adopted formal ontologies as a way to get their work done.  They have tons of data partitioned over many silos.   Biologists have adopted ontologies to provide canonical representations of scientific knowledge, or to annotate data to let others make use of it.  Often, it is not the authors who do it, but curators or automatic tools.</p>
<p>There are now hundreds of ontologies with tens of thousand of terms.  However, it has always been a &#8220;cottage industry&#8221;&#8212;various groups develop their own ontologies, then publish them for use by others.  Is there a way to open the development of the ontology up to the community?  Community might be just a few or thousands.</p>
<p>As an example, the gene ontology (28K terms) has 3 full time curators.  People from the community submit to an issue tracker to get new terms etc.  A ne version is released daily.  In contrast, the NCI thesaurus (for cancer) has 20 full time editors with 1 lead editor who runs everything, and a slow cycle of &#8220;releases&#8221; with less community input.  Others work like typical open source projects with 20-30 team members involved in active discussions.</p>
<p>Natasha&#8217;s group builds on Protege, a very old open source ontology editor that is now one of the most popular, with 120,000 registered users.  It has a very open plugin architecture with dozens of plugins for visualization, import, export, nlp, and lots of unknowns.  They&#8217;ve been working to augment protege with support for collaboration.  It works in a distributed fashion (desktop and web clients).  It support simultaneous editing, but also annotation, discussion, proposals and voting in the context of the ontology.  There are many types of annotations&#8212;questions, comments, proposals&#8212;on any elements of the ontology&#8212;classes, properties, instances.   While the tool handles most types of structured data, it is focused on taxonomic hierarchies were stuff gets inherited down the hierarchy.</p>
<p>They investigated use of their tool for several tasks.  One is ontology evaluation&#8212;finding existing ontologies that might be useful for you.  This source of information for this is author-contributed metadata about the ontologies&#8212;domain, key classes and concepts, who the developer is, etc. Another is automatic tools that compute quality metrics, and another is annotations by other users of the ontologies.</p>
<p>This last is important because some ontology metrics are subjective&#8212;a feature that is &#8220;good&#8221; in one setting can be awful in another.  An example might be a high level of axiomatization.  This is important for inference, but creates clutter if you just want description.  There&#8217;s also the problem of crosscutting taxonomies&#8212;you might have two different ways of describing the same domain that form a &#8220;matrix&#8221; of non-overlapping hierarchies.  To address this sort of subjectivity, they allow users to record evaluations of ontologies.</p>
<p>These tools can be explored at their <a title="Bioportal web site" href="http://bioportal.bioontology.org/">bioportal web site</a> where they have a large library of biomedical ontologies.  On that site, users can describe their ontology based projects, and list/review the ontologies they are using.  Reviewers give general reviews, usage information problems encountered, coverage of the key terms, major gaps, and issues with specific elements of the ontology.  This site aims to make ontology evaluation/creation a truly democratic process.  This is controversial&#8212;some argue that ontologies need a more rigorous editorial process (mirroring a current debate about open vs. traditional journal publication).</p>
<p>Another big task is mapping: connecting two ontologies by asserting that terms in two different ontologies &#8220;match&#8221;.  They aren&#8217;t trying to find mappings, but want to enable others to upload the mappings they have found.  Mappings can be created manually or uploaded in bulk (if computed by someone&#8217;s tool).  Mappings are themselves metadata, which can be annotated and discussed just like other data in the ontology.</p>
<p>Of course a big question is whether people will use these tools.  Right now, many users are asking for these features and reporting lots of bugs&#8212;good signs of demand.</p>
<p>A lot of questions have now arisen that need some serious user studies&#8212;what are the dynamics of the social networks that form around collaborative ontologies?  What are the different types of users/editors?  What produces the most discussion/controversy?  Do these tools help or hinder collaboration?</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/07/27/talk-community-based-ontology-development-alignment-and-evaluation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SIGIR09: Telling Experts from Spammers: Expertise Ranking in Folksonomies</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/07/22/sigir09-telling-experts-from-spammers-expertise-ranking-in-folksonomies/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/07/22/sigir09-telling-experts-from-spammers-expertise-ranking-in-folksonomies/#comments</comments>
		<pubDate>Wed, 22 Jul 2009 20:58:30 +0000</pubDate>
		<dc:creator>David Karger</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Publication]]></category>
		<category><![CDATA[SIGIR]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Social Computing]]></category>
		<category><![CDATA[CSAIL]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=386</guid>
		<description><![CDATA[From our friends in Southhampton (correction: and Hasso-Platner), a study of how to differentiate experts (who really know how to tag stuff) from spammers (who want to tag their own stuff, but try to acquire credibility by copying tags others have used).   They try to exploit the difference that the people who tag first are [...]]]></description>
			<content:encoded><![CDATA[<p>From our friends in Southhampton (correction: and Hasso-Platner), a study of how to differentiate experts (who really know how to tag stuff) from spammers (who want to tag their own stuff, but try to acquire credibility by copying tags others have used).   They try to exploit the difference that the people who tag first are obviously not copying.  They compared their classifier to some obvious baselines, such as assigning expertise to those with the most tags.  Evaluating their classifier was tricky because there isn&#8217;t a ground-truth data set.   So they used a simulation, inserting a variety of different simulated experts and spammers into the tag stream of delicious, and checking how there classifier deals with them. Their classifier won.</p>
<p>Of course, you can only draw limited confidence from this kind of simulation.  Their simulated users fit their model of the world (spammers labeled late) so of course a tool designed to their model will do well on their simulated users.  I wonder, would it have been that hard to just do manual labeling of expertise on some real delicious users?  This would obviously give more trustable results than simulations.   Indeed, they found that by manual examination, the top 50 users of the tag &#8220;mortgage&#8221; were spammers.  However, they say that the problem was finding a good ground truth for experts.   But that suggests it would still be possible to evaluate differentiation of spammers from non-spammers, even if you can&#8217;t evaluate differentiation of experts.</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/07/22/sigir09-telling-experts-from-spammers-expertise-ranking-in-folksonomies/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SIGIR09: The Wisdom of the Few: A Collaborative Filtering Approach Based on Expert Opinions from the Web</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/07/22/sigir09-the-wisdom-of-the-few-a-collaborative-filtering-approach-based-on-expert-opinions-from-the-web/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/07/22/sigir09-the-wisdom-of-the-few-a-collaborative-filtering-approach-based-on-expert-opinions-from-the-web/#comments</comments>
		<pubDate>Wed, 22 Jul 2009 18:45:21 +0000</pubDate>
		<dc:creator>David Karger</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Publication]]></category>
		<category><![CDATA[SIGIR]]></category>
		<category><![CDATA[CSAIL]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=383</guid>
		<description><![CDATA[Xavier  Amatriain of Telefonica research presented work on collaborative filtering.  Usually you do collaborative filtering by finding the other users &#8220;similar&#8221; to your subject and combining their recommendations.  This paper argued/demonstrated that sometimes you are better off figuring out who the experts art and only paying attention to their opinions.  You might just create [...]]]></description>
			<content:encoded><![CDATA[<p>Xavier  Amatriain of Telefonica research presented work on collaborative filtering.  Usually you do collaborative filtering by finding the other users &#8220;similar&#8221; to your subject and combining their recommendations.  This paper argued/demonstrated that sometimes you are better off figuring out who the experts art and only paying attention to their opinions.  You might just create non-personalized recommendations from them, or you might personalize by finding the best _experts_ to recommend for a user.  The experimented by exploring movie recommendation using the Netflix challenge mass ratings versus using the (expert) critics&#8217; recommendations on Rotten Tomatoes.  They found expert recommendations often worked better.</p>
<p>I asked about some past work on e.g. semisupervised learning suggests various approaches to combining small amounts of high-quality data (experts) with large amounts of messier data (mass user ratings).  It suggests, for example, some sort of weighted combination of expert and mass user opinion.  They know this could help a lot, but don&#8217;t have a general approach to separating the export from everyone else in a large mass of recommendations (Rotten Tomatoes did it for them).</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/07/22/sigir09-the-wisdom-of-the-few-a-collaborative-filtering-approach-based-on-expert-opinions-from-the-web/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interacting with Temporal Data @CHI09</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/04/17/interacting-with-temporal-data-chi09/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/04/17/interacting-with-temporal-data-chi09/#comments</comments>
		<pubDate>Fri, 17 Apr 2009 07:01:34 +0000</pubDate>
		<dc:creator>Max Van Kleek</dc:creator>
				<category><![CDATA[CHI]]></category>
		<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[chi]]></category>
		<category><![CDATA[temporal data]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=310</guid>
		<description><![CDATA[This year Wendy Mackay, Aurélien Tabard and I held a workshop for examining interaction challenges surrounding time, in particular time as a component of temporal data sets.  Our interest in this topic was brought about by the observation that low-cost storage, cheap sensing technologies, the Web and high speed networking have started to bring us [...]]]></description>
			<content:encoded><![CDATA[<p>This year Wendy Mackay, Aurélien Tabard and I held a workshop for examining interaction challenges surrounding time, in particular time as a component of temporal data sets.  Our interest in this topic was brought about by the observation that low-cost storage, cheap sensing technologies, the Web and high speed networking have started to bring us vast quantities of rich temporal data &#8212; whether it is in &#8220;traditional&#8221; forms (such as audio or video), or &#8220;new&#8221; forms such as rich activity logs of people, places and things.  The availability of these volumes of new data present new opportunities but also pose interaction challenges that we wished to start to identify and address.  From our CfP:</p>
<blockquote><p><em>Is time just another attribute of data? Or is it something more? Time brings meaning to data, especially data about the real world. Time is also essential for understanding human activity and an essential element of design processes. Sometimes we address time explicitly, sometimes implicitly. It structures how people interact with computers, but is also a measurable effect of that interaction. The goal of this workshop is to explore human-computer interaction from a temporal perspective.</em></p></blockquote>
<p>We were pleased that our workshop drew 35 participants with a variety of interests and backgrounds &#8212; from architects, interaction designers to data mining analysts, doctors, ubicomp researchers, and of course HCI researchers.  As can be seen in the <a title="Interacting with Temporal Data Workshop proceedings" href="http://temporal.csail.mit.edu/exhibit">workshop proceedings</a>, our participants were interested in a number of different types of temporal data:</p>
<ul>
<li>Media: audio + video capture, manipulation, editing, sharing</li>
<li>Personal health</li>
<li>Personal information management</li>
<li>Life logging (Personal activity data recording + Reflection)</li>
<li>Air traffic control</li>
<li>Financial data analysis</li>
<li>Sensor networks</li>
<li>Environmental impact monitoring</li>
<li>Product research</li>
<li>Software engineering</li>
</ul>
<p>Despite the diversity, several common themes emerged.</p>
<p>The first was empowerment: the idea that accurate, low-cost-to-capture rich records of people&#8217;s everyday activities could thoroughly change the way we live.  Participants highlighted several creative examples of how such records of our lives could help us &#8212; in personal, social and work contexts.  For example, getting accurate records of one&#8217;s daily routines (such as exercise and diet) could let people identify ways to live healthier [a la Thomas Goertz's Decision Tree].  Or, to enable the hacking of social dynamics: for example, to analyze in situ or post-hoc repeated patterns of conflict in interactions with particular individuals so as to be able to better understand sources of stress related to collaboration.  Or, simply helping the user more easily retrieve and manage their personal information in an activity-centric manner than complements human episodic memory.</p>
<p>The essential challenge was the question of how to give individuals (just-plain-folks, end-users) access to this rich data about themselves in a way that they could easily analyze, understand, manage and use.  One participant commented that such information was &#8220;turning citizens into intelligence analysts &#8212; about their own lives&#8221;.  Intelligence analysts, of course, have extensive training in how to look at data; end-users don&#8217;t.</p>
<p>Another was the question of accountability, access, protection and privacy: we have never previously had access to accurate records about any aspects of our lives.  Once we have these records, what sort of implications will this have on our interactions with others? (e.g., ineffable records of where people were, how long they were there, what they did)  The process of scientific discovery, process/protocol and how this will impact how scientists work with one another? How will we control or grant others access to these records in a way that provides individuals privacy?  If individuals are employees/members of organizations, who &#8220;owns&#8221; the data about an individual&#8217;s activities at work, and what rights does the individual have towards accessing/ it and what rights does an individual have to their own activity records? Finally, after an individual departs (passes away or leaves an organization), how should such data be handled or retired? Who has rights to a deceased individual&#8217;s life log?</p>
<p>Other themes and topics of discussion included : the need for interfaces to help reconcile subjective/emotional memories of the past with &#8220;cold, hard lifelogs&#8221;, implicit versus explicit representations of time; e.g., different ways of portraying dynamic processes, and explanation facilities for time-dynamic pattern recognition.</p>
<p>Based on the strong interest from our workshop participants, we have decided to start a discussion group / online watering hole for us to further discuss some of the issues surrounding interaction with temporal data.  We welcome anyone interested (not only workshop participants) to join and post their thoughts, questions, projects and ideas:</p>
<ul>
<li><a title="Google Group on Temporal Data" href="http://group.google.com/group/temporal-data">Temporal-Data @ groups.google.com</a> &#8211; Temporal Data Google Group</li>
</ul>
<p>With this google group we wish to continue our consolidated, cross-application domain discussion of interaction issues with the hopes of taming the complexities of our data rich environments.</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/04/17/interacting-with-temporal-data-chi09/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Case for a Collaborative Query Management System</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/01/06/a-case-for-a-collaborative-query-management-system/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/01/06/a-case-for-a-collaborative-query-management-system/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 20:14:49 +0000</pubDate>
		<dc:creator>David Karger</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Publication]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=231</guid>
		<description><![CDATA[This is a CIDR presentation by Nodira Khoussainova of University of Washington arguing for a collaborative repository of complex SQL database queries.  Sounds like they want co-scripter for SQL.
There&#8217;s a problem of hunting through all the queries to find the one you want.  They want effective search and browsing, and also assistance in composing new [...]]]></description>
			<content:encoded><![CDATA[<p>This is a CIDR presentation by Nodira Khoussainova of University of Washington arguing for a collaborative repository of complex SQL database queries.  Sounds like they want co-scripter for SQL.</p>
<p>There&#8217;s a problem of hunting through all the queries to find the one you want.  They want effective search and browsing, and also assistance in composing new queries.    There are challenges:</p>
<ul>
<li>queries are not just strings, but complex objects with inputs, outputs, and semantics.  2 similar queries can have very different outputs, and 2 different queries can return the same</li>
<li>typical search problem: need to avoid giving too many matches</li>
<li>efficient algorithms (this is a database conference after all)</li>
</ul>
<p>An application they have in mind is scientific data management.  There&#8217;s tons of data and lots of (shared) data analysis with complex queries that are freqently evolving.</p>
<p>Consider the scenario of a novice user trying to create a query, given a large repository of past queries by others. He&#8217;ll try to find a perfect match but will probably need to take something close and then modify it.  There must be a metaquery language for describing the kind of query you want.  Since that query was probably built over time, there may be many versions that evolved, and it can be useful to see all the different versions and find the best ones for his use.   It willbe useful to explain to the user how these versions are related, e.g. this refines that.  One needs to watch out for the metaquery being more complicated to construct than the query they want to find.  One approach is &#8220;partial query&#8221;&#8212;for the user to build as much of the query as they can, then look for other queries that are similar.</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/01/06/a-case-for-a-collaborative-query-management-system/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Courserank: a socially-networked course selection system for Stanford</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2009/01/05/courserank-a-socially-networked-course-selection-system-for-stanford/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2009/01/05/courserank-a-socially-networked-course-selection-system-for-stanford/#comments</comments>
		<pubDate>Mon, 05 Jan 2009 19:08:21 +0000</pubDate>
		<dc:creator>David Karger</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Publication]]></category>
		<category><![CDATA[Social Computing]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=218</guid>
		<description><![CDATA[Georgia Koutrika et al. from Stanford built a system called &#8220;Courserank&#8221;&#8212;a place to evaluate courses.   It has been used by 98% of freshman at Stanford.  In addition to just picking courses, it offers a number of interesting social features.  It&#8217;s planner page lets students enter their entire plan for all years in school, classes and [...]]]></description>
			<content:encoded><![CDATA[<p>Georgia Koutrika et al. from Stanford built a system called &#8220;Courserank&#8221;&#8212;a place to evaluate courses.   It has been used by 98% of freshman at Stanford.  In addition to just picking courses, it offers a number of interesting social features.  It&#8217;s planner page lets students enter their entire plan for all years in school, classes and when you plan to take them.   Then after the term they enter star ratings and grades.   There&#8217;s also a tab showing which students plan to take which courses in which term, so you can take with students you know.   A Requirements feature that tells you what you still need to take.    There&#8217;s the opportunity to enter reviews and have discussions about various classes.   Students actually voluntarily enter their grades so you can find out grade distributions in the class.</p>
<p>Courserank is a social site.  But some things set it apart.  It&#8217;s not open, but has a well defined closed community&#8212;only Stanford ID can participate.   It&#8217;s not flat&#8212;it has well defined distinct constituencies (undergrads, grads students, faculty, the school).  It has special purpose tools like course planner that ar e highly domain specific.   It makes hybrid of user data and &#8220;official&#8221; data.   These make it a &#8220;special purpose social site&#8221;.</p>
<p>They wanted grade distribution for courses, but didn&#8217;t have access to official records.  So relied on students entering their own grades.  They compared the results to some official data: they follow each other <em>very well</em>. People are honest, unlike big social sites.  Perhaps because this is a closed community?  Or because the tool is actually helpful to them so they are motivated to &#8220;give back&#8221;?</p>
<p>They then swiched to a different speaker who spent time talking about recommendation systems. They want to support general recommendation.  What courses should I take based on my background?  What major should I pick based on my performance?  What is the best semester to take AI?  Current recommendation systems are not flexible enough or extensible to allow these kinds of questions.  They&#8217;re trying to create a formal model of recommender questions, kind of an extensions of database queries with a &#8220;recommend&#8221; operator.</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2009/01/05/courserank-a-socially-networked-course-selection-system-for-stanford/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google launches SearchWiki</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2008/11/25/google-launches-searchwiki/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2008/11/25/google-launches-searchwiki/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 16:17:19 +0000</pubDate>
		<dc:creator>Sacha Zyto</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Social Computing]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=150</guid>
		<description><![CDATA[Did you notice the little icons near you google search results this morning (at least if you were logged on your google account when you did the search). They are SearchWiki icons. This way, if you like or a result, you can promote it, so that it will always show in 1st position when you [...]]]></description>
			<content:encoded><![CDATA[<p>Did you notice the little icons near you google search results this morning (at least if you were logged on your google account when you did the search). They are SearchWiki icons. This way, if you like or a result, you can promote it, so that it will always show in 1st position when you do that search again.</p>
<p>In the same vein, you can ban search results, so that never appear again, add URLs that you think would be relevant matches to that query (potential for social indexing here&#8230;), and add comments, public or private.</p>
<p>This reminds me a lot of Jaime Re:Search project (see http://people.csail.mit.edu/teevan/work/publications/ ).</p>
<p>About the ability to add comments, I wonder if there will be pressure to remove that feature, since it enables anyone to post anything about any website similarly to ThirdVoice, which was discontinued a few years ago: http://www.wired.com/techbiz/media/news/2001/04/42803.</p>
<p>However, I tried to add a sample public comment on my csail webpage, but it didn&#8217;t show in google search once I had logged out (or when logged in from another account). I don&#8217;t if there&#8217;s a delay for other people to see public comments from other people, or I&#8217;m missing something about how public google comment work: (Blog) comments are welcome !</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2008/11/25/google-launches-searchwiki/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Friendsourcing</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2008/11/24/friendsourcing/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2008/11/24/friendsourcing/#comments</comments>
		<pubDate>Mon, 24 Nov 2008 21:27:09 +0000</pubDate>
		<dc:creator>Katrina Panovich</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[Social Computing]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=146</guid>
		<description><![CDATA[In a recent bout of interest in ‘personal information management,’ I’ve been thinking about and talking to people about the way we stay organized and on top of things.  Some people like GTD, others use google gcal/gmail/etc, others use post-its, and a whole host of people don’t record this information at all.  Instead, they do [...]]]></description>
			<content:encoded><![CDATA[<p>In a recent bout of interest in ‘personal information management,’ I’ve been thinking about and talking to people about the way we stay organized and on top of things.  Some people like GTD, others use google gcal/gmail/etc, others use post-its, and a whole host of people don’t <em>record</em> this information at all.  Instead, they do what I am calling <strong>friendsourcing</strong>.  (Until a better name comes along, at least.)</p>
<p>We’re familiar with outsourcing &#8211; sending our tasks or whatever to totally outside people.  We’re familiar, too, with crowdsourcing &#8211; asking the lazyweb/lazytwitter/lazytumblr/lazy___ world to answer questions for ourselves.  In either case, these methods are generally going to people who we trust marginally, perhaps only because that’s their job (outsourcing) or because we hope that correct answers will bubble to the top (crowdsourcing).</p>
<p>I’m finding an increasing number of people who do this friendsourcing thing &#8211; that is, they delegate this organization and other data to remember to trusted friends.  One unnamed advisor, for instance, freely admits that he doesn’t record things or rely on himself to remember information &#8211; he tells his grad students, and among those (presumably) trusted people, someone will remember and remind him.  A friend of mine has also said that he doesn’t write down activities and relies on reminders, especially from people (in this case, me) who he has determined generally <em>do</em> know what’s going on.</p>
<p>I think the thing that makes this really different is (1) that these are generally scheduling or organizational items and (2) that these people have found that relying on others is more effective, for them, than trying to remember information on their own.  It would be interesting to explore these ideas further.  For now, though, this is just a semi-structured thought I had a little over two months ago <a href="http://katrina.tumblr.com/post/49852332/friendsourcing">here</a> (note: this is my personal &#8211; entirely not professional &#8211; blog).</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2008/11/24/friendsourcing/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Tag relatedness in social bookmarking systems</title>
		<link>http://groups.csail.mit.edu/haystack/blog/2008/11/24/tag-relatedness-in-social-bookmarking-systems/</link>
		<comments>http://groups.csail.mit.edu/haystack/blog/2008/11/24/tag-relatedness-in-social-bookmarking-systems/#comments</comments>
		<pubDate>Mon, 24 Nov 2008 16:52:48 +0000</pubDate>
		<dc:creator>David Karger</dc:creator>
				<category><![CDATA[Collective Intelligence]]></category>
		<category><![CDATA[ISWC]]></category>
		<category><![CDATA[Publication]]></category>
		<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Social Computing]]></category>

		<guid isPermaLink="false">http://groups.csail.mit.edu/haystack/blog/?p=61</guid>
		<description><![CDATA[This post about ISWC got lost in the drafts pile, so I&#8217;m publishing it a bit late.
The folks from Tagora spoke about the relationships between different tags that get applied to the same document, and the tags that different people apply to the same documents.  The looked at the ternary relation (user, resource, tag) and [...]]]></description>
			<content:encoded><![CDATA[<p>This post about ISWC got lost in the drafts pile, so I&#8217;m publishing it a bit late.</p>
<p>The folks from Tagora spoke about the relationships between different tags that get applied to the same document, and the tags that different people apply to the same documents.  The looked at the ternary relation (user, resource, tag) and tried to infer measures of similarity.  What tags are &#8220;related&#8221; to others, or to other pages?  What measures capture useful notions of relatedness?   One simple example is co-occurrence&#8212;how often two tags were assigned to the same resource.  But there are more sophisticated measures that are better.  Someone else tried using a page-rank style algorithm where you start with high weight on one tag and see what other tags receive high weight from it (page-rank in an undirected graph is kind of overkill; mathematically it reduces to just looking at the degree of nodes).  They look at more context&#8212;two tage are similar if they appear with the same <em>sets</em> of tags.  They also look at grounding tags into wordnet to find semantic matches&#8212;measuring relatedness by distance in wordnet, for example.</p>
]]></content:encoded>
			<wfw:commentRss>http://groups.csail.mit.edu/haystack/blog/2008/11/24/tag-relatedness-in-social-bookmarking-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
