Building a Social Data Commons

Inspired by Ted’s vision of what he’d like to see happen to data.gov, I decided to have a try at my hopes for it. Ted’s desires for data.gov are all ones that I agree would make the data more accessible. I would now like to discuss what else I might want in a world where [...]

Spreadsheets vs. Relational Databases: Bridging the Gap

For non-programmers, spreadsheets are usually the option of choice when it comes to keeping track of non-trivial amounts of structured data. This is seen in all kinds of settings ranging from the business world to public administration and academic research. Spreadsheets, however, can only capture one kind of data structure: separate tabular [...]

In Defense of a Semantic Web Wild West

A month ago Stefano Mazzocchi published an interesting article on data reconciliation (detecting when two identifiers refer to the same item, and merging them) where he advocated a more centralized “a priori” approach (trying to keep the identifiers merged at the beginning).  I posted a response arguing the value of a more anarchic “a posteriori” [...]

Talk: Community-based ontology development alignment and evaluation

Natasha Noy gave a talk at CSAIL with the above title.  She works in with a large medical bioinformatics group at Stanford.  The bioinformatics community in general couldn’t care less about cool computer science but is one of the few groups that have heavily adopted formal ontologies as a way to get their work done.  [...]

Interacting with Temporal Data @CHI09

This year Wendy Mackay, Aurélien Tabard and I held a workshop for examining interaction challenges surrounding time, in particular time as a component of temporal data sets.  Our interest in this topic was brought about by the observation that low-cost storage, cheap sensing technologies, the Web and high speed networking have started to bring us [...]

Making the Case for Raw Data

Tim Berners-Lee’s recent TED talk on Linked Data has inspired quite a few people to ask what exactly linked data is, how it differs from data on the semantic web, and how realistic it is to assume universal and unique addressability of data items. A world with linked data would be a world with richer, [...]

What’s Wrong with SQL?

A lot of things, Mike Stonebraker might say, but I have something rather fundamental in mind.
Suppose I’m developing some sort of academic course management system. Chances are I’ll want to display to the user a list of course offerings and their associated course codes, readings from the syllabus, meeting times etc. Maybe something like this:

Now [...]

Building a content management system just by drawing the web forms

This is a nice talk by Kian Win Ong of UCSD called “Do It Yourself custom forms-driven workflow applications.”   They’re looking at all the work people invest building special purpose content management systems that really offer users little more than “CRUD” (create, read, update delete) interactins for certain specialized kinds of content.
The basic approach is [...]

A Case for a Collaborative Query Management System

This is a CIDR presentation by Nodira Khoussainova of University of Washington arguing for a collaborative repository of complex SQL database queries.  Sounds like they want co-scripter for SQL.
There’s a problem of hunting through all the queries to find the one you want.  They want effective search and browsing, and also assistance in composing new [...]

The Role of Schema Matching in Large Enterprises

A CIDR presentation by Ken Smith from Mitre on the use of the “match” operation that pairs properties of two different schema.  It’s used to merge data from two different sources.  He’s arguing that there are tons of uses of schema matching that precede the actually merging of data.

When you are trying to decide whether [...]