Next: Thesis Overview
Up: Introduction
Previous: Query Model System Components
Let's back up for a moment and look at our problem from one level
higher. What we would like to accomplish in general terms is a system
that assists users remember and organize. We are not
interested in having the user think in the constrained grammar rules
we set for our system. Rather we would like our system to act within
the rules and models a human would apply to finding information.
The tool we would like to provide to users is one that is flexible and
robust enough to handle the user's internal methods of recall. This
means that we would like to provide users with a set of tools that
allows a diverse set of queries in which the user may express the same
set of hints that occur in their own minds. We call this tool
hybrid-search.
This thesis was proposed within the context of the Haystack
project[23], as a means to extend the functionality of this
adaptive personal information repository system. Details of
Haystack's core implementation, which have made this thesis possible,
will be discussed in great details in chapters
and .
Figure: The Hybrid-Search Query Model
The Haystack architecture provides the means to rapidly integrate a
variety of information systems into a user's functional Haystack.
From the query perspective, it possible through Haystack to build a
``multiplexer'' that selects information systems that directly match
the users needs, and a method of combining the results. From the
indexing side, we can build tools to centrally maintain a data
repository. This functionality will allow for the hybrid-search
mechanism we are aiming for. Figure reflects the
goal of breaking up a query into different components for processing
by the optimal information system.
This revised model splits the information system into three parts:
I1 + F1 (the IR system), I2 + F2 (the database), and
I3 + F3 (the hypertext system). Additionally, the model
introduces two new functions:
- A Multiplexing function,
M(qf)®qf1,qf2,qf3 which routes different
parts of the original query to the information system(s) best suited
to deal with it.
- A combination function,
S(r21,r22,r21)®r2 which merges the
results of the different filter functions into one result set.
If it is not entirely clear yet what type of query a user could issue
to such a system it may be helpful, in conclusion, to provide some
examples of hybrid-search type queries (which independent information
systems would not be able to answer):
- Which documents do I have about four legged animals that I
wrote last year? The first part of the query, ``about four
legged animals,'' is an unstructured query (we are assuming that we
didn't pre-categorize the documents in the collection). The second
part, which limits the set of documents to a certain time period, is
a structuerd (database) query.
- Which email did I send to David in regard to his request
for clarification on Lore? This is very similar to the first
query. The first part limits us to emails that David sent, again a
structured query. The second part, which may require some semantic
understanding on the part of the information system, is answerable
by an unstructured (information retrieval) system.
- Which web pages did I visit after looking at the MIT
homepage last week? The first part of the query, ``web pages
...after ...MIT homepage,'' is an associative
search. The second part of the
query constrains the results to ``last week,'' and is a database
type query.
Next: Thesis Overview
Up: Introduction
Previous: Query Model System Components
Copyright 1998, Eytan Adar (eytan@alum.mit.edu)