The Haystack architecture can be described in two distinct parts, a
Haystack Data Model (HDM) and a Haystack Service Model (HSM). The
data model is the means by which we represent the user's information
space, and the services (for the most part) append to or process the
data in some fashion. This chapter discusses the building blocks of
the Haystack data model. Specific details on issues such as
persistent storage and object creation will be touched upon briefly
here, but will be discussed fully in Appendix .
As described in Chapter the abstract representation of
the Haystack data model (HDM) is that of a directed graph with first
class edges and vertices. Vertices and edges are typed, and the typing
information provides semantic information about the structure. For
example, we can have a bale.HaystackDocument
node attached to a
needle.Location.URL
node. The
needle.Location.URL has some content associated with it
(let's say http://web.mit.edu/). The implication then is that the URL
of this particular bale.HaystackDocument is
http://goose.lcs.mit.edu/. Nodes in Haystack are called
Straws. All Straws are created with a unique
identifier associated with them (see HaystackID below). The
way to think about this model is in terms of associations. As
discussed previously, we are attempting to model the same type of
associations that correspond naturally to the document state model.
With the HDM it is then possible to represent both the metadata
associations between objects (i.e. the URL of a document, the author
of a thesis, the date the picture was deleted), as well as the
associations between documents (for example, all the documents that I
cited in this paper). Figure
illustrates some of these
relationships.
Figure: A sample Straw structure
The HDM sample structure illustrates the use of the three main Straw components: bales, ties, and needles. Recall the Straw model described in the previous chapter depended on three types of ties connecting nodes. In implementing Straws in Java these tie types are created implicitly by the objects they connect. For example, to bind Straws into one document cluster (i.e. by intra-document ties) a bale.HaystackDocument object is created, and all Straws associated with the same document cluster are attached to that bale.HaystackDocument. Inter-document ties are created simply by attaching two bale.HaystackDocuments together. To create the term ties a Needle with text is attached to the Bale. To review, Straw is the fundamental building block for the HDM. Bales are collections of Straws. Ties connect two Straws together, and Needles have content.
As stated previously, ties (or as implemented: Ties) are also
first class objects. Specifically, Ties are an extension of
Straws with a number of extra features (discussed in
section ). Additionally, because Ties are
first class, we can apply metadata and relations to links and not just
nodes. The tie.Filetype in Figure
was created
by some service, specifically FileTypeGuessService. To make
this fact obvious a tie.Creator is attached to the
tie.Filetype object with the creator's name attached to that.
It is important to note that all objects are derived from a primary interface. A Java interfaces contains the method signatures of all functions that are required to be defined by any class that implements that interface. The primary HDM interface in Haystack is known as haystack.object.Weavable. All Haystack data objects are defined within the Java package haystack.object. Other defined interfaces include TieWeavable, NeedleWeavable, and BaleWeavable. This chapter will discuss each of these interfaces in turn as well as their basic implementations. We will also discuss the model by which we add new nodes to the HDM.