Title: In-Network Aggregation of Holistic Functions Speaker: Fabian Kuhn Place: 32-G575 Time: 4-5:15pm Date: Monday, April 14, 2008 Joint meeting with A&C colloquium Abstract: In today's computing environments, data is no longer stored at one central server. Often, data is distributed across many heterogeneous locations that are connected through a network. Nevertheless, there need to be efficient methods to query large data sets. Basic aggregation queries such as computing the maximum, the sum, or the average of a set of values can be answered quickly by a simple convergecast on a spanning tree of the network where the data is aggregated from the leaves towards the root of the tree. The time needed for this computation is proportional to the depth of the spanning tree and thus, if the spanning tree is chosen appropriately, proportional to the diameter D of the network. Aggregation functions that can be decomposed into functions over subsets of the set of all values and that can then be computed using a single convergecast are often called distributive functions. Functions that cannot be decomposed in such a way are known as non-distributive or holistic functions. Typical important examples of holistic functions are the median of a set of values or the mode (the most frequent element) of a set of elements. In the talk, I will show a simple randomized distributed algorithm that computes the median (and more generally the element with a given rank) in time O(D*log_D(n)). We will see that this is optimal, i.e., computing the median requires time Omega(D*log_D(n)). Further, I will present the basic ideas of a distributed algorithm that computes the mode in time O(D+F_2/m^2*log n). Here, m is the frequency of the mode and F_2 is a parameter (called the second frequency moment) of the frequency distribution. Generalizing a known communication complexity result, it is possible to prove a lower bound that is weaker but depends on the frequency distribution in a similar way. This is joint work with Thomas Locher and Roger Wattenhofer (median) and with Thomas Locher and Stefan Schmid (mode).