utilMDE
Class RandomSelector<T>

Object
  extended by RandomSelector<T>

public class RandomSelector<T>
extends Object

RandomSelector selects k elements uniformly at random from an iteration over n elements using only O(k) space instead of O(n) space. For example, selecting 1 element from a FileStream containing 1000 elements will take O(1) space. The class takes as input the number k during initialization and then can accept() any number of Objects in the future. At any point in time, getValues() will either return k randomly selected elements from the elements previous accepted or if accept() was called fewer than k times, will return all elements previously accepted.

The random selection is independent between every constructed instance of RandomSelector objects, but for the same instance, multiple calls to getValues() are not independent. Making two calls to consecutive getValues() without an accept() in between will return two new Lists containing the same elements.

A second mode allows for a fixed probability of randomly keeping each item as opposed to a fixed number of samples.

SPECFIELDS:
current_values : Set : The values chosen based on the Objects observed
number_observed : int : The number of Objects observed
number_to_take : int : The number of elements to choose ('k' above)
keep_probability: double : The percentage of elements to keep
selector_mode : {FIXED,PERCENT} : either fixed amount of samples or fixed percent.

Example use:
// randomly selects 100 lines of text from a file

  List selectedLines = null;
  try {
     BufferedReader br = new BufferedReader
       (new FileReader ("myfile.txt"));
     RandomSelector selector = new RandomSelector (100);
     while (br.ready()) {
       selector.accept (br.readLine());
     }
     selectedLines = selector.getValues();
   }
   catch (IOException e2) { e2.printStackTrace(); }
 


Constructor Summary
RandomSelector(double keep_probability, Random r)
           
RandomSelector(int num_elts)
           
RandomSelector(int num_elts, Random r)
           
 
Method Summary
 void accept(T next)
          When in fixed sample mode, increments the number of observed elements i by 1, then with probability k / i, the Object 'next' will be added to the currently selected values 'current_values' where k is equal to 'number_to_take'.
 List<T> getValues()
          Returns current_values, modifies none.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RandomSelector

public RandomSelector(int num_elts)
Parameters:
num_elts - The number of elements intended to be selected from the input elements Sets 'number_to_take' = num_elts

RandomSelector

public RandomSelector(int num_elts,
                      Random r)
Parameters:
num_elts - The number of elements intended to be selected from the input elements.
r - The seed to give for random number generation. Sets 'number_to_take' = num_elts

RandomSelector

public RandomSelector(double keep_probability,
                      Random r)
Parameters:
keep_probability - The probability that each element is selected from the oncoming Iteration.
r - The seed to give for random number generation.
Method Detail

accept

public void accept(T next)

When in fixed sample mode, increments the number of observed elements i by 1, then with probability k / i, the Object 'next' will be added to the currently selected values 'current_values' where k is equal to 'number_to_take'. If the size of current_values exceeds number_to_take, then one of the existing elements in current_values will be removed at random.

When in probability mode, adds next to 'current_values' with probability equal to 'keep_probability'.


getValues

public List<T> getValues()
Returns current_values, modifies none.