-------------------------------------------------------------------- Descartes: A Dynamically Adaptive Compiler and Run-Time System using Continual Profile-Driven Program Multi-Specialization -------------------------------------------------------------------- M.I.T. Doctoral Dissertation Proposal Michael R. Blair [ziggy@ai.mit.edu] Abstract -------- Continually profiling and adapting multiple specializations of a program as they are run on actual inputs can provide greater performance improvement at lower cost than traditional mainstream compile-time code optimization techniques. I propose to demonstrate this claim by integrating a traditional optimizing compiler with a code profiler then augmenting the run-time system to use these profiling hooks for adaptive code optimization, using a cost-benefit investment strategy to govern the optimization efforts. Thus, focus for the optimizations is provided by profiling while throttling is provided by the cost-benefit investment model. Why do I expect this to out-perform traditional compile-time optimization techniques? Traditional techniques for improving program performance rely on information deducible at compile time. Such information is static. It is used to improve the performance of a procedure on its entire domain of possible inputs. Recently developed compiler techniques have yielded further improvements by specializing multiple instances of a program for specific distinct inputs based on textually apparent invocations of the program from within other programs. This too uses only static information. The current state of the art in performance enhancement suggests that gathering information from sample runs of a program (a.k.a. ``profiling'') can provide still more valuable information. Such strategies gather dynamic information about a program's performance on a suite of typical inputs, re-compiling the program in light of this previously unavailable dynamic information to further improve the expected average performance of the program by improving its performance on the chosen input suite. Note that the profiling is done once and thereafter the re-compiled code is frozen. Unfortunately, improving a program's _average_ performance on _typical_ inputs does not ensure _optimal_ performance on any _specific_ input. Worse, it is often difficult to ascertain what constitutes a typical input if the program has a large and characteristically diverse input domain. For instance: What constitutes a ``typical'' circuit for a circuit simulator? What constitutes a ``typical'' pedigree for a genetics analysis program? What constitutes a ``typical'' equation for a symbolic algebra program? Indeed, what constitutes a ``typical'' program for an interpreter or operating system or compiler? This thesis explores the conjecture that continually profiling and adapting multiple specializations of a program as they are run on actual inputs can provide greater performance improvement at lower cost than the above-mentioned mainstream primarily-static techniques. The interesting challenges to demonstrating this thesis lie in deciding what code to re-compile, how to do it, to what degree and at what cost. The remainder of this proposal addresses these challenges.