The approach presented in this paper resolves the conflict between the desire to improve the performance of presentation conversion routines and the desire to keep the code size of these routines under control. This conflict is well document in research literature.
Performance problems due to presentation conversion routines have been reported many times from different areas of computer science research (Huitema & Doghrie, 1989), (Clark & Tennenhouse, 1990), (Thekkath & Levy, 1993)). Code size problems have been reported when using stub compilers for developing practical systems, e.g. an X.500 directory service (Sample & Neufeld, 1993) or an operating system ((Jones, 1985), (Kessler, 1994)).
As a result, stub compilers often offer compiler switches to choose between two different code generation strategies ((Zahn et. al, 1990), (Corbin, 1990), (Huitema, 1991), (Sample, 1993), (Kessler, 1994)). The switch only decides which of the alternatives is chosen (interpreted, procedure-driven or inlined). Then, the same alternative is used to generate code for all types in the interface specification. The result is either efficient but bulky code, or compact, but inefficient code. In contrast, the stub compiler presented in this paper provides a switch to specify the maximally allowable code size. The compiler then tries to generate the most efficient code possible under the given size constraint.
USC (O'Malley et al., 1994) implements an optimization method based on inlining that significantly improves presentation conversion speed for TCP/IP packets. However, the method cannot be applied to more complex, application-oriented interface definitions without incurring a significant code size overhead, since it is based on inlining. An optimization stage that follows the principles presented in this paper can alleviate this problem, and thus make the USC optimization method usable for other protocols than TCP/IP.
Designing an optimization stage for size/speed trade-off problems is difficult, even at today's advanced state of the art in compiler construction (Bacon, Graham & Sharp, 1994). Most standard compiler optimizations either decrease the size of the code (elimination of induction variables or of dead code) or have no significant effect on the size (register allocation, instruction scheduling). (Pittman, 1987) presents the general idea of using a hybrid between interpreted and procedure-driven code to solve size/speed trade-off problems. However, in contrast to the work presented in this paper, the trade-off is not done automatically, but requires intervention by the programmer. (Scheifler, 1977) shows that the size/speed trade-off for inlining procedure-calls is equivalent to a Knapsack problem. We extend this result to the trade-off between interpreted and procedure-driven code.
The optimizer presented in this paper relies on execution frequency information. The conventional approach for gathering this information is to collect execution trace data ( gprof (Graham et al., 1983), (Pettis & Hansen, 1990), (McFarling, 1991)). However, this approach makes code optimization very time-consuming for the application programmer. The amount of effort usually becomes prohibitive for distributed programs, since two or more program modules must be run in parallel to collect trace data. An alternative is to derive execution frequency information by having the compiler analyze the structure of the source code, e.g. by heuristic branch prediction (Ball & Larus, 1993). This approach does not require human intervention, and is thus well-suited for optimizing distributed programs. The work presented in this paper develops heuristic branch prediction rules for interface specifications.
Earlier results of this work ((Hoschka & Huitema, 1993), (Hoschka & Huitema, 1994)) have been applied to solve a related size/speed trade-off problem in the area of automatic code generation of protocol automata in (HIPPCO (Castelluccia & Hoschka, 1995), (Castellucia et al., 1997))). The work presented in this paper extends these results by describing how the size/speed trade-off can be resolved by a mapping onto a Knapsack problem. In contrast to (Castellucia et al., 1997), we do not rely on manual annotation for deriving execution frequencies, but use an automatic approach based on heuristic branch prediction.