From f3706612212b433bf6598cc767ef90a4013e0e19 Mon Sep 17 00:00:00 2001 From: Jean Yang Date: Thu, 4 Dec 2008 18:08:07 -0500 Subject: [PATCH] small changes --- writeup/fp_writeup.tex | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/writeup/fp_writeup.tex b/writeup/fp_writeup.tex index df09923..bf37259 100644 --- a/writeup/fp_writeup.tex +++ b/writeup/fp_writeup.tex @@ -40,14 +40,14 @@ We discuss our experimental setup in Section~\ref{setup}, our implementation of \section{Experimental setup} \label{setup} -Though Java and C++ both have sufficiently ``interesting'' libraries and a large code base, we favored Java over C++ because 1) Java code is more portable, since it compiles to bytecode rather than machine code and is dynamically linked, 2) there are code copying issues involved with C++ templates, and 3) for hisorical reasons, there is more open source Java code that uses Java standard libraries than there is open source C++ code that uses the STL~\footnote{Many open source software in C++ uses its own version of abstract data structure libraries because 1) the libraries were not standardized across compilers and 2) people have efficiency issues with using the STL.}. In this section we describe how we implemented our profiler and constructed traces. +Though Java and C++ both have sufficiently ``interesting'' libraries and a large code base, we favored Java over C++ because 1) Java code is more portable, since it compiles to bytecode rather than machine code and is dynamically linked, 2) there are code copying issues involved with C++ templates, and 3) for hisorical reasons, there is more open source Java code that uses Java standard libraries than there is open source C++ code that uses the STL~\footnote{Many open source software in C++ uses its own version of abstract data structure libraries because 1) the libraries were not standardized across compilers and 2) people have efficiency issues with using the STL.}. Here we describe how we implemented our profiler and constructed traces. \subsection{Instrumenting Java code to produce traces} We constructed runtime profiles of the code we run in order to construct the method traces (rather than adding logging code to the standard libraries) because 1) this method is more general, giving us the freedom of being able to construct learning data from any Java \texttt{jar} file and 2) this method gave us the freedom to choose our set of libraries to classify after seeing which ones were being used in the code. We did this by writing a Java profiler that inserts logging instructions in the Java bytecode whenever there is a call to a standard library method of interest. Because Java's standard libraries are loaded in the bootstrap classloader rather than in the system loader, however, this causes problems for directly instrumenting the standard library classes. Because of this, we instrument other classes loaded in the system classloader and, from code in those classes, log calls to standard library objects~\footnote{Because of these unforeseen issues involving Java's idiosyncracies, this process required far more work than initially anticipated. We did, however, emerge victorious: we can now construct a profile of any Java JAR executable file.}. -Our method for getting traces uses the Java compiler's \texttt{javaagent} support for attaching a profiler to a piece of code. We constructed our traces by modifying the source code from the Java Interactive Profiler~\cite{JIP}, a Java profiler written in Java and built on top of the ASM bytecode reengineering library~\cite{ASM}. Our program inserts bytecode instructions into the profiled code that then call our profiler functions to calls to standard library functions. We initially tracked sequences of calls to given libraries, but we realized that this information is not as useful as logging calls to specific instances of classes. To do this, we added instrumentation code that inspects the runtime stack to get at the object we are profiling. We take the hash value of each object in order to distinguish calls to different instances of the same class. +Our method for getting traces uses the Java compiler's \texttt{javaagent} support for attaching a profiler to a piece of code. We constructed our traces by modifying the source code from the Java Interactive Profiler~\cite{JIP}, a Java profiler written in Java and built on top of the ASM bytecode reengineering library~\cite{ASM}. Our program inserts bytecode instructions into the profiled code that then call our profiler functions to calls to standard library functions. We initially tracked sequences of calls to given libraries, but from analyzing preliminary results we realized that this information is not as useful as logging calls to specific instances of classes. To do this, we added instrumentation code that inspects the runtime stack to get at the object we are profiling. (This unfortunately introduced significant overhead, as for methods with many arguments we had to pop the arguments, store them into memory, examine the object reference, and load the arguments to push them back on the stack.) We take the hash value of each object in order to distinguish calls to different instances of the same class. \subsection{Building traces} We initially downloaded an assortment of programs, many of which had interactive interfaces, but we quickly realized that we would not be able to generate enough learning data by interacting with these programs. We chose instead to focus on programs that 1) could take lots of inputs and therefore could provide enough data, 2) used enough common libraries with other programs we found, and 3) did not load libraries that cause our profiler to crash. We were also limited by the fact that the profiler introduced significant overhead. -- 2.11.4.GIT