Thesis: fixing typo

[ifi-stolz-refaktor.git] / thesis / master-thesis-erlenkr.tex
diff --git a/thesis/master-thesis-erlenkr.tex b/thesis/master-thesis-erlenkr.tex

index 37f9f042ac127c6dfd81d23ec6acc4d5512eea58..ced191c40ec08a1927ac8324266d224110872c8e 100644 (file)
--- a/thesis/master-thesis-erlenkr.tex
+++ b/thesis/master-thesis-erlenkr.tex
@@ -64,6 +64,54 @@
  
  \bibliography{bibliography/master-thesis-erlenkr-bibliography}
  
+  % UML comment in TikZ:
+  % ref: https://tex.stackexchange.com/questions/103688/folded-paper-shape-tikz
+\makeatletter
+\pgfdeclareshape{umlcomment}{
+  \inheritsavedanchors[from=rectangle] % this is nearly a rectangle
+  \inheritanchorborder[from=rectangle]
+  \inheritanchor[from=rectangle]{center}
+  \inheritanchor[from=rectangle]{north}
+  \inheritanchor[from=rectangle]{south}
+  \inheritanchor[from=rectangle]{west}
+  \inheritanchor[from=rectangle]{east}
+  % ... and possibly more
+  \backgroundpath{% this is new
+  % store lower right in xa/ya and upper right in xb/yb
+  \southwest \pgf@xa=\pgf@x \pgf@ya=\pgf@y
+  \northeast \pgf@xb=\pgf@x \pgf@yb=\pgf@y
+  % compute corner of ‘‘flipped page’’
+  \pgf@xc=\pgf@xb \advance\pgf@xc by-10pt % this should be a parameter
+  \pgf@yc=\pgf@yb \advance\pgf@yc by-10pt
+  % construct main path
+  \pgfpathmoveto{\pgfpoint{\pgf@xa}{\pgf@ya}}
+  \pgfpathlineto{\pgfpoint{\pgf@xa}{\pgf@yb}}
+  \pgfpathlineto{\pgfpoint{\pgf@xc}{\pgf@yb}}
+  \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@yc}}
+  \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@ya}}
+  \pgfpathclose
+  % add little corner
+  \pgfpathmoveto{\pgfpoint{\pgf@xc}{\pgf@yb}}
+  \pgfpathlineto{\pgfpoint{\pgf@xc}{\pgf@yc}}
+  \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@yc}}
+  \pgfpathlineto{\pgfpoint{\pgf@xc}{\pgf@yc}}
+  }
+}
+\makeatother
+
+\tikzstyle{comment}=[%
+  draw,
+  drop shadow,
+  fill=white,
+  align=center,
+  shape=document,
+  minimum width=20mm,
+  minimum height=10mm,
+  shape=umlcomment,
+  inner sep=2ex,
+  font=\ttfamily,
+]
+
  \begin{document}
  \ififorside
  \frontmatter{}
@@ -1296,19 +1344,21 @@ The analysis and precondition checking is done by the
  First is check whether the selection is a valid selection or not, with respect 
  to statement boundaries and that it actually contains any selections. Then it 
  checks the legality of both extracting the selection and also moving it to 
-another class. If the selection is approved as legal, it is analyzed to find the 
-presumably best target to move the extracted method to.
+another class. This checking of is performed by a range of checkers 
+\see{checkers}.  If the selection is approved as legal, it is analyzed to find 
+the presumably best target to move the extracted method to.
  
  For finding the best suitable target the analyzer is using a 
  \typewithref{no.uio.ifi.refaktor.analyze.collectors}{PrefixesCollector} that 
  collects all the possible candidates for the refactoring. All the non-candidates 
  is found by an 
  \typewithref{no.uio.ifi.refaktor.analyze.collectors}{UnfixesCollector} that 
-collects all the targets that will give some kind of error if used. All prefixes 
-(and unfixes) are represented by a 
+collects all the targets that will give some kind of error if used.  (For 
+details about the property collectors, se \myref{propertyCollectors}.) All 
+prefixes (and unfixes) are represented by a 
  \typewithref{no.uio.ifi.refaktor.extractors}{Prefix}, and they are collected 
  into sets of prefixes. The safe prefixes is found by subtracting from the set of 
-candidate prefixes the prefixes that is enclosing any of the unfixes. A prefix 
+candidate prefixes the prefixes that is enclosing any of the unfixes.  A prefix 
  is enclosing an unfix if the unfix is in the set of its sub-prefixes.  As an 
  example, \texttt{``a.b''} is enclosing \texttt{``a''}, as is \texttt{``a''}. The 
  safe prefixes is unified in a \type{PrefixSet}. If a prefix has only one 
@@ -1379,121 +1429,6 @@ is provided by the parameter information object.
  \subsection{Finding the IMethod}\label{postExtractExecution}
  \todoin{Rename section. Write.}
  
-\subsection{Property collectors}
-The prefixes and unfixes are found by property 
-collectors\typeref{no.uio.ifi.refaktor.extractors.collectors.PropertyCollector}.  
-A property collector follows the visitor pattern\citing{designPatterns} and is 
-of the \typewithref{org.eclipse.jdt.core.dom}{ASTVisitor} type.  An 
-\type{ASTVisitor} visits nodes in an abstract syntax tree that forms the Java 
-document object model. The tree consists of nodes of type 
-\typewithref{org.eclipse.jdt.core.do}{ASTNode}.
-
-\subsubsection{The PrefixesCollector}
-The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{PrefixesCollector} 
-finds prefixes that makes up tha basis for calculating move targets for the 
-Extract and Move Method refactoring. It visits expression 
-statements\typeref{org.eclipse.jdt.core.dom.ExpressionStatement} and creates 
-prefixes from its expressions in the case of method invocations. The prefixes 
-found is registered with a prefix set, together with all its sub-prefixes.
-\todo{Rewrite in the case of changes to the way prefixes are found}
-
-\subsubsection{The UnfixesCollector}\label{unfixes}
-The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector} 
-finds unfixes within a selection. That is prefixes that cannot be used as a 
-basis for finding a move target in a refactoring.
-
-An unfix can be a name that is assigned to within a selection. The reason that 
-this cannot be allowed, is that the result would be an assignment to the 
-\type{this} keyword, which is not valid in Java \see{eclipse_bug_420726}.
-
-Prefixes that originates from variable declarations within the same selection 
-are also considered unfixes. This is because when a method is moved, it needs to 
-be called through a variable. If this variable is also within the method that is 
-to be moved, this obviously cannot be done.
-
-Also considered as unfixes are variable references that are of types that is not 
-suitable for moving a methods to. This can be either because it is not 
-physically possible to move the method to the desired class or that it will 
-cause compilation errors by doing so.
-
-If the type binding for a name is not resolved it is considered and unfix. The 
-same applies to types that is only found in compiled code, so they have no 
-underlying source that is accessible to us. (E.g. the \type{java.lang.String} 
-class.)
-
-Interfaces types are not suitable as targets. This is simply because interfaces 
-in java cannot contain methods with bodies. (This thesis does not deal with 
-features of Java versions later than Java 7. Java 8 has interfaces with default 
-implementations of methods.) Neither are local types allowed. This accounts for 
-both local and anonymous classes. Anonymous classes are effectively the same as 
-interface types with respect to unfixes. Local classes could in theory be used 
-as targets, but this is not possible due to limitations of the implementation of 
-the Extract and Move Method refactoring. The problem is that the refactoring is 
-done in two steps, so the intermediate state between the two refactorings would 
-not be legal Java code. In the case of local classes, the problem is that, in 
-the intermediate step, a selection referencing a local class would need to take 
-the local class as a parameter if it were to be extracted to a new method. This 
-new method would need to live in the scope of the declaring class of the 
-originating method. The local class would then not be in the scope of the 
-extracted method, thus bringing the source code into an illegal state. One could 
-imagine that the method was extracted and moved in one operation, without an 
-intermediate state. Then it would make sense to include variables with types of 
-local classes in the set of legal targets, since the local classes would then be 
-in the scopes of the method calls. If this makes any difference for software 
-metrics that measure coupling would be a different discussion.
-
-\begin{listing}
-\begin{multicols}{2}
-\begin{minted}[]{java}
-// Before
-void declaresLocalClass() {
-  class LocalClass {
-    void foo() {}
-    void bar() {}
-  }
-
-  LocalClass inst =
-    new LocalClass();
-  inst.foo();
-  inst.bar();
-}
-\end{minted}
-
-\columnbreak
-
-\begin{minted}[]{java}
-// After Extract Method
-void declaresLocalClass() {
-  class LocalClass {
-    void foo() {}
-    void bar() {}
-  }
-
-  LocalClass inst =
-    new LocalClass();
-  fooBar(inst);
-}
-
-// Intermediate step
-void fooBar(LocalClass inst) {
-  inst.foo();
-  inst.bar();
-}
-\end{minted}
-\end{multicols}
-\caption{When Extract and Move Method tries to use a variable with a local type 
-as the move target, an intermediate step is taken that is not allowed. Here: 
-\type{LocalClass} is not in the scope of \method{fooBar} in its intermediate 
-location.}
-\label{lst:extractMethod_LocalClass}
-\end{listing}
-
-The last class of names that are considered unfixes is names used in null tests.  
-These are tests that reads like this: if \texttt{<name>} equals \var{null} then 
-do something. If allowing variables used in those kinds of expressions as 
-targets for moving methods, we would end up with code containing boolean 
-expressions like \texttt{this == null}, which would not be meaningful, since 
-\var{this} would never be \var{null}.
  
  \subsection{The Prefix Class}
  This class exists mainly for holding data about a prefix, such as the expression 
@@ -1726,7 +1661,69 @@ from how source manipulation programs would do it, except for some properties of
  code that is analyzed in the parser, and that they may be differing in what 
  kinds of properties they analyze.  Thus the process of translation source code 
  into a structure that is suitable for analyzing, can be seen as a kind of 
-interrupted compilation process.
+interrupted compilation process \see{fig:interruptedCompilationProcess}.
+
+\begin{figure}[h]
+  \centering
+  \tikzset{
+    base/.style={anchor=north, align=center, rectangle, minimum height=1.4cm},
+    basewithshadow/.style={base, drop shadow, fill=white},
+    outlined/.style={basewithshadow, draw, rounded corners, minimum 
+    width=0.4cm},
+    primary/.style={outlined, font=\bfseries},
+    dashedbox/.style={outlined, dashed},
+    arrowpath/.style={black, align=center, font=\small},
+    processarrow/.style={arrowpath, ->, >=angle 90, shorten >=1pt},
+  }
+  \begin{tikzpicture}[node distance=1.3cm and 3cm, scale=1, every 
+    node/.style={transform shape}]
+    \node[base](AuxNode1){\small source code};
+    \node[primary, right=of AuxNode1, xshift=-2.5cm](Scanner){Scanner};
+    \node[primary, right=of Scanner, xshift=0.5cm](Parser){Parser};
+    \node[dashedbox, below=of Parser](SemanticAnalyzer){Semantic\\Analyzer};
+    \node[dashedbox, left=of SemanticAnalyzer](SourceCodeOptimizer){Source 
+    Code\\Optimizer};
+    \node[dashedbox, below=of SourceCodeOptimizer
+    ](CodeGenerator){Code\\Generator};
+    \node[dashedbox, right=of CodeGenerator](TargetCodeOptimizer){Target 
+    Code\\Optimizer};
+    \node[base, right=of TargetCodeOptimizer](AuxNode2){};
+
+    \draw[processarrow](AuxNode1) -- (Scanner);
+
+    \path[arrowpath] (Scanner) -- node [sloped](tokens){tokens}(Parser);
+    \draw[processarrow](Scanner) -- (tokens) -- (Parser);
+
+    \path[arrowpath] (Parser) -- node (syntax){syntax 
+    tree}(SemanticAnalyzer);
+    \draw[processarrow](Parser) -- (syntax) -- (SemanticAnalyzer);
+
+    \path[arrowpath] (SemanticAnalyzer) -- node 
+    [sloped](annotated){annotated\\tree}(SourceCodeOptimizer);
+    \draw[processarrow, dashed](SemanticAnalyzer) -- (annotated) -- 
+    (SourceCodeOptimizer);
+
+    \path[arrowpath] (SourceCodeOptimizer) -- node 
+    (intermediate){intermediate code}(CodeGenerator);
+    \draw[processarrow, dashed](SourceCodeOptimizer) -- (intermediate) --
+    (CodeGenerator);
+
+    \path[arrowpath] (CodeGenerator) -- node [sloped](target1){target 
+    code}(TargetCodeOptimizer);
+    \draw[processarrow, dashed](CodeGenerator) -- (target1) --
+    (TargetCodeOptimizer);
+
+    \path[arrowpath](TargetCodeOptimizer) -- node [sloped](target2){target 
+    code}(AuxNode2);
+    \draw[processarrow, dashed](TargetCodeOptimizer) -- (target2) (AuxNode2);
+  \end{tikzpicture}
+  \caption{Interrupted compilation process. {\footnotesize (Full compilation 
+    process from \emph{Compiler construction: principles and practice} by 
+    Kenneth C.  Louden\citing{louden1997}.)}}
+  \label{fig:interruptedCompilationProcess}
+\end{figure}
+
+\todoin{Refine \myref{fig:interruptedCompilationProcess}.}
  
  The process starts with a \emph{scanner}, or lexer. The job of the scanner is to 
  read the source code and divide it into tokens for the parser. Therefore, it is 
@@ -1769,10 +1766,15 @@ bindings for variables and variable bindings for names.
  
  \begin{figure}[h]
    \centering
-  \begin{tikzpicture}[scale=0.7]
+  \begin{tikzpicture}[scale=0.8]
    \tikzset{level distance=40pt}
+  \tikzset{sibling distance=5pt}
+  \tikzstyle{thescale}=[scale=0.8]
+  \tikzset{every tree node/.style={align=center}}
    \tikzset{edge from parent/.append style={thick}}
-  \tikzset{every internal node/.style={ellipse,draw,fill=lightgray}}
+  \tikzstyle{inode}=[rectangle,rounded corners,draw,fill=lightgray,drop 
+  shadow,align=center]
+  \tikzset{every internal node/.style={inode}}
    \tikzset{every leaf node/.style={draw=none,fill=none}}
  
    \Tree [.\type{InfixExpression} [.\type{InfixExpression}
@@ -1819,6 +1821,9 @@ However, the expression \type{Name} is a little special, since it is both used
  as an operand in compound expressions, as well as for names in type declarations 
  and such.
  
+There is an overview of some of the structure of an Eclipse AST in 
+\myref{fig:astEclipse}.
+
  \begin{figure}[h]
    \centering
    \begin{tikzpicture}[scale=0.8]
@@ -1832,28 +1837,34 @@ and such.
    \tikzset{every internal node/.style={inode}}
    \tikzset{every leaf node/.style={draw=none,fill=none}}
  
-  \Tree [.\type{CompilationUnit} [.\type{[ PackageDeclaration ]} ]
-    [.\type{\{ ImportDeclaration \}*} ]
+  \Tree [.\type{CompilationUnit} [.\type{[ PackageDeclaration ]} [.\type{Name} ] 
+  [.\type{\{ Annotation \}*} ] ]
+  [.\type{\{ ImportDeclaration \}*} [.\type{Name} ] ]
      [.\type{\{ AbstractTypeDeclaration \}+} [.\node(site){\type{\{ 
-    BodyDeclaration \}*}}; ] ]
+    BodyDeclaration \}*}}; ] [.\type{SimpleName} ] ]
    ]
-  \begin{scope}[shift={(0,-5)}]
+  \begin{scope}[shift={(0.5,-6)}]
      \node[inode,thescale](root){\type{MethodDeclaration}};
-    \node[inode,thescale](modifiers) at (5,-5){\type{\{ IExtendedModifier \}*} 
+    \node[inode,thescale](modifiers) at (4.5,-5){\type{\{ IExtendedModifier \}*} 
      \\ {\footnotesize (Of type \type{Modifier} or \type{Annotation})}};
-    \node[inode,thescale](typeParameters) at (-6,-3){\type{\{ TypeParameter 
+    \node[inode,thescale](typeParameters) at (-6,-3.5){\type{\{ TypeParameter 
      \}*}};
-    \node[inode,thescale](parameters) at (-6,-5){\type{\{ 
+    \node[inode,thescale](parameters) at (-5,-5){\type{\{ 
      SingleVariableDeclaration \}*} \\ {\footnotesize (Parameters)}};
-    \node[inode,thescale](exceptions) at (6,-3){\type{\{ Name \}*} \\ 
+    \node[inode,thescale](exceptions) at (5,-3){\type{\{ Name \}*} \\ 
      {\footnotesize (Exceptions)}};
-    \node[inode,thescale](return) at (-7,-1){\type{Type} \\ {\footnotesize 
+    \node[inode,thescale](return) at (-6.5,-2){\type{Type} \\ {\footnotesize 
      (Return type)}};
-    \node[inode,thescale](body) at (0,-5){\type{[ Block ]} \\ {\footnotesize 
-    (Body)}};
+    \begin{scope}[shift={(0,-5)}]
+      \Tree [.\node(body){\type{[ Block ]} \\ {\footnotesize (Body)}};
+      [.\type{\{ Statement \}*} [.\type{\{ Expression \}*} ]
+        [.\type{\{ Statement \}*} [.\type{\ldots} ]]
+      ]
+      ]
+    \end{scope}
    \end{scope}
-  \draw[->,>=triangle 45,shorten >=1pt](root.100)..controls +(north:1) and 
-  +(west:5)..(site);
+  \draw[->,>=triangle 90,shorten >=1pt](root.east)..controls +(east:2) and 
+  +(south:1)..(site.south);
  
    \draw (root.south) -- (modifiers);
    \draw (root.south) -- (typeParameters);
@@ -1866,20 +1877,383 @@ and such.
    \caption{The format of the abstract syntax tree in Eclipse.}
    \label{fig:astEclipse}
  \end{figure}
+\todoin{Add more to the AST format tree? \myref{fig:astEclipse}}
+
+\section{The ASTVisitor}\label{astVisitor}
+So far, the only thing that has been adressed is how the the data that is going 
+to be the basis for our analysis is structured. Another aspect of it is how we 
+are going to traverse the AST to gather the information we need, so we can 
+conclude about the properties we are analysing. It is of course possible to 
+start at the top of the tree, and manually search through its nodes for the ones 
+we are looking for, but that is a bit inconvenient. To be able to efficiently 
+utilize such an approach, we would need to make our own framework for traversing 
+the tree and visiting only the types of nodes we are after. Luckily, this 
+functionality is already provided in Eclipse, by its 
+\typewithref{org.eclipse.jdt.core.dom}{ASTVisitor}.
+
+The Eclipse AST, together with its \type{ASTVisitor}, follows the \emph{Visitor} 
+pattern\citing{designPatterns}. The intent of this design pattern is to 
+facilitate extending the functionality of classes without touching the classes 
+themselves.
+
+Let us say that there is a class hierarchy of \emph{Elements}. These elements 
+all have a method \method{accept(Visitor visitor)}. In its simplest form, the 
+\method{accept} method just calls the \method{visit} method of the visitor with 
+itself as an argument, like this: \code{visitor.visit(this)}.  For the visitors 
+to be able to extend the functionality of all the classes in the elements 
+hierarchy, each \type{Visitor} must have one visit method for each concrete 
+class in the hierarchy. Say the hierarchy consists of the concrete classes 
+\type{ConcreteElementA} and \type{ConcreteElementB}. Then each visitor must have 
+the (possibly empty) methods \method{visit(ConcreteElementA element)} and 
+\method{visit(ConcreteElementB element)}. This scenario is depicted in 
+\myref{fig:visitorPattern}.
+
+\begin{figure}[h]
+  \centering
+  \tikzstyle{abstract}=[rectangle, draw=black, fill=white, drop shadow, text 
+  centered, anchor=north, text=black, text width=6cm, every one node 
+part/.style={align=center, font=\bfseries\itshape}]
+  \tikzstyle{concrete}=[rectangle, draw=black, fill=white, drop shadow, text 
+  centered, anchor=north, text=black, text width=6cm]
+  \tikzstyle{inheritarrow}=[->, >=open triangle 90, thick]
+  \tikzstyle{commentarrow}=[->, >=angle 90, dashed]
+  \tikzstyle{line}=[-, thick]
+  \tikzset{every one node part/.style={align=center, font=\bfseries}}
+  \tikzset{every second node part/.style={align=center, font=\ttfamily}}
+        
+  \begin{tikzpicture}[node distance=1cm, scale=0.8, every node/.style={transform 
+    shape}]
+    \node (Element) [abstract, rectangle split, rectangle split parts=2]
+        {
+          \nodepart{one}{Element}
+          \nodepart{second}{+accept(visitor: Visitor)}
+        };
+    \node (AuxNode01) [text width=0, minimum height=2cm, below=of Element] {};
+    \node (ConcreteElementA) [concrete, rectangle split, rectangle split 
+    parts=2, left=of AuxNode01]
+        {
+          \nodepart{one}{ConcreteElementA}
+          \nodepart{second}{+accept(visitor: Visitor)}
+        };
+    \node (ConcreteElementB) [concrete, rectangle split, rectangle split 
+    parts=2, right=of AuxNode01]
+        {
+          \nodepart{one}{ConcreteElementB}
+          \nodepart{second}{+accept(visitor: Visitor)}
+        };
+
+    \node[comment, below=of ConcreteElementA] (CommentA) {visitor.visit(this)};
+
+    \node[comment, below=of ConcreteElementB] (CommentB) {visitor.visit(this)};
+
+    \node (AuxNodeX) [text width=0, minimum height=1cm, below=of AuxNode01] {};
+
+    \node (Visitor) [abstract, rectangle split, rectangle split parts=2, 
+    below=of AuxNodeX]
+        {
+          \nodepart{one}{Visitor}
+          \nodepart{second}{+visit(ConcreteElementA)\\+visit(ConcreteElementB)}
+        };
+    \node (AuxNode02) [text width=0, minimum height=2cm, below=of Visitor] {};
+    \node (ConcreteVisitor1) [concrete, rectangle split, rectangle split 
+    parts=2, left=of AuxNode02]
+        {
+          \nodepart{one}{ConcreteVisitor1}
+          \nodepart{second}{+visit(ConcreteElementA)\\+visit(ConcreteElementB)}
+        };
+    \node (ConcreteVisitor2) [concrete, rectangle split, rectangle split 
+    parts=2, right=of AuxNode02]
+        {
+          \nodepart{one}{ConcreteVisitor2}
+          \nodepart{second}{+visit(ConcreteElementA)\\+visit(ConcreteElementB)}
+        };
+
+    
+    \draw[inheritarrow] (ConcreteElementA.north) -- ++(0,0.7) -| 
+    (Element.south);
+    \draw[line] (ConcreteElementA.north) -- ++(0,0.7) -| 
+    (ConcreteElementB.north);
+
+    \draw[inheritarrow] (ConcreteVisitor1.north) -- ++(0,0.7) -| 
+    (Visitor.south);
+    \draw[line] (ConcreteVisitor1.north) -- ++(0,0.7) -| 
+    (ConcreteVisitor2.north);
+
+    \draw[commentarrow] (CommentA.north) -- (ConcreteElementA.south);
+    \draw[commentarrow] (CommentB.north) -- (ConcreteElementB.south);
+
+    
+  \end{tikzpicture}
+  \caption{The Visitor Pattern.}
+  \label{fig:visitorPattern}
+\end{figure}
+
+The use of the visitor pattern can be appropriate when the hierarchy of elements 
+is mostly stable, but the family of operations over its elements is constantly 
+growing. This is clearly the cas for the Eclipse AST, since the hierarchy of 
+type \type{ASTNode} is very stable, but the functionality of its elements is 
+extended every time someone needs to operate on the AST. Another aspect of the 
+Eclipse implementation is that it is a public API, and the visitor pattern is an 
+easy way to provide access to the nodes in the tree.
+
+The version of the visitor pattern implemented for the AST nodes in Eclipse also 
+provides an elegant way to traverse the tree. It does so by following the 
+convention that every node in the tree first let the visitor visit itself, 
+before it also makes all its children accept the visitor. The children are only 
+visited if the visit method of their parent returns \var{true}. This pattern 
+then makes for a prefix traversal of the AST. If postfix traversal is desired, 
+the visitors also has \method{endVisit} methods for each node type, that is 
+called after the \method{visit} method for a node. In addition to these visit 
+methods, there are also the methods \method{preVisit(ASTNode)}, 
+\method{postVisit(ASTNode)} and \method{preVisit2(ASTNode)}. The 
+\method{preVisit} method is called before the type-specific \method{visit} 
+method. The \method{postVisit} method is called after the type-specific 
+\method{endVisit}. The type specific \method{visit} is only called if 
+\method{preVisit2} returns \var{true}. Overriding the \method{preVisit2} is also 
+altering the behavior of \method{preVisit}, since the default implementation is 
+responsible for calling it.
+
+An example of a trivial \type{ASTVisitor} is shown in 
+\myref{lst:astVisitorExample}.
+
+\begin{listing}
+\begin{minted}{java}
+public class CollectNamesVisitor extends ASTVisitor {
+    Collection<Name> names = new LinkedList<Name>();
+
+    @Override
+    public boolean visit(QualifiedName node) {
+      names.add(node);
+      return false;
+    }
+
+    @Override
+    public boolean visit(SimpleName node) {
+        names.add(node);
+        return true;
+    }
+} 
+\end{minted}
+\caption{An \type{ASTVisitor} that visits all the names in a subtree and adds 
+them to a collection, except those names that are children of any 
+\type{QualifiedName}.}
+\label{lst:astVisitorExample}
+\end{listing}
+
+\section{Property collectors}\label{propertyCollectors}
+The prefixes and unfixes are found by property 
+collectors\typeref{no.uio.ifi.refaktor.extractors.collectors.PropertyCollector}.  
+A property collector is of the \type{ASTVisitor} type, and thus visits nodes of 
+type \type{ASTNode} of the abstract syntax tree \see{astVisitor}.
+
+\subsection{The PrefixesCollector}
+The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{PrefixesCollector} 
+finds prefixes that makes up the basis for calculating move targets for the 
+Extract and Move Method refactoring. It visits expression 
+statements\typeref{org.eclipse.jdt.core.dom.ExpressionStatement} and creates 
+prefixes from its expressions in the case of method invocations. The prefixes 
+found is registered with a prefix set, together with all its sub-prefixes.
+
+\subsection{The UnfixesCollector}\label{unfixes}
+The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector} 
+finds unfixes within a selection. That is prefixes that cannot be used as a 
+basis for finding a move target in a refactoring.
+
+An unfix can be a name that is assigned to within a selection. The reason that 
+this cannot be allowed, is that the result would be an assignment to the 
+\type{this} keyword, which is not valid in Java \see{eclipse_bug_420726}.
+
+Prefixes that originates from variable declarations within the same selection 
+are also considered unfixes. This is because when a method is moved, it needs to 
+be called through a variable. If this variable is also within the method that is 
+to be moved, this obviously cannot be done.
+
+Also considered as unfixes are variable references that are of types that is not 
+suitable for moving a methods to. This can be either because it is not 
+physically possible to move the method to the desired class or that it will 
+cause compilation errors by doing so.
+
+If the type binding for a name is not resolved it is considered and unfix. The 
+same applies to types that is only found in compiled code, so they have no 
+underlying source that is accessible to us. (E.g. the \type{java.lang.String} 
+class.)
+
+Interfaces types are not suitable as targets. This is simply because interfaces 
+in java cannot contain methods with bodies. (This thesis does not deal with 
+features of Java versions later than Java 7. Java 8 has interfaces with default 
+implementations of methods.) Neither are local types allowed. This accounts for 
+both local and anonymous classes. Anonymous classes are effectively the same as 
+interface types with respect to unfixes. Local classes could in theory be used 
+as targets, but this is not possible due to limitations of the implementation of 
+the Extract and Move Method refactoring. The problem is that the refactoring is 
+done in two steps, so the intermediate state between the two refactorings would 
+not be legal Java code. In the case of local classes, the problem is that, in 
+the intermediate step, a selection referencing a local class would need to take 
+the local class as a parameter if it were to be extracted to a new method. This 
+new method would need to live in the scope of the declaring class of the 
+originating method. The local class would then not be in the scope of the 
+extracted method, thus bringing the source code into an illegal state. One could 
+imagine that the method was extracted and moved in one operation, without an 
+intermediate state. Then it would make sense to include variables with types of 
+local classes in the set of legal targets, since the local classes would then be 
+in the scopes of the method calls. If this makes any difference for software 
+metrics that measure coupling would be a different discussion.
+
+\begin{listing}
+\begin{multicols}{2}
+\begin{minted}[]{java}
+// Before
+void declaresLocalClass() {
+  class LocalClass {
+    void foo() {}
+    void bar() {}
+  }
+
+  LocalClass inst =
+    new LocalClass();
+  inst.foo();
+  inst.bar();
+}
+\end{minted}
+
+\columnbreak
+
+\begin{minted}[]{java}
+// After Extract Method
+void declaresLocalClass() {
+  class LocalClass {
+    void foo() {}
+    void bar() {}
+  }
  
+  LocalClass inst =
+    new LocalClass();
+  fooBar(inst);
+}
  
-\section{Illegal selections}
+// Intermediate step
+void fooBar(LocalClass inst) {
+  inst.foo();
+  inst.bar();
+}
+\end{minted}
+\end{multicols}
+\caption{When Extract and Move Method tries to use a variable with a local type 
+as the move target, an intermediate step is taken that is not allowed. Here: 
+\type{LocalClass} is not in the scope of \method{fooBar} in its intermediate 
+location.}
+\label{lst:extractMethod_LocalClass}
+\end{listing}
  
-\subsection{Not all branches end in return}
+The last class of names that are considered unfixes is names used in null tests.  
+These are tests that reads like this: if \texttt{<name>} equals \var{null} then 
+do something. If allowing variables used in those kinds of expressions as 
+targets for moving methods, we would end up with code containing boolean 
+expressions like \texttt{this == null}, which would not be meaningful, since 
+\var{this} would never be \var{null}.
  
-\subsection{Ambiguous return statement}
-This problem occurs when there is either more than one assignment to a local 
-variable that is used outside of the selection, or there is only one, but there 
-are also return statements in the selection.
  
-\todoin{Explain why we do not need to consider variables assigned inside 
-local/anonymous classes. (The referenced variables need to be final and so 
-on\ldots)}
+\subsection{The ContainsReturnStatementCollector}
+The 
+\typewithref{no.uio.ifi.refaktor.analyze.collectors}{ContainsReturnStatementCollector} 
+is a very simple property collector. It only visits the return statements within 
+a selection, and can report whether it encountered a return statement or not.
+
+\subsection{The LastStatementCollector}
+The \typewithref{no.uio.ifi.refaktor.analyze.collectors}{LastStatementCollector} 
+collects the last statement of a selection. It does so by only visiting the top 
+level statements of the selection, and compares the textual end offset of each 
+encuntered statement with the end offset of the previous statement found.
+
+\section{Checkers}\label{checkers}
+The checkers are a range of classes that checks that selections complies with 
+certian criterias. If a 
+\typewithref{no.uio.ifi.refaktor.analyze.analyzers}{Checker} fails, it throws a 
+\type{CheckerException}. The checkers are managed by the 
+\type{LegalStatementsChecker}, which does not, in fact, implement the 
+\type{Checker} interface. It does, however, run all the checkers registered with 
+it, and reports that all statements are considered legal if no 
+\type{CheckerException} is thrown. Many of the checkers either extends the 
+\type{PropertyCollector} or utilizes one or more property collectors to verify 
+some criterias. The checkers registered with the \type{LegalStatementsChecker} 
+are described next. They are run in the order presented below.
+
+\subsection{The EnclosingInstanceReferenceChecker}
+The purpose of this checker is to verify that the names in a selection is not 
+referencing any enclosing instances. This is for making sure that all references 
+is legal in a method that is to be moved. Theoretically, some situations could 
+be easily solved my passing a reference to the referenced class with the moved 
+method (e.g. when calling public methods), but the dependency on the 
+\type{MoveInstanceMethodProcessor} prevents this.
+
+The 
+\typewithref{no.uio.ifi.refaktor.analyze.analyzers}{EnclosingInstanceReferenceChecker} 
+is a modified version of the 
+\typewithref{org.eclipse.jdt.internal.corext.refactoring.structure.MoveInstanceMethodProcessor}{EnclosingInstanceReferenceFinder} 
+from the \type{MoveInstanceMethodProcessor}. Wherever the 
+\type{EnclosingInstanceReferenceFinder} would create a fatal error status, the 
+checker throws a \type{CheckerException}.
+
+It works by first finding all of the enclosing types of a selection. Thereafter 
+it visits all its simple names to check that they are not references to 
+variables or methods declared in any of the enclosing types. In addition the 
+checker visits \var{this}-expressions to verify that no such expressions is 
+qualified with any name.
+
+\subsection{The ReturnStatementsChecker}\label{returnStatementsChecker}
+\todoin{Write\ldots/change implementation/use control flow graph?}
+
+\subsection{The AmbiguousReturnValueChecker}
+This checker verifies that there are no \emph{ambiguous return statements} in a 
+selection. The problem with ambiguous return statements arise when a selection 
+is chosen to be extracted into a new method, but it needs to return more than 
+one value from that method.  This problem occurs in two situations.  The first 
+situation arise when there is more than one local variable that is both assigned 
+to within a selection and also referenced after the selection. The other 
+situation occur when there is only one such assignment, but there is also one or 
+more return statements in the selection.
+
+First the checker needs to collect some data. Those data are the binding keys 
+for all simple names that are assigned to within the selection, including 
+variable declarations, but excluding fields. The checker also collects whether 
+there exists a return statement in the selection or not. No further checks of 
+return statements are needed, since, at this point, the selection is already 
+checked for illegal return statements \see{returnStatementsChecker}.
+
+After the binding keys of the assignees are collected, the checker searches the 
+part of the enclosing method that is after the selection for references whose 
+binding keys are among the the collected keys. If more than one unique referral 
+is found, or only one referral is found, but the selection also contains a 
+return statement, we have a situation with an ambiguous return value, and an 
+exception is thrown.
+
+%\todoin{Explain why we do not need to consider variables assigned inside 
+%local/anonymous classes. (The referenced variables need to be final and so 
+%on\ldots)}
+
+\subsection{The IllegalStatementsChecker}
+This checker is designed to check for illegal statements.
+
+Any use of the \var{super} keyword is prohibited, since its meaning is altered 
+when moving a method to another class.
+
+For a \emph{break} statement, there is two situations to consider: A break 
+statement with or without a label. If the break statement has a label, it is 
+checked that whole of the labeled statement is inside the selection. Since a 
+label does not have any binding information, we have to search upwards in the 
+AST to find the \type{LabeledStatement} that corresponds to the label from the 
+break statement, and check that it is contained in the selection. If the break 
+statement does not have a label attached to it, it is checked that its innermost 
+enclosing loop or switch statement also is inside the selection.
+
+The situation for a \emph{continue} statement is the same as for a break 
+statement, except that it is not allowed inside switch statements.
+
+Regarding \emph{assignments}, two types of assignments is allowed: Assignment to 
+a non-final variable and assignment to an array access. All other assignments is 
+regarded illegal.
+
+\todoin{Finish\ldots}
  
  \chapter{Eclipse Bugs Found}
  \todoin{Add other things and change headline?}