Thesis: adding reference to Java Model table

[ifi-stolz-refaktor.git] / thesis / master-thesis-erlenkr.tex
diff --git a/thesis/master-thesis-erlenkr.tex b/thesis/master-thesis-erlenkr.tex

index 865bd2d7dc0178af90ad97212a0fc463d2d87036..267ff52e647355f05800b99fad8d126ed30c6533 100644 (file)
--- a/thesis/master-thesis-erlenkr.tex
+++ b/thesis/master-thesis-erlenkr.tex
@@ -2,29 +2,61 @@
  \usepackage{import}
  \usepackage[utf8]{inputenc}
  \usepackage[T1]{fontenc,url}
+\usepackage{lmodern} % using Latin Modern to be able to use bold typewriter font
  \urlstyle{sf}
-\usepackage{babel,textcomp,csquotes,ifimasterforside,varioref,graphicx}
-\usepackage[style=numeric-comp]{biblatex}
-%\usepackage[backend=biber,style=numeric-comp]{biblatex}
+\usepackage{listings}
+\usepackage{tabularx}
+\usepackage{tikz}
+\usepackage{tikz-qtree}
+\usetikzlibrary{shapes,snakes,trees}
+\usepackage{babel,textcomp,csquotes,ifimasterforside,varioref}
+\usepackage[hidelinks]{hyperref}
+\usepackage{cleveref}
+\usepackage[style=numeric-comp,backend=bibtex]{biblatex}
  \usepackage{amsthm}
-\usepackage{todonotes}
+\usepackage{graphicx}
+% use 'disable' before printing:
+\usepackage[]{todonotes}
+\usepackage{xspace}
+\usepackage{he-she}
+\usepackage{verbatim}
+\usepackage{minted}
+\usepackage{multicol}
+\usemintedstyle{bw}
  \usepackage{perpage} %the perpage package
  \MakePerPage{footnote} %the perpage package command
  
-\theoremstyle{plain}
+\theoremstyle{definition}
  \newtheorem*{wordDef}{Definition}
  
+\graphicspath{ {./figures/} }
+
+\newcommand{\citing}[1]{~\cite{#1}}
+\newcommand{\myref}[1]{\cref{#1} on \cpageref{#1}}
+
  \newcommand{\definition}[1]{\begin{wordDef}#1\end{wordDef}}
-\newcommand{\see}[1]{(see \ref{#1})}
+\newcommand{\see}[1]{(see \myref{#1})}
+\newcommand{\See}[1]{(See \myref{#1}.)}
  \newcommand{\explanation}[3]{\noindent\textbf{\textit{#1}}\\*\emph{When:} 
  #2\\*\emph{How:} #3\\*[-7px]}
-\newcommand{\type}[1]{\texttt{#1}}
+
+%\newcommand{\type}[1]{\lstinline{#1}}
+\newcommand{\code}[1]{\texttt{\textbf{#1}}}
+\newcommand{\type}[1]{\code{#1}}
  \newcommand{\typeref}[1]{\footnote{\type{#1}}}
  \newcommand{\typewithref}[2]{\type{#2}\typeref{#1.#2}}
  \newcommand{\method}[1]{\type{#1}}
  \newcommand{\methodref}[2]{\footnote{\type{#1}\method{\##2()}}}
  \newcommand{\methodwithref}[2]{\method{#2}\footnote{\type{#1}\method{\##2()}}}
+\newcommand{\var}[1]{\type{#1}}
+
+\newcommand{\refactoring}[1]{\emph{#1}}
+\newcommand{\ExtractMethod}{\refactoring{Extract Method}\xspace}
+\newcommand{\MoveMethod}{\refactoring{Move Method}\xspace}
+\newcommand{\ExtractAndMoveMethod}{\refactoring{Extract and Move Method}\xspace}
  
+\newcommand\todoin[2][]{\todo[inline, caption={2do}, #1]{
+\begin{minipage}{\textwidth-4pt}#2\end{minipage}}}
  
  \title{Refactoring}
  \subtitle{An essay}
@@ -38,7 +70,9 @@
  
  
  \chapter*{Abstract}
-Empty document.
+\todoin{\textbf{Remove all todos (including list) before delivery/printing!!!  
+Can be done by removing ``draft'' from documentclass.}}
+\todoin{Write abstract}
  
  \tableofcontents{}
  \listoffigures{}
@@ -46,129 +80,455 @@ Empty document.
  
  \chapter*{Preface}
  
-\mainmatter
+The discussions in this report must be seen in the context of object oriented 
+programming languages, and Java in particular, since that is the language in 
+which most of the examples will be given. All though the techniques discussed 
+may be applicable to languages from other paradigms, they will not be the 
+subject of this report.
  
-\chapter{Introduction}
+\mainmatter
  
-\section{What is Refactoring?}
+\chapter{What is Refactoring?}
  
-This question is best answered dividing the answer into two parts.  First 
-defining the concept of a refactoring, then discuss what the discipline of 
-refactoring is all about. And to make it clear already from the beginning: The 
-discussions in this report must be seen in the context of object oriented 
-programming languages. It may be obvious, but much of the material will not make 
-much sense otherwise, although some of the techniques may be applicable to 
-sequential \todo{sequential?} languages, then possibly in other forms.
+This question is best answered by first defining the concept of a 
+\emph{refactoring}, what it is to \emph{refactor}, and then discuss what aspects 
+of programming make people want to refactor their code.
  
-\subsection{Defining refactoring}
-Martin Fowler, in his masterpiece on refactoring \cite{refactoring}, defines a 
+\section{Defining refactoring}
+Martin Fowler, in his classic book on refactoring\citing{refactoring}, defines a 
  refactoring like this:
+
  \begin{quote}
-  \emph{Refactoring} (noun): a change made to the \todo{what does he mean by 
-  internal?} internal structure of software to make it easier to understand and 
-  cheaper to modify without changing its observable 
-  behavior.~\cite{refactoring} % page 53
+  \emph{Refactoring} (noun): a change made to the internal 
+  structure\footnote{The structure observable by the programmer.} of software to 
+  make it easier to understand and cheaper to modify without changing its 
+  observable behavior.~\cite[p.~53]{refactoring}
  \end{quote}
-This definition gives additional meaning to the word \emph{refactoring}, beyond 
-its \todo{original?} original meaning. Fowler is mixing the \emph{motivation} 
-behind refactoring into his definition. Instead it could be made clean, only 
-considering the mechanical and behavioral aspects of refactoring. That is to 
-factor the program again, putting it together in a different way than before, 
-while preserving the behavior of the program. An alternative definition could 
-then be: 
-
-\definition{A refactoring is a transformation
+
+\noindent This definition assigns additional meaning to the word 
+\emph{refactoring}, beyond the composition of the prefix \emph{re-}, usually 
+meaning something like ``again'' or ``anew'', and the word \emph{factoring}, 
+that can mean to isolate the \emph{factors} of something. Here a \emph{factor} 
+would be close to the mathematical definition of something that divides a 
+quantity, without leaving a remainder. Fowler is mixing the \emph{motivation} 
+behind refactoring into his definition. Instead it could be more refined, formed 
+to only consider the \emph{mechanical} and \emph{behavioral} aspects of 
+refactoring.  That is to factor the program again, putting it together in a 
+different way than before, while preserving the behavior of the program. An 
+alternative definition could then be: 
+
+\definition{A \emph{refactoring} is a transformation
  done to a program without altering its external behavior.}
  
-So a refactoring primarily changes how the \emph{code} of a program is perceived 
-by the \emph{programmer}, and not the behavior experienced by any user of the 
-program. Although the logical meaning is preserved, such changes could 
-potentially alter the program's behavior when it comes to performance gain or 
-penalties. So any logic depending on the performance of a program could make the 
-program behave differently after a refactoring.
+From this we can conclude that a refactoring primarily changes how the 
+\emph{code} of a program is perceived by the \emph{programmer}, and not the 
+\emph{behavior} experienced by any user of the program. Although the logical 
+meaning is preserved, such changes could potentially alter the program's 
+behavior when it comes to performance gain or -penalties. So any logic depending 
+on the performance of a program could make the program behave differently after 
+a refactoring.
  
  In the extreme case one could argue that such a thing as \emph{software 
-obfuscation} is to refactor. If we where to define it as a refactoring, it could 
-be defined as a composite refactoring \see{intro_composite}, consisting of, for 
-instance, a series of rename refactorings. (But it could of course be much more 
-complex, and the mechanics of it would not exactly be carved in stone.) To 
-perform some serious obfuscation one would also take advantage of techniques not 
-found among established refactorings, such as removing whitespace. This might 
-not even generate a different syntax tree for languages not sensitive to 
-whitespace, placing it in the gray area of what transformations is to be 
-considered refactorings.
-
-Finally, to \emph{refactor} is (quoting Martin Fowler)
+obfuscation} is refactoring. Software obfuscation is to make source code harder 
+to read and analyze, while preserving its semantics. It could be done composing 
+many, more or less randomly chosen, refactorings. Then the question arise 
+whether it can be called a \emph{composite refactoring} 
+\see{compositeRefactorings} or not?  The answer is not obvious.  First, there is 
+no way to describe \emph{the} mechanics of software obfuscation, beacause there 
+are infinitely many ways to do that. Second, \emph{obfuscation} can be thought 
+of as \emph{one operation}: Either the code is obfuscated, or it is not. Third, 
+it makes no sense to call software obfuscation \emph{a} refactoring, since it 
+holds different meaning to different people. The last point is important, since 
+one of the motivations behind defining different refactorings is to build up a 
+vocabulary for software professionals to reason and discuss about programs, 
+similar to the motivation behind design patterns\citing{designPatterns}.  So for 
+describing \emph{software obfuscation}, it might be more appropriate to define 
+what you do when performing it rather than precisely defining its mechanics in 
+terms of other refactorings.
+
+\section{The etymology of 'refactoring'}
+It is a little difficult to pinpoint the exact origin of the word 
+``refactoring'', as it seems to have evolved as part of a colloquial 
+terminology, more than a scientific term. There is no authoritative source for a 
+formal definition of it. 
+
+According to Martin Fowler\citing{etymology-refactoring}, there may also be more 
+than one origin of the word. The most well-known source, when it comes to the 
+origin of \emph{refactoring}, is the Smalltalk\footnote{\emph{Smalltalk}, 
+object-oriented, dynamically typed, reflective programming language. See 
+\url{http://www.smalltalk.org}} community and their infamous \emph{Refactoring 
+Browser}\footnote{\url{http://st-www.cs.illinois.edu/users/brant/Refactory/RefactoringBrowser.html}} 
+described in the article \emph{A Refactoring Tool for 
+Smalltalk}\citing{refactoringBrowser1997}, published in 1997.  
+Allegedly\citing{etymology-refactoring}, the metaphor of factoring programs was 
+also present in the Forth\footnote{\emph{Forth} -- stack-based, extensible 
+programming language, without type-checking. See \url{http://www.forth.org}} 
+community, and the word ``refactoring'' is mentioned in a book by Leo Brodie, 
+called \emph{Thinking Forth}\citing{brodie1984}, first published in 
+1984\footnote{\emph{Thinking Forth} was first published in 1984 by the 
+\emph{Forth Interest Group}.  Then it was reprinted in 1994 with minor 
+typographical corrections, before it was transcribed into an electronic edition 
+typeset in \LaTeX\ and published under a Creative Commons licence in 2004. The 
+edition cited here is the 2004 edition, but the content should essentially be as 
+in 1984.}. The exact word is only printed one place~\cite[p.~232]{brodie1984}, 
+but the term \emph{factoring} is prominent in the book, that also contains a 
+whole chapter dedicated to (re)factoring, and how to keep the (Forth) code clean 
+and maintainable.
+
  \begin{quote}
-  \ldots to restructure software by applying a series of refactorings without 
-  changing its observable behavior.~\cite{refactoring} % page 54, definition
+  \ldots good factoring technique is perhaps the most important skill for a 
+  Forth programmer.~\cite[p.~172]{brodie1984}
  \end{quote}
  
-% subsection with the history of refactoring?
+\noindent Brodie also express what \emph{factoring} means to him:
  
-\subsection{Motivation} % better headline?
-To get a grasp of what refactoring is all about, we can answer this question: 
-\emph{Why do people refactor?} Possible answers could include: ``To remove 
-duplication'' or ``to break up long methods''.  Practitioners of the art of 
-Design Patterns~\cite{dp} could say that they do it to introduce a long-needed 
-pattern to their program's design.  So it's safe to say that peoples' intentions 
-are to make their programs \emph{better} in some sense. But what aspects of the 
-programs are becoming improved?
+\begin{quote}
+  Factoring means organizing code into useful fragments. To make a fragment 
+  useful, you often must separate reusable parts from non-reusable parts. The  
+  reusable parts become new definitions. The non-reusable parts become arguments 
+  or parameters to the definitions.~\cite[p.~172]{brodie1984}
+\end{quote}
+
+Fowler claims that the usage of the word \emph{refactoring} did not pass between 
+the \emph{Forth} and \emph{Smalltalk} communities, but that it emerged 
+independently in each of the communities.
+
+\section{Motivation -- Why people refactor}
+There are many reasons why people want to refactor their programs. They can for 
+instance do it to remove duplication, break up long methods or to introduce 
+design patterns\citing{designPatterns} into their software systems. The shared 
+trait for all these are that peoples intentions are to make their programs 
+\emph{better}, in some sense. But what aspects of their programs are becoming 
+improved?
  
  As already mentioned, people often refactor to get rid of duplication. Moving 
-identical or similar code into methods, and maybe pushing those up or down in 
-their hierarchies. Making template methods for overlapping algorithms 
-\todo{better?: functionality} and so on.  It's all about gathering what belongs 
-together and putting it all in one place.  And the result? The code is easier to 
-maintain. When removing the implicit coupling between the code snippets, the 
+identical or similar code into methods, and maybe pushing methods up or down in 
+their class hierarchies. Making template methods for overlapping 
+algorithms/functionality and so on. It is all about gathering what belongs 
+together and putting it all in one place. The resulting code is then easier to 
+maintain. When removing the implicit coupling\footnote{When duplicating code, 
+the code might not be coupled in other ways than that it is supposed to 
+represent the same functionality. So if this functionality is going to change, 
+it might need to change in more than one place, thus creating an implicit 
+coupling between the multiple pieces of code.} between code snippets, the 
  location of a bug is limited to only one place, and new functionality need only 
-to be added this one place, instead of a number of places people might not even 
-remember.
-
-The same people find out that their program contains a lot of long and 
-hard-to-grasp methods. Then what do they do? They begin dividing their methods 
-into smaller ones, using the \emph{Extract Method} 
-refactoring~\cite{refactoring}.  Then they may discover something about their 
-program that they weren't aware of before; revealing bugs they didn't know about 
-or couldn't find due to the complex structure of their program. \todo{Proof?} 
-Making the methods smaller and giving good names to the new ones clarifies the 
-algorithms and enhances the \emph{understandability} of the program. This makes 
-simple refactoring an excellent method for exploring unknown program code, or 
-code that you had forgotten that you wrote!
-
-The word \emph{simple} came up in the last section. In fact, most basic 
-refactorings are simple. The true power of them are revealed first when they are 
-combined into larger --- higher level --- refactorings, called \emph{composite 
-refactorings} \see{intro_composite}. Often the goal of such a series of 
-refactorings is a design pattern. Thus the \emph{design} can be evolved 
-throughout the lifetime of a program, opposed to designing up-front.  It's all 
-about being structured and taking small steps to improve the design.
-
-Many refactorings are aimed at lowering the coupling between different classes 
-and different layers of logic. Say for instance that the coupling between the 
-user interface and the business logic of a program is lowered. Then the business 
-logic of the program could much easier be the target of automated tests, 
-increasing the productivity in the software development process. It would also 
-be much easier to distribute the different parts of the program if they were 
-decoupled.
+to be added to this one place, instead of a number of places people might not 
+even remember.
+
+A problem you often encounter when programming, is that a program contains a lot 
+of long and hard-to-grasp methods. It can then help to break the methods into 
+smaller ones, using the \ExtractMethod refactoring\citing{refactoring}. Then you 
+may discover something about a program that you were not aware of before; 
+revealing bugs you did not know about or could not find due to the complex 
+structure of your program. \todo{Proof?} Making the methods smaller and giving 
+good names to the new ones clarifies the algorithms and enhances the 
+\emph{understandability} of the program \see{magic_number_seven}. This makes 
+refactoring an excellent method for exploring unknown program code, or code that 
+you had forgotten that you wrote.
+
+Most primitive refactorings are simple. Their true power is first revealed when 
+they are combined into larger --- higher level --- refactorings, called 
+\emph{composite refactorings} \see{compositeRefactorings}. Often the goal of 
+such a series of refactorings is a design pattern. Thus the \emph{design} can be 
+evolved throughout the lifetime of a program, as opposed to designing up-front.  
+It is all about being structured and taking small steps to improve a program's 
+design.
+
+Many software design pattern are aimed at lowering the coupling between 
+different classes and different layers of logic. One of the most famous is 
+perhaps the \emph{Model-View-Controller}\citing{designPatterns} pattern. It is 
+aimed at lowering the coupling between the user interface and the business logic 
+and data representation of a program. This also has the added benefit that the 
+business logic could much easier be the target of automated tests, increasing 
+the productivity in the software development process.  Refactoring is an 
+important tool on the way to something greater.
  
  Another effect of refactoring is that with the increased separation of concerns 
-coming out of many refactorings, the \emph{performance} is improved.  When 
-profiling programs, the problem parts are narrowed down to smaller parts of the 
-code, which are easier to tune, and optimization can be performed only where 
+coming out of many refactorings, the \emph{performance} can be improved. When 
+profiling programs, the problematic parts are narrowed down to smaller parts of 
+the code, which are easier to tune, and optimization can be performed only where 
  needed and in a more effective way.
  
+Last, but not least, and this should probably be the best reason to refactor, is 
+to refactor to \emph{facilitate a program change}. If one has managed to keep 
+one's code clean and tidy, and the code is not bloated with design patterns that 
+are not ever going to be needed, then some refactoring might be needed to 
+introduce a design pattern that is appropriate for the change that is going to 
+happen.
+
  Refactoring program code --- with a goal in mind --- can give the code itself 
  more value. That is in the form of robustness to bugs, understandability and 
-maintainability. With the first as an obvious advantage, but with the following 
-two being also very important in software development. By incorporating 
-refactoring in the development process, bugs are found faster, new functionality 
-is added more easily and code is easier to understand by the next person exposed 
-to it, which might as well be the person who wrote it. So, refactoring can also 
-add to the monetary value of a business, by increased productivity of the 
-development process in the long run.  Where this last point also should open 
-the eyes of some nearsighted managers who seldom see beyond the next milestone.
+maintainability. Having robust code is an obvious advantage, but 
+understandability and maintainability are both very important aspects of 
+software development. By incorporating refactoring in the development process, 
+bugs are found faster, new functionality is added more easily and code is easier 
+to understand by the next person exposed to it, which might as well be the 
+person who wrote it. The consequence of this, is that refactoring can increase 
+the average productivity of the development process, and thus also add to the 
+monetary value of a business in the long run. The perspective on productivity 
+and money should also be able to open the eyes of the many nearsighted managers 
+that seldom see beyond the next milestone.
+
+\section{The magical number seven}\label{magic_number_seven}
+The article \emph{The magical number seven, plus or minus two: some limits on 
+our capacity for processing information}\citing{miller1956} by George A.  
+Miller, was published in the journal \emph{Psychological Review} in 1956.  It 
+presents evidence that support that the capacity of the number of objects a 
+human being can hold in its working memory is roughly seven, plus or minus two 
+objects. This number varies a bit depending on the nature and complexity of the 
+objects, but is according to Miller ``\ldots never changing so much as to be 
+unrecognizable.''
+
+Miller's article culminates in the section called \emph{Recoding}, a term he 
+borrows from communication theory. The central result in this section is that by 
+recoding information, the capacity of the amount of information that a human can 
+process at a time is increased. By \emph{recoding}, Miller means to group 
+objects together in chunks and give each chunk a new name that it can be 
+remembered by. By organizing objects into patterns of ever growing depth, one 
+can memorize and process a much larger amount of data than if it were to be 
+represented as its basic pieces. This grouping and renaming is analogous to how 
+many refactorings work, by grouping pieces of code and give them a new name.  
+Examples are the fundamental \ExtractMethod and \refactoring{Extract Class} 
+refactorings\citing{refactoring}.
+
+\begin{quote}
+  \ldots recoding is an extremely powerful weapon for increasing the amount of 
+  information that we can deal with.~\cite[p.~95]{miller1956}
+\end{quote}
+
+An example from the article addresses the problem of memorizing a sequence of 
+binary digits. Let us say we have the following sequence\footnote{The example 
+  presented here is slightly modified (and shortened) from what is presented in 
+  the original article\citing{miller1956}, but it is essentially the same.} of 
+16 binary digits: ``1010001001110011''. Most of us will have a hard time 
+memorizing this sequence by only reading it once or twice. Imagine if we instead 
+translate it to this sequence: ``A273''. If you have a background from computer 
+science, it will be obvious that the latest sequence is the first sequence 
+recoded to be represented by digits with base 16. Most people should be able to 
+memorize this last sequence by only looking at it once.
+
+Another result from the Miller article is that when the amount of information a 
+human must interpret increases, it is crucial that the translation from one code 
+to another must be almost automatic for the subject to be able to remember the 
+translation, before \heshe is presented with new information to recode.  Thus 
+learning and understanding how to best organize certain kinds of data is 
+essential to efficiently handle that kind of data in the future. This is much 
+like when humans learn to read. First they must learn how to recognize letters.  
+Then they can learn distinct words, and later read sequences of words that form 
+whole sentences. Eventually, most of them will be able to read whole books and 
+briefly retell the important parts of its content. This suggest that the use of 
+design patterns\citing{designPatterns} is a good idea when reasoning about 
+computer programs. With extensive use of design patterns when creating complex 
+program structures, one does not always have to read whole classes of code to 
+comprehend how they function, it may be sufficient to only see the name of a 
+class to almost fully understand its responsibilities.
+
+\begin{quote}
+  Our language is tremendously useful for repackaging material into a few chunks 
+  rich in information.~\cite[p.~95]{miller1956}
+\end{quote}
+
+Without further evidence, these results at least indicate that refactoring 
+source code into smaller units with higher cohesion and, when needed, 
+introducing appropriate design patterns, should aid in the cause of creating 
+computer programs that are easier to maintain and has code that is easier (and 
+better) understood.
+
+\section{Notable contributions to the refactoring literature}
+\todoin{Update with more contributions}
+
+\begin{description}
+  \item[1992] William F. Opdyke submits his doctoral dissertation called 
+    \emph{Refactoring Object-Oriented Frameworks}\citing{opdyke1992}. This 
+    work defines a set of refactorings, that are behavior preserving given that 
+    their preconditions are met. The dissertation is focused on the automation 
+    of refactorings.
+  \item[1999] Martin Fowler et al.: \emph{Refactoring: Improving the Design of 
+    Existing Code}\citing{refactoring}. This is maybe the most influential text 
+    on refactoring. It bares similarities with Opdykes thesis\citing{opdyke1992} 
+    in the way that it provides a catalog of refactorings. But Fowler's book is 
+    more about the craft of refactoring, as he focuses on establishing a 
+    vocabulary for refactoring, together with the mechanics of different 
+    refactorings and when to perform them. His methodology is also founded on 
+  the principles of test-driven development.
+  \item[2005] Joshua Kerievsky: \emph{Refactoring to 
+    Patterns}\citing{kerievsky2005}. This book is heavily influenced by Fowler's 
+    \emph{Refactoring}\citing{refactoring} and the ``Gang of Four'' \emph{Design 
+    Patterns}\citing{designPatterns}. It is building on the refactoring 
+    catalogue from Fowler's book, but is trying to bridge the gap between 
+    \emph{refactoring} and \emph{design patterns} by providing a series of 
+    higher-level composite refactorings, that makes code evolve toward or away 
+    from certain design patterns. The book is trying to build up the readers 
+    intuition around \emph{why} one would want to use a particular design 
+    pattern, and not just \emph{how}. The book is encouraging evolutionary 
+    design.  \See{relationToDesignPatterns}
+\end{description}
+
+\section{Tool support (for Java)}\label{toolSupport}
+This section will briefly compare the refatoring support of the three IDEs 
+\emph{Eclipse}\footnote{\url{http://www.eclipse.org/}}, \emph{IntelliJ 
+IDEA}\footnote{The IDE under comparison is the \emph{Community Edition}, 
+\url{http://www.jetbrains.com/idea/}} and 
+\emph{NetBeans}\footnote{\url{https://netbeans.org/}}. These are the most 
+popular Java IDEs\citing{javaReport2011}.
+
+All three IDEs provide support for the most useful refactorings, like the 
+different extract, move and rename refactorings. In fact, Java-targeted IDEs are 
+known for their good refactoring support, so this did not appear as a big 
+surprise.
+
+The IDEs seem to have excellent support for the \ExtractMethod refactoring, so 
+at least they have all passed the first refactoring 
+rubicon\citing{fowlerRubicon2001,secondRubicon2012}.
+
+Regarding the \MoveMethod refactoring, the \emph{Eclipse} and \emph{IntelliJ} 
+IDEs do the job in very similar manners. In most situations they both do a 
+satisfying job by producing the expected outcome. But they do nothing to check 
+that the result does not break the semantics of the program \see{correctness}.
+The \emph{NetBeans} IDE implements this refactoring in a somewhat 
+unsophisticated way. For starters, its default destination for the move is 
+itself, although it refuses to perform the refactoring if chosen. But the worst 
+part is, that if moving the method \method{f} of the class \type{C} to the class 
+\type{X}, it will break the code. The result is shown in 
+\myref{lst:moveMethod_NetBeans}.
+
+\begin{listing}
+\begin{multicols}{2}
+\begin{minted}[samepage]{java}
+public class C {
+    private X x;
+    ...
+    public void f() {
+        x.m();
+        x.n();
+    }
+}
+\end{minted}
+
+\columnbreak
+
+\begin{minted}[samepage]{java}
+public class X {
+    ...
+    public void f(C c) {
+        c.x.m();
+        c.x.n();
+    }
+}
+\end{minted}
+\end{multicols}
+\caption{Moving method \method{f} from \type{C} to \type{X}.}
+\label{lst:moveMethod_NetBeans}
+\end{listing}
+
+NetBeans will try to make code that call the methods \method{m} and \method{n} 
+of \type{X} by accessing them through \var{c.x}, where \var{c} is a parameter of 
+type \type{C} that is added the method \method{f} when it is moved. (This is 
+seldom the desired outcome of this refactoring, but ironically, this ``feature'' 
+keeps NetBeans from breaking the code in the example from \myref{correctness}.) 
+If \var{c.x} for some reason is inaccessible to \type{X}, as in this case, the 
+refactoring breaks the code, and it will not compile. NetBeans presents a 
+preview of the refactoring outcome, but the preview does not catch it if the IDE 
+is about break the program. 
+
+The IDEs under investigation seems to have fairly good support for primitive 
+refactorings, but what about more complex ones, such as the \refactoring{Extract 
+Class}\citing{refactoring}? The \refactoring{Extract Class} refactoring works by 
+creating a class, for then to move members to that class and access them from 
+the old class via a reference to the new class. \emph{IntelliJ} handles this in 
+a fairly good manner, although, in the case of private methods, it leaves unused 
+methods behind. These are methods that delegate to a field with the type of the 
+new class, but are not used anywhere. \emph{Eclipse} has added (or withdrawn) 
+its own quirk to the Extract Class refactoring, and only allows for 
+\emph{fields} to be moved to a new class, \emph{not methods}. This makes it 
+effectively only extracting a data structure, and calling it 
+\refactoring{Extract Class} is a little misleading.  One would often be better 
+off with textual extract and paste than using the Extract Class refactoring in 
+Eclipse. When it comes to \emph{NetBeans}, it does not even seem to have made an 
+attempt on providing this refactoring. (Well, it probably has, but it does not 
+show in the IDE.) 
+
+\todoin{Visual Studio (C++/C\#), Smalltalk refactoring browser?,
+second refactoring rubicon?}
+
+\section{The relation to design patterns}\label{relationToDesignPatterns}
+
+\emph{Refactoring} and \emph{design patterns} have at least one thing in common, 
+they are both promoted by advocates of \emph{clean code}\citing{cleanCode} as 
+fundamental tools on the road to more maintanable and extendable source code.
+
+\begin{quote}
+  Design patterns help you determine how to reorganize a design, and they can 
+  reduce the amount of refactoring you need to do 
+  later.~\cite[p.~353]{designPatterns}
+\end{quote}
+
+Although sometimes associated with 
+over-engineering\citing{kerievsky2005,refactoring}, design patterns are in 
+general assumed to be good for maintainability of source code.  That may be 
+because many of them are designed to support the \emph{open/closed principle} of 
+object-oriented programming. The principle was first formulated by Bertrand 
+Meyer, the creator of the Eiffel programming language, like this: ``Modules 
+should be both open and closed.''\citing{meyer1988} It has been popularized, 
+with this as a common version: 
+
+\begin{quote}
+  Software entities (classes, modules, functions, etc.) should be open for 
+  extension, but closed for modification.\footnote{See 
+    \url{http://c2.com/cgi/wiki?OpenClosedPrinciple} or  
+    \url{https://en.wikipedia.org/wiki/Open/closed_principle}}
+\end{quote} 
+
+Maintainability is often thought of as the ability to be able to introduce new 
+functionality without having to change too much of the old code. When 
+refactoring, the motivation is often to facilitate adding new functionality. It 
+is about factoring the old code in a way that makes the new functionality being 
+able to benefit from the functionality already residing in a software system, 
+without having to copy old code into new. Then, next time someone shall add new 
+functionality, it is less likely that the old code has to change. Assuming that 
+a design pattern is the best way to get rid of duplication and assist in 
+implementing new functionality, it is reasonable to conclude that a design 
+pattern often is the target of a series of refactorings. Having a repertoire of 
+design patterns can also help in knowing when and how to refactor a program to 
+make it reflect certain desired characteristics.
+
+\begin{quote}
+  There is a natural relation between patterns and refactorings. Patterns are 
+  where you want to be; refactorings are ways to get there from somewhere 
+  else.~\cite[p.~107]{refactoring}
+\end{quote}
+
+This quote is wise in many contexts, but it is not always appropriate to say 
+``Patterns are where you want to be\ldots''. \emph{Sometimes}, patterns are 
+where you want to be, but only because it will benefit your design. It is not 
+true that one should always try to incorporate as many design patterns as 
+possible into a program. It is not like they have intrinsic value. They only add 
+value to a system when they support its design. Otherwise, the use of design 
+patterns may only lead to a program that is more complex than necessary.
+
+\begin{quote}
+  The overuse of patterns tends to result from being patterns happy. We are 
+  \emph{patterns happy} when we become so enamored of patterns that we simply 
+  must use them in our code.~\cite[p.~24]{kerievsky2005}
+\end{quote}
  
+This can easily happen when relying largely on up-front design. Then it is 
+natural, in the very beginning, to try to build in all the flexibility that one 
+believes will be necessary throughout the lifetime of a software system.  
+According to Joshua Kerievsky ``That sounds reasonable --- if you happen to be 
+psychic.''~\cite[p.~1]{kerievsky2005} He is advocating what he believes is a 
+better approach: To let software continually evolve. To start with a simple 
+design that meets today's needs, and tackle future needs by refactoring to 
+satisfy them. He believes that this is a more economic approach than investing 
+time and money into a design that inevitably is going to change. By relying on 
+continuously refactoring a system, its design can be made simpler without 
+sacrificing flexibility. To be able to fully rely on this approach, it is of 
+utter importance to have a reliable suit of tests to lean on. \See{testing} This 
+makes the design process more natural and less characterized by difficult 
+decisions that has to be made before proceeding in the process, and that is 
+going to define a project for all of its unforeseeable future.
+
+\begin{comment}
  
  \section{Classification of refactorings} 
  % only interesting refactorings
@@ -177,7 +537,7 @@ the eyes of some nearsighted managers who seldom see beyond the next milestone.
  
  \subsection{Structural refactorings}
  
-\subsubsection{Basic refactorings}
+\subsubsection{Primitive refactorings}
  
  % Composing Methods
  \explanation{Extract Method}{You have a code fragment that can be grouped 
@@ -372,13 +732,14 @@ the same. Then you can pull them up.}
  \explanation{Substitute Algorithm}{You want to replace an algorithm with one 
  that is clearer.}{Replace the body of the method with the new algorithm.}
  
+\end{comment}
  
  \section{The impact on software quality}
  
-\subsection{What is meant by quality?}
+\subsection{What is software quality?}
  The term \emph{software quality} has many meanings. It all depends on the 
  context we put it in. If we look at it with the eyes of a software developer, it 
-usually mean that the software is easily maintainable and testable, or in other 
+usually means that the software is easily maintainable and testable, or in other 
  words, that it is \emph{well designed}. This often correlates with the 
  management scale, where \emph{keeping the schedule} and \emph{customer 
  satisfaction} is at the center. From the customers point of view, in addition to 
@@ -389,46 +750,285 @@ of the scope of this document.)
  
  \subsection{The impact on performance}
  \begin{quote}
-  Refactoring certainly will make software go more slowly, but it also makes the 
-  software more amenable to performance tuning.~\cite{refactoring} % page 69
+  Refactoring certainly will make software go more slowly\footnote{With todays 
+  compiler optimization techniques and performance tuning of e.g. the Java 
+virtual machine, the penalties of object creation and method calls are 
+debatable.}, but it also makes the software more amenable to performance 
+tuning.~\cite[p.~69]{refactoring}
  \end{quote}
-There is a common belief that refactoring compromises performance, due to 
-increased degree of indirection and that polymorphism is slower than 
+
+\noindent There is a common belief that refactoring compromises performance, due 
+to increased degree of indirection and that polymorphism is slower than 
  conditionals.
  
-In a survey, Demeyer~\cite{demeyer2002} disproves this view in the case of 
-polymorphism. He is doing an experiment on, what he calls, ``Transform Self Type 
+In a survey, Demeyer\citing{demeyer2002} disproves this view in the case of 
+polymorphism. He did an experiment on, what he calls, ``Transform Self Type 
  Checks'' where you introduce a new polymorphic method and a new class hierarchy 
  to get rid of a class' type checking of a ``type attribute``. He uses this kind 
  of transformation to represent other ways of replacing conditionals with 
  polymorphism as well. The experiment is performed on the C++ programming 
-language and with three different compilers and platforms. \todo{But is the 
-result better?} Demeyer concludes that, with compiler optimization turned on, 
-polymorphism beats middle to large sized if-statements and does as well as 
-case-statements.  (In accordance with his hypothesis, due to similarities 
-between the way C++ handles polymorphism and case-statements.)
+language and with three different compilers and platforms. Demeyer concludes 
+that, with compiler optimization turned on, polymorphism beats middle to large 
+sized if-statements and does as well as case-statements.  (In accordance with 
+his hypothesis, due to similarities between the way C++ handles polymorphism and 
+case-statements.)
+
  \begin{quote}
    The interesting thing about performance is that if you analyze most programs, 
-  you find that they waste most of their time in a small fraction of the code.  
-  ~\cite{refactoring}
+  you find that they waste most of their time in a small fraction of the 
+  code.~\cite[p.~70]{refactoring}
  \end{quote}
-So, although an increased amount of method calls could potentially slow down 
-programs, one should avoid premature optimization and sacrificing good design, 
-leaving the performance tuning until after profiling the software and having 
-isolated the actual problem areas.
-
  
+\noindent So, although an increased amount of method calls could potentially 
+slow down programs, one should avoid premature optimization and sacrificing good 
+design, leaving the performance tuning until after profiling\footnote{For and 
+  example of a Java profiler, check out VisualVM: 
+  \url{http://visualvm.java.net/}} the software and having isolated the actual 
+  problem areas.
+
+\section{Composite refactorings}\label{compositeRefactorings}
+\todo{motivation, examples, manual vs automated?, what about refactoring in a 
+very large code base?}
+Generally, when thinking about refactoring, at the mechanical level, there are 
+essentially two kinds of refactorings. There are the \emph{primitive} 
+refactorings, and the \emph{composite} refactorings. 
+
+\definition{A \emph{primitive refactoring} is a refactoring that cannot be 
+expressed in terms of other refactorings.}
+
+\noindent Examples are the \refactoring{Pull Up Field} and \refactoring{Pull Up 
+Method} refactorings\citing{refactoring}, that move members up in their class 
+hierarchies.
+
+\definition{A \emph{composite refactoring} is a refactoring that can be 
+expressed in terms of two or more other refactorings.}
+
+\noindent An example of a composite refactoring is the \refactoring{Extract 
+Superclass} refactoring\citing{refactoring}. In its simplest form, it is composed 
+of the previously described primitive refactorings, in addition to the 
+\refactoring{Pull Up Constructor Body} refactoring\citing{refactoring}.  It works 
+by creating an abstract superclass that the target class(es) inherits from, then 
+by applying \refactoring{Pull Up Field}, \refactoring{Pull Up Method} and 
+\refactoring{Pull Up Constructor Body} on the members that are to be members of 
+the new superclass. For an overview of the \refactoring{Extract Superclass} 
+refactoring, see \myref{fig:extractSuperclass}.
+
+\begin{figure}[h]
+  \centering
+  \includegraphics[angle=270,width=\linewidth]{extractSuperclassItalic.pdf}
+  \caption{The Extract Superclass refactoring}
+  \label{fig:extractSuperclass}
+\end{figure}
+
+\section{Manual vs. automated refactorings}
+Refactoring is something every programmer does, even if \heshe does not known 
+the term \emph{refactoring}. Every refinement of source code that does not alter 
+the program's behavior is a refactoring. For small refactorings, such as 
+\ExtractMethod, executing it manually is a manageable task, but is still prone 
+to errors. Getting it right the first time is not easy, considering the method 
+signature and all the other aspects of the refactoring that has to be in place.  
+
+Take for instance the renaming of classes, methods and fields. For complex 
+programs these refactorings are almost impossible to get right.  Attacking them 
+with textual search and replace, or even regular expressions, will fall short on 
+these tasks. Then it is crucial to have proper tool support that can perform 
+them automatically. Tools that can parse source code and thus have semantic 
+knowledge about which occurrences of which names belong to what construct in the 
+program. For even trying to perform one of these complex task manually, one 
+would have to be very confident on the existing test suite \see{testing}.
+
+\section{Correctness of refactorings}\label{correctness}
+For automated refactorings to be truly useful, they must show a high degree of 
+behavior preservation. This last sentence might seem obvious, but there are 
+examples of refactorings in existing tools that break programs. I will now 
+present an example of an \ExtractMethod refactoring followed by a \MoveMethod 
+refactoring that breaks a program in both the \emph{Eclipse} and \emph{IntelliJ} 
+IDEs\footnote{The NetBeans IDE handles this particular situation without 
+  altering ther program's beavior, mainly because its Move Method refactoring 
+  implementation is a bit rancid in other ways \see{toolSupport}.}. The 
+  following piece of code shows the target for the composed refactoring:
+
+\begin{minted}[linenos,samepage]{java}
+public class C {
+    public X x = new X();
+
+    public void f() {
+        x.m(this);
+        x.n();
+    }
+}
+\end{minted}
+
+\noindent The next piece of code shows the destination of the refactoring. Note 
+that the method \method{m(C c)} of class \type{C} assigns to the field \var{x} 
+of the argument \var{c} that has type \type{C}:
+
+\begin{minted}[samepage]{java}
+public class X {
+    public void m(C c) {
+        c.x = new X();
+    }
+    public void n() {}
+}
+\end{minted}
+
+The refactoring sequence works by extracting line 5 and 6 from the original 
+class \type{C} into a method \method{f} with the statements from those lines as 
+its method body. The method is then moved to the class \type{X}. The result is 
+shown in the following two pieces of code:
+
+\begin{minted}[linenos,samepage]{java}
+public class C {
+    public X x = new X();
+
+    public void f() {
+        x.f(this);
+    }
+}
+\end{minted}
+
+\begin{minted}[linenos,samepage]{java}
+public class X {
+    public void m(C c) {
+        c.x = new X();
+    }
+    public void n() {}
+    public void f(C c) {
+        m(c);
+        n();
+    }
+}
+\end{minted}
+
+After the refactoring, the method \method{f} of class \type{C} is calling the 
+method \method{f} of class \type{X}, and the program now behaves different than 
+before. (See line 5 of the version of class \type{C} after the refactoring.) 
+Before the refactoring, the methods \method{m} and \method{n} of class \type{X} 
+are called on different object instances (see line 5 and 6 of the original class 
+\type{C}).  After, they are called on the same object, and the statement on line 
+3 of class \type{X} (the version after the refactoring) no longer have any 
+  effect in our example.
+
+The bug introduced in the previous example is of such a nature\footnote{Caused 
+  by aliasing. See \url{https://en.wikipedia.org/wiki/Aliasing_(computing)}} 
+  that it is very difficult to spot if the refactored code is not covered by 
+  tests.  It does not generate compilation errors, and will thus only result in 
+  a runtime error or corrupted data, which might be hard to detect.
+
+\section{Refactoring and the importance of testing}\label{testing}
+\begin{quote}
+  If you want to refactor, the essential precondition is having solid 
+  tests.\citing{refactoring}
+\end{quote}
  
-\section{Correctness of refactorings} 
-% Volker's example?
-
-\section{Composite refactorings} \label{intro_composite}
-% motivation, example(s)
-% manual vs automated?
-% what about refactoring in a very large code base?
+When refactoring, there are roughly three classes of errors that can be made.  
+The first class of errors are the ones that make the code unable to compile.  
+These \emph{compile-time} errors are of the nicer kind. They flash up at the 
+moment they are made (at least when using an IDE), and are usually easy to fix.  
+The second class are the \emph{runtime} errors. Although they take a bit longer 
+to surface, they usually manifest after some time in an illegal argument 
+exception, null pointer exception or similar during the program execution.  
+These kind of errors are a bit harder to handle, but at least they will show, 
+eventually. Then there are the \emph{behavior-changing} errors. These errors are 
+of the worst kind. They do not show up during compilation and they do not turn 
+on a blinking red light during runtime either. The program can seem to work 
+perfectly fine with them in play, but the business logic can be damaged in ways 
+that will only show up over time.
+
+For discovering runtime errors and behavior changes when refactoring, it is 
+essential to have good test coverage. Testing in this context means writing 
+automated tests. Manual testing may have its uses, but when refactoring, it is 
+automated unit testing that dominate. For discovering behavior changes it is 
+especially important to have tests that cover potential problems, since these 
+kind of errors does not reveal themselves.
+
+Unit testing is not a way to \emph{prove} that a program is correct, but it is a 
+way to make you confindent that it \emph{probably} works as desired.  In the 
+context of test driven development (commonly known as TDD), the tests are even a 
+way to define how the program is \emph{supposed} to work.  It is then, by 
+definition, working if the tests are passing.  
+
+If the test coverage for a code base is perfect, then it should, theoretically, 
+be risk-free to perform refactorings on it. This is why automated tests and 
+refactoring are such a great match.
+
+\subsection{Testing the code from correctness section}
+The worst thing that can happen when refactoring is to introduce changes to the 
+behavior of a program, as in the example on \myref{correctness}. This example 
+may be trivial, but the essence is clear. The only problem with the example is 
+that it is not clear how to create automated tests for it, without changing it 
+in intrusive ways.
+
+Unit tests, as they are known from the different xUnit frameworks around, are 
+only suitable to test the \emph{result} of isolated operations. They can not 
+easily (if at all) observe the \emph{history} of a program.
+
+
+\todoin{Write \ldots}
+
+Assuming a sequential (non-concurrent) program:
+
+\begin{minted}{java}
+tracematch (C c, X x) {
+  sym m before:
+    call(* X.m(C)) && args(c) && cflow(within(C));
+  sym n before:
+    call(* X.n()) && target(x) && cflow(within(C));
+  sym setCx after:
+    set(C.x) && target(c) && !cflow(m);
+
+  m n
+
+  { assert x == c.x; }
+}
+\end{minted}
+
+%\begin{minted}{java}
+%tracematch (X x1, X x2) {
+%  sym m before:
+%    call(* X.m(C)) && target(x1);
+%  sym n before:
+%    call(* X.n()) && target(x2);
+%  sym setX after:
+%    set(C.x) && !cflow(m) && !cflow(n);
+%
+%  m n
+%
+%  { assert x1 != x2; }
+%}
+%\end{minted}
+
+\section{The project}
+The aim of this project will be to investigate the relationship between a 
+composite refactoring composed of the \ExtractMethod and \MoveMethod 
+refactorings, and its impact on one or more software metrics.
+
+The composition of \ExtractMethod and \MoveMethod springs naturally out of the 
+need to move procedures closer to the data they manipulate. This composed 
+refactoring is not well described in the literature, but it is implemented in at 
+least one tool called 
+\emph{CodeRush}\footnote{\url{https://help.devexpress.com/\#CodeRush/CustomDocument3519}}, 
+that is an extension for \emph{MS Visual 
+Studio}\footnote{\url{http://www.visualstudio.com/}}. In CodeRush it is called 
+\emph{Extract Method to 
+Type}\footnote{\url{https://help.devexpress.com/\#CodeRush/CustomDocument6710}}, 
+but I choose to call it \ExtractAndMoveMethod, since I feel it better 
+communicates which primitive refactorings it is composed of. 
+
+For the metrics, I will at least measure the \emph{Coupling between object 
+classes} (CBO) metric that is described by Chidamber and Kemerer in their 
+article \emph{A Metrics Suite for Object Oriented 
+Design}\citing{metricsSuite1994}.
+
+The project will then consist in implementing the \ExtractAndMoveMethod 
+refactoring, as well as executing it over a larger code base. Then the effect of 
+the change must be measured by calculating the chosen software metrics both 
+before and after the execution. To be able to execute the refactoring 
+automatically I have to make it analyze code to determine the best selections to 
+extract into new methods.
  
  \section{Software metrics}
-
+\todoin{Is this the appropriate place to have this section?}
  
  %\part{The project}
  %\chapter{Planning the project}
@@ -436,13 +1036,71 @@ isolated the actual problem areas.
  %\chapter{Results}                   
  
  
-\chapter{Refactorings in Eclipse JDT: Design and 
-Shortcomings}\label{ch:jdt_refactorings}
+
+\chapter{\ldots}
+\todoin{write}
+\section{The problem statement}
+\section{Choosing the target language}
+Choosing which programming language to use as the target for manipulation is not 
+a very difficult task. The language has to be an object-oriented programming 
+language, and it must have existing tool support for refactoring. The 
+\emph{Java} programming language\footnote{\url{https://www.java.com/}} is the 
+dominating language when it comes to examples in the literature of refactoring, 
+and is thus a natural choice. Java is perhaps, currently the most influential 
+programming language in the world, with its \emph{Java Virtual Machine} that 
+runs on all of the most popular architectures and also supports\footnote{They 
+compile to java bytecode.} dozens of other programming languages, with 
+\emph{Scala}, \emph{Clojure} and \emph{Groovy} as the most prominent ones. Java 
+is currently the language that every other programming language is compared 
+against. It is also the primary language of the author of this thesis.
+
+\section{Choosing the tools}
+When choosing a tool for manipulating Java, there are certain criterias that 
+have to be met. First of all, the tool should have some existing refactoring 
+support that this thesis can build upon. Secondly it should provide some kind of 
+framework for parsing and analyzing Java source code. Third, it should itself be 
+open source. This is both because of the need to be able to browse the code for 
+the existing refactorings that is contained in the tool, and also because open 
+source projects hold value in them selves. Another important aspect to consider 
+is that open source projects of a certain size, usually has large communities of 
+people connected to them, that are commited to answering questions regarding the 
+use and misuse of the products, that to a large degree is made by the cummunity 
+itself.
+
+There is a certain class of tools that meet these criterias, namely the class of 
+\emph{IDEs}\footnote{\emph{Integrated Development Environment}}. These are 
+proagrams that is ment to support the whole production cycle of a cumputer 
+program, and the most popular IDEs that support Java, generally have quite good 
+refactoring support.
+
+The main contenders for this thesis is the \emph{Eclipse IDE}, with the 
+\emph{Java development tools} (JDT), the \emph{IntelliJ IDEA Community Edition} 
+and the \emph{NetBeans IDE}. \See{toolSupport} Eclipse and NetBeans are both 
+free, open source and community driven, while the IntelliJ IDEA has an open 
+sourced community edition that is free of charge, but also offer an 
+\emph{Ultimate Edition} with an extended set of features, at additional cost.  
+All three IDEs supports adding plugins to extend their functionality and tools 
+that can be used to parse and analyze Java source code. But one of the IDEs 
+stand out as a favorite, and that is the \emph{Eclipse IDE}. This is the most 
+popular\citing{javaReport2011} among them and seems to be de facto standard IDE 
+for Java development regardless of platform.
+
+
+\chapter{Refactorings in Eclipse JDT: Design, Shortcomings and Wishful 
+Thinking}\label{ch:jdt_refactorings}
+
+This chapter will deal with some of the design behind refactoring support in 
+Eclipse, and the JDT in specific. After which it will follow a section about 
+shortcomings of the refactoring API in terms of composition of refactorings. The 
+chapter will be concluded with a section telling some of the ways the 
+implementation of refactorings in the JDT could have worked to facilitate 
+composition of refactorings.
  
  \section{Design}
  The refactoring world of Eclipse can in general be separated into two parts: The 
-language independent part and the the part written for a specific programming 
-language -- the language that is the target of the supported refactorings. 
+language independent part and the part written for a specific programming 
+language -- the language that is the target of the supported refactorings.  
+\todo{What about the language specific part?}
  
  \subsection{The Language Toolkit}
  The Language Toolkit, or LTK for short, is the framework that is used to 
@@ -463,40 +1121,413 @@ and
  \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkFinalConditions}), 
  in addition to the 
  \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{createChange} 
-method that creates and returns an instance of the \type{Change} class that is 
-responsible for performing the actual workspace transformations.
-\todo{Write something about processor-based refactorings?}
+method that creates and returns an instance of the \type{Change} class.
+
+If the refactoring shall support that others participate in it when it is 
+executed, the refactoring has to be a processor-based 
+refactoring\typeref{org.eclipse.ltk.core.refactoring.participants.ProcessorBasedRefactoring}.  
+It then delegates to its given 
+\typewithref{org.eclipse.ltk.core.refactoring.participants}{RefactoringProcessor} 
+for condition checking and change creation.
  
  \subsubsection{The Change Class}
-\todo{\ldots}
+This class is the base class for objects that is responsible for performing the 
+actual workspace transformations in a refactoring. The main responsibilities for 
+its subclasses is to implement the 
+\methodwithref{org.eclipse.ltk.core.refactoring.Change}{perform} and 
+\methodwithref{org.eclipse.ltk.core.refactoring.Change}{isValid} methods. The 
+\method{isValid} method verifies that the change object is valid and thus can be 
+executed by calling its \method{perform} method. The \method{perform} method 
+performs the desired change and returns an undo change that can be executed to 
+reverse the effect of the transformation done by its originating change object. 
+
+\subsubsection{Executing a Refactoring}\label{executing_refactoring}
+The life cycle of a refactoring generally follows two steps after creation: 
+condition checking and change creation. By letting the refactoring object be 
+handled by a 
+\typewithref{org.eclipse.ltk.core.refactoring}{CheckConditionsOperation} that
+in turn is handled by a 
+\typewithref{org.eclipse.ltk.core.refactoring}{CreateChangeOperation}, it is 
+assured that the change creation process is managed in a proper manner.
+
+The actual execution of a change object has to follow a detailed life cycle.  
+This life cycle is honored if the \type{CreateChangeOperation} is handled by a 
+\typewithref{org.eclipse.ltk.core.refactoring}{PerformChangeOperation}. If also 
+an undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} is set 
+for the \type{PerformChangeOperation}, the undo change is added into the undo 
+history.
  
  \section{Shortcomings}
+This section is introduced naturally with a conclusion: The JDT refactoring 
+implementation does not facilitate composition of refactorings. 
+\todo{refine}This section will try to explain why, and also identify other 
+shortcomings of both the usability and the readability of the JDT refactoring 
+source code.
+
+I will begin at the end and work my way toward the composition part of this 
+section.
+
+\subsection{Absence of Generics in Eclipse Source Code}
+This section is not only concerning the JDT refactoring API, but also large 
+quantities of the Eclipse source code. The code shows a striking absence of the 
+Java language feature of generics. It is hard to read a class' interface when 
+methods return objects or takes parameters of raw types such as \type{List} or 
+\type{Map}. This sometimes results in having to read a lot of source code to 
+understand what is going on, instead of relying on the available interfaces. In 
+addition, it results in a lot of ugly code, making the use of typecasting more 
+of a rule than an exception.
+
+\subsection{Composite Refactorings Will Not Appear as Atomic Actions}
+
+\subsubsection{Missing Flexibility from JDT Refactorings}
+The JDT refactorings are not made with composition of refactorings in mind. When 
+a JDT refactoring is executed, it assumes that all conditions for it to be 
+applied successfully can be found by reading source files that has been 
+persisted to disk. They can only operate on the actual source material, and not 
+(in-memory) copies thereof. This constitutes a major disadvantage when trying to 
+compose refactorings, since if an exception occur in the middle of a sequence of 
+refactorings, it can leave the project in a state where the composite 
+refactoring was executed only partly. It makes it hard to discard the changes 
+done without monitoring and consulting the undo manager, an approach that is not 
+bullet proof.
+
+\subsubsection{Broken Undo History}
+When designing a composed refactoring that is to be performed as a sequence of 
+refactorings, you would like it to appear as a single change to the workspace.  
+This implies that you would also like to be able to undo all the changes done by 
+the refactoring in a single step. This is not the way it appears when a sequence 
+of JDT refactorings is executed. It leaves the undo history filled up with 
+individual undo actions corresponding to every single JDT refactoring in the 
+sequence. This problem is not trivial to handle in Eclipse.  
+\See{hacking_undo_history}
+
+\section{Wishful Thinking}
+\todoin{???}
  
  \chapter{Composite Refactorings in Eclipse}
  
  \section{A Simple Ad Hoc Model}
-As pointed out in chapter \ref{ch:jdt_refactorings}, the Eclipse JDT refactoring 
-model is not very well suited for making composite refactorings. Therefore a 
-simple model using changer objects (of type \type{RefaktorChanger}) is used as 
-an abstraction layer on top of the existing Eclipse refactorings.
+As pointed out in \myref{ch:jdt_refactorings}, the Eclipse JDT refactoring model 
+is not very well suited for making composite refactorings. Therefore a simple 
+model using changer objects (of type \type{RefaktorChanger}) is used as an 
+abstraction layer on top of the existing Eclipse refactorings, instead of 
+extending the \typewithref{org.eclipse.ltk.core.refactoring}{Refactoring} class.  
+
+The use of an additional abstraction layer is a deliberate choice. It is due to 
+the problem of creating a composite 
+\typewithref{org.eclipse.ltk.core.refactoring}{Change} that can handle text 
+changes that interfere with each other. Thus, a \type{RefaktorChanger} may, or 
+may not, take advantage of one or more existing refactorings, but it is always 
+intended to make a change to the workspace.
+
+\subsection{A typical \type{RefaktorChanger}}
+The typical refaktor changer class has two responsibilities, checking 
+preconditions and executing the requested changes. This is not too different 
+from the responsibilities of an LTK refactoring, with the distinction that a 
+refaktor changer also executes the change, while an LTK refactoring is only 
+responsible for creating the object that can later be used to do the job.
+
+Checking of preconditions is typically done by an 
+\typewithref{no.uio.ifi.refaktor.analyze.analyzers}{Analyzer}. If the 
+preconditions validate, the upcoming changes are executed by an 
+\typewithref{no.uio.ifi.refaktor.change.executors}{Executor}.
  
  \section{The Extract and Move Method Refactoring}
-The Extract and Move Method Refactoring is implemented mainly using these 
-classes:
-\begin{itemize}
-  \item \type{ExtractAndMoveMethodChanger}
-  \item \type{ExtractAndMoveMethodPrefixesExtractor}
-  \item \type{Prefix}
-  \item \type{PrefixSet}
-\end{itemize}
+%The Extract and Move Method Refactoring is implemented mainly using these 
+%classes:
+%\begin{itemize}
+%  \item \type{ExtractAndMoveMethodChanger}
+%  \item \type{ExtractAndMoveMethodPrefixesExtractor}
+%  \item \type{Prefix}
+%  \item \type{PrefixSet}
+%\end{itemize}
+
+\subsection{The Building Blocks}
+This is a composite refactoring, and hence is built up using several primitive 
+refactorings. These basic building blocks are, as its name implies, the 
+\ExtractMethod refactoring\citing{refactoring} and the \MoveMethod 
+refactoring\citing{refactoring}. In Eclipse, the implementations of these 
+refactorings are found in the classes 
+\typewithref{org.eclipse.jdt.internal.corext.refactoring.code}{ExtractMethodRefactoring} 
+and 
+\typewithref{org.eclipse.jdt.internal.corext.refactoring.structure}{MoveInstanceMethodProcessor}, 
+where the last class is designed to be used together with the processor-based 
+\typewithref{org.eclipse.ltk.core.refactoring.participants}{MoveRefactoring}.
+
+\subsubsection{The ExtractMethodRefactoring Class}
+This class is quite simple in its use. The only parameters it requires for 
+construction is a compilation 
+unit\typeref{org.eclipse.jdt.core.ICompilationUnit}, the offset into the source 
+code where the extraction shall start, and the length of the source to be 
+extracted. Then you have to set the method name for the new method together with 
+its visibility and some not so interesting parameters.
+
+\subsubsection{The MoveInstanceMethodProcessor Class}
+For the Move Method, the processor requires a little more advanced input than  
+the class for the Extract Method. For construction it requires a method 
+handle\typeref{org.eclipse.jdt.core.IMethod} for the method that is to be moved. 
+Then the target for the move have to be supplied as the variable binding from a 
+chosen variable declaration. In addition to this, one have to set some 
+parameters regarding setters/getters, as well as delegation.
+
+To make a working refactoring from the processor, one have to create a 
+\type{MoveRefactoring} with it.
  
  \subsection{The ExtractAndMoveMethodChanger Class}
-\subsection{The ExtractAndMoveMethodPrefixesExtractor Class}
+
+The \typewithref{no.uio.ifi.refaktor.changers}{ExtractAndMoveMethodChanger} 
+class is a subclass of the class 
+\typewithref{no.uio.ifi.refaktor.changers}{RefaktorChanger}. It is responsible 
+for analyzing and finding the best target for, and also executing, a composition 
+of the Extract Method and Move Method refactorings. This particular changer is 
+the one of my changers that is closest to being a true LTK refactoring. It can 
+be reworked to be one if the problems with overlapping changes are resolved. The 
+changer requires a text selection and the name of the new method, or else a 
+method name will be generated. The selection has to be of the type
+\typewithref{no.uio.ifi.refaktor.utils}{CompilationUnitTextSelection}. This 
+class is a custom extension to 
+\typewithref{org.eclipse.jface.text}{TextSelection}, that in addition to the 
+basic offset, length and similar methods, also carry an instance of the 
+underlying compilation unit handle for the selection.
+
+\subsubsection{The \type{ExtractAndMoveMethodAnalyzer}}
+The analysis and precondition checking is done by the 
+\typewithref{no.uio.ifi.refaktor.analyze.analyzers}{ExtractAnd\-MoveMethodAnalyzer}.  
+First is check whether the selection is a valid selection or not, with respect 
+to statement boundaries and that it actually contains any selections. Then it 
+checks the legality of both extracting the selection and also moving it to 
+another class. If the selection is approved as legal, it is analyzed to find the 
+presumably best target to move the extracted method to.
+
+For finding the best suitable target the analyzer is using a 
+\typewithref{no.uio.ifi.refaktor.analyze.collectors}{PrefixesCollector} that 
+collects all the possible candidates for the refactoring. All the non-candidates 
+is found by an 
+\typewithref{no.uio.ifi.refaktor.analyze.collectors}{UnfixesCollector} that 
+collects all the targets that will give some kind of error if used. All prefixes 
+(and unfixes) are represented by a 
+\typewithref{no.uio.ifi.refaktor.extractors}{Prefix}, and they are collected 
+into sets of prefixes. The safe prefixes is found by subtracting from the set of 
+candidate prefixes the prefixes that is enclosing any of the unfixes. A prefix 
+is enclosing an unfix if the unfix is in the set of its sub-prefixes.  As an 
+example, \texttt{``a.b''} is enclosing \texttt{``a''}, as is \texttt{``a''}. The 
+safe prefixes is unified in a \type{PrefixSet}. If a prefix has only one 
+occurrence, and is a simple expression, it is considered unsuitable as a move 
+target. This occurs in statements such as \texttt{``a.foo()''}. For such 
+statements it bares no meaning to extract and move them. It only generates an 
+extra method and the calling of it. 
+
+\todoin{Clean up sections/subsections.}
+
+\subsubsection{The \type{ExtractAndMoveMethodExecutor}}
+If the analysis finds a possible target for the composite refactoring, it is 
+executed by an 
+\typewithref{no.uio.ifi.refaktor.change.executors}{ExtractAndMoveMethodExecutor}.  
+It is composed of the two executors known as 
+\typewithref{no.uio.ifi.refaktor.change.executors}{ExtractMethodRefactoringExecutor} 
+and 
+\typewithref{no.uio.ifi.refaktor.change.executors}{MoveMethodRefactoringExecutor}.  
+The \type{ExtractAndMoveMethodExecutor} is responsible for gluing the two 
+together by feeding the \type{MoveMethod\-RefactoringExecutor} with the 
+resources needed after executing the extract method refactoring.  
+\See{postExtractExecution}
+
+\subsubsection{The \type{ExtractMethodRefactoringExecutor}}
+This executor is responsible for creating and executing an instance of the 
+\type{ExtractMethodRefactoring} class. It is also responsible for collecting 
+some post execution resources that can be used to find the method handle for the 
+extracted method, as well as information about its parameters, including the 
+variable they originated from.
+
+\subsubsection{The \type{MoveMethodRefactoringExecutor}}
+This executor is responsible for creating and executing an instance of the 
+\type{MoveRefactoring}. The move refactoring is a processor-based refactoring, 
+and for the Move Method refactoring it is the \type{MoveInstanceMethodProcessor} 
+that is used.
+
+The handle for the method to be moved is found on the basis of the information 
+gathered after the execution of the Extract Method refactoring. The only 
+information the \type{ExtractMethodRefactoring} is sharing after its execution, 
+regarding find the method handle, is the textual representation of the new 
+method signature. Therefore it must be parsed, the strings for types of the 
+parameters must be found and translated to a form that can be used to look up 
+the method handle from its type handle. They have to be on the unresolved 
+form.\todo{Elaborate?} The name for the type is found from the original 
+selection, since an extracted method must end up in the same type as the 
+originating method.
+
+When analyzing a selection prior to performing the Extract Method refactoring, a 
+target is chosen. It has to be a variable binding, so it is either a field or a 
+local variable/parameter. If the target is a field, it can be used with the 
+\type{MoveInstanceMethodProcessor} as it is, since the extracted method still is 
+in its scope. But if the target is local to the originating method, the target 
+that is to be used for the processor must be among its parameters. Thus the 
+target must be found among the extracted method's parameters. This is done by 
+finding the parameter information object that corresponds to the parameter that 
+was declared on basis of the original target's variable when the method was 
+extracted. (The extracted method must take one such parameter for each local 
+variable that is declared outside the selection that is extracted.) To match the 
+original target with the correct parameter information object, the key for the 
+information object is compared to the key from the original target's binding.  
+The source code must then be parsed to find the method declaration for the 
+extracted method. The new target must be found by searching through the 
+parameters of the declaration and choose the one that has the same type as the 
+old binding from the parameter information object, as well as the same name that 
+is provided by the parameter information object.
+
+
+\subsection{Finding the IMethod}\label{postExtractExecution}
+\todoin{Rename section. Write.}
+
+\subsection{Property collectors}
+The prefixes and unfixes are found by property 
+collectors\typeref{no.uio.ifi.refaktor.extractors.collectors.PropertyCollector}.  
+A property collector follows the visitor pattern\citing{designPatterns} and is 
+of the \typewithref{org.eclipse.jdt.core.dom}{ASTVisitor} type.  An 
+\type{ASTVisitor} visits nodes in an abstract syntax tree that forms the Java 
+document object model. The tree consists of nodes of type 
+\typewithref{org.eclipse.jdt.core.do}{ASTNode}.
+
+\subsubsection{The PrefixesCollector}
+The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{PrefixesCollector} 
+finds prefixes that makes up tha basis for calculating move targets for the 
+Extract and Move Method refactoring. It visits expression 
+statements\typeref{org.eclipse.jdt.core.dom.ExpressionStatement} and creates 
+prefixes from its expressions in the case of method invocations. The prefixes 
+found is registered with a prefix set, together with all its sub-prefixes.
+\todo{Rewrite in the case of changes to the way prefixes are found}
+
+\subsubsection{The UnfixesCollector}\label{unfixes}
+The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector} 
+finds unfixes within a selection. That is prefixes that cannot be used as a 
+basis for finding a move target in a refactoring.
+
+An unfix can be a name that is assigned to within a selection. The reason that 
+this cannot be allowed, is that the result would be an assignment to the 
+\type{this} keyword, which is not valid in Java \see{eclipse_bug_420726}.
+
+Prefixes that originates from variable declarations within the same selection 
+are also considered unfixes. This is because when a method is moved, it needs to 
+be called through a variable. If this variable is also within the method that is 
+to be moved, this obviously cannot be done.
+
+Also considered as unfixes are variable references that are of types that is not 
+suitable for moving a methods to. This can be either because it is not 
+physically possible to move the method to the desired class or that it will 
+cause compilation errors by doing so.
+
+If the type binding for a name is not resolved it is considered and unfix. The 
+same applies to types that is only found in compiled code, so they have no 
+underlying source that is accessible to us. (E.g. the \type{java.lang.String} 
+class.)
+
+Interfaces types are not suitable as targets. This is simply because interfaces 
+in java cannot contain methods with bodies. (This thesis does not deal with 
+features of Java versions later than Java 7. Java 8 has interfaces with default 
+implementations of methods.) Neither are local types allowed. This accounts for 
+both local and anonymous classes. Anonymous classes are effectively the same as 
+interface types with respect to unfixes. Local classes could in theory be used 
+as targets, but this is not possible due to limitations of the implementation of 
+the Extract and Move Method refactoring. The problem is that the refactoring is 
+done in two steps, so the intermediate state between the two refactorings would 
+not be legal Java code. In the case of local classes, the problem is that, in 
+the intermediate step, a selection referencing a local class would need to take 
+the local class as a parameter if it were to be extracted to a new method. This 
+new method would need to live in the scope of the declaring class of the 
+originating method. The local class would then not be in the scope of the 
+extracted method, thus bringing the source code into an illegal state. One could 
+imagine that the method was extracted and moved in one operation, without an 
+intermediate state. Then it would make sense to include variables with types of 
+local classes in the set of legal targets, since the local classes would then be 
+in the scopes of the method calls. If this makes any difference for software 
+metrics that measure coupling would be a different discussion.
+
+\begin{listing}
+\begin{multicols}{2}
+\begin{minted}[]{java}
+// Before
+void declaresLocalClass() {
+  class LocalClass {
+    void foo() {}
+    void bar() {}
+  }
+
+  LocalClass inst =
+    new LocalClass();
+  inst.foo();
+  inst.bar();
+}
+\end{minted}
+
+\columnbreak
+
+\begin{minted}[]{java}
+// After Extract Method
+void declaresLocalClass() {
+  class LocalClass {
+    void foo() {}
+    void bar() {}
+  }
+
+  LocalClass inst =
+    new LocalClass();
+  fooBar(inst);
+}
+
+// Intermediate step
+void fooBar(LocalClass inst) {
+  inst.foo();
+  inst.bar();
+}
+\end{minted}
+\end{multicols}
+\caption{When Extract and Move Method tries to use a variable with a local type 
+as the move target, an intermediate step is taken that is not allowed. Here: 
+\type{LocalClass} is not in the scope of \method{fooBar} in its intermediate 
+location.}
+\label{lst:extractMethod_LocalClass}
+\end{listing}
+
+The last class of names that are considered unfixes is names used in null tests.  
+These are tests that reads like this: if \texttt{<name>} equals \var{null} then 
+do something. If allowing variables used in those kinds of expressions as 
+targets for moving methods, we would end up with code containing boolean 
+expressions like \texttt{this == null}, which would not be meaningful, since 
+\var{this} would never be \var{null}.
+
  \subsection{The Prefix Class}
-\subsection{The PrefixSet Class}
+This class exists mainly for holding data about a prefix, such as the expression 
+that the prefix represents and the occurrence count of the prefix within a 
+selection. In addition to this, it has some functionality such as calculating 
+its sub-prefixes and intersecting it with another prefix. The definition of the 
+intersection between two prefixes is a prefix representing the longest common 
+expression between the two.
  
-\subsection{Hacking the Refactoring Undo History}
-\todo{Where to put this section?}
+\subsection{The PrefixSet Class}
+A prefix set holds elements of type \type{Prefix}. It is implemented with the 
+help of a \typewithref{java.util}{HashMap} and contains some typical set 
+operations, but it does not implement the \typewithref{java.util}{Set} 
+interface, since the prefix set does not need all of the functionality a 
+\type{Set} requires to be implemented. In addition It needs some other 
+functionality not found in the \type{Set} interface. So due to the relatively 
+limited use of prefix sets, and that it almost always needs to be referenced as 
+such, and not a \type{Set<Prefix>}, it remains as an ad hoc solution to a 
+concrete problem.
+
+There are two ways adding prefixes to a \type{PrefixSet}. The first is through 
+its \method{add} method. This works like one would expect from a set. It adds 
+the prefix to the set if it does not already contain the prefix. The other way 
+is to \emph{register} the prefix with the set. When registering a prefix, if the 
+set does not contain the prefix, it is just added. If the set contains the 
+prefix, its count gets incremented. This is how the occurrence count is handled.
+
+The prefix set also computes the set of prefixes that is not enclosing any 
+prefixes of another set. This is kind of a set difference operation only for 
+enclosing prefixes.
+
+\subsection{Hacking the Refactoring Undo 
+History}\label{hacking_undo_history}
+\todoin{Where to put this section?}
  
  As an attempt to make multiple subsequent changes to the workspace appear as a 
  single action (i.e. make the undo changes appear as such), I tried to alter 
@@ -509,8 +1540,8 @@ Eclipse refactorings, and then add them to a composite
  change\typeref{org.eclipse.ltk.core.refactoring.CompositeChange} that could be 
  added back to the manager. The interface of the undo manager does not offer a 
  way to remove/pop the last added undo change, so a possible solution could be to 
-decorate \cite{dp} the undo manager, to intercept and collect the undo changes 
-before delegating to the \method{addUndo} 
+decorate\citing{designPatterns} the undo manager, to intercept and collect the 
+undo changes before delegating to the \method{addUndo} 
  method\methodref{org.eclipse.ltk.core.refactoring.IUndoManager}{addUndo} of the 
  manager. Instead of giving it the intended undo change, a null change could be 
  given to prevent it from making any changes if run. Then one could let the 
@@ -549,6 +1580,394 @@ design of the refactoring undo management is partly to be blamed for this, as it
  it is to complex to be easily manipulated.
  
  
+
+
+\chapter{Analyzing Source Code in Eclipse}
+
+\section{The Java model}
+The Java model of Eclipse is its internal representation of a Java project. It 
+is light-weight, and has only limited possibilities for manipulating source 
+code. It is typically used as a basis for the Package Explorer in Eclipse.
+
+The elements of the Java model is only handles to the underlying elements. This 
+means that the underlying element of a handle does not need to actually exist.  
+Hence the user of a handle must always check that it exist by calling the 
+\method{exists} method of the handle.
+
+The handles with descriptions is listed in \myref{tab:javaModelTable}.
+
+\begin{table}[h]
+  \centering
+
+  \newcolumntype{L}[1]{>{\hsize=#1\hsize\raggedright\arraybackslash}X}%
+  % sum must equal number of columns (3)
+  \begin{tabularx}{\textwidth}{| L{0.7} | L{1.1} | L{1.2} |} 
+    \hline
+    \textbf{Project Element} & \textbf{Java Model element} & 
+    \textbf{Description} \\
+    \hline
+    Java project & \type{IJavaProject} & The Java project which contains all other objects. \\
+    \hline
+    Source folder /\linebreak[2] binary folder /\linebreak[3] external library & 
+    \type{IPackageFragmentRoot} & Hold source or binary files, can be a folder 
+    or a library (zip / jar file). \\
+    \hline
+    Each package & \type{IPackageFragment} & Each package is below the 
+    \type{IPackageFragmentRoot}, sub-packages are not leaves of the package, 
+    they are listed directed under \type{IPackageFragmentRoot}. \\
+    \hline
+    Java Source file & \type{ICompilationUnit} & The Source file is always below 
+    the package node. \\
+    \hline
+    Types /\linebreak[2] Fields /\linebreak[3] Methods & \type{IType} / 
+    \linebreak[0]
+    \type{IField} /\linebreak[3] \type{IMethod} & Types, fields and methods. \\
+    \hline
+  \end{tabularx}
+  \caption{The elements of the Java Model. {\footnotesize Taken from 
+    \url{http://www.vogella.com/tutorials/EclipseJDT/article.html}}}
+  \label{tab:javaModelTable}
+\end{table}
+
+The hierarchy of the Java Model is shown in \myref{fig:javaModel}.
+
+\begin{figure}[h]
+  \centering
+  \begin{tikzpicture}[%
+  grow via three points={one child at (0,-0.7) and
+  two children at (0,-0.7) and (0,-1.4)},
+  edge from parent path={(\tikzparentnode.south west)+(0.5,0) |- 
+  (\tikzchildnode.west)}]
+  \tikzstyle{every node}=[draw=black,thick,anchor=west]
+  \tikzstyle{selected}=[draw=red,fill=red!30]
+  \tikzstyle{optional}=[dashed,fill=gray!50]
+  \node {\type{IJavaProject}}
+    child { node {\type{IPackageFragmentRoot}}
+      child { node {\type{IPackageFragment}}
+        child { node {\type{ICompilationUnit}}
+          child { node {\type{IType}}
+            child { node {\type{\{ IType \}*}}
+              child { node {\type{\ldots}}}
+            }
+            child [missing] {}
+            child { node {\type{\{ IField \}*}}}
+            child { node {\type{IMethod}}
+              child { node {\type{\{ IType \}*}}
+                child { node {\type{\ldots}}}
+              }
+            }
+            child [missing] {}
+            child [missing] {}
+            child { node {\type{\{ IMethod \}*}}}
+          }
+          child [missing] {}
+          child [missing] {}
+          child [missing] {}
+          child [missing] {}
+          child [missing] {}
+          child [missing] {}
+          child [missing] {}
+          child { node {\type{\{ IType \}*}}}
+        }
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child [missing] {}
+        child { node {\type{\{ ICompilationUnit \}*}}}
+      }
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child [missing] {}
+      child { node {\type{\{ IPackageFragment \}*}}}
+    }
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child [missing] {}
+    child { node {\type{\{ IPackageFragmentRoot \}*}}}
+    ;
+  \end{tikzpicture}
+  \caption{The Java model of Eclipse. ``\type{\{ SomeElement \}*}'' means 
+  \type{SomeElement} zero or more times. For recursive structures, 
+  ``\type{\ldots}'' is used.}
+  \label{fig:javaModel}
+\end{figure}
+
+\section{The Abstract Synax Tree}
+Eclipse is following the common paradigm of using an abstract syntaxt tree for 
+source code analysis and manipulation.
+
+When parsing program source code into something that can be used as a foundation 
+for analysis, the start of the process follows the same steps as in a compiler.  
+This is all natural, because the way a compiler anayzes code is no different 
+from how source manipulation programs would do it, except for some properties of 
+code that is analyzed in the parser, and that they may be differing in what 
+kinds of properties they analyze.  Thus the process of translation source code 
+into a structure that is suitable for analyzing, can be seen as a kind of 
+interrupted compilation process.
+
+The process starts with a \emph{scanner}, or lexer. The job of the scanner is to 
+read the source code and divide it into tokens for the parser. Therefore, it is 
+also sometimes called a tokenizer. A token is a logical unit, defined in the 
+language specification, consisting of one or more consecutive characters.  In 
+the java language the tokens can for instance be the \var{this} keyword, a curly 
+bracket \var{\{} or a \var{nameToken}. It is recognized by the scanner on the 
+basis of something eqivalent of a regular expression. This part of the process 
+is often implemented with the use of a finite automata. In fact, it is common to 
+specify the tokens in regular expressions, that in turn is translated into a 
+finite automata lexer. This process can be automated.
+
+The program component used to translate a a stream of tokens into something 
+meaningful, is called a parser. A parser is fed tokens from the scanner and 
+performs an analysis of the structure of a program. It verifies that the syntax 
+is correct according to the grammar rules of a language, that is usually 
+specified in a context-free grammar, and often in a variant of the 
+\emph{Backus--Naur 
+Form}\footnote{\url{https://en.wikipedia.org/wiki/Backus-Naur\_Form}}. The 
+result coming from the parser is in the form of an \emph{Abstract Syntax Tree}, 
+AST for short. It is called \emph{abstract}, because the structure does not 
+contain all of the tokens produced by the scanner. It only contain logical 
+constructs, and because it forms a tree, all kinds of parentheses and brackets 
+are implicit in the structure. It is this AST that is used when performing the 
+semantic analysis of the code.
+
+As an example we can think of the expression \code{(5 + 7) * 2}. The root of 
+this tree would in Eclipse be an \type{InfixExpression} with the operator
+\var{TIMES}, and a left operand that is also an \type{InfixExpression} with the 
+operator \var{PLUS}. The left operand \type{InfixExpression}, has in turn a left 
+operand of type \type{NumberLiteral} with the value \var{``5''} and a right 
+operand \type{NumberLiteral} with the value \var{``7''}.  The root will have a 
+right operand of type \type{NumberLiteral} and value \var{``2''}. The AST for 
+this expression is illustrated in \myref{fig:astInfixExpression}.
+
+Contrary to the Java Model, an abstract syntaxt tree is a heavy-weight 
+representation of source code. It contains information about propertes like type 
+bindings for variables and variable bindings for names. 
+
+
+\begin{figure}[h]
+  \centering
+  \begin{tikzpicture}[scale=0.7]
+  \tikzset{level distance=40pt}
+  \tikzset{edge from parent/.append style={thick}}
+  \tikzset{every internal node/.style={ellipse,draw,fill=lightgray}}
+  \tikzset{every leaf node/.style={draw=none,fill=none}}
+
+  \Tree [.\type{InfixExpression} [.\type{InfixExpression}
+    [.\type{NumberLiteral} \var{``5''} ]  [.\type{Operator} \var{PLUS} ] 
+    [.\type{NumberLiteral} \var{``7''} ] ]
+  [.\type{Operator} \var{TIMES} ]
+    [.\type{NumberLiteral} \var{``2''} ] 
+  ]
+  \end{tikzpicture}
+  \caption{The abstract syntax tree for the expression \code{(5 + 7) * 2}.}
+  \label{fig:astInfixExpression}
+\end{figure}
+
+\subsection{The AST in Eclipse}
+In Eclipse, every node in the AST is a child of the abstract superclass 
+\typewithref{org.eclipse.jdt.core.dom}{ASTNode}. Every \type{ASTNode}, among a 
+lot of other things, provides information about its position and length in the 
+source code, as well as a reference to its parent and to the root of the tree.
+
+The root of the AST is always of type \type{CompilationUnit}. It is not the same 
+as an instance of an \type{ICompilationUnit}, which is the compilation unit 
+handle of the Java model. The children of a \type{CompilationUnit} is an 
+optional \type{PackageDeclaration}, zero or more nodes of type 
+\type{ImportDecaration} and all its top-level type declarations that has node 
+types \type{AbstractTypeDeclaration}.
+
+An \type{AbstractType\-Declaration} can be one of the types 
+\type{AnnotationType\-Declaration}, \type{Enum\-Declaration} or 
+\type{Type\-Declaration}. The children of an \type{AbstractType\-Declaration} 
+must be a subtype of a \type{BodyDeclaration}. These subtypes are: 
+\type{AnnotationTypeMember\-Declaration}, \type{EnumConstant\-Declaration}, 
+\type{Field\-Declaration}, \type{Initializer} and \type{Method\-Declaration}.
+
+Of the body declarations, the \type{Method\-Declaration} is the most interesting 
+one. Its children include lists of modifiers, type parameters, parameters and 
+exceptions. It has a return type node and a body node. The body, if present, is 
+of type \type{Block}. A \type{Block} is itself a \type{Statement}, and its 
+children is a list of \type{Statement} nodes.
+
+There are too many types of the abstract type \type{Statement} to list up, but 
+there exists a subtype of \type{Statement} for every statement type of Java, as 
+one would expect. This also applies to the abstract type \type{Expression}.  
+However, the expression \type{Name} is a little special, since it is both used 
+as an operand in compound expressions, as well as for names in type declarations 
+and such.
+
+\begin{figure}[h]
+  \centering
+  \begin{tikzpicture}[scale=0.6]
+  \tikzset{level distance=40pt}
+  \tikzset{edge from parent/.append style={thick}}
+  \tikzset{every tree node/.style={align=center}}
+  \tikzset{every internal node/.style={ellipse,draw,fill=lightgray}}
+  \tikzset{every leaf node/.style={draw=none,fill=none}}
+
+  \Tree [.\type{CompilationUnit} [.\type{[ PackageDeclaration ]} ]
+    [.\type{\{ ImportDeclaration \}*} ]
+    [.\type{\{ AbstractTypeDeclaration \}+} ]
+  ]
+  \end{tikzpicture}
+  \caption{The format of the abstract syntax tree in Eclipse.}
+  \label{fig:astEclipse}
+\end{figure}
+
+
+\section{Illegal selections}
+
+\subsection{Not all branches end in return}
+
+\subsection{Ambiguous return statement}
+This problem occurs when there is either more than one assignment to a local 
+variable that is used outside of the selection, or there is only one, but there 
+are also return statements in the selection.
+
+\todoin{Explain why we do not need to consider variables assigned inside 
+local/anonymous classes. (The referenced variables need to be final and so 
+on\ldots)}
+
+\chapter{Eclipse Bugs Found}
+\todoin{Add other things and change headline?}
+
+\section{Eclipse bug 420726: Code is broken when moving a method that is 
+assigning to the parameter that is also the move 
+destination}\label{eclipse_bug_420726}
+This bug\footnote{\url{https://bugs.eclipse.org/bugs/show\_bug.cgi?id=420726}}  
+was found when analyzing what kinds of names that was to be considered as 
+\emph{unfixes} \see{unfixes}.
+
+\subsection{The bug}
+The bug emerges when trying to move a method from one class to another, and when 
+the target for the move (must be a variable, local or field) is both a parameter 
+variable and also is assigned to within the method body. Eclipse allows this to 
+happen, although it is the sure path to a compilation error. This is because we 
+would then have an assignment to a \var{this} expression, which is not allowed 
+in Java.
+
+\subsection{The solution}
+The solution to this problem is to add all simple names that are assigned to in 
+a method body to the set of unfixes.
+
+\section{Eclipse bug 429416: IAE when moving method from anonymous class}
+I 
+discovered\footnote{\url{https://bugs.eclipse.org/bugs/show\_bug.cgi?id=429416}} 
+this bug during a batch change on the \type{org.eclipse.jdt.ui} project.
+
+\subsection{The bug}
+This bug surfaces when trying to use the Move Method refactoring to move a 
+method from an anonymous class to another class. This happens both for my 
+simulation as well as in Eclipse, through the user interface. It only occurs 
+when Eclipse analyzes the program and finds it necessary to pass an instance of 
+the originating class as a parameter to the moved method. I.e. it want to pass a 
+\var{this} expression. The execution ends in an 
+\typewithref{java.lang}{IllegalArgumentException} in 
+\typewithref{org.eclipse.jdt.core.dom}{SimpleName} and its 
+\method{setIdentifier(String)} method. The simple name is attempted created in 
+the method
+\methodwithref{org.eclipse.jdt.internal.corext.refactoring.structure.\\MoveInstanceMethodProcessor}{createInlinedMethodInvocation} 
+so the \type{MoveInstanceMethodProcessor} was early a clear suspect.
+
+The \method{createInlinedMethodInvocation} is the method that creates a method 
+invocation where the previous invocation to the method that was moved was. From 
+its code it can be read that when a \var{this} expression is going to be passed 
+in to the invocation, it shall be qualified with the name of the original 
+method's declaring class, if the declaring class is either an anonymous clas or 
+a member class. The problem with this, is that an anonymous class does not have 
+a name, hence the term \emph{anonymous} class! Therefore, when its name, an 
+empty string, is passed into 
+\methodwithref{org.eclipse.jdt.core.dom.AST}{newSimpleName} it all ends in an 
+\type{IllegalArgumentException}.
+
+\subsection{How I solved the problem}
+Since the \type{MoveInstanceMethodProcessor} is instantiated in the 
+\typewithref{no.uio.ifi.refaktor.change.executors}{MoveMethod\-RefactoringExecutor}, 
+and only need to be a 
+\typewithref{org.eclipse.ltk.core.refactoring.participants}{MoveProcessor}, I 
+was able to copy the code for the original move processor and modify it so that 
+it works better for me. It is now called 
+\typewithref{no.uio.ifi.refaktor.refactorings.processors}{ModifiedMoveInstanceMethodProcessor}.  
+The only modification done (in addition to some imports and suppression of 
+warnings), is in the \method{createInlinedMethodInvocation}. When the declaring 
+class of the method to move is anonymous, the \var{this} expression in the 
+parameter list is not qualified with the declaring class' (empty) name.
+
+\section{Eclipse bug 429954: Extracting statement with reference to local type 
+breaks code}\label{eclipse_bug_429954}
+The bug\footnote{\url{https://bugs.eclipse.org/bugs/show\_bug.cgi?id=429954}} 
+was discovered when doing some changes to the way unfixes is computed.
+
+\subsection{The bug}
+The problem is that Eclipse is allowing selections that references variables of 
+local types to be extracted. When this happens the code is broken, since the 
+extracted method must take a parameter of a local type that is not in the 
+methods scope. The problem is illustrated in 
+\myref{lst:extractMethod_LocalClass}, but there in another setting.
+
+\subsection{Actions taken}
+There are no actions directly springing out of this bug, since the Extract 
+Method refactoring cannot be meant to be this way. This is handled on the 
+analysis stage of our Extract and Move Method refactoring. So names representing 
+variables of local types is considered unfixes \see{unfixes}.
+\todoin{write more when fixing this in legal statements checker}
+
+\chapter{Related Work}
+
+\section{The compositional paradigm of refactoring}
+This paradigm builds upon the observation of Vakilian et 
+al.\citing{vakilian2012}, that of the many automated refactorings existing in 
+modern IDEs, the simplest ones are dominating the usage statistics. The report 
+mainly focuses on \emph{Eclipse} as the tool under investigation.
+
+The paradigm is described almost as the opposite of automated composition of 
+refactorings \see{compositeRefactorings}. It works by providing the programmer 
+with easily accessible primitive refactorings. These refactorings shall be 
+accessed via keyboard shortcuts or quick-assist menus\footnote{Think 
+quick-assist with Ctrl+1 in Eclipse} and be promptly executed, opposed to in the 
+currently dominating wizard-based refactoring paradigm. They are ment to 
+stimulate composing smaller refactorings into more complex changes, rather than 
+doing a large upfront configuration of a wizard-based refactoring, before 
+previewing and executing it. The compositional paradigm of refactoring is 
+supposed to give control back to the programmer, by supporting \himher with an 
+option of performing small rapid changes instead of large changes with a lesser 
+degree of control. The report authors hope this will lead to fewer unsuccessful 
+refactorings. It also could lower the bar for understanding the steps of a 
+larger composite refactoring and thus also help in figuring out what goes wrong 
+if one should choose to op in on a wizard-based refactoring.
+
+Vakilian and his associates have performed a survey of the effectiveness of the 
+compositional paradigm versus the wizard-based one. They claim to have found 
+evidence of that the \emph{compositional paradigm} outperforms the 
+\emph{wizard-based}. It does so by reducing automation, which seem 
+counterintuitive. Therefore they ask the question ``What is an appropriate level 
+of automation?'', and thus questions what they feel is a rush toward more 
+automation in the software engineering community.
+
+
  \backmatter{}
  \printbibliography
  \listoftodos