From: Erlend Kristiansen <erlenkr@ifi.uio.no>
Date: Fri, 25 Apr 2014 12:23:12 +0000 (+0200)
Subject: Thesis: adding motivation
X-Git-Url: http://git.uio.no/git/?a=commitdiff_plain;h=97ae6df14f9a902f0c75927e7f58aa714a08bd99;p=ifi-stolz-refaktor.git

Thesis: adding motivation
---

diff --git a/thesis/.gitignore b/thesis/.gitignore
index a3908b6d..1deee966 100644
--- a/thesis/.gitignore
+++ b/thesis/.gitignore
@@ -17,3 +17,4 @@
 *.pyg
 *.ist
 *.bcf
+*.lol
diff --git a/thesis/master-thesis-erlenkr.tex b/thesis/master-thesis-erlenkr.tex
index e86d0419..fd5369a6 100644
--- a/thesis/master-thesis-erlenkr.tex
+++ b/thesis/master-thesis-erlenkr.tex
@@ -166,6 +166,10 @@ identifies its participators and how they collaborate},
 \DefineBibliographyStrings{english}{%
   bibliography = {References},
 }
+\newbibmacro{string+doi}[1]{%
+  \iffieldundef{doi}{#1}{\href{http://dx.doi.org/\thefield{doi}}{#1}}}
+\DeclareFieldFormat{title}{\usebibmacro{string+doi}{\mkbibemph{#1}}}
+\DeclareFieldFormat[article]{title}{\usebibmacro{string+doi}{\mkbibquote{#1}}}
 
 % UML comment in TikZ:
 % ref: https://tex.stackexchange.com/questions/103688/folded-paper-shape-tikz
@@ -225,6 +229,7 @@ identifies its participators and how they collaborate},
 \newcolumntype{L}[1]{>{\hsize=#1\hsize\raggedright\arraybackslash}X}%
 \newcolumntype{R}[1]{>{\hsize=#1\hsize\raggedleft\arraybackslash}X}%
 
+
 \begin{document}
 %\pagenumbering{arabic}
 \mainmatter
@@ -241,13 +246,41 @@ Can be done by removing ``draft'' from documentclass.}}
 \tableofcontents{}
 \listoffigures{}
 \listoftables{}
+\listoflistings{}
 
 %\mainmatter
 %\setcounter{page}{13}
 
 \chapter{Introduction}
+
 \section{Motivation and structure}
 
+For large software projects, complex program source code is an issue. It impacts 
+the cost of maintenance in a negative way. It often stalls the implementation of 
+new functionality and other program changes. The code may be difficult to 
+understand, the changes may introduce new bugs that are hard to find and its 
+complexity can simply keep people from doing code changes in fear of breaking 
+some dependent piece of code.  All these problems are related, and often lead to 
+a vicious circle that slowly degrades the overall quality of a project.
+
+More specifically, and in an object-oriented context, a class may depend on a 
+number of other classes. Sometimes these intimate relationships are appropriate, 
+and sometimes they are not. Inappropriate \emph{coupling} between classes can 
+make it difficult to know whether or not a change that is aimed at fixing a 
+specific problem also alters the behavior of another part of a program.
+
+One of the tools that are used to fight complexity and coupling in program 
+source code is \emph{refactoring}. The intention for this master's thesis is 
+therefore to create an automated composite refactoring that reduces coupling 
+between classes. The refactoring shall be able to operate automatically in all 
+phases of a refactoring, from performing analysis to executing changes. It is 
+also a requirement that it should be able to process large quantities of source 
+code in a reasonable amount of time.
+
+
+\todoin{Structure. Write later\ldots}
+
+
 \section{What is refactoring?}
 
 This question is best answered by first defining the concept of a 
@@ -986,6 +1019,8 @@ tracematch (C c, X x) {
 
 
 \section{The Project}
+In this section we look at the work that shall be done for this project, its 
+building stones and some of the methodologies used.
 
 \subsection{Project description}
 The aim of this master's project will be to explore the relationship between the 
@@ -1014,6 +1049,64 @@ as well as executing it over a larger code base, as a case study. To be able to
 execute the refactoring automatically, I have to make it analyze code to 
 determine the best selections to extract into new methods.
 
+\subsection{The premises}
+Before we can start manipulating source code and write a tool for doing so, we 
+need to decide on a programming language for the code we are going to 
+manipulate. Also, since we do not want to start from scratch by implementing 
+primitive refactorings ourselves, we need to choose an existing tool that 
+provides the needed refactorings. In addition to be able to perform changes, we 
+need a framework for analyzing source code for the language we select.
+
+\subsubsection{Choosing the target language}
+Choosing which programming language the code that shall be manipulated shall be 
+written in, is not a very difficult task. We choose to limit the possible 
+languages to the object-oriented programming languages, since most of the 
+terminology and literature regarding refactoring comes from the world of 
+object-oriented programming. In addition, the language must have existing tool 
+support for refactoring.
+
+The \name{Java} programming language\footnote{\url{https://www.java.com/}} is 
+the dominating language when it comes to example code in the literature of 
+refactoring, and is thus a natural choice. Java is perhaps, currently the most 
+influential programming language in the world, with its \name{Java Virtual 
+Machine} that runs on all of the most popular architectures and also supports 
+dozens of other programming languages\footnote{They compile to Java bytecode.}, 
+with \name{Scala}, \name{Clojure} and \name{Groovy} as the most prominent ones.  
+Java is currently the language that every other programming language is compared 
+against. It is also the primary programming language for the author of this 
+thesis.
+
+\subsubsection{Choosing the tools}
+When choosing a tool for manipulating Java, there are certain criteria that 
+have to be met. First of all, the tool should have some existing refactoring 
+support that this thesis can build upon. Secondly it should provide some kind of 
+framework for parsing and analyzing Java source code. Third, it should itself be 
+open source. This is both because of the need to be able to browse the code for 
+the existing refactorings that is contained in the tool, and also because open 
+source projects hold value in them selves. Another important aspect to consider 
+is that open source projects of a certain size, usually has large communities of 
+people connected to them, that are committed to answering questions regarding the 
+use and misuse of the products, that to a large degree is made by the community 
+itself.
+
+There is a certain class of tools that meet these criteria, namely the class of 
+\emph{IDEs}\footnote{\emph{Integrated Development Environment}}. These are 
+programs that is meant to support the whole production cycle of a computer 
+program, and the most popular IDEs that support Java, generally have quite good 
+refactoring support.
+
+The main contenders for this thesis is the \name{Eclipse IDE}, with the 
+\name{Java development tools} (JDT), the \name{IntelliJ IDEA Community Edition} 
+and the \name{NetBeans IDE} \see{toolSupport}. \name{Eclipse} and 
+\name{NetBeans} are both free, open source and community driven, while the 
+\name{IntelliJ IDEA} has an open sourced community edition that is free of 
+charge, but also offer an \name{Ultimate Edition} with an extended set of 
+features, at additional cost.  All three IDEs supports adding plugins to extend 
+their functionality and tools that can be used to parse and analyze Java source 
+code. But one of the IDEs stand out as a favorite, and that is the \name{Eclipse 
+IDE}. This is the most popular\citing{javaReport2011} among them and seems to be 
+de facto standard IDE for Java development regardless of platform.
+
 \subsection{The primitive refactorings}
 The refactorings presented here are the primitive refactorings used in this 
 project. They are the abstract building blocks used by the \ExtractAndMoveMethod 
@@ -1183,58 +1276,98 @@ And, assuming the refactoring does in fact improve the quality of source code:
 usefulness of the refactoring in a software development setting? In what parts 
 of the development process can the refactoring play a role?
 
-\subsection{The premises}
-\todoin{Appropriate name?}
+\subsection{Methodology}
 
-\subsubsection{Choosing the target language}
-Choosing which programming language the code that shall be manipulated shall be 
-written in, is not a very difficult task. We choose to limit the possible 
-languages to the object-oriented programming languages, since most of the 
-terminology and literature regarding refactoring comes from the world of 
-object-oriented programming. In addition, the language must have existing tool 
-support for refactoring.
+\subsubsection{Evolutionary design}
+In the programming work for this project, it have tried to use a design strategy 
+called evolutionary design, also known as continuous or incremental 
+design\citing{wiki_continuous_2014}.  It is a software design strategy 
+advocated by the Extreme Programming community.  The essence of the strategy is 
+that you should let the design of your program evolve naturally as your 
+requirements change.  This is seen in contrast with up-front design, where 
+design decisions are made early in the process. 
 
-The \name{Java} programming language\footnote{\url{https://www.java.com/}} is 
-the dominating language when it comes to example code in the literature of 
-refactoring, and is thus a natural choice. Java is perhaps, currently the most 
-influential programming language in the world, with its \name{Java Virtual 
-Machine} that runs on all of the most popular architectures and also supports 
-dozens of other programming languages\footnote{They compile to Java bytecode.}, 
-with \name{Scala}, \name{Clojure} and \name{Groovy} as the most prominent ones.  
-Java is currently the language that every other programming language is compared 
-against. It is also the primary programming language for the author of this 
-thesis.
+The motivation behind evolutionary design is to keep the design of software as 
+simple as possible. This means not introducing unneeded functionality into a 
+program. You should defer introducing flexibility into your software, until it 
+is needed to be able to add functionality in a clean way.
 
-\subsubsection{Choosing the tools}
-When choosing a tool for manipulating Java, there are certain criteria that 
-have to be met. First of all, the tool should have some existing refactoring 
-support that this thesis can build upon. Secondly it should provide some kind of 
-framework for parsing and analyzing Java source code. Third, it should itself be 
-open source. This is both because of the need to be able to browse the code for 
-the existing refactorings that is contained in the tool, and also because open 
-source projects hold value in them selves. Another important aspect to consider 
-is that open source projects of a certain size, usually has large communities of 
-people connected to them, that are committed to answering questions regarding the 
-use and misuse of the products, that to a large degree is made by the community 
-itself.
+Holding up design decisions, implies that the time will eventually come when 
+decisions have to be made. The flexibility of the design then relies on the 
+programmer's abilities to perform the necessary refactoring, and \his confidence 
+in those abilities. From my experience working on this project, I can say that 
+this confidence is greatly enhanced by having automated tests to rely on 
+\see{tdd}.
 
-There is a certain class of tools that meet these criteria, namely the class of 
-\emph{IDEs}\footnote{\emph{Integrated Development Environment}}. These are 
-programs that is meant to support the whole production cycle of a computer 
-program, and the most popular IDEs that support Java, generally have quite good 
-refactoring support.
+The choice of going for evolutionary design developed naturally. As Fowler 
+points out in his article \tit{Is Design Dead?}, evolutionary design much 
+resembles the ``code and fix'' development strategy\citing{fowler_design_2004}.
+A strategy that most of us have practiced in school. This was also the case when 
+I first started this work. I had to learn the inner workings of Eclipse and its 
+refactoring-related plugins. That meant a lot of fumbling around with code I did 
+not know, in a trial and error fashion. Eventually I started writing tests for 
+my code, and my design began to evolve.
+
+\subsubsection{Test-driven development}\label{tdd}
+As mentioned before, the project started out as a classic code and fix 
+developmen process. My focus was aimed at getting something to work, rather than 
+doing so according to best practice. This resulted in a project that got out of 
+its starting blocks, but it was not accompanied by any tests. Hence it was soon 
+difficult to make any code changes with the confidence that the program was 
+still correct afterwards (assuming it was so before changing it). I always knew 
+that I had to introduce some tests at one point, but this experience accelerated 
+the process of leading me onto the path of testing.
+
+I then wrote tests for the core functionality of the plugin, and thus gained 
+more confidence in the correctness of my code. I could now perform quite drastic 
+changes without ``wetting my pants``. After this, nearly all of the semantic 
+changes done to the business logic of the project, or the addition of new 
+functionality, was made in a test-driven manner. This means that before 
+performing any changes, I would define the desired functionality through a set 
+of tests. I would then run the tests to check that they were run and that they 
+did not pass.  Then I would do any code changes necessary to make the tests 
+pass.  The definition of how the program is supposed to operate is then captured 
+by the tests.  However, this does not prove the correctness of the analysis 
+leading to the test definitions.
+
+\subsubsection{Continuous integration}
+\todoin{???}
+
+\section{Related Work}
+
+\subsection{Safer refactorings}
+\todoin{write}
+
+\subsection{The compositional paradigm of refactoring}
+This paradigm builds upon the observation of Vakilian et 
+al.\citing{vakilian2012}, that of the many automated refactorings existing in 
+modern IDEs, the simplest ones are dominating the usage statistics. The report 
+mainly focuses on \name{Eclipse} as the tool under investigation.
+
+The paradigm is described almost as the opposite of automated composition of 
+refactorings \see{compositeRefactorings}. It works by providing the programmer 
+with easily accessible primitive refactorings. These refactorings shall be 
+accessed via keyboard shortcuts or quick-assist menus\footnote{Think 
+quick-assist with Ctrl+1 in \name{Eclipse}} and be promptly executed, opposed to in the 
+currently dominating wizard-based refactoring paradigm. They are meant to 
+stimulate composing smaller refactorings into more complex changes, rather than 
+doing a large upfront configuration of a wizard-based refactoring, before 
+previewing and executing it. The compositional paradigm of refactoring is 
+supposed to give control back to the programmer, by supporting \himher with an 
+option of performing small rapid changes instead of large changes with a lesser 
+degree of control. The report authors hope this will lead to fewer unsuccessful 
+refactorings. It also could lower the bar for understanding the steps of a 
+larger composite refactoring and thus also help in figuring out what goes wrong 
+if one should choose to op in on a wizard-based refactoring.
+
+Vakilian and his associates have performed a survey of the effectiveness of the 
+compositional paradigm versus the wizard-based one. They claim to have found 
+evidence of that the \emph{compositional paradigm} outperforms the 
+\emph{wizard-based}. It does so by reducing automation, which seem 
+counterintuitive. Therefore they ask the question ``What is an appropriate level 
+of automation?'', and thus questions what they feel is a rush toward more 
+automation in the software engineering community.
 
-The main contenders for this thesis is the \name{Eclipse IDE}, with the 
-\name{Java development tools} (JDT), the \name{IntelliJ IDEA Community Edition} 
-and the \name{NetBeans IDE} \see{toolSupport}. \name{Eclipse} and 
-\name{NetBeans} are both free, open source and community driven, while the 
-\name{IntelliJ IDEA} has an open sourced community edition that is free of 
-charge, but also offer an \name{Ultimate Edition} with an extended set of 
-features, at additional cost.  All three IDEs supports adding plugins to extend 
-their functionality and tools that can be used to parse and analyze Java source 
-code. But one of the IDEs stand out as a favorite, and that is the \name{Eclipse 
-IDE}. This is the most popular\citing{javaReport2011} among them and seems to be 
-de facto standard IDE for Java development regardless of platform.
 
 
 \chapter{The search-based Extract and Move Method refactoring}
@@ -4543,105 +4676,12 @@ while, before they were solved. This is reflected in the ``Test Result Trend''
 and ``Code Coverage Trend'' reported by Jenkins.
 
 
-\chapter{Methodology}
-
-\section{Evolutionary design}
-In the programming work for this project, it have tried to use a design strategy 
-called evolutionary design, also known as continuous or incremental 
-design\citing{wiki_continuous_2014}.  It is a software design strategy 
-advocated by the Extreme Programming community.  The essence of the strategy is 
-that you should let the design of your program evolve naturally as your 
-requirements change.  This is seen in contrast with up-front design, where 
-design decisions are made early in the process. 
-
-The motivation behind evolutionary design is to keep the design of software as 
-simple as possible. This means not introducing unneeded functionality into a 
-program. You should defer introducing flexibility into your software, until it 
-is needed to be able to add functionality in a clean way.
-
-Holding up design decisions, implies that the time will eventually come when 
-decisions have to be made. The flexibility of the design then relies on the 
-programmer's abilities to perform the necessary refactoring, and \his confidence 
-in those abilities. From my experience working on this project, I can say that 
-this confidence is greatly enhanced by having automated tests to rely on 
-\see{tdd}.
-
-The choice of going for evolutionary design developed naturally. As Fowler 
-points out in his article \tit{Is Design Dead?}, evolutionary design much 
-resembles the ``code and fix'' development strategy\citing{fowler_design_2004}.
-A strategy that most of us have practiced in school. This was also the case when 
-I first started this work. I had to learn the inner workings of Eclipse and its 
-refactoring-related plugins. That meant a lot of fumbling around with code I did 
-not know, in a trial and error fashion. Eventually I started writing tests for 
-my code, and my design began to evolve.
-
-\section{Test-driven development}\label{tdd}
-As mentioned before, the project started out as a classic code and fix 
-developmen process. My focus was aimed at getting something to work, rather than 
-doing so according to best practice. This resulted in a project that got out of 
-its starting blocks, but it was not accompanied by any tests. Hence it was soon 
-difficult to make any code changes with the confidence that the program was 
-still correct afterwards (assuming it was so before changing it). I always knew 
-that I had to introduce some tests at one point, but this experience accelerated 
-the process of leading me onto the path of testing.
-
-I then wrote tests for the core functionality of the plugin, and thus gained 
-more confidence in the correctness of my code. I could now perform quite drastic 
-changes without ``wetting my pants``. After this, nearly all of the semantic 
-changes done to the business logic of the project, or the addition of new 
-functionality, was made in a test-driven manner. This means that before 
-performing any changes, I would define the desired functionality through a set 
-of tests. I would then run the tests to check that they were run and that they 
-did not pass.  Then I would do any code changes necessary to make the tests 
-pass.  The definition of how the program is supposed to operate is then captured 
-by the tests.  However, this does not prove the correctness of the analysis 
-leading to the test definitions.
-
-\section{Continuous integration}
-\todoin{???}
-
 
 \chapter{Conclusions and Future Work}
 \todoin{Write}
 
 \section{Future work}
 
-\chapter{Related Work}
-
-\section{Safer refactorings}
-\todoin{write}
-
-\section{The compositional paradigm of refactoring}
-This paradigm builds upon the observation of Vakilian et 
-al.\citing{vakilian2012}, that of the many automated refactorings existing in 
-modern IDEs, the simplest ones are dominating the usage statistics. The report 
-mainly focuses on \name{Eclipse} as the tool under investigation.
-
-The paradigm is described almost as the opposite of automated composition of 
-refactorings \see{compositeRefactorings}. It works by providing the programmer 
-with easily accessible primitive refactorings. These refactorings shall be 
-accessed via keyboard shortcuts or quick-assist menus\footnote{Think 
-quick-assist with Ctrl+1 in \name{Eclipse}} and be promptly executed, opposed to in the 
-currently dominating wizard-based refactoring paradigm. They are meant to 
-stimulate composing smaller refactorings into more complex changes, rather than 
-doing a large upfront configuration of a wizard-based refactoring, before 
-previewing and executing it. The compositional paradigm of refactoring is 
-supposed to give control back to the programmer, by supporting \himher with an 
-option of performing small rapid changes instead of large changes with a lesser 
-degree of control. The report authors hope this will lead to fewer unsuccessful 
-refactorings. It also could lower the bar for understanding the steps of a 
-larger composite refactoring and thus also help in figuring out what goes wrong 
-if one should choose to op in on a wizard-based refactoring.
-
-Vakilian and his associates have performed a survey of the effectiveness of the 
-compositional paradigm versus the wizard-based one. They claim to have found 
-evidence of that the \emph{compositional paradigm} outperforms the 
-\emph{wizard-based}. It does so by reducing automation, which seem 
-counterintuitive. Therefore they ask the question ``What is an appropriate level 
-of automation?'', and thus questions what they feel is a rush toward more 
-automation in the software engineering community.
-
-
 
 \appendix