thesis/master-thesis-erlenkr.tex

   1 \documentclass[USenglish]{ifimaster}
   2 \usepackage{import}
   3 \usepackage[utf8]{inputenc}
   4 \usepackage[T1]{fontenc,url}
   5 \usepackage{lmodern} % using Latin Modern to be able to use bold typewriter font
   6 \urlstyle{sf}
   7 \usepackage{babel,textcomp,csquotes,ifimasterforside,varioref,graphicx}
   8 \usepackage[hidelinks]{hyperref}
   9 \usepackage[style=numeric-comp,backend=bibtex]{biblatex}
  10 \usepackage{amsthm}
  11 \usepackage{todonotes}
  12 \usepackage{verbatim}
  13 \usepackage{minted}
  14 \usemintedstyle{bw}
  15 \usepackage{perpage} %the perpage package
  16 \MakePerPage{footnote} %the perpage package command
  17
  18 \theoremstyle{plain}
  19 \newtheorem*{wordDef}{Definition}
  20
  21 \graphicspath{ {./figures/} }
  22
  23 \newcommand{\definition}[1]{\begin{wordDef}#1\end{wordDef}}
  24 \newcommand{\see}[1]{(see section \ref{#1})}
  25 \newcommand{\See}[1]{(See section \ref{#1}.)}
  26 \newcommand{\explanation}[3]{\noindent\textbf{\textit{#1}}\\*\emph{When:}
  27 #2\\*\emph{How:} #3\\*[-7px]}
  28
  29 \newcommand{\type}[1]{\texttt{\textbf{#1}}}
  30 \newcommand{\typeref}[1]{\footnote{\type{#1}}}
  31 \newcommand{\typewithref}[2]{\type{#2}\typeref{#1.#2}}
  32 \newcommand{\method}[1]{\type{#1}}
  33 \newcommand{\methodref}[2]{\footnote{\type{#1}\method{\##2()}}}
  34 \newcommand{\methodwithref}[2]{\method{#2}\footnote{\type{#1}\method{\##2()}}}
  35 \newcommand{\var}[1]{\type{#1}}
  36
  37 \newcommand{\refactoring}[1]{\emph{#1}}
  38 \newcommand{\refactoringsp}[1]{\refactoring{#1} }
  39 \newcommand{\ExtractMethod}{\refactoringsp{Extract Method}}
  40 \newcommand{\MoveMethod}{\refactoringsp{Move Method}}
  41
  42 \newcommand{\citing}[1]{~\cite{#1}}
  43
  44 \newcommand\todoin[2][]{\todo[inline, caption={2do}, #1]{
  45 \begin{minipage}{\textwidth-4pt}#2\end{minipage}}}
  46
  47
  48 \title{Refactoring}
  49 \subtitle{An unfinished essay}
  50 \author{Erlend Kristiansen}
  51
  52 \bibliography{bibliography/master-thesis-erlenkr-bibliography}
  53
  54 \begin{document}
  55 \ififorside
  56 \frontmatter{}
  57
  58
  59 \chapter*{Abstract}
  60 \todoin{\textbf{Remove all todos (including list) before delivery/printing!!!}}
  61 \todoin{Write abstract}
  62
  63 \tableofcontents{}
  64 \listoffigures{}
  65 \listoftables{}
  66
  67 \chapter*{Preface}
  68
  69 To make it clear already from the beginning: The discussions in this report must
  70 be seen in the context of object oriented programming languages, and Java in
  71 particular, since that is the language in which most of the examples will be
  72 given. All though the techniques discussed may be applicable to languages from
  73 other paradigms, they will not be the subject of this report.
  74
  75 \mainmatter
  76
  77 \chapter{What is Refactoring?}
  78
  79 This question is best answered by first defining the concept of a
  80 \emph{refactoring}, what it is to \emph{refactor}, and then discuss what aspects
  81 of programming that make people want to refactor their code.
  82
  83 \section{Defining refactoring}
  84 Martin Fowler, in his masterpiece on refactoring \cite{refactoring}, defines a
  85 refactoring like this:
  86
  87 \begin{quote}
  88   \emph{Refactoring} (noun): a change made to the \todo{what does he mean by
  89   internal?} internal structure of software to make it easier to understand and
  90   cheaper to modify without changing its observable
  91   behavior.\citing{refactoring}
  92   % page 53
  93 \end{quote}
  94
  95 \noindent This definition assign additional meaning to the word
  96 \emph{refactoring}, beyond the composition of the prefix \emph{re-}, usually
  97 meaning something like ``again'' or ``anew'', and the word \emph{factoring},
  98 that can mean to determine the \emph{factors} of something. Where a
  99 \emph{factor} would be close to the mathematical definition of something that
 100 divides a quantity, without leaving a remainder. Fowler is mixing the
 101 \emph{motivation} behind refactoring into his definition.  Instead it could be
 102 made clean, only considering the mechanical and behavioral aspects of
 103 refactoring. That is to factor the program again, putting it together in a
 104 different way than before, while preserving the behavior of the program. An
 105 alternative definition could then be:
 106
 107 \definition{A refactoring is a transformation
 108 done to a program without altering its external behavior.}
 109
 110 From this we can conclude that a refactoring primarily changes how the
 111 \emph{code} of a program is perceived by the \emph{programmer}, and not the
 112 \emph{behavior} experienced by any user of the program. Although the logical
 113 meaning is preserved, such changes could potentially alter the program's
 114 behavior when it comes to performance gain or -penalties. So any logic depending
 115 on the performance of a program could make the program behave differently after
 116 a refactoring.
 117
 118 In the extreme case one could argue that such a thing as \emph{software
 119 obfuscation} is to refactor. If we where to define it as a refactoring, it could
 120 be defined as a composite refactoring \see{intro_composite}, consisting of, for
 121 instance, a series of rename refactorings. (But it could of course be much more
 122 complex, and the mechanics of it would not exactly be carved in stone.) To
 123 perform some serious obfuscation one would also take advantage of techniques not
 124 found among established refactorings, such as removing whitespace. This might
 125 not even generate a different syntax tree for languages not sensitive to
 126 whitespace, placing it in the gray area of what kind of transformations is to be
 127 considered refactorings.
 128
 129 Finally, to \emph{refactor} is (quoting Martin Fowler)
 130 \begin{quote}
 131   \ldots to restructure software by applying a series of refactorings without
 132   changing its observable behavior.\citing{refactoring} % page 54, definition
 133 \end{quote}
 134
 135 \section{The etymology of 'refactoring'}
 136 It is a little difficult to pinpoint the exact origin of the word
 137 ``refactoring'', as it seems to have evolved as part of a colloquial
 138 terminology, more than a scientific term. There is no authoritative source for a
 139 formal definition of it.
 140
 141 According to Martin Fowler\citing{etymology-refactoring}, there may also be more
 142 than one origin of the word. The most well-known source, when it comes to the
 143 origin of \emph{refactoring}, is the Smalltalk\footnote{\emph{Smalltalk},
 144 object-oriented, dynamically typed, reflective programming language.}\todo{find
 145 reference to Smalltalk website or similar?} community and their infamous
 146 \emph{Refactoring
 147 Browser}\footnote{\url{http://st-www.cs.illinois.edu/users/brant/Refactory/RefactoringBrowser.html}}
 148 described in the article \emph{A Refactoring Tool for
 149 Smalltalk}\citing{refactoringBrowser1997}, published in 1997.
 150 Allegedly\citing{etymology-refactoring}, the metaphor of factoring programs was
 151 also present in the Forth\footnote{\emph{Forth} -- stack-based, extensible
 152 programming language, without type-checking. See \url{http://www.forth.org}}
 153 community, and the word ``refactoring'' is mentioned in a book by Leo Brodie,
 154 called \emph{Thinking Forth}\citing{brodie1984}, first published in
 155 1984\footnote{\emph{Thinking Forth} was first published in 1984 by the
 156 \emph{Forth Interest Group}.  Then it was reprinted in 1994 with minor
 157 typographical corrections, before it was transcribed into an electronic edition
 158 typeset in \LaTeX\ and published under a Creative Commons licence in 2004. The
 159 edition cited here is the 2004 edition, but the content should essentially be as
 160 in 1984.}. The exact word is only printed one place\footnote{p. 232}, but the
 161 term \emph{factoring} is prominent in the book, that also contains a whole
 162 chapter dedicated to (re)factoring, and how to keep the (Forth) code clean and
 163 maintainable.
 164
 165 \begin{quote}
 166   \ldots good factoring technique is perhaps the most important skill for a
 167   Forth programmer.\citing{brodie1984}
 168 \end{quote}
 169
 170 \noindent Brodie also express what \emph{factoring} means to him:
 171
 172 \begin{quote}
 173   Factoring means organizing code into useful fragments. To make a fragment
 174   useful, you often must separate reusable parts from non-reusable parts. The
 175   reusable parts become new definitions. The non-reusable parts become arguments
 176   or parameters to the definitions.\citing{brodie1984}
 177 \end{quote}
 178
 179 Fowler claims that the usage of the word \emph{refactoring} did not pass between
 180 the \emph{Forth} and \emph{Smalltalk} communities, but that it emerged
 181 independently in each of the communities.
 182
 183 \todoin{more history?}
 184
 185 \section{Motivation -- Why people refactor}
 186 To get a grasp of what refactoring is all about, we can try to answer this
 187 question: \emph{Why do people refactor?} Possible answers could include: ``To
 188 remove duplication'' or ``to break up long methods''.  Practitioners of the art
 189 of Design Patterns\citing{dp} could say that they do it to introduce a
 190 long-needed pattern into their program's design.  So it is safe to say that
 191 peoples' intentions are to make their programs \emph{better} in some sense. But
 192 what aspects of the programs are becoming improved?
 193
 194 As already mentioned, people often refactor to get rid of duplication. Moving
 195 identical or similar code into methods, and maybe pushing those up or down in
 196 their class hierarchies. Making template methods for overlapping
 197 algorithms/functionality and so on.  It's all about gathering what belongs
 198 together and putting it all in one place.  And the result? The code is easier to
 199 maintain. When removing the implicit coupling between the code snippets, the
 200 location of a bug is limited to only one place, and new functionality need only
 201 to be added this one place, instead of a number of places people might not even
 202 remember.
 203
 204 The same people find out that their program contains a lot of long and
 205 hard-to-grasp methods. Then what do they do? They begin dividing their methods
 206 into smaller ones, using the \ExtractMethod
 207 refactoring\citing{refactoring}.  Then they may discover something about their
 208 program that they weren't aware of before; revealing bugs they didn't know about
 209 or couldn't find due to the complex structure of their program. \todo{Proof?}
 210 Making the methods smaller and giving good names to the new ones clarifies the
 211 algorithms and enhances the \emph{understandability} of the program
 212 \see{magic_number_seven}. This makes simple refactoring an excellent method for
 213 exploring unknown program code, or code that you had forgotten that you wrote!
 214
 215 The word \emph{simple} came up in the last section. In fact, most primitive
 216 refactorings are simple. The true power of them are revealed first when they are
 217 combined into larger --- higher level --- refactorings, called \emph{composite
 218 refactorings} \see{intro_composite}. Often the goal of such a series of
 219 refactorings is a design pattern. Thus the \emph{design} can be evolved
 220 throughout the lifetime of a program, opposed to designing up-front.  It's all
 221 about being structured and taking small steps to improve a program's design.
 222
 223 Many refactorings are aimed at lowering the coupling between different classes
 224 and different layers of logic. \todo{which refactorings?} Say for instance that
 225 the coupling between the user interface and the business logic of a program is
 226 lowered.  Then the business logic of the program could much easier be the target
 227 of automated tests, increasing the productivity in the software development
 228 process. It is also easier to distribute (e.g. between computers) the different
 229 components of a program if they are sufficiently decoupled.
 230
 231 Another effect of refactoring is that with the increased separation of concerns
 232 coming out of many refactorings, the \emph{performance} is improved.  When
 233 profiling programs, the problem parts are narrowed down to smaller parts of the
 234 code, which are easier to tune, and optimization can be performed only where
 235 needed and in a more effective way.
 236
 237 Last, but not least, and this should probably be the best reason to refactor, is
 238 to refactor to \emph{facilitate a program change}. If one has managed to keep
 239 one's code clean and tidy, and the code is not bloated with design patterns that
 240 is not ever going to be needed, then some refactoring might be needed to
 241 introduce a design pattern that is appropriate for the change that is going to
 242 happen.
 243
 244 Refactoring program code --- with a goal in mind --- can give the code itself
 245 more value. That is in the form of robustness to bugs, understandability and
 246 maintainability. With the first as an obvious advantage, but with the following
 247 two being also very important for software development. By incorporating
 248 refactoring in the development process, bugs are found faster, new functionality
 249 is added more easily and code is easier to understand by the next person exposed
 250 to it, which might as well be the person who wrote it. The consequence of this,
 251 is that refactoring can increase the average productivity of the development
 252 process, and thus also add to the monetary value of a business in the long run.
 253 Where this last point also should open the eyes of some nearsighted managers who
 254 seldom see beyond the next milestone.
 255
 256 \section{The magical number seven}\label{magic_number_seven}
 257 \emph{The magical number seven, plus or minus two: some limits on our capacity
 258 for processing information}\citing{miller1956} is an article by George A. Miller
 259 that was published in the journal \emph{Psychological Review} in 1956. It
 260 presents evidence that support that the capacity of the number of objects a
 261 human being can hold in its working memory is roughly seven, plus or minus two
 262 objects. This number varies a bit depending on the nature and complexity of the
 263 objects, but is according to Miller ``\ldots never changing so much as to be
 264 unrecognizable.''
 265
 266 Miller's article culminates in the section called \emph{Recoding}, a term he
 267 borrows from communication theory. The central result in this section is that by
 268 recoding information, the capacity of the amount of information that a human can
 269 process at a time is increased. By \emph{recoding}, Miller means to group
 270 objects together in chunks and give each chunk a new name that it can be
 271 remembered by. By organizing objects into patterns of ever growing depth, one
 272 can memorize and process a much larger amount of data than if it were to be
 273 represented as its basic pieces. This grouping and renaming is analogous to how
 274 many refactorings work, by grouping pieces of code and give them a new name.
 275 Examples are the central \ExtractMethod and \refactoring{Extract Class}
 276 refactorings\citing{refactoring}.
 277
 278 \begin{quote}
 279   \ldots recoding is an extremely powerful weapon for increasing the amount of
 280   information that we can deal with.\citing{miller1956}
 281 \end{quote}
 282
 283 An example from the article address the problem of memorizing a sequence of
 284 binary digits. Let us say we have the following sequence\footnote{The example
 285   presented here is slightly modified (and shortened) from what is presented in
 286   the original article\citing{miller1956}, but it is essentially the same.} of
 287 16 binary digits: ``1010001001110011''. Most of us will have a hard time
 288 memorizing this sequence by only reading it once or twice. Imagine if we instead
 289 translate it to this sequence: ``A273''. If you have a background from computer
 290 science, it will be obvious that the latest sequence is the first sequence
 291 recoded to be represented by digits with base 16. Most people should be able to
 292 memorize this last sequence by only looking at it once.
 293
 294 Another result from the Miller article is that when the amount of information a
 295 human must interpret increases, it is crucial that the translation from one code
 296 to another must be almost automatic for the subject to be able to remember the
 297 translation, before he or she is presented with new information to recode. Thus
 298 learning and understanding how to best organize certain kinds of data is
 299 essential to efficiently handle that kind of data in the future. This is much
 300 like when children learn to read. First they must learn how to recognize
 301 letters. Then they can learn distinct words, and later read sequences of words
 302 that form whole sentences. Eventually, most of them will be able to read whole
 303 books and briefly retell the important parts of its content. This suggest that
 304 the use of design patterns\citing{dp} is a good idea when reasoning about
 305 computer programs. With extensive use of design patterns when creating complex
 306 program structures, one does not always have to read whole classes of code to
 307 comprehend how they function, it may be sufficient to only see the name of a
 308 class to almost fully understand its responsibilities.
 309
 310 \begin{quote}
 311   Our language is tremendously useful for repackaging material into a few chunks
 312   rich in information.\citing{miller1956}
 313 \end{quote}
 314
 315 Without further evidence, these results at least indicates that refactoring
 316 source code into smaller units with higher cohesion and, when needed,
 317 introducing appropriate design patterns, should aid in the cause of creating
 318 computer programs that are easier to maintain and has code that is easier (and
 319 better) understood.
 320
 321 \section{Notable contributions to the refactoring literature}
 322 \todoin{Update with more contributions}
 323 \begin{description}
 324   \item[1992] William F. Opdyke submits his doctoral dissertation called
 325     \emph{Refactoring Object-Oriented Frameworks}\citing{opdyke1992}. This
 326     work defines a set of refactorings, that are behavior preserving given that
 327     their preconditions are met. The dissertation is focused on the automation
 328     of refactorings.
 329   \item[1999] Martin Fowler et al.: \emph{Refactoring: Improving the Design of
 330     Existing Code}\citing{refactoring}. This is maybe the most influential text
 331     on refactoring. It bares similarities with Opdykes thesis\citing{opdyke1992}
 332     in the way that it provides a catalog of refactorings. But Fowler's book is
 333     more about the craft of refactoring, as he focuses on establishing a
 334     vocabulary for refactoring, together with the mechanics of different
 335     refactorings and when to perform them. His methodology is also founded on
 336     the principles of test-driven development.
 337   \item[todo] \emph{Refactoring to Patterns}\todo{include}
 338 \end{description}
 339
 340 \section{Tool support}\label{toolSupport}
 341
 342 \subsection{Tool support for Java}
 343 This section will briefly compare the refatoring support of the three IDEs
 344 \emph{Eclipse}\footnote{\url{http://www.eclipse.org/}}, \emph{IntelliJ
 345 IDEA}\footnote{The IDE under comparison is the \emph{Community Edition},
 346 \url{http://www.jetbrains.com/idea/}} and
 347 \emph{NetBeans}\footnote{\url{https://netbeans.org/}}. These are the most
 348 popular Java IDEs\citing{javaReport2011}.
 349
 350 All three IDEs provide support for the most useful refactorings, like the
 351 different extract, move and rename refactorings. In fact, Java-targeted IDEs are
 352 known for their good refactoring support, so this did not appear as a big
 353 surprise.
 354
 355 The IDEs seem to have excellent support for the \ExtractMethod refactoring, so
 356 at least they have all passed the first refactoring
 357 rubicon\citing{fowlerRubicon2001,secondRubicon2012}.
 358
 359 Regarding the \MoveMethod refactoring, the \emph{Eclipse} and \emph{IntelliJ}
 360 IDEs do the job in very similar manners. In most situations they both do a
 361 satisfying job by producing the expected outcome. But they do nothing to check
 362 that the result does not break the semantics of the program. \See{correctness}
 363 The \emph{NetBeans} IDE implements this refactoring in a somewhat clumsy way.
 364 For starters, its default destination for the move is itself, although it
 365 refuses to perform the refactoring if chosen. But the worst part is, that if
 366 moving the method \method{f} of the below code to \type{X}, it will break the
 367 code. Given
 368
 369 \begin{minted}[samepage]{java}
 370 public class C {
 371     private X x;
 372     ...
 373     public void f() {
 374         x.m();
 375         x.n();
 376     }
 377 }
 378 \end{minted}
 379
 380 \noindent the move refactoring will produce the following in class \type{X}:
 381
 382 \begin{minted}[samepage]{java}
 383 public class X {
 384     ...
 385     public void f(C c) {
 386         c.x.m();
 387         c.x.n();
 388     }
 389 }
 390 \end{minted}
 391
 392 NetBeans will try to make code that call the methods \method{m} and \method{n}
 393 of \type{X} by accessing them through \var{c.x}, where \var{c} is a parameter of
 394 type \type{C} that is added the method \method{f} when it is moved.  If
 395 \var{c.x} for some reason is inaccessible to \type{X}, as in this case, the
 396 refactoring breaks the code, and it will not compile. It has a preview of the
 397 refactoring outcome, but that does not catch that it is about to do something
 398 stupid. Ironically, this ``feature'' of NetBeans keeps it from breaking the code
 399 in the example from section \ref{correctness}.
 400
 401 The IDEs under investigation seems to have fairly good support for primitive
 402 refactorings, but what about more complex ones, such as the \refactoring{Extract
 403 Class}\citing{refactoring}? The \refactoring{Extract Class} refactoring works by
 404 creating a class, for then to move members to that class and access them from
 405 the old class via a reference to the new class. \emph{IntelliJ} seems to handle
 406 this in a fairly good manner, although, in the case of private methods, it
 407 leaves unused methods behind. These are methods that delegate to a field of the
 408 new class, but are not used anywhere. \emph{Eclipse} has added (or withdrawn)
 409 its own fun twist to the refactoring, and only allows for \emph{fields} to be
 410 moved to a new class. This makes it effectively only extracting a data
 411 structure, and calling it \refactoring{Extract Class} is a little misleading.
 412 One would often be better off with textual extract and paste than using the
 413 Extract Class refactoring in Eclipse. When it comes to \emph{NetBeans}, it does
 414 not even seem to have made an attempt on providing this refactoring. (Well,
 415 probably has, but it does not show in the IDE.)
 416
 417 \todoin{Visual Studio (C++/C\#), Smalltalk refactoring browser?,
 418 second refactoring rubicon?}
 419
 420 \section{Relation to design patterns}
 421 \todoin{refactoring to patterns?}
 422 \begin{comment}
 423
 424 \section{Classification of refactorings}
 425 % only interesting refactorings
 426 % with 2 detailed examples? One for structured and one for intra-method?
 427 % Is replacing Bubblesort with Quick Sort considered a refactoring?
 428
 429 \subsection{Structural refactorings}
 430
 431 \subsubsection{Primitive refactorings}
 432
 433 % Composing Methods
 434 \explanation{Extract Method}{You have a code fragment that can be grouped
 435 together.}{Turn the fragment into a method whose name explains the purpose of
 436 the method.}
 437
 438 \explanation{Inline Method}{A method's body is just as clear as its name.}{Put
 439 the method's body into the body of its callers and remove the method.}
 440
 441 \explanation{Inline Temp}{You have a temp that is assigned to once with a simple
 442 expression, and the temp is getting in the way of other refactorings.}{Replace
 443 all references to that temp with the expression}
 444
 445 % Moving Features Between Objects
 446 \explanation{Move Method}{A method is, or will be, using or used by more
 447 features of another class than the class on which it is defined.}{Create a new
 448 method with a similar body in the class it uses most. Either turn the old method
 449 into a simple delegation, or remove it altogether.}
 450
 451 \explanation{Move Field}{A field is, or will be, used by another class more than
 452 the class on which it is defined}{Create a new field in the target class, and
 453 change all its users.}
 454
 455 % Organizing Data
 456 \explanation{Replace Magic Number with Symbolic Constant}{You have a literal
 457 number with a particular meaning.}{Create a constant, name it after the meaning,
 458 and replace the number with it.}
 459
 460 \explanation{Encapsulate Field}{There is a public field.}{Make it private and
 461 provide accessors.}
 462
 463 \explanation{Replace Type Code with Class}{A class has a numeric type code that
 464 does not affect its behavior.}{Replace the number with a new class.}
 465
 466 \explanation{Replace Type Code with Subclasses}{You have an immutable type code
 467 that affects the behavior of a class.}{Replace the type code with subclasses.}
 468
 469 \explanation{Replace Type Code with State/Strategy}{You have a type code that
 470 affects the behavior of a class, but you cannot use subclassing.}{Replace the
 471 type code with a state object.}
 472
 473 % Simplifying Conditional Expressions
 474 \explanation{Consolidate Duplicate Conditional Fragments}{The same fragment of
 475 code is in all branches of a conditional expression.}{Move it outside of the
 476 expression.}
 477
 478 \explanation{Remove Control Flag}{You have a variable that is acting as a
 479 control flag fro a series of boolean expressions.}{Use a break or return
 480 instead.}
 481
 482 \explanation{Replace Nested Conditional with Guard Clauses}{A method has
 483 conditional behavior that does not make clear the normal path of
 484 execution.}{Use guard clauses for all special cases.}
 485
 486 \explanation{Introduce Null Object}{You have repeated checks for a null
 487 value.}{Replace the null value with a null object.}
 488
 489 \explanation{Introduce Assertion}{A section of code assumes something about the
 490 state of the program.}{Make the assumption explicit with an assertion.}
 491
 492 % Making Method Calls Simpler
 493 \explanation{Rename Method}{The name of a method does not reveal its
 494 purpose.}{Change the name of the method}
 495
 496 \explanation{Add Parameter}{A method needs more information from its
 497 caller.}{Add a parameter for an object that can pass on this information.}
 498
 499 \explanation{Remove Parameter}{A parameter is no longer used by the method
 500 body.}{Remove it.}
 501
 502 %\explanation{Parameterize Method}{Several methods do similar things but with
 503 %different values contained in the method.}{Create one method that uses a
 504 %parameter for the different values.}
 505
 506 \explanation{Preserve Whole Object}{You are getting several values from an
 507 object and passing these values as parameters in a method call.}{Send the whole
 508 object instead.}
 509
 510 \explanation{Remove Setting Method}{A field should be set at creation time and
 511 never altered.}{Remove any setting method for that field.}
 512
 513 \explanation{Hide Method}{A method is not used by any other class.}{Make the
 514 method private.}
 515
 516 \explanation{Replace Constructor with Factory Method}{You want to do more than
 517 simple construction when you create an object}{Replace the constructor with a
 518 factory method.}
 519
 520 % Dealing with Generalization
 521 \explanation{Pull Up Field}{Two subclasses have the same field.}{Move the field
 522 to the superclass.}
 523
 524 \explanation{Pull Up Method}{You have methods with identical results on
 525 subclasses.}{Move them to the superclass.}
 526
 527 \explanation{Push Down Method}{Behavior on a superclass is relevant only for
 528 some of its subclasses.}{Move it to those subclasses.}
 529
 530 \explanation{Push Down Field}{A field is used only by some subclasses.}{Move the
 531 field to those subclasses}
 532
 533 \explanation{Extract Interface}{Several clients use the same subset of a class's
 534 interface, or two classes have part of their interfaces in common.}{Extract the
 535 subset into an interface.}
 536
 537 \explanation{Replace Inheritance with Delegation}{A subclass uses only part of a
 538 superclasses interface or does not want to inherit data.}{Create a field for the
 539 superclass, adjust methods to delegate to the superclass, and remove the
 540 subclassing.}
 541
 542 \explanation{Replace Delegation with Inheritance}{You're using delegation and
 543 are often writing many simple delegations for the entire interface}{Make the
 544 delegating class a subclass of the delegate.}
 545
 546 \subsubsection{Composite refactorings}
 547
 548 % Composing Methods
 549 % \explanation{Replace Method with Method Object}{}{}
 550
 551 % Moving Features Between Objects
 552 \explanation{Extract Class}{You have one class doing work that should be done by
 553 two}{Create a new class and move the relevant fields and methods from the old
 554 class into the new class.}
 555
 556 \explanation{Inline Class}{A class isn't doing very much.}{Move all its features
 557 into another class and delete it.}
 558
 559 \explanation{Hide Delegate}{A client is calling a delegate class of an
 560 object.}{Create Methods on the server to hide the delegate.}
 561
 562 \explanation{Remove Middle Man}{A class is doing to much simple delegation.}{Get
 563 the client to call the delegate directly.}
 564
 565 % Organizing Data
 566 \explanation{Replace Data Value with Object}{You have a data item that needs
 567 additional data or behavior.}{Turn the data item into an object.}
 568
 569 \explanation{Change Value to Reference}{You have a class with many equal
 570 instances that you want to replace with a single object.}{Turn the object into a
 571 reference object.}
 572
 573 \explanation{Encapsulate Collection}{A method returns a collection}{Make it
 574 return a read-only view and provide add/remove methods.}
 575
 576 % \explanation{Replace Array with Object}{}{}
 577
 578 \explanation{Replace Subclass with Fields}{You have subclasses that vary only in
 579 methods that return constant data.}{Change the methods to superclass fields and
 580 eliminate the subclasses.}
 581
 582 % Simplifying Conditional Expressions
 583 \explanation{Decompose Conditional}{You have a complicated conditional
 584 (if-then-else) statement.}{Extract methods from the condition, then part, an
 585 else part.}
 586
 587 \explanation{Consolidate Conditional Expression}{You have a sequence of
 588 conditional tests with the same result.}{Combine them into a single conditional
 589 expression and extract it.}
 590
 591 \explanation{Replace Conditional with Polymorphism}{You have a conditional that
 592 chooses different behavior depending on the type of an object.}{Move each leg
 593 of the conditional to an overriding method in a subclass. Make the original
 594 method abstract.}
 595
 596 % Making Method Calls Simpler
 597 \explanation{Replace Parameter with Method}{An object invokes a method, then
 598 passes the result as a parameter for a method. The receiver can also invoke this
 599 method.}{Remove the parameter and let the receiver invoke the method.}
 600
 601 \explanation{Introduce Parameter Object}{You have a group of parameters that
 602 naturally go together.}{Replace them with an object.}
 603
 604 % Dealing with Generalization
 605 \explanation{Extract Subclass}{A class has features that are used only in some
 606 instances.}{Create a subclass for that subset of features.}
 607
 608 \explanation{Extract Superclass}{You have two classes with similar
 609 features.}{Create a superclass and move the common features to the
 610 superclass.}
 611
 612 \explanation{Collapse Hierarchy}{A superclass and subclass are not very
 613 different.}{Merge them together.}
 614
 615 \explanation{Form Template Method}{You have two methods in subclasses that
 616 perform similar steps in the same order, yet the steps are different.}{Get the
 617 steps into methods with the same signature, so that the original methods become
 618 the same. Then you can pull them up.}
 619
 620
 621 \subsection{Functional refactorings}
 622
 623 \explanation{Substitute Algorithm}{You want to replace an algorithm with one
 624 that is clearer.}{Replace the body of the method with the new algorithm.}
 625
 626 \end{comment}
 627
 628 \section{The impact on software quality}
 629
 630 \subsection{What is meant by quality?}
 631 The term \emph{software quality} has many meanings. It all depends on the
 632 context we put it in. If we look at it with the eyes of a software developer, it
 633 usually mean that the software is easily maintainable and testable, or in other
 634 words, that it is \emph{well designed}. This often correlates with the
 635 management scale, where \emph{keeping the schedule} and \emph{customer
 636 satisfaction} is at the center. From the customers point of view, in addition to
 637 good usability, \emph{performance} and \emph{lack of bugs} is always
 638 appreciated, measurements that are also shared by the software developer. (In
 639 addition, such things as good documentation could be measured, but this is out
 640 of the scope of this document.)
 641
 642 \subsection{The impact on performance}
 643 \begin{quote}
 644   Refactoring certainly will make software go more slowly, but it also makes the
 645   software more amenable to performance tuning.\citing{refactoring} % page 69
 646 \end{quote}
 647
 648 \noindent There is a common belief that refactoring compromises performance, due
 649 to increased degree of indirection and that polymorphism is slower than
 650 conditionals.
 651
 652 In a survey, Demeyer\citing{demeyer2002} disproves this view in the case of
 653 polymorphism. He is doing an experiment on, what he calls, ``Transform Self Type
 654 Checks'' where you introduce a new polymorphic method and a new class hierarchy
 655 to get rid of a class' type checking of a ``type attribute``. He uses this kind
 656 of transformation to represent other ways of replacing conditionals with
 657 polymorphism as well. The experiment is performed on the C++ programming
 658 language and with three different compilers and platforms. \todo{But is the
 659 result better?} Demeyer concludes that, with compiler optimization turned on,
 660 polymorphism beats middle to large sized if-statements and does as well as
 661 case-statements.  (In accordance with his hypothesis, due to similarities
 662 between the way C++ handles polymorphism and case-statements.)
 663
 664 \begin{quote}
 665   The interesting thing about performance is that if you analyze most programs,
 666   you find that they waste most of their time in a small fraction of the
 667   code.\citing{refactoring}
 668 \end{quote}
 669
 670 \noindent So, although an increased amount of method calls could potentially
 671 slow down programs, one should avoid premature optimization and sacrificing good
 672 design, leaving the performance tuning until after profiling\footnote{For and
 673   example of a Java profiler, check out VisualVM:
 674   \url{http://visualvm.java.net/}} the software and having isolated the actual
 675   problem areas.
 676
 677 \section{Composite refactorings} \label{intro_composite}
 678 \todo{motivation, examples, manual vs automated?, what about refactoring in a
 679 very large code base?}
 680 Generally, when thinking about refactoring, at the mechanical level, there are
 681 essentially two kinds of refactorings. There are the \emph{primitive}
 682 refactorings, and the \emph{composite} refactorings. A primitive refactoring can
 683 be defined like this:
 684
 685 \definition{A primitive refactoring is a refactoring that cannot be expressed in
 686 terms of other refactorings.}
 687
 688 \noindent Examples are the \refactoring{Pull Up Field} and \refactoring{Pull Up
 689 Method} refactorings\citing{refactoring}, that moves members up in their class
 690 hierarchies.
 691
 692 A composite refactoring is more complex, and can be defined like this:
 693
 694 \definition{A composite refactoring is a refactoring that can be expressed in
 695 terms of two or more primitive refactorings.}
 696
 697 \noindent An example of a composite refactoring is the \refactoring{Extract
 698 Superclass} refactoring\citing{refactoring}. In its simplest form, it is composed
 699 of the previously described primitive refactorings, in addition to the
 700 \refactoring{Pull Up Constructor Body} refactoring\citing{refactoring}.  It works
 701 by creating an abstract superclass that the target class(es) inherits from, then
 702 by applying \refactoring{Pull Up Field}, \refactoring{Pull Up Method} and
 703 \refactoring{Pull Up Constructor Body} on the members that are to be members of
 704 the new superclass. For an overview of the \refactoring{Extract Superclass}
 705 refactoring, see figure \ref{fig:extractSuperclass}.
 706
 707 \begin{figure}[h]
 708   \centering
 709   \includegraphics[angle=270,width=\linewidth]{extractSuperclassItalic.pdf}
 710   \caption{The Extract Superclass refactoring}
 711   \label{fig:extractSuperclass}
 712 \end{figure}
 713
 714 \section{Manual vs. automated refactorings}
 715 Refactoring is something every programmer does, even if he or she does not known
 716 the term \emph{refactoring}. Every refinement of source code that does not alter
 717 the program's behavior is a refactoring. For small refactorings, such as
 718 \ExtractMethod, executing it manually is a manageable task, but is still
 719 prone to errors. Getting it right the first time is not easy, considering the
 720 signature and all the other aspects of the refactoring that has to be in place.
 721
 722 Take for instance the renaming of classes, methods and fields. For complex
 723 programs these refactorings are almost impossible to get right.  Attacking them
 724 with textual search and replace, or even regular expressions, will fall short on
 725 these tasks. Then it is crucial to have proper tool support that can perform
 726 them automatically. Tools that can parse source code and thus has semantic
 727 knowledge about which occurrences of which names that belongs to what construct
 728 in the program. For even trying to perform one of these complex task manually,
 729 one would have to be very confident on the existing test suite \see{testing}.
 730
 731 \section{Correctness of refactorings}\label{correctness}
 732 For automated refactorings to be truly useful, they must show a high degree of
 733 behavior preservation. This last sentence might seem obvious, but there are
 734 examples of refactorings in existing tools that break programs. I will now
 735 present an example of an \ExtractMethod refactoring followed by a \MoveMethod
 736 refactoring that breaks a program in both the \emph{Eclipse} and \emph{IntelliJ}
 737 IDEs\footnote{The NetBeans IDE handles this particular situation, mainly because
 738   its Move Method refactoring implementation is crippled in other ways
 739   \see{toolSupport}.}. The following piece of code shows the target for the
 740   composed refactoring:
 741
 742 \begin{minted}[linenos,samepage]{java}
 743 public class C {
 744     public X x = new X();
 745
 746     public void f() {
 747         x.m(this);
 748         x.n();
 749     }
 750 }
 751 \end{minted}
 752
 753 \noindent The next piece of code shows the destination of the refactoring. Note
 754 that the method \method{m(C c)} of class \type{C} assigns to the field \var{x}
 755 of the argument \var{c} that has type \type{C}:
 756
 757 \begin{minted}[samepage]{java}
 758 public class X {
 759     public void m(C c) {
 760         c.x = new X();
 761     }
 762     public void n() {}
 763 }
 764 \end{minted}
 765
 766 The refactoring sequence works by extracting line 5 and 6 from the original
 767 class \type{C} into a method \method{f} with the statements from those lines as
 768 its method body. The method is then moved to the class \type{X}. The result is
 769 shown in the following two pieces of code:
 770
 771 \begin{minted}[linenos,samepage]{java}
 772 public class C {
 773     public X x = new X();
 774
 775     public void f() {
 776         x.f(this);
 777     }
 778 }
 779 \end{minted}
 780
 781 \begin{minted}[linenos,samepage]{java}
 782 public class X {
 783     public void m(C c) {
 784         c.x = new X();
 785     }
 786     public void n() {}
 787     public void f(C c) {
 788         m(c);
 789         n();
 790     }
 791 }
 792 \end{minted}
 793
 794 After the refactoring, the method \method{f} of class \type{C} calls the method
 795 \method{f} of class \type{X}, and the program breaks.  (See line 5 of the
 796 version of class \type{C} after the refactoring.) Before the refactoring, the
 797 methods \method{m} and \method{n} of class \type{X} are called on different
 798 object instances (see line 5 and 6 of the original class \type{C}). After, they
 799 are called on the same object, and the statement on line 3 of class \type{X}
 800 (the version after the refactoring) no longer have any effect in our example.
 801
 802 The bug introduced in the previous example is of such a nature that it is very
 803 difficult to spot if the refactored code is not covered by tests. It does not
 804 generate compilation errors, and will thus only result in a runtime error or
 805 corrupted data, which might be hard to detect.
 806
 807 \section{Refactoring and testing}\label{testing}
 808 \begin{quote}
 809   If you want to refactor, the essential precondition is having solid
 810   tests.\citing{refactoring}
 811 \end{quote}
 812
 813 When refactoring, there are roughly two kinds of errors that can be made. There
 814 are errors that make the code unable to compile, and there are the silent
 815 errors, only popping up at runtime. Compile-time errors are the nice ones. They
 816 flash up at the moment they are made (at least when using an IDE), and are
 817 usually easy to fix. The other kind of error is the dangerous one. It is the
 818 kind of error introduced in the example of section \ref{correctness}. It is an
 819 error sneaking into your code without you noticing, maybe. For discovering those
 820 kind of errors when refactoring, it is essential to have good test coverage. It
 821 is not a way to \emph{prove} that the code is correct, but it is a way to make
 822 you confindent that it \emph{probably} works as desired. In the context of test
 823 driven development, the tests are even a way to define how the program is
 824 supposed to work. It is then, by definition, working if the tests are passing.
 825
 826 If the test coverage for a code base is perfect, then it should, theoretically,
 827 be risk-free to perform refactorings on it. This is why tests and refactoring is
 828 such a great match.
 829
 830 \section{Software metrics}
 831
 832 %\part{The project}
 833 %\chapter{Planning the project}
 834 %\part{Conclusion}
 835 %\chapter{Results}
 836
 837
 838
 839 \chapter{\ldots}
 840 \todoin{write}
 841 \section{The problem statement}
 842 \section{Choosing the target language}
 843 Choosing which programming language to use as the target for manipulation is not
 844 a very difficult task. The language have to be an object-oriented programming
 845 language, and it must have existing tool support for refactoring. The
 846 \emph{Java} programming language\footnote{\url{https://www.java.com/}} is the
 847 dominating language when it comes to examples in the literature of refactoring,
 848 and is thus a natural choice. Java is perhaps, currently the most influential
 849 programming language in the world, with its \emph{Java Virtual Machine} that
 850 runs on all of the most popular architectures and also supports\footnote{They
 851 compile to java bytecode.} dozens of other programming languages, with
 852 \emph{Scala}, \emph{Clojure} and \emph{Groovy} as the most prominent ones. Java
 853 is currently the language that every other programming language is compared
 854 against. It is also the primary language of the author of this thesis.
 855
 856 \section{Choosing the tools}
 857 When choosing a tool for manipulating Java, there are certain criterias that
 858 have to be met. First of all, the tool should have some existing refactoring
 859 support that this thesis can build upon. Secondly it should provide some kind of
 860 framework for parsing and analyzing Java source code. Third, it should itself be
 861 open source. This is both because of the need to be able to browse the code for
 862 the existing refactorings that is contained in the tool, and also because open
 863 source projects hold value in them selves. Another important aspect to consider
 864 is that open source projects of a certain size, usually has large communities of
 865 people connected to them, that are commited to answering questions regarding the
 866 use and misuse of the products, that to a large degree is made by the cummunity
 867 itself.
 868
 869 There is a certain class of tools that meet these criterias, namely the class of
 870 \emph{IDEs}\footnote{\emph{Integrated Development Environment}}. These are
 871 proagrams that is ment to support the whole production cycle of a cumputer
 872 program, and the most popular IDEs that support Java, generally have quite good
 873 refactoring support.
 874
 875 The main contenders for this thesis is the \emph{Eclipse IDE}, with the
 876 \emph{Java development tools} (JDT), the \emph{IntelliJ IDEA Community Edition}
 877 and the \emph{NetBeans IDE}. \See{toolSupport} Eclipse and NetBeans are both
 878 free, open source and community driven, while the IntelliJ IDEA has an open
 879 sourced community edition that is free of charge, but also offer an
 880 \emph{Ultimate Edition} with an extended set of features, at additional cost.
 881 All three IDEs supports adding plugins to extend their functionality and tools
 882 that can be used to parse and analyze Java source code. \todo{investigate if
 883 this is true} But one of the IDEs stand out as a favorite, and that is the
 884 \emph{Eclipse IDE}. This is the most popular\citing{javaReport2011} among them
 885 and seems to be de facto standard IDE for Java development regardless of
 886 platform.
 887
 888
 889 \chapter{Refactorings in Eclipse JDT: Design, Shortcomings and Wishful
 890 Thinking}\label{ch:jdt_refactorings}
 891
 892 This chapter will deal with some of the design behind refactoring support in
 893 Eclipse, and the JDT in specific. After which it will follow a section about
 894 shortcomings of the refactoring API in terms of composition of refactorings. The
 895 chapter will be concluded with a section telling some of the ways the
 896 implementation of refactorings in the JDT could have worked to facilitate
 897 composition of refactorings.
 898
 899 \section{Design}
 900 The refactoring world of Eclipse can in general be separated into two parts: The
 901 language independent part and the part written for a specific programming
 902 language -- the language that is the target of the supported refactorings.
 903 \todo{What about the language specific part?}
 904
 905 \subsection{The Language Toolkit}
 906 The Language Toolkit, or LTK for short, is the framework that is used to
 907 implement refactorings in Eclipse. It is language independent and provides the
 908 abstractions of a refactoring and the change it generates, in the form of the
 909 classes \typewithref{org.eclipse.ltk.core.refactoring}{Refactoring} and
 910 \typewithref{org.eclipse.ltk.core.refactoring}{Change}. (There is also parts of
 911 the LTK that is concerned with user interaction, but they will not be discussed
 912 here, since they are of little value to us and our use of the framework.)
 913
 914 \subsubsection{The Refactoring Class}
 915 The abstract class \type{Refactoring} is the core of the LTK framework. Every
 916 refactoring that is going to be supported by the LTK have to end up creating an
 917 instance of one of its subclasses. The main responsibilities of subclasses of
 918 \type{Refactoring} is to implement template methods for condition checking
 919 (\methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkInitialConditions}
 920 and
 921 \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkFinalConditions}),
 922 in addition to the
 923 \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{createChange}
 924 method that creates and returns an instance of the \type{Change} class.
 925
 926 If the refactoring shall support that others participate in it when it is
 927 executed, the refactoring has to be a processor-based
 928 refactoring\typeref{org.eclipse.ltk.core.refactoring.participants.ProcessorBasedRefactoring}.
 929 It then delegates to its given
 930 \typewithref{org.eclipse.ltk.core.refactoring.participants}{RefactoringProcessor}
 931 for condition checking and change creation.
 932
 933 \subsubsection{The Change Class}
 934 This class is the base class for objects that is responsible for performing the
 935 actual workspace transformations in a refactoring. The main responsibilities for
 936 its subclasses is to implement the
 937 \methodwithref{org.eclipse.ltk.core.refactoring.Change}{perform} and
 938 \methodwithref{org.eclipse.ltk.core.refactoring.Change}{isValid} methods. The
 939 \method{isValid} method verifies that the change object is valid and thus can be
 940 executed by calling its \method{perform} method. The \method{perform} method
 941 performs the desired change and returns an undo change that can be executed to
 942 reverse the effect of the transformation done by its originating change object.
 943
 944 \subsubsection{Executing a Refactoring}\label{executing_refactoring}
 945 The life cycle of a refactoring generally follows two steps after creation:
 946 condition checking and change creation. By letting the refactoring object be
 947 handled by a
 948 \typewithref{org.eclipse.ltk.core.refactoring}{CheckConditionsOperation} that
 949 in turn is handled by a
 950 \typewithref{org.eclipse.ltk.core.refactoring}{CreateChangeOperation}, it is
 951 assured that the change creation process is managed in a proper manner.
 952
 953 The actual execution of a change object has to follow a detailed life cycle.
 954 This life cycle is honored if the \type{CreateChangeOperation} is handled by a
 955 \typewithref{org.eclipse.ltk.core.refactoring}{PerformChangeOperation}. If also
 956 an undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} is set
 957 for the \type{PerformChangeOperation}, the undo change is added into the undo
 958 history.
 959
 960 \section{Shortcomings}
 961 This section is introduced naturally with a conclusion: The JDT refactoring
 962 implementation does not facilitate composition of refactorings.
 963 \todo{refine}This section will try to explain why, and also identify other
 964 shortcomings of both the usability and the readability of the JDT refactoring
 965 source code.
 966
 967 I will begin at the end and work my way toward the composition part of this
 968 section.
 969
 970 \subsection{Absence of Generics in Eclipse Source Code}
 971 This section is not only concerning the JDT refactoring API, but also large
 972 quantities of the Eclipse source code. The code shows a striking absence of the
 973 Java language feature of generics. It is hard to read a class' interface when
 974 methods return objects or takes parameters of raw types such as \type{List} or
 975 \type{Map}. This sometimes results in having to read a lot of source code to
 976 understand what is going on, instead of relying on the available interfaces. In
 977 addition, it results in a lot of ugly code, making the use of typecasting more
 978 of a rule than an exception.
 979
 980 \subsection{Composite Refactorings Will Not Appear as Atomic Actions}
 981
 982 \subsubsection{Missing Flexibility from JDT Refactorings}
 983 The JDT refactorings are not made with composition of refactorings in mind. When
 984 a JDT refactoring is executed, it assumes that all conditions for it to be
 985 applied successfully can be found by reading source files that has been
 986 persisted to disk. They can only operate on the actual source material, and not
 987 (in-memory) copies thereof. This constitutes a major disadvantage when trying to
 988 compose refactorings, since if an exception occur in the middle of a sequence of
 989 refactorings, it can leave the project in a state where the composite
 990 refactoring was executed only partly. It makes it hard to discard the changes
 991 done without monitoring and consulting the undo manager, an approach that is not
 992 bullet proof.
 993
 994 \subsubsection{Broken Undo History}
 995 When designing a composed refactoring that is to be performed as a sequence of
 996 refactorings, you would like it to appear as a single change to the workspace.
 997 This implies that you would also like to be able to undo all the changes done by
 998 the refactoring in a single step. This is not the way it appears when a sequence
 999 of JDT refactorings is executed. It leaves the undo history filled up with
1000 individual undo actions corresponding to every single JDT refactoring in the
1001 sequence. This problem is not trivial to handle in Eclipse. (See section
1002 \ref{hacking_undo_history}.)
1003
1004 \section{Wishful Thinking}
1005
1006
1007
1008 \chapter{Composite Refactorings in Eclipse}
1009
1010 \section{A Simple Ad Hoc Model}
1011 As pointed out in chapter \ref{ch:jdt_refactorings}, the Eclipse JDT refactoring
1012 model is not very well suited for making composite refactorings. Therefore a
1013 simple model using changer objects (of type \type{RefaktorChanger}) is used as
1014 an abstraction layer on top of the existing Eclipse refactorings.
1015
1016 \section{The Extract and Move Method Refactoring}
1017 %The Extract and Move Method Refactoring is implemented mainly using these
1018 %classes:
1019 %\begin{itemize}
1020 %  \item \type{ExtractAndMoveMethodChanger}
1021 %  \item \type{ExtractAndMoveMethodPrefixesExtractor}
1022 %  \item \type{Prefix}
1023 %  \item \type{PrefixSet}
1024 %\end{itemize}
1025
1026 \subsection{The Building Blocks}
1027 This is a composite refactoring, and hence is built up using several primitive
1028 refactorings. These basic building blocks are, as its name implies, the
1029 \ExtractMethod refactoring\citing{refactoring} and the \MoveMethod
1030 refactoring\citing{refactoring}. In Eclipse, the implementations of these
1031 refactorings are found in the classes
1032 \typewithref{org.eclipse.jdt.internal.corext.refactoring.code}{ExtractMethodRefactoring}
1033 and
1034 \typewithref{org.eclipse.jdt.internal.corext.refactoring.structure}{MoveInstanceMethodProcessor},
1035 where the last class is designed to be used together with the processor-based
1036 \typewithref{org.eclipse.ltk.core.refactoring.participants}{MoveRefactoring}.
1037
1038 \subsubsection{The ExtractMethodRefactoring Class}
1039 This class is quite simple in its use. The only parameters it requires for
1040 construction is a compilation
1041 unit\typeref{org.eclipse.jdt.core.ICompilationUnit}, the offset into the source
1042 code where the extraction shall start, and the length of the source to be
1043 extracted. Then you have to set the method name for the new method together with
1044 which access modifier that shall be used and some not so interesting parameters.
1045
1046 \subsubsection{The MoveInstanceMethodProcessor Class}
1047 For the Move Method the processor requires a little more advanced input than
1048 the class for the Extract Method. For construction it requires a method
1049 handle\typeref{org.eclipse.jdt.core.IMethod} from the Java Model for the method
1050 that is to be moved. Then the target for the move have to be supplied as the
1051 variable binding from a chosen variable declaration. In addition to this, one
1052 have to set some parameters regarding setters/getters and delegation.
1053
1054 To make a whole refactoring from the processor, one have to construct a
1055 \type{MoveRefactoring} from it.
1056
1057 \subsection{The ExtractAndMoveMethodChanger Class}
1058 The \typewithref{no.uio.ifi.refaktor.changers}{ExtractAndMoveMethodChanger}
1059 class, that is a subclass of the class
1060 \typewithref{no.uio.ifi.refaktor.changers}{RefaktorChanger}, is the class
1061 responsible for composing the \type{ExtractMethodRefactoring} and the
1062 \type{MoveRefactoring}. Its constructor takes a project
1063 handle\typeref{org.eclipse.core.resources.IProject}, the method name for the new
1064 method and a \typewithref{no.uio.ifi.refaktor.utils}{SmartTextSelection}.
1065
1066 A \type{SmartTextSelection} is basically a text
1067 selection\typeref{org.eclipse.jface.text.ITextSelection} object that enforces
1068 the providing of the underlying document during creation. I.e. its
1069 \methodwithref{no.uio.ifi.refaktor.utils.SmartTextSelection}{getDocument} method
1070 will never return \type{null}.
1071
1072 Before extracting the new method, the possible targets for the move operation is
1073 found with the help of an
1074 \typewithref{no.uio.ifi.refaktor.extractors}{ExtractAndMoveMethodPrefixesExtractor}.
1075 The possible targets is computed from the prefixes that the extractor returns
1076 from its
1077 \methodwithref{no.uio.ifi.refaktor.extractors.ExtractAndMoveMethodPrefixesExtractor}{getSafePrefixes}
1078 method. The changer then choose the most suitable target by finding the most
1079 frequent occurring prefix among the safe ones. The target is the type of the
1080 first part of the prefix.
1081
1082 After finding a suitable target, the \type{ExtractAndMoveMethodChanger} first
1083 creates an \type{ExtractMethodRefactoring} and performs it as explained in
1084 section \ref{executing_refactoring} about the execution of refactorings. Then it
1085 creates and performs the \type{MoveRefactoring} in the same way, based on the
1086 changes done by the Extract Method refactoring.
1087
1088 \subsection{The ExtractAndMoveMethodPrefixesExtractor Class}
1089 This extractor extracts properties needed for building the Extract and Move
1090 Method refactoring. It searches through the given selection to find safe
1091 prefixes, and those prefixes form a base that can be used to compute possible
1092 targets for the move part of the refactoring.  It finds both the candidates, in
1093 the form of prefixes, and the non-candidates, called unfixes. All prefixes (and
1094 unfixes) are represented by a
1095 \typewithref{no.uio.ifi.refaktor.extractors}{Prefix}, and they are collected
1096 into prefix sets.\typeref{no.uio.ifi.refaktor.extractors.PrefixSet}.
1097
1098 The prefixes and unfixes are found by property
1099 collectors\typeref{no.uio.ifi.refaktor.extractors.collectors.PropertyCollector}.
1100 A property collector follows the visitor pattern \cite{dp} and is of the
1101 \typewithref{org.eclipse.jdt.core.dom}{ASTVisitor} type.  An \type{ASTVisitor}
1102 visits nodes in an abstract syntax tree that forms the Java document object
1103 model. The tree consists of nodes of type
1104 \typewithref{org.eclipse.jdt.core.do}{ASTNode}.
1105
1106 \subsubsection{The PrefixesCollector}
1107 The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{PrefixesCollector}
1108 is of type \type{PropertyCollector}. It visits expression
1109 statements\typeref{org.eclipse.jdt.core.dom.ExpressionStatement} and creates
1110 prefixes from its expressions in the case of method invocations. The prefixes
1111 found is registered with a prefix set, together with all its sub-prefixes.
1112 \todo{Rewrite in the case of changes to the way prefixes are found}
1113
1114 \subsubsection{The UnfixesCollector}
1115 The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector}
1116 finds unfixes within the selection. An unfix is a name that is assigned to
1117 within the selection. The reason that this cannot be allowed, is that the result
1118 would be an assignment to the \type{this} keyword, which is not valid in Java.
1119
1120 \subsubsection{Computing Safe Prefixes}
1121 A safe prefix is a prefix that does not enclose an unfix. A prefix is enclosing
1122 an unfix if the unfix is in the set of its sub-prefixes. As an example,
1123 \texttt{``a.b''} is enclosing \texttt{``a''}, as is \texttt{``a''}. The safe
1124 prefixes is unified in a \type{PrefixSet} and can be fetched calling the
1125 \method{getSafePrefixes} method of the
1126 \type{ExtractAndMoveMethodPrefixesExtractor}.
1127
1128 \subsection{The Prefix Class}
1129 \todo{?}
1130 \subsection{The PrefixSet Class}
1131
1132 \subsection{Hacking the Refactoring Undo
1133 History}\label{hacking_undo_history}
1134 \todo{Where to put this section?}
1135
1136 As an attempt to make multiple subsequent changes to the workspace appear as a
1137 single action (i.e. make the undo changes appear as such), I tried to alter
1138 the undo changes\typeref{org.eclipse.ltk.core.refactoring.Change} in the history
1139 of the refactorings.
1140
1141 My first impulse was to remove the, in this case, last two undo changes from the
1142 undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} for the
1143 Eclipse refactorings, and then add them to a composite
1144 change\typeref{org.eclipse.ltk.core.refactoring.CompositeChange} that could be
1145 added back to the manager. The interface of the undo manager does not offer a
1146 way to remove/pop the last added undo change, so a possible solution could be to
1147 decorate \cite{dp} the undo manager, to intercept and collect the undo changes
1148 before delegating to the \method{addUndo}
1149 method\methodref{org.eclipse.ltk.core.refactoring.IUndoManager}{addUndo} of the
1150 manager. Instead of giving it the intended undo change, a null change could be
1151 given to prevent it from making any changes if run. Then one could let the
1152 collected undo changes form a composite change to be added to the manager.
1153
1154 There is a technical challenge with this approach, and it relates to the undo
1155 manager, and the concrete implementation
1156 UndoManager2\typeref{org.eclipse.ltk.internal.core.refactoring.UndoManager2}.
1157 This implementation is designed in a way that it is not possible to just add an
1158 undo change, you have to do it in the context of an active
1159 operation\typeref{org.eclipse.core.commands.operations.TriggeredOperations}.
1160 One could imagine that it might be possible to trick the undo manager into
1161 believing that you are doing a real change, by executing a refactoring that is
1162 returning a kind of null change that is returning our composite change of undo
1163 refactorings when it is performed.
1164
1165 Apart from the technical problems with this solution, there is a functional
1166 problem: If it all had worked out as planned, this would leave the undo history
1167 in a dirty state, with multiple empty undo operations corresponding to each of
1168 the sequentially executed refactoring operations, followed by a composite undo
1169 change corresponding to an empty change of the workspace for rounding of our
1170 composite refactoring. The solution to this particular problem could be to
1171 intercept the registration of the intermediate changes in the undo manager, and
1172 only register the last empty change.
1173
1174 Unfortunately, not everything works as desired with this solution. The grouping
1175 of the undo changes into the composite change does not make the undo operation
1176 appear as an atomic operation. The undo operation is still split up into
1177 separate undo actions, corresponding to the change done by its originating
1178 refactoring. And in addition, the undo actions has to be performed separate in
1179 all the editors involved. This makes it no solution at all, but a step toward
1180 something worse.
1181
1182 There might be a solution to this problem, but it remains to be found. The
1183 design of the refactoring undo management is partly to be blamed for this, as it
1184 it is to complex to be easily manipulated.
1185
1186
1187 \backmatter{}
1188 \printbibliography
1189 \listoftodos
1190 \end{document}