thesis/master-thesis-erlenkr.tex

   1 \documentclass[USenglish]{ifimaster}
   2 \usepackage{import}
   3 \usepackage[utf8]{inputenc}
   4 \usepackage[T1]{fontenc,url}
   5 \urlstyle{sf}
   6 \usepackage{babel,textcomp,csquotes,ifimasterforside,varioref,graphicx}
   7 \usepackage[style=numeric-comp,backend=bibtex]{biblatex}
   8 \usepackage{amsthm}
   9 \usepackage{todonotes}
  10 \usepackage{verbatim}
  11 \usepackage{perpage} %the perpage package
  12 \MakePerPage{footnote} %the perpage package command
  13
  14 \theoremstyle{plain}
  15 \newtheorem*{wordDef}{Definition}
  16
  17 \newcommand{\definition}[1]{\begin{wordDef}#1\end{wordDef}}
  18 \newcommand{\see}[1]{(see \ref{#1})}
  19 \newcommand{\explanation}[3]{\noindent\textbf{\textit{#1}}\\*\emph{When:}
  20 #2\\*\emph{How:} #3\\*[-7px]}
  21 \newcommand{\type}[1]{\texttt{#1}}
  22 \newcommand{\typeref}[1]{\footnote{\type{#1}}}
  23 \newcommand{\typewithref}[2]{\type{#2}\typeref{#1.#2}}
  24 \newcommand{\method}[1]{\type{#1}}
  25 \newcommand{\methodref}[2]{\footnote{\type{#1}\method{\##2()}}}
  26 \newcommand{\methodwithref}[2]{\method{#2}\footnote{\type{#1}\method{\##2()}}}
  27
  28
  29 \title{Refactoring}
  30 \subtitle{An unfinished essay}
  31 \author{Erlend Kristiansen}
  32
  33 \bibliography{bibliography/master-thesis-erlenkr-bibliography}
  34
  35 \begin{document}
  36 \ififorside
  37 \frontmatter{}
  38
  39
  40 \chapter*{Abstract}
  41 Empty document.
  42
  43 \tableofcontents{}
  44 \listoffigures{}
  45 \listoftables{}
  46
  47 \chapter*{Preface}
  48
  49 \mainmatter
  50
  51 \chapter{Introduction}
  52
  53 \section{What is Refactoring?}
  54
  55 This question is best answered dividing the answer into two parts. First
  56 defining the concept of a refactoring, then discuss what the discipline of
  57 refactoring is all about. And to make it clear already from the beginning: The
  58 discussions in this report must be seen in the context of object oriented
  59 programming languages. All though the techniques discussed may be applicable to
  60 languages from other paradigms, they will not be the subject of this report.
  61
  62 \subsection{Defining refactoring}
  63 Martin Fowler, in his masterpiece on refactoring \cite{refactoring}, defines a
  64 refactoring like this:
  65 \begin{quote}
  66   \emph{Refactoring} (noun): a change made to the \todo{what does he mean by
  67   internal?} internal structure of software to make it easier to understand and
  68   cheaper to modify without changing its observable
  69   behavior.~\cite{refactoring} % page 53
  70 \end{quote}
  71 This definition assign additional meaning to the word \emph{refactoring}, beyond
  72 the composition of the prefix \emph{re-}, usually meaning something like
  73 ``again'' or ``anew'', and the word \emph{factoring}, that can mean to determine
  74 the \emph{factors} of something. Where a \emph{factor} would be close to the
  75 mathematical definition of something that divides a quantity, without leaving a
  76 remainder. Fowler is mixing the \emph{motivation} behind refactoring into his
  77 definition.  Instead it could be made clean, only considering the mechanical and
  78 behavioral aspects of refactoring. That is to factor the program again, putting
  79 it together in a different way than before, while preserving the behavior of the
  80 program. An alternative definition could then be:
  81
  82 \definition{A refactoring is a transformation
  83 done to a program without altering its external behavior.}
  84
  85 From this we can conclude that a refactoring primarily changes how the
  86 \emph{code} of a program is perceived by the \emph{programmer}, and not the
  87 behavior experienced by any user of the program. Although the logical meaning is
  88 preserved, such changes could potentially alter the program's behavior when it
  89 comes to performance gain or -penalties. So any logic depending on the
  90 performance of a program could make the program behave differently after a
  91 refactoring.
  92
  93 In the extreme case one could argue that such a thing as \emph{software
  94 obfuscation} is to refactor. If we where to define it as a refactoring, it could
  95 be defined as a composite refactoring \see{intro_composite}, consisting of, for
  96 instance, a series of rename refactorings. (But it could of course be much more
  97 complex, and the mechanics of it would not exactly be carved in stone.) To
  98 perform some serious obfuscation one would also take advantage of techniques not
  99 found among established refactorings, such as removing whitespace. This might
 100 not even generate a different syntax tree for languages not sensitive to
 101 whitespace, placing it in the gray area of what kind of transformations is to be
 102 considered refactorings.
 103
 104 Finally, to \emph{refactor} is (quoting Martin Fowler)
 105 \begin{quote}
 106   \ldots to restructure software by applying a series of refactorings without
 107   changing its observable behavior.~\cite{refactoring} % page 54, definition
 108 \end{quote}
 109
 110 \todo{subsection with the history of refactoring?}
 111
 112 \subsection{Motivation}\todo{better headline? section vs. subsection}
 113 To get a grasp of what refactoring is all about, we can answer this question:
 114 \emph{Why do people refactor?} Possible answers could include: ``To remove
 115 duplication'' or ``to break up long methods''.  Practitioners of the art of
 116 Design Patterns~\cite{dp} could say that they do it to introduce a long-needed
 117 pattern into their program's design.  So it's safe to say that peoples'
 118 intentions are to make their programs \emph{better} in some sense. But what
 119 aspects of the programs are becoming improved?
 120
 121 As already mentioned, people often refactor to get rid of duplication. Moving
 122 identical or similar code into methods, and maybe pushing those up or down in
 123 their class hierarchies. Making template methods for overlapping algorithms
 124 \todo{better?: functionality} and so on.  It's all about gathering what belongs
 125 together and putting it all in one place.  And the result? The code is easier to
 126 maintain. When removing the implicit coupling between the code snippets, the
 127 location of a bug is limited to only one place, and new functionality need only
 128 to be added this one place, instead of a number of places people might not even
 129 remember.
 130
 131 The same people find out that their program contains a lot of long and
 132 hard-to-grasp methods. Then what do they do? They begin dividing their methods
 133 into smaller ones, using the \emph{Extract Method}
 134 refactoring~\cite{refactoring}.  Then they may discover something about their
 135 program that they weren't aware of before; revealing bugs they didn't know about
 136 or couldn't find due to the complex structure of their program. \todo{Proof?}
 137 Making the methods smaller and giving good names to the new ones clarifies the
 138 algorithms and enhances the \emph{understandability} of the program. This makes
 139 simple refactoring an excellent method for exploring unknown program code, or
 140 code that you had forgotten that you wrote!
 141
 142 The word \emph{simple} came up in the last section. In fact, most basic
 143 refactorings are simple. The true power of them are revealed first when they are
 144 combined into larger --- higher level --- refactorings, called \emph{composite
 145 refactorings} \see{intro_composite}. Often the goal of such a series of
 146 refactorings is a design pattern. Thus the \emph{design} can be evolved
 147 throughout the lifetime of a program, opposed to designing up-front.  It's all
 148 about being structured and taking small steps to improve the design.
 149
 150 Many refactorings are aimed at lowering the coupling between different classes
 151 and different layers of logic. Say for instance that the coupling between the
 152 user interface and the business logic of a program is lowered. Then the business
 153 logic of the program could much easier be the target of automated tests,
 154 increasing the productivity in the software development process. It is also
 155 easier to distribute (e.g. between computers) the different components of a
 156 program if they are sufficiently decoupled.
 157
 158 Another effect of refactoring is that with the increased separation of concerns
 159 coming out of many refactorings, the \emph{performance} is improved.  When
 160 profiling programs, the problem parts are narrowed down to smaller parts of the
 161 code, which are easier to tune, and optimization can be performed only where
 162 needed and in a more effective way.
 163
 164 Refactoring program code --- with a goal in mind --- can give the code itself
 165 more value. That is in the form of robustness to bugs, understandability and
 166 maintainability. With the first as an obvious advantage, but with the following
 167 two being also very important for software development. By incorporating
 168 refactoring in the development process, bugs are found faster, new functionality
 169 is added more easily and code is easier to understand by the next person exposed
 170 to it, which might as well be the person who wrote it. The consequence of this,
 171 is that refactoring can increase the average productivity of the development
 172 process, and thus also add to the monetary value of a business in the long run.
 173 Where this last point also should open the eyes of some nearsighted managers who
 174 seldom see beyond the next milestone.
 175
 176 \todo{motivation: support new functionality?}
 177
 178 \subsection{The etymology of 'refactoring'}
 179 It is a little difficult to pinpoint the exact origin of the word
 180 ``refactoring'', as it seems to have evolved as part of a colloquial
 181 terminology, more than a scientific term. There is no authoritative source for a
 182 formal definition of it.
 183
 184 According to Martin Fowler~\cite{etymology-refactoring}, there may also be more
 185 than one origin of the word. The most well-known source, when it comes to the
 186 origin of \emph{refactoring}, is the Smalltalk\footnote{\emph{Smalltalk},
 187 object-oriented, dynamically typed, reflective programming language.}\todo{find
 188 reference to Smalltalk website or similar?} community and their infamous
 189 \emph{Refactoring
 190 Browser}\footnote{\url{http://st-www.cs.illinois.edu/users/brant/Refactory/RefactoringBrowser.html}}
 191 described in the article \emph{A Refactoring Tool for
 192 Smalltalk}~\cite{refactoringBrowser1997}, published in 1997.
 193 Allegedly~\cite{etymology-refactoring}, the metaphor of factoring programs was
 194 also present in the Forth\footnote{\emph{Forth} -- stack-based, extensible
 195 programming language, without type-checking. See \url{http://www.forth.org}}
 196 community, and the word ``refactoring'' is mentioned in a book by Leo Brodie,
 197 called \emph{Thinking Forth}~\cite{brodie1984}, first published in
 198 1984\footnote{\emph{Thinking Forth} was first published in 1984 by the
 199 \emph{Forth Interest Group}.  Then it was reprinted in 1994 with minor
 200 typographical corrections, before it was transcribed into an electronic edition
 201 typeset in \LaTeX\ and published under a Creative Commons licence in 2004. The
 202 edition cited here is the 2004 edition, but the content should essentially be as
 203 in 1984.}. The exact word is only printed one place\footnote{p. 232}, but the
 204 term \emph{factoring} is prominent in the book, that also contains a whole
 205 chapter dedicated to (re)factoring, and how to keep the (Forth) code clean and
 206 maintainable.
 207 \begin{quote}
 208   \ldots good factoring technique is perhaps the most important skill for a
 209   Forth programmer.~\cite{brodie1984}
 210 \end{quote}
 211 Brodie also express what \emph{factoring} means to him:
 212 \begin{quote}
 213   Factoring means organizing code into useful fragments. To make a fragment
 214   useful, you often must separate reusable parts from non-reusable parts. The
 215   reusable parts become new definitions. The non-reusable parts become arguments
 216   or parameters to the definitions.~\cite{brodie1984}
 217 \end{quote}
 218
 219 Fowler claims that the usage of the word \emph{refactoring} did not pass between
 220 the \emph{Forth} and \emph{Smalltalk} communities, but that it emerged
 221 independently in each of the communities.
 222
 223 \subsection{Notable contributions to the refactoring literature}
 224 \todo{Update with more contributions}
 225 \begin{description}
 226   \item[1992] William F. Opdyke submits his doctoral dissertation called
 227     \emph{Refactoring Object-Oriented Frameworks}~\cite{opdyke1992}. This
 228     work defines a set of refactorings, that are behavior preserving given that
 229     their preconditions are met. The dissertation is focused on the automation
 230     of refactorings.
 231   \item[1999] Martin Fowler et al.: \emph{Refactoring: Improving the Design of
 232     Existing Code}~\cite{refactoring}. This is maybe the most influential text
 233     on refactoring. It bares similarities with Opdykes thesis~\cite{opdyke1992}
 234     in the way that it provides a catalog of refactorings. But Fowler's book is
 235     more about the craft of refactoring, as he focuses on establishing a
 236     vocabulary for refactoring, together with the mechanics of different
 237     refactorings and when to perform them. His methodology is also founded on
 238     the principles of test-driven development.
 239 \end{description}
 240
 241 \section{Tool support}
 242 \todo{write, section vs. subsection}
 243
 244 \section{Relation to design patterns}
 245 \todo{write, section vs. subsection}
 246 \begin{comment}
 247
 248 \section{Classification of refactorings}
 249 % only interesting refactorings
 250 % with 2 detailed examples? One for structured and one for intra-method?
 251 % Is replacing Bubblesort with Quick Sort considered a refactoring?
 252
 253 \subsection{Structural refactorings}
 254
 255 \subsubsection{Basic refactorings}
 256
 257 % Composing Methods
 258 \explanation{Extract Method}{You have a code fragment that can be grouped
 259 together.}{Turn the fragment into a method whose name explains the purpose of
 260 the method.}
 261
 262 \explanation{Inline Method}{A method's body is just as clear as its name.}{Put
 263 the method's body into the body of its callers and remove the method.}
 264
 265 \explanation{Inline Temp}{You have a temp that is assigned to once with a simple
 266 expression, and the temp is getting in the way of other refactorings.}{Replace
 267 all references to that temp with the expression}
 268
 269 % Moving Features Between Objects
 270 \explanation{Move Method}{A method is, or will be, using or used by more
 271 features of another class than the class on which it is defined.}{Create a new
 272 method with a similar body in the class it uses most. Either turn the old method
 273 into a simple delegation, or remove it altogether.}
 274
 275 \explanation{Move Field}{A field is, or will be, used by another class more than
 276 the class on which it is defined}{Create a new field in the target class, and
 277 change all its users.}
 278
 279 % Organizing Data
 280 \explanation{Replace Magic Number with Symbolic Constant}{You have a literal
 281 number with a particular meaning.}{Create a constant, name it after the meaning,
 282 and replace the number with it.}
 283
 284 \explanation{Encapsulate Field}{There is a public field.}{Make it private and
 285 provide accessors.}
 286
 287 \explanation{Replace Type Code with Class}{A class has a numeric type code that
 288 does not affect its behavior.}{Replace the number with a new class.}
 289
 290 \explanation{Replace Type Code with Subclasses}{You have an immutable type code
 291 that affects the behavior of a class.}{Replace the type code with subclasses.}
 292
 293 \explanation{Replace Type Code with State/Strategy}{You have a type code that
 294 affects the behavior of a class, but you cannot use subclassing.}{Replace the
 295 type code with a state object.}
 296
 297 % Simplifying Conditional Expressions
 298 \explanation{Consolidate Duplicate Conditional Fragments}{The same fragment of
 299 code is in all branches of a conditional expression.}{Move it outside of the
 300 expression.}
 301
 302 \explanation{Remove Control Flag}{You have a variable that is acting as a
 303 control flag fro a series of boolean expressions.}{Use a break or return
 304 instead.}
 305
 306 \explanation{Replace Nested Conditional with Guard Clauses}{A method has
 307 conditional behavior that does not make clear the normal path of
 308 execution.}{Use guard clauses for all special cases.}
 309
 310 \explanation{Introduce Null Object}{You have repeated checks for a null
 311 value.}{Replace the null value with a null object.}
 312
 313 \explanation{Introduce Assertion}{A section of code assumes something about the
 314 state of the program.}{Make the assumption explicit with an assertion.}
 315
 316 % Making Method Calls Simpler
 317 \explanation{Rename Method}{The name of a method does not reveal its
 318 purpose.}{Change the name of the method}
 319
 320 \explanation{Add Parameter}{A method needs more information from its
 321 caller.}{Add a parameter for an object that can pass on this information.}
 322
 323 \explanation{Remove Parameter}{A parameter is no longer used by the method
 324 body.}{Remove it.}
 325
 326 %\explanation{Parameterize Method}{Several methods do similar things but with
 327 %different values contained in the method.}{Create one method that uses a
 328 %parameter for the different values.}
 329
 330 \explanation{Preserve Whole Object}{You are getting several values from an
 331 object and passing these values as parameters in a method call.}{Send the whole
 332 object instead.}
 333
 334 \explanation{Remove Setting Method}{A field should be set at creation time and
 335 never altered.}{Remove any setting method for that field.}
 336
 337 \explanation{Hide Method}{A method is not used by any other class.}{Make the
 338 method private.}
 339
 340 \explanation{Replace Constructor with Factory Method}{You want to do more than
 341 simple construction when you create an object}{Replace the constructor with a
 342 factory method.}
 343
 344 % Dealing with Generalization
 345 \explanation{Pull Up Field}{Two subclasses have the same field.}{Move the field
 346 to the superclass.}
 347
 348 \explanation{Pull Up Method}{You have methods with identical results on
 349 subclasses.}{Move them to the superclass.}
 350
 351 \explanation{Push Down Method}{Behavior on a superclass is relevant only for
 352 some of its subclasses.}{Move it to those subclasses.}
 353
 354 \explanation{Push Down Field}{A field is used only by some subclasses.}{Move the
 355 field to those subclasses}
 356
 357 \explanation{Extract Interface}{Several clients use the same subset of a class's
 358 interface, or two classes have part of their interfaces in common.}{Extract the
 359 subset into an interface.}
 360
 361 \explanation{Replace Inheritance with Delegation}{A subclass uses only part of a
 362 superclasses interface or does not want to inherit data.}{Create a field for the
 363 superclass, adjust methods to delegate to the superclass, and remove the
 364 subclassing.}
 365
 366 \explanation{Replace Delegation with Inheritance}{You're using delegation and
 367 are often writing many simple delegations for the entire interface}{Make the
 368 delegating class a subclass of the delegate.}
 369
 370 \subsubsection{Composite refactorings}
 371
 372 % Composing Methods
 373 % \explanation{Replace Method with Method Object}{}{}
 374
 375 % Moving Features Between Objects
 376 \explanation{Extract Class}{You have one class doing work that should be done by
 377 two}{Create a new class and move the relevant fields and methods from the old
 378 class into the new class.}
 379
 380 \explanation{Inline Class}{A class isn't doing very much.}{Move all its features
 381 into another class and delete it.}
 382
 383 \explanation{Hide Delegate}{A client is calling a delegate class of an
 384 object.}{Create Methods on the server to hide the delegate.}
 385
 386 \explanation{Remove Middle Man}{A class is doing to much simple delegation.}{Get
 387 the client to call the delegate directly.}
 388
 389 % Organizing Data
 390 \explanation{Replace Data Value with Object}{You have a data item that needs
 391 additional data or behavior.}{Turn the data item into an object.}
 392
 393 \explanation{Change Value to Reference}{You have a class with many equal
 394 instances that you want to replace with a single object.}{Turn the object into a
 395 reference object.}
 396
 397 \explanation{Encapsulate Collection}{A method returns a collection}{Make it
 398 return a read-only view and provide add/remove methods.}
 399
 400 % \explanation{Replace Array with Object}{}{}
 401
 402 \explanation{Replace Subclass with Fields}{You have subclasses that vary only in
 403 methods that return constant data.}{Change the methods to superclass fields and
 404 eliminate the subclasses.}
 405
 406 % Simplifying Conditional Expressions
 407 \explanation{Decompose Conditional}{You have a complicated conditional
 408 (if-then-else) statement.}{Extract methods from the condition, then part, an
 409 else part.}
 410
 411 \explanation{Consolidate Conditional Expression}{You have a sequence of
 412 conditional tests with the same result.}{Combine them into a single conditional
 413 expression and extract it.}
 414
 415 \explanation{Replace Conditional with Polymorphism}{You have a conditional that
 416 chooses different behavior depending on the type of an object.}{Move each leg
 417 of the conditional to an overriding method in a subclass. Make the original
 418 method abstract.}
 419
 420 % Making Method Calls Simpler
 421 \explanation{Replace Parameter with Method}{An object invokes a method, then
 422 passes the result as a parameter for a method. The receiver can also invoke this
 423 method.}{Remove the parameter and let the receiver invoke the method.}
 424
 425 \explanation{Introduce Parameter Object}{You have a group of parameters that
 426 naturally go together.}{Replace them with an object.}
 427
 428 % Dealing with Generalization
 429 \explanation{Extract Subclass}{A class has features that are used only in some
 430 instances.}{Create a subclass for that subset of features.}
 431
 432 \explanation{Extract Superclass}{You have two classes with similar
 433 features.}{Create a superclass and move the common features to the
 434 superclass.}
 435
 436 \explanation{Collapse Hierarchy}{A superclass and subclass are not very
 437 different.}{Merge them together.}
 438
 439 \explanation{Form Template Method}{You have two methods in subclasses that
 440 perform similar steps in the same order, yet the steps are different.}{Get the
 441 steps into methods with the same signature, so that the original methods become
 442 the same. Then you can pull them up.}
 443
 444
 445 \subsection{Functional refactorings}
 446
 447 \explanation{Substitute Algorithm}{You want to replace an algorithm with one
 448 that is clearer.}{Replace the body of the method with the new algorithm.}
 449
 450 \end{comment}
 451
 452 \section{The impact on software quality}
 453
 454 \subsection{What is meant by quality?}
 455 The term \emph{software quality} has many meanings. It all depends on the
 456 context we put it in. If we look at it with the eyes of a software developer, it
 457 usually mean that the software is easily maintainable and testable, or in other
 458 words, that it is \emph{well designed}. This often correlates with the
 459 management scale, where \emph{keeping the schedule} and \emph{customer
 460 satisfaction} is at the center. From the customers point of view, in addition to
 461 good usability, \emph{performance} and \emph{lack of bugs} is always
 462 appreciated, measurements that are also shared by the software developer. (In
 463 addition, such things as good documentation could be measured, but this is out
 464 of the scope of this document.)
 465
 466 \subsection{The impact on performance}
 467 \begin{quote}
 468   Refactoring certainly will make software go more slowly, but it also makes the
 469   software more amenable to performance tuning.~\cite{refactoring} % page 69
 470 \end{quote}
 471 There is a common belief that refactoring compromises performance, due to
 472 increased degree of indirection and that polymorphism is slower than
 473 conditionals.
 474
 475 In a survey, Demeyer~\cite{demeyer2002} disproves this view in the case of
 476 polymorphism. He is doing an experiment on, what he calls, ``Transform Self Type
 477 Checks'' where you introduce a new polymorphic method and a new class hierarchy
 478 to get rid of a class' type checking of a ``type attribute``. He uses this kind
 479 of transformation to represent other ways of replacing conditionals with
 480 polymorphism as well. The experiment is performed on the C++ programming
 481 language and with three different compilers and platforms. \todo{But is the
 482 result better?} Demeyer concludes that, with compiler optimization turned on,
 483 polymorphism beats middle to large sized if-statements and does as well as
 484 case-statements.  (In accordance with his hypothesis, due to similarities
 485 between the way C++ handles polymorphism and case-statements.)
 486 \begin{quote}
 487   The interesting thing about performance is that if you analyze most programs,
 488   you find that they waste most of their time in a small fraction of the code.
 489   ~\cite{refactoring}
 490 \end{quote}
 491 So, although an increased amount of method calls could potentially slow down
 492 programs, one should avoid premature optimization and sacrificing good design,
 493 leaving the performance tuning until after profiling\footnote{For and example of
 494   a Java profiler, check out VisualVM: \url{http://visualvm.java.net/}} the
 495   software and having isolated the actual problem areas.
 496
 497
 498
 499 \section{Correctness of refactorings}
 500 % Volker's example?
 501
 502 \section{Composite refactorings} \label{intro_composite}
 503 % motivation, examples
 504 % manual vs automated?
 505 % what about refactoring in a very large code base?
 506
 507 \section{Software metrics}
 508
 509
 510 %\part{The project}
 511 %\chapter{Planning the project}
 512 %\part{Conclusion}
 513 %\chapter{Results}
 514
 515
 516
 517 \chapter{\ldots}
 518 \todo{write}
 519 \section{The problem statement}
 520 \section{Choosing the language}
 521 \section{Choosing the tool}
 522
 523 \chapter{Refactorings in Eclipse JDT: Design, Shortcomings and Wishful
 524 Thinking}\label{ch:jdt_refactorings}
 525
 526 This chapter will deal with some of the design behind refactoring support in
 527 Eclipse, and the JDT in specific. After which it will follow a section about
 528 shortcomings of the refactoring API in terms of composition of refactorings. The
 529 chapter will be concluded with a section telling some of the ways the
 530 implementation of refactorings in the JDT could have worked to facilitate
 531 composition of refactorings.
 532
 533 \section{Design}
 534 The refactoring world of Eclipse can in general be separated into two parts: The
 535 language independent part and the part written for a specific programming
 536 language -- the language that is the target of the supported refactorings.
 537 \todo{What about the language specific part?}
 538
 539 \subsection{The Language Toolkit}
 540 The Language Toolkit, or LTK for short, is the framework that is used to
 541 implement refactorings in Eclipse. It is language independent and provides the
 542 abstractions of a refactoring and the change it generates, in the form of the
 543 classes \typewithref{org.eclipse.ltk.core.refactoring}{Refactoring} and
 544 \typewithref{org.eclipse.ltk.core.refactoring}{Change}. (There is also parts of
 545 the LTK that is concerned with user interaction, but they will not be discussed
 546 here, since they are of little value to us and our use of the framework.)
 547
 548 \subsubsection{The Refactoring Class}
 549 The abstract class \type{Refactoring} is the core of the LTK framework. Every
 550 refactoring that is going to be supported by the LTK have to end up creating an
 551 instance of one of its subclasses. The main responsibilities of subclasses of
 552 \type{Refactoring} is to implement template methods for condition checking
 553 (\methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkInitialConditions}
 554 and
 555 \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkFinalConditions}),
 556 in addition to the
 557 \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{createChange}
 558 method that creates and returns an instance of the \type{Change} class.
 559
 560 If the refactoring shall support that others participate in it when it is
 561 executed, the refactoring has to be a processor-based
 562 refactoring\typeref{org.eclipse.ltk.core.refactoring.participants.ProcessorBasedRefactoring}.
 563 It then delegates to its given
 564 \typewithref{org.eclipse.ltk.core.refactoring.participants}{RefactoringProcessor}
 565 for condition checking and change creation.
 566
 567 \subsubsection{The Change Class}
 568 This class is the base class for objects that is responsible for performing the
 569 actual workspace transformations in a refactoring. The main responsibilities for
 570 its subclasses is to implement the
 571 \methodwithref{org.eclipse.ltk.core.refactoring.Change}{perform} and
 572 \methodwithref{org.eclipse.ltk.core.refactoring.Change}{isValid} methods. The
 573 \method{isValid} method verifies that the change object is valid and thus can be
 574 executed by calling its \method{perform} method. The \method{perform} method
 575 performs the desired change and returns an undo change that can be executed to
 576 reverse the effect of the transformation done by its originating change object.
 577
 578 \subsubsection{Executing a Refactoring}\label{executing_refactoring}
 579 The life cycle of a refactoring generally follows two steps after creation:
 580 condition checking and change creation. By letting the refactoring object be
 581 handled by a
 582 \typewithref{org.eclipse.ltk.core.refactoring}{CheckConditionsOperation} that
 583 in turn is handled by a
 584 \typewithref{org.eclipse.ltk.core.refactoring}{CreateChangeOperation}, it is
 585 assured that the change creation process is managed in a proper manner.
 586
 587 The actual execution of a change object has to follow a detailed life cycle.
 588 This life cycle is honored if the \type{CreateChangeOperation} is handled by a
 589 \typewithref{org.eclipse.ltk.core.refactoring}{PerformChangeOperation}. If also
 590 an undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} is set
 591 for the \type{PerformChangeOperation}, the undo change is added into the undo
 592 history.
 593
 594 \section{Shortcomings}
 595 This section is introduced naturally with a conclusion: The JDT refactoring
 596 implementation does not facilitate composition of refactorings.
 597 \todo{refine}This section will try to explain why, and also identify other
 598 shortcomings of both the usability and the readability of the JDT refactoring
 599 source code.
 600
 601 I will begin at the end and work my way toward the composition part of this
 602 section.
 603
 604 \subsection{Absence of Generics in Eclipse Source Code}
 605 This section is not only concerning the JDT refactoring API, but also large
 606 quantities of the Eclipse source code. The code shows a striking absence of the
 607 Java language feature of generics. It is hard to read a class' interface when
 608 methods return objects or takes parameters of raw types such as \type{List} or
 609 \type{Map}. This sometimes results in having to read a lot of source code to
 610 understand what is going on, instead of relying on the available interfaces. In
 611 addition, it results in a lot of ugly code, making the use of typecasting more
 612 of a rule than an exception.
 613
 614 \subsection{Composite Refactorings Will Not Appear as Atomic Actions}
 615
 616 \subsubsection{Missing Flexibility from JDT Refactorings}
 617 The JDT refactorings are not made with composition of refactorings in mind. When
 618 a JDT refactoring is executed, it assumes that all conditions for it to be
 619 applied successfully can be found by reading source files that has been
 620 persisted to disk. They can only operate on the actual source material, and not
 621 (in-memory) copies thereof. This constitutes a major disadvantage when trying to
 622 compose refactorings, since if an exception occur in the middle of a sequence of
 623 refactorings, it can leave the project in a state where the composite
 624 refactoring was executed only partly. It makes it hard to discard the changes
 625 done without monitoring and consulting the undo manager, an approach that is not
 626 bullet proof.
 627
 628 \subsubsection{Broken Undo History}
 629 When designing a composed refactoring that is to be performed as a sequence of
 630 refactorings, you would like it to appear as a single change to the workspace.
 631 This implies that you would also like to be able to undo all the changes done by
 632 the refactoring in a single step. This is not the way it appears when a sequence
 633 of JDT refactorings is executed. It leaves the undo history filled up with
 634 individual undo actions corresponding to every single JDT refactoring in the
 635 sequence. This problem is not trivial to handle in Eclipse. (See section
 636 \ref{hacking_undo_history}.)
 637
 638 \section{Wishful Thinking}
 639
 640
 641
 642 \chapter{Composite Refactorings in Eclipse}
 643
 644 \section{A Simple Ad Hoc Model}
 645 As pointed out in chapter \ref{ch:jdt_refactorings}, the Eclipse JDT refactoring
 646 model is not very well suited for making composite refactorings. Therefore a
 647 simple model using changer objects (of type \type{RefaktorChanger}) is used as
 648 an abstraction layer on top of the existing Eclipse refactorings.
 649
 650 \section{The Extract and Move Method Refactoring}
 651 %The Extract and Move Method Refactoring is implemented mainly using these
 652 %classes:
 653 %\begin{itemize}
 654 %  \item \type{ExtractAndMoveMethodChanger}
 655 %  \item \type{ExtractAndMoveMethodPrefixesExtractor}
 656 %  \item \type{Prefix}
 657 %  \item \type{PrefixSet}
 658 %\end{itemize}
 659
 660 \subsection{The Building Blocks}
 661 This is a composite refactoring, and hence is built up using several primitive
 662 refactorings. These basic building blocks are, as its name implies, the Extract
 663 Method Refactoring \cite{refactoring} and the Move Method Refactoring
 664 \cite{refactoring}. In Eclipse, the implementations of these refactorings are
 665 found in the classes
 666 \typewithref{org.eclipse.jdt.internal.corext.refactoring.code}{ExtractMethodRefactoring}
 667 and
 668 \typewithref{org.eclipse.jdt.internal.corext.refactoring.structure}{MoveInstanceMethodProcessor},
 669 where the last class is designed to be used together with the processor-based
 670 \typewithref{org.eclipse.ltk.core.refactoring.participants}{MoveRefactoring}.
 671
 672 \subsubsection{The ExtractMethodRefactoring Class}
 673 This class is quite simple in its use. The only parameters it requires for
 674 construction is a compilation
 675 unit\typeref{org.eclipse.jdt.core.ICompilationUnit}, the offset into the source
 676 code where the extraction shall start, and the length of the source to be
 677 extracted. Then you have to set the method name for the new method together with
 678 which access modifier that shall be used and some not so interesting parameters.
 679
 680 \subsubsection{The MoveInstanceMethodProcessor Class}
 681 For the Move Method the processor requires a little more advanced input than
 682 the class for the Extract Method. For construction it requires a method
 683 handle\typeref{org.eclipse.jdt.core.IMethod} from the Java Model for the method
 684 that is to be moved. Then the target for the move have to be supplied as the
 685 variable binding from a chosen variable declaration. In addition to this, one
 686 have to set some parameters regarding setters/getters and delegation.
 687
 688 To make a whole refactoring from the processor, one have to construct a
 689 \type{MoveRefactoring} from it.
 690
 691 \subsection{The ExtractAndMoveMethodChanger Class}
 692 The \typewithref{no.uio.ifi.refaktor.changers}{ExtractAndMoveMethodChanger}
 693 class, that is a subclass of the class
 694 \typewithref{no.uio.ifi.refaktor.changers}{RefaktorChanger}, is the class
 695 responsible for composing the \type{ExtractMethodRefactoring} and the
 696 \type{MoveRefactoring}. Its constructor takes a project
 697 handle\typeref{org.eclipse.core.resources.IProject}, the method name for the new
 698 method and a \typewithref{no.uio.ifi.refaktor.utils}{SmartTextSelection}.
 699
 700 A \type{SmartTextSelection} is basically a text
 701 selection\typeref{org.eclipse.jface.text.ITextSelection} object that enforces
 702 the providing of the underlying document during creation. I.e. its
 703 \methodwithref{no.uio.ifi.refaktor.utils.SmartTextSelection}{getDocument} method
 704 will never return \type{null}.
 705
 706 Before extracting the new method, the possible targets for the move operation is
 707 found with the help of an
 708 \typewithref{no.uio.ifi.refaktor.extractors}{ExtractAndMoveMethodPrefixesExtractor}.
 709 The possible targets is computed from the prefixes that the extractor returns
 710 from its
 711 \methodwithref{no.uio.ifi.refaktor.extractors.ExtractAndMoveMethodPrefixesExtractor}{getSafePrefixes}
 712 method. The changer then choose the most suitable target by finding the most
 713 frequent occurring prefix among the safe ones. The target is the type of the
 714 first part of the prefix.
 715
 716 After finding a suitable target, the \type{ExtractAndMoveMethodChanger} first
 717 creates an \type{ExtractMethodRefactoring} and performs it as explained in
 718 section \ref{executing_refactoring} about the execution of refactorings. Then it
 719 creates and performs the \type{MoveRefactoring} in the same way, based on the
 720 changes done by the Extract Method refactoring.
 721
 722 \subsection{The ExtractAndMoveMethodPrefixesExtractor Class}
 723 This extractor extracts properties needed for building the Extract and Move
 724 Method refactoring. It searches through the given selection to find safe
 725 prefixes, and those prefixes form a base that can be used to compute possible
 726 targets for the move part of the refactoring.  It finds both the candidates, in
 727 the form of prefixes, and the non-candidates, called unfixes. All prefixes (and
 728 unfixes) are represented by a
 729 \typewithref{no.uio.ifi.refaktor.extractors}{Prefix}, and they are collected
 730 into prefix sets.\typeref{no.uio.ifi.refaktor.extractors.PrefixSet}.
 731
 732 The prefixes and unfixes are found by property
 733 collectors\typeref{no.uio.ifi.refaktor.extractors.collectors.PropertyCollector}.
 734 A property collector follows the visitor pattern \cite{dp} and is of the
 735 \typewithref{org.eclipse.jdt.core.dom}{ASTVisitor} type.  An \type{ASTVisitor}
 736 visits nodes in an abstract syntax tree that forms the Java document object
 737 model. The tree consists of nodes of type
 738 \typewithref{org.eclipse.jdt.core.do}{ASTNode}.
 739
 740 \subsubsection{The PrefixesCollector}
 741 The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{PrefixesCollector}
 742 is of type \type{PropertyCollector}. It visits expression
 743 statements\typeref{org.eclipse.jdt.core.dom.ExpressionStatement} and creates
 744 prefixes from its expressions in the case of method invocations. The prefixes
 745 found is registered with a prefix set, together with all its sub-prefixes.
 746 \todo{Rewrite in the case of changes to the way prefixes are found}
 747
 748 \subsubsection{The UnfixesCollector}
 749 The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector}
 750 finds unfixes within the selection. An unfix is a name that is assigned to
 751 within the selection. The reason that this cannot be allowed, is that the result
 752 would be an assignment to the \type{this} keyword, which is not valid in Java.
 753
 754 \subsubsection{Computing Safe Prefixes}
 755 A safe prefix is a prefix that does not enclose an unfix. A prefix is enclosing
 756 an unfix if the unfix is in the set of its sub-prefixes. As an example,
 757 \texttt{``a.b''} is enclosing \texttt{``a''}, as is \texttt{``a''}. The safe
 758 prefixes is unified in a \type{PrefixSet} and can be fetched calling the
 759 \method{getSafePrefixes} method of the
 760 \type{ExtractAndMoveMethodPrefixesExtractor}.
 761
 762 \subsection{The Prefix Class}
 763 \todo{?}
 764 \subsection{The PrefixSet Class}
 765
 766 \subsection{Hacking the Refactoring Undo
 767 History}\label{hacking_undo_history}
 768 \todo{Where to put this section?}
 769
 770 As an attempt to make multiple subsequent changes to the workspace appear as a
 771 single action (i.e. make the undo changes appear as such), I tried to alter
 772 the undo changes\typeref{org.eclipse.ltk.core.refactoring.Change} in the history
 773 of the refactorings.
 774
 775 My first impulse was to remove the, in this case, last two undo changes from the
 776 undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} for the
 777 Eclipse refactorings, and then add them to a composite
 778 change\typeref{org.eclipse.ltk.core.refactoring.CompositeChange} that could be
 779 added back to the manager. The interface of the undo manager does not offer a
 780 way to remove/pop the last added undo change, so a possible solution could be to
 781 decorate \cite{dp} the undo manager, to intercept and collect the undo changes
 782 before delegating to the \method{addUndo}
 783 method\methodref{org.eclipse.ltk.core.refactoring.IUndoManager}{addUndo} of the
 784 manager. Instead of giving it the intended undo change, a null change could be
 785 given to prevent it from making any changes if run. Then one could let the
 786 collected undo changes form a composite change to be added to the manager.
 787
 788 There is a technical challenge with this approach, and it relates to the undo
 789 manager, and the concrete implementation
 790 UndoManager2\typeref{org.eclipse.ltk.internal.core.refactoring.UndoManager2}.
 791 This implementation is designed in a way that it is not possible to just add an
 792 undo change, you have to do it in the context of an active
 793 operation\typeref{org.eclipse.core.commands.operations.TriggeredOperations}.
 794 One could imagine that it might be possible to trick the undo manager into
 795 believing that you are doing a real change, by executing a refactoring that is
 796 returning a kind of null change that is returning our composite change of undo
 797 refactorings when it is performed.
 798
 799 Apart from the technical problems with this solution, there is a functional
 800 problem: If it all had worked out as planned, this would leave the undo history
 801 in a dirty state, with multiple empty undo operations corresponding to each of
 802 the sequentially executed refactoring operations, followed by a composite undo
 803 change corresponding to an empty change of the workspace for rounding of our
 804 composite refactoring. The solution to this particular problem could be to
 805 intercept the registration of the intermediate changes in the undo manager, and
 806 only register the last empty change.
 807
 808 Unfortunately, not everything works as desired with this solution. The grouping
 809 of the undo changes into the composite change does not make the undo operation
 810 appear as an atomic operation. The undo operation is still split up into
 811 separate undo actions, corresponding to the change done by its originating
 812 refactoring. And in addition, the undo actions has to be performed separate in
 813 all the editors involved. This makes it no solution at all, but a step toward
 814 something worse.
 815
 816 There might be a solution to this problem, but it remains to be found. The
 817 design of the refactoring undo management is partly to be blamed for this, as it
 818 it is to complex to be easily manipulated.
 819
 820
 821 \backmatter{}
 822 \printbibliography
 823 \listoftodos
 824 \end{document}