1 \documentclass[USenglish]{ifimaster}
3 \usepackage[utf8]{inputenc}
4 \usepackage[T1]{fontenc,url}
6 \usepackage{babel,textcomp,csquotes,ifimasterforside,varioref,graphicx}
7 \usepackage[style=numeric-comp]{biblatex}
8 %\usepackage[backend=biber,style=numeric-comp]{biblatex}
10 \usepackage{todonotes}
11 \usepackage{perpage} %the perpage package
12 \MakePerPage{footnote} %the perpage package command
15 \newtheorem*{wordDef}{Definition}
17 \newcommand{\definition}[1]{\begin{wordDef}#1\end{wordDef}}
18 \newcommand{\see}[1]{(see \ref{#1})}
19 \newcommand{\explanation}[3]{\noindent\textbf{\textit{#1}}\\*\emph{When:}
20 #2\\*\emph{How:} #3\\*[-7px]}
21 \newcommand{\type}[1]{\texttt{#1}}
22 \newcommand{\typeref}[1]{\footnote{\type{#1}}}
23 \newcommand{\typewithref}[2]{\type{#2}\typeref{#1.#2}}
24 \newcommand{\method}[1]{\type{#1}}
25 \newcommand{\methodref}[2]{\footnote{\type{#1}\method{\##2()}}}
26 \newcommand{\methodwithref}[2]{\method{#2}\footnote{\type{#1}\method{\##2()}}}
31 \author{Erlend Kristiansen}
33 \bibliography{bibliography/master-thesis-erlenkr-bibliography}
51 \chapter{Introduction}
53 \section{What is Refactoring?}
55 This question is best answered dividing the answer into two parts. First
56 defining the concept of a refactoring, then discuss what the discipline of
57 refactoring is all about. And to make it clear already from the beginning: The
58 discussions in this report must be seen in the context of object oriented
59 programming languages. It may be obvious, but much of the material will not make
60 much sense otherwise, although some of the techniques may be applicable to
61 sequential \todo{sequential?} languages, then possibly in other forms.
63 \subsection{Defining refactoring}
64 Martin Fowler, in his masterpiece on refactoring \cite{refactoring}, defines a
65 refactoring like this:
67 \emph{Refactoring} (noun): a change made to the \todo{what does he mean by
68 internal?} internal structure of software to make it easier to understand and
69 cheaper to modify without changing its observable
70 behavior.~\cite{refactoring} % page 53
72 This definition gives additional meaning to the word \emph{refactoring}, beyond
73 its \todo{original?} original meaning. Fowler is mixing the \emph{motivation}
74 behind refactoring into his definition. Instead it could be made clean, only
75 considering the mechanical and behavioral aspects of refactoring. That is to
76 factor the program again, putting it together in a different way than before,
77 while preserving the behavior of the program. An alternative definition could
80 \definition{A refactoring is a transformation
81 done to a program without altering its external behavior.}
83 So a refactoring primarily changes how the \emph{code} of a program is perceived
84 by the \emph{programmer}, and not the behavior experienced by any user of the
85 program. Although the logical meaning is preserved, such changes could
86 potentially alter the program's behavior when it comes to performance gain or
87 penalties. So any logic depending on the performance of a program could make the
88 program behave differently after a refactoring.
90 In the extreme case one could argue that such a thing as \emph{software
91 obfuscation} is to refactor. If we where to define it as a refactoring, it could
92 be defined as a composite refactoring \see{intro_composite}, consisting of, for
93 instance, a series of rename refactorings. (But it could of course be much more
94 complex, and the mechanics of it would not exactly be carved in stone.) To
95 perform some serious obfuscation one would also take advantage of techniques not
96 found among established refactorings, such as removing whitespace. This might
97 not even generate a different syntax tree for languages not sensitive to
98 whitespace, placing it in the gray area of what transformations is to be
99 considered refactorings.
101 Finally, to \emph{refactor} is (quoting Martin Fowler)
103 \ldots to restructure software by applying a series of refactorings without
104 changing its observable behavior.~\cite{refactoring} % page 54, definition
107 % subsection with the history of refactoring?
109 \subsection{Motivation} % better headline?
110 To get a grasp of what refactoring is all about, we can answer this question:
111 \emph{Why do people refactor?} Possible answers could include: ``To remove
112 duplication'' or ``to break up long methods''. Practitioners of the art of
113 Design Patterns~\cite{dp} could say that they do it to introduce a long-needed
114 pattern to their program's design. So it's safe to say that peoples' intentions
115 are to make their programs \emph{better} in some sense. But what aspects of the
116 programs are becoming improved?
118 As already mentioned, people often refactor to get rid of duplication. Moving
119 identical or similar code into methods, and maybe pushing those up or down in
120 their hierarchies. Making template methods for overlapping algorithms
121 \todo{better?: functionality} and so on. It's all about gathering what belongs
122 together and putting it all in one place. And the result? The code is easier to
123 maintain. When removing the implicit coupling between the code snippets, the
124 location of a bug is limited to only one place, and new functionality need only
125 to be added this one place, instead of a number of places people might not even
128 The same people find out that their program contains a lot of long and
129 hard-to-grasp methods. Then what do they do? They begin dividing their methods
130 into smaller ones, using the \emph{Extract Method}
131 refactoring~\cite{refactoring}. Then they may discover something about their
132 program that they weren't aware of before; revealing bugs they didn't know about
133 or couldn't find due to the complex structure of their program. \todo{Proof?}
134 Making the methods smaller and giving good names to the new ones clarifies the
135 algorithms and enhances the \emph{understandability} of the program. This makes
136 simple refactoring an excellent method for exploring unknown program code, or
137 code that you had forgotten that you wrote!
139 The word \emph{simple} came up in the last section. In fact, most basic
140 refactorings are simple. The true power of them are revealed first when they are
141 combined into larger --- higher level --- refactorings, called \emph{composite
142 refactorings} \see{intro_composite}. Often the goal of such a series of
143 refactorings is a design pattern. Thus the \emph{design} can be evolved
144 throughout the lifetime of a program, opposed to designing up-front. It's all
145 about being structured and taking small steps to improve the design.
147 Many refactorings are aimed at lowering the coupling between different classes
148 and different layers of logic. Say for instance that the coupling between the
149 user interface and the business logic of a program is lowered. Then the business
150 logic of the program could much easier be the target of automated tests,
151 increasing the productivity in the software development process. It would also
152 be much easier to distribute the different parts of the program if they were
155 Another effect of refactoring is that with the increased separation of concerns
156 coming out of many refactorings, the \emph{performance} is improved. When
157 profiling programs, the problem parts are narrowed down to smaller parts of the
158 code, which are easier to tune, and optimization can be performed only where
159 needed and in a more effective way.
161 Refactoring program code --- with a goal in mind --- can give the code itself
162 more value. That is in the form of robustness to bugs, understandability and
163 maintainability. With the first as an obvious advantage, but with the following
164 two being also very important in software development. By incorporating
165 refactoring in the development process, bugs are found faster, new functionality
166 is added more easily and code is easier to understand by the next person exposed
167 to it, which might as well be the person who wrote it. So, refactoring can also
168 add to the monetary value of a business, by increased productivity of the
169 development process in the long run. Where this last point also should open
170 the eyes of some nearsighted managers who seldom see beyond the next milestone.
173 \section{Classification of refactorings}
174 % only interesting refactorings
175 % with 2 detailed examples? One for structured and one for intra-method?
176 % Is replacing Bubblesort with Quick Sort considered a refactoring?
178 \subsection{Structural refactorings}
180 \subsubsection{Basic refactorings}
183 \explanation{Extract Method}{You have a code fragment that can be grouped
184 together.}{Turn the fragment into a method whose name explains the purpose of
187 \explanation{Inline Method}{A method's body is just as clear as its name.}{Put
188 the method's body into the body of its callers and remove the method.}
190 \explanation{Inline Temp}{You have a temp that is assigned to once with a simple
191 expression, and the temp is getting in the way of other refactorings.}{Replace
192 all references to that temp with the expression}
194 % Moving Features Between Objects
195 \explanation{Move Method}{A method is, or will be, using or used by more
196 features of another class than the class on which it is defined.}{Create a new
197 method with a similar body in the class it uses most. Either turn the old method
198 into a simple delegation, or remove it altogether.}
200 \explanation{Move Field}{A field is, or will be, used by another class more than
201 the class on which it is defined}{Create a new field in the target class, and
202 change all its users.}
205 \explanation{Replace Magic Number with Symbolic Constant}{You have a literal
206 number with a particular meaning.}{Create a constant, name it after the meaning,
207 and replace the number with it.}
209 \explanation{Encapsulate Field}{There is a public field.}{Make it private and
212 \explanation{Replace Type Code with Class}{A class has a numeric type code that
213 does not affect its behavior.}{Replace the number with a new class.}
215 \explanation{Replace Type Code with Subclasses}{You have an immutable type code
216 that affects the behavior of a class.}{Replace the type code with subclasses.}
218 \explanation{Replace Type Code with State/Strategy}{You have a type code that
219 affects the behavior of a class, but you cannot use subclassing.}{Replace the
220 type code with a state object.}
222 % Simplifying Conditional Expressions
223 \explanation{Consolidate Duplicate Conditional Fragments}{The same fragment of
224 code is in all branches of a conditional expression.}{Move it outside of the
227 \explanation{Remove Control Flag}{You have a variable that is acting as a
228 control flag fro a series of boolean expressions.}{Use a break or return
231 \explanation{Replace Nested Conditional with Guard Clauses}{A method has
232 conditional behavior that does not make clear the normal path of
233 execution.}{Use guard clauses for all special cases.}
235 \explanation{Introduce Null Object}{You have repeated checks for a null
236 value.}{Replace the null value with a null object.}
238 \explanation{Introduce Assertion}{A section of code assumes something about the
239 state of the program.}{Make the assumption explicit with an assertion.}
241 % Making Method Calls Simpler
242 \explanation{Rename Method}{The name of a method does not reveal its
243 purpose.}{Change the name of the method}
245 \explanation{Add Parameter}{A method needs more information from its
246 caller.}{Add a parameter for an object that can pass on this information.}
248 \explanation{Remove Parameter}{A parameter is no longer used by the method
251 %\explanation{Parameterize Method}{Several methods do similar things but with
252 %different values contained in the method.}{Create one method that uses a
253 %parameter for the different values.}
255 \explanation{Preserve Whole Object}{You are getting several values from an
256 object and passing these values as parameters in a method call.}{Send the whole
259 \explanation{Remove Setting Method}{A field should be set at creation time and
260 never altered.}{Remove any setting method for that field.}
262 \explanation{Hide Method}{A method is not used by any other class.}{Make the
265 \explanation{Replace Constructor with Factory Method}{You want to do more than
266 simple construction when you create an object}{Replace the constructor with a
269 % Dealing with Generalization
270 \explanation{Pull Up Field}{Two subclasses have the same field.}{Move the field
273 \explanation{Pull Up Method}{You have methods with identical results on
274 subclasses.}{Move them to the superclass.}
276 \explanation{Push Down Method}{Behavior on a superclass is relevant only for
277 some of its subclasses.}{Move it to those subclasses.}
279 \explanation{Push Down Field}{A field is used only by some subclasses.}{Move the
280 field to those subclasses}
282 \explanation{Extract Interface}{Several clients use the same subset of a class's
283 interface, or two classes have part of their interfaces in common.}{Extract the
284 subset into an interface.}
286 \explanation{Replace Inheritance with Delegation}{A subclass uses only part of a
287 superclasses interface or does not want to inherit data.}{Create a field for the
288 superclass, adjust methods to delegate to the superclass, and remove the
291 \explanation{Replace Delegation with Inheritance}{You're using delegation and
292 are often writing many simple delegations for the entire interface}{Make the
293 delegating class a subclass of the delegate.}
295 \subsubsection{Composite refactorings}
298 % \explanation{Replace Method with Method Object}{}{}
300 % Moving Features Between Objects
301 \explanation{Extract Class}{You have one class doing work that should be done by
302 two}{Create a new class and move the relevant fields and methods from the old
303 class into the new class.}
305 \explanation{Inline Class}{A class isn't doing very much.}{Move all its features
306 into another class and delete it.}
308 \explanation{Hide Delegate}{A client is calling a delegate class of an
309 object.}{Create Methods on the server to hide the delegate.}
311 \explanation{Remove Middle Man}{A class is doing to much simple delegation.}{Get
312 the client to call the delegate directly.}
315 \explanation{Replace Data Value with Object}{You have a data item that needs
316 additional data or behavior.}{Turn the data item into an object.}
318 \explanation{Change Value to Reference}{You have a class with many equal
319 instances that you want to replace with a single object.}{Turn the object into a
322 \explanation{Encapsulate Collection}{A method returns a collection}{Make it
323 return a read-only view and provide add/remove methods.}
325 % \explanation{Replace Array with Object}{}{}
327 \explanation{Replace Subclass with Fields}{You have subclasses that vary only in
328 methods that return constant data.}{Change the methods to superclass fields and
329 eliminate the subclasses.}
331 % Simplifying Conditional Expressions
332 \explanation{Decompose Conditional}{You have a complicated conditional
333 (if-then-else) statement.}{Extract methods from the condition, then part, an
336 \explanation{Consolidate Conditional Expression}{You have a sequence of
337 conditional tests with the same result.}{Combine them into a single conditional
338 expression and extract it.}
340 \explanation{Replace Conditional with Polymorphism}{You have a conditional that
341 chooses different behavior depending on the type of an object.}{Move each leg
342 of the conditional to an overriding method in a subclass. Make the original
345 % Making Method Calls Simpler
346 \explanation{Replace Parameter with Method}{An object invokes a method, then
347 passes the result as a parameter for a method. The receiver can also invoke this
348 method.}{Remove the parameter and let the receiver invoke the method.}
350 \explanation{Introduce Parameter Object}{You have a group of parameters that
351 naturally go together.}{Replace them with an object.}
353 % Dealing with Generalization
354 \explanation{Extract Subclass}{A class has features that are used only in some
355 instances.}{Create a subclass for that subset of features.}
357 \explanation{Extract Superclass}{You have two classes with similar
358 features.}{Create a superclass and move the common features to the
361 \explanation{Collapse Hierarchy}{A superclass and subclass are not very
362 different.}{Merge them together.}
364 \explanation{Form Template Method}{You have two methods in subclasses that
365 perform similar steps in the same order, yet the steps are different.}{Get the
366 steps into methods with the same signature, so that the original methods become
367 the same. Then you can pull them up.}
370 \subsection{Functional refactorings}
372 \explanation{Substitute Algorithm}{You want to replace an algorithm with one
373 that is clearer.}{Replace the body of the method with the new algorithm.}
376 \section{The impact on software quality}
378 \subsection{What is meant by quality?}
379 The term \emph{software quality} has many meanings. It all depends on the
380 context we put it in. If we look at it with the eyes of a software developer, it
381 usually mean that the software is easily maintainable and testable, or in other
382 words, that it is \emph{well designed}. This often correlates with the
383 management scale, where \emph{keeping the schedule} and \emph{customer
384 satisfaction} is at the center. From the customers point of view, in addition to
385 good usability, \emph{performance} and \emph{lack of bugs} is always
386 appreciated, measurements that are also shared by the software developer. (In
387 addition, such things as good documentation could be measured, but this is out
388 of the scope of this document.)
390 \subsection{The impact on performance}
392 Refactoring certainly will make software go more slowly, but it also makes the
393 software more amenable to performance tuning.~\cite{refactoring} % page 69
395 There is a common belief that refactoring compromises performance, due to
396 increased degree of indirection and that polymorphism is slower than
399 In a survey, Demeyer~\cite{demeyer2002} disproves this view in the case of
400 polymorphism. He is doing an experiment on, what he calls, ``Transform Self Type
401 Checks'' where you introduce a new polymorphic method and a new class hierarchy
402 to get rid of a class' type checking of a ``type attribute``. He uses this kind
403 of transformation to represent other ways of replacing conditionals with
404 polymorphism as well. The experiment is performed on the C++ programming
405 language and with three different compilers and platforms. \todo{But is the
406 result better?} Demeyer concludes that, with compiler optimization turned on,
407 polymorphism beats middle to large sized if-statements and does as well as
408 case-statements. (In accordance with his hypothesis, due to similarities
409 between the way C++ handles polymorphism and case-statements.)
411 The interesting thing about performance is that if you analyze most programs,
412 you find that they waste most of their time in a small fraction of the code.
415 So, although an increased amount of method calls could potentially slow down
416 programs, one should avoid premature optimization and sacrificing good design,
417 leaving the performance tuning until after profiling the software and having
418 isolated the actual problem areas.
422 \section{Correctness of refactorings}
425 \section{Composite refactorings} \label{intro_composite}
426 % motivation, example(s)
427 % manual vs automated?
428 % what about refactoring in a very large code base?
430 \section{Software metrics}
434 %\chapter{Planning the project}
439 \chapter{Refactorings in Eclipse JDT: Design and
440 Shortcomings}\label{ch:jdt_refactorings}
443 The refactoring world of Eclipse can in general be separated into two parts: The
444 language independent part and the the part written for a specific programming
445 language -- the language that is the target of the supported refactorings.
447 \subsection{The Language Toolkit}
448 The Language Toolkit, or LTK for short, is the framework that is used to
449 implement refactorings in Eclipse. It is language independent and provides the
450 abstractions of a refactoring and the change it generates, in the form of the
451 classes \typewithref{org.eclipse.ltk.core.refactoring}{Refactoring} and
452 \typewithref{org.eclipse.ltk.core.refactoring}{Change}. (There is also parts of
453 the LTK that is concerned with user interaction, but they will not be discussed
454 here, since they are of little value to us and our use of the framework.)
456 \subsubsection{The Refactoring Class}
457 The abstract class \type{Refactoring} is the core of the LTK framework. Every
458 refactoring that is going to be supported by the LTK have to end up creating an
459 instance of one of its subclasses. The main responsibilities of subclasses of
460 \type{Refactoring} is to implement template methods for condition checking
461 (\methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkInitialConditions}
463 \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkFinalConditions}),
465 \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{createChange}
466 method that creates and returns an instance of the \type{Change} class that is
467 responsible for performing the actual workspace transformations.
468 \todo{Write something about processor-based refactorings?}
470 \subsubsection{The Change Class}
473 \section{Shortcomings}
475 \chapter{Composite Refactorings in Eclipse}
477 \section{A Simple Ad Hoc Model}
478 As pointed out in chapter \ref{ch:jdt_refactorings}, the Eclipse JDT refactoring
479 model is not very well suited for making composite refactorings. Therefore a
480 simple model using changer objects (of type \type{RefaktorChanger}) is used as
481 an abstraction layer on top of the existing Eclipse refactorings.
483 \section{The Extract and Move Method Refactoring}
484 The Extract and Move Method Refactoring is implemented mainly using these
487 \item \type{ExtractAndMoveMethodChanger}
488 \item \type{ExtractAndMoveMethodPrefixesExtractor}
490 \item \type{PrefixSet}
493 \subsection{The ExtractAndMoveMethodChanger Class}
494 \subsection{The ExtractAndMoveMethodPrefixesExtractor Class}
495 \subsection{The Prefix Class}
496 \subsection{The PrefixSet Class}
498 \subsection{Hacking the Refactoring Undo History}
499 \todo{Where to put this section?}
501 As an attempt to make multiple subsequent changes to the workspace appear as a
502 single action (i.e. make the undo changes appear as such), I tried to alter
503 the undo changes\typeref{org.eclipse.ltk.core.refactoring.Change} in the history
506 My first impulse was to remove the, in this case, last two undo changes from the
507 undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} for the
508 Eclipse refactorings, and then add them to a composite
509 change\typeref{org.eclipse.ltk.core.refactoring.CompositeChange} that could be
510 added back to the manager. The interface of the undo manager does not offer a
511 way to remove/pop the last added undo change, so a possible solution could be to
512 decorate \cite{dp} the undo manager, to intercept and collect the undo changes
513 before delegating to the \method{addUndo}
514 method\methodref{org.eclipse.ltk.core.refactoring.IUndoManager}{addUndo} of the
515 manager. Instead of giving it the intended undo change, a null change could be
516 given to prevent it from making any changes if run. Then one could let the
517 collected undo changes form a composite change to be added to the manager.
519 There is a technical challenge with this approach, and it relates to the undo
520 manager, and the concrete implementation
521 UndoManager2\typeref{org.eclipse.ltk.internal.core.refactoring.UndoManager2}.
522 This implementation is designed in a way that it is not possible to just add an
523 undo change, you have to do it in the context of an active
524 operation\typeref{org.eclipse.core.commands.operations.TriggeredOperations}.
525 One could imagine that it might be possible to trick the undo manager into
526 believing that you are doing a real change, by executing a refactoring that is
527 returning a kind of null change that is returning our composite change of undo
528 refactorings when it is performed.
530 Apart from the technical problems with this solution, there is a functional
531 problem: If it all had worked out as planned, this would leave the undo history
532 in a dirty state, with multiple empty undo operations corresponding to each of
533 the sequentially executed refactoring operations, followed by a composite undo
534 change corresponding to an empty change of the workspace for rounding of our
535 composite refactoring. The solution to this particular problem could be to
536 intercept the registration of the intermediate changes in the undo manager, and
537 only register the last empty change.
539 Unfortunately, not everything works as desired with this solution. The grouping
540 of the undo changes into the composite change does not make the undo operation
541 appear as an atomic operation. The undo operation is still split up into
542 separate undo actions, corresponding to the change done by its originating
543 refactoring. And in addition, the undo actions has to be performed separate in
544 all the editors involved. This makes it no solution at all, but a step toward
547 There might be a solution to this problem, but it remains to be found. The
548 design of the refactoring undo management is partly to be blamed for this, as it
549 it is to complex to be easily manipulated.