1 \documentclass[USenglish]{ifimaster}
3 \usepackage[utf8]{inputenc}
4 \usepackage[T1]{fontenc,url}
6 \usepackage{babel,textcomp,csquotes,ifimasterforside,varioref,graphicx}
7 \usepackage[style=numeric-comp]{biblatex}
8 %\usepackage[backend=biber,style=numeric-comp]{biblatex}
10 \usepackage{todonotes}
13 \newtheorem*{wordDef}{Definition}
15 \newcommand{\definition}[1]{\begin{wordDef}#1\end{wordDef}}
16 \newcommand{\see}[1]{(see \ref{#1})}
17 \newcommand{\explanation}[3]{\noindent\textbf{\textit{#1}}\\*\emph{When:}
18 #2\\*\emph{How:} #3\\*[-7px]}
23 \author{Erlend Kristiansen}
25 \bibliography{bibliography/master-thesis-erlenkr-bibliography}
47 \chapter{Refactoring in general}
49 \section{What is refactoring?}
51 This question is best answered dividing the answer into two parts. First
52 defining the concept of a refactoring, then discuss what the dicipline of
53 refactoring is all about. And to make it clear already from the beginning: The
54 dicussions in this report must be seen in the context of object oriented
55 programming languages. It may be obvious, but much of the material will not make
56 much sense otherwise, although some of the techniques may be applicable to
57 sequential \todo{sequential?} languages, then possibly in other forms.
59 \subsection{Defining refactoring}
60 Martin Fowler, in his masterpiece on refactoring~\cite{refactoring}, defines a
61 refactoring like this:
63 \emph{Refactoring} (noun): a change made to the \todo{what does he mean by
64 internal?} internal structure of software to make it easier to understand and
65 cheaper to modify without changing its observable
66 behaviour.~\cite{refactoring} % page 53
68 This definition gives additional meaning to the word \emph{refactoring}, beyond
69 its \todo{original?} original meaning. Fowler is mixing the \emph{motivation}
70 behind refactoring into his definition. Instead it could be made clean, only
71 considering the mechanical and behavioural aspects of refactoring. That is to
72 factor the program again, putting it together in a different way than before,
73 while preserving the behaviour of the program. An alternative definition could
76 \definition{A refactoring is a transformation
77 done to a program without altering its external behaviour.}
79 So a refactoring primarily changes how the \emph{code} of a program is percepted
80 by the \emph{programmer}, and not the behaviour experienced by any user of the
81 program. Although the logical meaning is preserved, such changes could
82 potentially alter the program's behaviour when it comes to performance gain or
83 penalties. So any logic depending on the performance of a program could make the
84 program behave differently after a refactoring.
86 In the extreme case one could argue that such a thing as \emph{software
87 obfuscation} is to refactor. If we where to define it as a refactoring, it could
88 be defined as a composite refactoring \see{intro_composite}, consisting of, for
89 instance, a series of rename refactorings. (But it could of course be much more
90 complex, and the mechanics of it would not exactly be carved in stone.) To
91 perform some serious obfuscation one would also take advantage of techniques not
92 found among established refactorings, such as removing whitespace. This might
93 not even generate a different syntax tree for languages not sensitive to
94 whitespace, placing it in the gray area of what transformations is to be
95 considered refactorings.
97 Finally, to \emph{refactor} is (quoting Martin Fowler)
99 \ldots to restructure software by applying a series of refactorings without
100 changing its observable behaviour.~\cite{refactoring} % page 54, definition
103 % subsection with the history of refactoring?
105 \subsection{Motivation} % better headline?
106 To get a grasp of what refactoring is all about, we can answer this question:
107 \emph{Why do people refactor?} Possible answers could include: ``To remove
108 duplication'' or ``to break up long methods''. Practitioners of the art of
109 Design Patterns~\cite{dp} could say that they do it to introduce a long-needed
110 pattern to their program's design. So it's safe to say that peoples' intentions
111 are to make their programs \emph{better} in some sense. But what aspects of the
112 programs are becoming improved?
114 As already mentioned, people often refactor to get rid of duplication. Moving
115 identical or similar code into methods, and maybe pushing those up or down in
116 their hierarchies. Making template methods for overlapping algorithms
117 \todo{better?: functionality} and so on. It's all about gathering what belongs
118 together and putting it all in one place. And the result? The code is easier to
119 maintain. When removing the implicit coupling between the code snippets, the
120 location of a bug is limited to only one place, and new functionality need only
121 to be added this one place, instead of a number of places people might not even
124 The same people find out that their program contains a lot of long and
125 hard-to-grasp methods. Then what do they do? They begin dividing their methods
126 into smaller ones, using the \emph{Extract Method}
127 refactoring~\cite{refactoring}. Then they may discover something about their
128 program that they weren't aware of before; revealing bugs they didn't know about
129 or couldn't find due to the complex structure of their program. \todo{Proof?}
130 Making the methods smaller and giving good names to the new ones clarifies the
131 algorithms and enhances the \emph{understandability} of the program. This makes
132 simple refactoring an excellent method for exploring unknown program code, or
133 code that you had forgotten that you wrote!
135 The word \emph{simple} came up in the last section. In fact, most basic
136 refactorings are simple. The true power of them are revealed first when they are
137 combined into larger --- higher level --- refactorings, called \emph{composite
138 refactorings} \see{intro_composite}. Often the goal of such a serie of
139 refactorings is a design pattern. Thus the \emph{design} can be evolved
140 throughout the lifetime of a program, opposed to designing up-front. It's all
141 about being structured and taking small steps to improve the design.
143 Many refactorings are aimed at lowering the coupling between different classes
144 and different layers of logic. Say for instance that the coupling between the
145 user interface and the business logic of a program is lowered. Then the business
146 logic of the program could much easier be the target of automated tests,
147 increasing the productivity in the software development process. It would also
148 be much easier to distribute the different parts of the program if they were
151 Another effect of refactoring is that with the increased separation of concerns
152 coming out of many refactorings, the \emph{performance} is improved. When
153 profiling programs, the problem parts are narrowed down to smaller parts of the
154 code, which are easier to tune, and optimization can be performed only where
155 needed and in a more effective way.
157 Refactoring program code --- with a goal in mind --- can give the code itself
158 more value. That is in the form of robustness to bugs, understandability and
159 maintainability. With the first as an obvious advantage, but with the following
160 two being also very important in software development. By incorporating
161 refactoring in the development process, bugs are found faster, new functionality
162 is added more easily and code is easier to understand by the next person exposed
163 to it, which might as well be the person who wrote it. So, refactoring can also
164 add to the monetary value of a business, by increased productivity of the
165 develompment process in the long run. Where this last point also should open
166 the eyes of some nearsighted managers who seldom see beyond the next milestone.
169 \section{Classification of refactorings}
170 % only interesting refactorings
171 % with 2 detailed examples? One for structured and one for intra-method?
172 % Is replacing Bubblesort with Quick Sort considered a refactoring?
174 \subsection{Structural refactorings}
176 \subsubsection{Basic refactorings}
179 \explanation{Extract Method}{You have a code fragment that can be grouped
180 together.}{Turn the fragment into a method whose name explains the purpose of
183 \explanation{Inline Method}{A method's body is just as clear as its name.}{Put
184 the method's body into the body of its callers and remove the method.}
186 \explanation{Inline Temp}{You have a temp that is assigned to once with a simple
187 expression, and the temp is getting in the way of other refactorings.}{Replace
188 all references to that temp with the expression}
190 % Moving Features Between Objects
191 \explanation{Move Method}{A method is, or will be, using or used by more
192 features of another class than the class on which it is defined.}{Create a new
193 method with a similar body in the class it uses most. Either turn the old method
194 into a simple delegation, or remove it altogether.}
196 \explanation{Move Field}{A field is, or will be, used by another class more than
197 the class on which it is defined}{Create a new field in the target class, and
198 change all its users.}
201 \explanation{Replace Magic Number with Symbolic Constant}{You have a literal
202 number with a particular meaning.}{Create a constant, name it after the meaning,
203 and replace the number with it.}
205 \explanation{Encapsulate Field}{There is a public field.}{Make it private and
208 \explanation{Replace Type Code with Class}{A class has a numeric type code that
209 does not affect its behaviour.}{Replace the number with a new class.}
211 \explanation{Replace Type Code with Subclasses}{You have an immutable type code
212 that affects the behaviour of a class.}{Replace the type code with subclasses.}
214 \explanation{Replace Type Code with State/Strategy}{You have a type code that
215 affects the behaviour of a class, but you cannot use subclassing.}{Replace the
216 type code with a state object.}
218 % Simplifying Conditional Expressions
219 \explanation{Consolidate Duplicate Conditional Fragments}{The same fragment of
220 code is in all branches of a condtional expression.}{Move it outside of the
223 \explanation{Remove Control Flag}{You have a variable that is acting as a
224 control flag fro a series of boolean expressions.}{Use a break or return
227 \explanation{Replace Nested Conditional with Guard Clauses}{A method has
228 conditional behaviour that does not make clear the normal path of
229 execution.}{Use guard clauses for all special cases.}
231 \explanation{Introduce Null Object}{You have repeated chacks for a null
232 value.}{Replace the null value with a null object.}
234 \explanation{Introduce Assertion}{A section of code assumes something about the
235 state of the program.}{Make the assumption explicit with an assertion.}
237 % Making Method Calls Simpler
238 \explanation{Rename Method}{The name of a method does not reveal its
239 purpose.}{Change the name of the method}
241 \explanation{Add Parameter}{A method needs more information from its
242 caller.}{Add a parameter for an object that can pass on this information.}
244 \explanation{Remove Parameter}{A parameter is no longer used by the method
247 %\explanation{Parameterize Method}{Several methods do similar things but with
248 %different values contained in the method.}{Create one method that uses a
249 %parameter for the different values.}
251 \explanation{Preserve Whole Object}{You are getting several values from an
252 object and passing these values as parameters in a method call.}{Send the whole
255 \explanation{Remove Setting Method}{A field should be set at creation time and
256 never altered.}{Remove any setting method for that field.}
258 \explanation{Hide Method}{A method is not used by any other class.}{Make the
261 \explanation{Replace Constructor with Factory Method}{You vant to do more than
262 simple construction when you create an object}{Replace the consgtructor with a
265 % Dealing with Generalization
266 \explanation{Pull Up Field}{Two subclasses have the same field.}{MOve the field
269 \explanation{Pull Up Method}{You have methods with identical results on
270 subclasses.}{Move them to the superclass.}
272 \explanation{Push Down Method}{Behaviour on a superclass is relevant only for
273 some of its subclasses.}{Move it to those subclasses.}
275 \explanation{Push Down Field}{A field is used only by some subclasses.}{Move the
276 field to those subclasses}
278 \explanation{Extract Interface}{Several clients use the same subset of a class's
279 interface, or two classes have part of their interfac|es in common.}{Extract the
280 subset into an interface.}
282 \explanation{Replace Inheritance with Delegation}{A subclass uses only part of a
283 superclasses interface or does not want to inherit data.}{Create a field for the
284 superclass, adjust methods to delegate to the superclass, and remove the
287 \explanation{Replace Delegation with Inheritance}{You're using delegation and
288 are often writing many simple delegations for the entire interface}{Make the
289 delegating class a subclass of the delegate.}
291 \subsubsection{Composite refactorings}
294 % \explanation{Replace Method with Method Object}{}{}
296 % Moving Features Between Objects
297 \explanation{Extract Class}{You have one class doing work that should be done by
298 two}{Create a new class and move the relevant fields and methods from the old
299 class into the new class.}
301 \explanation{Inline Class}{A class isn't doing very much.}{Move all its features
302 into another class and delete it.}
304 \explanation{Hide Delegate}{A client is calling a delegate class of an
305 object.}{Create Methods on the server to hide the delegate.}
307 \explanation{Remove Middle Man}{A class is doing to much simple delegation.}{Get
308 the client to call the delegate directly.}
311 \explanation{Replace Data Value with Object}{You have a data item that needs
312 additional data or behaviour.}{Turn the data item into an object.}
314 \explanation{Change Value to Reference}{You have a class with many equal
315 instances that you want to replace with a single object.}{Turn the object into a
318 \explanation{Encapsulate Collection}{A method returns a collection}{Make it
319 return a read-only vew and provide add/remove methods.}
321 % \explanation{Replace Array with Object}{}{}
323 \explanation{Replace Subclass with Fields}{You have subclasses that vary only in
324 methods that return constant data.}{Change the methods to superclass fields and
325 eliminate the subclasses.}
327 % Simplifying Conditional Expressions
328 \explanation{Decompose Conditional}{You have a complicated conditional
329 (if-then-else) statement.}{Extract methods from the condition, then part, an
332 \explanation{Consolidate Conditional Expression}{You have a sequence of
333 conditional tests with the same result.}{Combine them into a single conditional
334 expression and extract it.}
336 \explanation{Replace Conditional with Polymorphism}{You have a conditional that
337 chooses different behaviour depending on the type of an object.}{Move each leg
338 of the conditional to an overriding method in a subclass. Make the original
341 % Making Method Calls Simpler
342 \explanation{Replace Parameter with Method}{An object invokes a method, then
343 passes the result as a parameter for a method. The receiver can also invoke this
344 method.}{Remove the parameter and let the receiver invoke the method.}
346 \explanation{Introduce Parameter Object}{You have a group of parameters that
347 naturally go together.}{Replace them with an object.}
349 % Dealing with Generalization
350 \explanation{Extract Subclass}{A class has features that are used only in some
351 instances.}{Create a subclass for that subset of features.}
353 \explanation{Extract Superclass}{You have two classes with similar
354 features.}{Create a superclass and move the common features to the
357 \explanation{Collapse Hierarchy}{A superclass and subclass are not very
358 different.}{Merge them together.}
360 \explanation{Form Template Method}{You have two methods in subclasses that
361 perform similar steps in the same order, yet the steps are different.}{Get the
362 steps into methods with the same signature, so that the original methods become
363 the same. Then you can pull them up.}
366 \subsection{Functional refactorings}
368 \explanation{Substitute Algorithm}{You want to replace an algorithm with one
369 that is clearer.}{Replace the body of the method with the new algorithm.}
372 \section{The impact on software quality}
374 \subsection{What is meant by quality?}
375 The term \emph{software quality} has many meanings. It all depends on the
376 context we put it in. If we look at it with the eyes of a software developer, it
377 usually mean that the software is easily maintainable and testable, or in other
378 words, that it is \emph{well designed}. This often correlates with the
379 management scale, where \emph{keeping the schedule} and \emph{customer
380 satisfaction} is at the center. From the customers point of view, in addition to
381 good usability, \emph{performance} and \emph{lack of bugs} is always
382 appreciated, measurements that are also shared by the software developer. (In
383 addition, such things as good documentation could be measured, but this is out
384 of the scope of this document.)
386 \subsection{The impact on performance}
388 Refactoring certainly will make software go more slowly, but it also makes the
389 software more amenable to performance tuning.~\cite{refactoring} % page 69
391 There is a common belief that refactoring compromises performance, due to
392 increased degree of indirection and that polymorphism is slower than
395 In a survey, Demeyer~\cite{demeyer2002} disproves this view in the case of
396 polymorphism. He is doing an experiment on, what he calls, ``Transform Self Type
397 Checks'' where you introduce a new polymorphic method and a new class hierarchy
398 to get rid of a class' type checking of a ``type attribute``. He uses this kind
399 of transformation to represent other ways of replacing conditionals with
400 polymorphism as well. The experiment is performed on the C++ programming
401 language and with three different compilers and platforms. \todo{But is the
402 result better?} Demeyer concludes that, with compiler optimization turned on,
403 polymorphism beats middle to large sized if-statements and does as well as
404 case-statements. (In accordance with his hypothesis, due to similarities
405 between the way C++ handles polymorphism and case-statements.)
407 The interesting thing about performance is that if you analyze most programs,
408 you find that they waste most of their time in a small fraction of the code.
411 So, although an increased amount of method calls could potentially slow down
412 programs, one should avoid premature optimization and sacrificing good design,
413 leaving the performance tuning until after profiling the software and having
414 isolated the actual problem areas.
418 \section{Correctness of refactorings}
421 \section{Composite refactorings} \label{intro_composite}
422 % motivation, example(s)
423 % manual vs automated?
424 % what about refactoring in a very large code base?
426 \section{Software metrics}
430 %\chapter{Planning the project}