]>
Commit | Line | Data |
---|---|---|
1 | \documentclass[USenglish]{ifimaster} | |
2 | \usepackage{import} | |
3 | \usepackage[utf8]{inputenc} | |
4 | \usepackage[T1]{fontenc,url} | |
5 | \usepackage{lmodern} % using Latin Modern to be able to use bold typewriter font | |
6 | \urlstyle{sf} | |
7 | \usepackage{listings} | |
8 | \usepackage{tabularx} | |
9 | \usepackage{tikz} | |
10 | \usepackage{tikz-qtree} | |
11 | \usetikzlibrary{shapes,snakes,trees,arrows,shadows,positioning,calc} | |
12 | \usepackage{babel,textcomp,csquotes,ifimasterforside,varioref} | |
13 | \usepackage[hidelinks]{hyperref} | |
14 | \usepackage{cleveref} | |
15 | \usepackage[style=numeric-comp,backend=bibtex]{biblatex} | |
16 | \usepackage{amsthm} | |
17 | \usepackage{graphicx} | |
18 | % use 'disable' before printing: | |
19 | \usepackage[]{todonotes} | |
20 | \usepackage{xspace} | |
21 | \usepackage{he-she} | |
22 | \usepackage{verbatim} | |
23 | \usepackage{minted} | |
24 | \usepackage{multicol} | |
25 | \usemintedstyle{bw} | |
26 | \usepackage{perpage} %the perpage package | |
27 | \MakePerPage{footnote} %the perpage package command | |
28 | ||
29 | \theoremstyle{definition} | |
30 | \newtheorem*{wordDef}{Definition} | |
31 | ||
32 | \graphicspath{ {./figures/} } | |
33 | ||
34 | \newcommand{\citing}[1]{~\cite{#1}} | |
35 | \newcommand{\myref}[1]{\cref{#1} on \cpageref{#1}} | |
36 | ||
37 | \newcommand{\definition}[1]{\begin{wordDef}#1\end{wordDef}} | |
38 | \newcommand{\see}[1]{(see \myref{#1})} | |
39 | \newcommand{\See}[1]{(See \myref{#1}.)} | |
40 | \newcommand{\explanation}[3]{\noindent\textbf{\textit{#1}}\\*\emph{When:} | |
41 | #2\\*\emph{How:} #3\\*[-7px]} | |
42 | ||
43 | %\newcommand{\type}[1]{\lstinline{#1}} | |
44 | \newcommand{\code}[1]{\texttt{\textbf{#1}}} | |
45 | \newcommand{\type}[1]{\code{#1}} | |
46 | \newcommand{\typeref}[1]{\footnote{\type{#1}}} | |
47 | \newcommand{\typewithref}[2]{\type{#2}\typeref{#1.#2}} | |
48 | \newcommand{\method}[1]{\type{#1}} | |
49 | \newcommand{\methodref}[2]{\footnote{\type{#1}\method{\##2()}}} | |
50 | \newcommand{\methodwithref}[2]{\method{#2}\footnote{\type{#1}\method{\##2()}}} | |
51 | \newcommand{\var}[1]{\type{#1}} | |
52 | ||
53 | \newcommand{\refactoring}[1]{\emph{#1}} | |
54 | \newcommand{\ExtractMethod}{\refactoring{Extract Method}\xspace} | |
55 | \newcommand{\MoveMethod}{\refactoring{Move Method}\xspace} | |
56 | \newcommand{\ExtractAndMoveMethod}{\refactoring{Extract and Move Method}\xspace} | |
57 | ||
58 | \newcommand\todoin[2][]{\todo[inline, caption={2do}, #1]{ | |
59 | \begin{minipage}{\textwidth-4pt}#2\end{minipage}}} | |
60 | ||
61 | \title{Refactoring} | |
62 | \subtitle{An essay} | |
63 | \author{Erlend Kristiansen} | |
64 | ||
65 | \bibliography{bibliography/master-thesis-erlenkr-bibliography} | |
66 | ||
67 | % UML comment in TikZ: | |
68 | % ref: https://tex.stackexchange.com/questions/103688/folded-paper-shape-tikz | |
69 | \makeatletter | |
70 | \pgfdeclareshape{umlcomment}{ | |
71 | \inheritsavedanchors[from=rectangle] % this is nearly a rectangle | |
72 | \inheritanchorborder[from=rectangle] | |
73 | \inheritanchor[from=rectangle]{center} | |
74 | \inheritanchor[from=rectangle]{north} | |
75 | \inheritanchor[from=rectangle]{south} | |
76 | \inheritanchor[from=rectangle]{west} | |
77 | \inheritanchor[from=rectangle]{east} | |
78 | % ... and possibly more | |
79 | \backgroundpath{% this is new | |
80 | % store lower right in xa/ya and upper right in xb/yb | |
81 | \southwest \pgf@xa=\pgf@x \pgf@ya=\pgf@y | |
82 | \northeast \pgf@xb=\pgf@x \pgf@yb=\pgf@y | |
83 | % compute corner of ‘‘flipped page’’ | |
84 | \pgf@xc=\pgf@xb \advance\pgf@xc by-10pt % this should be a parameter | |
85 | \pgf@yc=\pgf@yb \advance\pgf@yc by-10pt | |
86 | % construct main path | |
87 | \pgfpathmoveto{\pgfpoint{\pgf@xa}{\pgf@ya}} | |
88 | \pgfpathlineto{\pgfpoint{\pgf@xa}{\pgf@yb}} | |
89 | \pgfpathlineto{\pgfpoint{\pgf@xc}{\pgf@yb}} | |
90 | \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@yc}} | |
91 | \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@ya}} | |
92 | \pgfpathclose | |
93 | % add little corner | |
94 | \pgfpathmoveto{\pgfpoint{\pgf@xc}{\pgf@yb}} | |
95 | \pgfpathlineto{\pgfpoint{\pgf@xc}{\pgf@yc}} | |
96 | \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@yc}} | |
97 | \pgfpathlineto{\pgfpoint{\pgf@xc}{\pgf@yc}} | |
98 | } | |
99 | } | |
100 | \makeatother | |
101 | ||
102 | \tikzstyle{comment}=[% | |
103 | draw, | |
104 | drop shadow, | |
105 | fill=white, | |
106 | align=center, | |
107 | shape=document, | |
108 | minimum width=20mm, | |
109 | minimum height=10mm, | |
110 | shape=umlcomment, | |
111 | inner sep=2ex, | |
112 | font=\ttfamily, | |
113 | ] | |
114 | ||
115 | \begin{document} | |
116 | \ififorside | |
117 | \frontmatter{} | |
118 | ||
119 | ||
120 | \chapter*{Abstract} | |
121 | \todoin{\textbf{Remove all todos (including list) before delivery/printing!!! | |
122 | Can be done by removing ``draft'' from documentclass.}} | |
123 | \todoin{Write abstract} | |
124 | ||
125 | \tableofcontents{} | |
126 | \listoffigures{} | |
127 | \listoftables{} | |
128 | ||
129 | \chapter*{Preface} | |
130 | ||
131 | The discussions in this report must be seen in the context of object oriented | |
132 | programming languages, and Java in particular, since that is the language in | |
133 | which most of the examples will be given. All though the techniques discussed | |
134 | may be applicable to languages from other paradigms, they will not be the | |
135 | subject of this report. | |
136 | ||
137 | \mainmatter | |
138 | ||
139 | \chapter{What is Refactoring?} | |
140 | ||
141 | This question is best answered by first defining the concept of a | |
142 | \emph{refactoring}, what it is to \emph{refactor}, and then discuss what aspects | |
143 | of programming make people want to refactor their code. | |
144 | ||
145 | \section{Defining refactoring} | |
146 | Martin Fowler, in his classic book on refactoring\citing{refactoring}, defines a | |
147 | refactoring like this: | |
148 | ||
149 | \begin{quote} | |
150 | \emph{Refactoring} (noun): a change made to the internal | |
151 | structure\footnote{The structure observable by the programmer.} of software to | |
152 | make it easier to understand and cheaper to modify without changing its | |
153 | observable behavior.~\cite[p.~53]{refactoring} | |
154 | \end{quote} | |
155 | ||
156 | \noindent This definition assigns additional meaning to the word | |
157 | \emph{refactoring}, beyond the composition of the prefix \emph{re-}, usually | |
158 | meaning something like ``again'' or ``anew'', and the word \emph{factoring}, | |
159 | that can mean to isolate the \emph{factors} of something. Here a \emph{factor} | |
160 | would be close to the mathematical definition of something that divides a | |
161 | quantity, without leaving a remainder. Fowler is mixing the \emph{motivation} | |
162 | behind refactoring into his definition. Instead it could be more refined, formed | |
163 | to only consider the \emph{mechanical} and \emph{behavioral} aspects of | |
164 | refactoring. That is to factor the program again, putting it together in a | |
165 | different way than before, while preserving the behavior of the program. An | |
166 | alternative definition could then be: | |
167 | ||
168 | \definition{A \emph{refactoring} is a transformation | |
169 | done to a program without altering its external behavior.} | |
170 | ||
171 | From this we can conclude that a refactoring primarily changes how the | |
172 | \emph{code} of a program is perceived by the \emph{programmer}, and not the | |
173 | \emph{behavior} experienced by any user of the program. Although the logical | |
174 | meaning is preserved, such changes could potentially alter the program's | |
175 | behavior when it comes to performance gain or -penalties. So any logic depending | |
176 | on the performance of a program could make the program behave differently after | |
177 | a refactoring. | |
178 | ||
179 | In the extreme case one could argue that such a thing as \emph{software | |
180 | obfuscation} is refactoring. Software obfuscation is to make source code harder | |
181 | to read and analyze, while preserving its semantics. It could be done composing | |
182 | many, more or less randomly chosen, refactorings. Then the question arise | |
183 | whether it can be called a \emph{composite refactoring} | |
184 | \see{compositeRefactorings} or not? The answer is not obvious. First, there is | |
185 | no way to describe \emph{the} mechanics of software obfuscation, beacause there | |
186 | are infinitely many ways to do that. Second, \emph{obfuscation} can be thought | |
187 | of as \emph{one operation}: Either the code is obfuscated, or it is not. Third, | |
188 | it makes no sense to call software obfuscation \emph{a} refactoring, since it | |
189 | holds different meaning to different people. The last point is important, since | |
190 | one of the motivations behind defining different refactorings is to build up a | |
191 | vocabulary for software professionals to reason and discuss about programs, | |
192 | similar to the motivation behind design patterns\citing{designPatterns}. So for | |
193 | describing \emph{software obfuscation}, it might be more appropriate to define | |
194 | what you do when performing it rather than precisely defining its mechanics in | |
195 | terms of other refactorings. | |
196 | ||
197 | \section{The etymology of 'refactoring'} | |
198 | It is a little difficult to pinpoint the exact origin of the word | |
199 | ``refactoring'', as it seems to have evolved as part of a colloquial | |
200 | terminology, more than a scientific term. There is no authoritative source for a | |
201 | formal definition of it. | |
202 | ||
203 | According to Martin Fowler\citing{etymology-refactoring}, there may also be more | |
204 | than one origin of the word. The most well-known source, when it comes to the | |
205 | origin of \emph{refactoring}, is the Smalltalk\footnote{\emph{Smalltalk}, | |
206 | object-oriented, dynamically typed, reflective programming language. See | |
207 | \url{http://www.smalltalk.org}} community and their infamous \emph{Refactoring | |
208 | Browser}\footnote{\url{http://st-www.cs.illinois.edu/users/brant/Refactory/RefactoringBrowser.html}} | |
209 | described in the article \emph{A Refactoring Tool for | |
210 | Smalltalk}\citing{refactoringBrowser1997}, published in 1997. | |
211 | Allegedly\citing{etymology-refactoring}, the metaphor of factoring programs was | |
212 | also present in the Forth\footnote{\emph{Forth} -- stack-based, extensible | |
213 | programming language, without type-checking. See \url{http://www.forth.org}} | |
214 | community, and the word ``refactoring'' is mentioned in a book by Leo Brodie, | |
215 | called \emph{Thinking Forth}\citing{brodie1984}, first published in | |
216 | 1984\footnote{\emph{Thinking Forth} was first published in 1984 by the | |
217 | \emph{Forth Interest Group}. Then it was reprinted in 1994 with minor | |
218 | typographical corrections, before it was transcribed into an electronic edition | |
219 | typeset in \LaTeX\ and published under a Creative Commons licence in 2004. The | |
220 | edition cited here is the 2004 edition, but the content should essentially be as | |
221 | in 1984.}. The exact word is only printed one place~\cite[p.~232]{brodie1984}, | |
222 | but the term \emph{factoring} is prominent in the book, that also contains a | |
223 | whole chapter dedicated to (re)factoring, and how to keep the (Forth) code clean | |
224 | and maintainable. | |
225 | ||
226 | \begin{quote} | |
227 | \ldots good factoring technique is perhaps the most important skill for a | |
228 | Forth programmer.~\cite[p.~172]{brodie1984} | |
229 | \end{quote} | |
230 | ||
231 | \noindent Brodie also express what \emph{factoring} means to him: | |
232 | ||
233 | \begin{quote} | |
234 | Factoring means organizing code into useful fragments. To make a fragment | |
235 | useful, you often must separate reusable parts from non-reusable parts. The | |
236 | reusable parts become new definitions. The non-reusable parts become arguments | |
237 | or parameters to the definitions.~\cite[p.~172]{brodie1984} | |
238 | \end{quote} | |
239 | ||
240 | Fowler claims that the usage of the word \emph{refactoring} did not pass between | |
241 | the \emph{Forth} and \emph{Smalltalk} communities, but that it emerged | |
242 | independently in each of the communities. | |
243 | ||
244 | \section{Motivation -- Why people refactor} | |
245 | There are many reasons why people want to refactor their programs. They can for | |
246 | instance do it to remove duplication, break up long methods or to introduce | |
247 | design patterns\citing{designPatterns} into their software systems. The shared | |
248 | trait for all these are that peoples intentions are to make their programs | |
249 | \emph{better}, in some sense. But what aspects of their programs are becoming | |
250 | improved? | |
251 | ||
252 | As already mentioned, people often refactor to get rid of duplication. Moving | |
253 | identical or similar code into methods, and maybe pushing methods up or down in | |
254 | their class hierarchies. Making template methods for overlapping | |
255 | algorithms/functionality and so on. It is all about gathering what belongs | |
256 | together and putting it all in one place. The resulting code is then easier to | |
257 | maintain. When removing the implicit coupling\footnote{When duplicating code, | |
258 | the code might not be coupled in other ways than that it is supposed to | |
259 | represent the same functionality. So if this functionality is going to change, | |
260 | it might need to change in more than one place, thus creating an implicit | |
261 | coupling between the multiple pieces of code.} between code snippets, the | |
262 | location of a bug is limited to only one place, and new functionality need only | |
263 | to be added to this one place, instead of a number of places people might not | |
264 | even remember. | |
265 | ||
266 | A problem you often encounter when programming, is that a program contains a lot | |
267 | of long and hard-to-grasp methods. It can then help to break the methods into | |
268 | smaller ones, using the \ExtractMethod refactoring\citing{refactoring}. Then you | |
269 | may discover something about a program that you were not aware of before; | |
270 | revealing bugs you did not know about or could not find due to the complex | |
271 | structure of your program. \todo{Proof?} Making the methods smaller and giving | |
272 | good names to the new ones clarifies the algorithms and enhances the | |
273 | \emph{understandability} of the program \see{magic_number_seven}. This makes | |
274 | refactoring an excellent method for exploring unknown program code, or code that | |
275 | you had forgotten that you wrote. | |
276 | ||
277 | Most primitive refactorings are simple. Their true power is first revealed when | |
278 | they are combined into larger --- higher level --- refactorings, called | |
279 | \emph{composite refactorings} \see{compositeRefactorings}. Often the goal of | |
280 | such a series of refactorings is a design pattern. Thus the \emph{design} can be | |
281 | evolved throughout the lifetime of a program, as opposed to designing up-front. | |
282 | It is all about being structured and taking small steps to improve a program's | |
283 | design. | |
284 | ||
285 | Many software design pattern are aimed at lowering the coupling between | |
286 | different classes and different layers of logic. One of the most famous is | |
287 | perhaps the \emph{Model-View-Controller}\citing{designPatterns} pattern. It is | |
288 | aimed at lowering the coupling between the user interface and the business logic | |
289 | and data representation of a program. This also has the added benefit that the | |
290 | business logic could much easier be the target of automated tests, increasing | |
291 | the productivity in the software development process. Refactoring is an | |
292 | important tool on the way to something greater. | |
293 | ||
294 | Another effect of refactoring is that with the increased separation of concerns | |
295 | coming out of many refactorings, the \emph{performance} can be improved. When | |
296 | profiling programs, the problematic parts are narrowed down to smaller parts of | |
297 | the code, which are easier to tune, and optimization can be performed only where | |
298 | needed and in a more effective way. | |
299 | ||
300 | Last, but not least, and this should probably be the best reason to refactor, is | |
301 | to refactor to \emph{facilitate a program change}. If one has managed to keep | |
302 | one's code clean and tidy, and the code is not bloated with design patterns that | |
303 | are not ever going to be needed, then some refactoring might be needed to | |
304 | introduce a design pattern that is appropriate for the change that is going to | |
305 | happen. | |
306 | ||
307 | Refactoring program code --- with a goal in mind --- can give the code itself | |
308 | more value. That is in the form of robustness to bugs, understandability and | |
309 | maintainability. Having robust code is an obvious advantage, but | |
310 | understandability and maintainability are both very important aspects of | |
311 | software development. By incorporating refactoring in the development process, | |
312 | bugs are found faster, new functionality is added more easily and code is easier | |
313 | to understand by the next person exposed to it, which might as well be the | |
314 | person who wrote it. The consequence of this, is that refactoring can increase | |
315 | the average productivity of the development process, and thus also add to the | |
316 | monetary value of a business in the long run. The perspective on productivity | |
317 | and money should also be able to open the eyes of the many nearsighted managers | |
318 | that seldom see beyond the next milestone. | |
319 | ||
320 | \section{The magical number seven}\label{magic_number_seven} | |
321 | The article \emph{The magical number seven, plus or minus two: some limits on | |
322 | our capacity for processing information}\citing{miller1956} by George A. | |
323 | Miller, was published in the journal \emph{Psychological Review} in 1956. It | |
324 | presents evidence that support that the capacity of the number of objects a | |
325 | human being can hold in its working memory is roughly seven, plus or minus two | |
326 | objects. This number varies a bit depending on the nature and complexity of the | |
327 | objects, but is according to Miller ``\ldots never changing so much as to be | |
328 | unrecognizable.'' | |
329 | ||
330 | Miller's article culminates in the section called \emph{Recoding}, a term he | |
331 | borrows from communication theory. The central result in this section is that by | |
332 | recoding information, the capacity of the amount of information that a human can | |
333 | process at a time is increased. By \emph{recoding}, Miller means to group | |
334 | objects together in chunks and give each chunk a new name that it can be | |
335 | remembered by. By organizing objects into patterns of ever growing depth, one | |
336 | can memorize and process a much larger amount of data than if it were to be | |
337 | represented as its basic pieces. This grouping and renaming is analogous to how | |
338 | many refactorings work, by grouping pieces of code and give them a new name. | |
339 | Examples are the fundamental \ExtractMethod and \refactoring{Extract Class} | |
340 | refactorings\citing{refactoring}. | |
341 | ||
342 | \begin{quote} | |
343 | \ldots recoding is an extremely powerful weapon for increasing the amount of | |
344 | information that we can deal with.~\cite[p.~95]{miller1956} | |
345 | \end{quote} | |
346 | ||
347 | An example from the article addresses the problem of memorizing a sequence of | |
348 | binary digits. Let us say we have the following sequence\footnote{The example | |
349 | presented here is slightly modified (and shortened) from what is presented in | |
350 | the original article\citing{miller1956}, but it is essentially the same.} of | |
351 | 16 binary digits: ``1010001001110011''. Most of us will have a hard time | |
352 | memorizing this sequence by only reading it once or twice. Imagine if we instead | |
353 | translate it to this sequence: ``A273''. If you have a background from computer | |
354 | science, it will be obvious that the latest sequence is the first sequence | |
355 | recoded to be represented by digits with base 16. Most people should be able to | |
356 | memorize this last sequence by only looking at it once. | |
357 | ||
358 | Another result from the Miller article is that when the amount of information a | |
359 | human must interpret increases, it is crucial that the translation from one code | |
360 | to another must be almost automatic for the subject to be able to remember the | |
361 | translation, before \heshe is presented with new information to recode. Thus | |
362 | learning and understanding how to best organize certain kinds of data is | |
363 | essential to efficiently handle that kind of data in the future. This is much | |
364 | like when humans learn to read. First they must learn how to recognize letters. | |
365 | Then they can learn distinct words, and later read sequences of words that form | |
366 | whole sentences. Eventually, most of them will be able to read whole books and | |
367 | briefly retell the important parts of its content. This suggest that the use of | |
368 | design patterns\citing{designPatterns} is a good idea when reasoning about | |
369 | computer programs. With extensive use of design patterns when creating complex | |
370 | program structures, one does not always have to read whole classes of code to | |
371 | comprehend how they function, it may be sufficient to only see the name of a | |
372 | class to almost fully understand its responsibilities. | |
373 | ||
374 | \begin{quote} | |
375 | Our language is tremendously useful for repackaging material into a few chunks | |
376 | rich in information.~\cite[p.~95]{miller1956} | |
377 | \end{quote} | |
378 | ||
379 | Without further evidence, these results at least indicate that refactoring | |
380 | source code into smaller units with higher cohesion and, when needed, | |
381 | introducing appropriate design patterns, should aid in the cause of creating | |
382 | computer programs that are easier to maintain and has code that is easier (and | |
383 | better) understood. | |
384 | ||
385 | \section{Notable contributions to the refactoring literature} | |
386 | \todoin{Update with more contributions} | |
387 | ||
388 | \begin{description} | |
389 | \item[1992] William F. Opdyke submits his doctoral dissertation called | |
390 | \emph{Refactoring Object-Oriented Frameworks}\citing{opdyke1992}. This | |
391 | work defines a set of refactorings, that are behavior preserving given that | |
392 | their preconditions are met. The dissertation is focused on the automation | |
393 | of refactorings. | |
394 | \item[1999] Martin Fowler et al.: \emph{Refactoring: Improving the Design of | |
395 | Existing Code}\citing{refactoring}. This is maybe the most influential text | |
396 | on refactoring. It bares similarities with Opdykes thesis\citing{opdyke1992} | |
397 | in the way that it provides a catalog of refactorings. But Fowler's book is | |
398 | more about the craft of refactoring, as he focuses on establishing a | |
399 | vocabulary for refactoring, together with the mechanics of different | |
400 | refactorings and when to perform them. His methodology is also founded on | |
401 | the principles of test-driven development. | |
402 | \item[2005] Joshua Kerievsky: \emph{Refactoring to | |
403 | Patterns}\citing{kerievsky2005}. This book is heavily influenced by Fowler's | |
404 | \emph{Refactoring}\citing{refactoring} and the ``Gang of Four'' \emph{Design | |
405 | Patterns}\citing{designPatterns}. It is building on the refactoring | |
406 | catalogue from Fowler's book, but is trying to bridge the gap between | |
407 | \emph{refactoring} and \emph{design patterns} by providing a series of | |
408 | higher-level composite refactorings, that makes code evolve toward or away | |
409 | from certain design patterns. The book is trying to build up the readers | |
410 | intuition around \emph{why} one would want to use a particular design | |
411 | pattern, and not just \emph{how}. The book is encouraging evolutionary | |
412 | design. \See{relationToDesignPatterns} | |
413 | \end{description} | |
414 | ||
415 | \section{Tool support (for Java)}\label{toolSupport} | |
416 | This section will briefly compare the refatoring support of the three IDEs | |
417 | \emph{Eclipse}\footnote{\url{http://www.eclipse.org/}}, \emph{IntelliJ | |
418 | IDEA}\footnote{The IDE under comparison is the \emph{Community Edition}, | |
419 | \url{http://www.jetbrains.com/idea/}} and | |
420 | \emph{NetBeans}\footnote{\url{https://netbeans.org/}}. These are the most | |
421 | popular Java IDEs\citing{javaReport2011}. | |
422 | ||
423 | All three IDEs provide support for the most useful refactorings, like the | |
424 | different extract, move and rename refactorings. In fact, Java-targeted IDEs are | |
425 | known for their good refactoring support, so this did not appear as a big | |
426 | surprise. | |
427 | ||
428 | The IDEs seem to have excellent support for the \ExtractMethod refactoring, so | |
429 | at least they have all passed the first refactoring | |
430 | rubicon\citing{fowlerRubicon2001,secondRubicon2012}. | |
431 | ||
432 | Regarding the \MoveMethod refactoring, the \emph{Eclipse} and \emph{IntelliJ} | |
433 | IDEs do the job in very similar manners. In most situations they both do a | |
434 | satisfying job by producing the expected outcome. But they do nothing to check | |
435 | that the result does not break the semantics of the program \see{correctness}. | |
436 | The \emph{NetBeans} IDE implements this refactoring in a somewhat | |
437 | unsophisticated way. For starters, its default destination for the move is | |
438 | itself, although it refuses to perform the refactoring if chosen. But the worst | |
439 | part is, that if moving the method \method{f} of the class \type{C} to the class | |
440 | \type{X}, it will break the code. The result is shown in | |
441 | \myref{lst:moveMethod_NetBeans}. | |
442 | ||
443 | \begin{listing} | |
444 | \begin{multicols}{2} | |
445 | \begin{minted}[samepage]{java} | |
446 | public class C { | |
447 | private X x; | |
448 | ... | |
449 | public void f() { | |
450 | x.m(); | |
451 | x.n(); | |
452 | } | |
453 | } | |
454 | \end{minted} | |
455 | ||
456 | \columnbreak | |
457 | ||
458 | \begin{minted}[samepage]{java} | |
459 | public class X { | |
460 | ... | |
461 | public void f(C c) { | |
462 | c.x.m(); | |
463 | c.x.n(); | |
464 | } | |
465 | } | |
466 | \end{minted} | |
467 | \end{multicols} | |
468 | \caption{Moving method \method{f} from \type{C} to \type{X}.} | |
469 | \label{lst:moveMethod_NetBeans} | |
470 | \end{listing} | |
471 | ||
472 | NetBeans will try to make code that call the methods \method{m} and \method{n} | |
473 | of \type{X} by accessing them through \var{c.x}, where \var{c} is a parameter of | |
474 | type \type{C} that is added the method \method{f} when it is moved. (This is | |
475 | seldom the desired outcome of this refactoring, but ironically, this ``feature'' | |
476 | keeps NetBeans from breaking the code in the example from \myref{correctness}.) | |
477 | If \var{c.x} for some reason is inaccessible to \type{X}, as in this case, the | |
478 | refactoring breaks the code, and it will not compile. NetBeans presents a | |
479 | preview of the refactoring outcome, but the preview does not catch it if the IDE | |
480 | is about break the program. | |
481 | ||
482 | The IDEs under investigation seems to have fairly good support for primitive | |
483 | refactorings, but what about more complex ones, such as the \refactoring{Extract | |
484 | Class}\citing{refactoring}? The \refactoring{Extract Class} refactoring works by | |
485 | creating a class, for then to move members to that class and access them from | |
486 | the old class via a reference to the new class. \emph{IntelliJ} handles this in | |
487 | a fairly good manner, although, in the case of private methods, it leaves unused | |
488 | methods behind. These are methods that delegate to a field with the type of the | |
489 | new class, but are not used anywhere. \emph{Eclipse} has added (or withdrawn) | |
490 | its own quirk to the Extract Class refactoring, and only allows for | |
491 | \emph{fields} to be moved to a new class, \emph{not methods}. This makes it | |
492 | effectively only extracting a data structure, and calling it | |
493 | \refactoring{Extract Class} is a little misleading. One would often be better | |
494 | off with textual extract and paste than using the Extract Class refactoring in | |
495 | Eclipse. When it comes to \emph{NetBeans}, it does not even seem to have made an | |
496 | attempt on providing this refactoring. (Well, it probably has, but it does not | |
497 | show in the IDE.) | |
498 | ||
499 | \todoin{Visual Studio (C++/C\#), Smalltalk refactoring browser?, | |
500 | second refactoring rubicon?} | |
501 | ||
502 | \section{The relation to design patterns}\label{relationToDesignPatterns} | |
503 | ||
504 | \emph{Refactoring} and \emph{design patterns} have at least one thing in common, | |
505 | they are both promoted by advocates of \emph{clean code}\citing{cleanCode} as | |
506 | fundamental tools on the road to more maintanable and extendable source code. | |
507 | ||
508 | \begin{quote} | |
509 | Design patterns help you determine how to reorganize a design, and they can | |
510 | reduce the amount of refactoring you need to do | |
511 | later.~\cite[p.~353]{designPatterns} | |
512 | \end{quote} | |
513 | ||
514 | Although sometimes associated with | |
515 | over-engineering\citing{kerievsky2005,refactoring}, design patterns are in | |
516 | general assumed to be good for maintainability of source code. That may be | |
517 | because many of them are designed to support the \emph{open/closed principle} of | |
518 | object-oriented programming. The principle was first formulated by Bertrand | |
519 | Meyer, the creator of the Eiffel programming language, like this: ``Modules | |
520 | should be both open and closed.''\citing{meyer1988} It has been popularized, | |
521 | with this as a common version: | |
522 | ||
523 | \begin{quote} | |
524 | Software entities (classes, modules, functions, etc.) should be open for | |
525 | extension, but closed for modification.\footnote{See | |
526 | \url{http://c2.com/cgi/wiki?OpenClosedPrinciple} or | |
527 | \url{https://en.wikipedia.org/wiki/Open/closed_principle}} | |
528 | \end{quote} | |
529 | ||
530 | Maintainability is often thought of as the ability to be able to introduce new | |
531 | functionality without having to change too much of the old code. When | |
532 | refactoring, the motivation is often to facilitate adding new functionality. It | |
533 | is about factoring the old code in a way that makes the new functionality being | |
534 | able to benefit from the functionality already residing in a software system, | |
535 | without having to copy old code into new. Then, next time someone shall add new | |
536 | functionality, it is less likely that the old code has to change. Assuming that | |
537 | a design pattern is the best way to get rid of duplication and assist in | |
538 | implementing new functionality, it is reasonable to conclude that a design | |
539 | pattern often is the target of a series of refactorings. Having a repertoire of | |
540 | design patterns can also help in knowing when and how to refactor a program to | |
541 | make it reflect certain desired characteristics. | |
542 | ||
543 | \begin{quote} | |
544 | There is a natural relation between patterns and refactorings. Patterns are | |
545 | where you want to be; refactorings are ways to get there from somewhere | |
546 | else.~\cite[p.~107]{refactoring} | |
547 | \end{quote} | |
548 | ||
549 | This quote is wise in many contexts, but it is not always appropriate to say | |
550 | ``Patterns are where you want to be\ldots''. \emph{Sometimes}, patterns are | |
551 | where you want to be, but only because it will benefit your design. It is not | |
552 | true that one should always try to incorporate as many design patterns as | |
553 | possible into a program. It is not like they have intrinsic value. They only add | |
554 | value to a system when they support its design. Otherwise, the use of design | |
555 | patterns may only lead to a program that is more complex than necessary. | |
556 | ||
557 | \begin{quote} | |
558 | The overuse of patterns tends to result from being patterns happy. We are | |
559 | \emph{patterns happy} when we become so enamored of patterns that we simply | |
560 | must use them in our code.~\cite[p.~24]{kerievsky2005} | |
561 | \end{quote} | |
562 | ||
563 | This can easily happen when relying largely on up-front design. Then it is | |
564 | natural, in the very beginning, to try to build in all the flexibility that one | |
565 | believes will be necessary throughout the lifetime of a software system. | |
566 | According to Joshua Kerievsky ``That sounds reasonable --- if you happen to be | |
567 | psychic.''~\cite[p.~1]{kerievsky2005} He is advocating what he believes is a | |
568 | better approach: To let software continually evolve. To start with a simple | |
569 | design that meets today's needs, and tackle future needs by refactoring to | |
570 | satisfy them. He believes that this is a more economic approach than investing | |
571 | time and money into a design that inevitably is going to change. By relying on | |
572 | continuously refactoring a system, its design can be made simpler without | |
573 | sacrificing flexibility. To be able to fully rely on this approach, it is of | |
574 | utter importance to have a reliable suit of tests to lean on. \See{testing} This | |
575 | makes the design process more natural and less characterized by difficult | |
576 | decisions that has to be made before proceeding in the process, and that is | |
577 | going to define a project for all of its unforeseeable future. | |
578 | ||
579 | \begin{comment} | |
580 | ||
581 | \section{Classification of refactorings} | |
582 | % only interesting refactorings | |
583 | % with 2 detailed examples? One for structured and one for intra-method? | |
584 | % Is replacing Bubblesort with Quick Sort considered a refactoring? | |
585 | ||
586 | \subsection{Structural refactorings} | |
587 | ||
588 | \subsubsection{Primitive refactorings} | |
589 | ||
590 | % Composing Methods | |
591 | \explanation{Extract Method}{You have a code fragment that can be grouped | |
592 | together.}{Turn the fragment into a method whose name explains the purpose of | |
593 | the method.} | |
594 | ||
595 | \explanation{Inline Method}{A method's body is just as clear as its name.}{Put | |
596 | the method's body into the body of its callers and remove the method.} | |
597 | ||
598 | \explanation{Inline Temp}{You have a temp that is assigned to once with a simple | |
599 | expression, and the temp is getting in the way of other refactorings.}{Replace | |
600 | all references to that temp with the expression} | |
601 | ||
602 | % Moving Features Between Objects | |
603 | \explanation{Move Method}{A method is, or will be, using or used by more | |
604 | features of another class than the class on which it is defined.}{Create a new | |
605 | method with a similar body in the class it uses most. Either turn the old method | |
606 | into a simple delegation, or remove it altogether.} | |
607 | ||
608 | \explanation{Move Field}{A field is, or will be, used by another class more than | |
609 | the class on which it is defined}{Create a new field in the target class, and | |
610 | change all its users.} | |
611 | ||
612 | % Organizing Data | |
613 | \explanation{Replace Magic Number with Symbolic Constant}{You have a literal | |
614 | number with a particular meaning.}{Create a constant, name it after the meaning, | |
615 | and replace the number with it.} | |
616 | ||
617 | \explanation{Encapsulate Field}{There is a public field.}{Make it private and | |
618 | provide accessors.} | |
619 | ||
620 | \explanation{Replace Type Code with Class}{A class has a numeric type code that | |
621 | does not affect its behavior.}{Replace the number with a new class.} | |
622 | ||
623 | \explanation{Replace Type Code with Subclasses}{You have an immutable type code | |
624 | that affects the behavior of a class.}{Replace the type code with subclasses.} | |
625 | ||
626 | \explanation{Replace Type Code with State/Strategy}{You have a type code that | |
627 | affects the behavior of a class, but you cannot use subclassing.}{Replace the | |
628 | type code with a state object.} | |
629 | ||
630 | % Simplifying Conditional Expressions | |
631 | \explanation{Consolidate Duplicate Conditional Fragments}{The same fragment of | |
632 | code is in all branches of a conditional expression.}{Move it outside of the | |
633 | expression.} | |
634 | ||
635 | \explanation{Remove Control Flag}{You have a variable that is acting as a | |
636 | control flag fro a series of boolean expressions.}{Use a break or return | |
637 | instead.} | |
638 | ||
639 | \explanation{Replace Nested Conditional with Guard Clauses}{A method has | |
640 | conditional behavior that does not make clear the normal path of | |
641 | execution.}{Use guard clauses for all special cases.} | |
642 | ||
643 | \explanation{Introduce Null Object}{You have repeated checks for a null | |
644 | value.}{Replace the null value with a null object.} | |
645 | ||
646 | \explanation{Introduce Assertion}{A section of code assumes something about the | |
647 | state of the program.}{Make the assumption explicit with an assertion.} | |
648 | ||
649 | % Making Method Calls Simpler | |
650 | \explanation{Rename Method}{The name of a method does not reveal its | |
651 | purpose.}{Change the name of the method} | |
652 | ||
653 | \explanation{Add Parameter}{A method needs more information from its | |
654 | caller.}{Add a parameter for an object that can pass on this information.} | |
655 | ||
656 | \explanation{Remove Parameter}{A parameter is no longer used by the method | |
657 | body.}{Remove it.} | |
658 | ||
659 | %\explanation{Parameterize Method}{Several methods do similar things but with | |
660 | %different values contained in the method.}{Create one method that uses a | |
661 | %parameter for the different values.} | |
662 | ||
663 | \explanation{Preserve Whole Object}{You are getting several values from an | |
664 | object and passing these values as parameters in a method call.}{Send the whole | |
665 | object instead.} | |
666 | ||
667 | \explanation{Remove Setting Method}{A field should be set at creation time and | |
668 | never altered.}{Remove any setting method for that field.} | |
669 | ||
670 | \explanation{Hide Method}{A method is not used by any other class.}{Make the | |
671 | method private.} | |
672 | ||
673 | \explanation{Replace Constructor with Factory Method}{You want to do more than | |
674 | simple construction when you create an object}{Replace the constructor with a | |
675 | factory method.} | |
676 | ||
677 | % Dealing with Generalization | |
678 | \explanation{Pull Up Field}{Two subclasses have the same field.}{Move the field | |
679 | to the superclass.} | |
680 | ||
681 | \explanation{Pull Up Method}{You have methods with identical results on | |
682 | subclasses.}{Move them to the superclass.} | |
683 | ||
684 | \explanation{Push Down Method}{Behavior on a superclass is relevant only for | |
685 | some of its subclasses.}{Move it to those subclasses.} | |
686 | ||
687 | \explanation{Push Down Field}{A field is used only by some subclasses.}{Move the | |
688 | field to those subclasses} | |
689 | ||
690 | \explanation{Extract Interface}{Several clients use the same subset of a class's | |
691 | interface, or two classes have part of their interfaces in common.}{Extract the | |
692 | subset into an interface.} | |
693 | ||
694 | \explanation{Replace Inheritance with Delegation}{A subclass uses only part of a | |
695 | superclasses interface or does not want to inherit data.}{Create a field for the | |
696 | superclass, adjust methods to delegate to the superclass, and remove the | |
697 | subclassing.} | |
698 | ||
699 | \explanation{Replace Delegation with Inheritance}{You're using delegation and | |
700 | are often writing many simple delegations for the entire interface}{Make the | |
701 | delegating class a subclass of the delegate.} | |
702 | ||
703 | \subsubsection{Composite refactorings} | |
704 | ||
705 | % Composing Methods | |
706 | % \explanation{Replace Method with Method Object}{}{} | |
707 | ||
708 | % Moving Features Between Objects | |
709 | \explanation{Extract Class}{You have one class doing work that should be done by | |
710 | two}{Create a new class and move the relevant fields and methods from the old | |
711 | class into the new class.} | |
712 | ||
713 | \explanation{Inline Class}{A class isn't doing very much.}{Move all its features | |
714 | into another class and delete it.} | |
715 | ||
716 | \explanation{Hide Delegate}{A client is calling a delegate class of an | |
717 | object.}{Create Methods on the server to hide the delegate.} | |
718 | ||
719 | \explanation{Remove Middle Man}{A class is doing to much simple delegation.}{Get | |
720 | the client to call the delegate directly.} | |
721 | ||
722 | % Organizing Data | |
723 | \explanation{Replace Data Value with Object}{You have a data item that needs | |
724 | additional data or behavior.}{Turn the data item into an object.} | |
725 | ||
726 | \explanation{Change Value to Reference}{You have a class with many equal | |
727 | instances that you want to replace with a single object.}{Turn the object into a | |
728 | reference object.} | |
729 | ||
730 | \explanation{Encapsulate Collection}{A method returns a collection}{Make it | |
731 | return a read-only view and provide add/remove methods.} | |
732 | ||
733 | % \explanation{Replace Array with Object}{}{} | |
734 | ||
735 | \explanation{Replace Subclass with Fields}{You have subclasses that vary only in | |
736 | methods that return constant data.}{Change the methods to superclass fields and | |
737 | eliminate the subclasses.} | |
738 | ||
739 | % Simplifying Conditional Expressions | |
740 | \explanation{Decompose Conditional}{You have a complicated conditional | |
741 | (if-then-else) statement.}{Extract methods from the condition, then part, an | |
742 | else part.} | |
743 | ||
744 | \explanation{Consolidate Conditional Expression}{You have a sequence of | |
745 | conditional tests with the same result.}{Combine them into a single conditional | |
746 | expression and extract it.} | |
747 | ||
748 | \explanation{Replace Conditional with Polymorphism}{You have a conditional that | |
749 | chooses different behavior depending on the type of an object.}{Move each leg | |
750 | of the conditional to an overriding method in a subclass. Make the original | |
751 | method abstract.} | |
752 | ||
753 | % Making Method Calls Simpler | |
754 | \explanation{Replace Parameter with Method}{An object invokes a method, then | |
755 | passes the result as a parameter for a method. The receiver can also invoke this | |
756 | method.}{Remove the parameter and let the receiver invoke the method.} | |
757 | ||
758 | \explanation{Introduce Parameter Object}{You have a group of parameters that | |
759 | naturally go together.}{Replace them with an object.} | |
760 | ||
761 | % Dealing with Generalization | |
762 | \explanation{Extract Subclass}{A class has features that are used only in some | |
763 | instances.}{Create a subclass for that subset of features.} | |
764 | ||
765 | \explanation{Extract Superclass}{You have two classes with similar | |
766 | features.}{Create a superclass and move the common features to the | |
767 | superclass.} | |
768 | ||
769 | \explanation{Collapse Hierarchy}{A superclass and subclass are not very | |
770 | different.}{Merge them together.} | |
771 | ||
772 | \explanation{Form Template Method}{You have two methods in subclasses that | |
773 | perform similar steps in the same order, yet the steps are different.}{Get the | |
774 | steps into methods with the same signature, so that the original methods become | |
775 | the same. Then you can pull them up.} | |
776 | ||
777 | ||
778 | \subsection{Functional refactorings} | |
779 | ||
780 | \explanation{Substitute Algorithm}{You want to replace an algorithm with one | |
781 | that is clearer.}{Replace the body of the method with the new algorithm.} | |
782 | ||
783 | \end{comment} | |
784 | ||
785 | \section{The impact on software quality} | |
786 | ||
787 | \subsection{What is software quality?} | |
788 | The term \emph{software quality} has many meanings. It all depends on the | |
789 | context we put it in. If we look at it with the eyes of a software developer, it | |
790 | usually means that the software is easily maintainable and testable, or in other | |
791 | words, that it is \emph{well designed}. This often correlates with the | |
792 | management scale, where \emph{keeping the schedule} and \emph{customer | |
793 | satisfaction} is at the center. From the customers point of view, in addition to | |
794 | good usability, \emph{performance} and \emph{lack of bugs} is always | |
795 | appreciated, measurements that are also shared by the software developer. (In | |
796 | addition, such things as good documentation could be measured, but this is out | |
797 | of the scope of this document.) | |
798 | ||
799 | \subsection{The impact on performance} | |
800 | \begin{quote} | |
801 | Refactoring certainly will make software go more slowly\footnote{With todays | |
802 | compiler optimization techniques and performance tuning of e.g. the Java | |
803 | virtual machine, the penalties of object creation and method calls are | |
804 | debatable.}, but it also makes the software more amenable to performance | |
805 | tuning.~\cite[p.~69]{refactoring} | |
806 | \end{quote} | |
807 | ||
808 | \noindent There is a common belief that refactoring compromises performance, due | |
809 | to increased degree of indirection and that polymorphism is slower than | |
810 | conditionals. | |
811 | ||
812 | In a survey, Demeyer\citing{demeyer2002} disproves this view in the case of | |
813 | polymorphism. He did an experiment on, what he calls, ``Transform Self Type | |
814 | Checks'' where you introduce a new polymorphic method and a new class hierarchy | |
815 | to get rid of a class' type checking of a ``type attribute``. He uses this kind | |
816 | of transformation to represent other ways of replacing conditionals with | |
817 | polymorphism as well. The experiment is performed on the C++ programming | |
818 | language and with three different compilers and platforms. Demeyer concludes | |
819 | that, with compiler optimization turned on, polymorphism beats middle to large | |
820 | sized if-statements and does as well as case-statements. (In accordance with | |
821 | his hypothesis, due to similarities between the way C++ handles polymorphism and | |
822 | case-statements.) | |
823 | ||
824 | \begin{quote} | |
825 | The interesting thing about performance is that if you analyze most programs, | |
826 | you find that they waste most of their time in a small fraction of the | |
827 | code.~\cite[p.~70]{refactoring} | |
828 | \end{quote} | |
829 | ||
830 | \noindent So, although an increased amount of method calls could potentially | |
831 | slow down programs, one should avoid premature optimization and sacrificing good | |
832 | design, leaving the performance tuning until after profiling\footnote{For and | |
833 | example of a Java profiler, check out VisualVM: | |
834 | \url{http://visualvm.java.net/}} the software and having isolated the actual | |
835 | problem areas. | |
836 | ||
837 | \section{Composite refactorings}\label{compositeRefactorings} | |
838 | \todo{motivation, examples, manual vs automated?, what about refactoring in a | |
839 | very large code base?} | |
840 | Generally, when thinking about refactoring, at the mechanical level, there are | |
841 | essentially two kinds of refactorings. There are the \emph{primitive} | |
842 | refactorings, and the \emph{composite} refactorings. | |
843 | ||
844 | \definition{A \emph{primitive refactoring} is a refactoring that cannot be | |
845 | expressed in terms of other refactorings.} | |
846 | ||
847 | \noindent Examples are the \refactoring{Pull Up Field} and \refactoring{Pull Up | |
848 | Method} refactorings\citing{refactoring}, that move members up in their class | |
849 | hierarchies. | |
850 | ||
851 | \definition{A \emph{composite refactoring} is a refactoring that can be | |
852 | expressed in terms of two or more other refactorings.} | |
853 | ||
854 | \noindent An example of a composite refactoring is the \refactoring{Extract | |
855 | Superclass} refactoring\citing{refactoring}. In its simplest form, it is composed | |
856 | of the previously described primitive refactorings, in addition to the | |
857 | \refactoring{Pull Up Constructor Body} refactoring\citing{refactoring}. It works | |
858 | by creating an abstract superclass that the target class(es) inherits from, then | |
859 | by applying \refactoring{Pull Up Field}, \refactoring{Pull Up Method} and | |
860 | \refactoring{Pull Up Constructor Body} on the members that are to be members of | |
861 | the new superclass. For an overview of the \refactoring{Extract Superclass} | |
862 | refactoring, see \myref{fig:extractSuperclass}. | |
863 | ||
864 | \begin{figure}[h] | |
865 | \centering | |
866 | \includegraphics[angle=270,width=\linewidth]{extractSuperclassItalic.pdf} | |
867 | \caption{The Extract Superclass refactoring} | |
868 | \label{fig:extractSuperclass} | |
869 | \end{figure} | |
870 | ||
871 | \section{Manual vs. automated refactorings} | |
872 | Refactoring is something every programmer does, even if \heshe does not known | |
873 | the term \emph{refactoring}. Every refinement of source code that does not alter | |
874 | the program's behavior is a refactoring. For small refactorings, such as | |
875 | \ExtractMethod, executing it manually is a manageable task, but is still prone | |
876 | to errors. Getting it right the first time is not easy, considering the method | |
877 | signature and all the other aspects of the refactoring that has to be in place. | |
878 | ||
879 | Take for instance the renaming of classes, methods and fields. For complex | |
880 | programs these refactorings are almost impossible to get right. Attacking them | |
881 | with textual search and replace, or even regular expressions, will fall short on | |
882 | these tasks. Then it is crucial to have proper tool support that can perform | |
883 | them automatically. Tools that can parse source code and thus have semantic | |
884 | knowledge about which occurrences of which names belong to what construct in the | |
885 | program. For even trying to perform one of these complex task manually, one | |
886 | would have to be very confident on the existing test suite \see{testing}. | |
887 | ||
888 | \section{Correctness of refactorings}\label{correctness} | |
889 | For automated refactorings to be truly useful, they must show a high degree of | |
890 | behavior preservation. This last sentence might seem obvious, but there are | |
891 | examples of refactorings in existing tools that break programs. I will now | |
892 | present an example of an \ExtractMethod refactoring followed by a \MoveMethod | |
893 | refactoring that breaks a program in both the \emph{Eclipse} and \emph{IntelliJ} | |
894 | IDEs\footnote{The NetBeans IDE handles this particular situation without | |
895 | altering ther program's beavior, mainly because its Move Method refactoring | |
896 | implementation is a bit rancid in other ways \see{toolSupport}.}. The | |
897 | following piece of code shows the target for the composed refactoring: | |
898 | ||
899 | \begin{minted}[linenos,samepage]{java} | |
900 | public class C { | |
901 | public X x = new X(); | |
902 | ||
903 | public void f() { | |
904 | x.m(this); | |
905 | x.n(); | |
906 | } | |
907 | } | |
908 | \end{minted} | |
909 | ||
910 | \noindent The next piece of code shows the destination of the refactoring. Note | |
911 | that the method \method{m(C c)} of class \type{C} assigns to the field \var{x} | |
912 | of the argument \var{c} that has type \type{C}: | |
913 | ||
914 | \begin{minted}[samepage]{java} | |
915 | public class X { | |
916 | public void m(C c) { | |
917 | c.x = new X(); | |
918 | } | |
919 | public void n() {} | |
920 | } | |
921 | \end{minted} | |
922 | ||
923 | The refactoring sequence works by extracting line 5 and 6 from the original | |
924 | class \type{C} into a method \method{f} with the statements from those lines as | |
925 | its method body. The method is then moved to the class \type{X}. The result is | |
926 | shown in the following two pieces of code: | |
927 | ||
928 | \begin{minted}[linenos,samepage]{java} | |
929 | public class C { | |
930 | public X x = new X(); | |
931 | ||
932 | public void f() { | |
933 | x.f(this); | |
934 | } | |
935 | } | |
936 | \end{minted} | |
937 | ||
938 | \begin{minted}[linenos,samepage]{java} | |
939 | public class X { | |
940 | public void m(C c) { | |
941 | c.x = new X(); | |
942 | } | |
943 | public void n() {} | |
944 | public void f(C c) { | |
945 | m(c); | |
946 | n(); | |
947 | } | |
948 | } | |
949 | \end{minted} | |
950 | ||
951 | After the refactoring, the method \method{f} of class \type{C} is calling the | |
952 | method \method{f} of class \type{X}, and the program now behaves different than | |
953 | before. (See line 5 of the version of class \type{C} after the refactoring.) | |
954 | Before the refactoring, the methods \method{m} and \method{n} of class \type{X} | |
955 | are called on different object instances (see line 5 and 6 of the original class | |
956 | \type{C}). After, they are called on the same object, and the statement on line | |
957 | 3 of class \type{X} (the version after the refactoring) no longer have any | |
958 | effect in our example. | |
959 | ||
960 | The bug introduced in the previous example is of such a nature\footnote{Caused | |
961 | by aliasing. See \url{https://en.wikipedia.org/wiki/Aliasing_(computing)}} | |
962 | that it is very difficult to spot if the refactored code is not covered by | |
963 | tests. It does not generate compilation errors, and will thus only result in | |
964 | a runtime error or corrupted data, which might be hard to detect. | |
965 | ||
966 | \section{Refactoring and the importance of testing}\label{testing} | |
967 | \begin{quote} | |
968 | If you want to refactor, the essential precondition is having solid | |
969 | tests.\citing{refactoring} | |
970 | \end{quote} | |
971 | ||
972 | When refactoring, there are roughly three classes of errors that can be made. | |
973 | The first class of errors are the ones that make the code unable to compile. | |
974 | These \emph{compile-time} errors are of the nicer kind. They flash up at the | |
975 | moment they are made (at least when using an IDE), and are usually easy to fix. | |
976 | The second class are the \emph{runtime} errors. Although they take a bit longer | |
977 | to surface, they usually manifest after some time in an illegal argument | |
978 | exception, null pointer exception or similar during the program execution. | |
979 | These kind of errors are a bit harder to handle, but at least they will show, | |
980 | eventually. Then there are the \emph{behavior-changing} errors. These errors are | |
981 | of the worst kind. They do not show up during compilation and they do not turn | |
982 | on a blinking red light during runtime either. The program can seem to work | |
983 | perfectly fine with them in play, but the business logic can be damaged in ways | |
984 | that will only show up over time. | |
985 | ||
986 | For discovering runtime errors and behavior changes when refactoring, it is | |
987 | essential to have good test coverage. Testing in this context means writing | |
988 | automated tests. Manual testing may have its uses, but when refactoring, it is | |
989 | automated unit testing that dominate. For discovering behavior changes it is | |
990 | especially important to have tests that cover potential problems, since these | |
991 | kind of errors does not reveal themselves. | |
992 | ||
993 | Unit testing is not a way to \emph{prove} that a program is correct, but it is a | |
994 | way to make you confindent that it \emph{probably} works as desired. In the | |
995 | context of test driven development (commonly known as TDD), the tests are even a | |
996 | way to define how the program is \emph{supposed} to work. It is then, by | |
997 | definition, working if the tests are passing. | |
998 | ||
999 | If the test coverage for a code base is perfect, then it should, theoretically, | |
1000 | be risk-free to perform refactorings on it. This is why automated tests and | |
1001 | refactoring are such a great match. | |
1002 | ||
1003 | \subsection{Testing the code from correctness section} | |
1004 | The worst thing that can happen when refactoring is to introduce changes to the | |
1005 | behavior of a program, as in the example on \myref{correctness}. This example | |
1006 | may be trivial, but the essence is clear. The only problem with the example is | |
1007 | that it is not clear how to create automated tests for it, without changing it | |
1008 | in intrusive ways. | |
1009 | ||
1010 | Unit tests, as they are known from the different xUnit frameworks around, are | |
1011 | only suitable to test the \emph{result} of isolated operations. They can not | |
1012 | easily (if at all) observe the \emph{history} of a program. | |
1013 | ||
1014 | ||
1015 | \todoin{Write \ldots} | |
1016 | ||
1017 | Assuming a sequential (non-concurrent) program: | |
1018 | ||
1019 | \begin{minted}{java} | |
1020 | tracematch (C c, X x) { | |
1021 | sym m before: | |
1022 | call(* X.m(C)) && args(c) && cflow(within(C)); | |
1023 | sym n before: | |
1024 | call(* X.n()) && target(x) && cflow(within(C)); | |
1025 | sym setCx after: | |
1026 | set(C.x) && target(c) && !cflow(m); | |
1027 | ||
1028 | m n | |
1029 | ||
1030 | { assert x == c.x; } | |
1031 | } | |
1032 | \end{minted} | |
1033 | ||
1034 | %\begin{minted}{java} | |
1035 | %tracematch (X x1, X x2) { | |
1036 | % sym m before: | |
1037 | % call(* X.m(C)) && target(x1); | |
1038 | % sym n before: | |
1039 | % call(* X.n()) && target(x2); | |
1040 | % sym setX after: | |
1041 | % set(C.x) && !cflow(m) && !cflow(n); | |
1042 | % | |
1043 | % m n | |
1044 | % | |
1045 | % { assert x1 != x2; } | |
1046 | %} | |
1047 | %\end{minted} | |
1048 | ||
1049 | \section{The project} | |
1050 | The aim of this project will be to investigate the relationship between a | |
1051 | composite refactoring composed of the \ExtractMethod and \MoveMethod | |
1052 | refactorings, and its impact on one or more software metrics. | |
1053 | ||
1054 | The composition of \ExtractMethod and \MoveMethod springs naturally out of the | |
1055 | need to move procedures closer to the data they manipulate. This composed | |
1056 | refactoring is not well described in the literature, but it is implemented in at | |
1057 | least one tool called | |
1058 | \emph{CodeRush}\footnote{\url{https://help.devexpress.com/\#CodeRush/CustomDocument3519}}, | |
1059 | that is an extension for \emph{MS Visual | |
1060 | Studio}\footnote{\url{http://www.visualstudio.com/}}. In CodeRush it is called | |
1061 | \emph{Extract Method to | |
1062 | Type}\footnote{\url{https://help.devexpress.com/\#CodeRush/CustomDocument6710}}, | |
1063 | but I choose to call it \ExtractAndMoveMethod, since I feel it better | |
1064 | communicates which primitive refactorings it is composed of. | |
1065 | ||
1066 | For the metrics, I will at least measure the \emph{Coupling between object | |
1067 | classes} (CBO) metric that is described by Chidamber and Kemerer in their | |
1068 | article \emph{A Metrics Suite for Object Oriented | |
1069 | Design}\citing{metricsSuite1994}. | |
1070 | ||
1071 | The project will then consist in implementing the \ExtractAndMoveMethod | |
1072 | refactoring, as well as executing it over a larger code base. Then the effect of | |
1073 | the change must be measured by calculating the chosen software metrics both | |
1074 | before and after the execution. To be able to execute the refactoring | |
1075 | automatically I have to make it analyze code to determine the best selections to | |
1076 | extract into new methods. | |
1077 | ||
1078 | \section{Software metrics} | |
1079 | \todoin{Is this the appropriate place to have this section?} | |
1080 | ||
1081 | %\part{The project} | |
1082 | %\chapter{Planning the project} | |
1083 | %\part{Conclusion} | |
1084 | %\chapter{Results} | |
1085 | ||
1086 | ||
1087 | ||
1088 | \chapter{\ldots} | |
1089 | \todoin{write} | |
1090 | \section{The problem statement} | |
1091 | \section{Choosing the target language} | |
1092 | Choosing which programming language to use as the target for manipulation is not | |
1093 | a very difficult task. The language has to be an object-oriented programming | |
1094 | language, and it must have existing tool support for refactoring. The | |
1095 | \emph{Java} programming language\footnote{\url{https://www.java.com/}} is the | |
1096 | dominating language when it comes to examples in the literature of refactoring, | |
1097 | and is thus a natural choice. Java is perhaps, currently the most influential | |
1098 | programming language in the world, with its \emph{Java Virtual Machine} that | |
1099 | runs on all of the most popular architectures and also supports\footnote{They | |
1100 | compile to java bytecode.} dozens of other programming languages, with | |
1101 | \emph{Scala}, \emph{Clojure} and \emph{Groovy} as the most prominent ones. Java | |
1102 | is currently the language that every other programming language is compared | |
1103 | against. It is also the primary language of the author of this thesis. | |
1104 | ||
1105 | \section{Choosing the tools} | |
1106 | When choosing a tool for manipulating Java, there are certain criterias that | |
1107 | have to be met. First of all, the tool should have some existing refactoring | |
1108 | support that this thesis can build upon. Secondly it should provide some kind of | |
1109 | framework for parsing and analyzing Java source code. Third, it should itself be | |
1110 | open source. This is both because of the need to be able to browse the code for | |
1111 | the existing refactorings that is contained in the tool, and also because open | |
1112 | source projects hold value in them selves. Another important aspect to consider | |
1113 | is that open source projects of a certain size, usually has large communities of | |
1114 | people connected to them, that are commited to answering questions regarding the | |
1115 | use and misuse of the products, that to a large degree is made by the cummunity | |
1116 | itself. | |
1117 | ||
1118 | There is a certain class of tools that meet these criterias, namely the class of | |
1119 | \emph{IDEs}\footnote{\emph{Integrated Development Environment}}. These are | |
1120 | proagrams that is ment to support the whole production cycle of a cumputer | |
1121 | program, and the most popular IDEs that support Java, generally have quite good | |
1122 | refactoring support. | |
1123 | ||
1124 | The main contenders for this thesis is the \emph{Eclipse IDE}, with the | |
1125 | \emph{Java development tools} (JDT), the \emph{IntelliJ IDEA Community Edition} | |
1126 | and the \emph{NetBeans IDE}. \See{toolSupport} Eclipse and NetBeans are both | |
1127 | free, open source and community driven, while the IntelliJ IDEA has an open | |
1128 | sourced community edition that is free of charge, but also offer an | |
1129 | \emph{Ultimate Edition} with an extended set of features, at additional cost. | |
1130 | All three IDEs supports adding plugins to extend their functionality and tools | |
1131 | that can be used to parse and analyze Java source code. But one of the IDEs | |
1132 | stand out as a favorite, and that is the \emph{Eclipse IDE}. This is the most | |
1133 | popular\citing{javaReport2011} among them and seems to be de facto standard IDE | |
1134 | for Java development regardless of platform. | |
1135 | ||
1136 | ||
1137 | \chapter{Refactorings in Eclipse JDT: Design, Shortcomings and Wishful | |
1138 | Thinking}\label{ch:jdt_refactorings} | |
1139 | ||
1140 | This chapter will deal with some of the design behind refactoring support in | |
1141 | Eclipse, and the JDT in specific. After which it will follow a section about | |
1142 | shortcomings of the refactoring API in terms of composition of refactorings. The | |
1143 | chapter will be concluded with a section telling some of the ways the | |
1144 | implementation of refactorings in the JDT could have worked to facilitate | |
1145 | composition of refactorings. | |
1146 | ||
1147 | \section{Design} | |
1148 | The refactoring world of Eclipse can in general be separated into two parts: The | |
1149 | language independent part and the part written for a specific programming | |
1150 | language -- the language that is the target of the supported refactorings. | |
1151 | \todo{What about the language specific part?} | |
1152 | ||
1153 | \subsection{The Language Toolkit} | |
1154 | The Language Toolkit, or LTK for short, is the framework that is used to | |
1155 | implement refactorings in Eclipse. It is language independent and provides the | |
1156 | abstractions of a refactoring and the change it generates, in the form of the | |
1157 | classes \typewithref{org.eclipse.ltk.core.refactoring}{Refactoring} and | |
1158 | \typewithref{org.eclipse.ltk.core.refactoring}{Change}. (There is also parts of | |
1159 | the LTK that is concerned with user interaction, but they will not be discussed | |
1160 | here, since they are of little value to us and our use of the framework.) | |
1161 | ||
1162 | \subsubsection{The Refactoring Class} | |
1163 | The abstract class \type{Refactoring} is the core of the LTK framework. Every | |
1164 | refactoring that is going to be supported by the LTK have to end up creating an | |
1165 | instance of one of its subclasses. The main responsibilities of subclasses of | |
1166 | \type{Refactoring} is to implement template methods for condition checking | |
1167 | (\methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkInitialConditions} | |
1168 | and | |
1169 | \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{checkFinalConditions}), | |
1170 | in addition to the | |
1171 | \methodwithref{org.eclipse.ltk.core.refactoring.Refactoring}{createChange} | |
1172 | method that creates and returns an instance of the \type{Change} class. | |
1173 | ||
1174 | If the refactoring shall support that others participate in it when it is | |
1175 | executed, the refactoring has to be a processor-based | |
1176 | refactoring\typeref{org.eclipse.ltk.core.refactoring.participants.ProcessorBasedRefactoring}. | |
1177 | It then delegates to its given | |
1178 | \typewithref{org.eclipse.ltk.core.refactoring.participants}{RefactoringProcessor} | |
1179 | for condition checking and change creation. | |
1180 | ||
1181 | \subsubsection{The Change Class} | |
1182 | This class is the base class for objects that is responsible for performing the | |
1183 | actual workspace transformations in a refactoring. The main responsibilities for | |
1184 | its subclasses is to implement the | |
1185 | \methodwithref{org.eclipse.ltk.core.refactoring.Change}{perform} and | |
1186 | \methodwithref{org.eclipse.ltk.core.refactoring.Change}{isValid} methods. The | |
1187 | \method{isValid} method verifies that the change object is valid and thus can be | |
1188 | executed by calling its \method{perform} method. The \method{perform} method | |
1189 | performs the desired change and returns an undo change that can be executed to | |
1190 | reverse the effect of the transformation done by its originating change object. | |
1191 | ||
1192 | \subsubsection{Executing a Refactoring}\label{executing_refactoring} | |
1193 | The life cycle of a refactoring generally follows two steps after creation: | |
1194 | condition checking and change creation. By letting the refactoring object be | |
1195 | handled by a | |
1196 | \typewithref{org.eclipse.ltk.core.refactoring}{CheckConditionsOperation} that | |
1197 | in turn is handled by a | |
1198 | \typewithref{org.eclipse.ltk.core.refactoring}{CreateChangeOperation}, it is | |
1199 | assured that the change creation process is managed in a proper manner. | |
1200 | ||
1201 | The actual execution of a change object has to follow a detailed life cycle. | |
1202 | This life cycle is honored if the \type{CreateChangeOperation} is handled by a | |
1203 | \typewithref{org.eclipse.ltk.core.refactoring}{PerformChangeOperation}. If also | |
1204 | an undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} is set | |
1205 | for the \type{PerformChangeOperation}, the undo change is added into the undo | |
1206 | history. | |
1207 | ||
1208 | \section{Shortcomings} | |
1209 | This section is introduced naturally with a conclusion: The JDT refactoring | |
1210 | implementation does not facilitate composition of refactorings. | |
1211 | \todo{refine}This section will try to explain why, and also identify other | |
1212 | shortcomings of both the usability and the readability of the JDT refactoring | |
1213 | source code. | |
1214 | ||
1215 | I will begin at the end and work my way toward the composition part of this | |
1216 | section. | |
1217 | ||
1218 | \subsection{Absence of Generics in Eclipse Source Code} | |
1219 | This section is not only concerning the JDT refactoring API, but also large | |
1220 | quantities of the Eclipse source code. The code shows a striking absence of the | |
1221 | Java language feature of generics. It is hard to read a class' interface when | |
1222 | methods return objects or takes parameters of raw types such as \type{List} or | |
1223 | \type{Map}. This sometimes results in having to read a lot of source code to | |
1224 | understand what is going on, instead of relying on the available interfaces. In | |
1225 | addition, it results in a lot of ugly code, making the use of typecasting more | |
1226 | of a rule than an exception. | |
1227 | ||
1228 | \subsection{Composite Refactorings Will Not Appear as Atomic Actions} | |
1229 | ||
1230 | \subsubsection{Missing Flexibility from JDT Refactorings} | |
1231 | The JDT refactorings are not made with composition of refactorings in mind. When | |
1232 | a JDT refactoring is executed, it assumes that all conditions for it to be | |
1233 | applied successfully can be found by reading source files that has been | |
1234 | persisted to disk. They can only operate on the actual source material, and not | |
1235 | (in-memory) copies thereof. This constitutes a major disadvantage when trying to | |
1236 | compose refactorings, since if an exception occur in the middle of a sequence of | |
1237 | refactorings, it can leave the project in a state where the composite | |
1238 | refactoring was executed only partly. It makes it hard to discard the changes | |
1239 | done without monitoring and consulting the undo manager, an approach that is not | |
1240 | bullet proof. | |
1241 | ||
1242 | \subsubsection{Broken Undo History} | |
1243 | When designing a composed refactoring that is to be performed as a sequence of | |
1244 | refactorings, you would like it to appear as a single change to the workspace. | |
1245 | This implies that you would also like to be able to undo all the changes done by | |
1246 | the refactoring in a single step. This is not the way it appears when a sequence | |
1247 | of JDT refactorings is executed. It leaves the undo history filled up with | |
1248 | individual undo actions corresponding to every single JDT refactoring in the | |
1249 | sequence. This problem is not trivial to handle in Eclipse. | |
1250 | \See{hacking_undo_history} | |
1251 | ||
1252 | \section{Wishful Thinking} | |
1253 | \todoin{???} | |
1254 | ||
1255 | \chapter{Composite Refactorings in Eclipse} | |
1256 | ||
1257 | \section{A Simple Ad Hoc Model} | |
1258 | As pointed out in \myref{ch:jdt_refactorings}, the Eclipse JDT refactoring model | |
1259 | is not very well suited for making composite refactorings. Therefore a simple | |
1260 | model using changer objects (of type \type{RefaktorChanger}) is used as an | |
1261 | abstraction layer on top of the existing Eclipse refactorings, instead of | |
1262 | extending the \typewithref{org.eclipse.ltk.core.refactoring}{Refactoring} class. | |
1263 | ||
1264 | The use of an additional abstraction layer is a deliberate choice. It is due to | |
1265 | the problem of creating a composite | |
1266 | \typewithref{org.eclipse.ltk.core.refactoring}{Change} that can handle text | |
1267 | changes that interfere with each other. Thus, a \type{RefaktorChanger} may, or | |
1268 | may not, take advantage of one or more existing refactorings, but it is always | |
1269 | intended to make a change to the workspace. | |
1270 | ||
1271 | \subsection{A typical \type{RefaktorChanger}} | |
1272 | The typical refaktor changer class has two responsibilities, checking | |
1273 | preconditions and executing the requested changes. This is not too different | |
1274 | from the responsibilities of an LTK refactoring, with the distinction that a | |
1275 | refaktor changer also executes the change, while an LTK refactoring is only | |
1276 | responsible for creating the object that can later be used to do the job. | |
1277 | ||
1278 | Checking of preconditions is typically done by an | |
1279 | \typewithref{no.uio.ifi.refaktor.analyze.analyzers}{Analyzer}. If the | |
1280 | preconditions validate, the upcoming changes are executed by an | |
1281 | \typewithref{no.uio.ifi.refaktor.change.executors}{Executor}. | |
1282 | ||
1283 | \section{The Extract and Move Method Refactoring} | |
1284 | %The Extract and Move Method Refactoring is implemented mainly using these | |
1285 | %classes: | |
1286 | %\begin{itemize} | |
1287 | % \item \type{ExtractAndMoveMethodChanger} | |
1288 | % \item \type{ExtractAndMoveMethodPrefixesExtractor} | |
1289 | % \item \type{Prefix} | |
1290 | % \item \type{PrefixSet} | |
1291 | %\end{itemize} | |
1292 | ||
1293 | \subsection{The Building Blocks} | |
1294 | This is a composite refactoring, and hence is built up using several primitive | |
1295 | refactorings. These basic building blocks are, as its name implies, the | |
1296 | \ExtractMethod refactoring\citing{refactoring} and the \MoveMethod | |
1297 | refactoring\citing{refactoring}. In Eclipse, the implementations of these | |
1298 | refactorings are found in the classes | |
1299 | \typewithref{org.eclipse.jdt.internal.corext.refactoring.code}{ExtractMethodRefactoring} | |
1300 | and | |
1301 | \typewithref{org.eclipse.jdt.internal.corext.refactoring.structure}{MoveInstanceMethodProcessor}, | |
1302 | where the last class is designed to be used together with the processor-based | |
1303 | \typewithref{org.eclipse.ltk.core.refactoring.participants}{MoveRefactoring}. | |
1304 | ||
1305 | \subsubsection{The ExtractMethodRefactoring Class} | |
1306 | This class is quite simple in its use. The only parameters it requires for | |
1307 | construction is a compilation | |
1308 | unit\typeref{org.eclipse.jdt.core.ICompilationUnit}, the offset into the source | |
1309 | code where the extraction shall start, and the length of the source to be | |
1310 | extracted. Then you have to set the method name for the new method together with | |
1311 | its visibility and some not so interesting parameters. | |
1312 | ||
1313 | \subsubsection{The MoveInstanceMethodProcessor Class} | |
1314 | For the Move Method, the processor requires a little more advanced input than | |
1315 | the class for the Extract Method. For construction it requires a method | |
1316 | handle\typeref{org.eclipse.jdt.core.IMethod} for the method that is to be moved. | |
1317 | Then the target for the move have to be supplied as the variable binding from a | |
1318 | chosen variable declaration. In addition to this, one have to set some | |
1319 | parameters regarding setters/getters, as well as delegation. | |
1320 | ||
1321 | To make a working refactoring from the processor, one have to create a | |
1322 | \type{MoveRefactoring} with it. | |
1323 | ||
1324 | \subsection{The ExtractAndMoveMethodChanger Class} | |
1325 | ||
1326 | The \typewithref{no.uio.ifi.refaktor.changers}{ExtractAndMoveMethodChanger} | |
1327 | class is a subclass of the class | |
1328 | \typewithref{no.uio.ifi.refaktor.changers}{RefaktorChanger}. It is responsible | |
1329 | for analyzing and finding the best target for, and also executing, a composition | |
1330 | of the Extract Method and Move Method refactorings. This particular changer is | |
1331 | the one of my changers that is closest to being a true LTK refactoring. It can | |
1332 | be reworked to be one if the problems with overlapping changes are resolved. The | |
1333 | changer requires a text selection and the name of the new method, or else a | |
1334 | method name will be generated. The selection has to be of the type | |
1335 | \typewithref{no.uio.ifi.refaktor.utils}{CompilationUnitTextSelection}. This | |
1336 | class is a custom extension to | |
1337 | \typewithref{org.eclipse.jface.text}{TextSelection}, that in addition to the | |
1338 | basic offset, length and similar methods, also carry an instance of the | |
1339 | underlying compilation unit handle for the selection. | |
1340 | ||
1341 | \subsubsection{The \type{ExtractAndMoveMethodAnalyzer}} | |
1342 | The analysis and precondition checking is done by the | |
1343 | \typewithref{no.uio.ifi.refaktor.analyze.analyzers}{ExtractAnd\-MoveMethodAnalyzer}. | |
1344 | First is check whether the selection is a valid selection or not, with respect | |
1345 | to statement boundaries and that it actually contains any selections. Then it | |
1346 | checks the legality of both extracting the selection and also moving it to | |
1347 | another class. If the selection is approved as legal, it is analyzed to find the | |
1348 | presumably best target to move the extracted method to. | |
1349 | ||
1350 | For finding the best suitable target the analyzer is using a | |
1351 | \typewithref{no.uio.ifi.refaktor.analyze.collectors}{PrefixesCollector} that | |
1352 | collects all the possible candidates for the refactoring. All the non-candidates | |
1353 | is found by an | |
1354 | \typewithref{no.uio.ifi.refaktor.analyze.collectors}{UnfixesCollector} that | |
1355 | collects all the targets that will give some kind of error if used. All prefixes | |
1356 | (and unfixes) are represented by a | |
1357 | \typewithref{no.uio.ifi.refaktor.extractors}{Prefix}, and they are collected | |
1358 | into sets of prefixes. The safe prefixes is found by subtracting from the set of | |
1359 | candidate prefixes the prefixes that is enclosing any of the unfixes. A prefix | |
1360 | is enclosing an unfix if the unfix is in the set of its sub-prefixes. As an | |
1361 | example, \texttt{``a.b''} is enclosing \texttt{``a''}, as is \texttt{``a''}. The | |
1362 | safe prefixes is unified in a \type{PrefixSet}. If a prefix has only one | |
1363 | occurrence, and is a simple expression, it is considered unsuitable as a move | |
1364 | target. This occurs in statements such as \texttt{``a.foo()''}. For such | |
1365 | statements it bares no meaning to extract and move them. It only generates an | |
1366 | extra method and the calling of it. | |
1367 | ||
1368 | \todoin{Clean up sections/subsections.} | |
1369 | ||
1370 | \subsubsection{The \type{ExtractAndMoveMethodExecutor}} | |
1371 | If the analysis finds a possible target for the composite refactoring, it is | |
1372 | executed by an | |
1373 | \typewithref{no.uio.ifi.refaktor.change.executors}{ExtractAndMoveMethodExecutor}. | |
1374 | It is composed of the two executors known as | |
1375 | \typewithref{no.uio.ifi.refaktor.change.executors}{ExtractMethodRefactoringExecutor} | |
1376 | and | |
1377 | \typewithref{no.uio.ifi.refaktor.change.executors}{MoveMethodRefactoringExecutor}. | |
1378 | The \type{ExtractAndMoveMethodExecutor} is responsible for gluing the two | |
1379 | together by feeding the \type{MoveMethod\-RefactoringExecutor} with the | |
1380 | resources needed after executing the extract method refactoring. | |
1381 | \See{postExtractExecution} | |
1382 | ||
1383 | \subsubsection{The \type{ExtractMethodRefactoringExecutor}} | |
1384 | This executor is responsible for creating and executing an instance of the | |
1385 | \type{ExtractMethodRefactoring} class. It is also responsible for collecting | |
1386 | some post execution resources that can be used to find the method handle for the | |
1387 | extracted method, as well as information about its parameters, including the | |
1388 | variable they originated from. | |
1389 | ||
1390 | \subsubsection{The \type{MoveMethodRefactoringExecutor}} | |
1391 | This executor is responsible for creating and executing an instance of the | |
1392 | \type{MoveRefactoring}. The move refactoring is a processor-based refactoring, | |
1393 | and for the Move Method refactoring it is the \type{MoveInstanceMethodProcessor} | |
1394 | that is used. | |
1395 | ||
1396 | The handle for the method to be moved is found on the basis of the information | |
1397 | gathered after the execution of the Extract Method refactoring. The only | |
1398 | information the \type{ExtractMethodRefactoring} is sharing after its execution, | |
1399 | regarding find the method handle, is the textual representation of the new | |
1400 | method signature. Therefore it must be parsed, the strings for types of the | |
1401 | parameters must be found and translated to a form that can be used to look up | |
1402 | the method handle from its type handle. They have to be on the unresolved | |
1403 | form.\todo{Elaborate?} The name for the type is found from the original | |
1404 | selection, since an extracted method must end up in the same type as the | |
1405 | originating method. | |
1406 | ||
1407 | When analyzing a selection prior to performing the Extract Method refactoring, a | |
1408 | target is chosen. It has to be a variable binding, so it is either a field or a | |
1409 | local variable/parameter. If the target is a field, it can be used with the | |
1410 | \type{MoveInstanceMethodProcessor} as it is, since the extracted method still is | |
1411 | in its scope. But if the target is local to the originating method, the target | |
1412 | that is to be used for the processor must be among its parameters. Thus the | |
1413 | target must be found among the extracted method's parameters. This is done by | |
1414 | finding the parameter information object that corresponds to the parameter that | |
1415 | was declared on basis of the original target's variable when the method was | |
1416 | extracted. (The extracted method must take one such parameter for each local | |
1417 | variable that is declared outside the selection that is extracted.) To match the | |
1418 | original target with the correct parameter information object, the key for the | |
1419 | information object is compared to the key from the original target's binding. | |
1420 | The source code must then be parsed to find the method declaration for the | |
1421 | extracted method. The new target must be found by searching through the | |
1422 | parameters of the declaration and choose the one that has the same type as the | |
1423 | old binding from the parameter information object, as well as the same name that | |
1424 | is provided by the parameter information object. | |
1425 | ||
1426 | ||
1427 | \subsection{Finding the IMethod}\label{postExtractExecution} | |
1428 | \todoin{Rename section. Write.} | |
1429 | ||
1430 | \subsection{Property collectors} | |
1431 | The prefixes and unfixes are found by property | |
1432 | collectors\typeref{no.uio.ifi.refaktor.extractors.collectors.PropertyCollector}. | |
1433 | A property collector follows the visitor pattern\citing{designPatterns} and is | |
1434 | of the \typewithref{org.eclipse.jdt.core.dom}{ASTVisitor} type. An | |
1435 | \type{ASTVisitor} visits nodes in an abstract syntax tree that forms the Java | |
1436 | document object model. The tree consists of nodes of type | |
1437 | \typewithref{org.eclipse.jdt.core.do}{ASTNode}. | |
1438 | ||
1439 | \subsubsection{The PrefixesCollector} | |
1440 | The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{PrefixesCollector} | |
1441 | finds prefixes that makes up tha basis for calculating move targets for the | |
1442 | Extract and Move Method refactoring. It visits expression | |
1443 | statements\typeref{org.eclipse.jdt.core.dom.ExpressionStatement} and creates | |
1444 | prefixes from its expressions in the case of method invocations. The prefixes | |
1445 | found is registered with a prefix set, together with all its sub-prefixes. | |
1446 | \todo{Rewrite in the case of changes to the way prefixes are found} | |
1447 | ||
1448 | \subsubsection{The UnfixesCollector}\label{unfixes} | |
1449 | The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector} | |
1450 | finds unfixes within a selection. That is prefixes that cannot be used as a | |
1451 | basis for finding a move target in a refactoring. | |
1452 | ||
1453 | An unfix can be a name that is assigned to within a selection. The reason that | |
1454 | this cannot be allowed, is that the result would be an assignment to the | |
1455 | \type{this} keyword, which is not valid in Java \see{eclipse_bug_420726}. | |
1456 | ||
1457 | Prefixes that originates from variable declarations within the same selection | |
1458 | are also considered unfixes. This is because when a method is moved, it needs to | |
1459 | be called through a variable. If this variable is also within the method that is | |
1460 | to be moved, this obviously cannot be done. | |
1461 | ||
1462 | Also considered as unfixes are variable references that are of types that is not | |
1463 | suitable for moving a methods to. This can be either because it is not | |
1464 | physically possible to move the method to the desired class or that it will | |
1465 | cause compilation errors by doing so. | |
1466 | ||
1467 | If the type binding for a name is not resolved it is considered and unfix. The | |
1468 | same applies to types that is only found in compiled code, so they have no | |
1469 | underlying source that is accessible to us. (E.g. the \type{java.lang.String} | |
1470 | class.) | |
1471 | ||
1472 | Interfaces types are not suitable as targets. This is simply because interfaces | |
1473 | in java cannot contain methods with bodies. (This thesis does not deal with | |
1474 | features of Java versions later than Java 7. Java 8 has interfaces with default | |
1475 | implementations of methods.) Neither are local types allowed. This accounts for | |
1476 | both local and anonymous classes. Anonymous classes are effectively the same as | |
1477 | interface types with respect to unfixes. Local classes could in theory be used | |
1478 | as targets, but this is not possible due to limitations of the implementation of | |
1479 | the Extract and Move Method refactoring. The problem is that the refactoring is | |
1480 | done in two steps, so the intermediate state between the two refactorings would | |
1481 | not be legal Java code. In the case of local classes, the problem is that, in | |
1482 | the intermediate step, a selection referencing a local class would need to take | |
1483 | the local class as a parameter if it were to be extracted to a new method. This | |
1484 | new method would need to live in the scope of the declaring class of the | |
1485 | originating method. The local class would then not be in the scope of the | |
1486 | extracted method, thus bringing the source code into an illegal state. One could | |
1487 | imagine that the method was extracted and moved in one operation, without an | |
1488 | intermediate state. Then it would make sense to include variables with types of | |
1489 | local classes in the set of legal targets, since the local classes would then be | |
1490 | in the scopes of the method calls. If this makes any difference for software | |
1491 | metrics that measure coupling would be a different discussion. | |
1492 | ||
1493 | \begin{listing} | |
1494 | \begin{multicols}{2} | |
1495 | \begin{minted}[]{java} | |
1496 | // Before | |
1497 | void declaresLocalClass() { | |
1498 | class LocalClass { | |
1499 | void foo() {} | |
1500 | void bar() {} | |
1501 | } | |
1502 | ||
1503 | LocalClass inst = | |
1504 | new LocalClass(); | |
1505 | inst.foo(); | |
1506 | inst.bar(); | |
1507 | } | |
1508 | \end{minted} | |
1509 | ||
1510 | \columnbreak | |
1511 | ||
1512 | \begin{minted}[]{java} | |
1513 | // After Extract Method | |
1514 | void declaresLocalClass() { | |
1515 | class LocalClass { | |
1516 | void foo() {} | |
1517 | void bar() {} | |
1518 | } | |
1519 | ||
1520 | LocalClass inst = | |
1521 | new LocalClass(); | |
1522 | fooBar(inst); | |
1523 | } | |
1524 | ||
1525 | // Intermediate step | |
1526 | void fooBar(LocalClass inst) { | |
1527 | inst.foo(); | |
1528 | inst.bar(); | |
1529 | } | |
1530 | \end{minted} | |
1531 | \end{multicols} | |
1532 | \caption{When Extract and Move Method tries to use a variable with a local type | |
1533 | as the move target, an intermediate step is taken that is not allowed. Here: | |
1534 | \type{LocalClass} is not in the scope of \method{fooBar} in its intermediate | |
1535 | location.} | |
1536 | \label{lst:extractMethod_LocalClass} | |
1537 | \end{listing} | |
1538 | ||
1539 | The last class of names that are considered unfixes is names used in null tests. | |
1540 | These are tests that reads like this: if \texttt{<name>} equals \var{null} then | |
1541 | do something. If allowing variables used in those kinds of expressions as | |
1542 | targets for moving methods, we would end up with code containing boolean | |
1543 | expressions like \texttt{this == null}, which would not be meaningful, since | |
1544 | \var{this} would never be \var{null}. | |
1545 | ||
1546 | \subsection{The Prefix Class} | |
1547 | This class exists mainly for holding data about a prefix, such as the expression | |
1548 | that the prefix represents and the occurrence count of the prefix within a | |
1549 | selection. In addition to this, it has some functionality such as calculating | |
1550 | its sub-prefixes and intersecting it with another prefix. The definition of the | |
1551 | intersection between two prefixes is a prefix representing the longest common | |
1552 | expression between the two. | |
1553 | ||
1554 | \subsection{The PrefixSet Class} | |
1555 | A prefix set holds elements of type \type{Prefix}. It is implemented with the | |
1556 | help of a \typewithref{java.util}{HashMap} and contains some typical set | |
1557 | operations, but it does not implement the \typewithref{java.util}{Set} | |
1558 | interface, since the prefix set does not need all of the functionality a | |
1559 | \type{Set} requires to be implemented. In addition It needs some other | |
1560 | functionality not found in the \type{Set} interface. So due to the relatively | |
1561 | limited use of prefix sets, and that it almost always needs to be referenced as | |
1562 | such, and not a \type{Set<Prefix>}, it remains as an ad hoc solution to a | |
1563 | concrete problem. | |
1564 | ||
1565 | There are two ways adding prefixes to a \type{PrefixSet}. The first is through | |
1566 | its \method{add} method. This works like one would expect from a set. It adds | |
1567 | the prefix to the set if it does not already contain the prefix. The other way | |
1568 | is to \emph{register} the prefix with the set. When registering a prefix, if the | |
1569 | set does not contain the prefix, it is just added. If the set contains the | |
1570 | prefix, its count gets incremented. This is how the occurrence count is handled. | |
1571 | ||
1572 | The prefix set also computes the set of prefixes that is not enclosing any | |
1573 | prefixes of another set. This is kind of a set difference operation only for | |
1574 | enclosing prefixes. | |
1575 | ||
1576 | \subsection{Hacking the Refactoring Undo | |
1577 | History}\label{hacking_undo_history} | |
1578 | \todoin{Where to put this section?} | |
1579 | ||
1580 | As an attempt to make multiple subsequent changes to the workspace appear as a | |
1581 | single action (i.e. make the undo changes appear as such), I tried to alter | |
1582 | the undo changes\typeref{org.eclipse.ltk.core.refactoring.Change} in the history | |
1583 | of the refactorings. | |
1584 | ||
1585 | My first impulse was to remove the, in this case, last two undo changes from the | |
1586 | undo manager\typeref{org.eclipse.ltk.core.refactoring.IUndoManager} for the | |
1587 | Eclipse refactorings, and then add them to a composite | |
1588 | change\typeref{org.eclipse.ltk.core.refactoring.CompositeChange} that could be | |
1589 | added back to the manager. The interface of the undo manager does not offer a | |
1590 | way to remove/pop the last added undo change, so a possible solution could be to | |
1591 | decorate\citing{designPatterns} the undo manager, to intercept and collect the | |
1592 | undo changes before delegating to the \method{addUndo} | |
1593 | method\methodref{org.eclipse.ltk.core.refactoring.IUndoManager}{addUndo} of the | |
1594 | manager. Instead of giving it the intended undo change, a null change could be | |
1595 | given to prevent it from making any changes if run. Then one could let the | |
1596 | collected undo changes form a composite change to be added to the manager. | |
1597 | ||
1598 | There is a technical challenge with this approach, and it relates to the undo | |
1599 | manager, and the concrete implementation | |
1600 | UndoManager2\typeref{org.eclipse.ltk.internal.core.refactoring.UndoManager2}. | |
1601 | This implementation is designed in a way that it is not possible to just add an | |
1602 | undo change, you have to do it in the context of an active | |
1603 | operation\typeref{org.eclipse.core.commands.operations.TriggeredOperations}. | |
1604 | One could imagine that it might be possible to trick the undo manager into | |
1605 | believing that you are doing a real change, by executing a refactoring that is | |
1606 | returning a kind of null change that is returning our composite change of undo | |
1607 | refactorings when it is performed. | |
1608 | ||
1609 | Apart from the technical problems with this solution, there is a functional | |
1610 | problem: If it all had worked out as planned, this would leave the undo history | |
1611 | in a dirty state, with multiple empty undo operations corresponding to each of | |
1612 | the sequentially executed refactoring operations, followed by a composite undo | |
1613 | change corresponding to an empty change of the workspace for rounding of our | |
1614 | composite refactoring. The solution to this particular problem could be to | |
1615 | intercept the registration of the intermediate changes in the undo manager, and | |
1616 | only register the last empty change. | |
1617 | ||
1618 | Unfortunately, not everything works as desired with this solution. The grouping | |
1619 | of the undo changes into the composite change does not make the undo operation | |
1620 | appear as an atomic operation. The undo operation is still split up into | |
1621 | separate undo actions, corresponding to the change done by its originating | |
1622 | refactoring. And in addition, the undo actions has to be performed separate in | |
1623 | all the editors involved. This makes it no solution at all, but a step toward | |
1624 | something worse. | |
1625 | ||
1626 | There might be a solution to this problem, but it remains to be found. The | |
1627 | design of the refactoring undo management is partly to be blamed for this, as it | |
1628 | it is to complex to be easily manipulated. | |
1629 | ||
1630 | ||
1631 | ||
1632 | ||
1633 | \chapter{Analyzing Source Code in Eclipse} | |
1634 | ||
1635 | \section{The Java model} | |
1636 | The Java model of Eclipse is its internal representation of a Java project. It | |
1637 | is light-weight, and has only limited possibilities for manipulating source | |
1638 | code. It is typically used as a basis for the Package Explorer in Eclipse. | |
1639 | ||
1640 | The elements of the Java model is only handles to the underlying elements. This | |
1641 | means that the underlying element of a handle does not need to actually exist. | |
1642 | Hence the user of a handle must always check that it exist by calling the | |
1643 | \method{exists} method of the handle. | |
1644 | ||
1645 | The handles with descriptions is listed in \myref{tab:javaModelTable}. | |
1646 | ||
1647 | \begin{table}[h] | |
1648 | \centering | |
1649 | ||
1650 | \newcolumntype{L}[1]{>{\hsize=#1\hsize\raggedright\arraybackslash}X}% | |
1651 | % sum must equal number of columns (3) | |
1652 | \begin{tabularx}{\textwidth}{| L{0.7} | L{1.1} | L{1.2} |} | |
1653 | \hline | |
1654 | \textbf{Project Element} & \textbf{Java Model element} & | |
1655 | \textbf{Description} \\ | |
1656 | \hline | |
1657 | Java project & \type{IJavaProject} & The Java project which contains all other objects. \\ | |
1658 | \hline | |
1659 | Source folder /\linebreak[2] binary folder /\linebreak[3] external library & | |
1660 | \type{IPackageFragmentRoot} & Hold source or binary files, can be a folder | |
1661 | or a library (zip / jar file). \\ | |
1662 | \hline | |
1663 | Each package & \type{IPackageFragment} & Each package is below the | |
1664 | \type{IPackageFragmentRoot}, sub-packages are not leaves of the package, | |
1665 | they are listed directed under \type{IPackageFragmentRoot}. \\ | |
1666 | \hline | |
1667 | Java Source file & \type{ICompilationUnit} & The Source file is always below | |
1668 | the package node. \\ | |
1669 | \hline | |
1670 | Types /\linebreak[2] Fields /\linebreak[3] Methods & \type{IType} / | |
1671 | \linebreak[0] | |
1672 | \type{IField} /\linebreak[3] \type{IMethod} & Types, fields and methods. \\ | |
1673 | \hline | |
1674 | \end{tabularx} | |
1675 | \caption{The elements of the Java Model. {\footnotesize Taken from | |
1676 | \url{http://www.vogella.com/tutorials/EclipseJDT/article.html}}} | |
1677 | \label{tab:javaModelTable} | |
1678 | \end{table} | |
1679 | ||
1680 | The hierarchy of the Java Model is shown in \myref{fig:javaModel}. | |
1681 | ||
1682 | \begin{figure}[h] | |
1683 | \centering | |
1684 | \begin{tikzpicture}[% | |
1685 | grow via three points={one child at (0,-0.7) and | |
1686 | two children at (0,-0.7) and (0,-1.4)}, | |
1687 | edge from parent path={(\tikzparentnode.south west)+(0.5,0) |- | |
1688 | (\tikzchildnode.west)}] | |
1689 | \tikzstyle{every node}=[draw=black,thick,anchor=west] | |
1690 | \tikzstyle{selected}=[draw=red,fill=red!30] | |
1691 | \tikzstyle{optional}=[dashed,fill=gray!50] | |
1692 | \node {\type{IJavaProject}} | |
1693 | child { node {\type{IPackageFragmentRoot}} | |
1694 | child { node {\type{IPackageFragment}} | |
1695 | child { node {\type{ICompilationUnit}} | |
1696 | child { node {\type{IType}} | |
1697 | child { node {\type{\{ IType \}*}} | |
1698 | child { node {\type{\ldots}}} | |
1699 | } | |
1700 | child [missing] {} | |
1701 | child { node {\type{\{ IField \}*}}} | |
1702 | child { node {\type{IMethod}} | |
1703 | child { node {\type{\{ IType \}*}} | |
1704 | child { node {\type{\ldots}}} | |
1705 | } | |
1706 | } | |
1707 | child [missing] {} | |
1708 | child [missing] {} | |
1709 | child { node {\type{\{ IMethod \}*}}} | |
1710 | } | |
1711 | child [missing] {} | |
1712 | child [missing] {} | |
1713 | child [missing] {} | |
1714 | child [missing] {} | |
1715 | child [missing] {} | |
1716 | child [missing] {} | |
1717 | child [missing] {} | |
1718 | child { node {\type{\{ IType \}*}}} | |
1719 | } | |
1720 | child [missing] {} | |
1721 | child [missing] {} | |
1722 | child [missing] {} | |
1723 | child [missing] {} | |
1724 | child [missing] {} | |
1725 | child [missing] {} | |
1726 | child [missing] {} | |
1727 | child [missing] {} | |
1728 | child [missing] {} | |
1729 | child { node {\type{\{ ICompilationUnit \}*}}} | |
1730 | } | |
1731 | child [missing] {} | |
1732 | child [missing] {} | |
1733 | child [missing] {} | |
1734 | child [missing] {} | |
1735 | child [missing] {} | |
1736 | child [missing] {} | |
1737 | child [missing] {} | |
1738 | child [missing] {} | |
1739 | child [missing] {} | |
1740 | child [missing] {} | |
1741 | child [missing] {} | |
1742 | child { node {\type{\{ IPackageFragment \}*}}} | |
1743 | } | |
1744 | child [missing] {} | |
1745 | child [missing] {} | |
1746 | child [missing] {} | |
1747 | child [missing] {} | |
1748 | child [missing] {} | |
1749 | child [missing] {} | |
1750 | child [missing] {} | |
1751 | child [missing] {} | |
1752 | child [missing] {} | |
1753 | child [missing] {} | |
1754 | child [missing] {} | |
1755 | child [missing] {} | |
1756 | child [missing] {} | |
1757 | child { node {\type{\{ IPackageFragmentRoot \}*}}} | |
1758 | ; | |
1759 | \end{tikzpicture} | |
1760 | \caption{The Java model of Eclipse. ``\type{\{ SomeElement \}*}'' means | |
1761 | \type{SomeElement} zero or more times. For recursive structures, | |
1762 | ``\type{\ldots}'' is used.} | |
1763 | \label{fig:javaModel} | |
1764 | \end{figure} | |
1765 | ||
1766 | \section{The Abstract Synax Tree} | |
1767 | Eclipse is following the common paradigm of using an abstract syntaxt tree for | |
1768 | source code analysis and manipulation. | |
1769 | ||
1770 | When parsing program source code into something that can be used as a foundation | |
1771 | for analysis, the start of the process follows the same steps as in a compiler. | |
1772 | This is all natural, because the way a compiler anayzes code is no different | |
1773 | from how source manipulation programs would do it, except for some properties of | |
1774 | code that is analyzed in the parser, and that they may be differing in what | |
1775 | kinds of properties they analyze. Thus the process of translation source code | |
1776 | into a structure that is suitable for analyzing, can be seen as a kind of | |
1777 | interrupted compilation process \see{fig:interruptedCompilationProcess}. | |
1778 | ||
1779 | \begin{figure}[h] | |
1780 | \centering | |
1781 | \tikzset{ | |
1782 | base/.style={anchor=north, align=center, rectangle, minimum height=1.4cm}, | |
1783 | basewithshadow/.style={base, drop shadow, fill=white}, | |
1784 | outlined/.style={basewithshadow, draw, rounded corners, minimum | |
1785 | width=0.4cm}, | |
1786 | primary/.style={outlined, font=\bfseries}, | |
1787 | dashedbox/.style={outlined, dashed}, | |
1788 | arrowpath/.style={black, align=center, font=\small}, | |
1789 | processarrow/.style={arrowpath, ->, >=angle 90, shorten >=1pt}, | |
1790 | } | |
1791 | \begin{tikzpicture}[node distance=1.3cm and 3cm, scale=1, every | |
1792 | node/.style={transform shape}] | |
1793 | \node[base](AuxNode1){\small source code}; | |
1794 | \node[primary, right=of AuxNode1, xshift=-2.5cm](Scanner){Scanner}; | |
1795 | \node[primary, right=of Scanner, xshift=0.5cm](Parser){Parser}; | |
1796 | \node[dashedbox, below=of Parser](SemanticAnalyzer){Semantic\\Analyzer}; | |
1797 | \node[dashedbox, left=of SemanticAnalyzer](SourceCodeOptimizer){Source | |
1798 | Code\\Optimizer}; | |
1799 | \node[dashedbox, below=of SourceCodeOptimizer | |
1800 | ](CodeGenerator){Code\\Generator}; | |
1801 | \node[dashedbox, right=of CodeGenerator](TargetCodeOptimizer){Target | |
1802 | Code\\Optimizer}; | |
1803 | \node[base, right=of TargetCodeOptimizer](AuxNode2){}; | |
1804 | ||
1805 | \draw[processarrow](AuxNode1) -- (Scanner); | |
1806 | ||
1807 | \path[arrowpath] (Scanner) -- node [sloped](tokens){tokens}(Parser); | |
1808 | \draw[processarrow](Scanner) -- (tokens) -- (Parser); | |
1809 | ||
1810 | \path[arrowpath] (Parser) -- node (syntax){syntax | |
1811 | tree}(SemanticAnalyzer); | |
1812 | \draw[processarrow](Parser) -- (syntax) -- (SemanticAnalyzer); | |
1813 | ||
1814 | \path[arrowpath] (SemanticAnalyzer) -- node | |
1815 | [sloped](annotated){annotated\\tree}(SourceCodeOptimizer); | |
1816 | \draw[processarrow, dashed](SemanticAnalyzer) -- (annotated) -- | |
1817 | (SourceCodeOptimizer); | |
1818 | ||
1819 | \path[arrowpath] (SourceCodeOptimizer) -- node | |
1820 | (intermediate){intermediate code}(CodeGenerator); | |
1821 | \draw[processarrow, dashed](SourceCodeOptimizer) -- (intermediate) -- | |
1822 | (CodeGenerator); | |
1823 | ||
1824 | \path[arrowpath] (CodeGenerator) -- node [sloped](target1){target | |
1825 | code}(TargetCodeOptimizer); | |
1826 | \draw[processarrow, dashed](CodeGenerator) -- (target1) -- | |
1827 | (TargetCodeOptimizer); | |
1828 | ||
1829 | \path[arrowpath](TargetCodeOptimizer) -- node [sloped](target2){target | |
1830 | code}(AuxNode2); | |
1831 | \draw[processarrow, dashed](TargetCodeOptimizer) -- (target2) (AuxNode2); | |
1832 | \end{tikzpicture} | |
1833 | \caption{Interrupted compilation process. {\footnotesize (Full compilation | |
1834 | process from \emph{Compiler construction: principles and practice} by | |
1835 | Kenneth C. Louden\citing{louden1997}.)}} | |
1836 | \label{fig:interruptedCompilationProcess} | |
1837 | \end{figure} | |
1838 | ||
1839 | \todoin{Refine \myref{fig:interruptedCompilationProcess}.} | |
1840 | ||
1841 | The process starts with a \emph{scanner}, or lexer. The job of the scanner is to | |
1842 | read the source code and divide it into tokens for the parser. Therefore, it is | |
1843 | also sometimes called a tokenizer. A token is a logical unit, defined in the | |
1844 | language specification, consisting of one or more consecutive characters. In | |
1845 | the java language the tokens can for instance be the \var{this} keyword, a curly | |
1846 | bracket \var{\{} or a \var{nameToken}. It is recognized by the scanner on the | |
1847 | basis of something eqivalent of a regular expression. This part of the process | |
1848 | is often implemented with the use of a finite automata. In fact, it is common to | |
1849 | specify the tokens in regular expressions, that in turn is translated into a | |
1850 | finite automata lexer. This process can be automated. | |
1851 | ||
1852 | The program component used to translate a a stream of tokens into something | |
1853 | meaningful, is called a parser. A parser is fed tokens from the scanner and | |
1854 | performs an analysis of the structure of a program. It verifies that the syntax | |
1855 | is correct according to the grammar rules of a language, that is usually | |
1856 | specified in a context-free grammar, and often in a variant of the | |
1857 | \emph{Backus--Naur | |
1858 | Form}\footnote{\url{https://en.wikipedia.org/wiki/Backus-Naur\_Form}}. The | |
1859 | result coming from the parser is in the form of an \emph{Abstract Syntax Tree}, | |
1860 | AST for short. It is called \emph{abstract}, because the structure does not | |
1861 | contain all of the tokens produced by the scanner. It only contain logical | |
1862 | constructs, and because it forms a tree, all kinds of parentheses and brackets | |
1863 | are implicit in the structure. It is this AST that is used when performing the | |
1864 | semantic analysis of the code. | |
1865 | ||
1866 | As an example we can think of the expression \code{(5 + 7) * 2}. The root of | |
1867 | this tree would in Eclipse be an \type{InfixExpression} with the operator | |
1868 | \var{TIMES}, and a left operand that is also an \type{InfixExpression} with the | |
1869 | operator \var{PLUS}. The left operand \type{InfixExpression}, has in turn a left | |
1870 | operand of type \type{NumberLiteral} with the value \var{``5''} and a right | |
1871 | operand \type{NumberLiteral} with the value \var{``7''}. The root will have a | |
1872 | right operand of type \type{NumberLiteral} and value \var{``2''}. The AST for | |
1873 | this expression is illustrated in \myref{fig:astInfixExpression}. | |
1874 | ||
1875 | Contrary to the Java Model, an abstract syntaxt tree is a heavy-weight | |
1876 | representation of source code. It contains information about propertes like type | |
1877 | bindings for variables and variable bindings for names. | |
1878 | ||
1879 | ||
1880 | \begin{figure}[h] | |
1881 | \centering | |
1882 | \begin{tikzpicture}[scale=0.8] | |
1883 | \tikzset{level distance=40pt} | |
1884 | \tikzset{sibling distance=5pt} | |
1885 | \tikzstyle{thescale}=[scale=0.8] | |
1886 | \tikzset{every tree node/.style={align=center}} | |
1887 | \tikzset{edge from parent/.append style={thick}} | |
1888 | \tikzstyle{inode}=[rectangle,rounded corners,draw,fill=lightgray,drop | |
1889 | shadow,align=center] | |
1890 | \tikzset{every internal node/.style={inode}} | |
1891 | \tikzset{every leaf node/.style={draw=none,fill=none}} | |
1892 | ||
1893 | \Tree [.\type{InfixExpression} [.\type{InfixExpression} | |
1894 | [.\type{NumberLiteral} \var{``5''} ] [.\type{Operator} \var{PLUS} ] | |
1895 | [.\type{NumberLiteral} \var{``7''} ] ] | |
1896 | [.\type{Operator} \var{TIMES} ] | |
1897 | [.\type{NumberLiteral} \var{``2''} ] | |
1898 | ] | |
1899 | \end{tikzpicture} | |
1900 | \caption{The abstract syntax tree for the expression \code{(5 + 7) * 2}.} | |
1901 | \label{fig:astInfixExpression} | |
1902 | \end{figure} | |
1903 | ||
1904 | \subsection{The AST in Eclipse} | |
1905 | In Eclipse, every node in the AST is a child of the abstract superclass | |
1906 | \typewithref{org.eclipse.jdt.core.dom}{ASTNode}. Every \type{ASTNode}, among a | |
1907 | lot of other things, provides information about its position and length in the | |
1908 | source code, as well as a reference to its parent and to the root of the tree. | |
1909 | ||
1910 | The root of the AST is always of type \type{CompilationUnit}. It is not the same | |
1911 | as an instance of an \type{ICompilationUnit}, which is the compilation unit | |
1912 | handle of the Java model. The children of a \type{CompilationUnit} is an | |
1913 | optional \type{PackageDeclaration}, zero or more nodes of type | |
1914 | \type{ImportDecaration} and all its top-level type declarations that has node | |
1915 | types \type{AbstractTypeDeclaration}. | |
1916 | ||
1917 | An \type{AbstractType\-Declaration} can be one of the types | |
1918 | \type{AnnotationType\-Declaration}, \type{Enum\-Declaration} or | |
1919 | \type{Type\-Declaration}. The children of an \type{AbstractType\-Declaration} | |
1920 | must be a subtype of a \type{BodyDeclaration}. These subtypes are: | |
1921 | \type{AnnotationTypeMember\-Declaration}, \type{EnumConstant\-Declaration}, | |
1922 | \type{Field\-Declaration}, \type{Initializer} and \type{Method\-Declaration}. | |
1923 | ||
1924 | Of the body declarations, the \type{Method\-Declaration} is the most interesting | |
1925 | one. Its children include lists of modifiers, type parameters, parameters and | |
1926 | exceptions. It has a return type node and a body node. The body, if present, is | |
1927 | of type \type{Block}. A \type{Block} is itself a \type{Statement}, and its | |
1928 | children is a list of \type{Statement} nodes. | |
1929 | ||
1930 | There are too many types of the abstract type \type{Statement} to list up, but | |
1931 | there exists a subtype of \type{Statement} for every statement type of Java, as | |
1932 | one would expect. This also applies to the abstract type \type{Expression}. | |
1933 | However, the expression \type{Name} is a little special, since it is both used | |
1934 | as an operand in compound expressions, as well as for names in type declarations | |
1935 | and such. | |
1936 | ||
1937 | There is an overview of some of the structure of an Eclipse AST in | |
1938 | \myref{fig:astEclipse}. | |
1939 | ||
1940 | \begin{figure}[h] | |
1941 | \centering | |
1942 | \begin{tikzpicture}[scale=0.8] | |
1943 | \tikzset{level distance=50pt} | |
1944 | \tikzset{sibling distance=5pt} | |
1945 | \tikzstyle{thescale}=[scale=0.8] | |
1946 | \tikzset{every tree node/.style={align=center}} | |
1947 | \tikzset{edge from parent/.append style={thick}} | |
1948 | \tikzstyle{inode}=[rectangle,rounded corners,draw,fill=lightgray,drop | |
1949 | shadow,align=center] | |
1950 | \tikzset{every internal node/.style={inode}} | |
1951 | \tikzset{every leaf node/.style={draw=none,fill=none}} | |
1952 | ||
1953 | \Tree [.\type{CompilationUnit} [.\type{[ PackageDeclaration ]} [.\type{Name} ] | |
1954 | [.\type{\{ Annotation \}*} ] ] | |
1955 | [.\type{\{ ImportDeclaration \}*} [.\type{Name} ] ] | |
1956 | [.\type{\{ AbstractTypeDeclaration \}+} [.\node(site){\type{\{ | |
1957 | BodyDeclaration \}*}}; ] [.\type{SimpleName} ] ] | |
1958 | ] | |
1959 | \begin{scope}[shift={(0.5,-6)}] | |
1960 | \node[inode,thescale](root){\type{MethodDeclaration}}; | |
1961 | \node[inode,thescale](modifiers) at (4.5,-5){\type{\{ IExtendedModifier \}*} | |
1962 | \\ {\footnotesize (Of type \type{Modifier} or \type{Annotation})}}; | |
1963 | \node[inode,thescale](typeParameters) at (-6,-3.5){\type{\{ TypeParameter | |
1964 | \}*}}; | |
1965 | \node[inode,thescale](parameters) at (-5,-5){\type{\{ | |
1966 | SingleVariableDeclaration \}*} \\ {\footnotesize (Parameters)}}; | |
1967 | \node[inode,thescale](exceptions) at (5,-3){\type{\{ Name \}*} \\ | |
1968 | {\footnotesize (Exceptions)}}; | |
1969 | \node[inode,thescale](return) at (-6.5,-2){\type{Type} \\ {\footnotesize | |
1970 | (Return type)}}; | |
1971 | \begin{scope}[shift={(0,-5)}] | |
1972 | \Tree [.\node(body){\type{[ Block ]} \\ {\footnotesize (Body)}}; | |
1973 | [.\type{\{ Statement \}*} [.\type{\{ Expression \}*} ] | |
1974 | [.\type{\{ Statement \}*} [.\type{\ldots} ]] | |
1975 | ] | |
1976 | ] | |
1977 | \end{scope} | |
1978 | \end{scope} | |
1979 | \draw[->,>=triangle 90,shorten >=1pt](root.east)..controls +(east:2) and | |
1980 | +(south:1)..(site.south); | |
1981 | ||
1982 | \draw (root.south) -- (modifiers); | |
1983 | \draw (root.south) -- (typeParameters); | |
1984 | \draw (root.south) -- ($ (parameters.north) + (2,0) $); | |
1985 | \draw (root.south) -- (exceptions); | |
1986 | \draw (root.south) -- (return); | |
1987 | \draw (root.south) -- (body); | |
1988 | ||
1989 | \end{tikzpicture} | |
1990 | \caption{The format of the abstract syntax tree in Eclipse.} | |
1991 | \label{fig:astEclipse} | |
1992 | \end{figure} | |
1993 | \todoin{Add more to the AST format tree? \myref{fig:astEclipse}} | |
1994 | ||
1995 | \section{The ASTVisitor} | |
1996 | So far, the only thing that has been adressed is how the the data that is going | |
1997 | to be the basis for our analysis is structured. Another aspect of it is how we | |
1998 | are going to traverse the AST to gather the information we need, so we can | |
1999 | conclude about the properties we are analysing. It is of course possible to | |
2000 | start at the top of the tree, and manually search through its nodes for the ones | |
2001 | we are looking for, but that is a bit inconvenient. To be able to efficiently | |
2002 | utilize such an approach, we would need to make our own framework for traversing | |
2003 | the tree and visiting only the types of nodes we are after. Luckily, this | |
2004 | functionality is already provided in Eclipse, by its | |
2005 | \typewithref{org.eclipse.jdt.core.dom}{ASTVisitor}. | |
2006 | ||
2007 | The Eclipse AST, together with its \type{ASTVisitor}, follows the \emph{Visitor} | |
2008 | pattern\citing{designPatterns}. The intent of this design pattern is to | |
2009 | facilitate extending the functionality of classes without touching the classes | |
2010 | themselves. | |
2011 | ||
2012 | Let us say that there is a class hierarchy of \emph{Elements}. These elements | |
2013 | all have a method \method{accept(Visitor visitor)}. In its simplest form, the | |
2014 | \method{accept} method just calls the \method{visit} method of the visitor with | |
2015 | itself as an argument, like this: \code{visitor.visit(this)}. For the visitors | |
2016 | to be able to extend the functionality of all the classes in the elements | |
2017 | hierarchy, each \type{Visitor} must have one visit method for each concrete | |
2018 | class in the hierarchy. Say the hierarchy consists of the concrete classes | |
2019 | \type{ConcreteElementA} and \type{ConcreteElementB}. Then each visitor must have | |
2020 | the (possibly empty) methods \method{visit(ConcreteElementA element)} and | |
2021 | \method{visit(ConcreteElementB element)}. This scenario is depicted in | |
2022 | \myref{fig:visitorPattern}. | |
2023 | ||
2024 | \begin{figure}[h] | |
2025 | \centering | |
2026 | \tikzstyle{abstract}=[rectangle, draw=black, fill=white, drop shadow, text | |
2027 | centered, anchor=north, text=black, text width=6cm, every one node | |
2028 | part/.style={align=center, font=\bfseries\itshape}] | |
2029 | \tikzstyle{concrete}=[rectangle, draw=black, fill=white, drop shadow, text | |
2030 | centered, anchor=north, text=black, text width=6cm] | |
2031 | \tikzstyle{inheritarrow}=[->, >=open triangle 90, thick] | |
2032 | \tikzstyle{commentarrow}=[->, >=angle 90, dashed] | |
2033 | \tikzstyle{line}=[-, thick] | |
2034 | \tikzset{every one node part/.style={align=center, font=\bfseries}} | |
2035 | \tikzset{every second node part/.style={align=center, font=\ttfamily}} | |
2036 | ||
2037 | \begin{tikzpicture}[node distance=1cm, scale=0.8, every node/.style={transform | |
2038 | shape}] | |
2039 | \node (Element) [abstract, rectangle split, rectangle split parts=2] | |
2040 | { | |
2041 | \nodepart{one}{Element} | |
2042 | \nodepart{second}{+accept(visitor: Visitor)} | |
2043 | }; | |
2044 | \node (AuxNode01) [text width=0, minimum height=2cm, below=of Element] {}; | |
2045 | \node (ConcreteElementA) [concrete, rectangle split, rectangle split | |
2046 | parts=2, left=of AuxNode01] | |
2047 | { | |
2048 | \nodepart{one}{ConcreteElementA} | |
2049 | \nodepart{second}{+accept(visitor: Visitor)} | |
2050 | }; | |
2051 | \node (ConcreteElementB) [concrete, rectangle split, rectangle split | |
2052 | parts=2, right=of AuxNode01] | |
2053 | { | |
2054 | \nodepart{one}{ConcreteElementB} | |
2055 | \nodepart{second}{+accept(visitor: Visitor)} | |
2056 | }; | |
2057 | ||
2058 | \node[comment, below=of ConcreteElementA] (CommentA) {visitor.visit(this)}; | |
2059 | ||
2060 | \node[comment, below=of ConcreteElementB] (CommentB) {visitor.visit(this)}; | |
2061 | ||
2062 | \node (AuxNodeX) [text width=0, minimum height=1cm, below=of AuxNode01] {}; | |
2063 | ||
2064 | \node (Visitor) [abstract, rectangle split, rectangle split parts=2, | |
2065 | below=of AuxNodeX] | |
2066 | { | |
2067 | \nodepart{one}{Visitor} | |
2068 | \nodepart{second}{+visit(ConcreteElementA)\\+visit(ConcreteElementB)} | |
2069 | }; | |
2070 | \node (AuxNode02) [text width=0, minimum height=2cm, below=of Visitor] {}; | |
2071 | \node (ConcreteVisitor1) [concrete, rectangle split, rectangle split | |
2072 | parts=2, left=of AuxNode02] | |
2073 | { | |
2074 | \nodepart{one}{ConcreteVisitor1} | |
2075 | \nodepart{second}{+visit(ConcreteElementA)\\+visit(ConcreteElementB)} | |
2076 | }; | |
2077 | \node (ConcreteVisitor2) [concrete, rectangle split, rectangle split | |
2078 | parts=2, right=of AuxNode02] | |
2079 | { | |
2080 | \nodepart{one}{ConcreteVisitor2} | |
2081 | \nodepart{second}{+visit(ConcreteElementA)\\+visit(ConcreteElementB)} | |
2082 | }; | |
2083 | ||
2084 | ||
2085 | \draw[inheritarrow] (ConcreteElementA.north) -- ++(0,0.7) -| | |
2086 | (Element.south); | |
2087 | \draw[line] (ConcreteElementA.north) -- ++(0,0.7) -| | |
2088 | (ConcreteElementB.north); | |
2089 | ||
2090 | \draw[inheritarrow] (ConcreteVisitor1.north) -- ++(0,0.7) -| | |
2091 | (Visitor.south); | |
2092 | \draw[line] (ConcreteVisitor1.north) -- ++(0,0.7) -| | |
2093 | (ConcreteVisitor2.north); | |
2094 | ||
2095 | \draw[commentarrow] (CommentA.north) -- (ConcreteElementA.south); | |
2096 | \draw[commentarrow] (CommentB.north) -- (ConcreteElementB.south); | |
2097 | ||
2098 | ||
2099 | \end{tikzpicture} | |
2100 | \caption{The Visitor Pattern.} | |
2101 | \label{fig:visitorPattern} | |
2102 | \end{figure} | |
2103 | ||
2104 | The use of the visitor pattern can be appropriate when the hierarchy of elements | |
2105 | is mostly stable, but the family of operations over its elements is constantly | |
2106 | growing. This is clearly the cas for the Eclipse AST, since the hierarchy of | |
2107 | type \type{ASTNode} is very stable, but the functionality of its elements is | |
2108 | extended every time someone needs to operate on the AST. Another aspect of the | |
2109 | Eclipse implementation is that it is a public API, and the visitor pattern is an | |
2110 | easy way to provide access to the nodes in the tree. | |
2111 | ||
2112 | The version of the visitor pattern implemented for the AST nodes in Eclipse also | |
2113 | provides an elegant way to traverse the tree. It does so by following the | |
2114 | convention that every node in the tree first let the visitor visit itself, | |
2115 | before it also makes all its children accept the visitor. The children are only | |
2116 | visited if the visit method of their parent returns \var{true}. This pattern | |
2117 | then makes for a prefix traversal of the AST. If postfix traversal is desired, | |
2118 | the visitors also has \method{endVisit} methods for each node type, that is | |
2119 | called after the \method{visit} method for a node. In addition to these visit | |
2120 | methods, there are also the methods \method{preVisit(ASTNode)}, | |
2121 | \method{postVisit(ASTNode)} and \method{preVisit2(ASTNode)}. The | |
2122 | \method{preVisit} method is called before the type-specific \method{visit} | |
2123 | method. The \method{postVisit} method is called after the type-specific | |
2124 | \method{endVisit}. The type specific \method{visit} is only called if | |
2125 | \method{preVisit2} returns \var{true}. Overriding the \method{preVisit2} is also | |
2126 | altering the behavior of \method{preVisit}, since the default implementation is | |
2127 | responsible for calling it. | |
2128 | ||
2129 | An example of a trivial \type{ASTVisitor} is shown in | |
2130 | \myref{lst:astVisitorExample}. | |
2131 | ||
2132 | \begin{listing} | |
2133 | \begin{minted}{java} | |
2134 | public class CollectNamesVisitor extends ASTVisitor { | |
2135 | Collection<Name> names = new LinkedList<Name>(); | |
2136 | ||
2137 | @Override | |
2138 | public boolean visit(QualifiedName node) { | |
2139 | names.add(node); | |
2140 | return false; | |
2141 | } | |
2142 | ||
2143 | @Override | |
2144 | public boolean visit(SimpleName node) { | |
2145 | names.add(node); | |
2146 | return true; | |
2147 | } | |
2148 | } | |
2149 | \end{minted} | |
2150 | \caption{An \type{ASTVisitor} that visits all the names in a subtree and adds | |
2151 | them to a collection, except those names that are children of any | |
2152 | \type{QualifiedName}.} | |
2153 | \label{lst:astVisitorExample} | |
2154 | \end{listing} | |
2155 | ||
2156 | ||
2157 | \section{Illegal selections} | |
2158 | ||
2159 | \subsection{Not all branches end in return} | |
2160 | ||
2161 | \subsection{Ambiguous return statement} | |
2162 | This problem occurs when there is either more than one assignment to a local | |
2163 | variable that is used outside of the selection, or there is only one, but there | |
2164 | are also return statements in the selection. | |
2165 | ||
2166 | \todoin{Explain why we do not need to consider variables assigned inside | |
2167 | local/anonymous classes. (The referenced variables need to be final and so | |
2168 | on\ldots)} | |
2169 | ||
2170 | \chapter{Eclipse Bugs Found} | |
2171 | \todoin{Add other things and change headline?} | |
2172 | ||
2173 | \section{Eclipse bug 420726: Code is broken when moving a method that is | |
2174 | assigning to the parameter that is also the move | |
2175 | destination}\label{eclipse_bug_420726} | |
2176 | This bug\footnote{\url{https://bugs.eclipse.org/bugs/show\_bug.cgi?id=420726}} | |
2177 | was found when analyzing what kinds of names that was to be considered as | |
2178 | \emph{unfixes} \see{unfixes}. | |
2179 | ||
2180 | \subsection{The bug} | |
2181 | The bug emerges when trying to move a method from one class to another, and when | |
2182 | the target for the move (must be a variable, local or field) is both a parameter | |
2183 | variable and also is assigned to within the method body. Eclipse allows this to | |
2184 | happen, although it is the sure path to a compilation error. This is because we | |
2185 | would then have an assignment to a \var{this} expression, which is not allowed | |
2186 | in Java. | |
2187 | ||
2188 | \subsection{The solution} | |
2189 | The solution to this problem is to add all simple names that are assigned to in | |
2190 | a method body to the set of unfixes. | |
2191 | ||
2192 | \section{Eclipse bug 429416: IAE when moving method from anonymous class} | |
2193 | I | |
2194 | discovered\footnote{\url{https://bugs.eclipse.org/bugs/show\_bug.cgi?id=429416}} | |
2195 | this bug during a batch change on the \type{org.eclipse.jdt.ui} project. | |
2196 | ||
2197 | \subsection{The bug} | |
2198 | This bug surfaces when trying to use the Move Method refactoring to move a | |
2199 | method from an anonymous class to another class. This happens both for my | |
2200 | simulation as well as in Eclipse, through the user interface. It only occurs | |
2201 | when Eclipse analyzes the program and finds it necessary to pass an instance of | |
2202 | the originating class as a parameter to the moved method. I.e. it want to pass a | |
2203 | \var{this} expression. The execution ends in an | |
2204 | \typewithref{java.lang}{IllegalArgumentException} in | |
2205 | \typewithref{org.eclipse.jdt.core.dom}{SimpleName} and its | |
2206 | \method{setIdentifier(String)} method. The simple name is attempted created in | |
2207 | the method | |
2208 | \methodwithref{org.eclipse.jdt.internal.corext.refactoring.structure.\\MoveInstanceMethodProcessor}{createInlinedMethodInvocation} | |
2209 | so the \type{MoveInstanceMethodProcessor} was early a clear suspect. | |
2210 | ||
2211 | The \method{createInlinedMethodInvocation} is the method that creates a method | |
2212 | invocation where the previous invocation to the method that was moved was. From | |
2213 | its code it can be read that when a \var{this} expression is going to be passed | |
2214 | in to the invocation, it shall be qualified with the name of the original | |
2215 | method's declaring class, if the declaring class is either an anonymous clas or | |
2216 | a member class. The problem with this, is that an anonymous class does not have | |
2217 | a name, hence the term \emph{anonymous} class! Therefore, when its name, an | |
2218 | empty string, is passed into | |
2219 | \methodwithref{org.eclipse.jdt.core.dom.AST}{newSimpleName} it all ends in an | |
2220 | \type{IllegalArgumentException}. | |
2221 | ||
2222 | \subsection{How I solved the problem} | |
2223 | Since the \type{MoveInstanceMethodProcessor} is instantiated in the | |
2224 | \typewithref{no.uio.ifi.refaktor.change.executors}{MoveMethod\-RefactoringExecutor}, | |
2225 | and only need to be a | |
2226 | \typewithref{org.eclipse.ltk.core.refactoring.participants}{MoveProcessor}, I | |
2227 | was able to copy the code for the original move processor and modify it so that | |
2228 | it works better for me. It is now called | |
2229 | \typewithref{no.uio.ifi.refaktor.refactorings.processors}{ModifiedMoveInstanceMethodProcessor}. | |
2230 | The only modification done (in addition to some imports and suppression of | |
2231 | warnings), is in the \method{createInlinedMethodInvocation}. When the declaring | |
2232 | class of the method to move is anonymous, the \var{this} expression in the | |
2233 | parameter list is not qualified with the declaring class' (empty) name. | |
2234 | ||
2235 | \section{Eclipse bug 429954: Extracting statement with reference to local type | |
2236 | breaks code}\label{eclipse_bug_429954} | |
2237 | The bug\footnote{\url{https://bugs.eclipse.org/bugs/show\_bug.cgi?id=429954}} | |
2238 | was discovered when doing some changes to the way unfixes is computed. | |
2239 | ||
2240 | \subsection{The bug} | |
2241 | The problem is that Eclipse is allowing selections that references variables of | |
2242 | local types to be extracted. When this happens the code is broken, since the | |
2243 | extracted method must take a parameter of a local type that is not in the | |
2244 | methods scope. The problem is illustrated in | |
2245 | \myref{lst:extractMethod_LocalClass}, but there in another setting. | |
2246 | ||
2247 | \subsection{Actions taken} | |
2248 | There are no actions directly springing out of this bug, since the Extract | |
2249 | Method refactoring cannot be meant to be this way. This is handled on the | |
2250 | analysis stage of our Extract and Move Method refactoring. So names representing | |
2251 | variables of local types is considered unfixes \see{unfixes}. | |
2252 | \todoin{write more when fixing this in legal statements checker} | |
2253 | ||
2254 | \chapter{Related Work} | |
2255 | ||
2256 | \section{The compositional paradigm of refactoring} | |
2257 | This paradigm builds upon the observation of Vakilian et | |
2258 | al.\citing{vakilian2012}, that of the many automated refactorings existing in | |
2259 | modern IDEs, the simplest ones are dominating the usage statistics. The report | |
2260 | mainly focuses on \emph{Eclipse} as the tool under investigation. | |
2261 | ||
2262 | The paradigm is described almost as the opposite of automated composition of | |
2263 | refactorings \see{compositeRefactorings}. It works by providing the programmer | |
2264 | with easily accessible primitive refactorings. These refactorings shall be | |
2265 | accessed via keyboard shortcuts or quick-assist menus\footnote{Think | |
2266 | quick-assist with Ctrl+1 in Eclipse} and be promptly executed, opposed to in the | |
2267 | currently dominating wizard-based refactoring paradigm. They are ment to | |
2268 | stimulate composing smaller refactorings into more complex changes, rather than | |
2269 | doing a large upfront configuration of a wizard-based refactoring, before | |
2270 | previewing and executing it. The compositional paradigm of refactoring is | |
2271 | supposed to give control back to the programmer, by supporting \himher with an | |
2272 | option of performing small rapid changes instead of large changes with a lesser | |
2273 | degree of control. The report authors hope this will lead to fewer unsuccessful | |
2274 | refactorings. It also could lower the bar for understanding the steps of a | |
2275 | larger composite refactoring and thus also help in figuring out what goes wrong | |
2276 | if one should choose to op in on a wizard-based refactoring. | |
2277 | ||
2278 | Vakilian and his associates have performed a survey of the effectiveness of the | |
2279 | compositional paradigm versus the wizard-based one. They claim to have found | |
2280 | evidence of that the \emph{compositional paradigm} outperforms the | |
2281 | \emph{wizard-based}. It does so by reducing automation, which seem | |
2282 | counterintuitive. Therefore they ask the question ``What is an appropriate level | |
2283 | of automation?'', and thus questions what they feel is a rush toward more | |
2284 | automation in the software engineering community. | |
2285 | ||
2286 | ||
2287 | \backmatter{} | |
2288 | \printbibliography | |
2289 | \listoftodos | |
2290 | \end{document} |