Thesis: double class instance invocations

[ifi-stolz-refaktor.git] / thesis / master-thesis-erlenkr.tex
diff --git a/thesis/master-thesis-erlenkr.tex b/thesis/master-thesis-erlenkr.tex

index abdf498b9bb116845a98d65d6a32ef2424239757..7c3e3d31114466e88ae4fce6f06ebea09ef040fa 100644 (file)
--- a/thesis/master-thesis-erlenkr.tex
+++ b/thesis/master-thesis-erlenkr.tex
@@ -1653,6 +1653,24 @@ This shows that classes acting as superclasses are especially fragile to
  introducing errors in the context of automated refactoring. This is also shown 
  in bug\ldots \todoin{File Eclipse bug report}
  
+\subsection{A double class instance creation}
+The following is a problem caused solely by the underlying \MoveMethod 
+refactoring.  The problem occurs if two classes are instantiated such that the 
+first constructor invocation is an argument to a second, and that the first 
+constructor invocation takes an argument that is built up using a field. As an 
+example, say that \var{name} is a field of the enclosing class, and we have the 
+expression \code{new A(new B(name))}. If this expression is located in a 
+selection that is moved to another class, \var{name} will be left untouched, 
+instead of being prefixed with a variable of the same type as it is declared in.  
+If \var{name} is the destination for the move, it is not replaced by 
+\code{this}, or removed if it is a prefix to a member access 
+(\code{name.member}), but it is still left by itself.
+
+Situations like this would lead to code that will not compile. Therefore, we 
+have to avoid them by not allowing selections to contain such double class 
+instance creations that also contains references to fields.
+\todoin{File Eclipse bug report}
+
  \subsection{Instantiation of non-static inner class}
  When a non-static inner class is instantiated, this must happen in the scope of 
  its declaring class. This is because it must have access to the members of the 
@@ -1668,16 +1686,16 @@ unsuitable for the \ExtractAndMoveMethod refactoring.
  \subsection{References to enclosing instances of the enclosing class}
  The title of this section may be a little hard to grasp at first. What it means 
  is that there is a (non-static) class \m{C} that is declared in the scope of 
-possibly multiple other classes. And there is a statement in a method declared 
-in class \m{C} that contains a reference to one or more instances of these 
-enclosing classes of \m{C}.
+possibly multiple other classes. And there is a statement in the body of a 
+method declared in class \m{C}, that contains a reference to one or more 
+instances of these enclosing classes of \m{C}.
  
  The problem with this, is that these references may not be valid if they are 
  moved to another class. Theoretically, some situations could easily be solved by 
  passing, to the moved method, a reference to the instance where the problematic 
-referenced member is declared, in the case where this member is public. This is 
-not done in the underlying \MoveMethod refactoring, so it cannot be allowed in 
-the \ExtractAndMoveMethod refactoring either.
+referenced member is declared. This should work in the case where this member is 
+publicly accessible. This is not done in the underlying \MoveMethod refactoring, 
+so it cannot be allowed in the \ExtractAndMoveMethod refactoring either.
  
  \subsection{Inconsistent return statements}
  To verify that a text selection is consistent with respect to return statements, 
@@ -1699,38 +1717,40 @@ thrown exception.
  
  Return statements can be either explicit or implicit. An \emph{explicit} return 
  statement is formed by using the \code{return} keyword, while an \emph{implicit} 
-return statement is a statement that is not formed using \code{return} keyword, 
-but must be the last statement of a method that can have any side effects. This 
-can happen in methods with a void return type. An example is a statement that is 
+return statement is a statement that is not formed using \code{return}, but must 
+be the last statement of a method that can have any side effects. This can 
+happen in methods with a void return type. An example is a statement that is 
  inside one or more blocks. The last statement of a method could for instance be 
  a synchronized statement, but the last statement that is executed in the method, 
  and that can have any side effects, may be located inside the body of the 
  synchronized statement.
  
  We can start the check for this property by looking at the last statement of a 
-selection to see if it is a return statement (explicit or implicit) or a 
-throw statement.  If it is, then the property holds, assuming the selected code 
-does not contain any compilation errors.
+selection to see if it is a return statement (explicit or implicit) or a throw 
+statement.  If this is the case, then the property holds, assuming the selected 
+code does not contain any compilation errors. All execution paths within the 
+selection should end in either this, or another, return or throw statement.
  \todoin{State somewhere that we assume no compilation errors?}
  
  If the last statement of the selection is not a return or throw, the execution 
-of it must end in one. This means that all branches of the last statement of 
-every branch must end in a return or throw. Given this recursive definition, 
-there are only five types of statements that are guaranteed to end in a return 
-or throw if their child branches does. All other statements would have to be 
-considered illegal. The first three, block-statements, labeled statements and 
-do-statements are all kinds of fall-through statements that always gets their 
-body executed. Do-statements would not make much sense if written such that they
+of it must eventually end in one for the selection to be legal. This means that 
+all branches of the last statement of every branch must end in a return or 
+throw.  Given this recursive definition, there are only five types of statements 
+that are guaranteed to end in a return or throw if their child branches does.  
+All other statements would have to be considered illegal. The first three: 
+Block-statements, labeled statements and do-statements are all kinds of 
+fall-through statements that always gets their body executed. Do-statements 
+would not make much sense if written such that they
  always ends after the first round of execution of their body, but that is not 
  our concern. The remaining two statements that can end in a return or throw are 
  if-statements and try-statements.
  
  For an if-statement, the rule is that if its then-part does not contain any 
-return or throw statements, it is considered illegal. If the then-part does 
+return or throw statements, this is considered illegal. If the then-part does 
  contain a return or throw, the else-part is checked. If its else-part is 
  non-existent, or it does not contain any return or throw statements, the 
  statement is considered illegal. If an if-statement is not considered illegal, 
-the bodies of its parts must be checked. 
+the bodies of its two parts must be checked. 
  
  Try-statements are handled much the same way as if-statements. The body of a 
  try-statement must contain a return or throw. The same applies to its catch 
@@ -1752,7 +1772,8 @@ one such reference is done, or zero if any return statements are found.
  
  \subsection{Illegal statements}
  An illegal statement may be a statement that is of a type that is never allowed, 
-or it may be a statement that is only allowed if certain conditions are true.
+or it may be a statement of a type that is only allowed if certain conditions 
+are true.
  
  Any use of the \var{super} keyword is prohibited, since its meaning is altered 
  when moving a method to another class.
@@ -1792,8 +1813,8 @@ time to analyze all statements types.}
  
  
  
-\chapter{Refactorings in Eclipse JDT: Design, Shortcomings and Wishful 
-Thinking}\label{ch:jdt_refactorings}
+\chapter{Refactorings in Eclipse JDT: Design and 
+Shortcomings}\label{ch:jdt_refactorings}
  
  This chapter will deal with some of the design behind refactoring support in 
  \name{Eclipse}, and the JDT in specific. After which it will follow a section about 
@@ -1918,8 +1939,6 @@ individual undo actions corresponding to every single JDT refactoring in the
  sequence. This problem is not trivial to handle in \name{Eclipse} 
  \see{hacking_undo_history}.
  
-\section{Wishful Thinking}
-\todoin{???}
  
  
  \chapter{Composite Refactorings in Eclipse}
@@ -2130,13 +2149,13 @@ generates text selections of all the statement lists for the analyzer to work
  with.
  
  \paragraph{The statement lists creator}
-is responsible for generating lists of statements for all the possible levels of 
-statements in the method. The statement lists creator is implemented as an AST 
-visitor \see{astVisitor}. It generates lists of statements by visiting all the 
-blocks in the method declaration and stores their statements in a collection of 
-statement lists. In addition, it visits all of the other statements that can 
-have a statement as a child, such as the different control structures and the 
-labeled statement.
+is responsible for generating lists of statements for all the possible nesting 
+levels of statements in the method. The statement lists creator is implemented 
+as an AST visitor \see{astVisitor}. It generates lists of statements by visiting 
+all the blocks in the method declaration and stores their statements in a 
+collection of statement lists. In addition, it visits all of the other 
+statements that can have a statement as a child, such as the different control 
+structures and the labeled statement.
  
  The switch statement is the only kind of statement that is not straight forward 
  to obtain the child statements from. It stores all of its children in a flat 
@@ -2145,8 +2164,8 @@ there are potential statement lists between all of these case statements. The
  list of statements from a switch statement is therefore traversed, and the 
  statements between the case statements are grouped as separate lists.
  
-There is an example of how the statement lists creator would generate lists for 
-a simple method in \myref{lst:statementListsExample}.
+\Myref{lst:statementListsExample} shows an example of how the statement lists 
+creator would generate lists for a simple method.
  
  \begin{listing}[h]
  \def\charwidth{5.7pt}
@@ -2185,11 +2204,10 @@ into lists of statements. Each highlighted rectangle represents a list.}
  \end{listing}
  
  \paragraph{The text selections generator} generates text selections for each 
-list of statements from the statement lists creator. Conceptually, the generator 
-generates a text selection for every possible ordered \todo{make clearer} 
-combination of statements in a list. For a list of statements, the boundary 
-statements span out a text selection. This means that there are many different 
-lists that could span out the same selection.
+list of statements from the statement lists creator. The generator generates a 
+text selection for every sub-sequence of statements in a list. For a sequence of 
+statements, the first statement and the last statement span out a text 
+selection. 
  
  In practice, the text selections are calculated by only one traversal of the 
  statement list. There is a set of generated text selections. For each statement, 
@@ -2979,6 +2997,7 @@ found is registered with a prefix set, together with all its sub-prefixes.
  \subsection{The UnfixesCollector}\label{unfixes}
  The \typewithref{no.uio.ifi.refaktor.extractors.collectors}{UnfixesCollector} 
  finds unfixes within a selection.
+\todoin{Give more technical detail?}
  
  
  
@@ -3020,6 +3039,17 @@ very simple. It looks for calls to methods that are either protected or
  package-private within the selection, and throws an 
  \type{IllegalExpressionFoundException} if one is found.
  
+\subsection{The DoubleClassInstanceCreationChecker}
+The \type{DoubleClassInstanceCreationChecker} checks that there are no double 
+class instance creations where the inner constructor call take and argument that 
+is built up using field references.
+
+The checker visits all nodes of type \type{ClassInstanceCreation} within a 
+selection. For all of these nodes, if its parent also is a class instance 
+creation, it accepts a visitor that throws a 
+\type{IllegalExpressionFoundException} if it enclounters a name that is a field 
+reference.
+
  \subsection{The InstantiationOfNonStaticInnerClassChecker}
  The \type{InstantiationOfNonStaticInnerClassChecker} checks that selections
  does not contain instantiations of non-static inner classes. The 
@@ -3028,12 +3058,9 @@ instantiations gracefully when moving a method. This problem is also related to
  bug\ldots \todoin{File Eclipse bug report}
  
  \subsection{The EnclosingInstanceReferenceChecker}
-\todoin{Contains duplicate text from semantic section}
-The purpose of this checker is to verify that the names in a selection are not 
-referencing any enclosing instances. This is for making sure that all references 
-is legal in a method that is to be moved. Theoretically, some situations could 
-be easily solved my passing a reference to the referenced class with the moved 
-method (e.g. when calling public methods), but the dependency on the 
+The purpose of this checker is to verify that the names in a text selection are 
+not referencing any enclosing instances. In theory, the underlying problem could 
+be solved in some situations, but our dependency on the 
  \type{MoveInstanceMethodProcessor} prevents this.
  
  The 
@@ -3041,60 +3068,33 @@ The
  is a modified version of the 
  \typewithref{org.eclipse.jdt.internal.corext.refactoring.structure.MoveInstanceMethod\-Processor}{EnclosingInstanceReferenceFinder} 
  from the \type{MoveInstanceMethodProcessor}. Wherever the 
-\type{EnclosingInstanceReferenceFinder} would create a fatal error status, the 
-checker throws a \type{CheckerException}.
+\type{EnclosingInstanceReferenceFinder} would create a fatal error status, the
+checker will throw a \type{CheckerException}.
  
-It works by first finding all of the enclosing types of a selection. Thereafter 
-it visits all its simple names to check that they are not references to 
-variables or methods declared in any of the enclosing types. In addition the 
-checker visits \var{this}-expressions to verify that no such expressions are 
-qualified with any name.
+The checker works by first finding all of the enclosing types of a selection.  
+Thereafter, it visits all the simple names of the selection to check that they 
+are not references to variables or methods declared in any of the enclosing 
+types. In addition, the checker visits \var{this}-expressions to verify that no 
+such expressions are qualified with any name.
  
  \subsection{The ReturnStatementsChecker}\label{returnStatementsChecker}
-\todoin{Contains duplicate text from semantic section}
-The checker for return statements is meant to verify that if a text selection 
-contains a return statement, then every possible execution path within the 
-selection ends in a return statement. This property is important regarding the 
-\ExtractMethod refactoring. If it holds, it means that a method could be 
-extracted from the selection, and a call to it could be substituted for the 
-selection. If the method has a non-void return type, then a call to it would 
-also be a valid return point for the calling method. If its return value is of 
-the void type, then the \type{ExtractMethodRefactoring} of \name{Eclipse} 
-appends an empty return statement to the back of the method call. Therefore, the 
-analysis does not discriminate on either kinds of return statements, with or 
-without a return value.
-
-The property description implies that if the selection is free from return 
-statements, then the checker validates. So this is the first thing the checker 
-investigates.
+The checker for return statements is meant to verify that a text selection is 
+consistent regarding return statements.
+
+If the selection is free from return statements, then the checker validates.  So 
+this is the first thing the checker investigates.
  
  If the checker proceedes any further, it is because the selection contains one 
  or more return statements. The next test is therefore to check if the last 
-statement of the selection ends in either a return or a throw statement. If the 
-last statement of the selection ends in a return statement, then all execution 
-paths within the selection should end in either this, or another, return 
-statement. This is also true for a throw statement, since it causes an immediate 
-exit from the current block, together with all outer blocks in its control flow 
-that does not catch the thrown exception.
-
-Return statements can be either explicit or implicit. An \emph{explicit} return 
-statement is formed by using the \code{return} keyword, while an \emph{implicit} 
-return statement is a statement that is not formed by the \code{return} keyword, 
-but must be the last statement of a method that can have any side effects.  This 
-can happen in methods with a void return type. An example is a statement that is 
-inside one or more blocks. The last statement of a method could for instance be 
-an if-statement, but the last statement that is executed in the method, and that 
-can have any side effects, may be located inside the block of the else part of 
-the if-statement.
-
-The responsibility for checking that the last statement of the selection 
-eventually ends in a return or throw statement, is put on the 
+statement of the selection ends in either a return or a throw statement. The 
+responsibility for checking that the last statement of the selection eventually 
+ends in a return or throw statement, is put on the 
  \type{LastStatementOfSelectionEndsInReturnOrThrowChecker}. For every node 
-visited, if it is a statement, it does a test to see if the statement is a 
+visited, if the node is a statement, it does a test to see if the statement is a 
  return, a throw or if it is an implicit return statement. If this is the case, 
  no further checking is done. This checking is done in the \code{preVisit2} 
  method \see{astVisitor}. If the node is not of a type that is being handled by 
-its type specific visit method, the checker performs a simple test. If the node 
+its type-specific visit method, the checker performs a simple test. If the node 
  being visited is not the last statement of its parent that is also enclosed by 
  the selection, an \type{IllegalStatementFoundException} is thrown. This ensures 
  that all statements are taken care of, one way or the other. It also ensures 
@@ -3105,41 +3105,37 @@ To examine if a statement is an implicit return statement, the checker first
  finds the last statement declared in its enclosing method. If this statement is 
  the same as the one under investigation, it is considered an implicit return 
  statement. If the statements are not the same, the checker does a search to see 
-if statement examined is also the last statement of the method that can be 
+if the statement examined is also the last statement of the method that can be 
  reached. This includes the last statement of a block statement, a labeled 
  statement, a synchronized statement or a try statement, that in turn is the last 
-statement enclosed by the statement types listed. This search goes through all 
-the parents of a statement until a statement is found that is not one of the 
-mentioned acceptable parent statements. If the search ends in a method 
-declaration, then the statement is considered to be the last reachable statement 
-of the method, and thus also an implicit return statement.
-
-There are two kinds of statements that are handled explicitly. It is 
-if-statements and try-statements. Block, labeled and do-statements are handled 
-by fall-through to the other two. Do-statements are considered equal to blocks 
-in this context, since their bodies are always evaluated at least one time.  If- 
-and try-statements are visited only if they are the last node of their parent 
-within the selection.
-
-For if-statements, the rule is that if the then-part does not contain any return 
-or throw statements, it is considered illegal. If it does contain a return or 
+statement enclosed by one of the statement types listed. This search goes 
+through all the parents of a statement until a statement is found that is not 
+one of the mentioned acceptable parent statements. If the search ends in a 
+method declaration, then the statement is considered to be the last reachable 
+statement of the method, and thus it is an implicit return statement.
+
+There are two kinds of statements that are handled explicitly: If-statements and 
+try-statements. Block, labeled and do-statements are handled by fall-through to 
+the other two.
+
+If-statements are handled explicitly by overriding their type-specific visit 
+method. If the then-part does not contain any return or throw statements an 
+\type{IllegalStatementFoundException} is thrown. If it does contain a return or 
  throw, its else-part is checked. If the else-part is non-existent, or it does 
-not contain any return or throw statements, it is considered illegal. If the 
-statement is not regarded illegal, its children are visited.
+not contain any return or throw statements an exception is thrown. If no 
+exception is thrown while visiting the if-statement, its children are visited.
  
-Try-statements are handled much the same way as if-statements. Its body must 
+A try-statement is checked very similar to an if-statement. Its body must 
  contain a return or throw. The same applies to its catch clauses and finally 
-body. 
+body. Failure to validate produces an \type{IllegalStatementFoundException}.
  
  If the checker does not complain at any point, the selection is considered valid 
  with respect to return statements.
  
  \subsection{The AmbiguousReturnValueChecker}
-\todoin{revisit}
-This checker verifies that there are no \emph{ambiguous return values} in a 
-selection.
+This checker verifies that there are no ambiguous return values in a selection.
  
-First the checker needs to collect some data. Those data are the binding keys 
+First, the checker needs to collect some data. Those data are the binding keys 
  for all simple names that are assigned to within the selection, including 
  variable declarations, but excluding fields. The checker also collects whether 
  there exists a return statement in the selection or not. No further checks of 
@@ -3158,29 +3154,17 @@ is thrown.
  %on\ldots)}
  
  \subsection{The IllegalStatementsChecker}
-\todoin{Contains duplicate text from semantic section}
  This checker is designed to check for illegal statements.
  
-Any use of the \var{super} keyword is prohibited, since its meaning is altered 
-when moving a method to another class.
-
-For a \emph{break} statement, there is two situations to consider: A break 
-statement with or without a label. If the break statement has a label, it is 
-checked that whole of the labeled statement is inside the selection. Since a 
-label does not have any binding information, we have to search upwards in the 
-AST to find the \type{LabeledStatement} that corresponds to the label from the 
-break statement, and check that it is contained in the selection. If the break 
-statement does not have a label attached to it, it is checked that its innermost 
-enclosing loop or switch statement also is inside the selection.
-
-The situation for a \emph{continue} statement is the same as for a break 
-statement, except that it is not allowed inside switch statements.
-
-Regarding \emph{assignments}, two types of assignments is allowed: Assignment to 
-a non-final variable and assignment to an array access. All other assignments is 
-regarded illegal.
+Notice that labels in break and continue statements needs some special 
+treatment. Since a label does not have any binding information, we have to 
+search upwards in the AST to find the \type{LabeledStatement} that corresponds 
+to the label from the break or continue statement, and check that it is 
+contained in the selection. If the break or continue statement does not have a 
+label attached to it, it is checked that its innermost enclosing loop or switch 
+statement (break statements only) also is contained in the selection.
  
-\todoin{Finish\ldots}
+\todoin{Follow the development in the semantics section\ldots}
  
  
  \chapter{Benchmarking}