Minor change
[u/philim/db2osl_thesis.git] / program_code.tex
1\section{Code style}
3TODO: Conventions, ex.: iterators
4As the final system hopefully will have a long living cycle TODO
5and will be used and refined by many people, high code quality was an important aim.
6Beyond architectural issues this also involves cleanness on the lower level,
7like the design of classes and the implementation of methods.
8Common software development principles were followed TODO and
9the unfamiliar reader was constantly taken into account
10to yield clean, usable and readable code.
14Comments were used at places ambiguities or misinterpretations could arise,
15yet care was taken to face such problems at their roots and solve them
16wherever possible instead of just eliminating the ambiguity with comments.
18Consider the following method in \file{CLIDatabaseInteraction.java}:
19\codepar{public static void promptAbortRetrieveDBSchemaAndWait\\
20 \ind(final FutureTask<DBSchema> retriever) throws SQLException}
22It could have been called \code{promptAbortRetrieveDBSchema} only, with the
23waiting mentioned in a comment.
24However, the waiting is such an important part of its behavior, that this
25wouldn't have been enough, so the waiting was included in the function name.
26Since the method is called at one place only, the lengthening of the method
27name by 7 characters or about 26 \% is really not a problem.
29More generally, ``speaking code'' was used wherever possible,
30as described in section \fullref{speaking},
31which rendered many uses of comments unnecessary.
28b54c67 32In fact, the number of (plain, e.g. non-\name{Javadoc}) comments was consciously minimized,
33to enforce speaking code and avoid redundancy.
34This technique is known TODO.
36An exception of course from this is the highlighting of subdivisions.
37In class and method implementations, comments like
f779b639 38\codepar{//********************** Constructors **********************\textbackslash\textbackslash}
40were deliberately used to ease navigation inside source files for unfamiliar
41readers, but also to enhance readability: independent parts of method
42implementations, for example, were optically separated this way.
43Another alternative would have been to use separate methods for this code
44pieces, as was done in other cases, and thereby sticking strictly to the so-called
45``Composed Method Pattern'' \cite{composed}.
46However, this would have introduced additional artifacts with either long or non-speaking names,
47would have interrupted the reading flow and also would have increased complexity,
48because these methods would have been callable at least from everywhere in the source file.
c31df1ed 49
28b54c67 50Wherever possible, the appropriate \name{Javadoc} comments were used in favor of
51plain comments, for example to specify parameters, return types, exceptions
52and links to other parts of the documentation.
54\subsection{Speaking code}
56As mentioned in section \fullref{comments}, the use of ``speaking code'' as
57introduced TODO
58renders many uses of comments unnecessary.
59In particular, the following aspects are commonly considered when referring to
60the term ``speaking code'' TODO:
63 \item Variable names
64 \item Control flow
67\subsubsection{Variable names}
68A very important part of speaking code
70\subsection{Robustness against incorrect use}
71Care was taken to produce code that is robust to incorrect use, making it
72suitable for the expected environment of sporadic updates by unfamiliar and
73potentially even unpracticed programmers who very likely have their emphasis
74on the concepts of bootstrapping rather than details of the present code.
76In fact, carefully avoiding the introduction of technical artifacts to mind,
77preventing programmers from focusing on the actual program logic,
78is an important principle of writing clean code TODO.
80In modern programming languages, of course the main instruments for achieving
81this are the type system and exceptions.
82In particular, static type information should be used to reflect data
83abstraction and the ``kind'' of data, an object reflects,
84while dynamic type information should only be used implicitly,
85through dynamically dispatching method invocations\cite{str4}.
86Exceptions on the other hand should be used at any place related to errors
87and error handling, separating error handling noticeably from other code and
88enforcing the treatment of errors, preventing the programmer from using
89corrupted information in many cases.
91An example of both mechanism, static type information and exceptions, acting
92in combination, while cleanly fitting into the context of dynamic dispatching,
93are the following methods from \file{Column.java}:
94\codepar{public Boolean isNonNull()\\public Boolean isUnique()}
28b54c67 96There return type is the \name{Java} class \code{Boolean}, not the plain type
97\code{boolean}, because the information they return is not always known.
98In an early stage of the program, they returned \code{boolean} and were
99accompanied TODO by two methods
100\code{public boolean knownIsNonNull()} and \code{public boolean knownIsUnique()},
101telling the caller whether the respective information was known and thus the
102value returned by \code{isNonNull()} or \code{isUnique()}, respectively,
103was reliable.
28b54c67 105They were then changed to return the \name{Java} class \code{Boolean} and to return
106null pointers in case the respective information is not known.
107This eliminated any possibility of using unreliable data in favor of generating
108exceptions instead, in this case a \code{NullPointerException}, which is thrown
28b54c67 109automatically by the \name{Java Runtime Environment} if the programmer forgets the
110null check and tries to get a definite value from one of these methods
111when the correct value currently is not known.
113Comparing two unknown values -- thus, two null pointers --
114also yields the desired result, \code{true}, since the change,
115even when the programmer forgets that he deals with objects.
116However, when comparing two return values of one of the methods in general
117-- as opposed to comparing one such return value against a constant --,
118errors could occur if the programmer writes \code{col1.isUnique() == col2.isUnique()}
119instead of \code{col1.isUnique().booleanValue() == col2.isUnique().booleanValue()}.
120\\TODO: Java rules.
122TODO: more, summary
126Following the object-oriented programming paradigm, classes were heavily used
127to abstract from implementation details and to yield intuitively usable objects with
128a set of useful operations \cite{obj}.
130\subsubsection{Identification of classes}
131To identify potential classes, entities from the problem domain were -- if reasonable --
28b54c67 132directly represented as \name{Java} classes.
133The approach of choosing ``the program that most directly models the aspects of the
134real world that we are interested in'' to yield clean code,
135as described and recommended by Stroustrup \cite{str3}, proved to be extremely useful
136and effective.
137As a consequence, the code declares classes like \code{Column}, \code{ColumnSet},
138\code{ForeignKey}, \code{Table}, \code{TableSchema} and \code{SQLType}.
139As described in section \fullref{speaking}, class names were chosen to be concise
140but nevertheless expressive TODO.
141\name{Java} packages were used to help attain this aim,
142which is why the previously mentioned class names are unambiguous
143(for details about package use, see section \fullref{packages}, for the description
144of the packages in \myprog{} and their structuring, see section \fullref{coarse}).
146Care was taken not to introduce unnecessary classes, thereby complicating
147code structure and increasing the number of source files and program entities.
148Especially artificial classes, having little or no reference to real-world
149objects, could most often be avoided.
150On the other hand of course, it usually is not the cleanest solution
151to avoid such artificial classes entirely.
153\subsubsection{Const correctness}
154Specifying in the code which objects may be altered and which shall remain constant,
155thus allowing for additional static checks preventing undesired modifications,
156is commonly referred to as ``const correctness'' TODO.
158Unfortunately, \name{Java} lacks a keyword like \name{C++}'s \code{const},
159making it harder to achieve const correctness.
160It only specifies the similar keyword \code{final}, which is much less expressive and
161doesn't allow for a similarly effective error prevention \cite{final}.
162In particular, because \code{final} is not part of an object's type information,
163it is not possible to declare methods that return read-only objects TODO --
164placing a \code{final} before the method's return type would declare the
165method \code{final}. Similarly, there is no way to express that a method must not change
166the state of its object parameters. A method like \code{public f(final Object obj)}
167is only liable to not assigning a new value to its parameter object \code{obj} \cite{java}
168(which, if allowed, wouldn't affect the caller anyway \cite{java}).
169Methods changing its state, however, are allowed to be called on \code{obj} without
170restrictions \cite{java}.
172Several possibilities were considered to address this problem:
174 \item Not implementing const correctness, but stating the access rules in
175 comments only
176 \item Giving the methods which modify object states special names
177 like\\\code{setName\textendash\textendash USE\_WITH\_CARE}
178 \item Delegating changes of objects to special ``editor'' objects to be
28b54c67 179 obtained when an object shall be altered TODO
180 \item Deriving classes offering the modifying methods from the read-only
181 classes
184Not implementing const correctness at all of course would have been the simplest
185possibility, producing the shortest and most readable code, but since
186incautious manipulation of objects would possibly have introduced subtle,
187hard-to-spot errors which in many cases would have occurred under additional
188conditions only and at other places, for example when inserting a \code{Column}
189into a \code{ColumnSet}, this method was not seriously considered.
191Using intentionally angular, conspicuous names also was not considered seriously,
192since it would have cluttered the code for the only sake of hopefully warning
193programmers of possible errors -- and not attempting to avoid them technically.
195So the introduction of new classes was considered the most effective and cleanest
196solution, either in the form of ``editor'' classes or derived classes offering the
197modifying methods directly. Again -- as in the identification of classes --,
198the most direct solution was considered the best, so the latter form of introducing
199additional classes was chosen and classes like \code{ReadableColumn},
200\code{ReadableColumnSet} et cetera were introduced which offer only the read-only
201functionality and usually occur in interfaces.
202Their counterparts including modifying methods also were derived from them and the
203implications of modifications were explained in their documentation, while the
204issue and the approach as such were also mentioned in the documentation of the
205\code{Readable...} classes.
206The \code{Readable...} classes can be converted to their fully-functional
207counterparts via downcasting (only), thereby giving a strong hint to
208programmers that the resulting objects are to be used with care.
210\subsubsection{Java interfaces}
211In \name{Java} programming, it is quiet common and often recommended, that every
212class has at least one \code{interface} it \code{implements},
213specifying the operations the class provides. TODO
214If no obvious \code{interface} exists for a class or the desired
215interface name is already given to some other entity,
216the interface is often given names like \code{ITableSchema}
217or \code{TableSchemaInterface}.
219However, for a special purpose program with a relatively fixed set of classes
220mostly representing real-world artifacts from the problem domain,
221this approach was considered overly cluttering, introducing artificial
222code entities for no benefit.
223In particular, as explained in section TODO, all program classes either are
224standing alone TODO or belong to a class hierarchy derived from at least one
226So, except from the standalone classes, an interface existed anyway, either
227``naturally'' (as in the case of \code{Key}, for example) or because of
228the chosen way to implement const correctness.
229In some cases, these were interfaces declared in the program code, while
28b54c67 230in some cases, \name{Java} interfaces like \code{Set} were implemented
231(an obvious choice, of course, for \code{ColumnSet}).
232Introducing artificial interfaces for the standalone classes was considered
233unnecessary at least, if not messy.
237As mentioned in section \fullref{classes}, class names were chosen to be
238concise but nevertheless expressive.
28b54c67 239This only was possible through the use of \name{Java} \code{package}s,
240which also helped structure the program.
242For the current, relatively limited, extent of the program which currently
243comprises $45$ (\code{public}) classes, a flat package structure was
244considered ideal, because it is simple and doesn't stash source files deep
28b54c67 245in subdirectories (in \name{Java}, the directory structure of the source tree
246is required to reflect the package structure TODO).
247Because also every class belongs to a package,
248each source file is to be found exactly one directory below the root
249program source directory, which in many cases eases their handling.
251The following $11$ packages exist in the program
252(their purpose and more details about the package structure are
253described in section \fullref{coarse}):
255 \item \code{boostrapping}
256 \item \code{cli}
257 \item \code{database}
258 \item \code{helpers}
259 \item \code{log}
260 \item \code{main}
261 \item \code{osl}
262 \item \code{output}
263 \item \code{settings}
264 \item \code{specification}
265 \item \code{test}
28b54c67 266\end{itemize}\end{multicols}
268Each package is documented in the source code also, particularly in a file
269\file{package-info.java} residing in the respective package directory.
270This is a common scheme supported by the \name{Eclipse} IDE as well as the
271documentation generation systems \name{javadoc} and \name{doxygen} TODO
272(all of which were used in the creation of the program,
273as described in section TODO).