Major change
[u/philim/db2osl_thesis.git] / program_code.tex
1\section{Code style}
3TODO: Conventions, ex.: iterators
4As the final system hopefully will have a long living cycle TODO
5and will be used and refined by many people, high code quality was an important aim.
6Beyond architectural issues this also involves cleanness on the lower level,
7like the design of classes and the implementation of methods.
8Common software development principles were followed TODO and
9the unfamiliar reader was constantly taken into account
10to yield clean, usable and readable code.
14Comments were used at places ambiguities or misinterpretations could arise,
15yet care was taken to face such problems at their roots and solve them
16wherever possible instead of just eliminating the ambiguity with comments.
18Consider the following method in \file{}:
19\codepar{public static void promptAbortRetrieveDBSchemaAndWait\\
20 \ind(final FutureTask<DBSchema> retriever) throws SQLException}
22It could have been called \code{promptAbortRetrieveDBSchema} only, with the
23waiting mentioned in a comment.
24However, the waiting is such an important part of its behavior, that this
25wouldn't have been enough, so the waiting was included in the function name.
26Since the method is called at one place only, the lengthening of the method
27name by 7 characters or about 26 \% is really not a problem.
29More generally, ``speaking code'' was used wherever possible,
30as described in section \fullref{speaking},
31which rendered many uses of comments unnecessary.
28b54c67 32In fact, the number of (plain, e.g. non-\name{Javadoc}) comments was consciously minimized,
33to enforce speaking code and avoid redundancy.
34This technique is known TODO.
36An exception of course from this is the highlighting of subdivisions.
37In class and method implementations, comments like
38\codepar{//********************** Constructors **********************TODO}
40were deliberately used to ease navigation inside source files for unfamiliar
41readers, but also to enhance readability: independent parts of method
42implementations, for example, were optically separated this way.
43Another alternative would have been to use separate methods for this code
44pieces, as was done in other cases, but this would then have introduced
45additional artifacts with either long or non-speaking names.
46Additionally, it would have increased complexity, because these methods
47would have been callable at least from everywhere in the source file,
48and would have interrupted the reading flow.
49This technique is known TODO, while TODO
28b54c67 51Wherever possible, the appropriate \name{Javadoc} comments were used in favor of
52plain comments, for example to specify parameters, return types, exceptions
53and links to other parts of the documentation.
55\subsection{Speaking code}
57As mentioned in section \fullref{comments}, the use of ``speaking code'' as
58introduced TODO
59renders many uses of comments unnecessary.
60In particular, the following aspects are commonly considered when referring to
61the term ``speaking code'' TODO:
64 \item Variable names
65 \item Control flow
68\subsubsection{Variable names}
69A very important part of speaking code
71\subsection{Robustness against incorrect use}
72Care was taken to produce code that is robust to incorrect use, making it
73suitable for the expected environment of sporadic updates by unfamiliar and
74potentially even unpracticed programmers who very likely have their emphasis
75on the concepts of bootstrapping rather than details of the present code.
77In fact, carefully avoiding the introduction of technical artifacts to mind,
78preventing programmers from focusing on the actual program logic,
79is an important principle of writing clean code TODO.
81In modern programming languages, of course the main instruments for achieving
82this are the type system and exceptions.
83In particular, static type information should be used to reflect data
84abstraction and the ``kind'' of data, an object reflects,
85while dynamic type information should only be used implicitly,
86through dynamically dispatching method invocations\cite{str4}.
87Exceptions on the other hand should be used at any place related to errors
88and error handling, separating error handling noticeably from other code and
89enforcing the treatment of errors, preventing the programmer from using
90corrupted information in many cases.
92An example of both mechanism, static type information and exceptions, acting
93in combination, while cleanly fitting into the context of dynamic dispatching,
94are the following methods from \file{}:
95\codepar{public Boolean isNonNull()\\public Boolean isUnique()}
28b54c67 97There return type is the \name{Java} class \code{Boolean}, not the plain type
98\code{boolean}, because the information they return is not always known.
99In an early stage of the program, they returned \code{boolean} and were
100accompanied TODO by two methods
101\code{public boolean knownIsNonNull()} and \code{public boolean knownIsUnique()},
102telling the caller whether the respective information was known and thus the
103value returned by \code{isNonNull()} or \code{isUnique()}, respectively,
104was reliable.
28b54c67 106They were then changed to return the \name{Java} class \code{Boolean} and to return
107null pointers in case the respective information is not known.
108This eliminated any possibility of using unreliable data in favor of generating
109exceptions instead, in this case a \code{NullPointerException}, which is thrown
28b54c67 110automatically by the \name{Java Runtime Environment} if the programmer forgets the
111null check and tries to get a definite value from one of these methods
112when the correct value currently is not known.
114Comparing two unknown values -- thus, two null pointers --
115also yields the desired result, \code{true}, since the change,
116even when the programmer forgets that he deals with objects.
117However, when comparing two return values of one of the methods in general
118-- as opposed to comparing one such return value against a constant --,
119errors could occur if the programmer writes \code{col1.isUnique() == col2.isUnique()}
120instead of \code{col1.isUnique().booleanValue() == col2.isUnique().booleanValue()}.
121\\TODO: Java rules.
123TODO: more, summary
127Following the object-oriented programming paradigm, classes were heavily used
128to abstract from implementation details and to yield intuitively usable objects with
129a set of useful operations \cite{obj}.
131\subsubsection{Identification of classes}
132To identify potential classes, entities from the problem domain were -- if reasonable --
28b54c67 133directly represented as \name{Java} classes.
134The approach of choosing ``the program that most directly models the aspects of the
135real world that we are interested in'' to yield clean code,
136as described and recommended by Stroustrup \cite{str3}, proved to be extremely useful
137and effective.
138As a consequence, the code declares classes like \code{Column}, \code{ColumnSet},
139\code{ForeignKey}, \code{Table}, \code{TableSchema} and \code{SQLType}.
140As described in section \fullref{speaking}, class names were chosen to be concise
141but nevertheless expressive TODO.
142\name{Java} packages were used to help attain this aim,
143which is why the previously mentioned class names are unambiguous
144(for details about package use, see section \fullref{packages}, for the description
145of the packages in \myprog{} and their structuring, see section \fullref{coarse}).
147Care was taken not to introduce unnecessary classes, thereby complicating
148code structure and increasing the number of source files and program entities.
149Especially artificial classes, having little or no reference to real-world
150objects, could most often be avoided.
151On the other hand of course, it usually is not the cleanest solution
152to avoid such artificial classes entirely.
154\subsubsection{Const correctness}
155Specifying in the code which objects may be altered and which shall remain constant,
156thus allowing for additional static checks preventing undesired modifications,
157is commonly referred to as ``const correctness'' TODO.
159Unfortunately, \name{Java} lacks a keyword like \name{C++}'s \code{const},
160making it harder to achieve const correctness.
161It only specifies the similar keyword \code{final}, which is much less expressive and
162doesn't allow for a similarly effective error prevention \cite{final}.
163In particular, because \code{final} is not part of an object's type information,
164it is not possible to declare methods that return read-only objects TODO --
165placing a \code{final} before the method's return type would declare the
166method \code{final}. Similarly, there is no way to express that a method must not change
167the state of its object parameters. A method like \code{public f(final Object obj)}
168is only liable to not assigning a new value to its parameter object \code{obj} \cite{java}
169(which, if allowed, wouldn't affect the caller anyway \cite{java}).
170Methods changing its state, however, are allowed to be called on \code{obj} without
171restrictions \cite{java}.
173Several possibilities were considered to address this problem:
175 \item Not implementing const correctness, but stating the access rules in
176 comments only
177 \item Giving the methods which modify object states special names
178 like\\\code{setName\textendash\textendash USE\_WITH\_CARE}
179 \item Delegating changes of objects to special ``editor'' objects to be
28b54c67 180 obtained when an object shall be altered TODO
181 \item Deriving classes offering the modifying methods from the read-only
182 classes
185Not implementing const correctness at all of course would have been the simplest
186possibility, producing the shortest and most readable code, but since
187incautious manipulation of objects would possibly have introduced subtle,
188hard-to-spot errors which in many cases would have occurred under additional
189conditions only and at other places, for example when inserting a \code{Column}
190into a \code{ColumnSet}, this method was not seriously considered.
192Using intentionally angular, conspicuous names also was not considered seriously,
193since it would have cluttered the code for the only sake of hopefully warning
194programmers of possible errors -- and not attempting to avoid them technically.
196So the introduction of new classes was considered the most effective and cleanest
197solution, either in the form of ``editor'' classes or derived classes offering the
198modifying methods directly. Again -- as in the identification of classes --,
199the most direct solution was considered the best, so the latter form of introducing
200additional classes was chosen and classes like \code{ReadableColumn},
201\code{ReadableColumnSet} et cetera were introduced which offer only the read-only
202functionality and usually occur in interfaces.
203Their counterparts including modifying methods also were derived from them and the
204implications of modifications were explained in their documentation, while the
205issue and the approach as such were also mentioned in the documentation of the
206\code{Readable...} classes.
207The \code{Readable...} classes can be converted to their fully-functional
208counterparts via downcasting (only), thereby giving a strong hint to
209programmers that the resulting objects are to be used with care.
211\subsubsection{Java interfaces}
212In \name{Java} programming, it is quiet common and often recommended, that every
213class has at least one \code{interface} it \code{implements},
214specifying the operations the class provides. TODO
215If no obvious \code{interface} exists for a class or the desired
216interface name is already given to some other entity,
217the interface is often given names like \code{ITableSchema}
218or \code{TableSchemaInterface}.
220However, for a special purpose program with a relatively fixed set of classes
221mostly representing real-world artifacts from the problem domain,
222this approach was considered overly cluttering, introducing artificial
223code entities for no benefit.
224In particular, as explained in section TODO, all program classes either are
225standing alone TODO or belong to a class hierarchy derived from at least one
227So, except from the standalone classes, an interface existed anyway, either
228``naturally'' (as in the case of \code{Key}, for example) or because of
229the chosen way to implement const correctness.
230In some cases, these were interfaces declared in the program code, while
28b54c67 231in some cases, \name{Java} interfaces like \code{Set} were implemented
232(an obvious choice, of course, for \code{ColumnSet}).
233Introducing artificial interfaces for the standalone classes was considered
234unnecessary at least, if not messy.
238As mentioned in section \fullref{classes}, class names were chosen to be
239concise but nevertheless expressive.
28b54c67 240This only was possible through the use of \name{Java} \code{package}s,
241which also helped structure the program.
243For the current, relatively limited, extent of the program which currently
244comprises $45$ (\code{public}) classes, a flat package structure was
245considered ideal, because it is simple and doesn't stash source files deep
28b54c67 246in subdirectories (in \name{Java}, the directory structure of the source tree
247is required to reflect the package structure TODO).
248Because also every class belongs to a package,
249each source file is to be found exactly one directory below the root
250program source directory, which in many cases eases their handling.
252The following $11$ packages exist in the program
253(their purpose and more details about the package structure are
254described in section \fullref{coarse}):
256 \item \code{boostrapping}
257 \item \code{cli}
258 \item \code{database}
259 \item \code{helpers}
260 \item \code{log}
261 \item \code{main}
262 \item \code{osl}
263 \item \code{output}
264 \item \code{settings}
265 \item \code{specification}
266 \item \code{test}
28b54c67 267\end{itemize}\end{multicols}
269Each package is documented in the source code also, particularly in a file
270\file{} residing in the respective package directory.
271This is a common scheme supported by the \name{Eclipse} IDE as well as the
272documentation generation systems \name{javadoc} and \name{doxygen} TODO
273(all of which were used in the creation of the program,
274as described in section TODO).