program_code.tex

   1 \section{Code style}
   2 \label{code}
   3 TODO: Conventions, ex.: iterators
   4 As the final system hopefully will have a long living cycle TODO
   5 and will be used and refined by many people, high code quality was an important aim.
   6 Beyond architectural issues this also involves cleanness on the lower level,
   7 like the design of classes and the implementation of methods.
   8 Common software development principles were followed TODO and
   9 the unfamiliar reader was constantly taken into account
  10 to yield clean, usable and readable code.
  11
  12 \subsection{Comments}
  13 \label{comments}
  14 Comments were used at places ambiguities or misinterpretations could arise,
  15 yet care was taken to face such problems at their roots and solve them
  16 wherever possible instead of just eliminating the ambiguity with comments.
  17
  18 Consider the following method in \file{CLIDatabaseInteraction.java}:
  19 \codepar{public static void promptAbortRetrieveDBSchemaAndWait\\
  20         \ind(final FutureTask<DBSchema> retriever) throws SQLException}
  21
  22 It could have been called \code{promptAbortRetrieveDBSchema} only, with the
  23 waiting mentioned in a comment.
  24 However, the waiting is such an important part of its behavior, that this
  25 wouldn't have been enough, so the waiting was included in the function name.
  26 Since the method is called at one place only, the lengthening of the method
  27 name by 7 characters or about 26 \% is really not a problem.
  28
  29 More generally, ``speaking code'' was used wherever possible,
  30 as described in section \fullref{speaking},
  31 which rendered many uses of comments unnecessary.
  32 In fact, the number of (plain, e.g. non-\name{Javadoc}) comments was consciously minimized,
  33 to enforce speaking code and avoid redundancy.
  34 This technique is known TODO.
  35
  36 An exception of course from this is the highlighting of subdivisions.
  37 In class and method implementations, comments like
  38 \codepar{//********************** Constructors **********************TODO}
  39
  40 were deliberately used to ease navigation inside source files for unfamiliar
  41 readers, but also to enhance readability: independent parts of method
  42 implementations, for example, were optically separated this way.
  43 Another alternative would have been to use separate methods for this code
  44 pieces, as was done in other cases, but this would then have introduced
  45 additional artifacts with either long or non-speaking names.
  46 Additionally, it would have increased complexity, because these methods
  47 would have been callable at least from everywhere in the source file,
  48 and would have interrupted the reading flow.
  49 This technique is known TODO, while TODO
  50
  51 Wherever possible, the appropriate \name{Javadoc} comments were used in favor of
  52 plain comments, for example to specify parameters, return types, exceptions
  53 and links to other parts of the documentation.
  54
  55 \subsection{Speaking code}
  56 \label{speaking}
  57 As mentioned in section \fullref{comments}, the use of ``speaking code'' as
  58 introduced TODO
  59 renders many uses of comments unnecessary.
  60 In particular, the following aspects are commonly considered when referring to
  61 the term ``speaking code'' TODO:
  62
  63 \begin{itemize}
  64         \item Variable names
  65         \item Control flow
  66 \end{itemize}
  67
  68 \subsubsection{Variable names}
  69 A very important part of speaking code
  70
  71 \subsection{Robustness against incorrect use}
  72 Care was taken to produce code that is robust to incorrect use, making it
  73 suitable for the expected environment of sporadic updates by unfamiliar and
  74 potentially even unpracticed programmers who very likely have their emphasis
  75 on the concepts of bootstrapping rather than details of the present code.
  76
  77 In fact, carefully avoiding the introduction of technical artifacts to mind,
  78 preventing programmers from focusing on the actual program logic,
  79 is an important principle of writing clean code TODO.
  80
  81 In modern programming languages, of course the main instruments for achieving
  82 this are the type system and exceptions.
  83 In particular, static type information should be used to reflect data
  84 abstraction and the ``kind'' of data, an object reflects,
  85 while dynamic type information should only be used implicitly,
  86 through dynamically dispatching method invocations\cite{str4}.
  87 Exceptions on the other hand should be used at any place related to errors
  88 and error handling, separating error handling noticeably from other code and
  89 enforcing the treatment of errors, preventing the programmer from using
  90 corrupted information in many cases.
  91
  92 An example of both mechanism, static type information and exceptions, acting
  93 in combination, while cleanly fitting into the context of dynamic dispatching,
  94 are the following methods from \file{Column.java}:
  95 \codepar{public Boolean isNonNull()\\public Boolean isUnique()}
  96
  97 There return type is the \name{Java} class \code{Boolean}, not the plain type
  98 \code{boolean}, because the information they return is not always known.
  99 In an early stage of the program, they returned \code{boolean} and were
 100 accompanied TODO by two methods
 101 \code{public boolean knownIsNonNull()} and \code{public boolean knownIsUnique()},
 102 telling the caller whether the respective information was known and thus the
 103 value returned by \code{isNonNull()} or \code{isUnique()}, respectively,
 104 was reliable.
 105
 106 They were then changed to return the \name{Java} class \code{Boolean} and to return
 107 null pointers in case the respective information is not known.
 108 This eliminated any possibility of using unreliable data in favor of generating
 109 exceptions instead, in this case a \code{NullPointerException}, which is thrown
 110 automatically by the \name{Java Runtime Environment} if the programmer forgets the
 111 null check and tries to get a definite value from one of these methods
 112 when the correct value currently is not known.
 113
 114 Comparing two unknown values -- thus, two null pointers --
 115 also yields the desired result, \code{true}, since the change,
 116 even when the programmer forgets that he deals with objects.
 117 However, when comparing two return values of one of the methods in general
 118 -- as opposed to comparing one such return value against a constant --,
 119 errors could occur if the programmer writes \code{col1.isUnique() == col2.isUnique()}
 120 instead of \code{col1.isUnique().booleanValue() == col2.isUnique().booleanValue()}.
 121 \\TODO: Java rules.
 122
 123 TODO: more, summary
 124
 125 \subsection{Classes}
 126 \label{classes}
 127 Following the object-oriented programming paradigm, classes were heavily used
 128 to abstract from implementation details and to yield intuitively usable objects with
 129 a set of useful operations \cite{obj}.
 130
 131 \subsubsection{Identification of classes}
 132 To identify potential classes, entities from the problem domain were -- if reasonable --
 133 directly represented as \name{Java} classes.
 134 The approach of choosing ``the program that most directly models the aspects of the
 135 real world that we are interested in'' to yield clean code,
 136 as described and recommended by Stroustrup \cite{str3}, proved to be extremely useful
 137 and effective.
 138 As a consequence, the code declares classes like \code{Column}, \code{ColumnSet},
 139 \code{ForeignKey}, \code{Table}, \code{TableSchema} and \code{SQLType}.
 140 As described in section \fullref{speaking}, class names were chosen to be concise
 141 but nevertheless expressive TODO.
 142 \name{Java} packages were used to help attain this aim,
 143 which is why the previously mentioned class names are unambiguous
 144 (for details about package use, see section \fullref{packages}, for the description
 145 of the packages in \myprog{} and their structuring, see section \fullref{coarse}).
 146
 147 Care was taken not to introduce unnecessary classes, thereby complicating
 148 code structure and increasing the number of source files and program entities.
 149 Especially artificial classes, having little or no reference to real-world
 150 objects, could most often be avoided.
 151 On the other hand of course, it usually is not the cleanest solution
 152 to avoid such artificial classes entirely.
 153
 154 \subsubsection{Const correctness}
 155 Specifying in the code which objects may be altered and which shall remain constant,
 156 thus allowing for additional static checks preventing undesired modifications,
 157 is commonly referred to as ``const correctness'' TODO.
 158
 159 Unfortunately, \name{Java} lacks a keyword like \name{C++}'s \code{const},
 160 making it harder to achieve const correctness.
 161 It only specifies the similar keyword \code{final}, which is much less expressive and
 162 doesn't allow for a similarly effective error prevention \cite{final}.
 163 In particular, because \code{final} is not part of an object's type information,
 164 it is not possible to declare methods that return read-only objects TODO --
 165 placing a \code{final} before the method's return type would declare the
 166 method \code{final}. Similarly, there is no way to express that a method must not change
 167 the state of its object parameters. A method like \code{public f(final Object obj)}
 168 is only liable to not assigning a new value to its parameter object \code{obj} \cite{java}
 169 (which, if allowed, wouldn't affect the caller anyway \cite{java}).
 170 Methods changing its state, however, are allowed to be called on \code{obj} without
 171 restrictions \cite{java}.
 172
 173 Several possibilities were considered to address this problem:
 174 \begin{itemize}
 175         \item Not implementing const correctness, but stating the access rules in
 176         comments only
 177         \item Giving the methods which modify object states special names
 178         like\\\code{setName\textendash\textendash USE\_WITH\_CARE}
 179         \item Delegating changes of objects to special ``editor'' objects to be
 180         obtained when an object shall be altered TODO
 181         \item Deriving classes offering the modifying methods from the read-only
 182         classes
 183 \end{itemize}
 184
 185 Not implementing const correctness at all of course would have been the simplest
 186 possibility, producing the shortest and most readable code, but since
 187 incautious manipulation of objects would possibly have introduced subtle,
 188 hard-to-spot errors which in many cases would have occurred under additional
 189 conditions only and at other places, for example when inserting a \code{Column}
 190 into a \code{ColumnSet}, this method was not seriously considered.
 191
 192 Using intentionally angular, conspicuous names also was not considered seriously,
 193 since it would have cluttered the code for the only sake of hopefully warning
 194 programmers of possible errors -- and not attempting to avoid them technically.
 195
 196 So the introduction of new classes was considered the most effective and cleanest
 197 solution, either in the form of ``editor'' classes or derived classes offering the
 198 modifying methods directly. Again -- as in the identification of classes --,
 199 the most direct solution was considered the best, so the latter form of introducing
 200 additional classes was chosen and classes like \code{ReadableColumn},
 201 \code{ReadableColumnSet} et cetera were introduced which offer only the read-only
 202 functionality and usually occur in interfaces.
 203 Their counterparts including modifying methods also were derived from them and the
 204 implications of modifications were explained in their documentation, while the
 205 issue and the approach as such were also mentioned in the documentation of the
 206 \code{Readable...} classes.
 207 The \code{Readable...} classes can be converted to their fully-functional
 208 counterparts via downcasting (only), thereby giving a strong hint to
 209 programmers that the resulting objects are to be used with care.
 210
 211 \subsubsection{Java interfaces}
 212 In \name{Java} programming, it is quiet common and often recommended, that every
 213 class has at least one \code{interface} it \code{implements},
 214 specifying the operations the class provides. TODO
 215 If no obvious \code{interface} exists for a class or the desired
 216 interface name is already given to some other entity,
 217 the interface is often given names like \code{ITableSchema}
 218 or \code{TableSchemaInterface}.
 219
 220 However, for a special purpose program with a relatively fixed set of classes
 221 mostly representing real-world artifacts from the problem domain,
 222 this approach was considered overly cluttering, introducing artificial
 223 code entities for no benefit.
 224 In particular, as explained in section TODO, all program classes either are
 225 standing alone TODO or belong to a class hierarchy derived from at least one
 226 interface.
 227 So, except from the standalone classes, an interface existed anyway, either
 228 ``naturally'' (as in the case of \code{Key}, for example) or because of
 229 the chosen way to implement const correctness.
 230 In some cases, these were interfaces declared in the program code, while
 231 in some cases, \name{Java} interfaces like \code{Set} were implemented
 232 (an obvious choice, of course, for \code{ColumnSet}).
 233 Introducing artificial interfaces for the standalone classes was considered
 234 unnecessary at least, if not messy.
 235
 236 \subsection{Packages}
 237 \label{packages}
 238 As mentioned in section \fullref{classes}, class names were chosen to be
 239 concise but nevertheless expressive.
 240 This only was possible through the use of \name{Java} \code{package}s,
 241 which also helped structure the program.
 242
 243 For the current, relatively limited, extent of the program which currently
 244 comprises $45$ (\code{public}) classes, a flat package structure was
 245 considered ideal, because it is simple and doesn't stash source files deep
 246 in subdirectories (in \name{Java}, the directory structure of the source tree
 247 is required to reflect the package structure TODO).
 248 Because also every class belongs to a package,
 249 each source file is to be found exactly one directory below the root
 250 program source directory, which in many cases eases their handling.
 251
 252 The following $11$ packages exist in the program
 253 (their purpose and more details about the package structure are
 254 described in section \fullref{coarse}):
 255 \begin{multicols}{3}\begin{itemize}
 256         \item \code{boostrapping}
 257         \item \code{cli}
 258         \item \code{database}
 259         \item \code{helpers}
 260         \item \code{log}
 261         \item \code{main}
 262         \item \code{osl}
 263         \item \code{output}
 264         \item \code{settings}
 265         \item \code{specification}
 266         \item \code{test}
 267 \end{itemize}\end{multicols}
 268
 269 Each package is documented in the source code also, particularly in a file
 270 \file{package-info.java} residing in the respective package directory.
 271 This is a common scheme supported by the \name{Eclipse} IDE as well as the
 272 documentation generation systems \name{javadoc} and \name{doxygen} TODO
 273 (all of which were used in the creation of the program,
 274 as described in section TODO).