]>
Commit | Line | Data |
---|---|---|
c31df1ed PM |
1 | \section{Code style} |
2 | \label{code} | |
3 | TODO: Conventions, ex.: iterators | |
4 | As the final system hopefully will have a long living cycle TODO | |
5 | and will be used and refined by many people, high code quality was an important aim. | |
6 | Beyond architectural issues this also involves cleanness on the lower level, | |
7 | like the design of classes and the implementation of methods. | |
8 | Common software development principles were followed TODO and | |
9 | the unfamiliar reader was constantly taken into account | |
10 | to yield clean, usable and readable code. | |
11 | ||
12 | \subsection{Comments} | |
13 | \label{comments} | |
14 | Comments were used at places ambiguities or misinterpretations could arise, | |
15 | yet care was taken to face such problems at their roots and solve them | |
16 | wherever possible instead of just eliminating the ambiguity with comments. | |
17 | ||
18 | Consider the following method in \file{CLIDatabaseInteraction.java}: | |
19 | \codepar{public static void promptAbortRetrieveDBSchemaAndWait\\ | |
20 | \ind(final FutureTask<DBSchema> retriever) throws SQLException} | |
21 | ||
22 | It could have been called \code{promptAbortRetrieveDBSchema} only, with the | |
23 | waiting mentioned in a comment. | |
24 | However, the waiting is such an important part of its behavior, that this | |
25 | wouldn't have been enough, so the waiting was included in the function name. | |
26 | Since the method is called at one place only, the lengthening of the method | |
27 | name by 7 characters or about 26 \% is really not a problem. | |
28 | ||
29 | More generally, ``speaking code'' was used wherever possible, | |
30 | as described in section \fullref{speaking}, | |
31 | which rendered many uses of comments unnecessary. | |
28b54c67 | 32 | In fact, the number of (plain, e.g. non-\name{Javadoc}) comments was consciously minimized, |
c31df1ed PM |
33 | to enforce speaking code and avoid redundancy. |
34 | This technique is known TODO. | |
35 | ||
36 | An exception of course from this is the highlighting of subdivisions. | |
37 | In class and method implementations, comments like | |
38 | \codepar{//********************** Constructors **********************TODO} | |
39 | ||
40 | were deliberately used to ease navigation inside source files for unfamiliar | |
41 | readers, but also to enhance readability: independent parts of method | |
42 | implementations, for example, were optically separated this way. | |
43 | Another alternative would have been to use separate methods for this code | |
44 | pieces, as was done in other cases, but this would then have introduced | |
45 | additional artifacts with either long or non-speaking names. | |
46 | Additionally, it would have increased complexity, because these methods | |
47 | would have been callable at least from everywhere in the source file, | |
48 | and would have interrupted the reading flow. | |
49 | This technique is known TODO, while TODO | |
50 | ||
28b54c67 | 51 | Wherever possible, the appropriate \name{Javadoc} comments were used in favor of |
c31df1ed PM |
52 | plain comments, for example to specify parameters, return types, exceptions |
53 | and links to other parts of the documentation. | |
54 | ||
55 | \subsection{Speaking code} | |
56 | \label{speaking} | |
57 | As mentioned in section \fullref{comments}, the use of ``speaking code'' as | |
58 | introduced TODO | |
59 | renders many uses of comments unnecessary. | |
60 | In particular, the following aspects are commonly considered when referring to | |
61 | the term ``speaking code'' TODO: | |
62 | ||
63 | \begin{itemize} | |
64 | \item Variable names | |
65 | \item Control flow | |
66 | \end{itemize} | |
67 | ||
68 | \subsubsection{Variable names} | |
69 | A very important part of speaking code | |
70 | ||
71 | \subsection{Robustness against incorrect use} | |
72 | Care was taken to produce code that is robust to incorrect use, making it | |
73 | suitable for the expected environment of sporadic updates by unfamiliar and | |
74 | potentially even unpracticed programmers who very likely have their emphasis | |
75 | on the concepts of bootstrapping rather than details of the present code. | |
76 | ||
77 | In fact, carefully avoiding the introduction of technical artifacts to mind, | |
78 | preventing programmers from focusing on the actual program logic, | |
79 | is an important principle of writing clean code TODO. | |
80 | ||
81 | In modern programming languages, of course the main instruments for achieving | |
82 | this are the type system and exceptions. | |
83 | In particular, static type information should be used to reflect data | |
84 | abstraction and the ``kind'' of data, an object reflects, | |
85 | while dynamic type information should only be used implicitly, | |
86 | through dynamically dispatching method invocations\cite{str4}. | |
87 | Exceptions on the other hand should be used at any place related to errors | |
88 | and error handling, separating error handling noticeably from other code and | |
89 | enforcing the treatment of errors, preventing the programmer from using | |
90 | corrupted information in many cases. | |
91 | ||
92 | An example of both mechanism, static type information and exceptions, acting | |
93 | in combination, while cleanly fitting into the context of dynamic dispatching, | |
94 | are the following methods from \file{Column.java}: | |
95 | \codepar{public Boolean isNonNull()\\public Boolean isUnique()} | |
96 | ||
28b54c67 | 97 | There return type is the \name{Java} class \code{Boolean}, not the plain type |
c31df1ed PM |
98 | \code{boolean}, because the information they return is not always known. |
99 | In an early stage of the program, they returned \code{boolean} and were | |
100 | accompanied TODO by two methods | |
101 | \code{public boolean knownIsNonNull()} and \code{public boolean knownIsUnique()}, | |
102 | telling the caller whether the respective information was known and thus the | |
103 | value returned by \code{isNonNull()} or \code{isUnique()}, respectively, | |
104 | was reliable. | |
105 | ||
28b54c67 | 106 | They were then changed to return the \name{Java} class \code{Boolean} and to return |
c31df1ed PM |
107 | null pointers in case the respective information is not known. |
108 | This eliminated any possibility of using unreliable data in favor of generating | |
109 | exceptions instead, in this case a \code{NullPointerException}, which is thrown | |
28b54c67 | 110 | automatically by the \name{Java Runtime Environment} if the programmer forgets the |
c31df1ed PM |
111 | null check and tries to get a definite value from one of these methods |
112 | when the correct value currently is not known. | |
113 | ||
114 | Comparing two unknown values -- thus, two null pointers -- | |
115 | also yields the desired result, \code{true}, since the change, | |
116 | even when the programmer forgets that he deals with objects. | |
117 | However, when comparing two return values of one of the methods in general | |
118 | -- as opposed to comparing one such return value against a constant --, | |
119 | errors could occur if the programmer writes \code{col1.isUnique() == col2.isUnique()} | |
120 | instead of \code{col1.isUnique().booleanValue() == col2.isUnique().booleanValue()}. | |
121 | \\TODO: Java rules. | |
122 | ||
123 | TODO: more, summary | |
124 | ||
125 | \subsection{Classes} | |
126 | \label{classes} | |
127 | Following the object-oriented programming paradigm, classes were heavily used | |
128 | to abstract from implementation details and to yield intuitively usable objects with | |
129 | a set of useful operations \cite{obj}. | |
130 | ||
131 | \subsubsection{Identification of classes} | |
132 | To identify potential classes, entities from the problem domain were -- if reasonable -- | |
28b54c67 | 133 | directly represented as \name{Java} classes. |
c31df1ed PM |
134 | The approach of choosing ``the program that most directly models the aspects of the |
135 | real world that we are interested in'' to yield clean code, | |
136 | as described and recommended by Stroustrup \cite{str3}, proved to be extremely useful | |
137 | and effective. | |
138 | As a consequence, the code declares classes like \code{Column}, \code{ColumnSet}, | |
139 | \code{ForeignKey}, \code{Table}, \code{TableSchema} and \code{SQLType}. | |
140 | As described in section \fullref{speaking}, class names were chosen to be concise | |
141 | but nevertheless expressive TODO. | |
28b54c67 PM |
142 | \name{Java} packages were used to help attain this aim, |
143 | which is why the previously mentioned class names are unambiguous | |
144 | (for details about package use, see section \fullref{packages}, for the description | |
145 | of the packages in \myprog{} and their structuring, see section \fullref{coarse}). | |
c31df1ed PM |
146 | |
147 | Care was taken not to introduce unnecessary classes, thereby complicating | |
148 | code structure and increasing the number of source files and program entities. | |
149 | Especially artificial classes, having little or no reference to real-world | |
150 | objects, could most often be avoided. | |
151 | On the other hand of course, it usually is not the cleanest solution | |
152 | to avoid such artificial classes entirely. | |
153 | ||
154 | \subsubsection{Const correctness} | |
28b54c67 PM |
155 | Specifying in the code which objects may be altered and which shall remain constant, |
156 | thus allowing for additional static checks preventing undesired modifications, | |
c31df1ed PM |
157 | is commonly referred to as ``const correctness'' TODO. |
158 | ||
28b54c67 PM |
159 | Unfortunately, \name{Java} lacks a keyword like \name{C++}'s \code{const}, |
160 | making it harder to achieve const correctness. | |
c31df1ed PM |
161 | It only specifies the similar keyword \code{final}, which is much less expressive and |
162 | doesn't allow for a similarly effective error prevention \cite{final}. | |
163 | In particular, because \code{final} is not part of an object's type information, | |
164 | it is not possible to declare methods that return read-only objects TODO -- | |
165 | placing a \code{final} before the method's return type would declare the | |
166 | method \code{final}. Similarly, there is no way to express that a method must not change | |
167 | the state of its object parameters. A method like \code{public f(final Object obj)} | |
168 | is only liable to not assigning a new value to its parameter object \code{obj} \cite{java} | |
169 | (which, if allowed, wouldn't affect the caller anyway \cite{java}). | |
170 | Methods changing its state, however, are allowed to be called on \code{obj} without | |
171 | restrictions \cite{java}. | |
172 | ||
173 | Several possibilities were considered to address this problem: | |
174 | \begin{itemize} | |
175 | \item Not implementing const correctness, but stating the access rules in | |
176 | comments only | |
177 | \item Giving the methods which modify object states special names | |
178 | like\\\code{setName\textendash\textendash USE\_WITH\_CARE} | |
179 | \item Delegating changes of objects to special ``editor'' objects to be | |
28b54c67 | 180 | obtained when an object shall be altered TODO |
c31df1ed PM |
181 | \item Deriving classes offering the modifying methods from the read-only |
182 | classes | |
183 | \end{itemize} | |
184 | ||
185 | Not implementing const correctness at all of course would have been the simplest | |
186 | possibility, producing the shortest and most readable code, but since | |
187 | incautious manipulation of objects would possibly have introduced subtle, | |
188 | hard-to-spot errors which in many cases would have occurred under additional | |
189 | conditions only and at other places, for example when inserting a \code{Column} | |
190 | into a \code{ColumnSet}, this method was not seriously considered. | |
191 | ||
192 | Using intentionally angular, conspicuous names also was not considered seriously, | |
193 | since it would have cluttered the code for the only sake of hopefully warning | |
194 | programmers of possible errors -- and not attempting to avoid them technically. | |
195 | ||
196 | So the introduction of new classes was considered the most effective and cleanest | |
197 | solution, either in the form of ``editor'' classes or derived classes offering the | |
198 | modifying methods directly. Again -- as in the identification of classes --, | |
199 | the most direct solution was considered the best, so the latter form of introducing | |
200 | additional classes was chosen and classes like \code{ReadableColumn}, | |
201 | \code{ReadableColumnSet} et cetera were introduced which offer only the read-only | |
202 | functionality and usually occur in interfaces. | |
203 | Their counterparts including modifying methods also were derived from them and the | |
204 | implications of modifications were explained in their documentation, while the | |
205 | issue and the approach as such were also mentioned in the documentation of the | |
206 | \code{Readable...} classes. | |
207 | The \code{Readable...} classes can be converted to their fully-functional | |
208 | counterparts via downcasting (only), thereby giving a strong hint to | |
209 | programmers that the resulting objects are to be used with care. | |
210 | ||
211 | \subsubsection{Java interfaces} | |
28b54c67 PM |
212 | In \name{Java} programming, it is quiet common and often recommended, that every |
213 | class has at least one \code{interface} it \code{implements}, | |
c31df1ed PM |
214 | specifying the operations the class provides. TODO |
215 | If no obvious \code{interface} exists for a class or the desired | |
216 | interface name is already given to some other entity, | |
217 | the interface is often given names like \code{ITableSchema} | |
218 | or \code{TableSchemaInterface}. | |
219 | ||
220 | However, for a special purpose program with a relatively fixed set of classes | |
221 | mostly representing real-world artifacts from the problem domain, | |
222 | this approach was considered overly cluttering, introducing artificial | |
223 | code entities for no benefit. | |
224 | In particular, as explained in section TODO, all program classes either are | |
225 | standing alone TODO or belong to a class hierarchy derived from at least one | |
226 | interface. | |
227 | So, except from the standalone classes, an interface existed anyway, either | |
228 | ``naturally'' (as in the case of \code{Key}, for example) or because of | |
229 | the chosen way to implement const correctness. | |
230 | In some cases, these were interfaces declared in the program code, while | |
28b54c67 | 231 | in some cases, \name{Java} interfaces like \code{Set} were implemented |
c31df1ed PM |
232 | (an obvious choice, of course, for \code{ColumnSet}). |
233 | Introducing artificial interfaces for the standalone classes was considered | |
234 | unnecessary at least, if not messy. | |
235 | ||
236 | \subsection{Packages} | |
237 | \label{packages} | |
238 | As mentioned in section \fullref{classes}, class names were chosen to be | |
239 | concise but nevertheless expressive. | |
28b54c67 | 240 | This only was possible through the use of \name{Java} \code{package}s, |
c31df1ed PM |
241 | which also helped structure the program. |
242 | ||
243 | For the current, relatively limited, extent of the program which currently | |
244 | comprises $45$ (\code{public}) classes, a flat package structure was | |
245 | considered ideal, because it is simple and doesn't stash source files deep | |
28b54c67 | 246 | in subdirectories (in \name{Java}, the directory structure of the source tree |
c31df1ed PM |
247 | is required to reflect the package structure TODO). |
248 | Because also every class belongs to a package, | |
249 | each source file is to be found exactly one directory below the root | |
250 | program source directory, which in many cases eases their handling. | |
251 | ||
252 | The following $11$ packages exist in the program | |
253 | (their purpose and more details about the package structure are | |
28b54c67 PM |
254 | described in section \fullref{coarse}): |
255 | \begin{multicols}{3}\begin{itemize} | |
c31df1ed PM |
256 | \item \code{boostrapping} |
257 | \item \code{cli} | |
258 | \item \code{database} | |
259 | \item \code{helpers} | |
260 | \item \code{log} | |
261 | \item \code{main} | |
262 | \item \code{osl} | |
263 | \item \code{output} | |
264 | \item \code{settings} | |
265 | \item \code{specification} | |
266 | \item \code{test} | |
28b54c67 | 267 | \end{itemize}\end{multicols} |
c31df1ed PM |
268 | |
269 | Each package is documented in the source code also, particularly in a file | |
270 | \file{package-info.java} residing in the respective package directory. | |
271 | This is a common scheme supported by the \name{Eclipse} IDE as well as the | |
272 | documentation generation systems \name{javadoc} and \name{doxygen} TODO | |
273 | (all of which were used in the creation of the program, | |
274 | as described in section TODO). |