Minor change
[u/philim/db2osl_thesis.git] / program_arch.tex
... / ...
CommitLineData
1\section{Architecture}
2\label{arch}
3\subsection{Libraries used}
4\subsection{Coarse structuring}
5\label{coarse}
6TODO: overall description, modularity, extendability, ex: easy to add new in-/output formats
7TODO: mapping profiles (maybe better in next subsection)
8
9\subsubsection{Package structuring}
10The $45$ classes of \myprog{} were assigned to $11$ packages, each containing
11classes responsible for the same area of operation or taking over similar roles.
12Care was taken that package division happened senseful, producing meaningful packages
13with obvious task fields on the one hand, while on the other hand implementing an
14incisive separation with a notable degree of decoupling.
15Packages were chosen not to be nested but to be set out vapidly.
16Since this doesn't have any functional implications \cite{java}, but is rather an
17implementation detail, this is further explained in section \fullref{code_packages}.
18
19The packages are introduced and described in table \ref{arch_tbl_packages}.
20The lists of classes each package contains are given in table \ref{arch_tbl_classes}
21in the next section \fullref{fine}.
22For a detailed package description, refer to Appendix TODO.
23
24\begin{table}[H]
25 \begin{tabular}{p{3cm}|p{13cm}} %\KOMAoption{fontsize}{\smallerfontsize{}}
26 Package & Description\\
27 \hline
28 \code{bootstrapping} & Classes performing bootstrapping\\
29 \code{cli} & Classes related to the command line interface of \myprog{}\\
30 \code{database} & Classes related to the representation of relational databases and attached tasks\\
31 \code{helpers} & Helper classes used program-wide\\
32 \code{log} & Classes related to logging and diagnostic output\\
33 \code{main} & The \code{Main} class\\
34 \code{osl} & Classes representing OBDA specifications (as described in \cite{eng}) using the OBDA Specification Language (\osl{})\\
35 \code{output} & Classes used to output OBDA Specifications as described in \cite{eng}\\
36 \code{settings} & Classes related to program and job settings (including command line parsing)\\
37 \code{specification} & Classes representing (parts of) OBDA specifications (as described in \cite{eng}) directly, without involving \osl{}\\
38 \code{test} & Classes offering testing facilities\\
39 \end{tabular} %\KOMAoption{fontsize}{\myfontsize{}}
40 \caption{Descriptions of the packages in \myprog{}}
41 \label{arch_tbl_packages}
42\end{table}
43
44Besides intuition, as stated, care was involved when partitioning the program into
45these packages, which included the analysis of the package
46interaction under a given structure, and the carrying out of changes to
47make this structure achieve the desired pronounced decoupling with
48limited and intelligible dependencies.
49
50The \code{main} package was introduced to make the \code{Main} class,
51which carries information needed by other packages
52-- most prominently, the program name --,
53\code{import}able from inside these packages.
54For this, it is required for \code{Main} not to reside in the \code{default}
55package \cite{java}.
56
57Decoupling some of the functionality of a package into a new package -- which,
58in a nesting package structure, most probably would have become a sub-package -- and
59thus sacrificing the benefit of having fewer packages also played a role in some cases.
60Namely, \code{osl} is a package on its own instead of being part of the
61\code{specification} package, the \code{bootstrapping} classes also form a package
62on their own instead of belonging to the \code{specification} package,
63the classes of the \code{log} and the \code{cli} packages were not merged into
64one package, although logging currently exclusively happens on the command line,
65and the functionality of the \code{test} package, though containing only a few
66lines of code, was separated into its own package.
67
68Even though the package structure would have become quite simpler with these changes
69applied -- $4$ out of $11$ packages could have been saved this way --
70the first aim mentioned -- meaningfulness and intuitiveness -- was taken seriously
71and the presented partitioning was considered a more natural and
72comprehensible structuring, emphasizing different roles and thus being a more proper
73foundation for future extensions of the program.
74For example, because the \code{bootstrapping} package is central to the program and
75takes over TODO an active, processing role and in that is completely different
76from the classes of the \code{specification} package which on their part have a
77\emph{representing} role, it was considered senseful not to merge these two packages.
78This undergirds the separation of concerns within the program and stresses that
79the functionality of the \code{bootstrapping} package should not interweave with
80that in the \code{specification} package, making it easier for both to stay
81independent and further develop into understandable and suitable units.
82
83
84\subsubsection{Package interaction}
85As mentioned, the structuring of the packages was driven by the aim to gain a notable
86amount of decoupling.
87How this reflected in the dependency structure, thus the classes from other packages
88that the classes of a package depend on, is described in the following.
89As was also mentioned, the information presented here also acted back on the
90package partitioning, which changed in consequence.
91
92Dependencies of or on package \code{helpers} are not considered in the following,
93since this package precisely was meant to offer services used by many other packages.
94In fact, all facilities provided by \code{helpers} could just as well be part of
95the \name{Java} API, but unfortunately are not.
96The current dependency structure, factoring in this restriction, is shown in figure
97\ref{arch_fig_packages} and reveals a conceivably tidy system of dependencies.
98
99\begin{figure}[H]\begin{center}
100 \includegraphics[scale=0.86]{Images/package_graph.pdf}
101 \caption[Package dependencies in \myprog{}]{Package dependencies in \myprog{}.
102 ``$\rightarrow$'' means ``depends on''.}
103 \label{arch_fig_packages}
104\end{center}\end{figure}
105
106Except for the package \code{settings} (which is further explained below),
107every package has at most two outgoing edges, that is packages it depends on.
108Previous versions of \myprog{} had a quite more complicated package dependency structure,
109depicted in figure \ref{arch_fig_packages_earlier}.
110In this previous package structure, the maximum number of dependencies of
111packages other than \code{settings} on other packages is three, which also seems
112reasonably less.
113However, in the new structuring, \code{specification} has no packages it depends on
114and thus suits its purpose of providing a mundane and straight-forward
115representation of an OBDA Specification much better.
116%Furthermore, \code{output} doesn't depend on \code{database} anymore.
117
118\begin{figure}[H]\begin{center}
119 \includegraphics[scale=0.86]{Images/package_graph_earlier.pdf}
120 \caption[Package dependencies in earlier versions of \myprog{}]{Package dependencies
121 in earlier versions of \myprog{}. ``$\rightarrow$'' again means ``depends on''.}
122 \label{arch_fig_packages_earlier}
123\end{center}\end{figure}
124
125Though there still are quite a number of dependencies (to be precise: $19$),
126many of them ($8$, thus, nearly half) trace back to one central package
127in the middle, \code{settings}.
128This may seem odd at first glance, considering that most of the edges connecting to
129the \code{settings} node are outgoing edges and only one is incoming,
130whereas in a design where the settings are configured from within a single package
131and accessed from many other packages this would be the other way round.
132The reason for this constellation is that, as described in section \fullref{interface},
133all settings in \myprog{} are configured per bootstrapping job
134(there are no global settings) and so \code{settings}
135contains a class \code{Job} (and currently no other classes),
136which represents the configuration of a bootstrapping job but also
137provides a \code{perform()} method combining the facilities offered by the other
138packages.
139
140By this means, the \code{perform()} method of the \code{Job} class acts as the central
141driver performing the bootstrapping process, reducing the \code{main()} method two
142only $7$ lines of code and turning \code{settings} into something like an externalized
143part of the \code{main} package.
144If, in a future version of the program, this approach is changed and global settings or
145configuration files are introduced, \code{settings} will still be the central package,
146leaving the package structure and dependencies unchanged,
147since it either way contains information used by many other packages.
148This was the reason why it was not renamed to, for example, \code{driver}, which was
149considered, since at first glance it seems quite a bit unnatural
150to have the driver class reside in a package called ``settings''.
151
152%Package \code{helpers} depends on package \code{database}, which provides the \code{static}
153%method \code{getSQLTypeName}.
154
155\subsection{Fine structuring}
156\label{fine}
157While the packages in \myprog{} are introduced and described in section \fullref{coarse},
158the classes that comprise them are addressed in this section.
159For a detailed class index, refer to Appendix TODO.
160TODO: total classes etc.
161
162\subsubsection{Package contents}
163\label{package_details}
164
165Table \ref{arch_tbl_classes} lists the classes each package contains.
166The packages \code{cli}, \code{main}, \code{osl} and \code{settings} contain only
167one class each, while the by far most extensive package is \code{database},
168containing $15$ classes.
169
170\begin{table}[H]
171 \begin{multicols}{2}\begin{itemize} %\KOMAoption{fontsize}{\smallerfontsize{}}
172 \item \code{bootstrapping}
173 \begin{itemize}
174 \item \code{Bootstrapping}
175 \item \code{DirectMappingURIBuilder}
176 \item \code{URIBuilder}
177 \end{itemize}
178 \item \code{cli}
179 \begin{itemize}
180 \item \code{CLIDatabaseInteraction}
181 \end{itemize}
182 \item \code{database}
183 \begin{itemize}
184 \item \code{Column}
185 \item \code{ColumnSet}
186 \item \code{DatabaseException}
187 \item \code{DBSchema}
188 \item \code{ForeignKey}
189 \item \code{Key}
190 \item \code{PrimaryKey}
191 \item \code{ReadableColumn}
192 \item \code{ReadableColumnSet}
193 \item \code{ReadableForeignKey}
194 \item \code{ReadableKey}
195 \item \code{ReadablePrimaryKey}
196 \item \code{RetrieveDBSchema}
197 \item \code{Table}
198 \item \code{TableSchema}
199 \end{itemize}
200 \item \code{helpers}
201 \begin{itemize}
202 \item \code{Helpers}
203 \item \code{MapValueIterable}
204 \item \code{MapValueIterator}
205 \item \code{ReadOnlyIterable}
206 \item \code{ReadOnlyIterator}
207 \item \code{SQLType}
208 \item \code{UserAbortException}
209 \end{itemize}
210 \newpage
211 \item \code{log}
212 \begin{itemize}
213 \item \code{ConsoleDiagnosticOutputHandler}
214 \item \code{GlobalLogger}
215 \end{itemize}
216 \item \code{main}
217 \begin{itemize}
218 \item \code{Main}
219 \end{itemize}
220 \item \code{osl}
221 \begin{itemize}
222 \item \code{OSLSpecification}
223 \end{itemize}
224 \item \code{output}
225 \begin{itemize}
226 \item \code{ObjectSpecPrinter}
227 \item \code{OSLSpecPrinter}
228 \item \code{SpecPrinter}
229 \end{itemize}
230 \item \code{settings}
231 \begin{itemize}
232 \item \code{Job}
233 \end{itemize}
234 \item \code{specification}
235 \begin{itemize}
236 \item \code{AttributeMap}
237 \item \code{EntityMap}
238 \item \code{IdentifierMap}
239 \item \code{InvalidSpecificationException}
240 \item \code{OBDAMap}
241 \item \code{OBDASpecification}
242 \item \code{RelationMap}
243 \item \code{SubtypeMap}
244 \item \code{TranslationTable}
245 \end{itemize}
246 \item \code{test}
247 \begin{itemize}
248 \item \code{CreateTestDBSchema}
249 \item \code{GetSomeDBSchema}
250 \end{itemize}
251 \end{itemize}\end{multicols} %\KOMAoption{fontsize}{\myfontsize{}}
252 \caption{Class attachment to packages in \myprog{}}
253 \label{arch_tbl_classes}
254\end{table}
255
256\subsubsection{Class organization}
257\label{hierarchies}
258Organizing classes in a structured, obvious manner such that classes have well-defined
259roles, behave in an intuitive way, ideally representing artifacts from the world
260modeled in the program directly \cite{str3}, is a prerequisite to make the code
261clear and comprehensible on the architectural level.
262
263Section \fullref{code_classes} as part of section \fullref{code}
264describes the identification and naming scheme for the classes in \myprog{}.
265However, it is also important, to arrange these classes in useful, comprehensible
266class hierarchies to avoid code duplication, make appropriate use of the type system,
267ease the design of precise and flexible interfaces and enhance the
268adaptability and extensibility of the program.
269Figure \ref{arch_fig_hierachies} shows the class hierarchies in \myprog{},
270while standalone classes are listed in table \ref{arch_tbl_lone_classes}.
271
272\stepcounter{figure}
273\newcounter{figureNumberOfClassHierarchyFigure}
274\setcounter{figureNumberOfClassHierarchyFigure}{\value{figure}}
275\begin{figure}[H]\begin{center}
276 \ContinuedFloat*
277 \includegraphics[scale=0.86]{Images/inherit_graph_8.pdf}
278 \label{first_hierarchy}
279\end{center}\end{figure}
280\vspace{6px}
281\begin{figure}[H]\begin{center}
282 \ContinuedFloat*
283 \includegraphics[scale=0.86]{Images/inherit_graph_5.pdf}
284 \end{center}\end{figure}
285\begin{figure}[H]\begin{center}
286 \ContinuedFloat*
287 \includegraphics[scale=0.86]{Images/inherit_graph_7.pdf}
288\end{center}\end{figure}
289\begin{figure}[H]\begin{center}
290 \ContinuedFloat*
291 \includegraphics[scale=0.86]{Images/inherit_graph_19.pdf}
292\end{center}\end{figure}
293\begin{figure}[H]\begin{center}
294 \ContinuedFloat*
295 \includegraphics[scale=0.86]{Images/inherit_graph_1.pdf}
296\end{center}\end{figure}
297\begin{figure}[H]\begin{center}
298 \ContinuedFloat*
299 \includegraphics[scale=0.86]{Images/inherit_graph_17.pdf}
300\end{center}\end{figure}
301\begin{figure}[H]\begin{center}
302 \ContinuedFloat*
303 \includegraphics[scale=0.86]{Images/inherit_graph_21_extended.pdf}
304\end{center}\end{figure}
305\begin{figure}[H]\begin{center}
306 \ContinuedFloat*
307 \includegraphics[scale=0.86]{Images/inherit_graph_13.pdf}
308\end{center}\end{figure}
309\begin{figure}[H]\begin{center}
310 \ContinuedFloat*
311 \includegraphics[scale=0.86]{Images/inherit_graph_3.pdf}
312\end{center}\end{figure}
313\begin{figure}[H]\begin{center}
314 \ContinuedFloat*
315 \includegraphics[scale=0.86]{Images/inherit_graph_18.pdf}
316\end{center}\end{figure}
317\begin{figure}[H]\begin{center}
318 \ContinuedFloat*
319 \includegraphics[scale=0.86]{Images/inherit_graph_12.pdf}
320\end{center}\end{figure}
321\begin{figure}[H]\begin{center}
322 \ContinuedFloat*
323 \includegraphics[scale=0.86]{Images/inherit_graph_4.pdf}
324 \setcounter{figure}{\value{figureNumberOfClassHierarchyFigure}}
325 \caption[Class hierarchies in \myprog{}]{Class hierarchies in \myprog{}.
326 Interface names are italicized,
327 external classes or interfaces are hemmed with a gray frame.}
328 \label{arch_fig_hierachies}
329\end{center}\end{figure}
330
331\begin{table}[H]\begin{center}
332 \begin{tabular}{l}
333 \itm{} \code{main.Main}\\
334 \itm{} \code{database.RetrieveDBSchema}\\
335 \itm{} \code{database.Table}\\
336 \itm{} \code{helpers.Helpers}\\
337 \itm{} \code{helpers.SQLType}\\
338 \itm{} \code{specification.OBDASpecification}\\
339 \itm{} \code{osl.OSLSpecification}\\
340 \itm{} \code{bootstrapping.Bootstrapping}\\
341 \itm{} \code{cli.CLIDatabaseInteraction}\\
342 \itm{} \code{log.GlobalLogger}\\
343 \itm{} \code{test.CreateTestDBSchema}\\
344 \itm{} \code{test.GetSomeDBSchema}\\
345 \end{tabular}\\
346 \caption{Standalone classes in \myprog{}}
347 \label{arch_tbl_lone_classes}
348\end{center}\end{table}
349
350Note that every class hierarchy has at least one \code{interface} at its top.
351Classes not belonging to a class hierarchy were chosen not to be given an interface
352``factitiously'', which would have made them part of a (small) class hierarchy TODO.
353Deliberately, the scheme often recommended in the Java world TODO
354to give every class an interface it \code{implements} was not followed
355but the approach described by Stroustrup \cite{str4} to provide a rich set of
356so called ``concrete types'' not designed for use within class hierarchies, which
357``build the foundation of every well-designed program \cite{str4}'' TODO.
358The details of this consideration are explained in section \fullref{code_interfaces}.
359In fact, many useful types were already offered by the \name{Java} API
360and of course were not re-implemented.
361
362Class \code{Column} with its interface \code{ReadableColumn} is an exception TODO
363in that it was given an interface although it is basically a concrete type.
364The reason for this is the chosen way to implement const correctness,
365described in section \nameref{const} (which is part of section \fullref{code_classes}).
366This technique forced class \code{Column} to
367\code{implement} an interface, thus needlessly making it part of a class hierarchy,
368but also complicated the structure of some class hierarchies.
369Consider the class hierarchy around \code{ColumnSet},
370shown in the \hyperref[first_hierarchy]{first graph} of figure \ref{arch_fig_hierachies}.
371Definitely, it seems overly complicated at the first glance.
372But this complexity solely is introduced by the artificial
373\code{Readable...} interfaces;
374would \name{Java} provide a mechanism like \name{C++}'s \code{const},
375this hierarchy would be as simple as in the following graph:
376
377\begin{figure}[H]\begin{center}
378 \includegraphics[scale=0.86]{Images/inherit_graph_8_simplified.pdf}
379 \caption{\code{ColumnSet} class hierarchy in \myprog{} -- simplified}
380 \label{arch_fig_colset_hierarchy_simplified}
381\end{center}\end{figure}
382
383However, since const correctness is an important mechanism effectively preventing
384errors while on the other hand introducing clarity by itself, it was considered
385too important to be sacrificed, even for a cleaner and more intuitive class hierarchy.
386The fact that the \code{Readable...} scheme is very straight-forward and a programmer
387reading the documentation knows about its purpose and the real, much smaller,
388complexity also makes some amends for the simplicity sacrificed.
389The const correctness mechanism itself thereby hinders uninformed or ignorant
390programmers from mistakenly using the wrong class in an interface in many cases.
391
392For more information about the program structure on the class level,
393see section \fullref{code}, while for a detailed class index refer to Appendix TODO.