Minor change
[u/philim/db2osl_thesis.git] / program_arch.tex
CommitLineData
c31df1ed
PM
1\section{Architecture}
2\label{arch}
3\subsection{Libraries used}
4\subsection{Coarse structuring}
28b54c67 5\label{coarse}
c31df1ed
PM
6TODO: overall description, modularity, extendability, ex: easy to add new in-/output formats
7TODO: mapping profiles (maybe better in next subsection)
26717a83 8
62fe6284 9\subsubsection{Package structuring}
26717a83
PM
10The $45$ classes of \myprog{} were assigned to $11$ packages, each containing
11classes responsible for the same area of operation or taking over similar roles.
12Care was taken that package division happened senseful, producing meaningful packages
13with obvious task fields on the one hand, while on the other hand implementing an
14incisive separation with a notable degree of decoupling.
15Packages were chosen not to be nested but to be set out vapidly.
16Since this doesn't have any functional implications \cite{java}, but is rather an
17implementation detail, this is further explained in section \fullref{code_packages}.
18
19The packages are introduced and described in table \ref{arch_tbl_packages}.
20The lists of classes each package contains are given in table \ref{arch_tbl_classes}
21in the next section \fullref{fine}.
62fe6284 22For a detailed package description, refer to Appendix TODO.
26717a83
PM
23
24\begin{table}[H]
25 \begin{tabular}{p{3cm}|p{13cm}} %\KOMAoption{fontsize}{\smallerfontsize{}}
26 Package & Description\\
27 \hline
28 \code{bootstrapping} & Classes performing bootstrapping\\
29 \code{cli} & Classes related to the command line interface of \myprog{}\\
30 \code{database} & Classes related to the representation of relational databases and attached tasks\\
31 \code{helpers} & Helper classes used program-wide\\
32 \code{log} & Classes related to logging and diagnostic output\\
33 \code{main} & The \code{Main} class\\
34 \code{osl} & Classes representing OBDA specifications (as described in \cite{eng}) using the OBDA Specification Language (\osl{})\\
35 \code{output} & Classes used to output OBDA Specifications as described in \cite{eng}\\
36 \code{settings} & Classes related to program and job settings (including command line parsing)\\
37 \code{specification} & Classes representing (parts of) OBDA specifications (as described in \cite{eng}) directly, without involving \osl{}\\
38 \code{test} & Classes offering testing facilities\\
39 \end{tabular} %\KOMAoption{fontsize}{\myfontsize{}}
40 \caption{Descriptions of the packages in \myprog{}}
41 \label{arch_tbl_packages}
42\end{table}
43
44Besides intuition, as stated, care was involved when partitioning the program into
45these packages, which included the analysis of the package
46interaction under a given structure, and the carrying out of changes to
47make this structure achieve the desired pronounced decoupling with
48limited and intelligible dependencies.
49
50The \code{main} package was introduced to make the \code{Main} class,
51which carries information needed by other packages
52-- most prominently, the program name --,
53\code{import}able from inside these packages.
54For this, it is required for \code{Main} not to reside in the \code{default}
55package \cite{java}.
56
62fe6284
PM
57Decoupling some of the functionality of a package into a new package -- which,
58in a nesting package structure, most probably would have become a sub-package -- and
59thus sacrificing the benefit of having fewer packages also played a role in some cases.
60Namely, \code{osl} is a package on its own instead of being part of the
61\code{specification} package, the \code{bootstrapping} classes also form a package
62on their own instead of belonging to the \code{specification} package,
63the classes of the \code{log} and the \code{cli} packages were not merged into
64one package, although logging currently exclusively happens on the command line,
65and the functionality of the \code{test} package, though containing only a few
66lines of code, was separated into its own package.
67
68Even though the package structure would have become quite simpler with these changes
69applied -- $4$ out of $11$ packages could have been saved this way --
70the first aim mentioned -- meaningfulness and intuitiveness -- was taken seriously
71and the presented partitioning was considered a more natural and
72comprehensible structuring, emphasizing different roles and thus being a more proper
73foundation for future extensions of the program.
74For example, because the \code{bootstrapping} package is central to the program and
75takes over TODO an active, processing role and in that is completely different
76from the classes of the \code{specification} package which on their part have a
77\emph{representing} role, it was considered senseful not to merge these two packages.
78This undergirds the separation of concerns within the program and stresses that
79the functionality of the \code{bootstrapping} package should not interweave with
80that in the \code{specification} package, making it easier for both to stay
81independent and further develop into understandable and suitable units.
82
83
26717a83
PM
84\subsubsection{Package interaction}
85As mentioned, the structuring of the packages was driven by the aim to gain a notable
86amount of decoupling.
87How this reflected in the dependency structure, thus the classes from other packages
88that the classes of a package depend on, is described in the following.
89As was also mentioned, the information presented here also acted back on the
90package partitioning, which changed in consequence.
91
92Dependencies of or on package \code{helpers} are not considered in the following,
93since this package precisely was meant to offer services used by many other packages.
94In fact, all facilities provided by \code{helpers} could just as well be part of
95the \name{Java} API, but unfortunately are not.
96The current dependency structure, factoring in this restriction, is shown in figure
97\ref{arch_fig_packages} and reveals a conceivably tidy system of dependencies.
98
99\begin{figure}[H]\begin{center}
100 \includegraphics[scale=0.86]{Images/package_graph.pdf}
101 \caption[Package dependencies in \myprog{}]{Package dependencies in \myprog{}.
102 ``$\rightarrow$'' means ``depends on''.}
103 \label{arch_fig_packages}
104\end{center}\end{figure}
105
62fe6284
PM
106Except for the package \code{settings} (which is further explained below),
107every package has at most two outgoing edges, that is packages it depends on.
108Previous versions of \myprog{} had a quite more complicated package dependency structure,
109depicted in figure \ref{arch_fig_packages_earlier}.
110In this previous package structure, the maximum number of dependencies of
111packages other than \code{settings} on other packages is three, which also seems
112reasonably less.
113However, in the new structuring, \code{specification} has no packages it depends on
114and thus suits its purpose of providing a mundane and straight-forward
115representation of an OBDA Specification much better.
116%Furthermore, \code{output} doesn't depend on \code{database} anymore.
117
118\begin{figure}[H]\begin{center}
119 \includegraphics[scale=0.86]{Images/package_graph_earlier.pdf}
120 \caption[Package dependencies in earlier versions of \myprog{}]{Package dependencies
121 in earlier versions of \myprog{}. ``$\rightarrow$'' again means ``depends on''.}
122 \label{arch_fig_packages_earlier}
123\end{center}\end{figure}
124
125Though there still are quite a number of dependencies (to be precise: $19$),
126many of them ($8$, thus, nearly half) trace back to one central package
127in the middle, \code{settings}.
128This may seem odd at first glance, considering that most of the edges connecting to
129the \code{settings} node are outgoing edges and only one is incoming,
26717a83
PM
130whereas in a design where the settings are configured from within a single package
131and accessed from many other packages this would be the other way round.
132The reason for this constellation is that, as described in section \fullref{interface},
133all settings in \myprog{} are configured per bootstrapping job
134(there are no global settings) and so \code{settings}
62fe6284
PM
135contains a class \code{Job} (and currently no other classes),
136which represents the configuration of a bootstrapping job but also
26717a83
PM
137provides a \code{perform()} method combining the facilities offered by the other
138packages.
139
62fe6284 140By this means, the \code{perform()} method of the \code{Job} class acts as the central
26717a83
PM
141driver performing the bootstrapping process, reducing the \code{main()} method two
142only $7$ lines of code and turning \code{settings} into something like an externalized
143part of the \code{main} package.
144If, in a future version of the program, this approach is changed and global settings or
145configuration files are introduced, \code{settings} will still be the central package,
146leaving the package structure and dependencies unchanged,
147since it either way contains information used by many other packages.
148This was the reason why it was not renamed to, for example, \code{driver}, which was
62fe6284
PM
149considered, since at first glance it seems quite a bit unnatural
150to have the driver class reside in a package called ``settings''.
26717a83
PM
151
152%Package \code{helpers} depends on package \code{database}, which provides the \code{static}
153%method \code{getSQLTypeName}.
c31df1ed
PM
154
155\subsection{Fine structuring}
26717a83 156\label{fine}
26717a83
PM
157While the packages in \myprog{} are introduced and described in section \fullref{coarse},
158the classes that comprise them are addressed in this section.
62fe6284
PM
159For a detailed class index, refer to Appendix TODO.
160TODO: total classes etc.
161
162\subsubsection{Package contents}
163\label{package_details}
28b54c67 164
26717a83
PM
165Table \ref{arch_tbl_classes} lists the classes each package contains.
166The packages \code{cli}, \code{main}, \code{osl} and \code{settings} contain only
167one class each, while the by far most extensive package is \code{database},
62fe6284 168containing $15$ classes.
26717a83
PM
169
170\begin{table}[H]
28b54c67 171 \begin{multicols}{2}\begin{itemize} %\KOMAoption{fontsize}{\smallerfontsize{}}
26717a83 172 \item \code{bootstrapping}
28b54c67
PM
173 \begin{itemize}
174 \item \code{Bootstrapping}
175 \item \code{DirectMappingURIBuilder}
176 \item \code{URIBuilder}
177 \end{itemize}
26717a83 178 \item \code{cli}
28b54c67
PM
179 \begin{itemize}
180 \item \code{CLIDatabaseInteraction}
181 \end{itemize}
26717a83 182 \item \code{database}
28b54c67
PM
183 \begin{itemize}
184 \item \code{Column}
185 \item \code{ColumnSet}
186 \item \code{DatabaseException}
187 \item \code{DBSchema}
188 \item \code{ForeignKey}
28b54c67
PM
189 \item \code{Key}
190 \item \code{PrimaryKey}
191 \item \code{ReadableColumn}
192 \item \code{ReadableColumnSet}
193 \item \code{ReadableForeignKey}
194 \item \code{ReadableKey}
195 \item \code{ReadablePrimaryKey}
196 \item \code{RetrieveDBSchema}
28b54c67
PM
197 \item \code{Table}
198 \item \code{TableSchema}
199 \end{itemize}
26717a83 200 \item \code{helpers}
28b54c67 201 \begin{itemize}
62fe6284 202 \item \code{Helpers}
28b54c67
PM
203 \item \code{MapValueIterable}
204 \item \code{MapValueIterator}
205 \item \code{ReadOnlyIterable}
206 \item \code{ReadOnlyIterator}
62fe6284 207 \item \code{SQLType}
28b54c67
PM
208 \item \code{UserAbortException}
209 \end{itemize}
210 \newpage
26717a83 211 \item \code{log}
28b54c67
PM
212 \begin{itemize}
213 \item \code{ConsoleDiagnosticOutputHandler}
214 \item \code{GlobalLogger}
215 \end{itemize}
26717a83 216 \item \code{main}
28b54c67
PM
217 \begin{itemize}
218 \item \code{Main}
219 \end{itemize}
26717a83 220 \item \code{osl}
28b54c67
PM
221 \begin{itemize}
222 \item \code{OSLSpecification}
223 \end{itemize}
26717a83 224 \item \code{output}
28b54c67
PM
225 \begin{itemize}
226 \item \code{ObjectSpecPrinter}
227 \item \code{OSLSpecPrinter}
228 \item \code{SpecPrinter}
229 \end{itemize}
26717a83 230 \item \code{settings}
28b54c67
PM
231 \begin{itemize}
232 \item \code{Job}
233 \end{itemize}
26717a83 234 \item \code{specification}
28b54c67
PM
235 \begin{itemize}
236 \item \code{AttributeMap}
237 \item \code{EntityMap}
238 \item \code{IdentifierMap}
239 \item \code{InvalidSpecificationException}
240 \item \code{OBDAMap}
241 \item \code{OBDASpecification}
242 \item \code{RelationMap}
243 \item \code{SubtypeMap}
244 \item \code{TranslationTable}
245 \end{itemize}
26717a83 246 \item \code{test}
28b54c67
PM
247 \begin{itemize}
248 \item \code{CreateTestDBSchema}
249 \item \code{GetSomeDBSchema}
250 \end{itemize}
26717a83
PM
251 \end{itemize}\end{multicols} %\KOMAoption{fontsize}{\myfontsize{}}
252 \caption{Class attachment to packages in \myprog{}}
253 \label{arch_tbl_classes}
28b54c67
PM
254\end{table}
255
26717a83
PM
256\subsubsection{Class organization}
257\label{hierarchies}
258Organizing classes in a structured, obvious manner such that classes have well-defined
259roles, behave in an intuitive way, ideally representing artifacts from the world
260modeled in the program directly \cite{str3}, is a prerequisite to make the code
261clear and comprehensible on the architectural level.
262
62fe6284
PM
263Section \fullref{code_classes} as part of section \fullref{code}
264describes the identification and naming scheme for the classes in \myprog{}.
26717a83
PM
265However, it is also important, to arrange these classes in useful, comprehensible
266class hierarchies to avoid code duplication, make appropriate use of the type system,
267ease the design of precise and flexible interfaces and enhance the
268adaptability and extensibility of the program.
269Figure \ref{arch_fig_hierachies} shows the class hierarchies in \myprog{},
270while standalone classes are listed in table \ref{arch_tbl_lone_classes}.
28b54c67 271
62fe6284
PM
272\stepcounter{figure}
273\newcounter{figureNumberOfClassHierarchyFigure}
274\setcounter{figureNumberOfClassHierarchyFigure}{\value{figure}}
c31df1ed
PM
275\begin{figure}[H]\begin{center}
276 \ContinuedFloat*
28b54c67 277 \includegraphics[scale=0.86]{Images/inherit_graph_8.pdf}
26717a83 278 \label{first_hierarchy}
c31df1ed 279\end{center}\end{figure}
28b54c67 280\vspace{6px}
c31df1ed
PM
281\begin{figure}[H]\begin{center}
282 \ContinuedFloat*
28b54c67
PM
283 \includegraphics[scale=0.86]{Images/inherit_graph_5.pdf}
284 \end{center}\end{figure}
c31df1ed
PM
285\begin{figure}[H]\begin{center}
286 \ContinuedFloat*
28b54c67 287 \includegraphics[scale=0.86]{Images/inherit_graph_7.pdf}
c31df1ed
PM
288\end{center}\end{figure}
289\begin{figure}[H]\begin{center}
290 \ContinuedFloat*
28b54c67 291 \includegraphics[scale=0.86]{Images/inherit_graph_19.pdf}
c31df1ed
PM
292\end{center}\end{figure}
293\begin{figure}[H]\begin{center}
294 \ContinuedFloat*
28b54c67 295 \includegraphics[scale=0.86]{Images/inherit_graph_1.pdf}
c31df1ed
PM
296\end{center}\end{figure}
297\begin{figure}[H]\begin{center}
298 \ContinuedFloat*
28b54c67 299 \includegraphics[scale=0.86]{Images/inherit_graph_17.pdf}
c31df1ed
PM
300\end{center}\end{figure}
301\begin{figure}[H]\begin{center}
302 \ContinuedFloat*
28b54c67 303 \includegraphics[scale=0.86]{Images/inherit_graph_21_extended.pdf}
c31df1ed
PM
304\end{center}\end{figure}
305\begin{figure}[H]\begin{center}
306 \ContinuedFloat*
28b54c67 307 \includegraphics[scale=0.86]{Images/inherit_graph_13.pdf}
c31df1ed
PM
308\end{center}\end{figure}
309\begin{figure}[H]\begin{center}
310 \ContinuedFloat*
28b54c67 311 \includegraphics[scale=0.86]{Images/inherit_graph_3.pdf}
c31df1ed
PM
312\end{center}\end{figure}
313\begin{figure}[H]\begin{center}
314 \ContinuedFloat*
28b54c67 315 \includegraphics[scale=0.86]{Images/inherit_graph_18.pdf}
c31df1ed
PM
316\end{center}\end{figure}
317\begin{figure}[H]\begin{center}
318 \ContinuedFloat*
28b54c67 319 \includegraphics[scale=0.86]{Images/inherit_graph_12.pdf}
c31df1ed
PM
320\end{center}\end{figure}
321\begin{figure}[H]\begin{center}
322 \ContinuedFloat*
28b54c67 323 \includegraphics[scale=0.86]{Images/inherit_graph_4.pdf}
62fe6284 324 \setcounter{figure}{\value{figureNumberOfClassHierarchyFigure}}
28b54c67
PM
325 \caption[Class hierarchies in \myprog{}]{Class hierarchies in \myprog{}.
326 Interface names are italicized,
327 external classes or interfaces are hemmed with a gray frame.}
26717a83 328 \label{arch_fig_hierachies}
c31df1ed
PM
329\end{center}\end{figure}
330
c31df1ed
PM
331\begin{table}[H]\begin{center}
332 \begin{tabular}{l}
333 \itm{} \code{main.Main}\\
c31df1ed 334 \itm{} \code{database.RetrieveDBSchema}\\
c31df1ed 335 \itm{} \code{database.Table}\\
62fe6284
PM
336 \itm{} \code{helpers.Helpers}\\
337 \itm{} \code{helpers.SQLType}\\
c31df1ed
PM
338 \itm{} \code{specification.OBDASpecification}\\
339 \itm{} \code{osl.OSLSpecification}\\
340 \itm{} \code{bootstrapping.Bootstrapping}\\
341 \itm{} \code{cli.CLIDatabaseInteraction}\\
342 \itm{} \code{log.GlobalLogger}\\
343 \itm{} \code{test.CreateTestDBSchema}\\
344 \itm{} \code{test.GetSomeDBSchema}\\
345 \end{tabular}\\
346 \caption{Standalone classes in \myprog{}}
28b54c67 347 \label{arch_tbl_lone_classes}
c31df1ed
PM
348\end{center}\end{table}
349
26717a83
PM
350Note that every class hierarchy has at least one \code{interface} at its top.
351Classes not belonging to a class hierarchy were chosen not to be given an interface
352``factitiously'', which would have made them part of a (small) class hierarchy TODO.
353Deliberately, the scheme often recommended in the Java world TODO
354to give every class an interface it \code{implements} was not followed
355but the approach described by Stroustrup \cite{str4} to provide a rich set of
356so called ``concrete types'' not designed for use within class hierarchies, which
357``build the foundation of every well-designed program \cite{str4}'' TODO.
358The details of this consideration are explained in section \fullref{code_interfaces}.
62fe6284
PM
359In fact, many useful types were already offered by the \name{Java} API
360and of course were not re-implemented.
26717a83
PM
361
362Class \code{Column} with its interface \code{ReadableColumn} is an exception TODO
363in that it was given an interface although it is basically a concrete type.
364The reason for this is the chosen way to implement const correctness,
365described in section \nameref{const} (which is part of section \fullref{code_classes}).
366This technique forced class \code{Column} to
367\code{implement} an interface, thus needlessly making it part of a class hierarchy,
368but also complicated the structure of some class hierarchies.
369Consider the class hierarchy around \code{ColumnSet},
62fe6284 370shown in the \hyperref[first_hierarchy]{first graph} of figure \ref{arch_fig_hierachies}.
26717a83
PM
371Definitely, it seems overly complicated at the first glance.
372But this complexity solely is introduced by the artificial
373\code{Readable...} interfaces;
374would \name{Java} provide a mechanism like \name{C++}'s \code{const},
375this hierarchy would be as simple as in the following graph:
376
377\begin{figure}[H]\begin{center}
378 \includegraphics[scale=0.86]{Images/inherit_graph_8_simplified.pdf}
379 \caption{\code{ColumnSet} class hierarchy in \myprog{} -- simplified}
380 \label{arch_fig_colset_hierarchy_simplified}
381\end{center}\end{figure}
382
62fe6284
PM
383However, since const correctness is an important mechanism effectively preventing
384errors while on the other hand introducing clarity by itself, it was considered
385too important to be sacrificed, even for a cleaner and more intuitive class hierarchy.
386The fact that the \code{Readable...} scheme is very straight-forward and a programmer
387reading the documentation knows about its purpose and the real, much smaller,
388complexity also makes some amends for the simplicity sacrificed.
389The const correctness mechanism itself thereby hinders uninformed or ignorant
390programmers from mistakenly using the wrong class in an interface in many cases.
26717a83
PM
391
392For more information about the program structure on the class level,
62fe6284 393see section \fullref{code}, while for a detailed class index refer to Appendix TODO.