245e75196016ed56fc7c49f125be1913eae77db1
[u/philim/db2osl_thesis.git] / program_arch.tex
1 \section{Architecture}
2 \label{arch}
3 \subsection{Libraries used}
4 \subsection{Coarse structuring}
5 \label{coarse}
6 TODO: overall description, modularity, extendability, ex: easy to add new in-/output formats
7 TODO: mapping profiles (maybe better in next subsection)
8
9 \subsubsection{Package partitioning}
10 The $45$ classes of \myprog{} were assigned to $11$ packages, each containing
11 classes responsible for the same area of operation or taking over similar roles.
12 Care was taken that package division happened senseful, producing meaningful packages
13 with obvious task fields on the one hand, while on the other hand implementing an
14 incisive separation with a notable degree of decoupling.
15 Packages were chosen not to be nested but to be set out vapidly.
16 Since this doesn't have any functional implications \cite{java}, but is rather an
17 implementation detail, this is further explained in section \fullref{code_packages}.
18
19 The packages are introduced and described in table \ref{arch_tbl_packages}.
20 The lists of classes each package contains are given in table \ref{arch_tbl_classes}
21 in the next section \fullref{fine}.
22
23 \begin{table}[H]
24         \begin{tabular}{p{3cm}|p{13cm}} %\KOMAoption{fontsize}{\smallerfontsize{}}
25                 Package & Description\\
26                 \hline
27                 \code{bootstrapping} & Classes performing bootstrapping\\
28                 \code{cli} & Classes related to the command line interface of \myprog{}\\
29                 \code{database} & Classes related to the representation of relational databases and attached tasks\\
30                 \code{helpers} & Helper classes used program-wide\\
31                 \code{log} & Classes related to logging and diagnostic output\\
32                 \code{main} & The \code{Main} class\\
33                 \code{osl} & Classes representing OBDA specifications (as described in \cite{eng}) using the OBDA Specification Language (\osl{})\\
34                 \code{output} & Classes used to output OBDA Specifications as described in \cite{eng}\\
35                 \code{settings} & Classes related to program and job settings (including command line parsing)\\
36                 \code{specification} & Classes representing (parts of) OBDA specifications (as described in \cite{eng}) directly, without involving \osl{}\\
37                 \code{test} & Classes offering testing facilities\\
38         \end{tabular} %\KOMAoption{fontsize}{\myfontsize{}}
39         \caption{Descriptions of the packages in \myprog{}}
40         \label{arch_tbl_packages}
41 \end{table}
42
43 Besides intuition, as stated, care was involved when partitioning the program into
44 these packages, which included the analysis of the package
45 interaction under a given structure, and the carrying out of changes to
46 make this structure achieve the desired pronounced decoupling with
47 limited and intelligible dependencies.
48
49 The \code{main} package was introduced to make the \code{Main} class,
50 which carries information needed by other packages
51 -- most prominently, the program name --,
52 \code{import}able from inside these packages.
53 For this, it is required for \code{Main} not to reside in the \code{default}
54 package \cite{java}.
55
56 \subsubsection{Package interaction}
57 As mentioned, the structuring of the packages was driven by the aim to gain a notable
58 amount of decoupling.
59 How this reflected in the dependency structure, thus the classes from other packages
60 that the classes of a package depend on, is described in the following.
61 As was also mentioned, the information presented here also acted back on the
62 package partitioning, which changed in consequence.
63
64 Dependencies of or on package \code{helpers} are not considered in the following,
65 since this package precisely was meant to offer services used by many other packages.
66 In fact, all facilities provided by \code{helpers} could just as well be part of
67 the \name{Java} API, but unfortunately are not.
68 The current dependency structure, factoring in this restriction, is shown in figure
69 \ref{arch_fig_packages} and reveals a conceivably tidy system of dependencies.
70
71 \begin{figure}[H]\begin{center}
72                 \includegraphics[scale=0.86]{Images/package_graph.pdf}
73                 \caption[Package dependencies in \myprog{}]{Package dependencies in \myprog{}.
74                         ``$\rightarrow$'' means ``depends on''.}
75                 \label{arch_fig_packages}
76 \end{center}\end{figure}
77
78 Though there are quite a number of dependencies (to be precise: $19$), many of them
79 ($8$, thus, nearly half) trace back to one central package in the middle,
80 \code{settings}.
81 This may seem odd at first glance, considering that most of the edges of the node
82 representing package \code{settings} are outgoing edges and only one is incoming,
83 whereas in a design where the settings are configured from within a single package
84 and accessed from many other packages this would be the other way round.
85 The reason for this constellation is that, as described in section \fullref{interface},
86 all settings in \myprog{} are configured per bootstrapping job
87 (there are no global settings) and so \code{settings}
88 contains a class \code{Job} (and currently no other classes).
89 \code{Job} represents the configuration of a bootstrapping job but also
90 provides a \code{perform()} method combining the facilities offered by the other
91 packages.
92
93 This way, the \code{perform()} method of the \code{Job} class acts as the central
94 driver performing the bootstrapping process, reducing the \code{main()} method two
95 only $7$ lines of code and turning \code{settings} into something like an externalized
96 part of the \code{main} package.
97 If, in a future version of the program, this approach is changed and global settings or
98 configuration files are introduced, \code{settings} will still be the central package,
99 leaving the package structure and dependencies unchanged,
100 since it either way contains information used by many other packages.
101 This was the reason why it was not renamed to, for example, \code{driver}, which was
102 considered, since it seems quite a bit unnatural to have the driver class reside in
103 a package called ``settings'' at first glance.
104
105 Previous versions of \myprog{} had a quite more complicated package dependency structure,
106 depicted in figure \ref{arch_fig_packages_earlier}.
107
108 \begin{figure}[H]\begin{center}
109                 \includegraphics[scale=0.86]{Images/package_graph_earlier.pdf}
110                 \caption[Package dependencies in earlier versions of \myprog{}]{Package dependencies
111                         in earlier versions of \myprog{}. ``$\rightarrow$'' again means ``depends on''.}
112                 \label{arch_fig_packages_earlier}
113 \end{center}\end{figure}
114
115 %Package \code{helpers} depends on package \code{database}, which provides the \code{static}
116 %method \code{getSQLTypeName}.
117
118 \subsection{Fine structuring}
119 \label{fine}
120 \subsubsection{Package contents}
121 \label{package_details}
122 While the packages in \myprog{} are introduced and described in section \fullref{coarse},
123 the classes that comprise them are addressed in this section.
124
125 Table \ref{arch_tbl_classes} lists the classes each package contains.
126 The packages \code{cli}, \code{main}, \code{osl} and \code{settings} contain only
127 one class each, while the by far most extensive package is \code{database},
128 containing $17$ classes.
129
130 \begin{table}[H]
131         \begin{multicols}{2}\begin{itemize} %\KOMAoption{fontsize}{\smallerfontsize{}}
132                         \item \code{bootstrapping}
133                         \begin{itemize}
134                                 \item \code{Bootstrapping}
135                                 \item \code{DirectMappingURIBuilder}
136                                 \item \code{URIBuilder}
137                         \end{itemize}
138                         \item \code{cli}
139                         \begin{itemize}
140                                 \item \code{CLIDatabaseInteraction}
141                         \end{itemize}
142                         \item \code{database}
143                         \begin{itemize}
144                                 \item \code{Column}
145                                 \item \code{ColumnSet}
146                                 \item \code{DatabaseException}
147                                 \item \code{DBSchema}
148                                 \item \code{ForeignKey}
149                                 \item \code{Helpers}
150                                 \item \code{Key}
151                                 \item \code{PrimaryKey}
152                                 \item \code{ReadableColumn}
153                                 \item \code{ReadableColumnSet}
154                                 \item \code{ReadableForeignKey}
155                                 \item \code{ReadableKey}
156                                 \item \code{ReadablePrimaryKey}
157                                 \item \code{RetrieveDBSchema}
158                                 \item \code{SQLType}
159                                 \item \code{Table}
160                                 \item \code{TableSchema}
161                         \end{itemize}
162                         \item \code{helpers}
163                         \begin{itemize}
164                                 \item \code{MapValueIterable}
165                                 \item \code{MapValueIterator}
166                                 \item \code{ReadOnlyIterable}
167                                 \item \code{ReadOnlyIterator}
168                                 \item \code{UserAbortException}
169                         \end{itemize}
170                         \newpage
171                         \item \code{log}
172                         \begin{itemize}
173                                 \item \code{ConsoleDiagnosticOutputHandler}
174                                 \item \code{GlobalLogger}
175                         \end{itemize}
176                         \item \code{main}
177                         \begin{itemize}
178                                 \item \code{Main}
179                         \end{itemize}
180                         \item \code{osl}
181                         \begin{itemize}
182                                 \item \code{OSLSpecification}
183                         \end{itemize}
184                         \item \code{output}
185                         \begin{itemize}
186                                 \item \code{ObjectSpecPrinter}
187                                 \item \code{OSLSpecPrinter}
188                                 \item \code{SpecPrinter}
189                         \end{itemize}
190                         \item \code{settings}
191                         \begin{itemize}
192                                 \item \code{Job}
193                         \end{itemize}
194                         \item \code{specification}
195                         \begin{itemize}
196                                 \item \code{AttributeMap}
197                                 \item \code{EntityMap}
198                                 \item \code{IdentifierMap}
199                                 \item \code{InvalidSpecificationException}
200                                 \item \code{OBDAMap}
201                                 \item \code{OBDASpecification}
202                                 \item \code{RelationMap}
203                                 \item \code{SubtypeMap}
204                                 \item \code{TranslationTable}
205                         \end{itemize}
206                         \item \code{test}
207                         \begin{itemize}
208                                 \item \code{CreateTestDBSchema}
209                                 \item \code{GetSomeDBSchema}
210                         \end{itemize}
211                 \end{itemize}\end{multicols} %\KOMAoption{fontsize}{\myfontsize{}}
212                 \caption{Class attachment to packages in \myprog{}}
213                 \label{arch_tbl_classes}
214 \end{table}
215
216 \subsubsection{Class organization}
217 \label{hierarchies}
218 Organizing classes in a structured, obvious manner such that classes have well-defined
219 roles, behave in an intuitive way, ideally representing artifacts from the world
220 modeled in the program directly \cite{str3}, is a prerequisite to make the code
221 clear and comprehensible on the architectural level.
222
223 Section \fullref{code_classes} describes the identification and naming scheme for
224 the classes in \myprog{}.
225 However, it is also important, to arrange these classes in useful, comprehensible
226 class hierarchies to avoid code duplication, make appropriate use of the type system,
227 ease the design of precise and flexible interfaces and enhance the
228 adaptability and extensibility of the program.
229 Figure \ref{arch_fig_hierachies} shows the class hierarchies in \myprog{},
230 while standalone classes are listed in table \ref{arch_tbl_lone_classes}.
231
232 \begin{figure}[H]\begin{center}
233                 \ContinuedFloat*
234                 \includegraphics[scale=0.86]{Images/inherit_graph_8.pdf}
235                 \label{first_hierarchy}
236 \end{center}\end{figure}
237 \vspace{6px}
238 \begin{figure}[H]\begin{center}
239                 \ContinuedFloat*
240                 \includegraphics[scale=0.86]{Images/inherit_graph_5.pdf}
241         \end{center}\end{figure}
242 \begin{figure}[H]\begin{center}
243                 \ContinuedFloat*
244                 \includegraphics[scale=0.86]{Images/inherit_graph_7.pdf}
245 \end{center}\end{figure}
246 \begin{figure}[H]\begin{center}
247                 \ContinuedFloat*
248                 \includegraphics[scale=0.86]{Images/inherit_graph_19.pdf}
249 \end{center}\end{figure}
250 \begin{figure}[H]\begin{center}
251                 \ContinuedFloat*
252                 \includegraphics[scale=0.86]{Images/inherit_graph_1.pdf}
253 \end{center}\end{figure}
254 \begin{figure}[H]\begin{center}
255                 \ContinuedFloat*
256                 \includegraphics[scale=0.86]{Images/inherit_graph_17.pdf}
257 \end{center}\end{figure}
258 \begin{figure}[H]\begin{center}
259                 \ContinuedFloat*
260                 \includegraphics[scale=0.86]{Images/inherit_graph_21_extended.pdf}
261 \end{center}\end{figure}
262 \begin{figure}[H]\begin{center}
263                 \ContinuedFloat*
264                 \includegraphics[scale=0.86]{Images/inherit_graph_13.pdf}
265 \end{center}\end{figure}
266 \begin{figure}[H]\begin{center}
267                 \ContinuedFloat*
268                 \includegraphics[scale=0.86]{Images/inherit_graph_3.pdf}
269 \end{center}\end{figure}
270 \begin{figure}[H]\begin{center}
271                 \ContinuedFloat*
272                 \includegraphics[scale=0.86]{Images/inherit_graph_18.pdf}
273 \end{center}\end{figure}
274 \begin{figure}[H]\begin{center}
275                 \ContinuedFloat*
276                 \includegraphics[scale=0.86]{Images/inherit_graph_12.pdf}
277 \end{center}\end{figure}
278 \begin{figure}[H]\begin{center}
279                 \ContinuedFloat*
280                 \includegraphics[scale=0.86]{Images/inherit_graph_4.pdf}
281                 \setcounter{figure}{2} %TODO: variable
282                 \caption[Class hierarchies in \myprog{}]{Class hierarchies in \myprog{}.
283                         Interface names are italicized,
284                         external classes or interfaces are hemmed with a gray frame.}
285                 \label{arch_fig_hierachies}
286 \end{center}\end{figure}
287
288 \begin{table}[H]\begin{center}
289                 \begin{tabular}{l}
290                         \itm{} \code{main.Main}\\
291                         \itm{} \code{database.Helpers}\\
292                         \itm{} \code{database.RetrieveDBSchema}\\
293                         \itm{} \code{database.SQLType}\\
294                         \itm{} \code{database.Table}\\
295                         \itm{} \code{specification.OBDASpecification}\\
296                         \itm{} \code{osl.OSLSpecification}\\
297                         \itm{} \code{bootstrapping.Bootstrapping}\\
298                         \itm{} \code{cli.CLIDatabaseInteraction}\\
299                         \itm{} \code{log.GlobalLogger}\\
300                         \itm{} \code{test.CreateTestDBSchema}\\
301                         \itm{} \code{test.GetSomeDBSchema}\\
302                 \end{tabular}\\
303         \caption{Standalone classes in \myprog{}}
304         \label{arch_tbl_lone_classes}
305 \end{center}\end{table}
306
307 Note that every class hierarchy has at least one \code{interface} at its top.
308 Classes not belonging to a class hierarchy were chosen not to be given an interface
309 ``factitiously'', which would have made them part of a (small) class hierarchy TODO.
310 Deliberately, the scheme often recommended in the Java world TODO
311 to give every class an interface it \code{implements} was not followed
312 but the approach described by Stroustrup \cite{str4} to provide a rich set of
313 so called ``concrete types'' not designed for use within class hierarchies, which
314 ``build the foundation of every well-designed program \cite{str4}'' TODO.
315 The details of this consideration are explained in section \fullref{code_interfaces}.
316 In fact, many useful types were already offered by the \name{Java} API,
317 so they were not re-implemented.
318
319 Class \code{Column} with its interface \code{ReadableColumn} is an exception TODO
320 in that it was given an interface although it is basically a concrete type.
321 The reason for this is the chosen way to implement const correctness,
322 described in section \nameref{const} (which is part of section \fullref{code_classes}).
323 This technique forced class \code{Column} to
324 \code{implement} an interface, thus needlessly making it part of a class hierarchy,
325 but also complicated the structure of some class hierarchies.
326 Consider the class hierarchy around \code{ColumnSet},
327 shown in the first graph TODO of figure \ref{arch_fig_hierachies}.
328 Definitely, it seems overly complicated at the first glance.
329 But this complexity solely is introduced by the artificial
330 \code{Readable...} interfaces;
331 would \name{Java} provide a mechanism like \name{C++}'s \code{const},
332 this hierarchy would be as simple as in the following graph:
333
334 \begin{figure}[H]\begin{center}
335                 \includegraphics[scale=0.86]{Images/inherit_graph_8_simplified.pdf}
336                 \caption{\code{ColumnSet} class hierarchy in \myprog{} -- simplified}
337                 \label{arch_fig_colset_hierarchy_simplified}
338 \end{center}\end{figure}
339
340 However, TODO: still good
341
342 TODO: rest self-explanatory
343
344 For more information about the program structure on the class level,
345 see section \fullref{code}.