Minor change
[u/philim/db2osl_thesis.git] / program_arch.tex
index 5c5561f..245e751 100644 (file)
 \label{arch}
 \subsection{Libraries used}
 \subsection{Coarse structuring}
+\label{coarse}
 TODO: overall description, modularity, extendability, ex: easy to add new in-/output formats
 TODO: mapping profiles (maybe better in next subsection)
-TODO: package description
-TODO: package interaction description
+
+\subsubsection{Package partitioning}
+The $45$ classes of \myprog{} were assigned to $11$ packages, each containing
+classes responsible for the same area of operation or taking over similar roles.
+Care was taken that package division happened senseful, producing meaningful packages
+with obvious task fields on the one hand, while on the other hand implementing an
+incisive separation with a notable degree of decoupling.
+Packages were chosen not to be nested but to be set out vapidly.
+Since this doesn't have any functional implications \cite{java}, but is rather an
+implementation detail, this is further explained in section \fullref{code_packages}.
+
+The packages are introduced and described in table \ref{arch_tbl_packages}.
+The lists of classes each package contains are given in table \ref{arch_tbl_classes}
+in the next section \fullref{fine}.
+
+\begin{table}[H]
+       \begin{tabular}{p{3cm}|p{13cm}} %\KOMAoption{fontsize}{\smallerfontsize{}}
+               Package & Description\\
+               \hline
+               \code{bootstrapping} & Classes performing bootstrapping\\
+               \code{cli} & Classes related to the command line interface of \myprog{}\\
+               \code{database} & Classes related to the representation of relational databases and attached tasks\\
+               \code{helpers} & Helper classes used program-wide\\
+               \code{log} & Classes related to logging and diagnostic output\\
+               \code{main} & The \code{Main} class\\
+               \code{osl} & Classes representing OBDA specifications (as described in \cite{eng}) using the OBDA Specification Language (\osl{})\\
+               \code{output} & Classes used to output OBDA Specifications as described in \cite{eng}\\
+               \code{settings} & Classes related to program and job settings (including command line parsing)\\
+               \code{specification} & Classes representing (parts of) OBDA specifications (as described in \cite{eng}) directly, without involving \osl{}\\
+               \code{test} & Classes offering testing facilities\\
+       \end{tabular} %\KOMAoption{fontsize}{\myfontsize{}}
+       \caption{Descriptions of the packages in \myprog{}}
+       \label{arch_tbl_packages}
+\end{table}
+
+Besides intuition, as stated, care was involved when partitioning the program into
+these packages, which included the analysis of the package
+interaction under a given structure, and the carrying out of changes to
+make this structure achieve the desired pronounced decoupling with
+limited and intelligible dependencies.
+
+The \code{main} package was introduced to make the \code{Main} class,
+which carries information needed by other packages
+-- most prominently, the program name --,
+\code{import}able from inside these packages.
+For this, it is required for \code{Main} not to reside in the \code{default}
+package \cite{java}.
+
+\subsubsection{Package interaction}
+As mentioned, the structuring of the packages was driven by the aim to gain a notable
+amount of decoupling.
+How this reflected in the dependency structure, thus the classes from other packages
+that the classes of a package depend on, is described in the following.
+As was also mentioned, the information presented here also acted back on the
+package partitioning, which changed in consequence.
+
+Dependencies of or on package \code{helpers} are not considered in the following,
+since this package precisely was meant to offer services used by many other packages.
+In fact, all facilities provided by \code{helpers} could just as well be part of
+the \name{Java} API, but unfortunately are not.
+The current dependency structure, factoring in this restriction, is shown in figure
+\ref{arch_fig_packages} and reveals a conceivably tidy system of dependencies.
+
+\begin{figure}[H]\begin{center}
+               \includegraphics[scale=0.86]{Images/package_graph.pdf}
+               \caption[Package dependencies in \myprog{}]{Package dependencies in \myprog{}.
+                       ``$\rightarrow$'' means ``depends on''.}
+               \label{arch_fig_packages}
+\end{center}\end{figure}
+
+Though there are quite a number of dependencies (to be precise: $19$), many of them
+($8$, thus, nearly half) trace back to one central package in the middle,
+\code{settings}.
+This may seem odd at first glance, considering that most of the edges of the node
+representing package \code{settings} are outgoing edges and only one is incoming,
+whereas in a design where the settings are configured from within a single package
+and accessed from many other packages this would be the other way round.
+The reason for this constellation is that, as described in section \fullref{interface},
+all settings in \myprog{} are configured per bootstrapping job
+(there are no global settings) and so \code{settings}
+contains a class \code{Job} (and currently no other classes).
+\code{Job} represents the configuration of a bootstrapping job but also
+provides a \code{perform()} method combining the facilities offered by the other
+packages.
+
+This way, the \code{perform()} method of the \code{Job} class acts as the central
+driver performing the bootstrapping process, reducing the \code{main()} method two
+only $7$ lines of code and turning \code{settings} into something like an externalized
+part of the \code{main} package.
+If, in a future version of the program, this approach is changed and global settings or
+configuration files are introduced, \code{settings} will still be the central package,
+leaving the package structure and dependencies unchanged,
+since it either way contains information used by many other packages.
+This was the reason why it was not renamed to, for example, \code{driver}, which was
+considered, since it seems quite a bit unnatural to have the driver class reside in
+a package called ``settings'' at first glance.
+
+Previous versions of \myprog{} had a quite more complicated package dependency structure,
+depicted in figure \ref{arch_fig_packages_earlier}.
+
+\begin{figure}[H]\begin{center}
+               \includegraphics[scale=0.86]{Images/package_graph_earlier.pdf}
+               \caption[Package dependencies in earlier versions of \myprog{}]{Package dependencies
+                       in earlier versions of \myprog{}. ``$\rightarrow$'' again means ``depends on''.}
+               \label{arch_fig_packages_earlier}
+\end{center}\end{figure}
+
+%Package \code{helpers} depends on package \code{database}, which provides the \code{static}
+%method \code{getSQLTypeName}.
 
 \subsection{Fine structuring}
+\label{fine}
+\subsubsection{Package contents}
+\label{package_details}
+While the packages in \myprog{} are introduced and described in section \fullref{coarse},
+the classes that comprise them are addressed in this section.
+
+Table \ref{arch_tbl_classes} lists the classes each package contains.
+The packages \code{cli}, \code{main}, \code{osl} and \code{settings} contain only
+one class each, while the by far most extensive package is \code{database},
+containing $17$ classes.
+
+\begin{table}[H]
+       \begin{multicols}{2}\begin{itemize} %\KOMAoption{fontsize}{\smallerfontsize{}}
+                       \item \code{bootstrapping}
+                       \begin{itemize}
+                               \item \code{Bootstrapping}
+                               \item \code{DirectMappingURIBuilder}
+                               \item \code{URIBuilder}
+                       \end{itemize}
+                       \item \code{cli}
+                       \begin{itemize}
+                               \item \code{CLIDatabaseInteraction}
+                       \end{itemize}
+                       \item \code{database}
+                       \begin{itemize}
+                               \item \code{Column}
+                               \item \code{ColumnSet}
+                               \item \code{DatabaseException}
+                               \item \code{DBSchema}
+                               \item \code{ForeignKey}
+                               \item \code{Helpers}
+                               \item \code{Key}
+                               \item \code{PrimaryKey}
+                               \item \code{ReadableColumn}
+                               \item \code{ReadableColumnSet}
+                               \item \code{ReadableForeignKey}
+                               \item \code{ReadableKey}
+                               \item \code{ReadablePrimaryKey}
+                               \item \code{RetrieveDBSchema}
+                               \item \code{SQLType}
+                               \item \code{Table}
+                               \item \code{TableSchema}
+                       \end{itemize}
+                       \item \code{helpers}
+                       \begin{itemize}
+                               \item \code{MapValueIterable}
+                               \item \code{MapValueIterator}
+                               \item \code{ReadOnlyIterable}
+                               \item \code{ReadOnlyIterator}
+                               \item \code{UserAbortException}
+                       \end{itemize}
+                       \newpage
+                       \item \code{log}
+                       \begin{itemize}
+                               \item \code{ConsoleDiagnosticOutputHandler}
+                               \item \code{GlobalLogger}
+                       \end{itemize}
+                       \item \code{main}
+                       \begin{itemize}
+                               \item \code{Main}
+                       \end{itemize}
+                       \item \code{osl}
+                       \begin{itemize}
+                               \item \code{OSLSpecification}
+                       \end{itemize}
+                       \item \code{output}
+                       \begin{itemize}
+                               \item \code{ObjectSpecPrinter}
+                               \item \code{OSLSpecPrinter}
+                               \item \code{SpecPrinter}
+                       \end{itemize}
+                       \item \code{settings}
+                       \begin{itemize}
+                               \item \code{Job}
+                       \end{itemize}
+                       \item \code{specification}
+                       \begin{itemize}
+                               \item \code{AttributeMap}
+                               \item \code{EntityMap}
+                               \item \code{IdentifierMap}
+                               \item \code{InvalidSpecificationException}
+                               \item \code{OBDAMap}
+                               \item \code{OBDASpecification}
+                               \item \code{RelationMap}
+                               \item \code{SubtypeMap}
+                               \item \code{TranslationTable}
+                       \end{itemize}
+                       \item \code{test}
+                       \begin{itemize}
+                               \item \code{CreateTestDBSchema}
+                               \item \code{GetSomeDBSchema}
+                       \end{itemize}
+               \end{itemize}\end{multicols} %\KOMAoption{fontsize}{\myfontsize{}}
+               \caption{Class attachment to packages in \myprog{}}
+               \label{arch_tbl_classes}
+\end{table}
+
+\subsubsection{Class organization}
+\label{hierarchies}
+Organizing classes in a structured, obvious manner such that classes have well-defined
+roles, behave in an intuitive way, ideally representing artifacts from the world
+modeled in the program directly \cite{str3}, is a prerequisite to make the code
+clear and comprehensible on the architectural level.
+
+Section \fullref{code_classes} describes the identification and naming scheme for
+the classes in \myprog{}.
+However, it is also important, to arrange these classes in useful, comprehensible
+class hierarchies to avoid code duplication, make appropriate use of the type system,
+ease the design of precise and flexible interfaces and enhance the
+adaptability and extensibility of the program.
+Figure \ref{arch_fig_hierachies} shows the class hierarchies in \myprog{},
+while standalone classes are listed in table \ref{arch_tbl_lone_classes}.
+
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_8.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_8.pdf}
+               \label{first_hierarchy}
 \end{center}\end{figure}
+\vspace{6px}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_7.png}
-\end{center}\end{figure}
+               \includegraphics[scale=0.86]{Images/inherit_graph_5.pdf}
+       \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_5.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_7.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_19.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_19.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_1.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_1.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_17.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_17.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_21.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_21_extended.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_13.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_13.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_3.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_3.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_18.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_18.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_12.png}
+               \includegraphics[scale=0.86]{Images/inherit_graph_12.pdf}
 \end{center}\end{figure}
 \begin{figure}[H]\begin{center}
                \ContinuedFloat*
-               \includegraphics[scale=0.86]{Images/inherit_graph_4.png}
-               \setcounter{figure}{1}
-               \caption{Class hierarchies in \myprog{}}
-               \label{arch_fig_inheritance}
+               \includegraphics[scale=0.86]{Images/inherit_graph_4.pdf}
+               \setcounter{figure}{2} %TODO: variable
+               \caption[Class hierarchies in \myprog{}]{Class hierarchies in \myprog{}.
+                       Interface names are italicized,
+                       external classes or interfaces are hemmed with a gray frame.}
+               \label{arch_fig_hierachies}
 \end{center}\end{figure}
 
-
 \begin{table}[H]\begin{center}
                \begin{tabular}{l}
                        \itm{} \code{main.Main}\\
@@ -77,7 +301,45 @@ TODO: package interaction description
                        \itm{} \code{test.GetSomeDBSchema}\\
                \end{tabular}\\
        \caption{Standalone classes in \myprog{}}
-       \label{arch_tbl_classes}
+       \label{arch_tbl_lone_classes}
 \end{center}\end{table}
 
-%For more information about the program structure on the class level, see section \fullref{code}.
+Note that every class hierarchy has at least one \code{interface} at its top.
+Classes not belonging to a class hierarchy were chosen not to be given an interface
+``factitiously'', which would have made them part of a (small) class hierarchy TODO.
+Deliberately, the scheme often recommended in the Java world TODO
+to give every class an interface it \code{implements} was not followed
+but the approach described by Stroustrup \cite{str4} to provide a rich set of
+so called ``concrete types'' not designed for use within class hierarchies, which
+``build the foundation of every well-designed program \cite{str4}'' TODO.
+The details of this consideration are explained in section \fullref{code_interfaces}.
+In fact, many useful types were already offered by the \name{Java} API,
+so they were not re-implemented.
+
+Class \code{Column} with its interface \code{ReadableColumn} is an exception TODO
+in that it was given an interface although it is basically a concrete type.
+The reason for this is the chosen way to implement const correctness,
+described in section \nameref{const} (which is part of section \fullref{code_classes}).
+This technique forced class \code{Column} to
+\code{implement} an interface, thus needlessly making it part of a class hierarchy,
+but also complicated the structure of some class hierarchies.
+Consider the class hierarchy around \code{ColumnSet},
+shown in the first graph TODO of figure \ref{arch_fig_hierachies}.
+Definitely, it seems overly complicated at the first glance.
+But this complexity solely is introduced by the artificial
+\code{Readable...} interfaces;
+would \name{Java} provide a mechanism like \name{C++}'s \code{const},
+this hierarchy would be as simple as in the following graph:
+
+\begin{figure}[H]\begin{center}
+               \includegraphics[scale=0.86]{Images/inherit_graph_8_simplified.pdf}
+               \caption{\code{ColumnSet} class hierarchy in \myprog{} -- simplified}
+               \label{arch_fig_colset_hierarchy_simplified}
+\end{center}\end{figure}
+
+However, TODO: still good
+
+TODO: rest self-explanatory
+
+For more information about the program structure on the class level,
+see section \fullref{code}.