]>
Commit | Line | Data |
---|---|---|
76bcc7e6 | 1 | %________________________________________________________ |
2 | \section{Introduction} | |
3 | \label{Note:INTRO} | |
4 | ||
5 | Based on the official ALICE documents | |
6 | \cite{Note:RefPPR,Note:RefComputingTDR}, the computing model of the | |
7 | experiment can be described as follows: | |
8 | ||
9 | \begin{itemize} | |
10 | \item Tier 0 provides permanent storage of the raw data, distributes | |
11 | them to Tier 1 and performs the calibration and alignment task as | |
12 | well as the first reconstruction pass. The calibration procedure | |
13 | will also be addressed by PROOF clusters such as the CERN Analysis | |
14 | Facility (CAF) \cite{Note:RefCAF}. | |
15 | ||
16 | \item Tier 1s outside CERN collectively provide permanent storage of a | |
17 | copy of the raw data. All Tier 1s perform the subsequent | |
18 | reconstruction passes and the scheduled analysis tasks. | |
19 | ||
20 | \item Tier 2s generate and reconstruct the simulated Monte Carlo data | |
21 | and perform the chaotic analysis submitted by the physicists. | |
22 | ||
23 | \end{itemize} | |
24 | ||
25 | The experience of past experiments shows that the typical data | |
26 | analysis (chaotic analysis) will consume a large fraction of the total | |
27 | amount of resources. The time needed to analyze and reconstruct events | |
28 | depends mainly on the analysis and reconstruction algorithm. In | |
29 | particular, the GRID user data analysis has been developed and tested | |
30 | with two approaches: the asynchronous (batch approach) and the | |
31 | synchronous (interactive) analysis. | |
32 | ||
33 | ||
34 | In this note we will try to describe the distributed framework, and | |
35 | the steps needed in order to analyze data. We will also provide some | |
36 | practical examples for the users based on the new analysis framework | |
37 | which has been adopted by the collaboration | |
38 | \cite{Note:RefAnalysisFramework}. Before going into detail on the | |
39 | different analysis tasks, we would like to address the general steps a | |
40 | user needs to take before submitting an analysis job: | |
41 | ||
42 | ||
43 | \begin{itemize} | |
44 | \item Code validation: In order to validate the code, a user should | |
45 | copy a few AliESDs.root files locally and try to analyze them by | |
46 | following the instructions listed in section \ref{Note:LOCAL}. | |
47 | ||
48 | \item Interactive analysis: After the user is satisfied from both the | |
49 | code sanity and the corresponding results, the next step is to | |
50 | increase the statistics by submitting an interactive job that will | |
51 | analyze ESDs stored on the GRID. This task is done in such a way to | |
52 | simulate the behavior of a GRID worker node. If this step is | |
53 | successful then we have a big probability that our batch job will be | |
54 | executed properly. Detailed instructions on how to perform this task | |
55 | are listed in section \ref{Note:INTERACTIVE}. | |
56 | ||
57 | \item Finally, if the user is satisfied with the results from the | |
58 | previous step, a batch job can be launched that will take advantage | |
59 | of the whole GRID infrastructure in order to analyze files stored in | |
60 | different storage elements. This step is covered in detail in | |
61 | section \ref{Note:BATCH}. | |
62 | ||
63 | \end{itemize} | |
64 | ||
65 | It should be pointed out that what we describe in this note involves | |
66 | the usage of the whole metadata machinery of the ALICE experiment: | |
67 | that is both the file/run level metadata | |
68 | \cite{Note:RefFileCatalogMetadataNote} as well as the \tag\ | |
69 | \cite{Note:RefEventTagNote}. The latter is used extensively because | |
70 | apart from the fact that it provides an event filtering mechanism to | |
71 | the users and thus reducing the overall analysis time significantly | |
72 | \cite{Note:RefEventTagNote}, it also provides a transparent way to | |
73 | retrieve the desired input data collection in the proper format (= a | |
74 | chain of ESD files) which can be directly analyzed. On the other hand, | |
75 | if the \tag\ is not used then apart from the fact that the user cannot | |
76 | utilize the event filtering, he/she also has to create the input data | |
77 | collection (= a chain of ESD files), manually. | |
78 |