several bugfixes
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
1 /**************************************************************************
2  * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3  *                                                                        *
4  * Author: The ALICE Off-line Project.                                    *
5  * Contributors are mentioned in the code where appropriate.              *
6  *                                                                        *
7  * Permission to use, copy, modify and distribute this software and its   *
8  * documentation strictly for non-commercial purposes is hereby granted   *
9  * without fee, provided that the above copyright notice appears in all   *
10  * copies and that both the copyright notice and this permission notice   *
11  * appear in the supporting documentation. The authors make no claims     *
12  * about the suitability of this software for any purpose. It is          *
13  * provided "as is" without express or implied warranty.                  *
14  **************************************************************************/
15
16 /*
17 $Log$
18 Revision 1.73  2007/12/14 19:31:36  acolla
19 Sending email to DCS experts is temporarily commented
20
21 Revision 1.72  2007/12/13 15:44:28  acolla
22 Run type added in mail sent to detector expert (eases understanding)
23
24 Revision 1.71  2007/12/12 14:56:14  jgrosseo
25 sending shuttle_ignore to ML also in case of 0 events
26
27 Revision 1.70  2007/12/12 13:45:35  acolla
28 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
29
30 Revision 1.69  2007/12/12 10:06:29  acolla
31 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
32
33 time_start==0 && time_end==0
34
35 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
36
37 Revision 1.68  2007/12/11 10:15:17  acolla
38 Added marking SHUTTLE=DONE for invalid runs
39 (invalid start time or end time) and runs with totalEvents < 1
40
41 Revision 1.67  2007/12/07 19:14:36  acolla
42 in AliShuttleTrigger:
43
44 Added automatic collection of new runs on a regular time basis (settable from the configuration)
45
46 in AliShuttleConfig: new members
47
48 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
49 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
50
51 in AliShuttle:
52
53 - logs now stored in logs/#RUN/DET_#RUN.log
54
55 Revision 1.66  2007/12/05 10:45:19  jgrosseo
56 changed order of arguments to TMonaLisaWriter
57
58 Revision 1.65  2007/11/26 16:58:37  acolla
59 Monalisa configuration added: host and table name
60
61 Revision 1.64  2007/11/13 16:15:47  acolla
62 DCS map is stored in a file in the temp folder where the detector is processed.
63 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
64
65 Revision 1.63  2007/11/02 10:53:16  acolla
66 Protection added to AliShuttle::CopyFileLocally
67
68 Revision 1.62  2007/10/31 18:23:13  acolla
69 Furter developement on the Shuttle:
70
71 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
72 are now built from /alice/data, e.g.:
73 /alice/data/2007/LHC07a/OCDB
74
75 the year and LHC period are taken from the Shuttle.
76 Raw metadata files are stored by GRP to:
77 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
78
79 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
80
81 Revision 1.61  2007/10/30 20:33:51  acolla
82 Improved managing of temporary folders, which weren't correctly handled.
83 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
84
85 Revision 1.60  2007/10/29 18:06:16  acolla
86
87 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
88 This function can be used by GRP only. It stores raw data tags merged file to the
89 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
90
91 KNOWN ISSUES:
92
93 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
94 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
95 has been corrected in the root package on the Shuttle machine.
96
97 Revision 1.59  2007/10/05 12:40:55  acolla
98
99 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
100
101 Revision 1.58  2007/09/28 15:27:40  acolla
102
103 AliDCSClient "multiSplit" option added in the DCS configuration
104 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
105
106 Revision 1.57  2007/09/27 16:53:13  acolla
107 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
108 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
109
110 Revision 1.56  2007/09/14 16:46:14  jgrosseo
111 1) Connect and Close are called before and after each query, so one can
112 keep the same AliDCSClient object.
113 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
114 3) Splitting interval can be specified in constructor
115
116 Revision 1.55  2007/08/06 12:26:40  acolla
117 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
118 read from the run logbook.
119
120 Revision 1.54  2007/07/12 09:51:25  jgrosseo
121 removed duplicated log message in GetFile
122
123 Revision 1.53  2007/07/12 09:26:28  jgrosseo
124 updating hlt fxs base path
125
126 Revision 1.52  2007/07/12 08:06:45  jgrosseo
127 adding log messages in getfile... functions
128 adding not implemented copy constructor in alishuttleconfigholder
129
130 Revision 1.51  2007/07/03 17:24:52  acolla
131 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
132
133 Revision 1.50  2007/07/02 17:19:32  acolla
134 preprocessor is run in a temp directory that is removed when process is finished.
135
136 Revision 1.49  2007/06/29 10:45:06  acolla
137 Number of columns in MySql Shuttle logbook increased by one (HLT added)
138
139 Revision 1.48  2007/06/21 13:06:19  acolla
140 GetFileSources returns dummy list with 1 source if system=DCS (better than
141 returning error as it was)
142
143 Revision 1.47  2007/06/19 17:28:56  acolla
144 HLT updated; missing map bug removed.
145
146 Revision 1.46  2007/06/09 13:01:09  jgrosseo
147 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
148
149 Revision 1.45  2007/05/30 06:35:20  jgrosseo
150 Adding functionality to the Shuttle/TestShuttle:
151 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
152 o) Function to retrieve list of IDs for a given source      (GetFileIDs)
153 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
154 Example code has been added to the TestProcessor in TestShuttle
155
156 Revision 1.44  2007/05/11 16:09:32  acolla
157 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
158 example: ITS/SPD/100_filename.root
159
160 Revision 1.43  2007/05/10 09:59:51  acolla
161 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
162
163 Revision 1.42  2007/05/03 08:01:39  jgrosseo
164 typo in last commit :-(
165
166 Revision 1.41  2007/05/03 08:00:48  jgrosseo
167 fixing log message when pp want to skip dcs value retrieval
168
169 Revision 1.40  2007/04/27 07:06:48  jgrosseo
170 GetFileSources returns empty list in case of no files, but successful query
171 No mails sent in testmode
172
173 Revision 1.39  2007/04/17 12:43:57  acolla
174 Correction in StoreOCDB; change of text in mail to detector expert
175
176 Revision 1.38  2007/04/12 08:26:18  jgrosseo
177 updated comment
178
179 Revision 1.37  2007/04/10 16:53:14  jgrosseo
180 redirecting sub detector stdout, stderr to sub detector log file
181
182 Revision 1.35  2007/04/04 16:26:38  acolla
183 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
184 2. Added missing dependency in test preprocessors.
185 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
186
187 Revision 1.34  2007/04/04 10:33:36  jgrosseo
188 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
189 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
190
191 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
192
193 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
194
195 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
196
197 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
198 If you always need DCS data (like before), you do not need to implement it.
199
200 6) The run type has been added to the monitoring page
201
202 Revision 1.33  2007/04/03 13:56:01  acolla
203 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
204 run type.
205
206 Revision 1.32  2007/02/28 10:41:56  acolla
207 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
208 AliPreprocessor::GetRunType() function.
209 Added some ldap definition files.
210
211 Revision 1.30  2007/02/13 11:23:21  acolla
212 Moved getters and setters of Shuttle's main OCDB/Reference, local
213 OCDB/Reference, temp and log folders to AliShuttleInterface
214
215 Revision 1.27  2007/01/30 17:52:42  jgrosseo
216 adding monalisa monitoring
217
218 Revision 1.26  2007/01/23 19:20:03  acolla
219 Removed old ldif files, added TOF, MCH ldif files. Added some options in
220 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
221 SetShuttleLogDir
222
223 Revision 1.25  2007/01/15 19:13:52  acolla
224 Moved some AliInfo to AliDebug in SendMail function
225
226 Revision 1.21  2006/12/07 08:51:26  jgrosseo
227 update (alberto):
228 table, db names in ldap configuration
229 added GRP preprocessor
230 DCS data can also be retrieved by data point
231
232 Revision 1.20  2006/11/16 16:16:48  jgrosseo
233 introducing strict run ordering flag
234 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
235
236 Revision 1.19  2006/11/06 14:23:04  jgrosseo
237 major update (Alberto)
238 o) reading of run parameters from the logbook
239 o) online offline naming conversion
240 o) standalone DCSclient package
241
242 Revision 1.18  2006/10/20 15:22:59  jgrosseo
243 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
244 o) Merging Collect, CollectAll, CollectNew function
245 o) Removing implementation of empty copy constructors (declaration still there!)
246
247 Revision 1.17  2006/10/05 16:20:55  jgrosseo
248 adapting to new CDB classes
249
250 Revision 1.16  2006/10/05 15:46:26  jgrosseo
251 applying to the new interface
252
253 Revision 1.15  2006/10/02 16:38:39  jgrosseo
254 update (alberto):
255 fixed memory leaks
256 storing of objects that failed to be stored to the grid before
257 interfacing of shuttle status table in daq system
258
259 Revision 1.14  2006/08/29 09:16:05  jgrosseo
260 small update
261
262 Revision 1.13  2006/08/15 10:50:00  jgrosseo
263 effc++ corrections (alberto)
264
265 Revision 1.12  2006/08/08 14:19:29  jgrosseo
266 Update to shuttle classes (Alberto)
267
268 - Possibility to set the full object's path in the Preprocessor's and
269 Shuttle's  Store functions
270 - Possibility to extend the object's run validity in the same classes
271 ("startValidity" and "validityInfinite" parameters)
272 - Implementation of the StoreReferenceData function to store reference
273 data in a dedicated CDB storage.
274
275 Revision 1.11  2006/07/21 07:37:20  jgrosseo
276 last run is stored after each run
277
278 Revision 1.10  2006/07/20 09:54:40  jgrosseo
279 introducing status management: The processing per subdetector is divided into several steps,
280 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
281 can keep track of the number of failures and skips further processing after a certain threshold is
282 exceeded. These thresholds can be configured in LDAP.
283
284 Revision 1.9  2006/07/19 10:09:55  jgrosseo
285 new configuration, accesst to DAQ FES (Alberto)
286
287 Revision 1.8  2006/07/11 12:44:36  jgrosseo
288 adding parameters for extended validity range of data produced by preprocessor
289
290 Revision 1.7  2006/07/10 14:37:09  jgrosseo
291 small fix + todo comment
292
293 Revision 1.6  2006/07/10 13:01:41  jgrosseo
294 enhanced storing of last sucessfully processed run (alberto)
295
296 Revision 1.5  2006/07/04 14:59:57  jgrosseo
297 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
298
299 Revision 1.4  2006/06/12 09:11:16  jgrosseo
300 coding conventions (Alberto)
301
302 Revision 1.3  2006/06/06 14:26:40  jgrosseo
303 o) removed files that were moved to STEER
304 o) shuttle updated to follow the new interface (Alberto)
305
306 Revision 1.2  2006/03/07 07:52:34  hristov
307 New version (B.Yordanov)
308
309 Revision 1.6  2005/11/19 17:19:14  byordano
310 RetrieveDATEEntries and RetrieveConditionsData added
311
312 Revision 1.5  2005/11/19 11:09:27  byordano
313 AliShuttle declaration added
314
315 Revision 1.4  2005/11/17 17:47:34  byordano
316 TList changed to TObjArray
317
318 Revision 1.3  2005/11/17 14:43:23  byordano
319 import to local CVS
320
321 Revision 1.1.1.1  2005/10/28 07:33:58  hristov
322 Initial import as subdirectory in AliRoot
323
324 Revision 1.2  2005/09/13 08:41:15  byordano
325 default startTime endTime added
326
327 Revision 1.4  2005/08/30 09:13:02  byordano
328 some docs added
329
330 Revision 1.3  2005/08/29 21:15:47  byordano
331 some docs added
332
333 */
334
335 //
336 // This class is the main manager for AliShuttle. 
337 // It organizes the data retrieval from DCS and call the 
338 // interface methods of AliPreprocessor.
339 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
340 // data for its set of aliases is retrieved. If there is registered
341 // AliPreprocessor for this detector then it will be used
342 // accroding to the schema (see AliPreprocessor).
343 // If there isn't registered AliPreprocessor than the retrieved
344 // data is stored automatically to the undelying AliCDBStorage.
345 // For detSpec is used the alias name.
346 //
347
348 #include "AliShuttle.h"
349
350 #include "AliCDBManager.h"
351 #include "AliCDBStorage.h"
352 #include "AliCDBId.h"
353 #include "AliCDBRunRange.h"
354 #include "AliCDBPath.h"
355 #include "AliCDBEntry.h"
356 #include "AliShuttleConfig.h"
357 #include "DCSClient/AliDCSClient.h"
358 #include "AliLog.h"
359 #include "AliPreprocessor.h"
360 #include "AliShuttleStatus.h"
361 #include "AliShuttleLogbookEntry.h"
362
363 #include <TSystem.h>
364 #include <TObject.h>
365 #include <TString.h>
366 #include <TTimeStamp.h>
367 #include <TObjString.h>
368 #include <TSQLServer.h>
369 #include <TSQLResult.h>
370 #include <TSQLRow.h>
371 #include <TMutex.h>
372 #include <TSystemDirectory.h>
373 #include <TSystemFile.h>
374 #include <TFile.h>
375 #include <TGrid.h>
376 #include <TGridResult.h>
377
378 #include <TMonaLisaWriter.h>
379
380 #include <fstream>
381
382 #include <sys/types.h>
383 #include <sys/wait.h>
384
385 ClassImp(AliShuttle)
386
387 //______________________________________________________________________________________________
388 AliShuttle::AliShuttle(const AliShuttleConfig* config,
389                 UInt_t timeout, Int_t retries):
390 fConfig(config),
391 fTimeout(timeout), fRetries(retries),
392 fPreprocessorMap(),
393 fLogbookEntry(0),
394 fCurrentDetector(),
395 fStatusEntry(0),
396 fMonitoringMutex(0),
397 fLastActionTime(0),
398 fLastAction(),
399 fMonaLisa(0),
400 fTestMode(kNone),
401 fReadTestMode(kFALSE),
402 fOutputRedirected(kFALSE)
403 {
404         //
405         // config: AliShuttleConfig used
406         // timeout: timeout used for AliDCSClient connection
407         // retries: the number of retries in case of connection error.
408         //
409
410         if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
411         for(int iSys=0;iSys<4;iSys++) {
412                 fServer[iSys]=0;
413                 if (iSys < 3)
414                         fFXSlist[iSys].SetOwner(kTRUE);
415         }
416         fPreprocessorMap.SetOwner(kTRUE);
417
418         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
419                 fFirstUnprocessed[iDet] = kFALSE;
420
421         fMonitoringMutex = new TMutex();
422 }
423
424 //______________________________________________________________________________________________
425 AliShuttle::~AliShuttle()
426 {
427         //
428         // destructor
429         //
430
431         fPreprocessorMap.DeleteAll();
432         for(int iSys=0;iSys<4;iSys++)
433                 if(fServer[iSys]) {
434                         fServer[iSys]->Close();
435                         delete fServer[iSys];
436                         fServer[iSys] = 0;
437                 }
438
439         if (fStatusEntry){
440                 delete fStatusEntry;
441                 fStatusEntry = 0;
442         }
443         
444         if (fMonitoringMutex) 
445         {
446                 delete fMonitoringMutex;
447                 fMonitoringMutex = 0;
448         }
449 }
450
451 //______________________________________________________________________________________________
452 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
453 {
454         //
455         // Registers new AliPreprocessor.
456         // It uses GetName() for indentificator of the pre processor.
457         // The pre processor is registered it there isn't any other
458         // with the same identificator (GetName()).
459         //
460
461         const char* detName = preprocessor->GetName();
462         if(GetDetPos(detName) < 0)
463                 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
464
465         if (fPreprocessorMap.GetValue(detName)) {
466                 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
467                 return;
468         }
469
470         fPreprocessorMap.Add(new TObjString(detName), preprocessor);
471 }
472 //______________________________________________________________________________________________
473 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
474                 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
475 {
476         // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
477         // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
478         // using this function. Use StoreReferenceData instead!
479         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
480         // finishes the data are transferred to the main storage (Grid).
481
482         return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
483 }
484
485 //______________________________________________________________________________________________
486 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
487 {
488         // Stores a CDB object in the storage for reference data. This objects will not be available during
489         // offline reconstrunction. Use this function for reference data only!
490         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
491         // finishes the data are transferred to the main storage (Grid).
492
493         return StoreLocally(fgkLocalRefStorage, path, object, metaData);
494 }
495
496 //______________________________________________________________________________________________
497 Bool_t AliShuttle::StoreLocally(const TString& localUri,
498                         const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
499                         Int_t validityStart, Bool_t validityInfinite)
500 {
501         // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
502         // when the preprocessor finishes the data are transferred to the main storage (Grid).
503         // The parameters are:
504         //   1) Uri of the backup storage (Local)
505         //   2) the object's path.
506         //   3) the object to be stored
507         //   4) the metaData to be associated with the object
508         //   5) the validity start run number w.r.t. the current run,
509         //      if the data is valid only for this run leave the default 0
510         //   6) specifies if the calibration data is valid for infinity (this means until updated),
511         //      typical for calibration runs, the default is kFALSE
512         //
513         // returns 0 if fail, 1 otherwise
514
515         if (fTestMode & kErrorStorage)
516         {
517                 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
518                 return kFALSE;
519         }
520         
521         const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
522
523         Int_t firstRun = GetCurrentRun() - validityStart;
524         if(firstRun < 0) {
525                 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
526                 firstRun=0;
527         }
528
529         Int_t lastRun = -1;
530         if(validityInfinite) {
531                 lastRun = AliCDBRunRange::Infinity();
532         } else {
533                 lastRun = GetCurrentRun();
534         }
535
536         // Version is set to current run, it will be used later to transfer data to Grid
537         AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
538
539         if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
540                 TObjString runUsed = Form("%d", GetCurrentRun());
541                 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
542         }
543
544         Bool_t result = kFALSE;
545
546         if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
547                 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
548         } else {
549                 result = AliCDBManager::Instance()->GetStorage(localUri)
550                                         ->Put(object, id, metaData);
551         }
552
553         if(!result) {
554
555                 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
556         }
557
558         return result;
559 }
560
561 //______________________________________________________________________________________________
562 Bool_t AliShuttle::StoreOCDB()
563 {
564         //
565         // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
566         // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
567         // Then calls StoreRefFilesToGrid to store reference files. 
568         //
569         
570         if (fTestMode & kErrorGrid)
571         {
572                 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
573                 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
574                 return kFALSE;
575         }
576         
577         Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
578         Bool_t resultCDB = StoreOCDB(fgkMainCDB);
579
580         Log("SHUTTLE","StoreOCDB - Storing reference data ...");
581         Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
582         
583         Log("SHUTTLE","StoreOCDB - Storing reference files ...");
584         Bool_t resultRefFiles = CopyFilesToGrid("reference");
585         
586         Bool_t resultMetadata = kTRUE;
587         if(fCurrentDetector == "GRP") 
588         {
589                 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
590                 resultMetadata = CopyFilesToGrid("metadata");
591         }
592         
593         return resultCDB && resultRef && resultRefFiles && resultMetadata;
594 }
595
596 //______________________________________________________________________________________________
597 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
598 {
599         //
600         // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
601         //
602
603         TObjArray* gridIds=0;
604
605         Bool_t result = kTRUE;
606
607         const char* type = 0;
608         TString localURI;
609         if(gridURI == fgkMainCDB) {
610                 type = "OCDB";
611                 localURI = fgkLocalCDB;
612         } else if(gridURI == fgkMainRefStorage) {
613                 type = "reference";
614                 localURI = fgkLocalRefStorage;
615         } else {
616                 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
617                 return kFALSE;
618         }
619
620         AliCDBManager* man = AliCDBManager::Instance();
621
622         AliCDBStorage *gridSto = man->GetStorage(gridURI);
623         if(!gridSto) {
624                 Log("SHUTTLE",
625                         Form("StoreOCDB - cannot activate main %s storage", type));
626                 return kFALSE;
627         }
628
629         gridIds = gridSto->GetQueryCDBList();
630
631         // get objects previously stored in local CDB
632         AliCDBStorage *localSto = man->GetStorage(localURI);
633         if(!localSto) {
634                 Log("SHUTTLE",
635                         Form("StoreOCDB - cannot activate local %s storage", type));
636                 return kFALSE;
637         }
638         AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
639         // Local objects were stored with current run as Grid version!
640         TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
641         localEntries->SetOwner(1);
642
643         // loop on local stored objects
644         TIter localIter(localEntries);
645         AliCDBEntry *aLocEntry = 0;
646         while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
647                 aLocEntry->SetOwner(1);
648                 AliCDBId aLocId = aLocEntry->GetId();
649                 aLocEntry->SetVersion(-1);
650                 aLocEntry->SetSubVersion(-1);
651
652                 // If local object is valid up to infinity we store it only if it is
653                 // the first unprocessed run!
654                 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
655                         !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
656                 {
657                         Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
658                                                 "there are previous unprocessed runs!",
659                                                 fCurrentDetector.Data(), aLocId.GetPath().Data()));
660                         result = kFALSE;
661                         continue;
662                 }
663
664                 // loop on Grid valid Id's
665                 Bool_t store = kTRUE;
666                 TIter gridIter(gridIds);
667                 AliCDBId* aGridId = 0;
668                 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
669                         if(aGridId->GetPath() != aLocId.GetPath()) continue;
670                         // skip all objects valid up to infinity
671                         if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
672                         // if we get here, it means there's already some more recent object stored on Grid!
673                         store = kFALSE;
674                         break;
675                 }
676
677                 // If we get here, the file can be stored!
678                 Bool_t storeOk = gridSto->Put(aLocEntry);
679                 if(!store || storeOk){
680
681                         if (!store)
682                         {
683                                 Log(fCurrentDetector.Data(),
684                                         Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
685                                                 type, aGridId->ToString().Data()));
686                         } else {
687                                 Log("SHUTTLE",
688                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
689                                                 aLocId.ToString().Data(), type));
690                                 Log(fCurrentDetector.Data(),
691                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
692                                                 aLocId.ToString().Data(), type));
693                         }
694
695                         // removing local filename...
696                         TString filename;
697                         localSto->IdToFilename(aLocId, filename);
698                         Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
699                         RemoveFile(filename.Data());
700                         continue;
701                 } else  {
702                         Log("SHUTTLE",
703                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
704                                         type, aLocId.ToString().Data()));
705                         Log(fCurrentDetector.Data(),
706                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
707                                         type, aLocId.ToString().Data()));
708                         result = kFALSE;
709                 }
710         }
711         localEntries->Clear();
712
713         return result;
714 }
715
716 //______________________________________________________________________________________________
717 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
718 {
719         // clears the directory used to store reference files of a given subdetector
720   
721         AliCDBManager* man = AliCDBManager::Instance();
722         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
723         TString localBaseFolder = sto->GetBaseFolder();
724
725         TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
726         
727         Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
728
729         TString begin;
730         begin.Form("%d_", GetCurrentRun());
731         
732         TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
733         if (!baseDir)
734                 return kTRUE;
735                 
736         TList* dirList = baseDir->GetListOfFiles();
737         delete baseDir;
738         
739         if (!dirList) return kTRUE;
740                         
741         if (dirList->GetEntries() < 3) 
742         {
743                 delete dirList;
744                 return kTRUE;
745         }
746                                 
747         Int_t nDirs = 0, nDel = 0;
748         TIter dirIter(dirList);
749         TSystemFile* entry = 0;
750
751         Bool_t success = kTRUE;
752         
753         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
754         {                                       
755                 if (entry->IsDirectory())
756                         continue;
757                 
758                 TString fileName(entry->GetName());
759                 if (!fileName.BeginsWith(begin))
760                         continue;
761                         
762                 nDirs++;
763                                                 
764                 // delete file
765                 Int_t result = gSystem->Unlink(fileName.Data());
766                 
767                 if (result)
768                 {
769                         Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
770                         success = kFALSE;
771                 } else {
772                         nDel++;
773                 }
774         }
775
776         if(nDirs > 0)
777                 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.", 
778                         nDel, nDirs, targetDir.Data()));
779
780                 
781         delete dirList;
782         return success;
783
784
785
786
787
788
789   Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
790   if (result == 0)
791   {
792     // delete directory
793     result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
794     if (result != 0)
795     {  
796       Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
797       return kFALSE;
798     }
799   }
800
801   result = gSystem->mkdir(targetDir, kTRUE);
802   if (result != 0)
803   {
804     Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
805     return kFALSE;
806   }
807         
808   return kTRUE;
809 }
810
811 //______________________________________________________________________________________________
812 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
813 {
814         //
815         // Stores reference file directly (without opening it). This function stores the file locally.
816         //
817         // The file is stored under the following location: 
818         // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
819         // where <gridFileName> is the second parameter given to the function
820         // 
821         
822         if (fTestMode & kErrorStorage)
823         {
824                 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
825                 return kFALSE;
826         }
827         
828         AliCDBManager* man = AliCDBManager::Instance();
829         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
830         
831         TString localBaseFolder = sto->GetBaseFolder();
832         
833         TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);    
834         target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
835         
836         return CopyFileLocally(localFile, target);
837 }
838
839 //______________________________________________________________________________________________
840 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
841 {
842         //
843         // Stores Run metadata file to the Grid, in the run folder
844         //
845         // Only GRP can call this function.
846         
847         if (fTestMode & kErrorStorage)
848         {
849                 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
850                 return kFALSE;
851         }
852         
853         AliCDBManager* man = AliCDBManager::Instance();
854         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
855         
856         TString localBaseFolder = sto->GetBaseFolder();
857         
858         // Build Run level folder
859         // folder = /alice/data/year/lhcPeriod/runNb/Raw
860         
861                 
862         TString lhcPeriod = GetLHCPeriod();     
863         if (lhcPeriod.Length() == 0) 
864         {
865                 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
866                 return 0;
867         }
868         
869         TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s", 
870                                 localBaseFolder.Data(), GetCurrentYear(), 
871                                 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
872                                         
873         return CopyFileLocally(localFile, target);
874 }
875
876 //______________________________________________________________________________________________
877 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
878 {
879         //
880         // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
881         // Files are temporarily stored in the local reference storage. When the preprocessor 
882         // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn 
883         // (in reference or run level folders)
884         //
885         
886         TString targetDir(target(0, target.Last('/')));
887         
888         //try to open base dir folder, if it does not exist
889         void* dir = gSystem->OpenDirectory(targetDir.Data());
890         if (dir == NULL) {
891                 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
892                         Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
893                         return kFALSE;
894                 }
895
896         } else {
897                 gSystem->FreeDirectory(dir);
898         }
899         
900         Int_t result = 0;
901         
902         result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
903         if (result)
904         {
905                 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
906                 return kFALSE;
907         }
908
909         result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
910         if (!result)
911         {
912                 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
913                 if (gSystem->Unlink(target.Data()))
914                 {
915                         Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
916                         return kFALSE;
917                 }
918         }       
919         
920         result = gSystem->CopyFile(localFile, target);
921
922         if (result == 0)
923         {
924                 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
925                 return kTRUE;
926         }
927         else
928         {
929                 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d", 
930                                 localFile, target.Data(), result));
931                 return kFALSE;
932         }       
933
934
935
936 }
937
938 //______________________________________________________________________________________________
939 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
940 {
941         //
942         // Transfers local files to the Grid. Local files can be reference files 
943         // or run metadata file (from GRP only).
944         //
945         // According to the type (ref, metadata) the files are stored under the following location: 
946         // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
947         // metadata --> <run data folder>/<MetadataFileName>
948         //
949                 
950         AliCDBManager* man = AliCDBManager::Instance();
951         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
952         if (!sto)
953                 return kFALSE;
954         TString localBaseFolder = sto->GetBaseFolder();
955         
956         TString dir;
957         TString alienDir;
958         TString begin;
959         
960         if (strcmp(type, "reference") == 0) 
961         {
962                 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
963                 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
964                 if (!gridSto)
965                         return kFALSE;
966                 TString gridBaseFolder = gridSto->GetBaseFolder();
967                 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
968                 begin = Form("%d_", GetCurrentRun());
969         } 
970         else if (strcmp(type, "metadata") == 0)
971         {
972                         
973                 TString lhcPeriod = GetLHCPeriod();
974         
975                 if (lhcPeriod.Length() == 0) 
976                 {
977                         Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
978                         return 0;
979                 }
980                 
981                 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw", 
982                                 localBaseFolder.Data(), GetCurrentYear(), 
983                                 lhcPeriod.Data(), GetCurrentRun());
984                 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
985                 
986                 begin = "";
987         }
988         else 
989         {
990                 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
991                 return kFALSE;
992         }
993                 
994         TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
995         if (!baseDir)
996                 return kTRUE;
997                 
998         TList* dirList = baseDir->GetListOfFiles();
999         delete baseDir;
1000         
1001         if (!dirList) return kTRUE;
1002                 
1003         if (dirList->GetEntries() < 3) 
1004         {
1005                 delete dirList;
1006                 return kTRUE;
1007         }
1008                         
1009         if (!gGrid)
1010         { 
1011                 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1012                 delete dirList;
1013                 return kFALSE;
1014         }
1015         
1016         Int_t nDirs = 0, nTransfer = 0;
1017         TIter dirIter(dirList);
1018         TSystemFile* entry = 0;
1019
1020         Bool_t success = kTRUE;
1021         Bool_t first = kTRUE;
1022         
1023         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1024         {                       
1025                 if (entry->IsDirectory())
1026                         continue;
1027                         
1028                 TString fileName(entry->GetName());
1029                 if (!fileName.BeginsWith(begin))
1030                         continue;
1031                         
1032                 nDirs++;
1033                         
1034                 if (first)
1035                 {
1036                         first = kFALSE;
1037                         // check that folder exists, otherwise create it
1038                         TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1039                         
1040                         if (!result)
1041                         {
1042                                 delete dirList;
1043                                 return kFALSE;
1044                         }
1045                         
1046                         if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1047                         {
1048                                 // TODO It does not work currently! Bug in TAliEn::Mkdir
1049                                 // TODO Manually fixed in local root v5-16-00
1050                                 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1051                                 {
1052                                         Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1053                                                         alienDir.Data()));
1054                                         delete dirList;
1055                                         return kFALSE;
1056                                 } else {
1057                                         Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1058                                 }
1059                                 
1060                         } else {
1061                                         Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1062                         }
1063                 }
1064                         
1065                 TString fullLocalPath;
1066                 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1067                 
1068                 TString fullGridPath;
1069                 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1070
1071                 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1072                 
1073                 if (result)
1074                 {
1075                         Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!", 
1076                                                 fullLocalPath.Data(), fullGridPath.Data()));
1077                         RemoveFile(fullLocalPath);
1078                         nTransfer++;
1079                 }
1080                 else
1081                 {
1082                         Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!", 
1083                                                 fullLocalPath.Data(), fullGridPath.Data()));
1084                         success = kFALSE;
1085                 }
1086         }
1087
1088         Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.", 
1089                                                 nTransfer, nDirs, dir.Data()));
1090
1091                 
1092         delete dirList;
1093         return success;
1094 }
1095
1096 //______________________________________________________________________________________________
1097 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1098 {
1099         //
1100         // Get folder name of reference files 
1101         //
1102
1103         TString offDetStr(GetOfflineDetName(detector));
1104         TString dir;
1105         if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1106         {
1107                 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1108         } else {
1109                 dir.Form("%s/%s", base, offDetStr.Data());
1110         }
1111         
1112         return dir.Data();
1113         
1114
1115 }
1116
1117 //______________________________________________________________________________________________
1118 void AliShuttle::CleanLocalStorage(const TString& uri)
1119 {
1120         //
1121         // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1122         //
1123
1124         const char* type = 0;
1125         if(uri == fgkLocalCDB) {
1126                 type = "OCDB";
1127         } else if(uri == fgkLocalRefStorage) {
1128                 type = "Reference";
1129         } else {
1130                 AliError(Form("Invalid storage URI: %s", uri.Data()));
1131                 return;
1132         }
1133
1134         AliCDBManager* man = AliCDBManager::Instance();
1135
1136         // open local storage
1137         AliCDBStorage *localSto = man->GetStorage(uri);
1138         if(!localSto) {
1139                 Log("SHUTTLE",
1140                         Form("CleanLocalStorage - cannot activate local %s storage", type));
1141                 return;
1142         }
1143
1144         TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1145                 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1146
1147         AliDebug(2, Form("filename = %s", filename.Data()));
1148
1149         Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1150                 GetCurrentRun(), fCurrentDetector.Data()));
1151
1152         RemoveFile(filename.Data());
1153
1154 }
1155
1156 //______________________________________________________________________________________________
1157 void AliShuttle::RemoveFile(const char* filename)
1158 {
1159         //
1160         // removes local file
1161         //
1162
1163         TString command(Form("rm -f %s", filename));
1164
1165         Int_t result = gSystem->Exec(command.Data());
1166         if(result != 0)
1167         {
1168                 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1169                         fCurrentDetector.Data(), filename));
1170         }
1171 }
1172
1173 //______________________________________________________________________________________________
1174 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1175 {
1176         //
1177         // Reads the AliShuttleStatus from the CDB
1178         //
1179
1180         if (fStatusEntry){
1181                 delete fStatusEntry;
1182                 fStatusEntry = 0;
1183         }
1184
1185         fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1186                 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1187
1188         if (!fStatusEntry) return 0;
1189         fStatusEntry->SetOwner(1);
1190
1191         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1192         if (!status) {
1193                 AliError("Invalid object stored to CDB!");
1194                 return 0;
1195         }
1196
1197         return status;
1198 }
1199
1200 //______________________________________________________________________________________________
1201 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1202 {
1203         //
1204         // writes the status for one subdetector
1205         //
1206
1207         if (fStatusEntry){
1208                 delete fStatusEntry;
1209                 fStatusEntry = 0;
1210         }
1211
1212         Int_t run = GetCurrentRun();
1213
1214         AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1215
1216         fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1217         fStatusEntry->SetOwner(1);
1218
1219         UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1220
1221         if (!result) {
1222                 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1223                                                 fCurrentDetector.Data(), run));
1224                 return kFALSE;
1225         }
1226         
1227         SendMLInfo();
1228
1229         return kTRUE;
1230 }
1231
1232 //______________________________________________________________________________________________
1233 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1234 {
1235         //
1236         // changes the AliShuttleStatus for the given detector and run to the given status
1237         //
1238
1239         if (!fStatusEntry){
1240                 AliError("UNEXPECTED: fStatusEntry empty");
1241                 return;
1242         }
1243
1244         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1245
1246         if (!status){
1247                 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1248                 return;
1249         }
1250
1251         TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1252                                 fCurrentDetector.Data(),
1253                                 status->GetStatusName(),
1254                                 status->GetStatusName(newStatus));
1255         Log("SHUTTLE", actionStr);
1256         SetLastAction(actionStr);
1257
1258         status->SetStatus(newStatus);
1259         if (increaseCount) status->IncreaseCount();
1260
1261         AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1262
1263         SendMLInfo();
1264 }
1265
1266 //______________________________________________________________________________________________
1267 void AliShuttle::SendMLInfo()
1268 {
1269         //
1270         // sends ML information about the current status of the current detector being processed
1271         //
1272         
1273         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1274         
1275         if (!status){
1276                 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1277                 return;
1278         }
1279         
1280         TMonaLisaText  mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1281         TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1282
1283         TList mlList;
1284         mlList.Add(&mlStatus);
1285         mlList.Add(&mlRetryCount);
1286
1287         TString mlID;
1288         mlID.Form("%d", GetCurrentRun());
1289         fMonaLisa->SendParameters(&mlList, mlID);
1290 }
1291
1292 //______________________________________________________________________________________________
1293 Bool_t AliShuttle::ContinueProcessing()
1294 {
1295         // this function reads the AliShuttleStatus information from CDB and
1296         // checks if the processing should be continued
1297         // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1298
1299         if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1300
1301         AliPreprocessor* aPreprocessor =
1302                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1303         if (!aPreprocessor)
1304         {
1305                 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1306                 return kFALSE;
1307         }
1308
1309         AliShuttleLogbookEntry::Status entryStatus =
1310                 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1311
1312         if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1313                 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1314                                 fCurrentDetector.Data(),
1315                                 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1316                 return kFALSE;
1317         }
1318
1319         // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1320
1321         // check if current run is first unprocessed run for current detector
1322         if (fConfig->StrictRunOrder(fCurrentDetector) &&
1323                 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1324         {
1325                 if (fTestMode == kNone)
1326                 {
1327                         Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1328                                         " but this is not the first unprocessed run!"));
1329                         return kFALSE;
1330                 }
1331                 else
1332                 {
1333                         Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1334                                         "Although %s requires strict run ordering "
1335                                         "and this is not the first unprocessed run, "
1336                                         "the SHUTTLE continues"));
1337                 }
1338         }
1339
1340         AliShuttleStatus* status = ReadShuttleStatus();
1341         if (!status) {
1342                 // first time
1343                 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1344                                 fCurrentDetector.Data()));
1345                 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1346                 return WriteShuttleStatus(status);
1347         }
1348
1349         // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1350         // If it happens it may mean Logbook updating failed... let's do it now!
1351         if (status->GetStatus() == AliShuttleStatus::kDone ||
1352             status->GetStatus() == AliShuttleStatus::kFailed){
1353                 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1354                                         fCurrentDetector.Data(),
1355                                         status->GetStatusName(status->GetStatus())));
1356                 UpdateShuttleLogbook(fCurrentDetector.Data(),
1357                                         status->GetStatusName(status->GetStatus()));
1358                 return kFALSE;
1359         }
1360
1361         if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1362                 Log("SHUTTLE",
1363                         Form("ContinueProcessing - %s: Grid storage of one or more "
1364                                 "objects failed. Trying again now",
1365                                 fCurrentDetector.Data()));
1366                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1367                 if (StoreOCDB()){
1368                         Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1369                                 "successfully stored into main storage",
1370                                 fCurrentDetector.Data()));
1371                 } else {
1372                         Log("SHUTTLE",
1373                                 Form("ContinueProcessing - %s: Grid storage failed again",
1374                                         fCurrentDetector.Data()));
1375                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1376                 }
1377                 return kFALSE;
1378         }
1379
1380         // if we get here, there is a restart
1381         Bool_t cont = kFALSE;
1382
1383         // abort conditions
1384         if (status->GetCount() >= fConfig->GetMaxRetries()) {
1385                 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1386                                 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1387                                 status->GetCount(), status->GetStatusName()));
1388                 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1389                 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1390
1391                 // there may still be objects in local OCDB and reference storage
1392                 // and FXS databases may be not updated: do it now!
1393                 
1394                 // TODO Currently disabled, we want to keep files in case of failure!
1395                 // CleanLocalStorage(fgkLocalCDB);
1396                 // CleanLocalStorage(fgkLocalRefStorage);
1397                 // UpdateTableFailCase();
1398                 
1399                 // Send mail to detector expert!
1400                 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...", 
1401                                         fCurrentDetector.Data()));
1402                 if (!SendMail())
1403                         Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1404                                         fCurrentDetector.Data()));
1405
1406         } else {
1407                 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1408                                 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1409                                 status->GetStatusName(), status->GetCount()));
1410                 Bool_t increaseCount = kTRUE;
1411                 if (status->GetStatus() == AliShuttleStatus::kDCSError || 
1412                         status->GetStatus() == AliShuttleStatus::kDCSStarted)
1413                                 increaseCount = kFALSE;
1414                                 
1415                 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1416                 cont = kTRUE;
1417         }
1418
1419         return cont;
1420 }
1421
1422 //______________________________________________________________________________________________
1423 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1424 {
1425         //
1426         // Makes data retrieval for all detectors in the configuration.
1427         // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1428         // (Unprocessed, Inactive, Failed or Done).
1429         // Returns kFALSE in case of error occured and kTRUE otherwise
1430         //
1431
1432         if (!entry) return kFALSE;
1433
1434         fLogbookEntry = entry;
1435
1436         Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1437                                         GetCurrentRun()));
1438
1439         // Send the information to ML
1440         TMonaLisaText  mlStatus("SHUTTLE_status", "Processing");
1441         TMonaLisaText  mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1442
1443         TList mlList;
1444         mlList.Add(&mlStatus);
1445         mlList.Add(&mlRunType);
1446
1447         TString mlID;
1448         mlID.Form("%d", GetCurrentRun());
1449         fMonaLisa->SendParameters(&mlList, mlID);
1450
1451         if (fLogbookEntry->IsDone())
1452         {
1453                 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1454                 UpdateShuttleLogbook("shuttle_done");
1455                 fLogbookEntry = 0;
1456                 return kTRUE;
1457         }
1458
1459         // read test mode if flag is set
1460         if (fReadTestMode)
1461         {
1462                 fTestMode = kNone;
1463                 TString logEntry(entry->GetRunParameter("log"));
1464                 //printf("log entry = %s\n", logEntry.Data());
1465                 TString searchStr("Testmode: ");
1466                 Int_t pos = logEntry.Index(searchStr.Data());
1467                 //printf("%d\n", pos);
1468                 if (pos >= 0)
1469                 {
1470                         TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1471                         //printf("%s\n", subStr.String().Data());
1472                         TString newStr(subStr.Data());
1473                         TObjArray* token = newStr.Tokenize(' ');
1474                         if (token)
1475                         {
1476                                 //token->Print();
1477                                 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1478                                 if (tmpStr)
1479                                 {
1480                                         Int_t testMode = tmpStr->String().Atoi();
1481                                         if (testMode > 0)
1482                                         {
1483                                                 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1484                                                 SetTestMode((TestMode) testMode);
1485                                         }
1486                                 }
1487                                 delete token;          
1488                         }
1489                 }
1490         }
1491                 
1492         fLogbookEntry->Print("all");
1493
1494         // Initialization
1495         Bool_t hasError = kFALSE;
1496
1497         // Set the CDB and Reference folders according to the year and LHC period
1498         TString lhcPeriod(GetLHCPeriod());
1499         if (lhcPeriod.Length() == 0) 
1500         {
1501                 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1502                 return 0; 
1503         }       
1504         
1505         if (fgkMainCDB.Length() == 0)
1506                 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache", 
1507                                         GetCurrentYear(), lhcPeriod.Data());
1508         
1509         if (fgkMainRefStorage.Length() == 0)
1510                 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache", 
1511                                         GetCurrentYear(), lhcPeriod.Data());
1512         
1513         // Loop on detectors in the configuration
1514         TIter iter(fConfig->GetDetectors());
1515         TObjString* aDetector = 0;
1516
1517         Bool_t first = kTRUE;
1518
1519         while ((aDetector = (TObjString*) iter.Next()))
1520         {
1521                 fCurrentDetector = aDetector->String();
1522
1523                 if (ContinueProcessing() == kFALSE) continue;
1524                 
1525                 if (first)
1526                 {
1527                   // only read QueryCDB when needed and only once
1528                   AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1529                   if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1530                   AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1531                   if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1532                   first = kFALSE;
1533                 }
1534
1535                 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START  ******",
1536                                                 GetCurrentRun(), aDetector->GetName()));
1537
1538                 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1539
1540                 Log(fCurrentDetector.Data(), "Process - Starting processing");
1541
1542                 Int_t pid = fork();
1543
1544                 if (pid < 0)
1545                 {
1546                         Log("SHUTTLE", "Process - ERROR: Forking failed");
1547                 }
1548                 else if (pid > 0)
1549                 {
1550                         // parent
1551                         Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1552                                                         GetCurrentRun(), aDetector->GetName()));
1553
1554                         Long_t begin = time(0);
1555
1556                         int status; // to be used with waitpid, on purpose an int (not Int_t)!
1557                         while (waitpid(pid, &status, WNOHANG) == 0)
1558                         {
1559                                 Long_t expiredTime = time(0) - begin;
1560
1561                                 if (expiredTime > fConfig->GetPPTimeOut())
1562                                 {
1563                                         TString tmp;
1564                                         tmp.Form("Process - Process of %s time out. "
1565                                                         "Run time: %d seconds. Killing...",
1566                                                         fCurrentDetector.Data(), expiredTime);
1567                                         Log("SHUTTLE", tmp);
1568                                         Log(fCurrentDetector, tmp);
1569
1570                                         kill(pid, 9);
1571
1572                                         UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1573                                         hasError = kTRUE;
1574
1575                                         gSystem->Sleep(1000);
1576                                 }
1577                                 else
1578                                 {
1579                                         gSystem->Sleep(1000);
1580                                         
1581                                         TString checkStr;
1582                                         checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1583                                         FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1584                                         if (!pipe)
1585                                         {
1586                                                 Log("SHUTTLE", Form("Process - Error: "
1587                                                         "Could not open pipe to %s", checkStr.Data()));
1588                                                 continue;
1589                                         }
1590                                                 
1591                                         char buffer[100];
1592                                         if (!fgets(buffer, 100, pipe))
1593                                         {
1594                                                 Log("SHUTTLE", "Process - Error: ps did not return anything");
1595                                                 gSystem->ClosePipe(pipe);
1596                                                 continue;
1597                                         }
1598                                         gSystem->ClosePipe(pipe);
1599                                         
1600                                         //Log("SHUTTLE", Form("ps returned %s", buffer));
1601                                         
1602                                         Int_t mem = 0;
1603                                         if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1604                                         {
1605                                                 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1606                                                 continue;
1607                                         }
1608                                         
1609                                         if (expiredTime % 60 == 0)
1610                                         {
1611                                                 Log("SHUTTLE", Form("Process - %s: Checking process. "
1612                                                         "Run time: %d seconds - Memory consumption: %d KB",
1613                                                         fCurrentDetector.Data(), expiredTime, mem));
1614                                                 SendAlive();
1615                                         }
1616                                         
1617                                         if (mem > fConfig->GetPPMaxMem())
1618                                         {
1619                                                 TString tmp;
1620                                                 tmp.Form("Process - Process exceeds maximum allowed memory "
1621                                                         "(%d KB > %d KB). Killing...",
1622                                                         mem, fConfig->GetPPMaxMem());
1623                                                 Log("SHUTTLE", tmp);
1624                                                 Log(fCurrentDetector, tmp);
1625         
1626                                                 kill(pid, 9);
1627         
1628                                                 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1629                                                 hasError = kTRUE;
1630         
1631                                                 gSystem->Sleep(1000);
1632                                         }
1633                                 }
1634                         }
1635
1636                         Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1637                                                                 GetCurrentRun(), aDetector->GetName()));
1638
1639                         if (WIFEXITED(status))
1640                         {
1641                                 Int_t returnCode = WEXITSTATUS(status);
1642
1643                                 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1644                                                                                 returnCode));
1645
1646                                 if (returnCode == 0) hasError = kTRUE;
1647                         }
1648                 }
1649                 else if (pid == 0)
1650                 {
1651                         // client
1652                         Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1653                                 aDetector->GetName()));
1654
1655                         Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1656
1657                         if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1658                         {
1659                                 Log("SHUTTLE", "Process - Could not freopen stdout");
1660                         }
1661                         else
1662                         {
1663                                 fOutputRedirected = kTRUE;
1664                                 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1665                                         Log("SHUTTLE", "Process - Could not redirect stderr");
1666                                 
1667                         }
1668                         
1669                         TString wd = gSystem->WorkingDirectory();
1670                         TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(), 
1671                                 fCurrentDetector.Data(), GetCurrentRun());
1672                         
1673                         Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1674                         if (!result) // temp dir already exists!
1675                         {
1676                                 Log(fCurrentDetector.Data(), 
1677                                         Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1678                                 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));         
1679                         } 
1680                         
1681                         if (gSystem->mkdir(tmpDir.Data(), 1))
1682                         {
1683                                 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1684                                 gSystem->Exit(1);
1685                         }
1686                         
1687                         if (!gSystem->ChangeDirectory(tmpDir.Data())) 
1688                         {
1689                                 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1690                                 gSystem->Exit(1);                       
1691                         }
1692                         
1693                         Bool_t success = ProcessCurrentDetector();
1694                         
1695                         gSystem->ChangeDirectory(wd.Data());
1696                                                 
1697                         if (success) // Preprocessor finished successfully!
1698                         { 
1699                                 // remove temporary folder
1700                                 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1701                                 
1702                                 // Update time_processed field in FXS DB
1703                                 if (UpdateTable() == kFALSE)
1704                                         Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!", 
1705                                                         fCurrentDetector.Data()));
1706
1707                                 // Transfer the data from local storage to main storage (Grid)
1708                                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1709                                 if (StoreOCDB() == kFALSE)
1710                                 {
1711                                         Log("SHUTTLE", 
1712                                                 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1713                                                         GetCurrentRun(), aDetector->GetName()));
1714                                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1715                                         success = kFALSE;
1716                                 } else {
1717                                         Log("SHUTTLE", 
1718                                                 Form("\t\t\t****** run %d - %s: DONE ******",
1719                                                         GetCurrentRun(), aDetector->GetName()));
1720                                         UpdateShuttleStatus(AliShuttleStatus::kDone);
1721                                         UpdateShuttleLogbook(fCurrentDetector, "DONE");
1722                                 }
1723                         } else 
1724                         {
1725                                 Log("SHUTTLE", 
1726                                         Form("\t\t\t****** run %d - %s: PP ERROR ******",
1727                                                 GetCurrentRun(), aDetector->GetName()));
1728                         }
1729
1730                         for (UInt_t iSys=0; iSys<3; iSys++)
1731                         {
1732                                 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1733                         }
1734
1735                         Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1736                                                         GetCurrentRun(), aDetector->GetName(), success));
1737
1738                         // the client exits here
1739                         gSystem->Exit(success);
1740
1741                         AliError("We should never get here!!!");
1742                 }
1743         }
1744
1745         Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1746                                                         GetCurrentRun()));
1747
1748         //check if shuttle is done for this run, if so update logbook
1749         TObjArray checkEntryArray;
1750         checkEntryArray.SetOwner(1);
1751         TString whereClause = Form("where run=%d", GetCurrentRun());
1752         if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || 
1753                         checkEntryArray.GetEntries() == 0) {
1754                 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1755                                                 GetCurrentRun()));
1756                 return hasError == kFALSE;
1757         }
1758
1759         AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1760                                                 (checkEntryArray.At(0));
1761
1762         if (checkEntry)
1763         {
1764                 if (checkEntry->IsDone())
1765                 {
1766                         Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1767                         UpdateShuttleLogbook("shuttle_done");
1768                 }
1769                 else
1770                 {
1771                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1772                         {
1773                                 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1774                                 {
1775                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1776                                                         checkEntry->GetRun(), GetDetName(iDet)));
1777                                         fFirstUnprocessed[iDet] = kFALSE;
1778                                 }
1779                         }
1780                 }
1781         }
1782
1783         fLogbookEntry = 0;
1784
1785         return hasError == kFALSE;
1786 }
1787
1788 //______________________________________________________________________________________________
1789 Bool_t AliShuttle::ProcessCurrentDetector()
1790 {
1791         //
1792         // Makes data retrieval just for a specific detector (fCurrentDetector).
1793         // Threre should be a configuration for this detector.
1794
1795         Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d", 
1796                                                 fCurrentDetector.Data(), GetCurrentRun()));
1797
1798         TString wd = gSystem->WorkingDirectory();
1799         
1800         if (!CleanReferenceStorage(fCurrentDetector.Data()))
1801                 return kFALSE;
1802         
1803         gSystem->ChangeDirectory(wd.Data());
1804         
1805         TMap* dcsMap = new TMap();
1806
1807         // call preprocessor
1808         AliPreprocessor* aPreprocessor =
1809                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1810
1811         aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1812
1813         Bool_t processDCS = aPreprocessor->ProcessDCS();
1814
1815         if (!processDCS)
1816         {
1817                 Log(fCurrentDetector, "ProcessCurrentDetector -"
1818                         " The preprocessor requested to skip the retrieval of DCS values");
1819         }
1820         else if (fTestMode & kSkipDCS)
1821         {
1822                 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1823         } 
1824         else if (fTestMode & kErrorDCS)
1825         {
1826                 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1827                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1828                 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1829                 delete dcsMap;
1830                 return kFALSE;
1831         } else {
1832
1833                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1834
1835                 // Query DCS archive
1836                 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1837                 
1838                 for (int iServ=0; iServ<nServers; iServ++)
1839                 {
1840                 
1841                         TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1842                         Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1843                         Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1844
1845                         Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1846                                         " Querying DCS Amanda server %s:%d (%d of %d)", 
1847                                         host.Data(), port, iServ+1, nServers));
1848                         
1849                         TMap* aliasMap = 0;
1850                         TMap* dpMap = 0;
1851         
1852                         if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1853                         {
1854                                 aliasMap = GetValueSet(host, port, 
1855                                                 fConfig->GetDCSAliases(fCurrentDetector, iServ), 
1856                                                 kAlias, multiSplit);
1857                                 if (!aliasMap)
1858                                 {
1859                                         Log(fCurrentDetector, 
1860                                                 Form("ProcessCurrentDetector -"
1861                                                         " Error retrieving DCS aliases from server %s."
1862                                                         " Sending mail to DCS experts!", host.Data()));
1863                                         UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1864                                         
1865                                         //if (!SendMailToDCS())
1866                                         //      Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1867
1868                                         delete dcsMap;
1869                                         return kFALSE;
1870                                 }
1871                         }
1872                         
1873                         if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1874                         {
1875                                 dpMap = GetValueSet(host, port, 
1876                                                 fConfig->GetDCSDataPoints(fCurrentDetector, iServ), 
1877                                                 kDP, multiSplit);
1878                                 if (!dpMap)
1879                                 {
1880                                         Log(fCurrentDetector, 
1881                                                 Form("ProcessCurrentDetector -"
1882                                                         " Error retrieving DCS data points from server %s."
1883                                                         " Sending mail to DCS experts!", host.Data()));
1884                                         UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1885                                         
1886                                         //if (!SendMailToDCS())
1887                                         //      Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1888                                         
1889                                         if (aliasMap) delete aliasMap;
1890                                         delete dcsMap;
1891                                         return kFALSE;
1892                                 }                               
1893                         }
1894                         
1895                         // merge aliasMap and dpMap into dcsMap
1896                         if(aliasMap) {
1897                                 TIter iter(aliasMap);
1898                                 TObjString* key = 0;
1899                                 while ((key = (TObjString*) iter.Next()))
1900                                         dcsMap->Add(key, aliasMap->GetValue(key->String()));
1901                                 
1902                                 aliasMap->SetOwner(kFALSE);
1903                                 delete aliasMap;
1904                         }       
1905                         
1906                         if(dpMap) {
1907                                 TIter iter(dpMap);
1908                                 TObjString* key = 0;
1909                                 while ((key = (TObjString*) iter.Next()))
1910                                         dcsMap->Add(key, dpMap->GetValue(key->String()));
1911                                 
1912                                 dpMap->SetOwner(kFALSE);
1913                                 delete dpMap;
1914                         }
1915                 }
1916         }
1917         
1918         // save map into file, to help debugging in case of preprocessor error
1919         /*TFile* f = TFile::Open("DCSMap.root","recreate");
1920         f->cd();
1921         dcsMap->Write("DCSMap", TObject::kSingleKey);
1922         f->Close();
1923         delete f;*/
1924         
1925         // DCS Archive DB processing successful. Call Preprocessor!
1926         UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1927
1928         UInt_t returnValue = aPreprocessor->Process(dcsMap);
1929
1930         if (returnValue > 0) // Preprocessor error!
1931         {
1932                 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1933                                 "Preprocessor failed. Process returned %d.", returnValue));
1934                 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1935                 dcsMap->DeleteAll();
1936                 delete dcsMap;
1937                 return kFALSE;
1938         }
1939         
1940         // preprocessor ok!
1941         UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1942         Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1943                                 fCurrentDetector.Data()));
1944
1945         dcsMap->DeleteAll();
1946         delete dcsMap;
1947
1948         return kTRUE;
1949 }
1950
1951 //______________________________________________________________________________________________
1952 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1953                 TObjArray& entries)
1954 {
1955         // Query DAQ's Shuttle logbook and fills detector status object.
1956         // Call QueryRunParameters to query DAQ logbook for run parameters.
1957         //
1958
1959         entries.SetOwner(1);
1960
1961         // check connection, in case connect
1962         if(!Connect(3)) return kFALSE;
1963
1964         TString sqlQuery;
1965         sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1966
1967         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1968         if (!aResult) {
1969                 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1970                 return kFALSE;
1971         }
1972
1973         AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1974
1975         if(aResult->GetRowCount() == 0) {
1976                 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1977                 delete aResult;
1978                 return kTRUE;
1979         }
1980
1981         // TODO Check field count!
1982         const UInt_t nCols = 23;
1983         if (aResult->GetFieldCount() != (Int_t) nCols) {
1984                 Log("SHUTTLE", "Invalid SQL result field number!");
1985                 delete aResult;
1986                 return kFALSE;
1987         }
1988
1989         TSQLRow* aRow;
1990         while ((aRow = aResult->Next())) {
1991                 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1992                 Int_t run = runString.Atoi();
1993
1994                 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1995                 if (!entry)
1996                         continue;
1997
1998                 // loop on detectors
1999                 for(UInt_t ii = 0; ii < nCols; ii++)
2000                         entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2001
2002                 entries.AddLast(entry);
2003                 delete aRow;
2004         }
2005
2006         delete aResult;
2007         return kTRUE;
2008 }
2009
2010 //______________________________________________________________________________________________
2011 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2012 {
2013         //
2014         // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2015         //
2016
2017         // check connection, in case connect
2018         if (!Connect(3))
2019                 return 0;
2020
2021         TString sqlQuery;
2022         sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2023
2024         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2025         if (!aResult) {
2026                 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2027                 return 0;
2028         }
2029
2030         if (aResult->GetRowCount() == 0) {
2031                 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2032                 delete aResult;
2033                 return 0;
2034         }
2035
2036         if (aResult->GetRowCount() > 1) {
2037                 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2038                                 "more than one entry in DAQ Logbook for run %d!", run));
2039                 delete aResult;
2040                 return 0;
2041         }
2042
2043         TSQLRow* aRow = aResult->Next();
2044         if (!aRow)
2045         {
2046                 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2047                 delete aResult;
2048                 return 0;
2049         }
2050
2051         AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2052
2053         for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2054                 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2055
2056         UInt_t startTime = entry->GetStartTime();
2057         UInt_t endTime = entry->GetEndTime();
2058
2059 //      if (!startTime || !endTime || startTime > endTime) 
2060 //      {
2061 //              Log("SHUTTLE",
2062 //                      Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2063 //                              run, startTime, endTime));              
2064 //              
2065 //              Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2066 //              fLogbookEntry = entry;  
2067 //              if (!UpdateShuttleLogbook("shuttle_done"))
2068 //              {
2069 //                      AliError(Form("Could not update logbook for run %d !", run));
2070 //              }
2071 //              fLogbookEntry = 0;
2072 //                              
2073 //              delete entry;
2074 //              delete aRow;
2075 //              delete aResult;
2076 //              return 0;
2077 //      }
2078
2079         if (!startTime) 
2080         {
2081                 Log("SHUTTLE",
2082                         Form("QueryRunParameters - Invalid parameters for Run %d: " 
2083                                 "startTime = %d, endTime = %d. Skipping!",
2084                                         run, startTime, endTime));              
2085                 
2086                 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2087                 fLogbookEntry = entry;  
2088                 if (!UpdateShuttleLogbook("shuttle_ignored"))
2089                 {
2090                         AliError(Form("Could not update logbook for run %d !", run));
2091                 }
2092                 fLogbookEntry = 0;
2093                                 
2094                 delete entry;
2095                 delete aRow;
2096                 delete aResult;
2097                 return 0;
2098         }
2099         
2100         if (startTime && !endTime) 
2101         {
2102                 // TODO Here we don't mark SHUTTLE done, because this may mean 
2103                 //the run is still ongoing!!            
2104                 Log("SHUTTLE",
2105                         Form("QueryRunParameters - Invalid parameters for Run %d: "
2106                              "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2107                                         run, startTime, endTime));              
2108                 
2109                 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2110                 //fLogbookEntry = entry;        
2111                 //if (!UpdateShuttleLogbook("shuttle_done"))
2112                 //{
2113                 //      AliError(Form("Could not update logbook for run %d !", run));
2114                 //}
2115                 //fLogbookEntry = 0;
2116                                 
2117                 delete entry;
2118                 delete aRow;
2119                 delete aResult;
2120                 return 0;
2121         }
2122                         
2123         if (startTime && endTime && (startTime > endTime)) 
2124         {
2125                 Log("SHUTTLE",
2126                         Form("QueryRunParameters - Invalid parameters for Run %d: "
2127                                 "startTime = %d, endTime = %d. Skipping!",
2128                                         run, startTime, endTime));              
2129                 
2130                 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2131                 fLogbookEntry = entry;  
2132                 if (!UpdateShuttleLogbook("shuttle_ignored"))
2133                 {
2134                         AliError(Form("Could not update logbook for run %d !", run));
2135                 }
2136                 fLogbookEntry = 0;
2137                                 
2138                 delete entry;
2139                 delete aRow;
2140                 delete aResult;
2141                 return 0;
2142         }
2143                         
2144         TString totEventsStr = entry->GetRunParameter("totalEvents");  
2145         Int_t totEvents = totEventsStr.Atoi();
2146         if (totEvents < 1) 
2147         {
2148                 Log("SHUTTLE",
2149                         Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));             
2150                 
2151                 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));           
2152                 fLogbookEntry = entry;  
2153                 if (!UpdateShuttleLogbook("shuttle_ignored"))
2154                 {
2155                         AliError(Form("Could not update logbook for run %d !", run));
2156                 }
2157                 fLogbookEntry = 0;
2158                                 
2159                 delete entry;
2160                 delete aRow;
2161                 delete aResult;
2162                 return 0;
2163         }
2164
2165         delete aRow;
2166         delete aResult;
2167
2168         return entry;
2169 }
2170
2171 //______________________________________________________________________________________________
2172 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2173                               DCSType type, Int_t multiSplit)
2174 {
2175         // Retrieve all "entry" data points from the DCS server
2176         // host, port: TSocket connection parameters
2177         // entries: list of name of the alias or data point
2178         // type: kAlias or kDP
2179         // returns TMap of values, 0 when failure
2180         
2181         AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2182
2183         TMap* result = 0;
2184         if (type == kAlias)
2185         {
2186                 result = client.GetAliasValues(entries, GetCurrentStartTime(), 
2187                         GetCurrentEndTime());
2188         } 
2189         else if (type == kDP)
2190         {
2191                 result = client.GetDPValues(entries, GetCurrentStartTime(), 
2192                         GetCurrentEndTime());
2193         }
2194
2195         if (result == 0)
2196         {
2197                 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2198                         client.GetErrorString(client.GetResultErrorCode())));
2199                 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)        
2200                         Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2201                                 client.GetServerError().Data()));
2202
2203                 return 0;
2204         }
2205                 
2206         return result;
2207 }
2208
2209 //______________________________________________________________________________________________
2210 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2211                 const char* id, const char* source)
2212 {
2213         // Get calibration file from file exchange servers
2214         // First queris the FXS database for the file name, using the run, detector, id and source info
2215         // then calls RetrieveFile(filename) for actual copy to local disk
2216         // run: current run being processed (given by Logbook entry fLogbookEntry)
2217         // detector: the Preprocessor name
2218         // id: provided as a parameter by the Preprocessor
2219         // source: provided by the Preprocessor through GetFileSources function
2220
2221         // check if test mode should simulate a FXS error
2222         if (fTestMode & kErrorFXSFiles)
2223         {
2224                 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2225                 return 0;
2226         }
2227         
2228         // check connection, in case connect
2229         if (!Connect(system))
2230         {
2231                 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2232                 return 0;
2233         }
2234
2235         // Query preparation
2236         TString sourceName(source);
2237         Int_t nFields = 3;
2238         TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2239                                                                 fConfig->GetFXSdbTable(system));
2240         TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2241                                                                 GetCurrentRun(), detector, id);
2242
2243         if (system == kDAQ)
2244         {
2245                 whereClause += Form(" and DAQsource=\"%s\"", source);
2246         }
2247         else if (system == kDCS)
2248         {
2249                 sourceName="none";
2250         }
2251         else if (system == kHLT)
2252         {
2253                 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2254                 nFields = 3;
2255         }
2256
2257         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2258
2259         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2260
2261         // Query execution
2262         TSQLResult* aResult = 0;
2263         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2264         if (!aResult) {
2265                 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2266                                 GetSystemName(system), id, sourceName.Data()));
2267                 return 0;
2268         }
2269
2270         if(aResult->GetRowCount() == 0)
2271         {
2272                 Log(detector,
2273                         Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2274                                 GetSystemName(system), id, sourceName.Data()));
2275                 delete aResult;
2276                 return 0;
2277         }
2278
2279         if (aResult->GetRowCount() > 1) {
2280                 Log(detector,
2281                         Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2282                                 GetSystemName(system), id, sourceName.Data()));
2283                 delete aResult;
2284                 return 0;
2285         }
2286
2287         if (aResult->GetFieldCount() != nFields) {
2288                 Log(detector,
2289                         Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2290                                 GetSystemName(system), id, sourceName.Data()));
2291                 delete aResult;
2292                 return 0;
2293         }
2294
2295         TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2296
2297         if (!aRow){
2298                 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2299                                 GetSystemName(system), id, sourceName.Data()));
2300                 delete aResult;
2301                 return 0;
2302         }
2303
2304         TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2305         TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2306         TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2307
2308         delete aResult;
2309         delete aRow;
2310
2311         AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2312                                 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2313
2314         // retrieved file is renamed to make it unique
2315         TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2316                                         GetShuttleTempDir(), detector, GetCurrentRun(),
2317                                         GetSystemName(system), detector, GetCurrentRun(), 
2318                                         id, sourceName.Data());
2319
2320
2321         // file retrieval from FXS
2322         UInt_t nRetries = 0;
2323         UInt_t maxRetries = 3;
2324         Bool_t result = kFALSE;
2325
2326         // copy!! if successful TSystem::Exec returns 0
2327         while(nRetries++ < maxRetries) {
2328                 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2329                 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2330                 if(!result)
2331                 {
2332                         Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2333                                         filePath.Data(), GetSystemName(system)));
2334                         continue;
2335                 } 
2336
2337                 if (fileChecksum.Length()>0)
2338                 {
2339                         // compare md5sum of local file with the one stored in the FXS DB
2340                         Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2341                                                 localFileName.Data(), fileChecksum.Data()));
2342
2343                         if (md5Comp != 0)
2344                         {
2345                                 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2346                                                         filePath.Data()));
2347                                 result = kFALSE;
2348                                 continue;
2349                         }
2350                 } else {
2351                         Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2352                                                         filePath.Data(), GetSystemName(system)));
2353                 }
2354                 if (result) break;
2355         }
2356
2357         if(!result) return 0;
2358
2359         fFXSCalled[system]=kTRUE;
2360         TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2361         fFXSlist[system].Add(fileParams);
2362
2363         static TString staticLocalFileName;
2364         staticLocalFileName.Form("%s", localFileName.Data());
2365         
2366         Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2367                         "source %s from %s to %s", id, source, 
2368                         GetSystemName(system), localFileName.Data()));
2369                         
2370         return staticLocalFileName.Data();
2371 }
2372
2373 //______________________________________________________________________________________________
2374 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2375 {
2376         //
2377         // Copies file from FXS to local Shuttle machine
2378         //
2379
2380         // check temp directory: trying to cd to temp; if it does not exist, create it
2381         AliDebug(2, Form("Copy file %s from %s FXS into %s",
2382                         GetSystemName(system), fxsFileName, localFileName));
2383                         
2384         TString tmpDir(localFileName);
2385         
2386         tmpDir = tmpDir(0,tmpDir.Last('/'));
2387
2388         Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2389         if (noDir) // temp dir does not exists!
2390         {
2391                 if (gSystem->mkdir(tmpDir.Data(), 1))
2392                 {
2393                         Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2394                         return kFALSE;
2395                 }
2396         }
2397
2398         TString baseFXSFolder;
2399         if (system == kDAQ)
2400         {
2401                 baseFXSFolder = "FES/";
2402         }
2403         else if (system == kDCS)
2404         {
2405                 baseFXSFolder = "";
2406         }
2407         else if (system == kHLT)
2408         {
2409                 baseFXSFolder = "/opt/FXS/";
2410         }
2411
2412
2413         TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2414                 fConfig->GetFXSPort(system),
2415                 fConfig->GetFXSUser(system),
2416                 fConfig->GetFXSHost(system),
2417                 baseFXSFolder.Data(),
2418                 fxsFileName,
2419                 localFileName);
2420
2421         AliDebug(2, Form("%s",command.Data()));
2422
2423         Bool_t result = (gSystem->Exec(command.Data()) == 0);
2424
2425         return result;
2426 }
2427
2428 //______________________________________________________________________________________________
2429 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2430 {
2431         //
2432         // Get sources producing the condition file Id from file exchange servers
2433         // if id is NULL all sources are returned (distinct)
2434         //
2435
2436         Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2437         
2438         // check if test mode should simulate a FXS error
2439         if (fTestMode & kErrorFXSSources)
2440         {
2441                 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2442                 return 0;
2443         }
2444
2445         if (system == kDCS)
2446         {
2447                 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2448                 TList *list = new TList();
2449                 list->SetOwner(1);
2450                 list->Add(new TObjString(" "));
2451                 return list;
2452         }
2453
2454         // check connection, in case connect
2455         if (!Connect(system))
2456         {
2457                 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2458                 return NULL;
2459         }
2460
2461         TString sourceName = 0;
2462         if (system == kDAQ)
2463         {
2464                 sourceName = "DAQsource";
2465         } else if (system == kHLT)
2466         {
2467                 sourceName = "DDLnumbers";
2468         }
2469
2470         TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2471         TString whereClause = Form("run=%d and detector=\"%s\"",
2472                                 GetCurrentRun(), detector);
2473         if (id)
2474                 whereClause += Form(" and fileId=\"%s\"", id);
2475         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2476
2477         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2478
2479         // Query execution
2480         TSQLResult* aResult;
2481         aResult = fServer[system]->Query(sqlQuery);
2482         if (!aResult) {
2483                 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2484                                 GetSystemName(system), id));
2485                 return 0;
2486         }
2487
2488         TList *list = new TList();
2489         list->SetOwner(1);
2490         
2491         if (aResult->GetRowCount() == 0)
2492         {
2493                 Log(detector,
2494                         Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2495                 delete aResult;
2496                 return list;
2497         }
2498
2499         Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2500
2501         TSQLRow* aRow;
2502         while ((aRow = aResult->Next()))
2503         {
2504
2505                 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2506                 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2507                 list->Add(new TObjString(source));
2508                 delete aRow;
2509         }
2510
2511         delete aResult;
2512
2513         return list;
2514 }
2515
2516 //______________________________________________________________________________________________
2517 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2518 {
2519         //
2520         // Get all ids of condition files produced by a given source from file exchange servers
2521         //
2522         
2523         Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2524
2525         // check if test mode should simulate a FXS error
2526         if (fTestMode & kErrorFXSSources)
2527         {
2528                 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2529                 return 0;
2530         }
2531
2532         // check connection, in case connect
2533         if (!Connect(system))
2534         {
2535                 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2536                 return NULL;
2537         }
2538
2539         TString sourceName = 0;
2540         if (system == kDAQ)
2541         {
2542                 sourceName = "DAQsource";
2543         } else if (system == kHLT)
2544         {
2545                 sourceName = "DDLnumbers";
2546         }
2547
2548         TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2549         TString whereClause = Form("run=%d and detector=\"%s\"",
2550                                 GetCurrentRun(), detector);
2551         if (sourceName.Length() > 0 && source)
2552                 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2553         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2554
2555         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2556
2557         // Query execution
2558         TSQLResult* aResult;
2559         aResult = fServer[system]->Query(sqlQuery);
2560         if (!aResult) {
2561                 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2562                                 GetSystemName(system), source));
2563                 return 0;
2564         }
2565
2566         TList *list = new TList();
2567         list->SetOwner(1);
2568         
2569         if (aResult->GetRowCount() == 0)
2570         {
2571                 Log(detector,
2572                         Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2573                 delete aResult;
2574                 return list;
2575         }
2576
2577         Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2578
2579         TSQLRow* aRow;
2580
2581         while ((aRow = aResult->Next()))
2582         {
2583
2584                 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2585                 AliDebug(2, Form("fileId = %s", id.Data()));
2586                 list->Add(new TObjString(id));
2587                 delete aRow;
2588         }
2589
2590         delete aResult;
2591
2592         return list;
2593 }
2594
2595 //______________________________________________________________________________________________
2596 Bool_t AliShuttle::Connect(Int_t system)
2597 {
2598         // Connect to MySQL Server of the system's FXS MySQL databases
2599         // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2600         //
2601
2602         // check connection: if already connected return
2603         if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2604
2605         TString dbHost, dbUser, dbPass, dbName;
2606
2607         if (system < 3) // FXS db servers
2608         {
2609                 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2610                 dbUser = fConfig->GetFXSdbUser(system);
2611                 dbPass = fConfig->GetFXSdbPass(system);
2612                 dbName =   fConfig->GetFXSdbName(system);
2613         } else { // Run & Shuttle logbook servers
2614         // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2615                 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2616                 dbUser = fConfig->GetDAQlbUser();
2617                 dbPass = fConfig->GetDAQlbPass();
2618                 dbName =   fConfig->GetDAQlbDB();
2619         }
2620
2621         fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2622         if (!fServer[system] || !fServer[system]->IsConnected()) {
2623                 if(system < 3)
2624                 {
2625                 AliError(Form("Can't establish connection to FXS database for %s",
2626                                         AliShuttleInterface::GetSystemName(system)));
2627                 } else {
2628                 AliError("Can't establish connection to Run logbook.");
2629                 }
2630                 if(fServer[system]) delete fServer[system];
2631                 return kFALSE;
2632         }
2633
2634         // Get tables
2635         TSQLResult* aResult=0;
2636         switch(system){
2637                 case kDAQ:
2638                         aResult = fServer[kDAQ]->GetTables(dbName.Data());
2639                         break;
2640                 case kDCS:
2641                         aResult = fServer[kDCS]->GetTables(dbName.Data());
2642                         break;
2643                 case kHLT:
2644                         aResult = fServer[kHLT]->GetTables(dbName.Data());
2645                         break;
2646                 default:
2647                         aResult = fServer[3]->GetTables(dbName.Data());
2648                         break;
2649         }
2650
2651         delete aResult;
2652         return kTRUE;
2653 }
2654
2655 //______________________________________________________________________________________________
2656 Bool_t AliShuttle::UpdateTable()
2657 {
2658         //
2659         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2660         //
2661
2662         Bool_t result = kTRUE;
2663
2664         for (UInt_t system=0; system<3; system++)
2665         {
2666                 if(!fFXSCalled[system]) continue;
2667
2668                 // check connection, in case connect
2669                 if (!Connect(system))
2670                 {
2671                         Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2672                         result = kFALSE;
2673                         continue;
2674                 }
2675
2676                 TTimeStamp now; // now
2677
2678                 // Loop on FXS list entries
2679                 TIter iter(&fFXSlist[system]);
2680                 TObjString *aFXSentry=0;
2681                 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2682                 {
2683                         TString aFXSentrystr = aFXSentry->String();
2684                         TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2685                         if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2686                         {
2687                                 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2688                                         GetSystemName(system), aFXSentrystr.Data()));
2689                                 if(aFXSarray) delete aFXSarray;
2690                                 result = kFALSE;
2691                                 continue;
2692                         }
2693                         const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2694                         const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2695
2696                         TString whereClause;
2697                         if (system == kDAQ)
2698                         {
2699                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2700                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2701                         }
2702                         else if (system == kDCS)
2703                         {
2704                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2705                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId);
2706                         }
2707                         else if (system == kHLT)
2708                         {
2709                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2710                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2711                         }
2712
2713                         delete aFXSarray;
2714
2715                         TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2716                                                                 now.GetSec(), whereClause.Data());
2717
2718                         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2719
2720                         // Query execution
2721                         TSQLResult* aResult;
2722                         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2723                         if (!aResult)
2724                         {
2725                                 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2726                                                                 GetSystemName(system), sqlQuery.Data()));
2727                                 result = kFALSE;
2728                                 continue;
2729                         }
2730                         delete aResult;
2731                 }
2732         }
2733
2734         return result;
2735 }
2736
2737 //______________________________________________________________________________________________
2738 Bool_t AliShuttle::UpdateTableFailCase()
2739 {
2740         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2741         // this is called in case the preprocessor is declared failed for the current run, because
2742         // the fields are updated only in case of success
2743
2744         Bool_t result = kTRUE;
2745
2746         for (UInt_t system=0; system<3; system++)
2747         {
2748                 // check connection, in case connect
2749                 if (!Connect(system))
2750                 {
2751                         Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2752                                                         GetSystemName(system)));
2753                         result = kFALSE;
2754                         continue;
2755                 }
2756
2757                 TTimeStamp now; // now
2758
2759                 // Loop on FXS list entries
2760
2761                 TString whereClause = Form("where run=%d and detector=\"%s\";",
2762                                                 GetCurrentRun(), fCurrentDetector.Data());
2763
2764
2765                 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2766                                                         now.GetSec(), whereClause.Data());
2767
2768                 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2769
2770                 // Query execution
2771                 TSQLResult* aResult;
2772                 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2773                 if (!aResult)
2774                 {
2775                         Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2776                                                         GetSystemName(system), sqlQuery.Data()));
2777                         result = kFALSE;
2778                         continue;
2779                 }
2780                 delete aResult;
2781         }
2782
2783         return result;
2784 }
2785
2786 //______________________________________________________________________________________________
2787 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2788 {
2789         //
2790         // Update Shuttle logbook filling detector or shuttle_done column
2791         // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2792         //
2793
2794         // check connection, in case connect
2795         if(!Connect(3)){
2796                 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2797                 return kFALSE;
2798         }
2799
2800         TString detName(detector);
2801         TString setClause;
2802         if (detName == "shuttle_done" || detName == "shuttle_ignored")
2803         {
2804                 setClause = "set shuttle_done=1";
2805
2806                 if (detName == "shuttle_done")
2807                 {
2808                         // Send the information to ML
2809                         TMonaLisaText  mlStatus("SHUTTLE_status", "Done");
2810
2811                         TList mlList;
2812                         mlList.Add(&mlStatus);
2813                 
2814                         TString mlID;
2815                         mlID.Form("%d", GetCurrentRun());
2816                         fMonaLisa->SendParameters(&mlList, mlID);
2817                 }
2818         } else {
2819                 TString statusStr(status);
2820                 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2821                    statusStr.Contains("failed", TString::kIgnoreCase)){
2822                         setClause = Form("set %s=\"%s\"", detector, status);
2823                 } else {
2824                         Log("SHUTTLE",
2825                                 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2826                                         status, detector));
2827                         return kFALSE;
2828                 }
2829         }
2830
2831         TString whereClause = Form("where run=%d", GetCurrentRun());
2832
2833         TString sqlQuery = Form("update %s %s %s",
2834                                         fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2835
2836         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2837
2838         // Query execution
2839         TSQLResult* aResult;
2840         aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2841         if (!aResult) {
2842                 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2843                 return kFALSE;
2844         }
2845         delete aResult;
2846
2847         return kTRUE;
2848 }
2849
2850 //______________________________________________________________________________________________
2851 Int_t AliShuttle::GetCurrentRun() const
2852 {
2853         //
2854         // Get current run from logbook entry
2855         //
2856
2857         return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2858 }
2859
2860 //______________________________________________________________________________________________
2861 UInt_t AliShuttle::GetCurrentStartTime() const
2862 {
2863         //
2864         // get current start time
2865         //
2866
2867         return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2868 }
2869
2870 //______________________________________________________________________________________________
2871 UInt_t AliShuttle::GetCurrentEndTime() const
2872 {
2873         //
2874         // get current end time from logbook entry
2875         //
2876
2877         return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2878 }
2879
2880 //______________________________________________________________________________________________
2881 UInt_t AliShuttle::GetCurrentYear() const
2882 {
2883         //
2884         // Get current year from logbook entry
2885         //
2886
2887         if (!fLogbookEntry) return 0;
2888         
2889         TTimeStamp startTime(GetCurrentStartTime());
2890         TString year =  Form("%d",startTime.GetDate());
2891         year = year(0,4);
2892         
2893         return year.Atoi();
2894 }
2895
2896 //______________________________________________________________________________________________
2897 const char* AliShuttle::GetLHCPeriod() const
2898 {
2899         //
2900         // Get current LHC period from logbook entry
2901         //
2902
2903         if (!fLogbookEntry) return 0;
2904                 
2905         return fLogbookEntry->GetRunParameter("LHCperiod");
2906 }
2907
2908 //______________________________________________________________________________________________
2909 void AliShuttle::Log(const char* detector, const char* message)
2910 {
2911         //
2912         // Fill log string with a message
2913         //
2914
2915         TString logRunDir = GetShuttleLogDir();
2916         if (GetCurrentRun() >=0)
2917                 logRunDir += Form("/%d", GetCurrentRun());
2918         
2919         void* dir = gSystem->OpenDirectory(logRunDir.Data());
2920         if (dir == NULL) {
2921                 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2922                         AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2923                         return;
2924                 }
2925
2926         } else {
2927                 gSystem->FreeDirectory(dir);
2928         }
2929
2930         TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2931         if (GetCurrentRun() >= 0) 
2932                 toLog += Form("run %d - ", GetCurrentRun());
2933         toLog += Form("%s", message);
2934
2935         AliInfo(toLog.Data());
2936         
2937         // if we redirect the log output already to the file, leave here
2938         if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2939                 return;
2940
2941         TString fileName = GetLogFileName(detector);
2942         
2943         gSystem->ExpandPathName(fileName);
2944
2945         ofstream logFile;
2946         logFile.open(fileName, ofstream::out | ofstream::app);
2947
2948         if (!logFile.is_open()) {
2949                 AliError(Form("Could not open file %s", fileName.Data()));
2950                 return;
2951         }
2952
2953         logFile << toLog.Data() << "\n";
2954
2955         logFile.close();
2956 }
2957
2958 //______________________________________________________________________________________________
2959 TString AliShuttle::GetLogFileName(const char* detector) const
2960 {
2961         // 
2962         // returns the name of the log file for a given sub detector
2963         //
2964         
2965         TString fileName;
2966         
2967         if (GetCurrentRun() >= 0) 
2968         {
2969                 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(), 
2970                         detector, GetCurrentRun());
2971         } else {
2972                 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2973         }
2974
2975         return fileName;
2976 }
2977
2978 //______________________________________________________________________________________________
2979 void AliShuttle::SendAlive()
2980 {
2981         // sends alive message to ML
2982         
2983         TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
2984
2985         TList mlList;
2986         mlList.Add(&mlStatus);
2987
2988         fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2989 }
2990
2991 //______________________________________________________________________________________________
2992 Bool_t AliShuttle::Collect(Int_t run)
2993 {
2994         //
2995         // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2996         // If a dedicated run is given this run is processed
2997         //
2998         // In operational mode, this is the Shuttle function triggered by the EOR signal.
2999         //
3000
3001         if (run == -1)
3002                 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
3003         else
3004                 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
3005
3006         SetLastAction("Starting");
3007
3008         // create ML instance
3009         if (!fMonaLisa)
3010                 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
3011                 
3012
3013         SendAlive();
3014
3015         TString whereClause("where shuttle_done=0");
3016         if (run != -1)
3017                 whereClause += Form(" and run=%d", run);
3018
3019         TObjArray shuttleLogbookEntries;
3020         if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3021         {
3022                 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3023                 return kFALSE;
3024         }
3025
3026         if (shuttleLogbookEntries.GetEntries() == 0)
3027         {
3028                 if (run == -1)
3029                         Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3030                 else
3031                         Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3032                                                 "or it does not exist in Shuttle logbook", run));
3033                 return kTRUE;
3034         }
3035
3036         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3037                 fFirstUnprocessed[iDet] = kTRUE;
3038
3039         if (run != -1)
3040         {
3041                 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3042                 // flag them into fFirstUnprocessed array
3043                 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3044                 TObjArray tmpLogbookEntries;
3045                 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3046                 {
3047                         Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3048                         return kFALSE;
3049                 }
3050
3051                 TIter iter(&tmpLogbookEntries);
3052                 AliShuttleLogbookEntry* anEntry = 0;
3053                 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3054                 {
3055                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3056                         {
3057                                 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3058                                 {
3059                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3060                                                         anEntry->GetRun(), GetDetName(iDet)));
3061                                         fFirstUnprocessed[iDet] = kFALSE;
3062                                 }
3063                         }
3064
3065                 }
3066
3067         }
3068
3069         if (!RetrieveConditionsData(shuttleLogbookEntries))
3070         {
3071                 Log("SHUTTLE", "Collect - Process of at least one run failed");
3072                 return kFALSE;
3073         }
3074
3075         Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3076         return kTRUE;
3077 }
3078
3079 //______________________________________________________________________________________________
3080 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3081 {
3082         //