in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
1 /**************************************************************************
2  * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3  *                                                                        *
4  * Author: The ALICE Off-line Project.                                    *
5  * Contributors are mentioned in the code where appropriate.              *
6  *                                                                        *
7  * Permission to use, copy, modify and distribute this software and its   *
8  * documentation strictly for non-commercial purposes is hereby granted   *
9  * without fee, provided that the above copyright notice appears in all   *
10  * copies and that both the copyright notice and this permission notice   *
11  * appear in the supporting documentation. The authors make no claims     *
12  * about the suitability of this software for any purpose. It is          *
13  * provided "as is" without express or implied warranty.                  *
14  **************************************************************************/
15
16 /*
17 $Log$
18 Revision 1.68  2007/12/11 10:15:17  acolla
19 Added marking SHUTTLE=DONE for invalid runs
20 (invalid start time or end time) and runs with totalEvents < 1
21
22 Revision 1.67  2007/12/07 19:14:36  acolla
23 in AliShuttleTrigger:
24
25 Added automatic collection of new runs on a regular time basis (settable from the configuration)
26
27 in AliShuttleConfig: new members
28
29 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
30 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
31
32 in AliShuttle:
33
34 - logs now stored in logs/#RUN/DET_#RUN.log
35
36 Revision 1.66  2007/12/05 10:45:19  jgrosseo
37 changed order of arguments to TMonaLisaWriter
38
39 Revision 1.65  2007/11/26 16:58:37  acolla
40 Monalisa configuration added: host and table name
41
42 Revision 1.64  2007/11/13 16:15:47  acolla
43 DCS map is stored in a file in the temp folder where the detector is processed.
44 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
45
46 Revision 1.63  2007/11/02 10:53:16  acolla
47 Protection added to AliShuttle::CopyFileLocally
48
49 Revision 1.62  2007/10/31 18:23:13  acolla
50 Furter developement on the Shuttle:
51
52 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
53 are now built from /alice/data, e.g.:
54 /alice/data/2007/LHC07a/OCDB
55
56 the year and LHC period are taken from the Shuttle.
57 Raw metadata files are stored by GRP to:
58 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
59
60 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
61
62 Revision 1.61  2007/10/30 20:33:51  acolla
63 Improved managing of temporary folders, which weren't correctly handled.
64 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
65
66 Revision 1.60  2007/10/29 18:06:16  acolla
67
68 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
69 This function can be used by GRP only. It stores raw data tags merged file to the
70 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
71
72 KNOWN ISSUES:
73
74 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
75 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
76 has been corrected in the root package on the Shuttle machine.
77
78 Revision 1.59  2007/10/05 12:40:55  acolla
79
80 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
81
82 Revision 1.58  2007/09/28 15:27:40  acolla
83
84 AliDCSClient "multiSplit" option added in the DCS configuration
85 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
86
87 Revision 1.57  2007/09/27 16:53:13  acolla
88 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
89 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
90
91 Revision 1.56  2007/09/14 16:46:14  jgrosseo
92 1) Connect and Close are called before and after each query, so one can
93 keep the same AliDCSClient object.
94 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
95 3) Splitting interval can be specified in constructor
96
97 Revision 1.55  2007/08/06 12:26:40  acolla
98 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
99 read from the run logbook.
100
101 Revision 1.54  2007/07/12 09:51:25  jgrosseo
102 removed duplicated log message in GetFile
103
104 Revision 1.53  2007/07/12 09:26:28  jgrosseo
105 updating hlt fxs base path
106
107 Revision 1.52  2007/07/12 08:06:45  jgrosseo
108 adding log messages in getfile... functions
109 adding not implemented copy constructor in alishuttleconfigholder
110
111 Revision 1.51  2007/07/03 17:24:52  acolla
112 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
113
114 Revision 1.50  2007/07/02 17:19:32  acolla
115 preprocessor is run in a temp directory that is removed when process is finished.
116
117 Revision 1.49  2007/06/29 10:45:06  acolla
118 Number of columns in MySql Shuttle logbook increased by one (HLT added)
119
120 Revision 1.48  2007/06/21 13:06:19  acolla
121 GetFileSources returns dummy list with 1 source if system=DCS (better than
122 returning error as it was)
123
124 Revision 1.47  2007/06/19 17:28:56  acolla
125 HLT updated; missing map bug removed.
126
127 Revision 1.46  2007/06/09 13:01:09  jgrosseo
128 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
129
130 Revision 1.45  2007/05/30 06:35:20  jgrosseo
131 Adding functionality to the Shuttle/TestShuttle:
132 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
133 o) Function to retrieve list of IDs for a given source      (GetFileIDs)
134 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
135 Example code has been added to the TestProcessor in TestShuttle
136
137 Revision 1.44  2007/05/11 16:09:32  acolla
138 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
139 example: ITS/SPD/100_filename.root
140
141 Revision 1.43  2007/05/10 09:59:51  acolla
142 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
143
144 Revision 1.42  2007/05/03 08:01:39  jgrosseo
145 typo in last commit :-(
146
147 Revision 1.41  2007/05/03 08:00:48  jgrosseo
148 fixing log message when pp want to skip dcs value retrieval
149
150 Revision 1.40  2007/04/27 07:06:48  jgrosseo
151 GetFileSources returns empty list in case of no files, but successful query
152 No mails sent in testmode
153
154 Revision 1.39  2007/04/17 12:43:57  acolla
155 Correction in StoreOCDB; change of text in mail to detector expert
156
157 Revision 1.38  2007/04/12 08:26:18  jgrosseo
158 updated comment
159
160 Revision 1.37  2007/04/10 16:53:14  jgrosseo
161 redirecting sub detector stdout, stderr to sub detector log file
162
163 Revision 1.35  2007/04/04 16:26:38  acolla
164 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
165 2. Added missing dependency in test preprocessors.
166 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
167
168 Revision 1.34  2007/04/04 10:33:36  jgrosseo
169 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
170 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
171
172 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
173
174 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
175
176 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
177
178 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
179 If you always need DCS data (like before), you do not need to implement it.
180
181 6) The run type has been added to the monitoring page
182
183 Revision 1.33  2007/04/03 13:56:01  acolla
184 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
185 run type.
186
187 Revision 1.32  2007/02/28 10:41:56  acolla
188 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
189 AliPreprocessor::GetRunType() function.
190 Added some ldap definition files.
191
192 Revision 1.30  2007/02/13 11:23:21  acolla
193 Moved getters and setters of Shuttle's main OCDB/Reference, local
194 OCDB/Reference, temp and log folders to AliShuttleInterface
195
196 Revision 1.27  2007/01/30 17:52:42  jgrosseo
197 adding monalisa monitoring
198
199 Revision 1.26  2007/01/23 19:20:03  acolla
200 Removed old ldif files, added TOF, MCH ldif files. Added some options in
201 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
202 SetShuttleLogDir
203
204 Revision 1.25  2007/01/15 19:13:52  acolla
205 Moved some AliInfo to AliDebug in SendMail function
206
207 Revision 1.21  2006/12/07 08:51:26  jgrosseo
208 update (alberto):
209 table, db names in ldap configuration
210 added GRP preprocessor
211 DCS data can also be retrieved by data point
212
213 Revision 1.20  2006/11/16 16:16:48  jgrosseo
214 introducing strict run ordering flag
215 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
216
217 Revision 1.19  2006/11/06 14:23:04  jgrosseo
218 major update (Alberto)
219 o) reading of run parameters from the logbook
220 o) online offline naming conversion
221 o) standalone DCSclient package
222
223 Revision 1.18  2006/10/20 15:22:59  jgrosseo
224 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
225 o) Merging Collect, CollectAll, CollectNew function
226 o) Removing implementation of empty copy constructors (declaration still there!)
227
228 Revision 1.17  2006/10/05 16:20:55  jgrosseo
229 adapting to new CDB classes
230
231 Revision 1.16  2006/10/05 15:46:26  jgrosseo
232 applying to the new interface
233
234 Revision 1.15  2006/10/02 16:38:39  jgrosseo
235 update (alberto):
236 fixed memory leaks
237 storing of objects that failed to be stored to the grid before
238 interfacing of shuttle status table in daq system
239
240 Revision 1.14  2006/08/29 09:16:05  jgrosseo
241 small update
242
243 Revision 1.13  2006/08/15 10:50:00  jgrosseo
244 effc++ corrections (alberto)
245
246 Revision 1.12  2006/08/08 14:19:29  jgrosseo
247 Update to shuttle classes (Alberto)
248
249 - Possibility to set the full object's path in the Preprocessor's and
250 Shuttle's  Store functions
251 - Possibility to extend the object's run validity in the same classes
252 ("startValidity" and "validityInfinite" parameters)
253 - Implementation of the StoreReferenceData function to store reference
254 data in a dedicated CDB storage.
255
256 Revision 1.11  2006/07/21 07:37:20  jgrosseo
257 last run is stored after each run
258
259 Revision 1.10  2006/07/20 09:54:40  jgrosseo
260 introducing status management: The processing per subdetector is divided into several steps,
261 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
262 can keep track of the number of failures and skips further processing after a certain threshold is
263 exceeded. These thresholds can be configured in LDAP.
264
265 Revision 1.9  2006/07/19 10:09:55  jgrosseo
266 new configuration, accesst to DAQ FES (Alberto)
267
268 Revision 1.8  2006/07/11 12:44:36  jgrosseo
269 adding parameters for extended validity range of data produced by preprocessor
270
271 Revision 1.7  2006/07/10 14:37:09  jgrosseo
272 small fix + todo comment
273
274 Revision 1.6  2006/07/10 13:01:41  jgrosseo
275 enhanced storing of last sucessfully processed run (alberto)
276
277 Revision 1.5  2006/07/04 14:59:57  jgrosseo
278 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
279
280 Revision 1.4  2006/06/12 09:11:16  jgrosseo
281 coding conventions (Alberto)
282
283 Revision 1.3  2006/06/06 14:26:40  jgrosseo
284 o) removed files that were moved to STEER
285 o) shuttle updated to follow the new interface (Alberto)
286
287 Revision 1.2  2006/03/07 07:52:34  hristov
288 New version (B.Yordanov)
289
290 Revision 1.6  2005/11/19 17:19:14  byordano
291 RetrieveDATEEntries and RetrieveConditionsData added
292
293 Revision 1.5  2005/11/19 11:09:27  byordano
294 AliShuttle declaration added
295
296 Revision 1.4  2005/11/17 17:47:34  byordano
297 TList changed to TObjArray
298
299 Revision 1.3  2005/11/17 14:43:23  byordano
300 import to local CVS
301
302 Revision 1.1.1.1  2005/10/28 07:33:58  hristov
303 Initial import as subdirectory in AliRoot
304
305 Revision 1.2  2005/09/13 08:41:15  byordano
306 default startTime endTime added
307
308 Revision 1.4  2005/08/30 09:13:02  byordano
309 some docs added
310
311 Revision 1.3  2005/08/29 21:15:47  byordano
312 some docs added
313
314 */
315
316 //
317 // This class is the main manager for AliShuttle. 
318 // It organizes the data retrieval from DCS and call the 
319 // interface methods of AliPreprocessor.
320 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
321 // data for its set of aliases is retrieved. If there is registered
322 // AliPreprocessor for this detector then it will be used
323 // accroding to the schema (see AliPreprocessor).
324 // If there isn't registered AliPreprocessor than the retrieved
325 // data is stored automatically to the undelying AliCDBStorage.
326 // For detSpec is used the alias name.
327 //
328
329 #include "AliShuttle.h"
330
331 #include "AliCDBManager.h"
332 #include "AliCDBStorage.h"
333 #include "AliCDBId.h"
334 #include "AliCDBRunRange.h"
335 #include "AliCDBPath.h"
336 #include "AliCDBEntry.h"
337 #include "AliShuttleConfig.h"
338 #include "DCSClient/AliDCSClient.h"
339 #include "AliLog.h"
340 #include "AliPreprocessor.h"
341 #include "AliShuttleStatus.h"
342 #include "AliShuttleLogbookEntry.h"
343
344 #include <TSystem.h>
345 #include <TObject.h>
346 #include <TString.h>
347 #include <TTimeStamp.h>
348 #include <TObjString.h>
349 #include <TSQLServer.h>
350 #include <TSQLResult.h>
351 #include <TSQLRow.h>
352 #include <TMutex.h>
353 #include <TSystemDirectory.h>
354 #include <TSystemFile.h>
355 #include <TFile.h>
356 #include <TGrid.h>
357 #include <TGridResult.h>
358
359 #include <TMonaLisaWriter.h>
360
361 #include <fstream>
362
363 #include <sys/types.h>
364 #include <sys/wait.h>
365
366 ClassImp(AliShuttle)
367
368 //______________________________________________________________________________________________
369 AliShuttle::AliShuttle(const AliShuttleConfig* config,
370                 UInt_t timeout, Int_t retries):
371 fConfig(config),
372 fTimeout(timeout), fRetries(retries),
373 fPreprocessorMap(),
374 fLogbookEntry(0),
375 fCurrentDetector(),
376 fStatusEntry(0),
377 fMonitoringMutex(0),
378 fLastActionTime(0),
379 fLastAction(),
380 fMonaLisa(0),
381 fTestMode(kNone),
382 fReadTestMode(kFALSE),
383 fOutputRedirected(kFALSE)
384 {
385         //
386         // config: AliShuttleConfig used
387         // timeout: timeout used for AliDCSClient connection
388         // retries: the number of retries in case of connection error.
389         //
390
391         if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
392         for(int iSys=0;iSys<4;iSys++) {
393                 fServer[iSys]=0;
394                 if (iSys < 3)
395                         fFXSlist[iSys].SetOwner(kTRUE);
396         }
397         fPreprocessorMap.SetOwner(kTRUE);
398
399         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
400                 fFirstUnprocessed[iDet] = kFALSE;
401
402         fMonitoringMutex = new TMutex();
403 }
404
405 //______________________________________________________________________________________________
406 AliShuttle::~AliShuttle()
407 {
408         //
409         // destructor
410         //
411
412         fPreprocessorMap.DeleteAll();
413         for(int iSys=0;iSys<4;iSys++)
414                 if(fServer[iSys]) {
415                         fServer[iSys]->Close();
416                         delete fServer[iSys];
417                         fServer[iSys] = 0;
418                 }
419
420         if (fStatusEntry){
421                 delete fStatusEntry;
422                 fStatusEntry = 0;
423         }
424         
425         if (fMonitoringMutex) 
426         {
427                 delete fMonitoringMutex;
428                 fMonitoringMutex = 0;
429         }
430 }
431
432 //______________________________________________________________________________________________
433 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
434 {
435         //
436         // Registers new AliPreprocessor.
437         // It uses GetName() for indentificator of the pre processor.
438         // The pre processor is registered it there isn't any other
439         // with the same identificator (GetName()).
440         //
441
442         const char* detName = preprocessor->GetName();
443         if(GetDetPos(detName) < 0)
444                 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
445
446         if (fPreprocessorMap.GetValue(detName)) {
447                 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
448                 return;
449         }
450
451         fPreprocessorMap.Add(new TObjString(detName), preprocessor);
452 }
453 //______________________________________________________________________________________________
454 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
455                 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
456 {
457         // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
458         // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
459         // using this function. Use StoreReferenceData instead!
460         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
461         // finishes the data are transferred to the main storage (Grid).
462
463         return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
464 }
465
466 //______________________________________________________________________________________________
467 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
468 {
469         // Stores a CDB object in the storage for reference data. This objects will not be available during
470         // offline reconstrunction. Use this function for reference data only!
471         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
472         // finishes the data are transferred to the main storage (Grid).
473
474         return StoreLocally(fgkLocalRefStorage, path, object, metaData);
475 }
476
477 //______________________________________________________________________________________________
478 Bool_t AliShuttle::StoreLocally(const TString& localUri,
479                         const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
480                         Int_t validityStart, Bool_t validityInfinite)
481 {
482         // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
483         // when the preprocessor finishes the data are transferred to the main storage (Grid).
484         // The parameters are:
485         //   1) Uri of the backup storage (Local)
486         //   2) the object's path.
487         //   3) the object to be stored
488         //   4) the metaData to be associated with the object
489         //   5) the validity start run number w.r.t. the current run,
490         //      if the data is valid only for this run leave the default 0
491         //   6) specifies if the calibration data is valid for infinity (this means until updated),
492         //      typical for calibration runs, the default is kFALSE
493         //
494         // returns 0 if fail, 1 otherwise
495
496         if (fTestMode & kErrorStorage)
497         {
498                 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
499                 return kFALSE;
500         }
501         
502         const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
503
504         Int_t firstRun = GetCurrentRun() - validityStart;
505         if(firstRun < 0) {
506                 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
507                 firstRun=0;
508         }
509
510         Int_t lastRun = -1;
511         if(validityInfinite) {
512                 lastRun = AliCDBRunRange::Infinity();
513         } else {
514                 lastRun = GetCurrentRun();
515         }
516
517         // Version is set to current run, it will be used later to transfer data to Grid
518         AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
519
520         if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
521                 TObjString runUsed = Form("%d", GetCurrentRun());
522                 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
523         }
524
525         Bool_t result = kFALSE;
526
527         if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
528                 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
529         } else {
530                 result = AliCDBManager::Instance()->GetStorage(localUri)
531                                         ->Put(object, id, metaData);
532         }
533
534         if(!result) {
535
536                 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
537         }
538
539         return result;
540 }
541
542 //______________________________________________________________________________________________
543 Bool_t AliShuttle::StoreOCDB()
544 {
545         //
546         // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
547         // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
548         // Then calls StoreRefFilesToGrid to store reference files. 
549         //
550         
551         if (fTestMode & kErrorGrid)
552         {
553                 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
554                 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
555                 return kFALSE;
556         }
557         
558         Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
559         Bool_t resultCDB = StoreOCDB(fgkMainCDB);
560
561         Log("SHUTTLE","StoreOCDB - Storing reference data ...");
562         Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
563         
564         Log("SHUTTLE","StoreOCDB - Storing reference files ...");
565         Bool_t resultRefFiles = CopyFilesToGrid("reference");
566         
567         Bool_t resultMetadata = kTRUE;
568         if(fCurrentDetector == "GRP") 
569         {
570                 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
571                 resultMetadata = CopyFilesToGrid("metadata");
572         }
573         
574         return resultCDB && resultRef && resultRefFiles && resultMetadata;
575 }
576
577 //______________________________________________________________________________________________
578 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
579 {
580         //
581         // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
582         //
583
584         TObjArray* gridIds=0;
585
586         Bool_t result = kTRUE;
587
588         const char* type = 0;
589         TString localURI;
590         if(gridURI == fgkMainCDB) {
591                 type = "OCDB";
592                 localURI = fgkLocalCDB;
593         } else if(gridURI == fgkMainRefStorage) {
594                 type = "reference";
595                 localURI = fgkLocalRefStorage;
596         } else {
597                 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
598                 return kFALSE;
599         }
600
601         AliCDBManager* man = AliCDBManager::Instance();
602
603         AliCDBStorage *gridSto = man->GetStorage(gridURI);
604         if(!gridSto) {
605                 Log("SHUTTLE",
606                         Form("StoreOCDB - cannot activate main %s storage", type));
607                 return kFALSE;
608         }
609
610         gridIds = gridSto->GetQueryCDBList();
611
612         // get objects previously stored in local CDB
613         AliCDBStorage *localSto = man->GetStorage(localURI);
614         if(!localSto) {
615                 Log("SHUTTLE",
616                         Form("StoreOCDB - cannot activate local %s storage", type));
617                 return kFALSE;
618         }
619         AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
620         // Local objects were stored with current run as Grid version!
621         TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
622         localEntries->SetOwner(1);
623
624         // loop on local stored objects
625         TIter localIter(localEntries);
626         AliCDBEntry *aLocEntry = 0;
627         while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
628                 aLocEntry->SetOwner(1);
629                 AliCDBId aLocId = aLocEntry->GetId();
630                 aLocEntry->SetVersion(-1);
631                 aLocEntry->SetSubVersion(-1);
632
633                 // If local object is valid up to infinity we store it only if it is
634                 // the first unprocessed run!
635                 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
636                         !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
637                 {
638                         Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
639                                                 "there are previous unprocessed runs!",
640                                                 fCurrentDetector.Data(), aLocId.GetPath().Data()));
641                         continue;
642                 }
643
644                 // loop on Grid valid Id's
645                 Bool_t store = kTRUE;
646                 TIter gridIter(gridIds);
647                 AliCDBId* aGridId = 0;
648                 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
649                         if(aGridId->GetPath() != aLocId.GetPath()) continue;
650                         // skip all objects valid up to infinity
651                         if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
652                         // if we get here, it means there's already some more recent object stored on Grid!
653                         store = kFALSE;
654                         break;
655                 }
656
657                 // If we get here, the file can be stored!
658                 Bool_t storeOk = gridSto->Put(aLocEntry);
659                 if(!store || storeOk){
660
661                         if (!store)
662                         {
663                                 Log(fCurrentDetector.Data(),
664                                         Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
665                                                 type, aGridId->ToString().Data()));
666                         } else {
667                                 Log("SHUTTLE",
668                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
669                                                 aLocId.ToString().Data(), type));
670                                 Log(fCurrentDetector.Data(),
671                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
672                                                 aLocId.ToString().Data(), type));
673                         }
674
675                         // removing local filename...
676                         TString filename;
677                         localSto->IdToFilename(aLocId, filename);
678                         Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
679                         RemoveFile(filename.Data());
680                         continue;
681                 } else  {
682                         Log("SHUTTLE",
683                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
684                                         type, aLocId.ToString().Data()));
685                         Log(fCurrentDetector.Data(),
686                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
687                                         type, aLocId.ToString().Data()));
688                         result = kFALSE;
689                 }
690         }
691         localEntries->Clear();
692
693         return result;
694 }
695
696 //______________________________________________________________________________________________
697 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
698 {
699         // clears the directory used to store reference files of a given subdetector
700   
701         AliCDBManager* man = AliCDBManager::Instance();
702         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
703         TString localBaseFolder = sto->GetBaseFolder();
704
705         TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
706         
707         Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
708
709         TString begin;
710         begin.Form("%d_", GetCurrentRun());
711         
712         TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
713         if (!baseDir)
714                 return kTRUE;
715                 
716         TList* dirList = baseDir->GetListOfFiles();
717         delete baseDir;
718         
719         if (!dirList) return kTRUE;
720                         
721         if (dirList->GetEntries() < 3) 
722         {
723                 delete dirList;
724                 return kTRUE;
725         }
726                                 
727         Int_t nDirs = 0, nDel = 0;
728         TIter dirIter(dirList);
729         TSystemFile* entry = 0;
730
731         Bool_t success = kTRUE;
732         
733         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
734         {                                       
735                 if (entry->IsDirectory())
736                         continue;
737                 
738                 TString fileName(entry->GetName());
739                 if (!fileName.BeginsWith(begin))
740                         continue;
741                         
742                 nDirs++;
743                                                 
744                 // delete file
745                 Int_t result = gSystem->Unlink(fileName.Data());
746                 
747                 if (result)
748                 {
749                         Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
750                         success = kFALSE;
751                 } else {
752                         nDel++;
753                 }
754         }
755
756         if(nDirs > 0)
757                 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.", 
758                         nDel, nDirs, targetDir.Data()));
759
760                 
761         delete dirList;
762         return success;
763
764
765
766
767
768
769   Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
770   if (result == 0)
771   {
772     // delete directory
773     result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
774     if (result != 0)
775     {  
776       Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
777       return kFALSE;
778     }
779   }
780
781   result = gSystem->mkdir(targetDir, kTRUE);
782   if (result != 0)
783   {
784     Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
785     return kFALSE;
786   }
787         
788   return kTRUE;
789 }
790
791 //______________________________________________________________________________________________
792 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
793 {
794         //
795         // Stores reference file directly (without opening it). This function stores the file locally.
796         //
797         // The file is stored under the following location: 
798         // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
799         // where <gridFileName> is the second parameter given to the function
800         // 
801         
802         if (fTestMode & kErrorStorage)
803         {
804                 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
805                 return kFALSE;
806         }
807         
808         AliCDBManager* man = AliCDBManager::Instance();
809         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
810         
811         TString localBaseFolder = sto->GetBaseFolder();
812         
813         TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);    
814         target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
815         
816         return CopyFileLocally(localFile, target);
817 }
818
819 //______________________________________________________________________________________________
820 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
821 {
822         //
823         // Stores Run metadata file to the Grid, in the run folder
824         //
825         // Only GRP can call this function.
826         
827         if (fTestMode & kErrorStorage)
828         {
829                 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
830                 return kFALSE;
831         }
832         
833         AliCDBManager* man = AliCDBManager::Instance();
834         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
835         
836         TString localBaseFolder = sto->GetBaseFolder();
837         
838         // Build Run level folder
839         // folder = /alice/data/year/lhcPeriod/runNb/Raw
840         
841                 
842         TString lhcPeriod = GetLHCPeriod();     
843         if (lhcPeriod.Length() == 0) 
844         {
845                 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
846                 return 0;
847         }
848         
849         TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s", 
850                                 localBaseFolder.Data(), GetCurrentYear(), 
851                                 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
852                                         
853         return CopyFileLocally(localFile, target);
854 }
855
856 //______________________________________________________________________________________________
857 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
858 {
859         //
860         // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
861         // Files are temporarily stored in the local reference storage. When the preprocessor 
862         // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn 
863         // (in reference or run level folders)
864         //
865         
866         TString targetDir(target(0, target.Last('/')));
867         
868         //try to open base dir folder, if it does not exist
869         void* dir = gSystem->OpenDirectory(targetDir.Data());
870         if (dir == NULL) {
871                 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
872                         Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
873                         return kFALSE;
874                 }
875
876         } else {
877                 gSystem->FreeDirectory(dir);
878         }
879         
880         Int_t result = 0;
881         
882         result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
883         if (result)
884         {
885                 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
886                 return kFALSE;
887         }
888
889         result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
890         if (!result)
891         {
892                 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
893                 if (gSystem->Unlink(target.Data()))
894                 {
895                         Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
896                         return kFALSE;
897                 }
898         }       
899         
900         result = gSystem->CopyFile(localFile, target);
901
902         if (result == 0)
903         {
904                 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
905                 return kTRUE;
906         }
907         else
908         {
909                 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d", 
910                                 localFile, target.Data(), result));
911                 return kFALSE;
912         }       
913
914
915
916 }
917
918 //______________________________________________________________________________________________
919 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
920 {
921         //
922         // Transfers local files to the Grid. Local files can be reference files 
923         // or run metadata file (from GRP only).
924         //
925         // According to the type (ref, metadata) the files are stored under the following location: 
926         // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
927         // metadata --> <run data folder>/<MetadataFileName>
928         //
929                 
930         AliCDBManager* man = AliCDBManager::Instance();
931         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
932         if (!sto)
933                 return kFALSE;
934         TString localBaseFolder = sto->GetBaseFolder();
935         
936         TString dir;
937         TString alienDir;
938         TString begin;
939         
940         if (strcmp(type, "reference") == 0) 
941         {
942                 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
943                 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
944                 if (!gridSto)
945                         return kFALSE;
946                 TString gridBaseFolder = gridSto->GetBaseFolder();
947                 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
948                 begin = Form("%d_", GetCurrentRun());
949         } 
950         else if (strcmp(type, "metadata") == 0)
951         {
952                         
953                 TString lhcPeriod = GetLHCPeriod();
954         
955                 if (lhcPeriod.Length() == 0) 
956                 {
957                         Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
958                         return 0;
959                 }
960                 
961                 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw", 
962                                 localBaseFolder.Data(), GetCurrentYear(), 
963                                 lhcPeriod.Data(), GetCurrentRun());
964                 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
965                 
966                 begin = "";
967         }
968         else 
969         {
970                 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
971                 return kFALSE;
972         }
973                 
974         TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
975         if (!baseDir)
976                 return kTRUE;
977                 
978         TList* dirList = baseDir->GetListOfFiles();
979         delete baseDir;
980         
981         if (!dirList) return kTRUE;
982                 
983         if (dirList->GetEntries() < 3) 
984         {
985                 delete dirList;
986                 return kTRUE;
987         }
988                         
989         if (!gGrid)
990         { 
991                 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
992                 delete dirList;
993                 return kFALSE;
994         }
995         
996         Int_t nDirs = 0, nTransfer = 0;
997         TIter dirIter(dirList);
998         TSystemFile* entry = 0;
999
1000         Bool_t success = kTRUE;
1001         Bool_t first = kTRUE;
1002         
1003         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1004         {                       
1005                 if (entry->IsDirectory())
1006                         continue;
1007                         
1008                 TString fileName(entry->GetName());
1009                 if (!fileName.BeginsWith(begin))
1010                         continue;
1011                         
1012                 nDirs++;
1013                         
1014                 if (first)
1015                 {
1016                         first = kFALSE;
1017                         // check that folder exists, otherwise create it
1018                         TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1019                         
1020                         if (!result)
1021                         {
1022                                 delete dirList;
1023                                 return kFALSE;
1024                         }
1025                         
1026                         if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1027                         {
1028                                 // TODO It does not work currently! Bug in TAliEn::Mkdir
1029                                 // TODO Manually fixed in local root v5-16-00
1030                                 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1031                                 {
1032                                         Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1033                                                         alienDir.Data()));
1034                                         delete dirList;
1035                                         return kFALSE;
1036                                 } else {
1037                                         Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1038                                 }
1039                                 
1040                         } else {
1041                                         Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1042                         }
1043                 }
1044                         
1045                 TString fullLocalPath;
1046                 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1047                 
1048                 TString fullGridPath;
1049                 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1050
1051                 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1052                 
1053                 if (result)
1054                 {
1055                         Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!", 
1056                                                 fullLocalPath.Data(), fullGridPath.Data()));
1057                         RemoveFile(fullLocalPath);
1058                         nTransfer++;
1059                 }
1060                 else
1061                 {
1062                         Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!", 
1063                                                 fullLocalPath.Data(), fullGridPath.Data()));
1064                         success = kFALSE;
1065                 }
1066         }
1067
1068         Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.", 
1069                                                 nTransfer, nDirs, dir.Data()));
1070
1071                 
1072         delete dirList;
1073         return success;
1074 }
1075
1076 //______________________________________________________________________________________________
1077 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1078 {
1079         //
1080         // Get folder name of reference files 
1081         //
1082
1083         TString offDetStr(GetOfflineDetName(detector));
1084         TString dir;
1085         if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1086         {
1087                 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1088         } else {
1089                 dir.Form("%s/%s", base, offDetStr.Data());
1090         }
1091         
1092         return dir.Data();
1093         
1094
1095 }
1096
1097 //______________________________________________________________________________________________
1098 void AliShuttle::CleanLocalStorage(const TString& uri)
1099 {
1100         //
1101         // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1102         //
1103
1104         const char* type = 0;
1105         if(uri == fgkLocalCDB) {
1106                 type = "OCDB";
1107         } else if(uri == fgkLocalRefStorage) {
1108                 type = "Reference";
1109         } else {
1110                 AliError(Form("Invalid storage URI: %s", uri.Data()));
1111                 return;
1112         }
1113
1114         AliCDBManager* man = AliCDBManager::Instance();
1115
1116         // open local storage
1117         AliCDBStorage *localSto = man->GetStorage(uri);
1118         if(!localSto) {
1119                 Log("SHUTTLE",
1120                         Form("CleanLocalStorage - cannot activate local %s storage", type));
1121                 return;
1122         }
1123
1124         TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1125                 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1126
1127         AliDebug(2, Form("filename = %s", filename.Data()));
1128
1129         Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1130                 GetCurrentRun(), fCurrentDetector.Data()));
1131
1132         RemoveFile(filename.Data());
1133
1134 }
1135
1136 //______________________________________________________________________________________________
1137 void AliShuttle::RemoveFile(const char* filename)
1138 {
1139         //
1140         // removes local file
1141         //
1142
1143         TString command(Form("rm -f %s", filename));
1144
1145         Int_t result = gSystem->Exec(command.Data());
1146         if(result != 0)
1147         {
1148                 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1149                         fCurrentDetector.Data(), filename));
1150         }
1151 }
1152
1153 //______________________________________________________________________________________________
1154 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1155 {
1156         //
1157         // Reads the AliShuttleStatus from the CDB
1158         //
1159
1160         if (fStatusEntry){
1161                 delete fStatusEntry;
1162                 fStatusEntry = 0;
1163         }
1164
1165         fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1166                 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1167
1168         if (!fStatusEntry) return 0;
1169         fStatusEntry->SetOwner(1);
1170
1171         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1172         if (!status) {
1173                 AliError("Invalid object stored to CDB!");
1174                 return 0;
1175         }
1176
1177         return status;
1178 }
1179
1180 //______________________________________________________________________________________________
1181 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1182 {
1183         //
1184         // writes the status for one subdetector
1185         //
1186
1187         if (fStatusEntry){
1188                 delete fStatusEntry;
1189                 fStatusEntry = 0;
1190         }
1191
1192         Int_t run = GetCurrentRun();
1193
1194         AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1195
1196         fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1197         fStatusEntry->SetOwner(1);
1198
1199         UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1200
1201         if (!result) {
1202                 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1203                                                 fCurrentDetector.Data(), run));
1204                 return kFALSE;
1205         }
1206         
1207         SendMLInfo();
1208
1209         return kTRUE;
1210 }
1211
1212 //______________________________________________________________________________________________
1213 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1214 {
1215         //
1216         // changes the AliShuttleStatus for the given detector and run to the given status
1217         //
1218
1219         if (!fStatusEntry){
1220                 AliError("UNEXPECTED: fStatusEntry empty");
1221                 return;
1222         }
1223
1224         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1225
1226         if (!status){
1227                 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1228                 return;
1229         }
1230
1231         TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1232                                 fCurrentDetector.Data(),
1233                                 status->GetStatusName(),
1234                                 status->GetStatusName(newStatus));
1235         Log("SHUTTLE", actionStr);
1236         SetLastAction(actionStr);
1237
1238         status->SetStatus(newStatus);
1239         if (increaseCount) status->IncreaseCount();
1240
1241         AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1242
1243         SendMLInfo();
1244 }
1245
1246 //______________________________________________________________________________________________
1247 void AliShuttle::SendMLInfo()
1248 {
1249         //
1250         // sends ML information about the current status of the current detector being processed
1251         //
1252         
1253         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1254         
1255         if (!status){
1256                 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1257                 return;
1258         }
1259         
1260         TMonaLisaText  mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1261         TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1262
1263         TList mlList;
1264         mlList.Add(&mlStatus);
1265         mlList.Add(&mlRetryCount);
1266
1267         fMonaLisa->SendParameters(&mlList);
1268 }
1269
1270 //______________________________________________________________________________________________
1271 Bool_t AliShuttle::ContinueProcessing()
1272 {
1273         // this function reads the AliShuttleStatus information from CDB and
1274         // checks if the processing should be continued
1275         // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1276
1277         if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1278
1279         AliPreprocessor* aPreprocessor =
1280                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1281         if (!aPreprocessor)
1282         {
1283                 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1284                 return kFALSE;
1285         }
1286
1287         AliShuttleLogbookEntry::Status entryStatus =
1288                 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1289
1290         if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1291                 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1292                                 fCurrentDetector.Data(),
1293                                 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1294                 return kFALSE;
1295         }
1296
1297         // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1298
1299         // check if current run is first unprocessed run for current detector
1300         if (fConfig->StrictRunOrder(fCurrentDetector) &&
1301                 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1302         {
1303                 if (fTestMode == kNone)
1304                 {
1305                         Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1306                                         " but this is not the first unprocessed run!"));
1307                         return kFALSE;
1308                 }
1309                 else
1310                 {
1311                         Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1312                                         "Although %s requires strict run ordering "
1313                                         "and this is not the first unprocessed run, "
1314                                         "the SHUTTLE continues"));
1315                 }
1316         }
1317
1318         AliShuttleStatus* status = ReadShuttleStatus();
1319         if (!status) {
1320                 // first time
1321                 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1322                                 fCurrentDetector.Data()));
1323                 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1324                 return WriteShuttleStatus(status);
1325         }
1326
1327         // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1328         // If it happens it may mean Logbook updating failed... let's do it now!
1329         if (status->GetStatus() == AliShuttleStatus::kDone ||
1330             status->GetStatus() == AliShuttleStatus::kFailed){
1331                 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1332                                         fCurrentDetector.Data(),
1333                                         status->GetStatusName(status->GetStatus())));
1334                 UpdateShuttleLogbook(fCurrentDetector.Data(),
1335                                         status->GetStatusName(status->GetStatus()));
1336                 return kFALSE;
1337         }
1338
1339         if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1340                 Log("SHUTTLE",
1341                         Form("ContinueProcessing - %s: Grid storage of one or more "
1342                                 "objects failed. Trying again now",
1343                                 fCurrentDetector.Data()));
1344                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1345                 if (StoreOCDB()){
1346                         Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1347                                 "successfully stored into main storage",
1348                                 fCurrentDetector.Data()));
1349                 } else {
1350                         Log("SHUTTLE",
1351                                 Form("ContinueProcessing - %s: Grid storage failed again",
1352                                         fCurrentDetector.Data()));
1353                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1354                 }
1355                 return kFALSE;
1356         }
1357
1358         // if we get here, there is a restart
1359         Bool_t cont = kFALSE;
1360
1361         // abort conditions
1362         if (status->GetCount() >= fConfig->GetMaxRetries()) {
1363                 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1364                                 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1365                                 status->GetCount(), status->GetStatusName()));
1366                 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1367                 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1368
1369                 // there may still be objects in local OCDB and reference storage
1370                 // and FXS databases may be not updated: do it now!
1371                 
1372                 // TODO Currently disabled, we want to keep files in case of failure!
1373                 // CleanLocalStorage(fgkLocalCDB);
1374                 // CleanLocalStorage(fgkLocalRefStorage);
1375                 // UpdateTableFailCase();
1376                 
1377                 // Send mail to detector expert!
1378                 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...", 
1379                                         fCurrentDetector.Data()));
1380                 if (!SendMail())
1381                         Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1382                                         fCurrentDetector.Data()));
1383
1384         } else {
1385                 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1386                                 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1387                                 status->GetStatusName(), status->GetCount()));
1388                 Bool_t increaseCount = kTRUE;
1389                 if (status->GetStatus() == AliShuttleStatus::kDCSError || 
1390                         status->GetStatus() == AliShuttleStatus::kDCSStarted)
1391                                 increaseCount = kFALSE;
1392                                 
1393                 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1394                 cont = kTRUE;
1395         }
1396
1397         return cont;
1398 }
1399
1400 //______________________________________________________________________________________________
1401 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1402 {
1403         //
1404         // Makes data retrieval for all detectors in the configuration.
1405         // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1406         // (Unprocessed, Inactive, Failed or Done).
1407         // Returns kFALSE in case of error occured and kTRUE otherwise
1408         //
1409
1410         if (!entry) return kFALSE;
1411
1412         fLogbookEntry = entry;
1413
1414         Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1415                                         GetCurrentRun()));
1416
1417         // create ML instance that monitors this run
1418         fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable(), Form("%d", GetCurrentRun()));
1419
1420         // Send the information to ML
1421         TMonaLisaText  mlStatus("SHUTTLE_status", "Processing");
1422         TMonaLisaText  mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1423
1424         TList mlList;
1425         mlList.Add(&mlStatus);
1426         mlList.Add(&mlRunType);
1427
1428         fMonaLisa->SendParameters(&mlList);
1429
1430         if (fLogbookEntry->IsDone())
1431         {
1432                 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1433                 UpdateShuttleLogbook("shuttle_done");
1434                 fLogbookEntry = 0;
1435                 return kTRUE;
1436         }
1437
1438         // read test mode if flag is set
1439         if (fReadTestMode)
1440         {
1441                 fTestMode = kNone;
1442                 TString logEntry(entry->GetRunParameter("log"));
1443                 //printf("log entry = %s\n", logEntry.Data());
1444                 TString searchStr("Testmode: ");
1445                 Int_t pos = logEntry.Index(searchStr.Data());
1446                 //printf("%d\n", pos);
1447                 if (pos >= 0)
1448                 {
1449                         TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1450                         //printf("%s\n", subStr.String().Data());
1451                         TString newStr(subStr.Data());
1452                         TObjArray* token = newStr.Tokenize(' ');
1453                         if (token)
1454                         {
1455                                 //token->Print();
1456                                 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1457                                 if (tmpStr)
1458                                 {
1459                                         Int_t testMode = tmpStr->String().Atoi();
1460                                         if (testMode > 0)
1461                                         {
1462                                                 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1463                                                 SetTestMode((TestMode) testMode);
1464                                         }
1465                                 }
1466                                 delete token;          
1467                         }
1468                 }
1469         }
1470                 
1471         fLogbookEntry->Print("all");
1472
1473         // Initialization
1474         Bool_t hasError = kFALSE;
1475
1476         // Set the CDB and Reference folders according to the year and LHC period
1477         TString lhcPeriod(GetLHCPeriod());
1478         if (lhcPeriod.Length() == 0) 
1479         {
1480                 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1481                 return 0; 
1482         }       
1483         
1484         if (fgkMainCDB.Length() == 0)
1485                 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache", 
1486                                         GetCurrentYear(), lhcPeriod.Data());
1487         
1488         if (fgkMainRefStorage.Length() == 0)
1489                 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache", 
1490                                         GetCurrentYear(), lhcPeriod.Data());
1491         
1492         AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1493         if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1494         AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1495         if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1496
1497         // Loop on detectors in the configuration
1498         TIter iter(fConfig->GetDetectors());
1499         TObjString* aDetector = 0;
1500
1501         while ((aDetector = (TObjString*) iter.Next()))
1502         {
1503                 fCurrentDetector = aDetector->String();
1504
1505                 if (ContinueProcessing() == kFALSE) continue;
1506
1507                 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START  ******",
1508                                                 GetCurrentRun(), aDetector->GetName()));
1509
1510                 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1511
1512                 Log(fCurrentDetector.Data(), "Process - Starting processing");
1513
1514                 Int_t pid = fork();
1515
1516                 if (pid < 0)
1517                 {
1518                         Log("SHUTTLE", "Process - ERROR: Forking failed");
1519                 }
1520                 else if (pid > 0)
1521                 {
1522                         // parent
1523                         Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1524                                                         GetCurrentRun(), aDetector->GetName()));
1525
1526                         Long_t begin = time(0);
1527
1528                         int status; // to be used with waitpid, on purpose an int (not Int_t)!
1529                         while (waitpid(pid, &status, WNOHANG) == 0)
1530                         {
1531                                 Long_t expiredTime = time(0) - begin;
1532
1533                                 if (expiredTime > fConfig->GetPPTimeOut())
1534                                 {
1535                                         TString tmp;
1536                                         tmp.Form("Process - Process of %s time out. "
1537                                                         "Run time: %d seconds. Killing...",
1538                                                         fCurrentDetector.Data(), expiredTime);
1539                                         Log("SHUTTLE", tmp);
1540                                         Log(fCurrentDetector, tmp);
1541
1542                                         kill(pid, 9);
1543
1544                                         UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1545                                         hasError = kTRUE;
1546
1547                                         gSystem->Sleep(1000);
1548                                 }
1549                                 else
1550                                 {
1551                                         gSystem->Sleep(1000);
1552                                         
1553                                         TString checkStr;
1554                                         checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1555                                         FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1556                                         if (!pipe)
1557                                         {
1558                                                 Log("SHUTTLE", Form("Process - Error: "
1559                                                         "Could not open pipe to %s", checkStr.Data()));
1560                                                 continue;
1561                                         }
1562                                                 
1563                                         char buffer[100];
1564                                         if (!fgets(buffer, 100, pipe))
1565                                         {
1566                                                 Log("SHUTTLE", "Process - Error: ps did not return anything");
1567                                                 gSystem->ClosePipe(pipe);
1568                                                 continue;
1569                                         }
1570                                         gSystem->ClosePipe(pipe);
1571                                         
1572                                         //Log("SHUTTLE", Form("ps returned %s", buffer));
1573                                         
1574                                         Int_t mem = 0;
1575                                         if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1576                                         {
1577                                                 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1578                                                 continue;
1579                                         }
1580                                         
1581                                         if (expiredTime % 60 == 0)
1582                                                 Log("SHUTTLE", Form("Process - %s: Checking process. "
1583                                                         "Run time: %d seconds - Memory consumption: %d KB",
1584                                                         fCurrentDetector.Data(), expiredTime, mem));
1585                                         
1586                                         if (mem > fConfig->GetPPMaxMem())
1587                                         {
1588                                                 TString tmp;
1589                                                 tmp.Form("Process - Process exceeds maximum allowed memory "
1590                                                         "(%d KB > %d KB). Killing...",
1591                                                         mem, fConfig->GetPPMaxMem());
1592                                                 Log("SHUTTLE", tmp);
1593                                                 Log(fCurrentDetector, tmp);
1594         
1595                                                 kill(pid, 9);
1596         
1597                                                 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1598                                                 hasError = kTRUE;
1599         
1600                                                 gSystem->Sleep(1000);
1601                                         }
1602                                 }
1603                         }
1604
1605                         Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1606                                                                 GetCurrentRun(), aDetector->GetName()));
1607
1608                         if (WIFEXITED(status))
1609                         {
1610                                 Int_t returnCode = WEXITSTATUS(status);
1611
1612                                 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1613                                                                                 returnCode));
1614
1615                                 if (returnCode == 0) hasError = kTRUE;
1616                         }
1617                 }
1618                 else if (pid == 0)
1619                 {
1620                         // client
1621                         Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1622                                 aDetector->GetName()));
1623
1624                         Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1625
1626                         if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1627                         {
1628                                 Log("SHUTTLE", "Process - Could not freopen stdout");
1629                         }
1630                         else
1631                         {
1632                                 fOutputRedirected = kTRUE;
1633                                 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1634                                         Log("SHUTTLE", "Process - Could not redirect stderr");
1635                                 
1636                         }
1637                         
1638                         TString wd = gSystem->WorkingDirectory();
1639                         TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(), 
1640                                 fCurrentDetector.Data(), GetCurrentRun());
1641                         
1642                         Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1643                         if (!result) // temp dir already exists!
1644                         {
1645                                 Log(fCurrentDetector.Data(), 
1646                                         Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1647                                 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));         
1648                         } 
1649                         
1650                         if (gSystem->mkdir(tmpDir.Data(), 1))
1651                         {
1652                                 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1653                                 gSystem->Exit(1);
1654                         }
1655                         
1656                         if (!gSystem->ChangeDirectory(tmpDir.Data())) 
1657                         {
1658                                 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1659                                 gSystem->Exit(1);                       
1660                         }
1661                         
1662                         Bool_t success = ProcessCurrentDetector();
1663                         
1664                         gSystem->ChangeDirectory(wd.Data());
1665                                                 
1666                         if (success) // Preprocessor finished successfully!
1667                         { 
1668                                 // remove temporary folder
1669                                 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1670                                 
1671                                 // Update time_processed field in FXS DB
1672                                 if (UpdateTable() == kFALSE)
1673                                         Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!", 
1674                                                         fCurrentDetector.Data()));
1675
1676                                 // Transfer the data from local storage to main storage (Grid)
1677                                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1678                                 if (StoreOCDB() == kFALSE)
1679                                 {
1680                                         Log("SHUTTLE", 
1681                                                 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1682                                                         GetCurrentRun(), aDetector->GetName()));
1683                                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1684                                         success = kFALSE;
1685                                 } else {
1686                                         Log("SHUTTLE", 
1687                                                 Form("\t\t\t****** run %d - %s: DONE ******",
1688                                                         GetCurrentRun(), aDetector->GetName()));
1689                                         UpdateShuttleStatus(AliShuttleStatus::kDone);
1690                                         UpdateShuttleLogbook(fCurrentDetector, "DONE");
1691                                 }
1692                         } else 
1693                         {
1694                                 Log("SHUTTLE", 
1695                                         Form("\t\t\t****** run %d - %s: PP ERROR ******",
1696                                                 GetCurrentRun(), aDetector->GetName()));
1697                         }
1698
1699                         for (UInt_t iSys=0; iSys<3; iSys++)
1700                         {
1701                                 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1702                         }
1703
1704                         Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1705                                                         GetCurrentRun(), aDetector->GetName(), success));
1706
1707                         // the client exits here
1708                         gSystem->Exit(success);
1709
1710                         AliError("We should never get here!!!");
1711                 }
1712         }
1713
1714         Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1715                                                         GetCurrentRun()));
1716
1717         //check if shuttle is done for this run, if so update logbook
1718         TObjArray checkEntryArray;
1719         checkEntryArray.SetOwner(1);
1720         TString whereClause = Form("where run=%d", GetCurrentRun());
1721         if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || 
1722                         checkEntryArray.GetEntries() == 0) {
1723                 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1724                                                 GetCurrentRun()));
1725                 return hasError == kFALSE;
1726         }
1727
1728         AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1729                                                 (checkEntryArray.At(0));
1730
1731         if (checkEntry)
1732         {
1733                 if (checkEntry->IsDone())
1734                 {
1735                         Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1736                         UpdateShuttleLogbook("shuttle_done");
1737                 }
1738                 else
1739                 {
1740                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1741                         {
1742                                 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1743                                 {
1744                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1745                                                         checkEntry->GetRun(), GetDetName(iDet)));
1746                                         fFirstUnprocessed[iDet] = kFALSE;
1747                                 }
1748                         }
1749                 }
1750         }
1751
1752         // remove ML instance
1753         delete fMonaLisa;
1754         fMonaLisa = 0;
1755
1756         fLogbookEntry = 0;
1757
1758         return hasError == kFALSE;
1759 }
1760
1761 //______________________________________________________________________________________________
1762 Bool_t AliShuttle::ProcessCurrentDetector()
1763 {
1764         //
1765         // Makes data retrieval just for a specific detector (fCurrentDetector).
1766         // Threre should be a configuration for this detector.
1767
1768         Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d", 
1769                                                 fCurrentDetector.Data(), GetCurrentRun()));
1770
1771         TString wd = gSystem->WorkingDirectory();
1772         
1773         if (!CleanReferenceStorage(fCurrentDetector.Data()))
1774                 return kFALSE;
1775         
1776         gSystem->ChangeDirectory(wd.Data());
1777         
1778         TMap* dcsMap = new TMap();
1779
1780         // call preprocessor
1781         AliPreprocessor* aPreprocessor =
1782                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1783
1784         aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1785
1786         Bool_t processDCS = aPreprocessor->ProcessDCS();
1787
1788         if (!processDCS)
1789         {
1790                 Log(fCurrentDetector, "ProcessCurrentDetector -"
1791                         " The preprocessor requested to skip the retrieval of DCS values");
1792         }
1793         else if (fTestMode & kSkipDCS)
1794         {
1795                 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1796         } 
1797         else if (fTestMode & kErrorDCS)
1798         {
1799                 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1800                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1801                 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1802                 delete dcsMap;
1803                 return kFALSE;
1804         } else {
1805
1806                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1807
1808                 // Query DCS archive
1809                 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1810                 
1811                 for (int iServ=0; iServ<nServers; iServ++)
1812                 {
1813                 
1814                         TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1815                         Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1816                         Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1817
1818                         Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1819                                         " Querying DCS Amanda server %s:%d (%d of %d)", 
1820                                         host.Data(), port, iServ+1, nServers));
1821                         
1822                         TMap* aliasMap = 0;
1823                         TMap* dpMap = 0;
1824         
1825                         if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1826                         {
1827                                 aliasMap = GetValueSet(host, port, 
1828                                                 fConfig->GetDCSAliases(fCurrentDetector, iServ), 
1829                                                 kAlias, multiSplit);
1830                                 if (!aliasMap)
1831                                 {
1832                                         Log(fCurrentDetector, 
1833                                                 Form("ProcessCurrentDetector -"
1834                                                         " Error retrieving DCS aliases from server %s."
1835                                                         " Sending mail to DCS experts!", host.Data()));
1836                                         UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1837                                         
1838                                         if (!SendMailToDCS())
1839                                                 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1840
1841                                         delete dcsMap;
1842                                         return kFALSE;
1843                                 }
1844                         }
1845                         
1846                         if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1847                         {
1848                                 dpMap = GetValueSet(host, port, 
1849                                                 fConfig->GetDCSDataPoints(fCurrentDetector, iServ), 
1850                                                 kDP, multiSplit);
1851                                 if (!dpMap)
1852                                 {
1853                                         Log(fCurrentDetector, 
1854                                                 Form("ProcessCurrentDetector -"
1855                                                         " Error retrieving DCS data points from server %s."
1856                                                         " Sending mail to DCS experts!", host.Data()));
1857                                         UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1858                                         
1859                                         if (!SendMailToDCS())
1860                                                 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1861                                         
1862                                         if (aliasMap) delete aliasMap;
1863                                         delete dcsMap;
1864                                         return kFALSE;
1865                                 }                               
1866                         }
1867                         
1868                         // merge aliasMap and dpMap into dcsMap
1869                         if(aliasMap) {
1870                                 TIter iter(aliasMap);
1871                                 TObjString* key = 0;
1872                                 while ((key = (TObjString*) iter.Next()))
1873                                         dcsMap->Add(key, aliasMap->GetValue(key->String()));
1874                                 
1875                                 aliasMap->SetOwner(kFALSE);
1876                                 delete aliasMap;
1877                         }       
1878                         
1879                         if(dpMap) {
1880                                 TIter iter(dpMap);
1881                                 TObjString* key = 0;
1882                                 while ((key = (TObjString*) iter.Next()))
1883                                         dcsMap->Add(key, dpMap->GetValue(key->String()));
1884                                 
1885                                 dpMap->SetOwner(kFALSE);
1886                                 delete dpMap;
1887                         }
1888                 }
1889         }
1890         
1891         // save map into file, to help debugging in case of preprocessor error
1892         TFile* f = TFile::Open("DCSMap.root","recreate");
1893         f->cd();
1894         dcsMap->Write("DCSMap", TObject::kSingleKey);
1895         f->Close();
1896         delete f;
1897         
1898         // DCS Archive DB processing successful. Call Preprocessor!
1899         UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1900
1901         UInt_t returnValue = aPreprocessor->Process(dcsMap);
1902
1903         if (returnValue > 0) // Preprocessor error!
1904         {
1905                 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1906                                 "Preprocessor failed. Process returned %d.", returnValue));
1907                 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1908                 dcsMap->DeleteAll();
1909                 delete dcsMap;
1910                 return kFALSE;
1911         }
1912         
1913         // preprocessor ok!
1914         UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1915         Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1916                                 fCurrentDetector.Data()));
1917
1918         dcsMap->DeleteAll();
1919         delete dcsMap;
1920
1921         return kTRUE;
1922 }
1923
1924 //______________________________________________________________________________________________
1925 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1926                 TObjArray& entries)
1927 {
1928         // Query DAQ's Shuttle logbook and fills detector status object.
1929         // Call QueryRunParameters to query DAQ logbook for run parameters.
1930         //
1931
1932         entries.SetOwner(1);
1933
1934         // check connection, in case connect
1935         if(!Connect(3)) return kFALSE;
1936
1937         TString sqlQuery;
1938         sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1939
1940         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1941         if (!aResult) {
1942                 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1943                 return kFALSE;
1944         }
1945
1946         AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1947
1948         if(aResult->GetRowCount() == 0) {
1949                 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1950                 delete aResult;
1951                 return kTRUE;
1952         }
1953
1954         // TODO Check field count!
1955         const UInt_t nCols = 23;
1956         if (aResult->GetFieldCount() != (Int_t) nCols) {
1957                 Log("SHUTTLE", "Invalid SQL result field number!");
1958                 delete aResult;
1959                 return kFALSE;
1960         }
1961
1962         TSQLRow* aRow;
1963         while ((aRow = aResult->Next())) {
1964                 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1965                 Int_t run = runString.Atoi();
1966
1967                 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1968                 if (!entry)
1969                         continue;
1970
1971                 // loop on detectors
1972                 for(UInt_t ii = 0; ii < nCols; ii++)
1973                         entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1974
1975                 entries.AddLast(entry);
1976                 delete aRow;
1977         }
1978
1979         delete aResult;
1980         return kTRUE;
1981 }
1982
1983 //______________________________________________________________________________________________
1984 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1985 {
1986         //
1987         // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1988         //
1989
1990         // check connection, in case connect
1991         if (!Connect(3))
1992                 return 0;
1993
1994         TString sqlQuery;
1995         sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1996
1997         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1998         if (!aResult) {
1999                 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2000                 return 0;
2001         }
2002
2003         if (aResult->GetRowCount() == 0) {
2004                 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2005                 delete aResult;
2006                 return 0;
2007         }
2008
2009         if (aResult->GetRowCount() > 1) {
2010                 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2011                                 "more than one entry in DAQ Logbook for run %d!", run));
2012                 delete aResult;
2013                 return 0;
2014         }
2015
2016         TSQLRow* aRow = aResult->Next();
2017         if (!aRow)
2018         {
2019                 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2020                 delete aResult;
2021                 return 0;
2022         }
2023
2024         AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2025
2026         for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2027                 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2028
2029         UInt_t startTime = entry->GetStartTime();
2030         UInt_t endTime = entry->GetEndTime();
2031
2032 //      if (!startTime || !endTime || startTime > endTime) 
2033 //      {
2034 //              Log("SHUTTLE",
2035 //                      Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2036 //                              run, startTime, endTime));              
2037 //              
2038 //              Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2039 //              fLogbookEntry = entry;  
2040 //              if (!UpdateShuttleLogbook("shuttle_done"))
2041 //              {
2042 //                      AliError(Form("Could not update logbook for run %d !", run));
2043 //              }
2044 //              fLogbookEntry = 0;
2045 //                              
2046 //              delete entry;
2047 //              delete aRow;
2048 //              delete aResult;
2049 //              return 0;
2050 //      }
2051
2052         if (!startTime) 
2053         {
2054                 Log("SHUTTLE",
2055                         Form("QueryRunParameters - Invalid parameters for Run %d: " 
2056                                 "startTime = %d, endTime = %d. Skipping!",
2057                                         run, startTime, endTime));              
2058                 
2059                 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2060                 fLogbookEntry = entry;  
2061                 if (!UpdateShuttleLogbook("shuttle_done"))
2062                 {
2063                         AliError(Form("Could not update logbook for run %d !", run));
2064                 }
2065                 fLogbookEntry = 0;
2066                                 
2067                 delete entry;
2068                 delete aRow;
2069                 delete aResult;
2070                 return 0;
2071         }
2072         
2073         if (startTime && !endTime) 
2074         {
2075                 // TODO Here we don't mark SHUTTLE done, because this may mean 
2076                 //the run is still ongoing!!            
2077                 Log("SHUTTLE",
2078                         Form("QueryRunParameters - Invalid parameters for Run %d: "
2079                              "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2080                                         run, startTime, endTime));              
2081                 
2082                 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2083                 //fLogbookEntry = entry;        
2084                 //if (!UpdateShuttleLogbook("shuttle_done"))
2085                 //{
2086                 //      AliError(Form("Could not update logbook for run %d !", run));
2087                 //}
2088                 //fLogbookEntry = 0;
2089                                 
2090                 delete entry;
2091                 delete aRow;
2092                 delete aResult;
2093                 return 0;
2094         }
2095                         
2096         if (startTime && endTime && (startTime > endTime)) 
2097         {
2098                 Log("SHUTTLE",
2099                         Form("QueryRunParameters - Invalid parameters for Run %d: "
2100                                 "startTime = %d, endTime = %d. Skipping!",
2101                                         run, startTime, endTime));              
2102                 
2103                 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2104                 fLogbookEntry = entry;  
2105                 if (!UpdateShuttleLogbook("shuttle_done"))
2106                 {
2107                         AliError(Form("Could not update logbook for run %d !", run));
2108                 }
2109                 fLogbookEntry = 0;
2110                                 
2111                 delete entry;
2112                 delete aRow;
2113                 delete aResult;
2114                 return 0;
2115         }
2116                         
2117         TString totEventsStr = entry->GetRunParameter("totalEvents");  
2118         Int_t totEvents = totEventsStr.Atoi();
2119         if (totEvents < 1) 
2120         {
2121                 Log("SHUTTLE",
2122                         Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));             
2123                 
2124                 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));           
2125                 fLogbookEntry = entry;  
2126                 if (!UpdateShuttleLogbook("shuttle_done"))
2127                 {
2128                         AliError(Form("Could not update logbook for run %d !", run));
2129                 }
2130                 fLogbookEntry = 0;
2131                                 
2132                 delete entry;
2133                 delete aRow;
2134                 delete aResult;
2135                 return 0;
2136         }
2137
2138         delete aRow;
2139         delete aResult;
2140
2141         return entry;
2142 }
2143
2144 //______________________________________________________________________________________________
2145 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2146                               DCSType type, Int_t multiSplit)
2147 {
2148         // Retrieve all "entry" data points from the DCS server
2149         // host, port: TSocket connection parameters
2150         // entries: list of name of the alias or data point
2151         // type: kAlias or kDP
2152         // returns TMap of values, 0 when failure
2153         
2154         AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2155
2156         TMap* result = 0;
2157         if (type == kAlias)
2158         {
2159                 result = client.GetAliasValues(entries, GetCurrentStartTime(), 
2160                         GetCurrentEndTime());
2161         } 
2162         else if (type == kDP)
2163         {
2164                 result = client.GetDPValues(entries, GetCurrentStartTime(), 
2165                         GetCurrentEndTime());
2166         }
2167
2168         if (result == 0)
2169         {
2170                 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2171                         client.GetErrorString(client.GetResultErrorCode())));
2172                 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)        
2173                         Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2174                                 client.GetServerError().Data()));
2175
2176                 return 0;
2177         }
2178                 
2179         return result;
2180 }
2181
2182 //______________________________________________________________________________________________
2183 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2184                 const char* id, const char* source)
2185 {
2186         // Get calibration file from file exchange servers
2187         // First queris the FXS database for the file name, using the run, detector, id and source info
2188         // then calls RetrieveFile(filename) for actual copy to local disk
2189         // run: current run being processed (given by Logbook entry fLogbookEntry)
2190         // detector: the Preprocessor name
2191         // id: provided as a parameter by the Preprocessor
2192         // source: provided by the Preprocessor through GetFileSources function
2193
2194         // check if test mode should simulate a FXS error
2195         if (fTestMode & kErrorFXSFiles)
2196         {
2197                 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2198                 return 0;
2199         }
2200         
2201         // check connection, in case connect
2202         if (!Connect(system))
2203         {
2204                 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2205                 return 0;
2206         }
2207
2208         // Query preparation
2209         TString sourceName(source);
2210         Int_t nFields = 3;
2211         TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2212                                                                 fConfig->GetFXSdbTable(system));
2213         TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2214                                                                 GetCurrentRun(), detector, id);
2215
2216         if (system == kDAQ)
2217         {
2218                 whereClause += Form(" and DAQsource=\"%s\"", source);
2219         }
2220         else if (system == kDCS)
2221         {
2222                 sourceName="none";
2223         }
2224         else if (system == kHLT)
2225         {
2226                 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2227                 nFields = 3;
2228         }
2229
2230         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2231
2232         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2233
2234         // Query execution
2235         TSQLResult* aResult = 0;
2236         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2237         if (!aResult) {
2238                 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2239                                 GetSystemName(system), id, sourceName.Data()));
2240                 return 0;
2241         }
2242
2243         if(aResult->GetRowCount() == 0)
2244         {
2245                 Log(detector,
2246                         Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2247                                 GetSystemName(system), id, sourceName.Data()));
2248                 delete aResult;
2249                 return 0;
2250         }
2251
2252         if (aResult->GetRowCount() > 1) {
2253                 Log(detector,
2254                         Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2255                                 GetSystemName(system), id, sourceName.Data()));
2256                 delete aResult;
2257                 return 0;
2258         }
2259
2260         if (aResult->GetFieldCount() != nFields) {
2261                 Log(detector,
2262                         Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2263                                 GetSystemName(system), id, sourceName.Data()));
2264                 delete aResult;
2265                 return 0;
2266         }
2267
2268         TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2269
2270         if (!aRow){
2271                 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2272                                 GetSystemName(system), id, sourceName.Data()));
2273                 delete aResult;
2274                 return 0;
2275         }
2276
2277         TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2278         TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2279         TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2280
2281         delete aResult;
2282         delete aRow;
2283
2284         AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2285                                 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2286
2287         // retrieved file is renamed to make it unique
2288         TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2289                                         GetShuttleTempDir(), detector, GetCurrentRun(),
2290                                         GetSystemName(system), detector, GetCurrentRun(), 
2291                                         id, sourceName.Data());
2292
2293
2294         // file retrieval from FXS
2295         UInt_t nRetries = 0;
2296         UInt_t maxRetries = 3;
2297         Bool_t result = kFALSE;
2298
2299         // copy!! if successful TSystem::Exec returns 0
2300         while(nRetries++ < maxRetries) {
2301                 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2302                 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2303                 if(!result)
2304                 {
2305                         Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2306                                         filePath.Data(), GetSystemName(system)));
2307                         continue;
2308                 } 
2309
2310                 if (fileChecksum.Length()>0)
2311                 {
2312                         // compare md5sum of local file with the one stored in the FXS DB
2313                         Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2314                                                 localFileName.Data(), fileChecksum.Data()));
2315
2316                         if (md5Comp != 0)
2317                         {
2318                                 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2319                                                         filePath.Data()));
2320                                 result = kFALSE;
2321                                 continue;
2322                         }
2323                 } else {
2324                         Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2325                                                         filePath.Data(), GetSystemName(system)));
2326                 }
2327                 if (result) break;
2328         }
2329
2330         if(!result) return 0;
2331
2332         fFXSCalled[system]=kTRUE;
2333         TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2334         fFXSlist[system].Add(fileParams);
2335
2336         static TString staticLocalFileName;
2337         staticLocalFileName.Form("%s", localFileName.Data());
2338         
2339         Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2340                         "source %s from %s to %s", id, source, 
2341                         GetSystemName(system), localFileName.Data()));
2342                         
2343         return staticLocalFileName.Data();
2344 }
2345
2346 //______________________________________________________________________________________________
2347 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2348 {
2349         //
2350         // Copies file from FXS to local Shuttle machine
2351         //
2352
2353         // check temp directory: trying to cd to temp; if it does not exist, create it
2354         AliDebug(2, Form("Copy file %s from %s FXS into %s",
2355                         GetSystemName(system), fxsFileName, localFileName));
2356                         
2357         TString tmpDir(localFileName);
2358         
2359         tmpDir = tmpDir(0,tmpDir.Last('/'));
2360
2361         Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2362         if (noDir) // temp dir does not exists!
2363         {
2364                 if (gSystem->mkdir(tmpDir.Data(), 1))
2365                 {
2366                         Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2367                         return kFALSE;
2368                 }
2369         }
2370
2371         TString baseFXSFolder;
2372         if (system == kDAQ)
2373         {
2374                 baseFXSFolder = "FES/";
2375         }
2376         else if (system == kDCS)
2377         {
2378                 baseFXSFolder = "";
2379         }
2380         else if (system == kHLT)
2381         {
2382                 baseFXSFolder = "/opt/FXS/";
2383         }
2384
2385
2386         TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2387                 fConfig->GetFXSPort(system),
2388                 fConfig->GetFXSUser(system),
2389                 fConfig->GetFXSHost(system),
2390                 baseFXSFolder.Data(),
2391                 fxsFileName,
2392                 localFileName);
2393
2394         AliDebug(2, Form("%s",command.Data()));
2395
2396         Bool_t result = (gSystem->Exec(command.Data()) == 0);
2397
2398         return result;
2399 }
2400
2401 //______________________________________________________________________________________________
2402 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2403 {
2404         //
2405         // Get sources producing the condition file Id from file exchange servers
2406         // if id is NULL all sources are returned (distinct)
2407         //
2408
2409         Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2410         
2411         // check if test mode should simulate a FXS error
2412         if (fTestMode & kErrorFXSSources)
2413         {
2414                 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2415                 return 0;
2416         }
2417
2418         if (system == kDCS)
2419         {
2420                 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2421                 TList *list = new TList();
2422                 list->SetOwner(1);
2423                 list->Add(new TObjString(" "));
2424                 return list;
2425         }
2426
2427         // check connection, in case connect
2428         if (!Connect(system))
2429         {
2430                 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2431                 return NULL;
2432         }
2433
2434         TString sourceName = 0;
2435         if (system == kDAQ)
2436         {
2437                 sourceName = "DAQsource";
2438         } else if (system == kHLT)
2439         {
2440                 sourceName = "DDLnumbers";
2441         }
2442
2443         TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2444         TString whereClause = Form("run=%d and detector=\"%s\"",
2445                                 GetCurrentRun(), detector);
2446         if (id)
2447                 whereClause += Form(" and fileId=\"%s\"", id);
2448         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2449
2450         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2451
2452         // Query execution
2453         TSQLResult* aResult;
2454         aResult = fServer[system]->Query(sqlQuery);
2455         if (!aResult) {
2456                 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2457                                 GetSystemName(system), id));
2458                 return 0;
2459         }
2460
2461         TList *list = new TList();
2462         list->SetOwner(1);
2463         
2464         if (aResult->GetRowCount() == 0)
2465         {
2466                 Log(detector,
2467                         Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2468                 delete aResult;
2469                 return list;
2470         }
2471
2472         Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2473
2474         TSQLRow* aRow;
2475         while ((aRow = aResult->Next()))
2476         {
2477
2478                 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2479                 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2480                 list->Add(new TObjString(source));
2481                 delete aRow;
2482         }
2483
2484         delete aResult;
2485
2486         return list;
2487 }
2488
2489 //______________________________________________________________________________________________
2490 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2491 {
2492         //
2493         // Get all ids of condition files produced by a given source from file exchange servers
2494         //
2495         
2496         Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2497
2498         // check if test mode should simulate a FXS error
2499         if (fTestMode & kErrorFXSSources)
2500         {
2501                 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2502                 return 0;
2503         }
2504
2505         // check connection, in case connect
2506         if (!Connect(system))
2507         {
2508                 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2509                 return NULL;
2510         }
2511
2512         TString sourceName = 0;
2513         if (system == kDAQ)
2514         {
2515                 sourceName = "DAQsource";
2516         } else if (system == kHLT)
2517         {
2518                 sourceName = "DDLnumbers";
2519         }
2520
2521         TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2522         TString whereClause = Form("run=%d and detector=\"%s\"",
2523                                 GetCurrentRun(), detector);
2524         if (sourceName.Length() > 0 && source)
2525                 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2526         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2527
2528         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2529
2530         // Query execution
2531         TSQLResult* aResult;
2532         aResult = fServer[system]->Query(sqlQuery);
2533         if (!aResult) {
2534                 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2535                                 GetSystemName(system), source));
2536                 return 0;
2537         }
2538
2539         TList *list = new TList();
2540         list->SetOwner(1);
2541         
2542         if (aResult->GetRowCount() == 0)
2543         {
2544                 Log(detector,
2545                         Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2546                 delete aResult;
2547                 return list;
2548         }
2549
2550         Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2551
2552         TSQLRow* aRow;
2553
2554         while ((aRow = aResult->Next()))
2555         {
2556
2557                 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2558                 AliDebug(2, Form("fileId = %s", id.Data()));
2559                 list->Add(new TObjString(id));
2560                 delete aRow;
2561         }
2562
2563         delete aResult;
2564
2565         return list;
2566 }
2567
2568 //______________________________________________________________________________________________
2569 Bool_t AliShuttle::Connect(Int_t system)
2570 {
2571         // Connect to MySQL Server of the system's FXS MySQL databases
2572         // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2573         //
2574
2575         // check connection: if already connected return
2576         if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2577
2578         TString dbHost, dbUser, dbPass, dbName;
2579
2580         if (system < 3) // FXS db servers
2581         {
2582                 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2583                 dbUser = fConfig->GetFXSdbUser(system);
2584                 dbPass = fConfig->GetFXSdbPass(system);
2585                 dbName =   fConfig->GetFXSdbName(system);
2586         } else { // Run & Shuttle logbook servers
2587         // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2588                 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2589                 dbUser = fConfig->GetDAQlbUser();
2590                 dbPass = fConfig->GetDAQlbPass();
2591                 dbName =   fConfig->GetDAQlbDB();
2592         }
2593
2594         fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2595         if (!fServer[system] || !fServer[system]->IsConnected()) {
2596                 if(system < 3)
2597                 {
2598                 AliError(Form("Can't establish connection to FXS database for %s",
2599                                         AliShuttleInterface::GetSystemName(system)));
2600                 } else {
2601                 AliError("Can't establish connection to Run logbook.");
2602                 }
2603                 if(fServer[system]) delete fServer[system];
2604                 return kFALSE;
2605         }
2606
2607         // Get tables
2608         TSQLResult* aResult=0;
2609         switch(system){
2610                 case kDAQ:
2611                         aResult = fServer[kDAQ]->GetTables(dbName.Data());
2612                         break;
2613                 case kDCS:
2614                         aResult = fServer[kDCS]->GetTables(dbName.Data());
2615                         break;
2616                 case kHLT:
2617                         aResult = fServer[kHLT]->GetTables(dbName.Data());
2618                         break;
2619                 default:
2620                         aResult = fServer[3]->GetTables(dbName.Data());
2621                         break;
2622         }
2623
2624         delete aResult;
2625         return kTRUE;
2626 }
2627
2628 //______________________________________________________________________________________________
2629 Bool_t AliShuttle::UpdateTable()
2630 {
2631         //
2632         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2633         //
2634
2635         Bool_t result = kTRUE;
2636
2637         for (UInt_t system=0; system<3; system++)
2638         {
2639                 if(!fFXSCalled[system]) continue;
2640
2641                 // check connection, in case connect
2642                 if (!Connect(system))
2643                 {
2644                         Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2645                         result = kFALSE;
2646                         continue;
2647                 }
2648
2649                 TTimeStamp now; // now
2650
2651                 // Loop on FXS list entries
2652                 TIter iter(&fFXSlist[system]);
2653                 TObjString *aFXSentry=0;
2654                 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2655                 {
2656                         TString aFXSentrystr = aFXSentry->String();
2657                         TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2658                         if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2659                         {
2660                                 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2661                                         GetSystemName(system), aFXSentrystr.Data()));
2662                                 if(aFXSarray) delete aFXSarray;
2663                                 result = kFALSE;
2664                                 continue;
2665                         }
2666                         const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2667                         const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2668
2669                         TString whereClause;
2670                         if (system == kDAQ)
2671                         {
2672                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2673                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2674                         }
2675                         else if (system == kDCS)
2676                         {
2677                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2678                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId);
2679                         }
2680                         else if (system == kHLT)
2681                         {
2682                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2683                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2684                         }
2685
2686                         delete aFXSarray;
2687
2688                         TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2689                                                                 now.GetSec(), whereClause.Data());
2690
2691                         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2692
2693                         // Query execution
2694                         TSQLResult* aResult;
2695                         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2696                         if (!aResult)
2697                         {
2698                                 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2699                                                                 GetSystemName(system), sqlQuery.Data()));
2700                                 result = kFALSE;
2701                                 continue;
2702                         }
2703                         delete aResult;
2704                 }
2705         }
2706
2707         return result;
2708 }
2709
2710 //______________________________________________________________________________________________
2711 Bool_t AliShuttle::UpdateTableFailCase()
2712 {
2713         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2714         // this is called in case the preprocessor is declared failed for the current run, because
2715         // the fields are updated only in case of success
2716
2717         Bool_t result = kTRUE;
2718
2719         for (UInt_t system=0; system<3; system++)
2720         {
2721                 // check connection, in case connect
2722                 if (!Connect(system))
2723                 {
2724                         Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2725                                                         GetSystemName(system)));
2726                         result = kFALSE;
2727                         continue;
2728                 }
2729
2730                 TTimeStamp now; // now
2731
2732                 // Loop on FXS list entries
2733
2734                 TString whereClause = Form("where run=%d and detector=\"%s\";",
2735                                                 GetCurrentRun(), fCurrentDetector.Data());
2736
2737
2738                 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2739                                                         now.GetSec(), whereClause.Data());
2740
2741                 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2742
2743                 // Query execution
2744                 TSQLResult* aResult;
2745                 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2746                 if (!aResult)
2747                 {
2748                         Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2749                                                         GetSystemName(system), sqlQuery.Data()));
2750                         result = kFALSE;
2751                         continue;
2752                 }
2753                 delete aResult;
2754         }
2755
2756         return result;
2757 }
2758
2759 //______________________________________________________________________________________________
2760 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2761 {
2762         //
2763         // Update Shuttle logbook filling detector or shuttle_done column
2764         // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2765         //
2766
2767         // check connection, in case connect
2768         if(!Connect(3)){
2769                 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2770                 return kFALSE;
2771         }
2772
2773         TString detName(detector);
2774         TString setClause;
2775         if(detName == "shuttle_done")
2776         {
2777                 setClause = "set shuttle_done=1";
2778
2779                 if (fMonaLisa)
2780                 {
2781                         // Send the information to ML
2782                         TMonaLisaText  mlStatus("SHUTTLE_status", "Done");
2783
2784                         TList mlList;
2785                         mlList.Add(&mlStatus);
2786                 
2787                         fMonaLisa->SendParameters(&mlList);
2788                 }
2789         } else {
2790                 TString statusStr(status);
2791                 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2792                    statusStr.Contains("failed", TString::kIgnoreCase)){
2793                         setClause = Form("set %s=\"%s\"", detector, status);
2794                 } else {
2795                         Log("SHUTTLE",
2796                                 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2797                                         status, detector));
2798                         return kFALSE;
2799                 }
2800         }
2801
2802         TString whereClause = Form("where run=%d", GetCurrentRun());
2803
2804         TString sqlQuery = Form("update %s %s %s",
2805                                         fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2806
2807         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2808
2809         // Query execution
2810         TSQLResult* aResult;
2811         aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2812         if (!aResult) {
2813                 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2814                 return kFALSE;
2815         }
2816         delete aResult;
2817
2818         return kTRUE;
2819 }
2820
2821 //______________________________________________________________________________________________
2822 Int_t AliShuttle::GetCurrentRun() const
2823 {
2824         //
2825         // Get current run from logbook entry
2826         //
2827
2828         return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2829 }
2830
2831 //______________________________________________________________________________________________
2832 UInt_t AliShuttle::GetCurrentStartTime() const
2833 {
2834         //
2835         // get current start time
2836         //
2837
2838         return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2839 }
2840
2841 //______________________________________________________________________________________________
2842 UInt_t AliShuttle::GetCurrentEndTime() const
2843 {
2844         //
2845         // get current end time from logbook entry
2846         //
2847
2848         return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2849 }
2850
2851 //______________________________________________________________________________________________
2852 UInt_t AliShuttle::GetCurrentYear() const
2853 {
2854         //
2855         // Get current year from logbook entry
2856         //
2857
2858         if (!fLogbookEntry) return 0;
2859         
2860         TTimeStamp startTime(GetCurrentStartTime());
2861         TString year =  Form("%d",startTime.GetDate());
2862         year = year(0,4);
2863         
2864         return year.Atoi();
2865 }
2866
2867 //______________________________________________________________________________________________
2868 const char* AliShuttle::GetLHCPeriod() const
2869 {
2870         //
2871         // Get current LHC period from logbook entry
2872         //
2873
2874         if (!fLogbookEntry) return 0;
2875                 
2876         return fLogbookEntry->GetRunParameter("LHCperiod");
2877 }
2878
2879 //______________________________________________________________________________________________
2880 void AliShuttle::Log(const char* detector, const char* message)
2881 {
2882         //
2883         // Fill log string with a message
2884         //
2885
2886         TString logRunDir = GetShuttleLogDir();
2887         if (GetCurrentRun() >=0)
2888                 logRunDir += Form("/%d", GetCurrentRun());
2889         
2890         void* dir = gSystem->OpenDirectory(logRunDir.Data());
2891         if (dir == NULL) {
2892                 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2893                         AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2894                         return;
2895                 }
2896
2897         } else {
2898                 gSystem->FreeDirectory(dir);
2899         }
2900
2901         TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2902         if (GetCurrentRun() >= 0) 
2903                 toLog += Form("run %d - ", GetCurrentRun());
2904         toLog += Form("%s", message);
2905
2906         AliInfo(toLog.Data());
2907         
2908         // if we redirect the log output already to the file, leave here
2909         if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2910                 return;
2911
2912         TString fileName = GetLogFileName(detector);
2913         
2914         gSystem->ExpandPathName(fileName);
2915
2916         ofstream logFile;
2917         logFile.open(fileName, ofstream::out | ofstream::app);
2918
2919         if (!logFile.is_open()) {
2920                 AliError(Form("Could not open file %s", fileName.Data()));
2921                 return;
2922         }
2923
2924         logFile << toLog.Data() << "\n";
2925
2926         logFile.close();
2927 }
2928
2929 //______________________________________________________________________________________________
2930 TString AliShuttle::GetLogFileName(const char* detector) const
2931 {
2932         // 
2933         // returns the name of the log file for a given sub detector
2934         //
2935         
2936         TString fileName;
2937         
2938         if (GetCurrentRun() >= 0) 
2939         {
2940                 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(), 
2941                         detector, GetCurrentRun());
2942         } else {
2943                 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2944         }
2945
2946         return fileName;
2947 }
2948
2949 //______________________________________________________________________________________________
2950 Bool_t AliShuttle::Collect(Int_t run)
2951 {
2952         //
2953         // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2954         // If a dedicated run is given this run is processed
2955         //
2956         // In operational mode, this is the Shuttle function triggered by the EOR signal.
2957         //
2958
2959         if (run == -1)
2960                 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2961         else
2962                 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2963
2964         SetLastAction("Starting");
2965
2966         TString whereClause("where shuttle_done=0");
2967         if (run != -1)
2968                 whereClause += Form(" and run=%d", run);
2969
2970         TObjArray shuttleLogbookEntries;
2971         if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2972         {
2973                 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2974                 return kFALSE;
2975         }
2976
2977         if (shuttleLogbookEntries.GetEntries() == 0)
2978         {
2979                 if (run == -1)
2980                         Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2981                 else
2982                         Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2983                                                 "or it does not exist in Shuttle logbook", run));
2984                 return kTRUE;
2985         }
2986
2987         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2988                 fFirstUnprocessed[iDet] = kTRUE;
2989
2990         if (run != -1)
2991         {
2992                 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2993                 // flag them into fFirstUnprocessed array
2994                 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2995                 TObjArray tmpLogbookEntries;
2996                 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2997                 {
2998                         Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2999                         return kFALSE;
3000                 }
3001
3002                 TIter iter(&tmpLogbookEntries);
3003                 AliShuttleLogbookEntry* anEntry = 0;
3004                 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3005                 {
3006                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3007                         {
3008                                 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3009                                 {
3010                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3011                                                         anEntry->GetRun(), GetDetName(iDet)));
3012                                         fFirstUnprocessed[iDet] = kFALSE;
3013                                 }
3014                         }
3015
3016                 }
3017
3018         }
3019
3020         if (!RetrieveConditionsData(shuttleLogbookEntries))
3021         {
3022                 Log("SHUTTLE", "Collect - Process of at least one run failed");
3023                 return kFALSE;
3024         }
3025
3026         Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3027         return kTRUE;
3028 }
3029
3030 //______________________________________________________________________________________________
3031 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3032 {
3033         //
3034         // Retrieve conditions data for all runs that aren't processed yet
3035         //
3036
3037         Bool_t hasError = kFALSE;
3038
3039         TIter iter(&dateEntries);
3040         AliShuttleLogbookEntry* anEntry;
3041
3042         while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3043                 if (!Process(anEntry)){
3044                         hasError = kTRUE;
3045                 }
3046
3047                 // clean SHUTTLE temp directory
3048                 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3049                 //RemoveFile(filename.Data());
3050         }
3051
3052         return hasError == kFALSE;
3053 }
3054
3055 //______________________________________________________________________________________________
3056 ULong_t AliShuttle::GetTimeOfLastAction() const
3057 {
3058         //
3059         // Gets time of last action
3060         //
3061
3062         ULong_t tmp;
3063
3064         fMonitoringMutex->Lock();
3065
3066         tmp = fLastActionTime;
3067
3068         fMonitoringMutex->UnLock();
3069
3070         return tmp;
3071 }
3072
3073 //______________________________________________________________________________________________
3074 const TString AliShuttle::GetLastAction() const
3075 {
3076         //
3077         // returns a string description of the last action
3078         //
3079
3080         TString tmp;
3081
3082         fMonitoringMutex->Lock();
3083         
3084         tmp = fLastAction;
3085         
3086         fMonitoringMutex->UnLock();
3087
3088         return tmp;