Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
1 /**************************************************************************
2  * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3  *                                                                        *
4  * Author: The ALICE Off-line Project.                                    *
5  * Contributors are mentioned in the code where appropriate.              *
6  *                                                                        *
7  * Permission to use, copy, modify and distribute this software and its   *
8  * documentation strictly for non-commercial purposes is hereby granted   *
9  * without fee, provided that the above copyright notice appears in all   *
10  * copies and that both the copyright notice and this permission notice   *
11  * appear in the supporting documentation. The authors make no claims     *
12  * about the suitability of this software for any purpose. It is          *
13  * provided "as is" without express or implied warranty.                  *
14  **************************************************************************/
15
16 /*
17 $Log$
18 Revision 1.54  2007/07/12 09:51:25  jgrosseo
19 removed duplicated log message in GetFile
20
21 Revision 1.53  2007/07/12 09:26:28  jgrosseo
22 updating hlt fxs base path
23
24 Revision 1.52  2007/07/12 08:06:45  jgrosseo
25 adding log messages in getfile... functions
26 adding not implemented copy constructor in alishuttleconfigholder
27
28 Revision 1.51  2007/07/03 17:24:52  acolla
29 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
30
31 Revision 1.50  2007/07/02 17:19:32  acolla
32 preprocessor is run in a temp directory that is removed when process is finished.
33
34 Revision 1.49  2007/06/29 10:45:06  acolla
35 Number of columns in MySql Shuttle logbook increased by one (HLT added)
36
37 Revision 1.48  2007/06/21 13:06:19  acolla
38 GetFileSources returns dummy list with 1 source if system=DCS (better than
39 returning error as it was)
40
41 Revision 1.47  2007/06/19 17:28:56  acolla
42 HLT updated; missing map bug removed.
43
44 Revision 1.46  2007/06/09 13:01:09  jgrosseo
45 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
46
47 Revision 1.45  2007/05/30 06:35:20  jgrosseo
48 Adding functionality to the Shuttle/TestShuttle:
49 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
50 o) Function to retrieve list of IDs for a given source      (GetFileIDs)
51 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
52 Example code has been added to the TestProcessor in TestShuttle
53
54 Revision 1.44  2007/05/11 16:09:32  acolla
55 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
56 example: ITS/SPD/100_filename.root
57
58 Revision 1.43  2007/05/10 09:59:51  acolla
59 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
60
61 Revision 1.42  2007/05/03 08:01:39  jgrosseo
62 typo in last commit :-(
63
64 Revision 1.41  2007/05/03 08:00:48  jgrosseo
65 fixing log message when pp want to skip dcs value retrieval
66
67 Revision 1.40  2007/04/27 07:06:48  jgrosseo
68 GetFileSources returns empty list in case of no files, but successful query
69 No mails sent in testmode
70
71 Revision 1.39  2007/04/17 12:43:57  acolla
72 Correction in StoreOCDB; change of text in mail to detector expert
73
74 Revision 1.38  2007/04/12 08:26:18  jgrosseo
75 updated comment
76
77 Revision 1.37  2007/04/10 16:53:14  jgrosseo
78 redirecting sub detector stdout, stderr to sub detector log file
79
80 Revision 1.35  2007/04/04 16:26:38  acolla
81 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
82 2. Added missing dependency in test preprocessors.
83 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
84
85 Revision 1.34  2007/04/04 10:33:36  jgrosseo
86 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
87 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
88
89 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
90
91 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
92
93 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
94
95 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
96 If you always need DCS data (like before), you do not need to implement it.
97
98 6) The run type has been added to the monitoring page
99
100 Revision 1.33  2007/04/03 13:56:01  acolla
101 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
102 run type.
103
104 Revision 1.32  2007/02/28 10:41:56  acolla
105 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
106 AliPreprocessor::GetRunType() function.
107 Added some ldap definition files.
108
109 Revision 1.30  2007/02/13 11:23:21  acolla
110 Moved getters and setters of Shuttle's main OCDB/Reference, local
111 OCDB/Reference, temp and log folders to AliShuttleInterface
112
113 Revision 1.27  2007/01/30 17:52:42  jgrosseo
114 adding monalisa monitoring
115
116 Revision 1.26  2007/01/23 19:20:03  acolla
117 Removed old ldif files, added TOF, MCH ldif files. Added some options in
118 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
119 SetShuttleLogDir
120
121 Revision 1.25  2007/01/15 19:13:52  acolla
122 Moved some AliInfo to AliDebug in SendMail function
123
124 Revision 1.21  2006/12/07 08:51:26  jgrosseo
125 update (alberto):
126 table, db names in ldap configuration
127 added GRP preprocessor
128 DCS data can also be retrieved by data point
129
130 Revision 1.20  2006/11/16 16:16:48  jgrosseo
131 introducing strict run ordering flag
132 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
133
134 Revision 1.19  2006/11/06 14:23:04  jgrosseo
135 major update (Alberto)
136 o) reading of run parameters from the logbook
137 o) online offline naming conversion
138 o) standalone DCSclient package
139
140 Revision 1.18  2006/10/20 15:22:59  jgrosseo
141 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
142 o) Merging Collect, CollectAll, CollectNew function
143 o) Removing implementation of empty copy constructors (declaration still there!)
144
145 Revision 1.17  2006/10/05 16:20:55  jgrosseo
146 adapting to new CDB classes
147
148 Revision 1.16  2006/10/05 15:46:26  jgrosseo
149 applying to the new interface
150
151 Revision 1.15  2006/10/02 16:38:39  jgrosseo
152 update (alberto):
153 fixed memory leaks
154 storing of objects that failed to be stored to the grid before
155 interfacing of shuttle status table in daq system
156
157 Revision 1.14  2006/08/29 09:16:05  jgrosseo
158 small update
159
160 Revision 1.13  2006/08/15 10:50:00  jgrosseo
161 effc++ corrections (alberto)
162
163 Revision 1.12  2006/08/08 14:19:29  jgrosseo
164 Update to shuttle classes (Alberto)
165
166 - Possibility to set the full object's path in the Preprocessor's and
167 Shuttle's  Store functions
168 - Possibility to extend the object's run validity in the same classes
169 ("startValidity" and "validityInfinite" parameters)
170 - Implementation of the StoreReferenceData function to store reference
171 data in a dedicated CDB storage.
172
173 Revision 1.11  2006/07/21 07:37:20  jgrosseo
174 last run is stored after each run
175
176 Revision 1.10  2006/07/20 09:54:40  jgrosseo
177 introducing status management: The processing per subdetector is divided into several steps,
178 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
179 can keep track of the number of failures and skips further processing after a certain threshold is
180 exceeded. These thresholds can be configured in LDAP.
181
182 Revision 1.9  2006/07/19 10:09:55  jgrosseo
183 new configuration, accesst to DAQ FES (Alberto)
184
185 Revision 1.8  2006/07/11 12:44:36  jgrosseo
186 adding parameters for extended validity range of data produced by preprocessor
187
188 Revision 1.7  2006/07/10 14:37:09  jgrosseo
189 small fix + todo comment
190
191 Revision 1.6  2006/07/10 13:01:41  jgrosseo
192 enhanced storing of last sucessfully processed run (alberto)
193
194 Revision 1.5  2006/07/04 14:59:57  jgrosseo
195 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
196
197 Revision 1.4  2006/06/12 09:11:16  jgrosseo
198 coding conventions (Alberto)
199
200 Revision 1.3  2006/06/06 14:26:40  jgrosseo
201 o) removed files that were moved to STEER
202 o) shuttle updated to follow the new interface (Alberto)
203
204 Revision 1.2  2006/03/07 07:52:34  hristov
205 New version (B.Yordanov)
206
207 Revision 1.6  2005/11/19 17:19:14  byordano
208 RetrieveDATEEntries and RetrieveConditionsData added
209
210 Revision 1.5  2005/11/19 11:09:27  byordano
211 AliShuttle declaration added
212
213 Revision 1.4  2005/11/17 17:47:34  byordano
214 TList changed to TObjArray
215
216 Revision 1.3  2005/11/17 14:43:23  byordano
217 import to local CVS
218
219 Revision 1.1.1.1  2005/10/28 07:33:58  hristov
220 Initial import as subdirectory in AliRoot
221
222 Revision 1.2  2005/09/13 08:41:15  byordano
223 default startTime endTime added
224
225 Revision 1.4  2005/08/30 09:13:02  byordano
226 some docs added
227
228 Revision 1.3  2005/08/29 21:15:47  byordano
229 some docs added
230
231 */
232
233 //
234 // This class is the main manager for AliShuttle. 
235 // It organizes the data retrieval from DCS and call the 
236 // interface methods of AliPreprocessor.
237 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
238 // data for its set of aliases is retrieved. If there is registered
239 // AliPreprocessor for this detector then it will be used
240 // accroding to the schema (see AliPreprocessor).
241 // If there isn't registered AliPreprocessor than the retrieved
242 // data is stored automatically to the undelying AliCDBStorage.
243 // For detSpec is used the alias name.
244 //
245
246 #include "AliShuttle.h"
247
248 #include "AliCDBManager.h"
249 #include "AliCDBStorage.h"
250 #include "AliCDBId.h"
251 #include "AliCDBRunRange.h"
252 #include "AliCDBPath.h"
253 #include "AliCDBEntry.h"
254 #include "AliShuttleConfig.h"
255 #include "DCSClient/AliDCSClient.h"
256 #include "AliLog.h"
257 #include "AliPreprocessor.h"
258 #include "AliShuttleStatus.h"
259 #include "AliShuttleLogbookEntry.h"
260
261 #include <TSystem.h>
262 #include <TObject.h>
263 #include <TString.h>
264 #include <TTimeStamp.h>
265 #include <TObjString.h>
266 #include <TSQLServer.h>
267 #include <TSQLResult.h>
268 #include <TSQLRow.h>
269 #include <TMutex.h>
270 #include <TSystemDirectory.h>
271 #include <TSystemFile.h>
272 #include <TFile.h>
273 #include <TFileMerger.h>
274 #include <TGrid.h>
275 #include <TGridResult.h>
276
277 #include <TMonaLisaWriter.h>
278
279 #include <fstream>
280
281 #include <sys/types.h>
282 #include <sys/wait.h>
283
284 ClassImp(AliShuttle)
285
286 //______________________________________________________________________________________________
287 AliShuttle::AliShuttle(const AliShuttleConfig* config,
288                 UInt_t timeout, Int_t retries):
289 fConfig(config),
290 fTimeout(timeout), fRetries(retries),
291 fPreprocessorMap(),
292 fLogbookEntry(0),
293 fCurrentDetector(),
294 fStatusEntry(0),
295 fMonitoringMutex(0),
296 fLastActionTime(0),
297 fLastAction(),
298 fMonaLisa(0),
299 fTestMode(kNone),
300 fReadTestMode(kFALSE),
301 fOutputRedirected(kFALSE)
302 {
303         //
304         // config: AliShuttleConfig used
305         // timeout: timeout used for AliDCSClient connection
306         // retries: the number of retries in case of connection error.
307         //
308
309         if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
310         for(int iSys=0;iSys<4;iSys++) {
311                 fServer[iSys]=0;
312                 if (iSys < 3)
313                         fFXSlist[iSys].SetOwner(kTRUE);
314         }
315         fPreprocessorMap.SetOwner(kTRUE);
316
317         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
318                 fFirstUnprocessed[iDet] = kFALSE;
319
320         fMonitoringMutex = new TMutex();
321 }
322
323 //______________________________________________________________________________________________
324 AliShuttle::~AliShuttle()
325 {
326         //
327         // destructor
328         //
329
330         fPreprocessorMap.DeleteAll();
331         for(int iSys=0;iSys<4;iSys++)
332                 if(fServer[iSys]) {
333                         fServer[iSys]->Close();
334                         delete fServer[iSys];
335                         fServer[iSys] = 0;
336                 }
337
338         if (fStatusEntry){
339                 delete fStatusEntry;
340                 fStatusEntry = 0;
341         }
342         
343         if (fMonitoringMutex) 
344         {
345                 delete fMonitoringMutex;
346                 fMonitoringMutex = 0;
347         }
348 }
349
350 //______________________________________________________________________________________________
351 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
352 {
353         //
354         // Registers new AliPreprocessor.
355         // It uses GetName() for indentificator of the pre processor.
356         // The pre processor is registered it there isn't any other
357         // with the same identificator (GetName()).
358         //
359
360         const char* detName = preprocessor->GetName();
361         if(GetDetPos(detName) < 0)
362                 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
363
364         if (fPreprocessorMap.GetValue(detName)) {
365                 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
366                 return;
367         }
368
369         fPreprocessorMap.Add(new TObjString(detName), preprocessor);
370 }
371 //______________________________________________________________________________________________
372 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
373                 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
374 {
375         // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
376         // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
377         // using this function. Use StoreReferenceData instead!
378         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
379         // finishes the data are transferred to the main storage (Grid).
380
381         return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
382 }
383
384 //______________________________________________________________________________________________
385 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
386 {
387         // Stores a CDB object in the storage for reference data. This objects will not be available during
388         // offline reconstrunction. Use this function for reference data only!
389         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
390         // finishes the data are transferred to the main storage (Grid).
391
392         return StoreLocally(fgkLocalRefStorage, path, object, metaData);
393 }
394
395 //______________________________________________________________________________________________
396 Bool_t AliShuttle::StoreLocally(const TString& localUri,
397                         const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
398                         Int_t validityStart, Bool_t validityInfinite)
399 {
400         // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
401         // when the preprocessor finishes the data are transferred to the main storage (Grid).
402         // The parameters are:
403         //   1) Uri of the backup storage (Local)
404         //   2) the object's path.
405         //   3) the object to be stored
406         //   4) the metaData to be associated with the object
407         //   5) the validity start run number w.r.t. the current run,
408         //      if the data is valid only for this run leave the default 0
409         //   6) specifies if the calibration data is valid for infinity (this means until updated),
410         //      typical for calibration runs, the default is kFALSE
411         //
412         // returns 0 if fail, 1 otherwise
413
414         if (fTestMode & kErrorStorage)
415         {
416                 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
417                 return kFALSE;
418         }
419         
420         const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
421
422         Int_t firstRun = GetCurrentRun() - validityStart;
423         if(firstRun < 0) {
424                 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
425                 firstRun=0;
426         }
427
428         Int_t lastRun = -1;
429         if(validityInfinite) {
430                 lastRun = AliCDBRunRange::Infinity();
431         } else {
432                 lastRun = GetCurrentRun();
433         }
434
435         // Version is set to current run, it will be used later to transfer data to Grid
436         AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
437
438         if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
439                 TObjString runUsed = Form("%d", GetCurrentRun());
440                 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
441         }
442
443         Bool_t result = kFALSE;
444
445         if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
446                 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
447         } else {
448                 result = AliCDBManager::Instance()->GetStorage(localUri)
449                                         ->Put(object, id, metaData);
450         }
451
452         if(!result) {
453
454                 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
455         }
456
457         return result;
458 }
459
460 //______________________________________________________________________________________________
461 Bool_t AliShuttle::StoreOCDB()
462 {
463         //
464         // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
465         // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
466         // Then calls StoreRefFilesToGrid to store reference files. 
467         //
468         
469         if (fTestMode & kErrorGrid)
470         {
471                 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
472                 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
473                 return kFALSE;
474         }
475         
476         Log("SHUTTLE","Storing OCDB data ...");
477         Bool_t resultCDB = StoreOCDB(fgkMainCDB);
478
479         Log("SHUTTLE","Storing reference data ...");
480         Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
481         
482         Log("SHUTTLE","Storing reference files ...");
483         Bool_t resultRefFiles = StoreRefFilesToGrid();
484         
485         return resultCDB && resultRef && resultRefFiles;
486 }
487
488 //______________________________________________________________________________________________
489 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
490 {
491         //
492         // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
493         //
494
495         TObjArray* gridIds=0;
496
497         Bool_t result = kTRUE;
498
499         const char* type = 0;
500         TString localURI;
501         if(gridURI == fgkMainCDB) {
502                 type = "OCDB";
503                 localURI = fgkLocalCDB;
504         } else if(gridURI == fgkMainRefStorage) {
505                 type = "reference";
506                 localURI = fgkLocalRefStorage;
507         } else {
508                 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
509                 return kFALSE;
510         }
511
512         AliCDBManager* man = AliCDBManager::Instance();
513
514         AliCDBStorage *gridSto = man->GetStorage(gridURI);
515         if(!gridSto) {
516                 Log("SHUTTLE",
517                         Form("StoreOCDB - cannot activate main %s storage", type));
518                 return kFALSE;
519         }
520
521         gridIds = gridSto->GetQueryCDBList();
522
523         // get objects previously stored in local CDB
524         AliCDBStorage *localSto = man->GetStorage(localURI);
525         if(!localSto) {
526                 Log("SHUTTLE",
527                         Form("StoreOCDB - cannot activate local %s storage", type));
528                 return kFALSE;
529         }
530         AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
531         // Local objects were stored with current run as Grid version!
532         TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
533         localEntries->SetOwner(1);
534
535         // loop on local stored objects
536         TIter localIter(localEntries);
537         AliCDBEntry *aLocEntry = 0;
538         while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
539                 aLocEntry->SetOwner(1);
540                 AliCDBId aLocId = aLocEntry->GetId();
541                 aLocEntry->SetVersion(-1);
542                 aLocEntry->SetSubVersion(-1);
543
544                 // If local object is valid up to infinity we store it only if it is
545                 // the first unprocessed run!
546                 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
547                         !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
548                 {
549                         Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
550                                                 "there are previous unprocessed runs!",
551                                                 fCurrentDetector.Data(), aLocId.GetPath().Data()));
552                         continue;
553                 }
554
555                 // loop on Grid valid Id's
556                 Bool_t store = kTRUE;
557                 TIter gridIter(gridIds);
558                 AliCDBId* aGridId = 0;
559                 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
560                         if(aGridId->GetPath() != aLocId.GetPath()) continue;
561                         // skip all objects valid up to infinity
562                         if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
563                         // if we get here, it means there's already some more recent object stored on Grid!
564                         store = kFALSE;
565                         break;
566                 }
567
568                 // If we get here, the file can be stored!
569                 Bool_t storeOk = gridSto->Put(aLocEntry);
570                 if(!store || storeOk){
571
572                         if (!store)
573                         {
574                                 Log(fCurrentDetector.Data(),
575                                         Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
576                                                 type, aGridId->ToString().Data()));
577                         } else {
578                                 Log("SHUTTLE",
579                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
580                                                 aLocId.ToString().Data(), type));
581                                 Log(fCurrentDetector.Data(),
582                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
583                                                 aLocId.ToString().Data(), type));
584                         }
585
586                         // removing local filename...
587                         TString filename;
588                         localSto->IdToFilename(aLocId, filename);
589                         AliInfo(Form("Removing local file %s", filename.Data()));
590                         RemoveFile(filename.Data());
591                         continue;
592                 } else  {
593                         Log("SHUTTLE",
594                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
595                                         type, aLocId.ToString().Data()));
596                         Log(fCurrentDetector.Data(),
597                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
598                                         type, aLocId.ToString().Data()));
599                         result = kFALSE;
600                 }
601         }
602         localEntries->Clear();
603
604         return result;
605 }
606
607 //______________________________________________________________________________________________
608 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
609 {
610         // clears the directory used to store reference files of a given subdetector
611   
612         AliCDBManager* man = AliCDBManager::Instance();
613         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
614         TString localBaseFolder = sto->GetBaseFolder();
615
616         TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
617         
618         Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
619
620         TString begin;
621         begin.Form("%d_", GetCurrentRun());
622         
623         TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
624         if (!baseDir)
625                 return kTRUE;
626                 
627         TList* dirList = baseDir->GetListOfFiles();
628         delete baseDir;
629         
630         if (!dirList) return kTRUE;
631                         
632         if (dirList->GetEntries() < 3) 
633         {
634                 delete dirList;
635                 return kTRUE;
636         }
637                                 
638         Int_t nDirs = 0, nDel = 0;
639         TIter dirIter(dirList);
640         TSystemFile* entry = 0;
641
642         Bool_t success = kTRUE;
643         
644         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
645         {                                       
646                 if (entry->IsDirectory())
647                         continue;
648                 
649                 TString fileName(entry->GetName());
650                 if (!fileName.BeginsWith(begin))
651                         continue;
652                         
653                 nDirs++;
654                                                 
655                 // delete file
656                 Int_t result = gSystem->Unlink(fileName.Data());
657                 
658                 if (result)
659                 {
660                         Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
661                         success = kFALSE;
662                 } else {
663                         nDel++;
664                 }
665         }
666
667         if(nDirs > 0)
668                 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.", 
669                         nDel, nDirs, targetDir.Data()));
670
671                 
672         delete dirList;
673         return success;
674
675
676
677
678
679
680   Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
681   if (result == 0)
682   {
683     // delete directory
684     result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
685     if (result != 0)
686     {  
687       Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
688       return kFALSE;
689     }
690   }
691
692   result = gSystem->mkdir(targetDir, kTRUE);
693   if (result != 0)
694   {
695     Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
696     return kFALSE;
697   }
698         
699   return kTRUE;
700 }
701
702 //______________________________________________________________________________________________
703 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
704 {
705         //
706         // Stores reference file directly (without opening it). This function stores the file locally.
707         //
708         // The file is stored under the following location: 
709         // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
710         // where <gridFileName> is the second parameter given to the function
711         // 
712         
713         if (fTestMode & kErrorStorage)
714         {
715                 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
716                 return kFALSE;
717         }
718         
719         AliCDBManager* man = AliCDBManager::Instance();
720         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
721         
722         TString localBaseFolder = sto->GetBaseFolder();
723         
724         TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector); 
725         
726         //try to open folder, if does not exist
727         void* dir = gSystem->OpenDirectory(targetDir.Data());
728         if (dir == NULL) {
729                 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
730                         Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
731                         return kFALSE;
732                 }
733
734         } else {
735                 gSystem->FreeDirectory(dir);
736         }
737
738         TString target;
739         target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
740         
741         Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
742         if (result)
743         {
744                 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
745                 return kFALSE;
746         }
747
748         result = gSystem->CopyFile(localFile, target);
749
750         if (result == 0)
751         {
752                 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
753                 return kTRUE;
754         }
755         else
756         {
757                 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d", 
758                                 localFile, target.Data(), result));
759                 return kFALSE;
760         }       
761 }
762
763 //______________________________________________________________________________________________
764 Bool_t AliShuttle::StoreRefFilesToGrid()
765 {
766         //
767         // Transfers the reference file to the Grid.
768         //
769         // The files are stored under the following location: 
770         // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
771         //
772         
773         AliCDBManager* man = AliCDBManager::Instance();
774         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
775         if (!sto)
776                 return kFALSE;
777         TString localBaseFolder = sto->GetBaseFolder();
778                 
779         TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
780                 
781         AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
782         if (!gridSto)
783                 return kFALSE;
784         
785         TString gridBaseFolder = gridSto->GetBaseFolder();
786
787         TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
788         
789         TString begin;
790         begin.Form("%d_", GetCurrentRun());
791         
792         TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
793         if (!baseDir)
794                 return kTRUE;
795                 
796         TList* dirList = baseDir->GetListOfFiles();
797         delete baseDir;
798         
799         if (!dirList) return kTRUE;
800                 
801         if (dirList->GetEntries() < 3) 
802         {
803                 delete dirList;
804                 return kTRUE;
805         }
806                         
807         if (!gGrid)
808         { 
809                 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
810                 delete dirList;
811                 return kFALSE;
812         }
813         
814         Int_t nDirs = 0, nTransfer = 0;
815         TIter dirIter(dirList);
816         TSystemFile* entry = 0;
817
818         Bool_t success = kTRUE;
819         Bool_t first = kTRUE;
820         
821         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
822         {                       
823                 if (entry->IsDirectory())
824                         continue;
825                         
826                 TString fileName(entry->GetName());
827                 if (!fileName.BeginsWith(begin))
828                         continue;
829                         
830                 nDirs++;
831                         
832                 if (first)
833                 {
834                         first = kFALSE;
835                         // check that DET folder exists, otherwise create it
836                         TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
837                         
838                         if (!result)
839                         {
840                                 delete dirList;
841                                 return kFALSE;
842                         }
843                         
844                         if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
845                         {
846                                 if (!gGrid->Mkdir(alienDir.Data(),"",0))
847                                 {
848                                         Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
849                                                         alienDir.Data()));
850                                         delete dirList;
851                                         return kFALSE;
852                                 } else {
853                                         Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
854                                 }
855                                 
856                         } else {
857                                         Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
858                         }
859                 }
860                         
861                 TString fullLocalPath;
862                 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
863                 
864                 TString fullGridPath;
865                 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
866
867                 TFileMerger fileMerger;
868                 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
869                 
870                 if (result)
871                 {
872                         Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
873                         RemoveFile(fullLocalPath);
874                         nTransfer++;
875                 }
876                 else
877                 {
878                         Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
879                         success = kFALSE;
880                 }
881         }
882
883         Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
884
885                 
886         delete dirList;
887         return success;
888 }
889
890 //______________________________________________________________________________________________
891 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
892 {
893         //
894         // Get folder name of reference files 
895         //
896
897         TString offDetStr(GetOfflineDetName(detector));
898         TString dir;
899         if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
900         {
901                 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
902         } else {
903                 dir.Form("%s/%s", base, offDetStr.Data());
904         }
905         
906         return dir.Data();
907         
908
909 }
910 //______________________________________________________________________________________________
911 void AliShuttle::CleanLocalStorage(const TString& uri)
912 {
913         //
914         // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
915         //
916
917         const char* type = 0;
918         if(uri == fgkLocalCDB) {
919                 type = "OCDB";
920         } else if(uri == fgkLocalRefStorage) {
921                 type = "Reference";
922         } else {
923                 AliError(Form("Invalid storage URI: %s", uri.Data()));
924                 return;
925         }
926
927         AliCDBManager* man = AliCDBManager::Instance();
928
929         // open local storage
930         AliCDBStorage *localSto = man->GetStorage(uri);
931         if(!localSto) {
932                 Log("SHUTTLE",
933                         Form("CleanLocalStorage - cannot activate local %s storage", type));
934                 return;
935         }
936
937         TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
938                 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
939
940         AliInfo(Form("filename = %s", filename.Data()));
941
942         AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
943                 GetCurrentRun(), fCurrentDetector.Data()));
944
945         RemoveFile(filename.Data());
946
947 }
948
949 //______________________________________________________________________________________________
950 void AliShuttle::RemoveFile(const char* filename)
951 {
952         //
953         // removes local file
954         //
955
956         TString command(Form("rm -f %s", filename));
957
958         Int_t result = gSystem->Exec(command.Data());
959         if(result != 0)
960         {
961                 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
962                         fCurrentDetector.Data(), filename));
963         }
964 }
965
966 //______________________________________________________________________________________________
967 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
968 {
969         //
970         // Reads the AliShuttleStatus from the CDB
971         //
972
973         if (fStatusEntry){
974                 delete fStatusEntry;
975                 fStatusEntry = 0;
976         }
977
978         fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
979                 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
980
981         if (!fStatusEntry) return 0;
982         fStatusEntry->SetOwner(1);
983
984         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
985         if (!status) {
986                 AliError("Invalid object stored to CDB!");
987                 return 0;
988         }
989
990         return status;
991 }
992
993 //______________________________________________________________________________________________
994 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
995 {
996         //
997         // writes the status for one subdetector
998         //
999
1000         if (fStatusEntry){
1001                 delete fStatusEntry;
1002                 fStatusEntry = 0;
1003         }
1004
1005         Int_t run = GetCurrentRun();
1006
1007         AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1008
1009         fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1010         fStatusEntry->SetOwner(1);
1011
1012         UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1013
1014         if (!result) {
1015                 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1016                                                 fCurrentDetector.Data(), run));
1017                 return kFALSE;
1018         }
1019         
1020         SendMLInfo();
1021
1022         return kTRUE;
1023 }
1024
1025 //______________________________________________________________________________________________
1026 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1027 {
1028         //
1029         // changes the AliShuttleStatus for the given detector and run to the given status
1030         //
1031
1032         if (!fStatusEntry){
1033                 AliError("UNEXPECTED: fStatusEntry empty");
1034                 return;
1035         }
1036
1037         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1038
1039         if (!status){
1040                 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
1041                 return;
1042         }
1043
1044         TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1045                                 fCurrentDetector.Data(),
1046                                 status->GetStatusName(),
1047                                 status->GetStatusName(newStatus));
1048         Log("SHUTTLE", actionStr);
1049         SetLastAction(actionStr);
1050
1051         status->SetStatus(newStatus);
1052         if (increaseCount) status->IncreaseCount();
1053
1054         AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1055
1056         SendMLInfo();
1057 }
1058
1059 //______________________________________________________________________________________________
1060 void AliShuttle::SendMLInfo()
1061 {
1062         //
1063         // sends ML information about the current status of the current detector being processed
1064         //
1065         
1066         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1067         
1068         if (!status){
1069                 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1070                 return;
1071         }
1072         
1073         TMonaLisaText  mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1074         TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1075
1076         TList mlList;
1077         mlList.Add(&mlStatus);
1078         mlList.Add(&mlRetryCount);
1079
1080         fMonaLisa->SendParameters(&mlList);
1081 }
1082
1083 //______________________________________________________________________________________________
1084 Bool_t AliShuttle::ContinueProcessing()
1085 {
1086         // this function reads the AliShuttleStatus information from CDB and
1087         // checks if the processing should be continued
1088         // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1089
1090         if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1091
1092         AliPreprocessor* aPreprocessor =
1093                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1094         if (!aPreprocessor)
1095         {
1096                 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1097                 return kFALSE;
1098         }
1099
1100         AliShuttleLogbookEntry::Status entryStatus =
1101                 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1102
1103         if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1104                 AliInfo(Form("ContinueProcessing - %s is %s",
1105                                 fCurrentDetector.Data(),
1106                                 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1107                 return kFALSE;
1108         }
1109
1110         // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1111
1112         // check if current run is first unprocessed run for current detector
1113         if (fConfig->StrictRunOrder(fCurrentDetector) &&
1114                 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1115         {
1116                 if (fTestMode == kNone)
1117                 {
1118                         Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1119                         return kFALSE;
1120                 }
1121                 else
1122                 {
1123                         Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1124                 }
1125         }
1126
1127         AliShuttleStatus* status = ReadShuttleStatus();
1128         if (!status) {
1129                 // first time
1130                 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1131                                 fCurrentDetector.Data()));
1132                 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1133                 return WriteShuttleStatus(status);
1134         }
1135
1136         // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1137         // If it happens it may mean Logbook updating failed... let's do it now!
1138         if (status->GetStatus() == AliShuttleStatus::kDone ||
1139             status->GetStatus() == AliShuttleStatus::kFailed){
1140                 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1141                                         fCurrentDetector.Data(),
1142                                         status->GetStatusName(status->GetStatus())));
1143                 UpdateShuttleLogbook(fCurrentDetector.Data(),
1144                                         status->GetStatusName(status->GetStatus()));
1145                 return kFALSE;
1146         }
1147
1148         if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1149                 Log("SHUTTLE",
1150                         Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1151                                 fCurrentDetector.Data()));
1152                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1153                 if (StoreOCDB()){
1154                         Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1155                                 fCurrentDetector.Data()));
1156                         UpdateShuttleStatus(AliShuttleStatus::kDone);
1157                         UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1158                 } else {
1159                         Log("SHUTTLE",
1160                                 Form("ContinueProcessing - %s: Grid storage failed again",
1161                                         fCurrentDetector.Data()));
1162                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1163                 }
1164                 return kFALSE;
1165         }
1166
1167         // if we get here, there is a restart
1168         Bool_t cont = kFALSE;
1169
1170         // abort conditions
1171         if (status->GetCount() >= fConfig->GetMaxRetries()) {
1172                 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1173                                 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1174                                 status->GetCount(), status->GetStatusName()));
1175                 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1176                 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1177
1178                 // there may still be objects in local OCDB and reference storage
1179                 // and FXS databases may be not updated: do it now!
1180                 
1181                 // TODO Currently disabled, we want to keep files in case of failure!
1182                 // CleanLocalStorage(fgkLocalCDB);
1183                 // CleanLocalStorage(fgkLocalRefStorage);
1184                 // UpdateTableFailCase();
1185                 
1186                 // Send mail to detector expert!
1187                 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1188                 if (!SendMail())
1189                         Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1190                                         fCurrentDetector.Data()));
1191
1192         } else {
1193                 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1194                                 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1195                                 status->GetStatusName(), status->GetCount()));
1196                 Bool_t increaseCount = kTRUE;
1197                 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1198                         increaseCount = kFALSE;
1199                 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1200                 cont = kTRUE;
1201         }
1202
1203         return cont;
1204 }
1205
1206 //______________________________________________________________________________________________
1207 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1208 {
1209         //
1210         // Makes data retrieval for all detectors in the configuration.
1211         // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1212         // (Unprocessed, Inactive, Failed or Done).
1213         // Returns kFALSE in case of error occured and kTRUE otherwise
1214         //
1215
1216         if (!entry) return kFALSE;
1217
1218         fLogbookEntry = entry;
1219
1220         AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1221                                         GetCurrentRun()));
1222
1223         // create ML instance that monitors this run
1224         fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1225         // disable monitoring of other parameters that come e.g. from TFile
1226         gMonitoringWriter = 0;
1227
1228         // Send the information to ML
1229         TMonaLisaText  mlStatus("SHUTTLE_status", "Processing");
1230         TMonaLisaText  mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1231
1232         TList mlList;
1233         mlList.Add(&mlStatus);
1234         mlList.Add(&mlRunType);
1235
1236         fMonaLisa->SendParameters(&mlList);
1237
1238         if (fLogbookEntry->IsDone())
1239         {
1240                 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1241                 UpdateShuttleLogbook("shuttle_done");
1242                 fLogbookEntry = 0;
1243                 return kTRUE;
1244         }
1245
1246         // read test mode if flag is set
1247         if (fReadTestMode)
1248         {
1249                 fTestMode = kNone;
1250                 TString logEntry(entry->GetRunParameter("log"));
1251                 //printf("log entry = %s\n", logEntry.Data());
1252                 TString searchStr("Testmode: ");
1253                 Int_t pos = logEntry.Index(searchStr.Data());
1254                 //printf("%d\n", pos);
1255                 if (pos >= 0)
1256                 {
1257                         TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1258                         //printf("%s\n", subStr.String().Data());
1259                         TString newStr(subStr.Data());
1260                         TObjArray* token = newStr.Tokenize(' ');
1261                         if (token)
1262                         {
1263                                 //token->Print();
1264                                 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1265                                 if (tmpStr)
1266                                 {
1267                                         Int_t testMode = tmpStr->String().Atoi();
1268                                         if (testMode > 0)
1269                                         {
1270                                                 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1271                                                 SetTestMode((TestMode) testMode);
1272                                         }
1273                                 }
1274                                 delete token;          
1275                         }
1276                 }
1277         }
1278         
1279         Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1280         
1281         fLogbookEntry->Print("all");
1282
1283         // Initialization
1284         Bool_t hasError = kFALSE;
1285
1286         AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1287         if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1288         AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1289         if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1290
1291         // Loop on detectors in the configuration
1292         TIter iter(fConfig->GetDetectors());
1293         TObjString* aDetector = 0;
1294
1295         while ((aDetector = (TObjString*) iter.Next()))
1296         {
1297                 fCurrentDetector = aDetector->String();
1298
1299                 if (ContinueProcessing() == kFALSE) continue;
1300
1301                 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START  ******",
1302                                                 GetCurrentRun(), aDetector->GetName()));
1303
1304                 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1305
1306                 Log(fCurrentDetector.Data(), "Starting processing");
1307
1308                 Int_t pid = fork();
1309
1310                 if (pid < 0)
1311                 {
1312                         Log("SHUTTLE", "ERROR: Forking failed");
1313                 }
1314                 else if (pid > 0)
1315                 {
1316                         // parent
1317                         AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1318                                                         GetCurrentRun(), aDetector->GetName()));
1319
1320                         Long_t begin = time(0);
1321
1322                         int status; // to be used with waitpid, on purpose an int (not Int_t)!
1323                         while (waitpid(pid, &status, WNOHANG) == 0)
1324                         {
1325                                 Long_t expiredTime = time(0) - begin;
1326
1327                                 if (expiredTime > fConfig->GetPPTimeOut())
1328                                 {
1329                                         TString tmp;
1330                                         tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1331                                                                 fCurrentDetector.Data(), expiredTime);
1332                                         Log("SHUTTLE", tmp);
1333                                         Log(fCurrentDetector, tmp);
1334
1335                                         kill(pid, 9);
1336
1337                                         UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1338                                         hasError = kTRUE;
1339
1340                                         gSystem->Sleep(1000);
1341                                 }
1342                                 else
1343                                 {
1344                                         gSystem->Sleep(1000);
1345                                         
1346                                         TString checkStr;
1347                                         checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1348                                         FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1349                                         if (!pipe)
1350                                         {
1351                                                 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1352                                                 continue;
1353                                         }
1354                                                 
1355                                         char buffer[100];
1356                                         if (!fgets(buffer, 100, pipe))
1357                                         {
1358                                                 Log("SHUTTLE", "Error: ps did not return anything");
1359                                                 gSystem->ClosePipe(pipe);
1360                                                 continue;
1361                                         }
1362                                         gSystem->ClosePipe(pipe);
1363                                         
1364                                         //Log("SHUTTLE", Form("ps returned %s", buffer));
1365                                         
1366                                         Int_t mem = 0;
1367                                         if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1368                                         {
1369                                                 Log("SHUTTLE", "Error: Could not parse output of ps");
1370                                                 continue;
1371                                         }
1372                                         
1373                                         if (expiredTime % 60 == 0)
1374                                                 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1375                                                                 fCurrentDetector.Data(), expiredTime, mem));
1376                                         
1377                                         if (mem > fConfig->GetPPMaxMem())
1378                                         {
1379                                                 TString tmp;
1380                                                 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1381                                                         mem, fConfig->GetPPMaxMem());
1382                                                 Log("SHUTTLE", tmp);
1383                                                 Log(fCurrentDetector, tmp);
1384         
1385                                                 kill(pid, 9);
1386         
1387                                                 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1388                                                 hasError = kTRUE;
1389         
1390                                                 gSystem->Sleep(1000);
1391                                         }
1392                                 }
1393                         }
1394
1395                         AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1396                                                                 GetCurrentRun(), aDetector->GetName()));
1397
1398                         if (WIFEXITED(status))
1399                         {
1400                                 Int_t returnCode = WEXITSTATUS(status);
1401
1402                                 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1403                                                                                 returnCode));
1404
1405                                 if (returnCode == 0) hasError = kTRUE;
1406                         }
1407                 }
1408                 else if (pid == 0)
1409                 {
1410                         // client
1411                         AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1412
1413                         AliInfo("Redirecting output...");
1414
1415                         if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1416                         {
1417                                 Log("SHUTTLE", "Could not freopen stdout");
1418                         }
1419                         else
1420                         {
1421                                 fOutputRedirected = kTRUE;
1422                                 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1423                                         Log("SHUTTLE", "Could not redirect stderr");
1424                                 
1425                         }
1426                         
1427                         TString wd = gSystem->WorkingDirectory();
1428                         TString tmpDir = Form("%s/%s_process",GetShuttleTempDir(),fCurrentDetector.Data());
1429                         
1430                         gSystem->mkdir(tmpDir.Data());
1431                         gSystem->ChangeDirectory(tmpDir.Data());
1432                         
1433                         Bool_t success = ProcessCurrentDetector();
1434                         
1435                         gSystem->ChangeDirectory(wd.Data());
1436                         
1437                         gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1438                         
1439                         if (success) // Preprocessor finished successfully!
1440                         { 
1441                                 // Update time_processed field in FXS DB
1442                                 if (UpdateTable() == kFALSE)
1443                                         Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!", 
1444                                                         fCurrentDetector.Data()));
1445
1446                                 // Transfer the data from local storage to main storage (Grid)
1447                                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1448                                 if (StoreOCDB() == kFALSE)
1449                                 {
1450                                         AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1451                                                         GetCurrentRun(), aDetector->GetName()));
1452                                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1453                                         success = kFALSE;
1454                                 } else {
1455                                         AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1456                                                         GetCurrentRun(), aDetector->GetName()));
1457                                         UpdateShuttleStatus(AliShuttleStatus::kDone);
1458                                         UpdateShuttleLogbook(fCurrentDetector, "DONE");
1459                                 }
1460                         }
1461
1462                         for (UInt_t iSys=0; iSys<3; iSys++)
1463                         {
1464                                 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1465                         }
1466
1467                         AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1468                                                         GetCurrentRun(), aDetector->GetName(), success));
1469
1470                         // the client exits here
1471                         gSystem->Exit(success);
1472
1473                         AliError("We should never get here!!!");
1474                 }
1475         }
1476
1477         AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1478                                                         GetCurrentRun()));
1479
1480         //check if shuttle is done for this run, if so update logbook
1481         TObjArray checkEntryArray;
1482         checkEntryArray.SetOwner(1);
1483         TString whereClause = Form("where run=%d", GetCurrentRun());
1484         if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1485                 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1486                                                 GetCurrentRun()));
1487                 return hasError == kFALSE;
1488         }
1489
1490         AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1491                                                 (checkEntryArray.At(0));
1492
1493         if (checkEntry)
1494         {
1495                 if (checkEntry->IsDone())
1496                 {
1497                         Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1498                         UpdateShuttleLogbook("shuttle_done");
1499                 }
1500                 else
1501                 {
1502                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1503                         {
1504                                 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1505                                 {
1506                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1507                                                         checkEntry->GetRun(), GetDetName(iDet)));
1508                                         fFirstUnprocessed[iDet] = kFALSE;
1509                                 }
1510                         }
1511                 }
1512         }
1513
1514         // remove ML instance
1515         delete fMonaLisa;
1516         fMonaLisa = 0;
1517
1518         fLogbookEntry = 0;
1519
1520         return hasError == kFALSE;
1521 }
1522
1523 //______________________________________________________________________________________________
1524 Bool_t AliShuttle::ProcessCurrentDetector()
1525 {
1526         //
1527         // Makes data retrieval just for a specific detector (fCurrentDetector).
1528         // Threre should be a configuration for this detector.
1529
1530         AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1531
1532         if (!CleanReferenceStorage(fCurrentDetector.Data()))
1533                 return kFALSE;
1534
1535         TMap* dcsMap = 0;
1536
1537         // call preprocessor
1538         AliPreprocessor* aPreprocessor =
1539                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1540
1541         aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1542
1543         Bool_t processDCS = aPreprocessor->ProcessDCS();
1544
1545         if (!processDCS)
1546         {
1547                 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1548         }
1549         else if (fTestMode & kSkipDCS)
1550         {
1551                 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1552         } 
1553         else if (fTestMode & kErrorDCS)
1554         {
1555                 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1556                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1557                 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1558                 return kFALSE;
1559         } else {
1560
1561                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1562
1563                 TString host(fConfig->GetDCSHost(fCurrentDetector));
1564                 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1565
1566                 if (fConfig->GetDCSAliases(fCurrentDetector)->GetEntries() > 0)
1567                 {
1568                         dcsMap = GetValueSet(host, port, fConfig->GetDCSAliases(fCurrentDetector), kAlias);
1569                         if (!dcsMap)
1570                         {
1571                                 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS aliases");
1572                                 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1573                                 return kFALSE;
1574                         }
1575                 }
1576                 
1577                 if (fConfig->GetDCSDataPoints(fCurrentDetector)->GetEntries() > 0)
1578                 {
1579                         TMap* dcsMap2 = GetValueSet(host, port, fConfig->GetDCSDataPoints(fCurrentDetector), kDP);
1580                         if (!dcsMap2)
1581                         {
1582                                 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS data points");
1583                                 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1584                                 if (dcsMap)
1585                                         delete dcsMap;
1586                                 return kFALSE;
1587                         }
1588                         
1589                         if (!dcsMap)
1590                         {
1591                                 dcsMap = dcsMap2;
1592                         }
1593                         else // merge
1594                         {
1595                                 TIter iter(dcsMap2);
1596                                 TObjString* key = 0;
1597                                 while ((key = (TObjString*) iter.Next()))
1598                                         dcsMap->Add(key, dcsMap2->GetValue(key->String()));
1599                                         
1600                                 dcsMap2->SetOwner(kFALSE);
1601                                 delete dcsMap2;
1602                         }
1603                 }
1604                 
1605         }
1606
1607         // still no map?
1608         if (!dcsMap)
1609                 dcsMap = new TMap;
1610         
1611         // DCS Archive DB processing successful. Call Preprocessor!
1612         UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1613
1614         UInt_t returnValue = aPreprocessor->Process(dcsMap);
1615
1616         if (returnValue > 0) // Preprocessor error!
1617         {
1618                 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1619                 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1620                 dcsMap->DeleteAll();
1621                 delete dcsMap;
1622                 return kFALSE;
1623         }
1624         
1625         // preprocessor ok!
1626         UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1627         Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1628                                 fCurrentDetector.Data()));
1629
1630         dcsMap->DeleteAll();
1631         delete dcsMap;
1632
1633         return kTRUE;
1634 }
1635
1636 //______________________________________________________________________________________________
1637 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1638                 TObjArray& entries)
1639 {
1640         // Query DAQ's Shuttle logbook and fills detector status object.
1641         // Call QueryRunParameters to query DAQ logbook for run parameters.
1642         //
1643
1644         entries.SetOwner(1);
1645
1646         // check connection, in case connect
1647         if(!Connect(3)) return kFALSE;
1648
1649         TString sqlQuery;
1650         sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1651
1652         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1653         if (!aResult) {
1654                 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1655                 return kFALSE;
1656         }
1657
1658         AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1659
1660         if(aResult->GetRowCount() == 0) {
1661                 AliInfo("No entries in Shuttle Logbook match request");
1662                 delete aResult;
1663                 return kTRUE;
1664         }
1665
1666         // TODO Check field count!
1667         const UInt_t nCols = 23;
1668         if (aResult->GetFieldCount() != (Int_t) nCols) {
1669                 AliError("Invalid SQL result field number!");
1670                 delete aResult;
1671                 return kFALSE;
1672         }
1673
1674         TSQLRow* aRow;
1675         while ((aRow = aResult->Next())) {
1676                 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1677                 Int_t run = runString.Atoi();
1678
1679                 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1680                 if (!entry)
1681                         continue;
1682
1683                 // loop on detectors
1684                 for(UInt_t ii = 0; ii < nCols; ii++)
1685                         entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1686
1687                 entries.AddLast(entry);
1688                 delete aRow;
1689         }
1690
1691         delete aResult;
1692         return kTRUE;
1693 }
1694
1695 //______________________________________________________________________________________________
1696 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1697 {
1698         //
1699         // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1700         //
1701
1702         // check connection, in case connect
1703         if (!Connect(3))
1704                 return 0;
1705
1706         TString sqlQuery;
1707         sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1708
1709         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1710         if (!aResult) {
1711                 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1712                 return 0;
1713         }
1714
1715         if (aResult->GetRowCount() == 0) {
1716                 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1717                 delete aResult;
1718                 return 0;
1719         }
1720
1721         if (aResult->GetRowCount() > 1) {
1722                 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1723                 delete aResult;
1724                 return 0;
1725         }
1726
1727         TSQLRow* aRow = aResult->Next();
1728         if (!aRow)
1729         {
1730                 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1731                 delete aResult;
1732                 return 0;
1733         }
1734
1735         AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1736
1737         for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1738                 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1739
1740         UInt_t startTime = entry->GetStartTime();
1741         UInt_t endTime = entry->GetEndTime();
1742
1743         if (!startTime || !endTime || startTime > endTime) {
1744                 Log("SHUTTLE",
1745                         Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1746                                 run, startTime, endTime));
1747                 delete entry;
1748                 delete aRow;
1749                 delete aResult;
1750                 return 0;
1751         }
1752
1753         delete aRow;
1754         delete aResult;
1755
1756         return entry;
1757 }
1758
1759 //______________________________________________________________________________________________
1760 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1761                                 TObjArray* valueSet, DCSType type)
1762 {
1763         // Retrieve all "entry" data points from the DCS server
1764         // host, port: TSocket connection parameters
1765         // entry: name of the alias or data point
1766         // valueSet: array of retrieved AliDCSValue's
1767         // type: kAlias or kDP
1768
1769         AliDCSClient client(host, port, fTimeout, fRetries);
1770         if (!client.IsConnected())
1771         {
1772                 return kFALSE;
1773         }
1774
1775         Int_t result=0;
1776
1777         if (type == kAlias)
1778         {
1779                 result = client.GetAliasValues(entry,
1780                         GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1781         } else
1782         if (type == kDP)
1783         {
1784                 result = client.GetDPValues(entry,
1785                         GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1786         }
1787
1788         if (result < 0)
1789         {
1790                 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1791                         entry, AliDCSClient::GetErrorString(result)));
1792
1793                 if (result == AliDCSClient::fgkServerError)
1794                 {
1795                         Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1796                                 client.GetServerError().Data()));
1797                 }
1798
1799                 return kFALSE;
1800         }
1801
1802         return kTRUE;
1803 }
1804
1805 //______________________________________________________________________________________________
1806 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1807                               DCSType type)
1808 {
1809         // Retrieve all "entry" data points from the DCS server
1810         // host, port: TSocket connection parameters
1811         // entries: list of name of the alias or data point
1812         // type: kAlias or kDP
1813         // returns TMap of values, 0 when failure
1814
1815         const Int_t kSplit = 100; // maximum number of DPs at a time
1816         
1817         Int_t totalEntries = entries->GetEntries();
1818         
1819         TMap* result = 0;
1820         
1821         for (Int_t index=0; index < totalEntries; index += kSplit)
1822         {
1823                 Int_t endIndex = index + kSplit;
1824         
1825                 AliDCSClient client(host, port, fTimeout, fRetries);
1826                 if (!client.IsConnected())
1827                         return 0;
1828
1829                 TMap* partialResult = 0;
1830
1831                 if (type == kAlias)
1832                 {
1833                         partialResult = client.GetAliasValues(entries, GetCurrentStartTime(), 
1834                                 GetCurrentEndTime(), index, endIndex);
1835                 } 
1836                 else if (type == kDP)
1837                 {
1838                         partialResult = client.GetDPValues(entries, GetCurrentStartTime(), 
1839                                 GetCurrentEndTime(), index, endIndex);
1840                 }
1841
1842                 if (partialResult == 0)
1843                 {
1844                         Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries (%d...%d)! Reason: %s",
1845                                 index, endIndex, client.GetServerError().Data()));
1846         
1847                         if (result)
1848                                 delete result;
1849                                 
1850                         return 0;
1851                 }
1852                 
1853                 AliInfo(Form("Retrieved entries %d..%d (total %d); E.g. %s has %d values collected",
1854                                         index, endIndex, totalEntries, entries->At(index)->GetName(), ((TObjArray*)
1855                                         partialResult->GetValue(entries->At(index)->GetName()))->GetEntriesFast()));
1856                 
1857                 if (!result)
1858                 {
1859                         result = partialResult;
1860                 }
1861                 else
1862                 {               
1863                         TIter iter(partialResult);
1864                         TObjString* key = 0;
1865                         while ((key = (TObjString*) iter.Next()))
1866                                 result->Add(key, partialResult->GetValue(key->String()));
1867                                 
1868                         partialResult->SetOwner(kFALSE);
1869                         delete partialResult;
1870                 }
1871         
1872         }
1873
1874         return result;
1875 }
1876 //______________________________________________________________________________________________
1877 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1878                 const char* id, const char* source)
1879 {
1880         // Get calibration file from file exchange servers
1881         // First queris the FXS database for the file name, using the run, detector, id and source info
1882         // then calls RetrieveFile(filename) for actual copy to local disk
1883         // run: current run being processed (given by Logbook entry fLogbookEntry)
1884         // detector: the Preprocessor name
1885         // id: provided as a parameter by the Preprocessor
1886         // source: provided by the Preprocessor through GetFileSources function
1887
1888         // check if test mode should simulate a FXS error
1889         if (fTestMode & kErrorFXSFiles)
1890         {
1891                 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1892                 return 0;
1893         }
1894         
1895         // check connection, in case connect
1896         if (!Connect(system))
1897         {
1898                 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1899                 return 0;
1900         }
1901
1902         // Query preparation
1903         TString sourceName(source);
1904         Int_t nFields = 3;
1905         TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1906                                                                 fConfig->GetFXSdbTable(system));
1907         TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1908                                                                 GetCurrentRun(), detector, id);
1909
1910         if (system == kDAQ)
1911         {
1912                 whereClause += Form(" and DAQsource=\"%s\"", source);
1913         }
1914         else if (system == kDCS)
1915         {
1916                 sourceName="none";
1917         }
1918         else if (system == kHLT)
1919         {
1920                 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1921                 nFields = 3;
1922         }
1923
1924         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1925
1926         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1927
1928         // Query execution
1929         TSQLResult* aResult = 0;
1930         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1931         if (!aResult) {
1932                 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1933                                 GetSystemName(system), id, sourceName.Data()));
1934                 return 0;
1935         }
1936
1937         if(aResult->GetRowCount() == 0)
1938         {
1939                 Log(detector,
1940                         Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1941                                 GetSystemName(system), id, sourceName.Data()));
1942                 delete aResult;
1943                 return 0;
1944         }
1945
1946         if (aResult->GetRowCount() > 1) {
1947                 Log(detector,
1948                         Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1949                                 GetSystemName(system), id, sourceName.Data()));
1950                 delete aResult;
1951                 return 0;
1952         }
1953
1954         if (aResult->GetFieldCount() != nFields) {
1955                 Log(detector,
1956                         Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1957                                 GetSystemName(system), id, sourceName.Data()));
1958                 delete aResult;
1959                 return 0;
1960         }
1961
1962         TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1963
1964         if (!aRow){
1965                 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1966                                 GetSystemName(system), id, sourceName.Data()));
1967                 delete aResult;
1968                 return 0;
1969         }
1970
1971         TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1972         TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1973         TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1974
1975         delete aResult;
1976         delete aRow;
1977
1978         AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1979                                 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1980
1981         // retrieved file is renamed to make it unique
1982         TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1983                                         GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1984
1985
1986         // file retrieval from FXS
1987         UInt_t nRetries = 0;
1988         UInt_t maxRetries = 3;
1989         Bool_t result = kFALSE;
1990
1991         // copy!! if successful TSystem::Exec returns 0
1992         while(nRetries++ < maxRetries) {
1993                 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1994                 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1995                 if(!result)
1996                 {
1997                         Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1998                                         filePath.Data(), GetSystemName(system)));
1999                         continue;
2000                 } 
2001
2002                 if (fileChecksum.Length()>0)
2003                 {
2004                         // compare md5sum of local file with the one stored in the FXS DB
2005                         Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
2006                                                 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
2007
2008                         if (md5Comp != 0)
2009                         {
2010                                 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2011                                                         filePath.Data()));
2012                                 result = kFALSE;
2013                                 continue;
2014                         }
2015                 } else {
2016                         Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2017                                                         filePath.Data(), GetSystemName(system)));
2018                 }
2019                 if (result) break;
2020         }
2021
2022         if(!result) return 0;
2023
2024         fFXSCalled[system]=kTRUE;
2025         TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2026         fFXSlist[system].Add(fileParams);
2027
2028         static TString fullLocalFileName;
2029         fullLocalFileName.Form("%s/%s", GetShuttleTempDir(), localFileName.Data());
2030
2031         Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and source %s from %s to %s", id, source, GetSystemName(system), fullLocalFileName.Data()));
2032
2033         return fullLocalFileName.Data();
2034 }
2035
2036 //______________________________________________________________________________________________
2037 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2038 {
2039         //
2040         // Copies file from FXS to local Shuttle machine
2041         //
2042
2043         // check temp directory: trying to cd to temp; if it does not exist, create it
2044         AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
2045                         GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
2046
2047         void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
2048         if (dir == NULL) {
2049                 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
2050                         AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
2051                         return kFALSE;
2052                 }
2053
2054         } else {
2055                 gSystem->FreeDirectory(dir);
2056         }
2057
2058         TString baseFXSFolder;
2059         if (system == kDAQ)
2060         {
2061                 baseFXSFolder = "FES/";
2062         }
2063         else if (system == kDCS)
2064         {
2065                 baseFXSFolder = "";
2066         }
2067         else if (system == kHLT)
2068         {
2069                 baseFXSFolder = "/opt/FXS/";
2070         }
2071
2072
2073         TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
2074                 fConfig->GetFXSPort(system),
2075                 fConfig->GetFXSUser(system),
2076                 fConfig->GetFXSHost(system),
2077                 baseFXSFolder.Data(),
2078                 fxsFileName,
2079                 GetShuttleTempDir(),
2080                 localFileName);
2081
2082         AliDebug(2, Form("%s",command.Data()));
2083
2084         Bool_t result = (gSystem->Exec(command.Data()) == 0);
2085
2086         return result;
2087 }
2088
2089 //______________________________________________________________________________________________
2090 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2091 {
2092         //
2093         // Get sources producing the condition file Id from file exchange servers
2094         // if id is NULL all sources are returned (distinct)
2095         //
2096
2097         Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2098         
2099         // check if test mode should simulate a FXS error
2100         if (fTestMode & kErrorFXSSources)
2101         {
2102                 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2103                 return 0;
2104         }
2105
2106         if (system == kDCS)
2107         {
2108                 AliWarning("DCS system has only one source of data!");
2109                 TList *list = new TList();
2110                 list->SetOwner(1);
2111                 list->Add(new TObjString(" "));
2112                 return list;
2113         }
2114
2115         // check connection, in case connect
2116         if (!Connect(system))
2117         {
2118                 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2119                 return NULL;
2120         }
2121
2122         TString sourceName = 0;
2123         if (system == kDAQ)
2124         {
2125                 sourceName = "DAQsource";
2126         } else if (system == kHLT)
2127         {
2128                 sourceName = "DDLnumbers";
2129         }
2130
2131         TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2132         TString whereClause = Form("run=%d and detector=\"%s\"",
2133                                 GetCurrentRun(), detector);
2134         if (id)
2135                 whereClause += Form(" and fileId=\"%s\"", id);
2136         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2137
2138         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2139
2140         // Query execution
2141         TSQLResult* aResult;
2142         aResult = fServer[system]->Query(sqlQuery);
2143         if (!aResult) {
2144                 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2145                                 GetSystemName(system), id));
2146                 return 0;
2147         }
2148
2149         TList *list = new TList();
2150         list->SetOwner(1);
2151         
2152         if (aResult->GetRowCount() == 0)
2153         {
2154                 Log(detector,
2155                         Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2156                 delete aResult;
2157                 return list;
2158         }
2159
2160         Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2161
2162         TSQLRow* aRow;
2163         while ((aRow = aResult->Next()))
2164         {
2165
2166                 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2167                 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2168                 list->Add(new TObjString(source));
2169                 delete aRow;
2170         }
2171
2172         delete aResult;
2173
2174         return list;
2175 }
2176
2177 //______________________________________________________________________________________________
2178 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2179 {
2180         //
2181         // Get all ids of condition files produced by a given source from file exchange servers
2182         //
2183         
2184         Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2185
2186         // check if test mode should simulate a FXS error
2187         if (fTestMode & kErrorFXSSources)
2188         {
2189                 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2190                 return 0;
2191         }
2192
2193         // check connection, in case connect
2194         if (!Connect(system))
2195         {
2196                 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2197                 return NULL;
2198         }
2199
2200         TString sourceName = 0;
2201         if (system == kDAQ)
2202         {
2203                 sourceName = "DAQsource";
2204         } else if (system == kHLT)
2205         {
2206                 sourceName = "DDLnumbers";
2207         }
2208
2209         TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2210         TString whereClause = Form("run=%d and detector=\"%s\"",
2211                                 GetCurrentRun(), detector);
2212         if (sourceName.Length() > 0 && source)
2213                 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2214         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2215
2216         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2217
2218         // Query execution
2219         TSQLResult* aResult;
2220         aResult = fServer[system]->Query(sqlQuery);
2221         if (!aResult) {
2222                 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2223                                 GetSystemName(system), source));
2224                 return 0;
2225         }
2226
2227         TList *list = new TList();
2228         list->SetOwner(1);
2229         
2230         if (aResult->GetRowCount() == 0)
2231         {
2232                 Log(detector,
2233                         Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2234                 delete aResult;
2235                 return list;
2236         }
2237
2238         Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2239
2240         TSQLRow* aRow;
2241
2242         while ((aRow = aResult->Next()))
2243         {
2244
2245                 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2246                 AliDebug(2, Form("fileId = %s", id.Data()));
2247                 list->Add(new TObjString(id));
2248                 delete aRow;
2249         }
2250
2251         delete aResult;
2252
2253         return list;
2254 }
2255
2256 //______________________________________________________________________________________________
2257 Bool_t AliShuttle::Connect(Int_t system)
2258 {
2259         // Connect to MySQL Server of the system's FXS MySQL databases
2260         // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2261         //
2262
2263         // check connection: if already connected return
2264         if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2265
2266         TString dbHost, dbUser, dbPass, dbName;
2267
2268         if (system < 3) // FXS db servers
2269         {
2270                 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2271                 dbUser = fConfig->GetFXSdbUser(system);
2272                 dbPass = fConfig->GetFXSdbPass(system);
2273                 dbName =   fConfig->GetFXSdbName(system);
2274         } else { // Run & Shuttle logbook servers
2275         // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2276                 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2277                 dbUser = fConfig->GetDAQlbUser();
2278                 dbPass = fConfig->GetDAQlbPass();
2279                 dbName =   fConfig->GetDAQlbDB();
2280         }
2281
2282         fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2283         if (!fServer[system] || !fServer[system]->IsConnected()) {
2284                 if(system < 3)
2285                 {
2286                 AliError(Form("Can't establish connection to FXS database for %s",
2287                                         AliShuttleInterface::GetSystemName(system)));
2288                 } else {
2289                 AliError("Can't establish connection to Run logbook.");
2290                 }
2291                 if(fServer[system]) delete fServer[system];
2292                 return kFALSE;
2293         }
2294
2295         // Get tables
2296         TSQLResult* aResult=0;
2297         switch(system){
2298                 case kDAQ:
2299                         aResult = fServer[kDAQ]->GetTables(dbName.Data());
2300                         break;
2301                 case kDCS:
2302                         aResult = fServer[kDCS]->GetTables(dbName.Data());
2303                         break;
2304                 case kHLT:
2305                         aResult = fServer[kHLT]->GetTables(dbName.Data());
2306                         break;
2307                 default:
2308                         aResult = fServer[3]->GetTables(dbName.Data());
2309                         break;
2310         }
2311
2312         delete aResult;
2313         return kTRUE;
2314 }
2315
2316 //______________________________________________________________________________________________
2317 Bool_t AliShuttle::UpdateTable()
2318 {
2319         //
2320         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2321         //
2322
2323         Bool_t result = kTRUE;
2324
2325         for (UInt_t system=0; system<3; system++)
2326         {
2327                 if(!fFXSCalled[system]) continue;
2328
2329                 // check connection, in case connect
2330                 if (!Connect(system))
2331                 {
2332                         Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2333                         result = kFALSE;
2334                         continue;
2335                 }
2336
2337                 TTimeStamp now; // now
2338
2339                 // Loop on FXS list entries
2340                 TIter iter(&fFXSlist[system]);
2341                 TObjString *aFXSentry=0;
2342                 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2343                 {
2344                         TString aFXSentrystr = aFXSentry->String();
2345                         TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2346                         if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2347                         {
2348                                 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2349                                         GetSystemName(system), aFXSentrystr.Data()));
2350                                 if(aFXSarray) delete aFXSarray;
2351                                 result = kFALSE;
2352                                 continue;
2353                         }
2354                         const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2355                         const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2356
2357                         TString whereClause;
2358                         if (system == kDAQ)
2359                         {
2360                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2361                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2362                         }
2363                         else if (system == kDCS)
2364                         {
2365                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2366                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId);
2367                         }
2368                         else if (system == kHLT)
2369                         {
2370                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2371                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2372                         }
2373
2374                         delete aFXSarray;
2375
2376                         TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2377                                                                 now.GetSec(), whereClause.Data());
2378
2379                         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2380
2381                         // Query execution
2382                         TSQLResult* aResult;
2383                         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2384                         if (!aResult)
2385                         {
2386                                 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2387                                                                 GetSystemName(system), sqlQuery.Data()));
2388                                 result = kFALSE;
2389                                 continue;
2390                         }
2391                         delete aResult;
2392                 }
2393         }
2394
2395         return result;
2396 }
2397
2398 //______________________________________________________________________________________________
2399 Bool_t AliShuttle::UpdateTableFailCase()
2400 {
2401         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2402         // this is called in case the preprocessor is declared failed for the current run, because
2403         // the fields are updated only in case of success
2404
2405         Bool_t result = kTRUE;
2406
2407         for (UInt_t system=0; system<3; system++)
2408         {
2409                 // check connection, in case connect
2410                 if (!Connect(system))
2411                 {
2412                         Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2413                                                         GetSystemName(system)));
2414                         result = kFALSE;
2415                         continue;
2416                 }
2417
2418                 TTimeStamp now; // now
2419
2420                 // Loop on FXS list entries
2421
2422                 TString whereClause = Form("where run=%d and detector=\"%s\";",
2423                                                 GetCurrentRun(), fCurrentDetector.Data());
2424
2425
2426                 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2427                                                         now.GetSec(), whereClause.Data());
2428
2429                 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2430
2431                 // Query execution
2432                 TSQLResult* aResult;
2433                 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2434                 if (!aResult)
2435                 {
2436                         Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2437                                                         GetSystemName(system), sqlQuery.Data()));
2438                         result = kFALSE;
2439                         continue;
2440                 }
2441                 delete aResult;
2442         }
2443
2444         return result;
2445 }
2446
2447 //______________________________________________________________________________________________
2448 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2449 {
2450         //
2451         // Update Shuttle logbook filling detector or shuttle_done column
2452         // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2453         //
2454
2455         // check connection, in case connect
2456         if(!Connect(3)){
2457                 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2458                 return kFALSE;
2459         }
2460
2461         TString detName(detector);
2462         TString setClause;
2463         if(detName == "shuttle_done")
2464         {
2465                 setClause = "set shuttle_done=1";
2466
2467                 // Send the information to ML
2468                 TMonaLisaText  mlStatus("SHUTTLE_status", "Done");
2469
2470                 TList mlList;
2471                 mlList.Add(&mlStatus);
2472
2473                 fMonaLisa->SendParameters(&mlList);
2474         } else {
2475                 TString statusStr(status);
2476                 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2477                    statusStr.Contains("failed", TString::kIgnoreCase)){
2478                         setClause = Form("set %s=\"%s\"", detector, status);
2479                 } else {
2480                         Log("SHUTTLE",
2481                                 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2482                                         status, detector));
2483                         return kFALSE;
2484                 }
2485         }
2486
2487         TString whereClause = Form("where run=%d", GetCurrentRun());
2488
2489         TString sqlQuery = Form("update %s %s %s",
2490                                         fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2491
2492         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2493
2494         // Query execution
2495         TSQLResult* aResult;
2496         aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2497         if (!aResult) {
2498                 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2499                 return kFALSE;
2500         }
2501         delete aResult;
2502
2503         return kTRUE;
2504 }
2505
2506 //______________________________________________________________________________________________
2507 Int_t AliShuttle::GetCurrentRun() const
2508 {
2509         //
2510         // Get current run from logbook entry
2511         //
2512
2513         return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2514 }
2515
2516 //______________________________________________________________________________________________
2517 UInt_t AliShuttle::GetCurrentStartTime() const
2518 {
2519         //
2520         // get current start time
2521         //
2522
2523         return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2524 }
2525
2526 //______________________________________________________________________________________________
2527 UInt_t AliShuttle::GetCurrentEndTime() const
2528 {
2529         //
2530         // get current end time from logbook entry
2531         //
2532
2533         return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2534 }
2535
2536 //______________________________________________________________________________________________
2537 void AliShuttle::Log(const char* detector, const char* message)
2538 {
2539         //
2540         // Fill log string with a message
2541         //
2542
2543         void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2544         if (dir == NULL) {
2545                 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2546                         AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2547                         return;
2548                 }
2549
2550         } else {
2551                 gSystem->FreeDirectory(dir);
2552         }
2553
2554         TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2555         if (GetCurrentRun() >= 0) 
2556                 toLog += Form("run %d - ", GetCurrentRun());
2557         toLog += Form("%s", message);
2558
2559         AliInfo(toLog.Data());
2560         
2561         // if we redirect the log output already to the file, leave here
2562         if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2563                 return;
2564
2565         TString fileName = GetLogFileName(detector);
2566         
2567         gSystem->ExpandPathName(fileName);
2568
2569         ofstream logFile;
2570         logFile.open(fileName, ofstream::out | ofstream::app);
2571
2572         if (!logFile.is_open()) {
2573                 AliError(Form("Could not open file %s", fileName.Data()));
2574                 return;
2575         }
2576
2577         logFile << toLog.Data() << "\n";
2578
2579         logFile.close();
2580 }
2581
2582 //______________________________________________________________________________________________
2583 TString AliShuttle::GetLogFileName(const char* detector) const
2584 {
2585         // 
2586         // returns the name of the log file for a given sub detector
2587         //
2588         
2589         TString fileName;
2590         
2591         if (GetCurrentRun() >= 0) 
2592                 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2593         else
2594                 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2595
2596         return fileName;
2597 }
2598
2599 //______________________________________________________________________________________________
2600 Bool_t AliShuttle::Collect(Int_t run)
2601 {
2602         //
2603         // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2604         // If a dedicated run is given this run is processed
2605         //
2606         // In operational mode, this is the Shuttle function triggered by the EOR signal.
2607         //
2608
2609         if (run == -1)
2610                 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2611         else
2612                 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2613
2614         SetLastAction("Starting");
2615
2616         TString whereClause("where shuttle_done=0");
2617         if (run != -1)
2618                 whereClause += Form(" and run=%d", run);
2619
2620         TObjArray shuttleLogbookEntries;
2621         if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2622         {
2623                 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2624                 return kFALSE;
2625         }
2626
2627         if (shuttleLogbookEntries.GetEntries() == 0)
2628         {
2629                 if (run == -1)
2630                         Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2631                 else
2632                         Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2633                                                 "or it does not exist in Shuttle logbook", run));
2634                 return kTRUE;
2635         }
2636
2637         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2638                 fFirstUnprocessed[iDet] = kTRUE;
2639
2640         if (run != -1)
2641         {
2642                 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2643                 // flag them into fFirstUnprocessed array
2644                 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2645                 TObjArray tmpLogbookEntries;
2646                 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2647                 {
2648                         Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2649                         return kFALSE;
2650                 }
2651
2652                 TIter iter(&tmpLogbookEntries);
2653                 AliShuttleLogbookEntry* anEntry = 0;
2654                 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2655                 {
2656                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2657                         {
2658                                 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2659                                 {
2660                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2661                                                         anEntry->GetRun(), GetDetName(iDet)));
2662                                         fFirstUnprocessed[iDet] = kFALSE;
2663                                 }
2664                         }
2665
2666                 }
2667
2668         }
2669
2670         if (!RetrieveConditionsData(shuttleLogbookEntries))
2671         {
2672                 Log("SHUTTLE", "Collect - Process of at least one run failed");
2673                 return kFALSE;
2674         }
2675
2676         Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2677         return kTRUE;
2678 }
2679
2680 //______________________________________________________________________________________________
2681 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2682 {
2683         //
2684         // Retrieve conditions data for all runs that aren't processed yet
2685         //
2686
2687         Bool_t hasError = kFALSE;
2688
2689         TIter iter(&dateEntries);
2690         AliShuttleLogbookEntry* anEntry;
2691
2692         while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2693                 if (!Process(anEntry)){
2694                         hasError = kTRUE;
2695                 }
2696
2697                 // clean SHUTTLE temp directory
2698                 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2699                 RemoveFile(filename.Data());
2700         }
2701
2702         return hasError == kFALSE;
2703 }
2704
2705 //______________________________________________________________________________________________
2706 ULong_t AliShuttle::GetTimeOfLastAction() const
2707 {
2708         //
2709         // Gets time of last action
2710         //
2711
2712         ULong_t tmp;
2713
2714         fMonitoringMutex->Lock();
2715
2716         tmp = fLastActionTime;
2717
2718         fMonitoringMutex->UnLock();
2719
2720         return tmp;
2721 }
2722
2723 //______________________________________________________________________________________________
2724 const TString AliShuttle::GetLastAction() const
2725 {
2726         //
2727         // returns a string description of the last action
2728         //
2729
2730         TString tmp;
2731
2732         fMonitoringMutex->Lock();
2733         
2734         tmp = fLastAction;
2735         
2736         fMonitoringMutex->UnLock();
2737
2738         return tmp;
2739 }
2740
2741 //______________________________________________________________________________________________
2742 void AliShuttle::SetLastAction(const char* action)
2743 {
2744         //
2745         // updates the monitoring variables
2746         //
2747
2748         fMonitoringMutex->Lock();
2749
2750         fLastAction = action;
2751         fLastActionTime = time(0);
2752         
2753         fMonitoringMutex->UnLock();
2754 }
2755
2756 //______________________________________________________________________________________________
2757 const char* AliShuttle::GetRunParameter(const char* param)
2758 {
2759         //
2760         // returns run parameter read from DAQ logbook
2761         //
2762
2763         if(!fLogbookEntry) {
2764                 AliError("No logbook entry!");
2765                 return 0;
2766         }
2767
2768         return fLogbookEntry->GetRunParameter(param);
2769 }
2770
2771 //______________________________________________________________________________________________
2772 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2773 {
2774         //
2775         // returns object from OCDB valid for current run
2776         //
2777
2778         if (fTestMode & kErrorOCDB)
2779         {
2780                 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2781                 return 0;
2782         }
2783         
2784         AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2785         if (!sto)
2786         {
2787                 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2788                 return 0;
2789         }
2790
2791         return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2792 }
2793
2794 //______________________________________________________________________________________________
2795 Bool_t AliShuttle::SendMail()
2796 {
2797         //
2798         // sends a mail to the subdetector expert in case of preprocessor error
2799         //
2800         
2801         if (fTestMode != kNone)
2802                 return kTRUE;
2803
2804         void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2805         if (dir == NULL)
2806         {
2807                 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2808                 {
2809                         AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2810                         return kFALSE;
2811                 }
2812
2813         } else {
2814                 gSystem->FreeDirectory(dir);
2815         }
2816
2817         TString bodyFileName;
2818         bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2819         gSystem->ExpandPathName(bodyFileName);
2820
2821         ofstream mailBody;
2822         mailBody.open(bodyFileName, ofstream::out);
2823
2824         if (!mailBody.is_open())
2825         {
2826                 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2827                 return kFALSE;
2828         }
2829
2830         TString to="";
2831         TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2832         TObjString *anExpert=0;
2833         while ((anExpert = (TObjString*) iterExperts.Next()))
2834         {
2835                 to += Form("%s,", anExpert->GetName());
2836         }
2837         to.Remove(to.Length()-1);
2838         AliDebug(2, Form("to: %s",to.Data()));
2839
2840         if (to.IsNull()) {
2841                 AliInfo("List of detector responsibles not yet set!");
2842                 return kFALSE;
2843         }
2844
2845         TString cc="alberto.colla@cern.ch";
2846
2847         TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
2848                                 fCurrentDetector.Data(), GetCurrentRun());
2849         AliDebug(2, Form("subject: %s", subject.Data()));
2850
2851         TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2852         body += Form("SHUTTLE just detected that your preprocessor "
2853                         "failed processing run %d!!\n\n", GetCurrentRun());
2854         body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2855         body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2856         body += Form("Find the %s log for the current run on \n\n"
2857                 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n", 
2858                 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
2859         body += Form("The last 10 lines of %s log file are following:\n\n");
2860
2861         AliDebug(2, Form("Body begin: %s", body.Data()));
2862
2863         mailBody << body.Data();
2864         mailBody.close();
2865         mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2866
2867         TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2868         TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2869         if (gSystem->Exec(tailCommand.Data()))
2870         {
2871                 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2872         }
2873
2874         TString endBody = Form("------------------------------------------------------\n\n");
2875         endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2876         endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2877         endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
2878
2879         AliDebug(2, Form("Body end: %s", endBody.Data()));
2880
2881         mailBody << endBody.Data();
2882
2883         mailBody.close();
2884
2885         // send mail!
2886         TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2887                                                 subject.Data(),
2888                                                 cc.Data(),
2889                                                 to.Data(),
2890                                                 bodyFileName.Data());
2891         AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2892
2893         Bool_t result = gSystem->Exec(mailCommand.Data());
2894
2895         return result == 0;
2896 }
2897
2898 //______________________________________________________________________________________________
2899 const char* AliShuttle::GetRunType()
2900 {
2901         //
2902         // returns run type read from "run type" logbook
2903         //
2904
2905         if(!fLogbookEntry) {
2906                 AliError("No logbook entry!");
2907                 return 0;
2908         }
2909
2910         return fLogbookEntry->GetRunType();
2911 }
2912
2913 //______________________________________________________________________________________________
2914 Bool_t AliShuttle::GetHLTStatus()
2915 {
2916         // Return HLT status (ON=1 OFF=0)
2917         // Converts the HLT status from the status string read in the run logbook (not just a bool)
2918
2919         if(!fLogbookEntry) {
2920                 AliError("No logbook entry!");
2921                 return 0;
2922         }
2923
2924         // TODO implement when HLTStatus is inserted in run logbook
2925         //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
2926         //if(hltStatus == "OFF") {return kFALSE};
2927
2928         return kTRUE;
2929 }
2930
2931 //______________________________________________________________________________________________
2932 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2933 {
2934         //
2935         // sets Shuttle temp directory
2936         //
2937
2938         fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2939 }
2940
2941 //______________________________________________________________________________________________
2942 void AliShuttle::SetShuttleLogDir(const char* logDir)
2943 {
2944         //
2945         // sets Shuttle log directory
2946         //
2947
2948         fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2949 }