Furter developement on the Shuttle:
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
1 /**************************************************************************
2  * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3  *                                                                        *
4  * Author: The ALICE Off-line Project.                                    *
5  * Contributors are mentioned in the code where appropriate.              *
6  *                                                                        *
7  * Permission to use, copy, modify and distribute this software and its   *
8  * documentation strictly for non-commercial purposes is hereby granted   *
9  * without fee, provided that the above copyright notice appears in all   *
10  * copies and that both the copyright notice and this permission notice   *
11  * appear in the supporting documentation. The authors make no claims     *
12  * about the suitability of this software for any purpose. It is          *
13  * provided "as is" without express or implied warranty.                  *
14  **************************************************************************/
15
16 /*
17 $Log$
18 Revision 1.61  2007/10/30 20:33:51  acolla
19 Improved managing of temporary folders, which weren't correctly handled.
20 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
21
22 Revision 1.60  2007/10/29 18:06:16  acolla
23
24 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
25 This function can be used by GRP only. It stores raw data tags merged file to the
26 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
27
28 KNOWN ISSUES:
29
30 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
31 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
32 has been corrected in the root package on the Shuttle machine.
33
34 Revision 1.59  2007/10/05 12:40:55  acolla
35
36 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
37
38 Revision 1.58  2007/09/28 15:27:40  acolla
39
40 AliDCSClient "multiSplit" option added in the DCS configuration
41 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
42
43 Revision 1.57  2007/09/27 16:53:13  acolla
44 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
45 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
46
47 Revision 1.56  2007/09/14 16:46:14  jgrosseo
48 1) Connect and Close are called before and after each query, so one can
49 keep the same AliDCSClient object.
50 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
51 3) Splitting interval can be specified in constructor
52
53 Revision 1.55  2007/08/06 12:26:40  acolla
54 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
55 read from the run logbook.
56
57 Revision 1.54  2007/07/12 09:51:25  jgrosseo
58 removed duplicated log message in GetFile
59
60 Revision 1.53  2007/07/12 09:26:28  jgrosseo
61 updating hlt fxs base path
62
63 Revision 1.52  2007/07/12 08:06:45  jgrosseo
64 adding log messages in getfile... functions
65 adding not implemented copy constructor in alishuttleconfigholder
66
67 Revision 1.51  2007/07/03 17:24:52  acolla
68 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
69
70 Revision 1.50  2007/07/02 17:19:32  acolla
71 preprocessor is run in a temp directory that is removed when process is finished.
72
73 Revision 1.49  2007/06/29 10:45:06  acolla
74 Number of columns in MySql Shuttle logbook increased by one (HLT added)
75
76 Revision 1.48  2007/06/21 13:06:19  acolla
77 GetFileSources returns dummy list with 1 source if system=DCS (better than
78 returning error as it was)
79
80 Revision 1.47  2007/06/19 17:28:56  acolla
81 HLT updated; missing map bug removed.
82
83 Revision 1.46  2007/06/09 13:01:09  jgrosseo
84 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
85
86 Revision 1.45  2007/05/30 06:35:20  jgrosseo
87 Adding functionality to the Shuttle/TestShuttle:
88 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
89 o) Function to retrieve list of IDs for a given source      (GetFileIDs)
90 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
91 Example code has been added to the TestProcessor in TestShuttle
92
93 Revision 1.44  2007/05/11 16:09:32  acolla
94 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
95 example: ITS/SPD/100_filename.root
96
97 Revision 1.43  2007/05/10 09:59:51  acolla
98 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
99
100 Revision 1.42  2007/05/03 08:01:39  jgrosseo
101 typo in last commit :-(
102
103 Revision 1.41  2007/05/03 08:00:48  jgrosseo
104 fixing log message when pp want to skip dcs value retrieval
105
106 Revision 1.40  2007/04/27 07:06:48  jgrosseo
107 GetFileSources returns empty list in case of no files, but successful query
108 No mails sent in testmode
109
110 Revision 1.39  2007/04/17 12:43:57  acolla
111 Correction in StoreOCDB; change of text in mail to detector expert
112
113 Revision 1.38  2007/04/12 08:26:18  jgrosseo
114 updated comment
115
116 Revision 1.37  2007/04/10 16:53:14  jgrosseo
117 redirecting sub detector stdout, stderr to sub detector log file
118
119 Revision 1.35  2007/04/04 16:26:38  acolla
120 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
121 2. Added missing dependency in test preprocessors.
122 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
123
124 Revision 1.34  2007/04/04 10:33:36  jgrosseo
125 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
126 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
127
128 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
129
130 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
131
132 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
133
134 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
135 If you always need DCS data (like before), you do not need to implement it.
136
137 6) The run type has been added to the monitoring page
138
139 Revision 1.33  2007/04/03 13:56:01  acolla
140 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
141 run type.
142
143 Revision 1.32  2007/02/28 10:41:56  acolla
144 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
145 AliPreprocessor::GetRunType() function.
146 Added some ldap definition files.
147
148 Revision 1.30  2007/02/13 11:23:21  acolla
149 Moved getters and setters of Shuttle's main OCDB/Reference, local
150 OCDB/Reference, temp and log folders to AliShuttleInterface
151
152 Revision 1.27  2007/01/30 17:52:42  jgrosseo
153 adding monalisa monitoring
154
155 Revision 1.26  2007/01/23 19:20:03  acolla
156 Removed old ldif files, added TOF, MCH ldif files. Added some options in
157 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
158 SetShuttleLogDir
159
160 Revision 1.25  2007/01/15 19:13:52  acolla
161 Moved some AliInfo to AliDebug in SendMail function
162
163 Revision 1.21  2006/12/07 08:51:26  jgrosseo
164 update (alberto):
165 table, db names in ldap configuration
166 added GRP preprocessor
167 DCS data can also be retrieved by data point
168
169 Revision 1.20  2006/11/16 16:16:48  jgrosseo
170 introducing strict run ordering flag
171 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
172
173 Revision 1.19  2006/11/06 14:23:04  jgrosseo
174 major update (Alberto)
175 o) reading of run parameters from the logbook
176 o) online offline naming conversion
177 o) standalone DCSclient package
178
179 Revision 1.18  2006/10/20 15:22:59  jgrosseo
180 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
181 o) Merging Collect, CollectAll, CollectNew function
182 o) Removing implementation of empty copy constructors (declaration still there!)
183
184 Revision 1.17  2006/10/05 16:20:55  jgrosseo
185 adapting to new CDB classes
186
187 Revision 1.16  2006/10/05 15:46:26  jgrosseo
188 applying to the new interface
189
190 Revision 1.15  2006/10/02 16:38:39  jgrosseo
191 update (alberto):
192 fixed memory leaks
193 storing of objects that failed to be stored to the grid before
194 interfacing of shuttle status table in daq system
195
196 Revision 1.14  2006/08/29 09:16:05  jgrosseo
197 small update
198
199 Revision 1.13  2006/08/15 10:50:00  jgrosseo
200 effc++ corrections (alberto)
201
202 Revision 1.12  2006/08/08 14:19:29  jgrosseo
203 Update to shuttle classes (Alberto)
204
205 - Possibility to set the full object's path in the Preprocessor's and
206 Shuttle's  Store functions
207 - Possibility to extend the object's run validity in the same classes
208 ("startValidity" and "validityInfinite" parameters)
209 - Implementation of the StoreReferenceData function to store reference
210 data in a dedicated CDB storage.
211
212 Revision 1.11  2006/07/21 07:37:20  jgrosseo
213 last run is stored after each run
214
215 Revision 1.10  2006/07/20 09:54:40  jgrosseo
216 introducing status management: The processing per subdetector is divided into several steps,
217 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
218 can keep track of the number of failures and skips further processing after a certain threshold is
219 exceeded. These thresholds can be configured in LDAP.
220
221 Revision 1.9  2006/07/19 10:09:55  jgrosseo
222 new configuration, accesst to DAQ FES (Alberto)
223
224 Revision 1.8  2006/07/11 12:44:36  jgrosseo
225 adding parameters for extended validity range of data produced by preprocessor
226
227 Revision 1.7  2006/07/10 14:37:09  jgrosseo
228 small fix + todo comment
229
230 Revision 1.6  2006/07/10 13:01:41  jgrosseo
231 enhanced storing of last sucessfully processed run (alberto)
232
233 Revision 1.5  2006/07/04 14:59:57  jgrosseo
234 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
235
236 Revision 1.4  2006/06/12 09:11:16  jgrosseo
237 coding conventions (Alberto)
238
239 Revision 1.3  2006/06/06 14:26:40  jgrosseo
240 o) removed files that were moved to STEER
241 o) shuttle updated to follow the new interface (Alberto)
242
243 Revision 1.2  2006/03/07 07:52:34  hristov
244 New version (B.Yordanov)
245
246 Revision 1.6  2005/11/19 17:19:14  byordano
247 RetrieveDATEEntries and RetrieveConditionsData added
248
249 Revision 1.5  2005/11/19 11:09:27  byordano
250 AliShuttle declaration added
251
252 Revision 1.4  2005/11/17 17:47:34  byordano
253 TList changed to TObjArray
254
255 Revision 1.3  2005/11/17 14:43:23  byordano
256 import to local CVS
257
258 Revision 1.1.1.1  2005/10/28 07:33:58  hristov
259 Initial import as subdirectory in AliRoot
260
261 Revision 1.2  2005/09/13 08:41:15  byordano
262 default startTime endTime added
263
264 Revision 1.4  2005/08/30 09:13:02  byordano
265 some docs added
266
267 Revision 1.3  2005/08/29 21:15:47  byordano
268 some docs added
269
270 */
271
272 //
273 // This class is the main manager for AliShuttle. 
274 // It organizes the data retrieval from DCS and call the 
275 // interface methods of AliPreprocessor.
276 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
277 // data for its set of aliases is retrieved. If there is registered
278 // AliPreprocessor for this detector then it will be used
279 // accroding to the schema (see AliPreprocessor).
280 // If there isn't registered AliPreprocessor than the retrieved
281 // data is stored automatically to the undelying AliCDBStorage.
282 // For detSpec is used the alias name.
283 //
284
285 #include "AliShuttle.h"
286
287 #include "AliCDBManager.h"
288 #include "AliCDBStorage.h"
289 #include "AliCDBId.h"
290 #include "AliCDBRunRange.h"
291 #include "AliCDBPath.h"
292 #include "AliCDBEntry.h"
293 #include "AliShuttleConfig.h"
294 #include "DCSClient/AliDCSClient.h"
295 #include "AliLog.h"
296 #include "AliPreprocessor.h"
297 #include "AliShuttleStatus.h"
298 #include "AliShuttleLogbookEntry.h"
299
300 #include <TSystem.h>
301 #include <TObject.h>
302 #include <TString.h>
303 #include <TTimeStamp.h>
304 #include <TObjString.h>
305 #include <TSQLServer.h>
306 #include <TSQLResult.h>
307 #include <TSQLRow.h>
308 #include <TMutex.h>
309 #include <TSystemDirectory.h>
310 #include <TSystemFile.h>
311 #include <TFile.h>
312 #include <TGrid.h>
313 #include <TGridResult.h>
314
315 #include <TMonaLisaWriter.h>
316
317 #include <fstream>
318
319 #include <sys/types.h>
320 #include <sys/wait.h>
321
322 ClassImp(AliShuttle)
323
324 //______________________________________________________________________________________________
325 AliShuttle::AliShuttle(const AliShuttleConfig* config,
326                 UInt_t timeout, Int_t retries):
327 fConfig(config),
328 fTimeout(timeout), fRetries(retries),
329 fPreprocessorMap(),
330 fLogbookEntry(0),
331 fCurrentDetector(),
332 fStatusEntry(0),
333 fMonitoringMutex(0),
334 fLastActionTime(0),
335 fLastAction(),
336 fMonaLisa(0),
337 fTestMode(kNone),
338 fReadTestMode(kFALSE),
339 fOutputRedirected(kFALSE)
340 {
341         //
342         // config: AliShuttleConfig used
343         // timeout: timeout used for AliDCSClient connection
344         // retries: the number of retries in case of connection error.
345         //
346
347         if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
348         for(int iSys=0;iSys<4;iSys++) {
349                 fServer[iSys]=0;
350                 if (iSys < 3)
351                         fFXSlist[iSys].SetOwner(kTRUE);
352         }
353         fPreprocessorMap.SetOwner(kTRUE);
354
355         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
356                 fFirstUnprocessed[iDet] = kFALSE;
357
358         fMonitoringMutex = new TMutex();
359 }
360
361 //______________________________________________________________________________________________
362 AliShuttle::~AliShuttle()
363 {
364         //
365         // destructor
366         //
367
368         fPreprocessorMap.DeleteAll();
369         for(int iSys=0;iSys<4;iSys++)
370                 if(fServer[iSys]) {
371                         fServer[iSys]->Close();
372                         delete fServer[iSys];
373                         fServer[iSys] = 0;
374                 }
375
376         if (fStatusEntry){
377                 delete fStatusEntry;
378                 fStatusEntry = 0;
379         }
380         
381         if (fMonitoringMutex) 
382         {
383                 delete fMonitoringMutex;
384                 fMonitoringMutex = 0;
385         }
386 }
387
388 //______________________________________________________________________________________________
389 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
390 {
391         //
392         // Registers new AliPreprocessor.
393         // It uses GetName() for indentificator of the pre processor.
394         // The pre processor is registered it there isn't any other
395         // with the same identificator (GetName()).
396         //
397
398         const char* detName = preprocessor->GetName();
399         if(GetDetPos(detName) < 0)
400                 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
401
402         if (fPreprocessorMap.GetValue(detName)) {
403                 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
404                 return;
405         }
406
407         fPreprocessorMap.Add(new TObjString(detName), preprocessor);
408 }
409 //______________________________________________________________________________________________
410 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
411                 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
412 {
413         // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
414         // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
415         // using this function. Use StoreReferenceData instead!
416         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
417         // finishes the data are transferred to the main storage (Grid).
418
419         return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
420 }
421
422 //______________________________________________________________________________________________
423 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
424 {
425         // Stores a CDB object in the storage for reference data. This objects will not be available during
426         // offline reconstrunction. Use this function for reference data only!
427         // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
428         // finishes the data are transferred to the main storage (Grid).
429
430         return StoreLocally(fgkLocalRefStorage, path, object, metaData);
431 }
432
433 //______________________________________________________________________________________________
434 Bool_t AliShuttle::StoreLocally(const TString& localUri,
435                         const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
436                         Int_t validityStart, Bool_t validityInfinite)
437 {
438         // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
439         // when the preprocessor finishes the data are transferred to the main storage (Grid).
440         // The parameters are:
441         //   1) Uri of the backup storage (Local)
442         //   2) the object's path.
443         //   3) the object to be stored
444         //   4) the metaData to be associated with the object
445         //   5) the validity start run number w.r.t. the current run,
446         //      if the data is valid only for this run leave the default 0
447         //   6) specifies if the calibration data is valid for infinity (this means until updated),
448         //      typical for calibration runs, the default is kFALSE
449         //
450         // returns 0 if fail, 1 otherwise
451
452         if (fTestMode & kErrorStorage)
453         {
454                 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
455                 return kFALSE;
456         }
457         
458         const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
459
460         Int_t firstRun = GetCurrentRun() - validityStart;
461         if(firstRun < 0) {
462                 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
463                 firstRun=0;
464         }
465
466         Int_t lastRun = -1;
467         if(validityInfinite) {
468                 lastRun = AliCDBRunRange::Infinity();
469         } else {
470                 lastRun = GetCurrentRun();
471         }
472
473         // Version is set to current run, it will be used later to transfer data to Grid
474         AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
475
476         if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
477                 TObjString runUsed = Form("%d", GetCurrentRun());
478                 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
479         }
480
481         Bool_t result = kFALSE;
482
483         if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
484                 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
485         } else {
486                 result = AliCDBManager::Instance()->GetStorage(localUri)
487                                         ->Put(object, id, metaData);
488         }
489
490         if(!result) {
491
492                 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
493         }
494
495         return result;
496 }
497
498 //______________________________________________________________________________________________
499 Bool_t AliShuttle::StoreOCDB()
500 {
501         //
502         // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
503         // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
504         // Then calls StoreRefFilesToGrid to store reference files. 
505         //
506         
507         if (fTestMode & kErrorGrid)
508         {
509                 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
510                 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
511                 return kFALSE;
512         }
513         
514         Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
515         Bool_t resultCDB = StoreOCDB(fgkMainCDB);
516
517         Log("SHUTTLE","StoreOCDB - Storing reference data ...");
518         Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
519         
520         Log("SHUTTLE","StoreOCDB - Storing reference files ...");
521         Bool_t resultRefFiles = CopyFilesToGrid("reference");
522         
523         Bool_t resultMetadata = kTRUE;
524         if(fCurrentDetector == "GRP") 
525         {
526                 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
527                 resultMetadata = CopyFilesToGrid("metadata");
528         }
529         
530         return resultCDB && resultRef && resultRefFiles && resultMetadata;
531 }
532
533 //______________________________________________________________________________________________
534 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
535 {
536         //
537         // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
538         //
539
540         TObjArray* gridIds=0;
541
542         Bool_t result = kTRUE;
543
544         const char* type = 0;
545         TString localURI;
546         if(gridURI == fgkMainCDB) {
547                 type = "OCDB";
548                 localURI = fgkLocalCDB;
549         } else if(gridURI == fgkMainRefStorage) {
550                 type = "reference";
551                 localURI = fgkLocalRefStorage;
552         } else {
553                 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
554                 return kFALSE;
555         }
556
557         AliCDBManager* man = AliCDBManager::Instance();
558
559         AliCDBStorage *gridSto = man->GetStorage(gridURI);
560         if(!gridSto) {
561                 Log("SHUTTLE",
562                         Form("StoreOCDB - cannot activate main %s storage", type));
563                 return kFALSE;
564         }
565
566         gridIds = gridSto->GetQueryCDBList();
567
568         // get objects previously stored in local CDB
569         AliCDBStorage *localSto = man->GetStorage(localURI);
570         if(!localSto) {
571                 Log("SHUTTLE",
572                         Form("StoreOCDB - cannot activate local %s storage", type));
573                 return kFALSE;
574         }
575         AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
576         // Local objects were stored with current run as Grid version!
577         TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
578         localEntries->SetOwner(1);
579
580         // loop on local stored objects
581         TIter localIter(localEntries);
582         AliCDBEntry *aLocEntry = 0;
583         while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
584                 aLocEntry->SetOwner(1);
585                 AliCDBId aLocId = aLocEntry->GetId();
586                 aLocEntry->SetVersion(-1);
587                 aLocEntry->SetSubVersion(-1);
588
589                 // If local object is valid up to infinity we store it only if it is
590                 // the first unprocessed run!
591                 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
592                         !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
593                 {
594                         Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
595                                                 "there are previous unprocessed runs!",
596                                                 fCurrentDetector.Data(), aLocId.GetPath().Data()));
597                         continue;
598                 }
599
600                 // loop on Grid valid Id's
601                 Bool_t store = kTRUE;
602                 TIter gridIter(gridIds);
603                 AliCDBId* aGridId = 0;
604                 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
605                         if(aGridId->GetPath() != aLocId.GetPath()) continue;
606                         // skip all objects valid up to infinity
607                         if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
608                         // if we get here, it means there's already some more recent object stored on Grid!
609                         store = kFALSE;
610                         break;
611                 }
612
613                 // If we get here, the file can be stored!
614                 Bool_t storeOk = gridSto->Put(aLocEntry);
615                 if(!store || storeOk){
616
617                         if (!store)
618                         {
619                                 Log(fCurrentDetector.Data(),
620                                         Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
621                                                 type, aGridId->ToString().Data()));
622                         } else {
623                                 Log("SHUTTLE",
624                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
625                                                 aLocId.ToString().Data(), type));
626                                 Log(fCurrentDetector.Data(),
627                                         Form("StoreOCDB - Object <%s> successfully put into %s storage",
628                                                 aLocId.ToString().Data(), type));
629                         }
630
631                         // removing local filename...
632                         TString filename;
633                         localSto->IdToFilename(aLocId, filename);
634                         Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
635                         RemoveFile(filename.Data());
636                         continue;
637                 } else  {
638                         Log("SHUTTLE",
639                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
640                                         type, aLocId.ToString().Data()));
641                         Log(fCurrentDetector.Data(),
642                                 Form("StoreOCDB - Grid %s storage of object <%s> failed",
643                                         type, aLocId.ToString().Data()));
644                         result = kFALSE;
645                 }
646         }
647         localEntries->Clear();
648
649         return result;
650 }
651
652 //______________________________________________________________________________________________
653 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
654 {
655         // clears the directory used to store reference files of a given subdetector
656   
657         AliCDBManager* man = AliCDBManager::Instance();
658         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
659         TString localBaseFolder = sto->GetBaseFolder();
660
661         TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
662         
663         Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
664
665         TString begin;
666         begin.Form("%d_", GetCurrentRun());
667         
668         TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
669         if (!baseDir)
670                 return kTRUE;
671                 
672         TList* dirList = baseDir->GetListOfFiles();
673         delete baseDir;
674         
675         if (!dirList) return kTRUE;
676                         
677         if (dirList->GetEntries() < 3) 
678         {
679                 delete dirList;
680                 return kTRUE;
681         }
682                                 
683         Int_t nDirs = 0, nDel = 0;
684         TIter dirIter(dirList);
685         TSystemFile* entry = 0;
686
687         Bool_t success = kTRUE;
688         
689         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
690         {                                       
691                 if (entry->IsDirectory())
692                         continue;
693                 
694                 TString fileName(entry->GetName());
695                 if (!fileName.BeginsWith(begin))
696                         continue;
697                         
698                 nDirs++;
699                                                 
700                 // delete file
701                 Int_t result = gSystem->Unlink(fileName.Data());
702                 
703                 if (result)
704                 {
705                         Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
706                         success = kFALSE;
707                 } else {
708                         nDel++;
709                 }
710         }
711
712         if(nDirs > 0)
713                 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.", 
714                         nDel, nDirs, targetDir.Data()));
715
716                 
717         delete dirList;
718         return success;
719
720
721
722
723
724
725   Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
726   if (result == 0)
727   {
728     // delete directory
729     result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
730     if (result != 0)
731     {  
732       Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
733       return kFALSE;
734     }
735   }
736
737   result = gSystem->mkdir(targetDir, kTRUE);
738   if (result != 0)
739   {
740     Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
741     return kFALSE;
742   }
743         
744   return kTRUE;
745 }
746
747 //______________________________________________________________________________________________
748 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
749 {
750         //
751         // Stores reference file directly (without opening it). This function stores the file locally.
752         //
753         // The file is stored under the following location: 
754         // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
755         // where <gridFileName> is the second parameter given to the function
756         // 
757         
758         if (fTestMode & kErrorStorage)
759         {
760                 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
761                 return kFALSE;
762         }
763         
764         AliCDBManager* man = AliCDBManager::Instance();
765         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
766         
767         TString localBaseFolder = sto->GetBaseFolder();
768         
769         TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);    
770         target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
771         
772         return CopyFileLocally(localFile, target);
773 }
774
775 //______________________________________________________________________________________________
776 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
777 {
778         //
779         // Stores Run metadata file to the Grid, in the run folder
780         //
781         // Only GRP can call this function.
782         
783         if (fTestMode & kErrorStorage)
784         {
785                 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
786                 return kFALSE;
787         }
788         
789         AliCDBManager* man = AliCDBManager::Instance();
790         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
791         
792         TString localBaseFolder = sto->GetBaseFolder();
793         
794         // Build Run level folder
795         // folder = /alice/data/year/lhcPeriod/runNb/Raw
796         
797                 
798         TString lhcPeriod = GetLHCPeriod();     
799         if (lhcPeriod.Length() == 0) 
800         {
801                 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
802                 return 0;
803         }
804         
805         TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s", 
806                                 localBaseFolder.Data(), GetCurrentYear(), 
807                                 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
808                                         
809         return CopyFileLocally(localFile, target);
810 }
811
812 //______________________________________________________________________________________________
813 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
814 {
815         //
816         // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
817         // Files are temporarily stored in the local reference storage. When the preprocessor 
818         // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn 
819         // (in reference or run level folders)
820         //
821         
822         TString targetDir(target(0, target.Last('/')));
823         
824         //try to open base dir folder, if it does not exist
825         void* dir = gSystem->OpenDirectory(targetDir.Data());
826         if (dir == NULL) {
827                 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
828                         Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
829                         return kFALSE;
830                 }
831
832         } else {
833                 gSystem->FreeDirectory(dir);
834         }
835         
836         Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
837         if (result)
838         {
839                 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
840                 return kFALSE;
841         }
842
843         result = gSystem->CopyFile(localFile, target);
844
845         if (result == 0)
846         {
847                 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
848                 return kTRUE;
849         }
850         else
851         {
852                 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d", 
853                                 localFile, target.Data(), result));
854                 return kFALSE;
855         }       
856
857
858
859 }
860
861 //______________________________________________________________________________________________
862 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
863 {
864         //
865         // Transfers local files to the Grid. Local files can be reference files 
866         // or run metadata file (from GRP only).
867         //
868         // According to the type (ref, metadata) the files are stored under the following location: 
869         // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
870         // metadata --> <run data folder>/<MetadataFileName>
871         //
872                 
873         AliCDBManager* man = AliCDBManager::Instance();
874         AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
875         if (!sto)
876                 return kFALSE;
877         TString localBaseFolder = sto->GetBaseFolder();
878         
879         TString dir;
880         TString alienDir;
881         TString begin;
882         
883         if (strcmp(type, "reference") == 0) 
884         {
885                 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
886                 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
887                 if (!gridSto)
888                         return kFALSE;
889                 TString gridBaseFolder = gridSto->GetBaseFolder();
890                 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
891                 begin = Form("%d_", GetCurrentRun());
892         } 
893         else if (strcmp(type, "metadata") == 0)
894         {
895                         
896                 TString lhcPeriod = GetLHCPeriod();
897         
898                 if (lhcPeriod.Length() == 0) 
899                 {
900                         Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
901                         return 0;
902                 }
903                 
904                 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw", 
905                                 localBaseFolder.Data(), GetCurrentYear(), 
906                                 lhcPeriod.Data(), GetCurrentRun());
907                 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
908                 
909                 begin = "";
910         }
911         else 
912         {
913                 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
914                 return kFALSE;
915         }
916                 
917         TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
918         if (!baseDir)
919                 return kTRUE;
920                 
921         TList* dirList = baseDir->GetListOfFiles();
922         delete baseDir;
923         
924         if (!dirList) return kTRUE;
925                 
926         if (dirList->GetEntries() < 3) 
927         {
928                 delete dirList;
929                 return kTRUE;
930         }
931                         
932         if (!gGrid)
933         { 
934                 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
935                 delete dirList;
936                 return kFALSE;
937         }
938         
939         Int_t nDirs = 0, nTransfer = 0;
940         TIter dirIter(dirList);
941         TSystemFile* entry = 0;
942
943         Bool_t success = kTRUE;
944         Bool_t first = kTRUE;
945         
946         while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
947         {                       
948                 if (entry->IsDirectory())
949                         continue;
950                         
951                 TString fileName(entry->GetName());
952                 if (!fileName.BeginsWith(begin))
953                         continue;
954                         
955                 nDirs++;
956                         
957                 if (first)
958                 {
959                         first = kFALSE;
960                         // check that folder exists, otherwise create it
961                         TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
962                         
963                         if (!result)
964                         {
965                                 delete dirList;
966                                 return kFALSE;
967                         }
968                         
969                         if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
970                         {
971                                 // TODO It does not work currently! Bug in TAliEn::Mkdir
972                                 // TODO Manually fixed in local root v5-16-00
973                                 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
974                                 {
975                                         Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
976                                                         alienDir.Data()));
977                                         delete dirList;
978                                         return kFALSE;
979                                 } else {
980                                         Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
981                                 }
982                                 
983                         } else {
984                                         Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
985                         }
986                 }
987                         
988                 TString fullLocalPath;
989                 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
990                 
991                 TString fullGridPath;
992                 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
993
994                 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
995                 
996                 if (result)
997                 {
998                         Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!", 
999                                                 fullLocalPath.Data(), fullGridPath.Data()));
1000                         RemoveFile(fullLocalPath);
1001                         nTransfer++;
1002                 }
1003                 else
1004                 {
1005                         Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!", 
1006                                                 fullLocalPath.Data(), fullGridPath.Data()));
1007                         success = kFALSE;
1008                 }
1009         }
1010
1011         Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.", 
1012                                                 nTransfer, nDirs, dir.Data()));
1013
1014                 
1015         delete dirList;
1016         return success;
1017 }
1018
1019 //______________________________________________________________________________________________
1020 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1021 {
1022         //
1023         // Get folder name of reference files 
1024         //
1025
1026         TString offDetStr(GetOfflineDetName(detector));
1027         TString dir;
1028         if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1029         {
1030                 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1031         } else {
1032                 dir.Form("%s/%s", base, offDetStr.Data());
1033         }
1034         
1035         return dir.Data();
1036         
1037
1038 }
1039
1040 //______________________________________________________________________________________________
1041 void AliShuttle::CleanLocalStorage(const TString& uri)
1042 {
1043         //
1044         // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1045         //
1046
1047         const char* type = 0;
1048         if(uri == fgkLocalCDB) {
1049                 type = "OCDB";
1050         } else if(uri == fgkLocalRefStorage) {
1051                 type = "Reference";
1052         } else {
1053                 AliError(Form("Invalid storage URI: %s", uri.Data()));
1054                 return;
1055         }
1056
1057         AliCDBManager* man = AliCDBManager::Instance();
1058
1059         // open local storage
1060         AliCDBStorage *localSto = man->GetStorage(uri);
1061         if(!localSto) {
1062                 Log("SHUTTLE",
1063                         Form("CleanLocalStorage - cannot activate local %s storage", type));
1064                 return;
1065         }
1066
1067         TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1068                 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1069
1070         AliDebug(2, Form("filename = %s", filename.Data()));
1071
1072         Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1073                 GetCurrentRun(), fCurrentDetector.Data()));
1074
1075         RemoveFile(filename.Data());
1076
1077 }
1078
1079 //______________________________________________________________________________________________
1080 void AliShuttle::RemoveFile(const char* filename)
1081 {
1082         //
1083         // removes local file
1084         //
1085
1086         TString command(Form("rm -f %s", filename));
1087
1088         Int_t result = gSystem->Exec(command.Data());
1089         if(result != 0)
1090         {
1091                 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1092                         fCurrentDetector.Data(), filename));
1093         }
1094 }
1095
1096 //______________________________________________________________________________________________
1097 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1098 {
1099         //
1100         // Reads the AliShuttleStatus from the CDB
1101         //
1102
1103         if (fStatusEntry){
1104                 delete fStatusEntry;
1105                 fStatusEntry = 0;
1106         }
1107
1108         fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1109                 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1110
1111         if (!fStatusEntry) return 0;
1112         fStatusEntry->SetOwner(1);
1113
1114         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1115         if (!status) {
1116                 AliError("Invalid object stored to CDB!");
1117                 return 0;
1118         }
1119
1120         return status;
1121 }
1122
1123 //______________________________________________________________________________________________
1124 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1125 {
1126         //
1127         // writes the status for one subdetector
1128         //
1129
1130         if (fStatusEntry){
1131                 delete fStatusEntry;
1132                 fStatusEntry = 0;
1133         }
1134
1135         Int_t run = GetCurrentRun();
1136
1137         AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1138
1139         fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1140         fStatusEntry->SetOwner(1);
1141
1142         UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1143
1144         if (!result) {
1145                 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1146                                                 fCurrentDetector.Data(), run));
1147                 return kFALSE;
1148         }
1149         
1150         SendMLInfo();
1151
1152         return kTRUE;
1153 }
1154
1155 //______________________________________________________________________________________________
1156 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1157 {
1158         //
1159         // changes the AliShuttleStatus for the given detector and run to the given status
1160         //
1161
1162         if (!fStatusEntry){
1163                 AliError("UNEXPECTED: fStatusEntry empty");
1164                 return;
1165         }
1166
1167         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1168
1169         if (!status){
1170                 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1171                 return;
1172         }
1173
1174         TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1175                                 fCurrentDetector.Data(),
1176                                 status->GetStatusName(),
1177                                 status->GetStatusName(newStatus));
1178         Log("SHUTTLE", actionStr);
1179         SetLastAction(actionStr);
1180
1181         status->SetStatus(newStatus);
1182         if (increaseCount) status->IncreaseCount();
1183
1184         AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1185
1186         SendMLInfo();
1187 }
1188
1189 //______________________________________________________________________________________________
1190 void AliShuttle::SendMLInfo()
1191 {
1192         //
1193         // sends ML information about the current status of the current detector being processed
1194         //
1195         
1196         AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1197         
1198         if (!status){
1199                 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1200                 return;
1201         }
1202         
1203         TMonaLisaText  mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1204         TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1205
1206         TList mlList;
1207         mlList.Add(&mlStatus);
1208         mlList.Add(&mlRetryCount);
1209
1210         fMonaLisa->SendParameters(&mlList);
1211 }
1212
1213 //______________________________________________________________________________________________
1214 Bool_t AliShuttle::ContinueProcessing()
1215 {
1216         // this function reads the AliShuttleStatus information from CDB and
1217         // checks if the processing should be continued
1218         // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1219
1220         if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1221
1222         AliPreprocessor* aPreprocessor =
1223                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1224         if (!aPreprocessor)
1225         {
1226                 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1227                 return kFALSE;
1228         }
1229
1230         AliShuttleLogbookEntry::Status entryStatus =
1231                 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1232
1233         if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1234                 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1235                                 fCurrentDetector.Data(),
1236                                 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1237                 return kFALSE;
1238         }
1239
1240         // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1241
1242         // check if current run is first unprocessed run for current detector
1243         if (fConfig->StrictRunOrder(fCurrentDetector) &&
1244                 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1245         {
1246                 if (fTestMode == kNone)
1247                 {
1248                         Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1249                                         " but this is not the first unprocessed run!"));
1250                         return kFALSE;
1251                 }
1252                 else
1253                 {
1254                         Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1255                                         "Although %s requires strict run ordering "
1256                                         "and this is not the first unprocessed run, "
1257                                         "the SHUTTLE continues"));
1258                 }
1259         }
1260
1261         AliShuttleStatus* status = ReadShuttleStatus();
1262         if (!status) {
1263                 // first time
1264                 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1265                                 fCurrentDetector.Data()));
1266                 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1267                 return WriteShuttleStatus(status);
1268         }
1269
1270         // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1271         // If it happens it may mean Logbook updating failed... let's do it now!
1272         if (status->GetStatus() == AliShuttleStatus::kDone ||
1273             status->GetStatus() == AliShuttleStatus::kFailed){
1274                 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1275                                         fCurrentDetector.Data(),
1276                                         status->GetStatusName(status->GetStatus())));
1277                 UpdateShuttleLogbook(fCurrentDetector.Data(),
1278                                         status->GetStatusName(status->GetStatus()));
1279                 return kFALSE;
1280         }
1281
1282         if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1283                 Log("SHUTTLE",
1284                         Form("ContinueProcessing - %s: Grid storage of one or more "
1285                                 "objects failed. Trying again now",
1286                                 fCurrentDetector.Data()));
1287                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1288                 if (StoreOCDB()){
1289                         Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1290                                 "successfully stored into main storage",
1291                                 fCurrentDetector.Data()));
1292                         UpdateShuttleStatus(AliShuttleStatus::kDone);
1293                         UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1294                 } else {
1295                         Log("SHUTTLE",
1296                                 Form("ContinueProcessing - %s: Grid storage failed again",
1297                                         fCurrentDetector.Data()));
1298                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1299                 }
1300                 return kFALSE;
1301         }
1302
1303         // if we get here, there is a restart
1304         Bool_t cont = kFALSE;
1305
1306         // abort conditions
1307         if (status->GetCount() >= fConfig->GetMaxRetries()) {
1308                 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1309                                 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1310                                 status->GetCount(), status->GetStatusName()));
1311                 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1312                 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1313
1314                 // there may still be objects in local OCDB and reference storage
1315                 // and FXS databases may be not updated: do it now!
1316                 
1317                 // TODO Currently disabled, we want to keep files in case of failure!
1318                 // CleanLocalStorage(fgkLocalCDB);
1319                 // CleanLocalStorage(fgkLocalRefStorage);
1320                 // UpdateTableFailCase();
1321                 
1322                 // Send mail to detector expert!
1323                 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...", 
1324                                         fCurrentDetector.Data()));
1325                 if (!SendMail())
1326                         Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1327                                         fCurrentDetector.Data()));
1328
1329         } else {
1330                 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1331                                 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1332                                 status->GetStatusName(), status->GetCount()));
1333                 Bool_t increaseCount = kTRUE;
1334                 if (status->GetStatus() == AliShuttleStatus::kDCSError || 
1335                         status->GetStatus() == AliShuttleStatus::kDCSStarted)
1336                                 increaseCount = kFALSE;
1337                                 
1338                 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1339                 cont = kTRUE;
1340         }
1341
1342         return cont;
1343 }
1344
1345 //______________________________________________________________________________________________
1346 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1347 {
1348         //
1349         // Makes data retrieval for all detectors in the configuration.
1350         // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1351         // (Unprocessed, Inactive, Failed or Done).
1352         // Returns kFALSE in case of error occured and kTRUE otherwise
1353         //
1354
1355         if (!entry) return kFALSE;
1356
1357         fLogbookEntry = entry;
1358
1359         Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1360                                         GetCurrentRun()));
1361
1362         // create ML instance that monitors this run
1363         fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1364         // disable monitoring of other parameters that come e.g. from TFile
1365         gMonitoringWriter = 0;
1366
1367         // Send the information to ML
1368         TMonaLisaText  mlStatus("SHUTTLE_status", "Processing");
1369         TMonaLisaText  mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1370
1371         TList mlList;
1372         mlList.Add(&mlStatus);
1373         mlList.Add(&mlRunType);
1374
1375         fMonaLisa->SendParameters(&mlList);
1376
1377         if (fLogbookEntry->IsDone())
1378         {
1379                 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1380                 UpdateShuttleLogbook("shuttle_done");
1381                 fLogbookEntry = 0;
1382                 return kTRUE;
1383         }
1384
1385         // read test mode if flag is set
1386         if (fReadTestMode)
1387         {
1388                 fTestMode = kNone;
1389                 TString logEntry(entry->GetRunParameter("log"));
1390                 //printf("log entry = %s\n", logEntry.Data());
1391                 TString searchStr("Testmode: ");
1392                 Int_t pos = logEntry.Index(searchStr.Data());
1393                 //printf("%d\n", pos);
1394                 if (pos >= 0)
1395                 {
1396                         TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1397                         //printf("%s\n", subStr.String().Data());
1398                         TString newStr(subStr.Data());
1399                         TObjArray* token = newStr.Tokenize(' ');
1400                         if (token)
1401                         {
1402                                 //token->Print();
1403                                 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1404                                 if (tmpStr)
1405                                 {
1406                                         Int_t testMode = tmpStr->String().Atoi();
1407                                         if (testMode > 0)
1408                                         {
1409                                                 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1410                                                 SetTestMode((TestMode) testMode);
1411                                         }
1412                                 }
1413                                 delete token;          
1414                         }
1415                 }
1416         }
1417                 
1418         fLogbookEntry->Print("all");
1419
1420         // Initialization
1421         Bool_t hasError = kFALSE;
1422
1423         // Set the CDB and Reference folders according to the year and LHC period
1424         TString lhcPeriod(GetLHCPeriod());
1425         if (lhcPeriod.Length() == 0) 
1426         {
1427                 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
1428                 return 0;
1429         }       
1430         
1431         if (fgkMainCDB.Length() == 0)
1432                 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache", 
1433                                         GetCurrentYear(), lhcPeriod.Data());
1434         
1435         if (fgkMainRefStorage.Length() == 0)
1436                 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache", 
1437                                         GetCurrentYear(), lhcPeriod.Data());
1438         
1439         AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1440         if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1441         AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1442         if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1443
1444         // Loop on detectors in the configuration
1445         TIter iter(fConfig->GetDetectors());
1446         TObjString* aDetector = 0;
1447
1448         while ((aDetector = (TObjString*) iter.Next()))
1449         {
1450                 fCurrentDetector = aDetector->String();
1451
1452                 if (ContinueProcessing() == kFALSE) continue;
1453
1454                 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START  ******",
1455                                                 GetCurrentRun(), aDetector->GetName()));
1456
1457                 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1458
1459                 Log(fCurrentDetector.Data(), "Process - Starting processing");
1460
1461                 Int_t pid = fork();
1462
1463                 if (pid < 0)
1464                 {
1465                         Log("SHUTTLE", "Process - ERROR: Forking failed");
1466                 }
1467                 else if (pid > 0)
1468                 {
1469                         // parent
1470                         Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1471                                                         GetCurrentRun(), aDetector->GetName()));
1472
1473                         Long_t begin = time(0);
1474
1475                         int status; // to be used with waitpid, on purpose an int (not Int_t)!
1476                         while (waitpid(pid, &status, WNOHANG) == 0)
1477                         {
1478                                 Long_t expiredTime = time(0) - begin;
1479
1480                                 if (expiredTime > fConfig->GetPPTimeOut())
1481                                 {
1482                                         TString tmp;
1483                                         tmp.Form("Process - Process of %s time out. "
1484                                                         "Run time: %d seconds. Killing...",
1485                                                         fCurrentDetector.Data(), expiredTime);
1486                                         Log("SHUTTLE", tmp);
1487                                         Log(fCurrentDetector, tmp);
1488
1489                                         kill(pid, 9);
1490
1491                                         UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1492                                         hasError = kTRUE;
1493
1494                                         gSystem->Sleep(1000);
1495                                 }
1496                                 else
1497                                 {
1498                                         gSystem->Sleep(1000);
1499                                         
1500                                         TString checkStr;
1501                                         checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1502                                         FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1503                                         if (!pipe)
1504                                         {
1505                                                 Log("SHUTTLE", Form("Process - Error: "
1506                                                         "Could not open pipe to %s", checkStr.Data()));
1507                                                 continue;
1508                                         }
1509                                                 
1510                                         char buffer[100];
1511                                         if (!fgets(buffer, 100, pipe))
1512                                         {
1513                                                 Log("SHUTTLE", "Process - Error: ps did not return anything");
1514                                                 gSystem->ClosePipe(pipe);
1515                                                 continue;
1516                                         }
1517                                         gSystem->ClosePipe(pipe);
1518                                         
1519                                         //Log("SHUTTLE", Form("ps returned %s", buffer));
1520                                         
1521                                         Int_t mem = 0;
1522                                         if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1523                                         {
1524                                                 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1525                                                 continue;
1526                                         }
1527                                         
1528                                         if (expiredTime % 60 == 0)
1529                                                 Log("SHUTTLE", Form("Process - %s: Checking process. "
1530                                                         "Run time: %d seconds - Memory consumption: %d KB",
1531                                                         fCurrentDetector.Data(), expiredTime, mem));
1532                                         
1533                                         if (mem > fConfig->GetPPMaxMem())
1534                                         {
1535                                                 TString tmp;
1536                                                 tmp.Form("Process - Process exceeds maximum allowed memory "
1537                                                         "(%d KB > %d KB). Killing...",
1538                                                         mem, fConfig->GetPPMaxMem());
1539                                                 Log("SHUTTLE", tmp);
1540                                                 Log(fCurrentDetector, tmp);
1541         
1542                                                 kill(pid, 9);
1543         
1544                                                 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1545                                                 hasError = kTRUE;
1546         
1547                                                 gSystem->Sleep(1000);
1548                                         }
1549                                 }
1550                         }
1551
1552                         Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1553                                                                 GetCurrentRun(), aDetector->GetName()));
1554
1555                         if (WIFEXITED(status))
1556                         {
1557                                 Int_t returnCode = WEXITSTATUS(status);
1558
1559                                 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1560                                                                                 returnCode));
1561
1562                                 if (returnCode == 0) hasError = kTRUE;
1563                         }
1564                 }
1565                 else if (pid == 0)
1566                 {
1567                         // client
1568                         Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1569                                 aDetector->GetName()));
1570
1571                         Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1572
1573                         if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1574                         {
1575                                 Log("SHUTTLE", "Process - Could not freopen stdout");
1576                         }
1577                         else
1578                         {
1579                                 fOutputRedirected = kTRUE;
1580                                 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1581                                         Log("SHUTTLE", "Process - Could not redirect stderr");
1582                                 
1583                         }
1584                         
1585                         TString wd = gSystem->WorkingDirectory();
1586                         TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(), 
1587                                 fCurrentDetector.Data(), GetCurrentRun());
1588                         
1589                         Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1590                         if (!result) // temp dir already exists!
1591                         {
1592                                 Log(fCurrentDetector.Data(), 
1593                                         Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1594                                 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));         
1595                         } 
1596                         
1597                         if (gSystem->mkdir(tmpDir.Data(), 1))
1598                         {
1599                                 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1600                                 gSystem->Exit(1);
1601                         }
1602                         
1603                         if (!gSystem->ChangeDirectory(tmpDir.Data())) 
1604                         {
1605                                 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1606                                 gSystem->Exit(1);                       
1607                         }
1608                         
1609                         Bool_t success = ProcessCurrentDetector();
1610                         
1611                         gSystem->ChangeDirectory(wd.Data());
1612                         
1613                         gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1614                         
1615                         if (success) // Preprocessor finished successfully!
1616                         { 
1617                                 // Update time_processed field in FXS DB
1618                                 if (UpdateTable() == kFALSE)
1619                                         Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!", 
1620                                                         fCurrentDetector.Data()));
1621
1622                                 // Transfer the data from local storage to main storage (Grid)
1623                                 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1624                                 if (StoreOCDB() == kFALSE)
1625                                 {
1626                                         Log("SHUTTLE", 
1627                                                 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1628                                                         GetCurrentRun(), aDetector->GetName()));
1629                                         UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1630                                         success = kFALSE;
1631                                 } else {
1632                                         Log("SHUTTLE", 
1633                                                 Form("\t\t\t****** run %d - %s: DONE ******",
1634                                                         GetCurrentRun(), aDetector->GetName()));
1635                                         UpdateShuttleStatus(AliShuttleStatus::kDone);
1636                                         UpdateShuttleLogbook(fCurrentDetector, "DONE");
1637                                 }
1638                         } else 
1639                         {
1640                                 Log("SHUTTLE", 
1641                                         Form("\t\t\t****** run %d - %s: PP ERROR ******",
1642                                                 GetCurrentRun(), aDetector->GetName()));
1643                         }
1644
1645                         for (UInt_t iSys=0; iSys<3; iSys++)
1646                         {
1647                                 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1648                         }
1649
1650                         Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1651                                                         GetCurrentRun(), aDetector->GetName(), success));
1652
1653                         // the client exits here
1654                         gSystem->Exit(success);
1655
1656                         AliError("We should never get here!!!");
1657                 }
1658         }
1659
1660         Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1661                                                         GetCurrentRun()));
1662
1663         //check if shuttle is done for this run, if so update logbook
1664         TObjArray checkEntryArray;
1665         checkEntryArray.SetOwner(1);
1666         TString whereClause = Form("where run=%d", GetCurrentRun());
1667         if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1668                 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1669                                                 GetCurrentRun()));
1670                 return hasError == kFALSE;
1671         }
1672
1673         AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1674                                                 (checkEntryArray.At(0));
1675
1676         if (checkEntry)
1677         {
1678                 if (checkEntry->IsDone())
1679                 {
1680                         Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1681                         UpdateShuttleLogbook("shuttle_done");
1682                 }
1683                 else
1684                 {
1685                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1686                         {
1687                                 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1688                                 {
1689                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1690                                                         checkEntry->GetRun(), GetDetName(iDet)));
1691                                         fFirstUnprocessed[iDet] = kFALSE;
1692                                 }
1693                         }
1694                 }
1695         }
1696
1697         // remove ML instance
1698         delete fMonaLisa;
1699         fMonaLisa = 0;
1700
1701         fLogbookEntry = 0;
1702
1703         return hasError == kFALSE;
1704 }
1705
1706 //______________________________________________________________________________________________
1707 Bool_t AliShuttle::ProcessCurrentDetector()
1708 {
1709         //
1710         // Makes data retrieval just for a specific detector (fCurrentDetector).
1711         // Threre should be a configuration for this detector.
1712
1713         Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d", 
1714                                                 fCurrentDetector.Data(), GetCurrentRun()));
1715
1716         TString wd = gSystem->WorkingDirectory();
1717         
1718         if (!CleanReferenceStorage(fCurrentDetector.Data()))
1719                 return kFALSE;
1720         
1721         gSystem->ChangeDirectory(wd.Data());
1722         
1723         TMap* dcsMap = new TMap();
1724
1725         // call preprocessor
1726         AliPreprocessor* aPreprocessor =
1727                 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1728
1729         aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1730
1731         Bool_t processDCS = aPreprocessor->ProcessDCS();
1732
1733         if (!processDCS)
1734         {
1735                 Log(fCurrentDetector, "ProcessCurrentDetector -"
1736                         " The preprocessor requested to skip the retrieval of DCS values");
1737         }
1738         else if (fTestMode & kSkipDCS)
1739         {
1740                 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1741         } 
1742         else if (fTestMode & kErrorDCS)
1743         {
1744                 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1745                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1746                 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1747                 delete dcsMap;
1748                 return kFALSE;
1749         } else {
1750
1751                 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1752
1753                 // Query DCS archive
1754                 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1755                 
1756                 for (int iServ=0; iServ<nServers; iServ++)
1757                 {
1758                 
1759                         TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1760                         Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1761                         Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1762
1763                         Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1764                                         " Querying DCS Amanda server %s:%d (%d of %d)", 
1765                                         host.Data(), port, iServ+1, nServers));
1766                         
1767                         TMap* aliasMap = 0;
1768                         TMap* dpMap = 0;
1769         
1770                         if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1771                         {
1772                                 aliasMap = GetValueSet(host, port, 
1773                                                 fConfig->GetDCSAliases(fCurrentDetector, iServ), 
1774                                                 kAlias, multiSplit);
1775                                 if (!aliasMap)
1776                                 {
1777                                         Log(fCurrentDetector, 
1778                                                 Form("ProcessCurrentDetector -"
1779                                                         " Error retrieving DCS aliases from server %s."
1780                                                         " Sending mail to DCS experts!", host.Data()));
1781                                         UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1782                                         
1783                                         if (!SendMailToDCS())
1784                                                 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1785
1786                                         delete dcsMap;
1787                                         return kFALSE;
1788                                 }
1789                         }
1790                         
1791                         if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1792                         {
1793                                 dpMap = GetValueSet(host, port, 
1794                                                 fConfig->GetDCSDataPoints(fCurrentDetector, iServ), 
1795                                                 kDP, multiSplit);
1796                                 if (!dpMap)
1797                                 {
1798                                         Log(fCurrentDetector, 
1799                                                 Form("ProcessCurrentDetector -"
1800                                                         " Error retrieving DCS data points from server %s."
1801                                                         " Sending mail to DCS experts!", host.Data()));
1802                                         UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1803                                         
1804                                         if (!SendMailToDCS())
1805                                                 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1806                                         
1807                                         if (aliasMap) delete aliasMap;
1808                                         delete dcsMap;
1809                                         return kFALSE;
1810                                 }                               
1811                         }
1812                         
1813                         // merge aliasMap and dpMap into dcsMap
1814                         if(aliasMap) {
1815                                 TIter iter(aliasMap);
1816                                 TObjString* key = 0;
1817                                 while ((key = (TObjString*) iter.Next()))
1818                                         dcsMap->Add(key, aliasMap->GetValue(key->String()));
1819                                 
1820                                 aliasMap->SetOwner(kFALSE);
1821                                 delete aliasMap;
1822                         }       
1823                         
1824                         if(dpMap) {
1825                                 TIter iter(dpMap);
1826                                 TObjString* key = 0;
1827                                 while ((key = (TObjString*) iter.Next()))
1828                                         dcsMap->Add(key, dpMap->GetValue(key->String()));
1829                                 
1830                                 dpMap->SetOwner(kFALSE);
1831                                 delete dpMap;
1832                         }
1833                 }
1834         }
1835         
1836         // DCS Archive DB processing successful. Call Preprocessor!
1837         UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1838
1839         UInt_t returnValue = aPreprocessor->Process(dcsMap);
1840
1841         if (returnValue > 0) // Preprocessor error!
1842         {
1843                 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1844                                 "Preprocessor failed. Process returned %d.", returnValue));
1845                 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1846                 dcsMap->DeleteAll();
1847                 delete dcsMap;
1848                 return kFALSE;
1849         }
1850         
1851         // preprocessor ok!
1852         UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1853         Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1854                                 fCurrentDetector.Data()));
1855
1856         dcsMap->DeleteAll();
1857         delete dcsMap;
1858
1859         return kTRUE;
1860 }
1861
1862 //______________________________________________________________________________________________
1863 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1864                 TObjArray& entries)
1865 {
1866         // Query DAQ's Shuttle logbook and fills detector status object.
1867         // Call QueryRunParameters to query DAQ logbook for run parameters.
1868         //
1869
1870         entries.SetOwner(1);
1871
1872         // check connection, in case connect
1873         if(!Connect(3)) return kFALSE;
1874
1875         TString sqlQuery;
1876         sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1877
1878         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1879         if (!aResult) {
1880                 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1881                 return kFALSE;
1882         }
1883
1884         AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1885
1886         if(aResult->GetRowCount() == 0) {
1887                 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1888                 delete aResult;
1889                 return kTRUE;
1890         }
1891
1892         // TODO Check field count!
1893         const UInt_t nCols = 23;
1894         if (aResult->GetFieldCount() != (Int_t) nCols) {
1895                 Log("SHUTTLE", "Invalid SQL result field number!");
1896                 delete aResult;
1897                 return kFALSE;
1898         }
1899
1900         TSQLRow* aRow;
1901         while ((aRow = aResult->Next())) {
1902                 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1903                 Int_t run = runString.Atoi();
1904
1905                 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1906                 if (!entry)
1907                         continue;
1908
1909                 // loop on detectors
1910                 for(UInt_t ii = 0; ii < nCols; ii++)
1911                         entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1912
1913                 entries.AddLast(entry);
1914                 delete aRow;
1915         }
1916
1917         delete aResult;
1918         return kTRUE;
1919 }
1920
1921 //______________________________________________________________________________________________
1922 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1923 {
1924         //
1925         // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1926         //
1927
1928         // check connection, in case connect
1929         if (!Connect(3))
1930                 return 0;
1931
1932         TString sqlQuery;
1933         sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1934
1935         TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1936         if (!aResult) {
1937                 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
1938                 return 0;
1939         }
1940
1941         if (aResult->GetRowCount() == 0) {
1942                 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1943                 delete aResult;
1944                 return 0;
1945         }
1946
1947         if (aResult->GetRowCount() > 1) {
1948                 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
1949                                 "more than one entry in DAQ Logbook for run %d!", run));
1950                 delete aResult;
1951                 return 0;
1952         }
1953
1954         TSQLRow* aRow = aResult->Next();
1955         if (!aRow)
1956         {
1957                 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
1958                 delete aResult;
1959                 return 0;
1960         }
1961
1962         AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1963
1964         for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1965                 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1966
1967         UInt_t startTime = entry->GetStartTime();
1968         UInt_t endTime = entry->GetEndTime();
1969
1970         if (!startTime || !endTime || startTime > endTime) {
1971                 Log("SHUTTLE",
1972                         Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1973                                 run, startTime, endTime));
1974                 delete entry;
1975                 delete aRow;
1976                 delete aResult;
1977                 return 0;
1978         }
1979
1980         delete aRow;
1981         delete aResult;
1982
1983         return entry;
1984 }
1985
1986 //______________________________________________________________________________________________
1987 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1988                               DCSType type, Int_t multiSplit)
1989 {
1990         // Retrieve all "entry" data points from the DCS server
1991         // host, port: TSocket connection parameters
1992         // entries: list of name of the alias or data point
1993         // type: kAlias or kDP
1994         // returns TMap of values, 0 when failure
1995         
1996         AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
1997
1998         TMap* result = 0;
1999         if (type == kAlias)
2000         {
2001                 result = client.GetAliasValues(entries, GetCurrentStartTime(), 
2002                         GetCurrentEndTime());
2003         } 
2004         else if (type == kDP)
2005         {
2006                 result = client.GetDPValues(entries, GetCurrentStartTime(), 
2007                         GetCurrentEndTime());
2008         }
2009
2010         if (result == 0)
2011         {
2012                 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2013                         client.GetErrorString(client.GetResultErrorCode())));
2014                 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)        
2015                         Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2016                                 client.GetServerError().Data()));
2017
2018                 return 0;
2019         }
2020                 
2021         return result;
2022 }
2023
2024 //______________________________________________________________________________________________
2025 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2026                 const char* id, const char* source)
2027 {
2028         // Get calibration file from file exchange servers
2029         // First queris the FXS database for the file name, using the run, detector, id and source info
2030         // then calls RetrieveFile(filename) for actual copy to local disk
2031         // run: current run being processed (given by Logbook entry fLogbookEntry)
2032         // detector: the Preprocessor name
2033         // id: provided as a parameter by the Preprocessor
2034         // source: provided by the Preprocessor through GetFileSources function
2035
2036         // check if test mode should simulate a FXS error
2037         if (fTestMode & kErrorFXSFiles)
2038         {
2039                 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2040                 return 0;
2041         }
2042         
2043         // check connection, in case connect
2044         if (!Connect(system))
2045         {
2046                 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2047                 return 0;
2048         }
2049
2050         // Query preparation
2051         TString sourceName(source);
2052         Int_t nFields = 3;
2053         TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2054                                                                 fConfig->GetFXSdbTable(system));
2055         TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2056                                                                 GetCurrentRun(), detector, id);
2057
2058         if (system == kDAQ)
2059         {
2060                 whereClause += Form(" and DAQsource=\"%s\"", source);
2061         }
2062         else if (system == kDCS)
2063         {
2064                 sourceName="none";
2065         }
2066         else if (system == kHLT)
2067         {
2068                 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2069                 nFields = 3;
2070         }
2071
2072         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2073
2074         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2075
2076         // Query execution
2077         TSQLResult* aResult = 0;
2078         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2079         if (!aResult) {
2080                 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2081                                 GetSystemName(system), id, sourceName.Data()));
2082                 return 0;
2083         }
2084
2085         if(aResult->GetRowCount() == 0)
2086         {
2087                 Log(detector,
2088                         Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2089                                 GetSystemName(system), id, sourceName.Data()));
2090                 delete aResult;
2091                 return 0;
2092         }
2093
2094         if (aResult->GetRowCount() > 1) {
2095                 Log(detector,
2096                         Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2097                                 GetSystemName(system), id, sourceName.Data()));
2098                 delete aResult;
2099                 return 0;
2100         }
2101
2102         if (aResult->GetFieldCount() != nFields) {
2103                 Log(detector,
2104                         Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2105                                 GetSystemName(system), id, sourceName.Data()));
2106                 delete aResult;
2107                 return 0;
2108         }
2109
2110         TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2111
2112         if (!aRow){
2113                 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2114                                 GetSystemName(system), id, sourceName.Data()));
2115                 delete aResult;
2116                 return 0;
2117         }
2118
2119         TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2120         TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2121         TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2122
2123         delete aResult;
2124         delete aRow;
2125
2126         AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2127                                 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2128
2129         // retrieved file is renamed to make it unique
2130         TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2131                                         GetShuttleTempDir(), detector, GetCurrentRun(),
2132                                         GetSystemName(system), detector, GetCurrentRun(), 
2133                                         id, sourceName.Data());
2134
2135
2136         // file retrieval from FXS
2137         UInt_t nRetries = 0;
2138         UInt_t maxRetries = 3;
2139         Bool_t result = kFALSE;
2140
2141         // copy!! if successful TSystem::Exec returns 0
2142         while(nRetries++ < maxRetries) {
2143                 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2144                 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2145                 if(!result)
2146                 {
2147                         Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2148                                         filePath.Data(), GetSystemName(system)));
2149                         continue;
2150                 } 
2151
2152                 if (fileChecksum.Length()>0)
2153                 {
2154                         // compare md5sum of local file with the one stored in the FXS DB
2155                         Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2156                                                 localFileName.Data(), fileChecksum.Data()));
2157
2158                         if (md5Comp != 0)
2159                         {
2160                                 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2161                                                         filePath.Data()));
2162                                 result = kFALSE;
2163                                 continue;
2164                         }
2165                 } else {
2166                         Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2167                                                         filePath.Data(), GetSystemName(system)));
2168                 }
2169                 if (result) break;
2170         }
2171
2172         if(!result) return 0;
2173
2174         fFXSCalled[system]=kTRUE;
2175         TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2176         fFXSlist[system].Add(fileParams);
2177
2178         static TString staticLocalFileName;
2179         staticLocalFileName.Form("%s", localFileName.Data());
2180         
2181         Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2182                         "source %s from %s to %s", id, source, 
2183                         GetSystemName(system), localFileName.Data()));
2184                         
2185         return staticLocalFileName.Data();
2186 }
2187
2188 //______________________________________________________________________________________________
2189 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2190 {
2191         //
2192         // Copies file from FXS to local Shuttle machine
2193         //
2194
2195         // check temp directory: trying to cd to temp; if it does not exist, create it
2196         AliDebug(2, Form("Copy file %s from %s FXS into %s",
2197                         GetSystemName(system), fxsFileName, localFileName));
2198                         
2199         TString tmpDir(localFileName);
2200         
2201         tmpDir = tmpDir(0,tmpDir.Last('/'));
2202
2203         Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2204         if (noDir) // temp dir does not exists!
2205         {
2206                 if (gSystem->mkdir(tmpDir.Data(), 1))
2207                 {
2208                         Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2209                         return kFALSE;
2210                 }
2211         }
2212
2213         TString baseFXSFolder;
2214         if (system == kDAQ)
2215         {
2216                 baseFXSFolder = "FES/";
2217         }
2218         else if (system == kDCS)
2219         {
2220                 baseFXSFolder = "";
2221         }
2222         else if (system == kHLT)
2223         {
2224                 baseFXSFolder = "/opt/FXS/";
2225         }
2226
2227
2228         TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2229                 fConfig->GetFXSPort(system),
2230                 fConfig->GetFXSUser(system),
2231                 fConfig->GetFXSHost(system),
2232                 baseFXSFolder.Data(),
2233                 fxsFileName,
2234                 localFileName);
2235
2236         AliDebug(2, Form("%s",command.Data()));
2237
2238         Bool_t result = (gSystem->Exec(command.Data()) == 0);
2239
2240         return result;
2241 }
2242
2243 //______________________________________________________________________________________________
2244 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2245 {
2246         //
2247         // Get sources producing the condition file Id from file exchange servers
2248         // if id is NULL all sources are returned (distinct)
2249         //
2250
2251         Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2252         
2253         // check if test mode should simulate a FXS error
2254         if (fTestMode & kErrorFXSSources)
2255         {
2256                 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2257                 return 0;
2258         }
2259
2260         if (system == kDCS)
2261         {
2262                 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2263                 TList *list = new TList();
2264                 list->SetOwner(1);
2265                 list->Add(new TObjString(" "));
2266                 return list;
2267         }
2268
2269         // check connection, in case connect
2270         if (!Connect(system))
2271         {
2272                 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2273                 return NULL;
2274         }
2275
2276         TString sourceName = 0;
2277         if (system == kDAQ)
2278         {
2279                 sourceName = "DAQsource";
2280         } else if (system == kHLT)
2281         {
2282                 sourceName = "DDLnumbers";
2283         }
2284
2285         TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2286         TString whereClause = Form("run=%d and detector=\"%s\"",
2287                                 GetCurrentRun(), detector);
2288         if (id)
2289                 whereClause += Form(" and fileId=\"%s\"", id);
2290         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2291
2292         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2293
2294         // Query execution
2295         TSQLResult* aResult;
2296         aResult = fServer[system]->Query(sqlQuery);
2297         if (!aResult) {
2298                 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2299                                 GetSystemName(system), id));
2300                 return 0;
2301         }
2302
2303         TList *list = new TList();
2304         list->SetOwner(1);
2305         
2306         if (aResult->GetRowCount() == 0)
2307         {
2308                 Log(detector,
2309                         Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2310                 delete aResult;
2311                 return list;
2312         }
2313
2314         Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2315
2316         TSQLRow* aRow;
2317         while ((aRow = aResult->Next()))
2318         {
2319
2320                 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2321                 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2322                 list->Add(new TObjString(source));
2323                 delete aRow;
2324         }
2325
2326         delete aResult;
2327
2328         return list;
2329 }
2330
2331 //______________________________________________________________________________________________
2332 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2333 {
2334         //
2335         // Get all ids of condition files produced by a given source from file exchange servers
2336         //
2337         
2338         Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2339
2340         // check if test mode should simulate a FXS error
2341         if (fTestMode & kErrorFXSSources)
2342         {
2343                 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2344                 return 0;
2345         }
2346
2347         // check connection, in case connect
2348         if (!Connect(system))
2349         {
2350                 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2351                 return NULL;
2352         }
2353
2354         TString sourceName = 0;
2355         if (system == kDAQ)
2356         {
2357                 sourceName = "DAQsource";
2358         } else if (system == kHLT)
2359         {
2360                 sourceName = "DDLnumbers";
2361         }
2362
2363         TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2364         TString whereClause = Form("run=%d and detector=\"%s\"",
2365                                 GetCurrentRun(), detector);
2366         if (sourceName.Length() > 0 && source)
2367                 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2368         TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2369
2370         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2371
2372         // Query execution
2373         TSQLResult* aResult;
2374         aResult = fServer[system]->Query(sqlQuery);
2375         if (!aResult) {
2376                 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2377                                 GetSystemName(system), source));
2378                 return 0;
2379         }
2380
2381         TList *list = new TList();
2382         list->SetOwner(1);
2383         
2384         if (aResult->GetRowCount() == 0)
2385         {
2386                 Log(detector,
2387                         Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2388                 delete aResult;
2389                 return list;
2390         }
2391
2392         Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2393
2394         TSQLRow* aRow;
2395
2396         while ((aRow = aResult->Next()))
2397         {
2398
2399                 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2400                 AliDebug(2, Form("fileId = %s", id.Data()));
2401                 list->Add(new TObjString(id));
2402                 delete aRow;
2403         }
2404
2405         delete aResult;
2406
2407         return list;
2408 }
2409
2410 //______________________________________________________________________________________________
2411 Bool_t AliShuttle::Connect(Int_t system)
2412 {
2413         // Connect to MySQL Server of the system's FXS MySQL databases
2414         // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2415         //
2416
2417         // check connection: if already connected return
2418         if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2419
2420         TString dbHost, dbUser, dbPass, dbName;
2421
2422         if (system < 3) // FXS db servers
2423         {
2424                 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2425                 dbUser = fConfig->GetFXSdbUser(system);
2426                 dbPass = fConfig->GetFXSdbPass(system);
2427                 dbName =   fConfig->GetFXSdbName(system);
2428         } else { // Run & Shuttle logbook servers
2429         // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2430                 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2431                 dbUser = fConfig->GetDAQlbUser();
2432                 dbPass = fConfig->GetDAQlbPass();
2433                 dbName =   fConfig->GetDAQlbDB();
2434         }
2435
2436         fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2437         if (!fServer[system] || !fServer[system]->IsConnected()) {
2438                 if(system < 3)
2439                 {
2440                 AliError(Form("Can't establish connection to FXS database for %s",
2441                                         AliShuttleInterface::GetSystemName(system)));
2442                 } else {
2443                 AliError("Can't establish connection to Run logbook.");
2444                 }
2445                 if(fServer[system]) delete fServer[system];
2446                 return kFALSE;
2447         }
2448
2449         // Get tables
2450         TSQLResult* aResult=0;
2451         switch(system){
2452                 case kDAQ:
2453                         aResult = fServer[kDAQ]->GetTables(dbName.Data());
2454                         break;
2455                 case kDCS:
2456                         aResult = fServer[kDCS]->GetTables(dbName.Data());
2457                         break;
2458                 case kHLT:
2459                         aResult = fServer[kHLT]->GetTables(dbName.Data());
2460                         break;
2461                 default:
2462                         aResult = fServer[3]->GetTables(dbName.Data());
2463                         break;
2464         }
2465
2466         delete aResult;
2467         return kTRUE;
2468 }
2469
2470 //______________________________________________________________________________________________
2471 Bool_t AliShuttle::UpdateTable()
2472 {
2473         //
2474         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2475         //
2476
2477         Bool_t result = kTRUE;
2478
2479         for (UInt_t system=0; system<3; system++)
2480         {
2481                 if(!fFXSCalled[system]) continue;
2482
2483                 // check connection, in case connect
2484                 if (!Connect(system))
2485                 {
2486                         Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2487                         result = kFALSE;
2488                         continue;
2489                 }
2490
2491                 TTimeStamp now; // now
2492
2493                 // Loop on FXS list entries
2494                 TIter iter(&fFXSlist[system]);
2495                 TObjString *aFXSentry=0;
2496                 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2497                 {
2498                         TString aFXSentrystr = aFXSentry->String();
2499                         TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2500                         if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2501                         {
2502                                 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2503                                         GetSystemName(system), aFXSentrystr.Data()));
2504                                 if(aFXSarray) delete aFXSarray;
2505                                 result = kFALSE;
2506                                 continue;
2507                         }
2508                         const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2509                         const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2510
2511                         TString whereClause;
2512                         if (system == kDAQ)
2513                         {
2514                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2515                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2516                         }
2517                         else if (system == kDCS)
2518                         {
2519                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2520                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId);
2521                         }
2522                         else if (system == kHLT)
2523                         {
2524                                 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2525                                                         GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2526                         }
2527
2528                         delete aFXSarray;
2529
2530                         TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2531                                                                 now.GetSec(), whereClause.Data());
2532
2533                         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2534
2535                         // Query execution
2536                         TSQLResult* aResult;
2537                         aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2538                         if (!aResult)
2539                         {
2540                                 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2541                                                                 GetSystemName(system), sqlQuery.Data()));
2542                                 result = kFALSE;
2543                                 continue;
2544                         }
2545                         delete aResult;
2546                 }
2547         }
2548
2549         return result;
2550 }
2551
2552 //______________________________________________________________________________________________
2553 Bool_t AliShuttle::UpdateTableFailCase()
2554 {
2555         // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2556         // this is called in case the preprocessor is declared failed for the current run, because
2557         // the fields are updated only in case of success
2558
2559         Bool_t result = kTRUE;
2560
2561         for (UInt_t system=0; system<3; system++)
2562         {
2563                 // check connection, in case connect
2564                 if (!Connect(system))
2565                 {
2566                         Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2567                                                         GetSystemName(system)));
2568                         result = kFALSE;
2569                         continue;
2570                 }
2571
2572                 TTimeStamp now; // now
2573
2574                 // Loop on FXS list entries
2575
2576                 TString whereClause = Form("where run=%d and detector=\"%s\";",
2577                                                 GetCurrentRun(), fCurrentDetector.Data());
2578
2579
2580                 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2581                                                         now.GetSec(), whereClause.Data());
2582
2583                 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2584
2585                 // Query execution
2586                 TSQLResult* aResult;
2587                 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2588                 if (!aResult)
2589                 {
2590                         Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2591                                                         GetSystemName(system), sqlQuery.Data()));
2592                         result = kFALSE;
2593                         continue;
2594                 }
2595                 delete aResult;
2596         }
2597
2598         return result;
2599 }
2600
2601 //______________________________________________________________________________________________
2602 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2603 {
2604         //
2605         // Update Shuttle logbook filling detector or shuttle_done column
2606         // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2607         //
2608
2609         // check connection, in case connect
2610         if(!Connect(3)){
2611                 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2612                 return kFALSE;
2613         }
2614
2615         TString detName(detector);
2616         TString setClause;
2617         if(detName == "shuttle_done")
2618         {
2619                 setClause = "set shuttle_done=1";
2620
2621                 // Send the information to ML
2622                 TMonaLisaText  mlStatus("SHUTTLE_status", "Done");
2623
2624                 TList mlList;
2625                 mlList.Add(&mlStatus);
2626
2627                 fMonaLisa->SendParameters(&mlList);
2628         } else {
2629                 TString statusStr(status);
2630                 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2631                    statusStr.Contains("failed", TString::kIgnoreCase)){
2632                         setClause = Form("set %s=\"%s\"", detector, status);
2633                 } else {
2634                         Log("SHUTTLE",
2635                                 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2636                                         status, detector));
2637                         return kFALSE;
2638                 }
2639         }
2640
2641         TString whereClause = Form("where run=%d", GetCurrentRun());
2642
2643         TString sqlQuery = Form("update %s %s %s",
2644                                         fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2645
2646         AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2647
2648         // Query execution
2649         TSQLResult* aResult;
2650         aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2651         if (!aResult) {
2652                 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2653                 return kFALSE;
2654         }
2655         delete aResult;
2656
2657         return kTRUE;
2658 }
2659
2660 //______________________________________________________________________________________________
2661 Int_t AliShuttle::GetCurrentRun() const
2662 {
2663         //
2664         // Get current run from logbook entry
2665         //
2666
2667         return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2668 }
2669
2670 //______________________________________________________________________________________________
2671 UInt_t AliShuttle::GetCurrentStartTime() const
2672 {
2673         //
2674         // get current start time
2675         //
2676
2677         return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2678 }
2679
2680 //______________________________________________________________________________________________
2681 UInt_t AliShuttle::GetCurrentEndTime() const
2682 {
2683         //
2684         // get current end time from logbook entry
2685         //
2686
2687         return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2688 }
2689
2690 //______________________________________________________________________________________________
2691 UInt_t AliShuttle::GetCurrentYear() const
2692 {
2693         //
2694         // Get current year from logbook entry
2695         //
2696
2697         if (!fLogbookEntry) return 0;
2698         
2699         TTimeStamp startTime(GetCurrentStartTime());
2700         TString year =  Form("%d",startTime.GetDate());
2701         year = year(0,4);
2702         
2703         return year.Atoi();
2704 }
2705
2706 //______________________________________________________________________________________________
2707 const char* AliShuttle::GetLHCPeriod() const
2708 {
2709         //
2710         // Get current LHC period from logbook entry
2711         //
2712
2713         if (!fLogbookEntry) return 0;
2714                 
2715         return fLogbookEntry->GetRunParameter("LHCperiod");
2716 }
2717
2718 //______________________________________________________________________________________________
2719 void AliShuttle::Log(const char* detector, const char* message)
2720 {
2721         //
2722         // Fill log string with a message
2723         //
2724
2725         void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2726         if (dir == NULL) {
2727                 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2728                         AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2729                         return;
2730                 }
2731
2732         } else {
2733                 gSystem->FreeDirectory(dir);
2734         }
2735
2736         TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2737         if (GetCurrentRun() >= 0) 
2738                 toLog += Form("run %d - ", GetCurrentRun());
2739         toLog += Form("%s", message);
2740
2741         AliInfo(toLog.Data());
2742         
2743         // if we redirect the log output already to the file, leave here
2744         if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2745                 return;
2746
2747         TString fileName = GetLogFileName(detector);
2748         
2749         gSystem->ExpandPathName(fileName);
2750
2751         ofstream logFile;
2752         logFile.open(fileName, ofstream::out | ofstream::app);
2753
2754         if (!logFile.is_open()) {
2755                 AliError(Form("Could not open file %s", fileName.Data()));
2756                 return;
2757         }
2758
2759         logFile << toLog.Data() << "\n";
2760
2761         logFile.close();
2762 }
2763
2764 //______________________________________________________________________________________________
2765 TString AliShuttle::GetLogFileName(const char* detector) const
2766 {
2767         // 
2768         // returns the name of the log file for a given sub detector
2769         //
2770         
2771         TString fileName;
2772         
2773         if (GetCurrentRun() >= 0) 
2774                 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2775         else
2776                 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2777
2778         return fileName;
2779 }
2780
2781 //______________________________________________________________________________________________
2782 Bool_t AliShuttle::Collect(Int_t run)
2783 {
2784         //
2785         // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2786         // If a dedicated run is given this run is processed
2787         //
2788         // In operational mode, this is the Shuttle function triggered by the EOR signal.
2789         //
2790
2791         if (run == -1)
2792                 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2793         else
2794                 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2795
2796         SetLastAction("Starting");
2797
2798         TString whereClause("where shuttle_done=0");
2799         if (run != -1)
2800                 whereClause += Form(" and run=%d", run);
2801
2802         TObjArray shuttleLogbookEntries;
2803         if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2804         {
2805                 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2806                 return kFALSE;
2807         }
2808
2809         if (shuttleLogbookEntries.GetEntries() == 0)
2810         {
2811                 if (run == -1)
2812                         Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2813                 else
2814                         Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2815                                                 "or it does not exist in Shuttle logbook", run));
2816                 return kTRUE;
2817         }
2818
2819         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2820                 fFirstUnprocessed[iDet] = kTRUE;
2821
2822         if (run != -1)
2823         {
2824                 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2825                 // flag them into fFirstUnprocessed array
2826                 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2827                 TObjArray tmpLogbookEntries;
2828                 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2829                 {
2830                         Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2831                         return kFALSE;
2832                 }
2833
2834                 TIter iter(&tmpLogbookEntries);
2835                 AliShuttleLogbookEntry* anEntry = 0;
2836                 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2837                 {
2838                         for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2839                         {
2840                                 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2841                                 {
2842                                         AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2843                                                         anEntry->GetRun(), GetDetName(iDet)));
2844                                         fFirstUnprocessed[iDet] = kFALSE;
2845                                 }
2846                         }
2847
2848                 }
2849
2850         }
2851
2852         if (!RetrieveConditionsData(shuttleLogbookEntries))
2853         {
2854                 Log("SHUTTLE", "Collect - Process of at least one run failed");
2855                 return kFALSE;
2856         }
2857
2858         Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2859         return kTRUE;
2860 }
2861
2862 //______________________________________________________________________________________________
2863 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2864 {
2865         //
2866         // Retrieve conditions data for all runs that aren't processed yet
2867         //
2868
2869         Bool_t hasError = kFALSE;
2870
2871         TIter iter(&dateEntries);
2872         AliShuttleLogbookEntry* anEntry;
2873
2874         while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2875                 if (!Process(anEntry)){
2876                         hasError = kTRUE;
2877                 }
2878
2879                 // clean SHUTTLE temp directory
2880                 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2881                 //RemoveFile(filename.Data());
2882         }
2883
2884         return hasError == kFALSE;
2885 }
2886
2887 //______________________________________________________________________________________________
2888 ULong_t AliShuttle::GetTimeOfLastAction() const
2889 {
2890         //
2891         // Gets time of last action
2892         //
2893
2894         ULong_t tmp;
2895
2896         fMonitoringMutex->Lock();
2897
2898         tmp = fLastActionTime;
2899
2900         fMonitoringMutex->UnLock();
2901
2902         return tmp;
2903 }
2904
2905 //______________________________________________________________________________________________
2906 const TString AliShuttle::GetLastAction() const
2907 {
2908         //
2909         // returns a string description of the last action
2910         //
2911
2912         TString tmp;
2913
2914         fMonitoringMutex->Lock();
2915         
2916         tmp = fLastAction;
2917         
2918         fMonitoringMutex->UnLock();
2919
2920         return tmp;
2921 }
2922
2923 //______________________________________________________________________________________________
2924 void AliShuttle::SetLastAction(const char* action)
2925 {
2926         //
2927         // updates the monitoring variables
2928         //
2929
2930         fMonitoringMutex->Lock();
2931
2932         fLastAction = action;
2933         fLastActionTime = time(0);
2934         
2935         fMonitoringMutex->UnLock();
2936 }
2937
2938 //______________________________________________________________________________________________
2939 const char* AliShuttle::GetRunParameter(const char* param)
2940 {
2941         //
2942         // returns run parameter read from DAQ logbook
2943         //
2944
2945         if(!fLogbookEntry) {
2946                 AliError("No logbook entry!");
2947                 return 0;
2948         }
2949
2950         return fLogbookEntry->GetRunParameter(param);
2951 }
2952
2953 //______________________________________________________________________________________________
2954 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2955 {
2956         //
2957         // returns object from OCDB valid for current run
2958         //
2959
2960         if (fTestMode & kErrorOCDB)
2961         {
2962                 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2963                 return 0;
2964         }
2965         
2966         AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2967         if (!sto)
2968         {
2969                 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2970                 return 0;
2971         }
2972
2973         return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2974 }
2975
2976 //______________________________________________________________________________________________
2977 Bool_t AliShuttle::SendMail()
2978 {
2979         //
2980         // sends a mail to the subdetector expert in case of preprocessor error
2981         //
2982         
2983         if (fTestMode != kNone)
2984                 return kTRUE;
2985
2986         void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2987         if (dir == NULL)
2988         {
2989                 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2990                 {
2991                         Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
2992                         return kFALSE;
2993                 }
2994
2995         } else {
2996                 gSystem->FreeDirectory(dir);
2997         }
2998
2999         TString bodyFileName;
3000         bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3001         gSystem->ExpandPathName(bodyFileName);
3002
3003         ofstream mailBody;
3004         mailBody.open(bodyFileName, ofstream::out);
3005
3006         if (!mailBody.is_open())
3007         {
3008                 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3009                 return kFALSE;
3010         }
3011
3012         TString to="";
3013         TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3014         TObjString *anExpert=0;
3015         while ((anExpert = (TObjString*) iterExperts.Next()))
3016         {
3017                 to += Form("%s,", anExpert->GetName());
3018         }
3019         to.Remove(to.Length()-1);
3020         AliDebug(2, Form("to: %s",to.Data()));
3021
3022         if (to.IsNull()) {
3023                 Log("SHUTTLE", "List of detector responsibles not yet set!");
3024                 return kFALSE;
3025         }
3026
3027         TString cc="alberto.colla@cern.ch";
3028
3029         TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
3030                                 fCurrentDetector.Data(), GetCurrentRun());
3031         AliDebug(2, Form("subject: %s", subject.Data()));
3032
3033         TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3034         body += Form("SHUTTLE just detected that your preprocessor "
3035                         "failed processing run %d!!\n\n", GetCurrentRun());
3036         body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
3037         body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3038         body += Form("Find the %s log for the current run on \n\n"
3039                 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n", 
3040                 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
3041         body += Form("The last 10 lines of %s log file are following:\n\n");
3042
3043         AliDebug(2, Form("Body begin: %s", body.Data()));
3044
3045         mailBody << body.Data();
3046         mailBody.close();
3047         mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3048
3049         TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
3050         TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3051         if (gSystem->Exec(tailCommand.Data()))
3052         {
3053                 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3054         }
3055
3056         TString endBody = Form("------------------------------------------------------\n\n");
3057         endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3058         endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3059         endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3060
3061         AliDebug(2, Form("Body end: %s", endBody.Data()));
3062
3063         mailBody << endBody.Data();
3064
3065         mailBody.close();
3066
3067         // send mail!
3068         TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3069                                                 subject.Data(),
3070                                                 cc.Data(),
3071                                                 to.Data(),
3072                                                 bodyFileName.Data());
3073         AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3074
3075         Bool_t result = gSystem->Exec(mailCommand.Data());
3076
3077         return result == 0;
3078 }
3079
3080 //______________________________________________________________________________________________
3081 Bool_t AliShuttle::SendMailToDCS()
3082 {
3083         //
3084         // sends a mail to the DCS experts in case of DCS error
3085         //
3086         
3087         if (fTestMode != kNone)
3088                 return kTRUE;
3089
3090         void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3091         if (dir == NULL)
3092         {
3093                 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3094                 {
3095                         Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3096                         return kFALSE;
3097                 }
3098
3099         } else {
3100                 gSystem->FreeDirectory(dir);
3101         }
3102
3103         TString bodyFileName;
3104         bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3105         gSystem->ExpandPathName(bodyFileName);