1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.44 2007/05/11 16:09:32 acolla
19 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
20 example: ITS/SPD/100_filename.root
22 Revision 1.43 2007/05/10 09:59:51 acolla
23 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
25 Revision 1.42 2007/05/03 08:01:39 jgrosseo
26 typo in last commit :-(
28 Revision 1.41 2007/05/03 08:00:48 jgrosseo
29 fixing log message when pp want to skip dcs value retrieval
31 Revision 1.40 2007/04/27 07:06:48 jgrosseo
32 GetFileSources returns empty list in case of no files, but successful query
33 No mails sent in testmode
35 Revision 1.39 2007/04/17 12:43:57 acolla
36 Correction in StoreOCDB; change of text in mail to detector expert
38 Revision 1.38 2007/04/12 08:26:18 jgrosseo
41 Revision 1.37 2007/04/10 16:53:14 jgrosseo
42 redirecting sub detector stdout, stderr to sub detector log file
44 Revision 1.35 2007/04/04 16:26:38 acolla
45 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
46 2. Added missing dependency in test preprocessors.
47 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
49 Revision 1.34 2007/04/04 10:33:36 jgrosseo
50 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
51 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
53 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
55 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
57 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
59 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
60 If you always need DCS data (like before), you do not need to implement it.
62 6) The run type has been added to the monitoring page
64 Revision 1.33 2007/04/03 13:56:01 acolla
65 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
68 Revision 1.32 2007/02/28 10:41:56 acolla
69 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
70 AliPreprocessor::GetRunType() function.
71 Added some ldap definition files.
73 Revision 1.30 2007/02/13 11:23:21 acolla
74 Moved getters and setters of Shuttle's main OCDB/Reference, local
75 OCDB/Reference, temp and log folders to AliShuttleInterface
77 Revision 1.27 2007/01/30 17:52:42 jgrosseo
78 adding monalisa monitoring
80 Revision 1.26 2007/01/23 19:20:03 acolla
81 Removed old ldif files, added TOF, MCH ldif files. Added some options in
82 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
85 Revision 1.25 2007/01/15 19:13:52 acolla
86 Moved some AliInfo to AliDebug in SendMail function
88 Revision 1.21 2006/12/07 08:51:26 jgrosseo
90 table, db names in ldap configuration
91 added GRP preprocessor
92 DCS data can also be retrieved by data point
94 Revision 1.20 2006/11/16 16:16:48 jgrosseo
95 introducing strict run ordering flag
96 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
98 Revision 1.19 2006/11/06 14:23:04 jgrosseo
99 major update (Alberto)
100 o) reading of run parameters from the logbook
101 o) online offline naming conversion
102 o) standalone DCSclient package
104 Revision 1.18 2006/10/20 15:22:59 jgrosseo
105 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
106 o) Merging Collect, CollectAll, CollectNew function
107 o) Removing implementation of empty copy constructors (declaration still there!)
109 Revision 1.17 2006/10/05 16:20:55 jgrosseo
110 adapting to new CDB classes
112 Revision 1.16 2006/10/05 15:46:26 jgrosseo
113 applying to the new interface
115 Revision 1.15 2006/10/02 16:38:39 jgrosseo
118 storing of objects that failed to be stored to the grid before
119 interfacing of shuttle status table in daq system
121 Revision 1.14 2006/08/29 09:16:05 jgrosseo
124 Revision 1.13 2006/08/15 10:50:00 jgrosseo
125 effc++ corrections (alberto)
127 Revision 1.12 2006/08/08 14:19:29 jgrosseo
128 Update to shuttle classes (Alberto)
130 - Possibility to set the full object's path in the Preprocessor's and
131 Shuttle's Store functions
132 - Possibility to extend the object's run validity in the same classes
133 ("startValidity" and "validityInfinite" parameters)
134 - Implementation of the StoreReferenceData function to store reference
135 data in a dedicated CDB storage.
137 Revision 1.11 2006/07/21 07:37:20 jgrosseo
138 last run is stored after each run
140 Revision 1.10 2006/07/20 09:54:40 jgrosseo
141 introducing status management: The processing per subdetector is divided into several steps,
142 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
143 can keep track of the number of failures and skips further processing after a certain threshold is
144 exceeded. These thresholds can be configured in LDAP.
146 Revision 1.9 2006/07/19 10:09:55 jgrosseo
147 new configuration, accesst to DAQ FES (Alberto)
149 Revision 1.8 2006/07/11 12:44:36 jgrosseo
150 adding parameters for extended validity range of data produced by preprocessor
152 Revision 1.7 2006/07/10 14:37:09 jgrosseo
153 small fix + todo comment
155 Revision 1.6 2006/07/10 13:01:41 jgrosseo
156 enhanced storing of last sucessfully processed run (alberto)
158 Revision 1.5 2006/07/04 14:59:57 jgrosseo
159 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
161 Revision 1.4 2006/06/12 09:11:16 jgrosseo
162 coding conventions (Alberto)
164 Revision 1.3 2006/06/06 14:26:40 jgrosseo
165 o) removed files that were moved to STEER
166 o) shuttle updated to follow the new interface (Alberto)
168 Revision 1.2 2006/03/07 07:52:34 hristov
169 New version (B.Yordanov)
171 Revision 1.6 2005/11/19 17:19:14 byordano
172 RetrieveDATEEntries and RetrieveConditionsData added
174 Revision 1.5 2005/11/19 11:09:27 byordano
175 AliShuttle declaration added
177 Revision 1.4 2005/11/17 17:47:34 byordano
178 TList changed to TObjArray
180 Revision 1.3 2005/11/17 14:43:23 byordano
183 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
184 Initial import as subdirectory in AliRoot
186 Revision 1.2 2005/09/13 08:41:15 byordano
187 default startTime endTime added
189 Revision 1.4 2005/08/30 09:13:02 byordano
192 Revision 1.3 2005/08/29 21:15:47 byordano
198 // This class is the main manager for AliShuttle.
199 // It organizes the data retrieval from DCS and call the
200 // interface methods of AliPreprocessor.
201 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
202 // data for its set of aliases is retrieved. If there is registered
203 // AliPreprocessor for this detector then it will be used
204 // accroding to the schema (see AliPreprocessor).
205 // If there isn't registered AliPreprocessor than the retrieved
206 // data is stored automatically to the undelying AliCDBStorage.
207 // For detSpec is used the alias name.
210 #include "AliShuttle.h"
212 #include "AliCDBManager.h"
213 #include "AliCDBStorage.h"
214 #include "AliCDBId.h"
215 #include "AliCDBRunRange.h"
216 #include "AliCDBPath.h"
217 #include "AliCDBEntry.h"
218 #include "AliShuttleConfig.h"
219 #include "DCSClient/AliDCSClient.h"
221 #include "AliPreprocessor.h"
222 #include "AliShuttleStatus.h"
223 #include "AliShuttleLogbookEntry.h"
228 #include <TTimeStamp.h>
229 #include <TObjString.h>
230 #include <TSQLServer.h>
231 #include <TSQLResult.h>
234 #include <TSystemDirectory.h>
235 #include <TSystemFile.h>
236 #include <TFileMerger.h>
238 #include <TGridResult.h>
240 #include <TMonaLisaWriter.h>
244 #include <sys/types.h>
245 #include <sys/wait.h>
249 //______________________________________________________________________________________________
250 AliShuttle::AliShuttle(const AliShuttleConfig* config,
251 UInt_t timeout, Int_t retries):
253 fTimeout(timeout), fRetries(retries),
263 fReadTestMode(kFALSE),
264 fOutputRedirected(kFALSE)
267 // config: AliShuttleConfig used
268 // timeout: timeout used for AliDCSClient connection
269 // retries: the number of retries in case of connection error.
272 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
273 for(int iSys=0;iSys<4;iSys++) {
276 fFXSlist[iSys].SetOwner(kTRUE);
278 fPreprocessorMap.SetOwner(kTRUE);
280 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
281 fFirstUnprocessed[iDet] = kFALSE;
283 fMonitoringMutex = new TMutex();
286 //______________________________________________________________________________________________
287 AliShuttle::~AliShuttle()
293 fPreprocessorMap.DeleteAll();
294 for(int iSys=0;iSys<4;iSys++)
296 fServer[iSys]->Close();
297 delete fServer[iSys];
306 if (fMonitoringMutex)
308 delete fMonitoringMutex;
309 fMonitoringMutex = 0;
313 //______________________________________________________________________________________________
314 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
317 // Registers new AliPreprocessor.
318 // It uses GetName() for indentificator of the pre processor.
319 // The pre processor is registered it there isn't any other
320 // with the same identificator (GetName()).
323 const char* detName = preprocessor->GetName();
324 if(GetDetPos(detName) < 0)
325 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
327 if (fPreprocessorMap.GetValue(detName)) {
328 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
332 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
334 //______________________________________________________________________________________________
335 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
336 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
338 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
339 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
340 // using this function. Use StoreReferenceData instead!
341 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
342 // finishes the data are transferred to the main storage (Grid).
344 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
347 //______________________________________________________________________________________________
348 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
350 // Stores a CDB object in the storage for reference data. This objects will not be available during
351 // offline reconstrunction. Use this function for reference data only!
352 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
353 // finishes the data are transferred to the main storage (Grid).
355 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
358 //______________________________________________________________________________________________
359 Bool_t AliShuttle::StoreLocally(const TString& localUri,
360 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
361 Int_t validityStart, Bool_t validityInfinite)
363 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
364 // when the preprocessor finishes the data are transferred to the main storage (Grid).
365 // The parameters are:
366 // 1) Uri of the backup storage (Local)
367 // 2) the object's path.
368 // 3) the object to be stored
369 // 4) the metaData to be associated with the object
370 // 5) the validity start run number w.r.t. the current run,
371 // if the data is valid only for this run leave the default 0
372 // 6) specifies if the calibration data is valid for infinity (this means until updated),
373 // typical for calibration runs, the default is kFALSE
375 // returns 0 if fail, 1 otherwise
377 if (fTestMode & kErrorStorage)
379 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
383 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
385 Int_t firstRun = GetCurrentRun() - validityStart;
387 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
392 if(validityInfinite) {
393 lastRun = AliCDBRunRange::Infinity();
395 lastRun = GetCurrentRun();
398 // Version is set to current run, it will be used later to transfer data to Grid
399 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
401 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
402 TObjString runUsed = Form("%d", GetCurrentRun());
403 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
406 Bool_t result = kFALSE;
408 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
409 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
411 result = AliCDBManager::Instance()->GetStorage(localUri)
412 ->Put(object, id, metaData);
417 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
423 //______________________________________________________________________________________________
424 Bool_t AliShuttle::StoreOCDB()
427 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
428 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
429 // Then calls StoreRefFilesToGrid to store reference files.
432 if (fTestMode & kErrorGrid)
434 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
435 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
439 Log("SHUTTLE","Storing OCDB data ...");
440 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
442 Log("SHUTTLE","Storing reference data ...");
443 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
445 Log("SHUTTLE","Storing reference files ...");
446 Bool_t resultRefFiles = StoreRefFilesToGrid();
448 return resultCDB && resultRef && resultRefFiles;
451 //______________________________________________________________________________________________
452 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
455 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
458 TObjArray* gridIds=0;
460 Bool_t result = kTRUE;
462 const char* type = 0;
464 if(gridURI == fgkMainCDB) {
466 localURI = fgkLocalCDB;
467 } else if(gridURI == fgkMainRefStorage) {
469 localURI = fgkLocalRefStorage;
471 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
475 AliCDBManager* man = AliCDBManager::Instance();
477 AliCDBStorage *gridSto = man->GetStorage(gridURI);
480 Form("StoreOCDB - cannot activate main %s storage", type));
484 gridIds = gridSto->GetQueryCDBList();
486 // get objects previously stored in local CDB
487 AliCDBStorage *localSto = man->GetStorage(localURI);
490 Form("StoreOCDB - cannot activate local %s storage", type));
493 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
494 // Local objects were stored with current run as Grid version!
495 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
496 localEntries->SetOwner(1);
498 // loop on local stored objects
499 TIter localIter(localEntries);
500 AliCDBEntry *aLocEntry = 0;
501 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
502 aLocEntry->SetOwner(1);
503 AliCDBId aLocId = aLocEntry->GetId();
504 aLocEntry->SetVersion(-1);
505 aLocEntry->SetSubVersion(-1);
507 // If local object is valid up to infinity we store it only if it is
508 // the first unprocessed run!
509 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
510 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
512 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
513 "there are previous unprocessed runs!",
514 fCurrentDetector.Data(), aLocId.GetPath().Data()));
518 // loop on Grid valid Id's
519 Bool_t store = kTRUE;
520 TIter gridIter(gridIds);
521 AliCDBId* aGridId = 0;
522 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
523 if(aGridId->GetPath() != aLocId.GetPath()) continue;
524 // skip all objects valid up to infinity
525 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
526 // if we get here, it means there's already some more recent object stored on Grid!
531 // If we get here, the file can be stored!
532 Bool_t storeOk = gridSto->Put(aLocEntry);
533 if(!store || storeOk){
537 Log(fCurrentDetector.Data(),
538 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
539 type, aGridId->ToString().Data()));
542 Form("StoreOCDB - Object <%s> successfully put into %s storage",
543 aLocId.ToString().Data(), type));
544 Log(fCurrentDetector.Data(),
545 Form("StoreOCDB - Object <%s> successfully put into %s storage",
546 aLocId.ToString().Data(), type));
549 // removing local filename...
551 localSto->IdToFilename(aLocId, filename);
552 AliInfo(Form("Removing local file %s", filename.Data()));
553 RemoveFile(filename.Data());
557 Form("StoreOCDB - Grid %s storage of object <%s> failed",
558 type, aLocId.ToString().Data()));
559 Log(fCurrentDetector.Data(),
560 Form("StoreOCDB - Grid %s storage of object <%s> failed",
561 type, aLocId.ToString().Data()));
565 localEntries->Clear();
570 //______________________________________________________________________________________________
571 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
573 // clears the directory used to store reference files of a given subdetector
575 AliCDBManager* man = AliCDBManager::Instance();
576 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
577 TString localBaseFolder = sto->GetBaseFolder();
579 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
581 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
584 begin.Form("%d_", GetCurrentRun());
586 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
590 TList* dirList = baseDir->GetListOfFiles();
593 if (!dirList) return kTRUE;
595 if (dirList->GetEntries() < 3)
601 Int_t nDirs = 0, nDel = 0;
602 TIter dirIter(dirList);
603 TSystemFile* entry = 0;
605 Bool_t success = kTRUE;
607 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
609 if (entry->IsDirectory())
612 TString fileName(entry->GetName());
613 if (!fileName.BeginsWith(begin))
619 Int_t result = gSystem->Unlink(fileName.Data());
623 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
631 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
632 nDel, nDirs, targetDir.Data()));
643 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
647 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
650 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
655 result = gSystem->mkdir(targetDir, kTRUE);
658 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
665 //______________________________________________________________________________________________
666 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
669 // Stores reference file directly (without opening it). This function stores the file locally.
671 // The file is stored under the following location:
672 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
673 // where <gridFileName> is the second parameter given to the function
676 if (fTestMode & kErrorStorage)
678 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
682 AliCDBManager* man = AliCDBManager::Instance();
683 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
685 TString localBaseFolder = sto->GetBaseFolder();
687 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
689 //try to open folder, if does not exist
690 void* dir = gSystem->OpenDirectory(targetDir.Data());
692 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
693 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
698 gSystem->FreeDirectory(dir);
702 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
704 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
707 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
711 result = gSystem->CopyFile(localFile, target);
715 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
720 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
721 localFile, target.Data(), result));
726 //______________________________________________________________________________________________
727 Bool_t AliShuttle::StoreRefFilesToGrid()
730 // Transfers the reference file to the Grid.
732 // The files are stored under the following location:
733 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
736 AliCDBManager* man = AliCDBManager::Instance();
737 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
740 TString localBaseFolder = sto->GetBaseFolder();
742 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
744 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
748 TString gridBaseFolder = gridSto->GetBaseFolder();
750 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
753 begin.Form("%d_", GetCurrentRun());
755 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
759 TList* dirList = baseDir->GetListOfFiles();
762 if (!dirList) return kTRUE;
764 if (dirList->GetEntries() < 3)
772 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
777 Int_t nDirs = 0, nTransfer = 0;
778 TIter dirIter(dirList);
779 TSystemFile* entry = 0;
781 Bool_t success = kTRUE;
782 Bool_t first = kTRUE;
784 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
786 if (entry->IsDirectory())
789 TString fileName(entry->GetName());
790 if (!fileName.BeginsWith(begin))
798 // check that DET folder exists, otherwise create it
799 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
807 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
809 if (!gGrid->Mkdir(alienDir.Data(),"",0))
811 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
816 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
820 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
824 TString fullLocalPath;
825 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
827 TString fullGridPath;
828 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
830 TFileMerger fileMerger;
831 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
835 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
836 RemoveFile(fullLocalPath);
841 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
846 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
853 //______________________________________________________________________________________________
854 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
857 // Get folder name of reference files
860 TString offDetStr(GetOfflineDetName(detector));
862 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
864 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
866 dir.Form("%s/%s", base, offDetStr.Data());
873 //______________________________________________________________________________________________
874 void AliShuttle::CleanLocalStorage(const TString& uri)
877 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
880 const char* type = 0;
881 if(uri == fgkLocalCDB) {
883 } else if(uri == fgkLocalRefStorage) {
886 AliError(Form("Invalid storage URI: %s", uri.Data()));
890 AliCDBManager* man = AliCDBManager::Instance();
892 // open local storage
893 AliCDBStorage *localSto = man->GetStorage(uri);
896 Form("CleanLocalStorage - cannot activate local %s storage", type));
900 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
901 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
903 AliInfo(Form("filename = %s", filename.Data()));
905 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
906 GetCurrentRun(), fCurrentDetector.Data()));
908 RemoveFile(filename.Data());
912 //______________________________________________________________________________________________
913 void AliShuttle::RemoveFile(const char* filename)
916 // removes local file
919 TString command(Form("rm -f %s", filename));
921 Int_t result = gSystem->Exec(command.Data());
924 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
925 fCurrentDetector.Data(), filename));
929 //______________________________________________________________________________________________
930 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
933 // Reads the AliShuttleStatus from the CDB
941 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
942 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
944 if (!fStatusEntry) return 0;
945 fStatusEntry->SetOwner(1);
947 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
949 AliError("Invalid object stored to CDB!");
956 //______________________________________________________________________________________________
957 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
960 // writes the status for one subdetector
968 Int_t run = GetCurrentRun();
970 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
972 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
973 fStatusEntry->SetOwner(1);
975 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
978 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
979 fCurrentDetector.Data(), run));
988 //______________________________________________________________________________________________
989 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
992 // changes the AliShuttleStatus for the given detector and run to the given status
996 AliError("UNEXPECTED: fStatusEntry empty");
1000 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1003 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
1007 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1008 fCurrentDetector.Data(),
1009 status->GetStatusName(),
1010 status->GetStatusName(newStatus));
1011 Log("SHUTTLE", actionStr);
1012 SetLastAction(actionStr);
1014 status->SetStatus(newStatus);
1015 if (increaseCount) status->IncreaseCount();
1017 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1022 //______________________________________________________________________________________________
1023 void AliShuttle::SendMLInfo()
1026 // sends ML information about the current status of the current detector being processed
1029 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1032 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1036 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1037 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1040 mlList.Add(&mlStatus);
1041 mlList.Add(&mlRetryCount);
1043 fMonaLisa->SendParameters(&mlList);
1046 //______________________________________________________________________________________________
1047 Bool_t AliShuttle::ContinueProcessing()
1049 // this function reads the AliShuttleStatus information from CDB and
1050 // checks if the processing should be continued
1051 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1053 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1055 AliPreprocessor* aPreprocessor =
1056 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1059 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1063 AliShuttleLogbookEntry::Status entryStatus =
1064 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1066 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1067 AliInfo(Form("ContinueProcessing - %s is %s",
1068 fCurrentDetector.Data(),
1069 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1073 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1075 // check if current run is first unprocessed run for current detector
1076 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1077 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1079 if (fTestMode == kNone)
1081 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1086 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1090 AliShuttleStatus* status = ReadShuttleStatus();
1093 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1094 fCurrentDetector.Data()));
1095 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1096 return WriteShuttleStatus(status);
1099 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1100 // If it happens it may mean Logbook updating failed... let's do it now!
1101 if (status->GetStatus() == AliShuttleStatus::kDone ||
1102 status->GetStatus() == AliShuttleStatus::kFailed){
1103 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1104 fCurrentDetector.Data(),
1105 status->GetStatusName(status->GetStatus())));
1106 UpdateShuttleLogbook(fCurrentDetector.Data(),
1107 status->GetStatusName(status->GetStatus()));
1111 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1113 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1114 fCurrentDetector.Data()));
1115 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1117 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1118 fCurrentDetector.Data()));
1119 UpdateShuttleStatus(AliShuttleStatus::kDone);
1120 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1123 Form("ContinueProcessing - %s: Grid storage failed again",
1124 fCurrentDetector.Data()));
1125 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1130 // if we get here, there is a restart
1131 Bool_t cont = kFALSE;
1134 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1135 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1136 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1137 status->GetCount(), status->GetStatusName()));
1138 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1139 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1141 // there may still be objects in local OCDB and reference storage
1142 // and FXS databases may be not updated: do it now!
1144 // TODO Currently disabled, we want to keep files in case of failure!
1145 // CleanLocalStorage(fgkLocalCDB);
1146 // CleanLocalStorage(fgkLocalRefStorage);
1147 // UpdateTableFailCase();
1149 // Send mail to detector expert!
1150 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1152 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1153 fCurrentDetector.Data()));
1156 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1157 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1158 status->GetStatusName(), status->GetCount()));
1159 Bool_t increaseCount = kTRUE;
1160 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1161 increaseCount = kFALSE;
1162 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1169 //______________________________________________________________________________________________
1170 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1173 // Makes data retrieval for all detectors in the configuration.
1174 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1175 // (Unprocessed, Inactive, Failed or Done).
1176 // Returns kFALSE in case of error occured and kTRUE otherwise
1179 if (!entry) return kFALSE;
1181 fLogbookEntry = entry;
1183 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1186 // create ML instance that monitors this run
1187 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1188 // disable monitoring of other parameters that come e.g. from TFile
1189 gMonitoringWriter = 0;
1191 // Send the information to ML
1192 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1193 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1196 mlList.Add(&mlStatus);
1197 mlList.Add(&mlRunType);
1199 fMonaLisa->SendParameters(&mlList);
1201 if (fLogbookEntry->IsDone())
1203 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1204 UpdateShuttleLogbook("shuttle_done");
1209 // read test mode if flag is set
1213 TString logEntry(entry->GetRunParameter("log"));
1214 //printf("log entry = %s\n", logEntry.Data());
1215 TString searchStr("Testmode: ");
1216 Int_t pos = logEntry.Index(searchStr.Data());
1217 //printf("%d\n", pos);
1220 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1221 //printf("%s\n", subStr.String().Data());
1222 TString newStr(subStr.Data());
1223 TObjArray* token = newStr.Tokenize(' ');
1227 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1230 Int_t testMode = tmpStr->String().Atoi();
1233 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1234 SetTestMode((TestMode) testMode);
1242 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1244 fLogbookEntry->Print("all");
1247 Bool_t hasError = kFALSE;
1249 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1250 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1251 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1252 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1254 // Loop on detectors in the configuration
1255 TIter iter(fConfig->GetDetectors());
1256 TObjString* aDetector = 0;
1258 while ((aDetector = (TObjString*) iter.Next()))
1260 fCurrentDetector = aDetector->String();
1262 if (ContinueProcessing() == kFALSE) continue;
1264 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1265 GetCurrentRun(), aDetector->GetName()));
1267 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1269 Log(fCurrentDetector.Data(), "Starting processing");
1275 Log("SHUTTLE", "ERROR: Forking failed");
1280 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1281 GetCurrentRun(), aDetector->GetName()));
1283 Long_t begin = time(0);
1285 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1286 while (waitpid(pid, &status, WNOHANG) == 0)
1288 Long_t expiredTime = time(0) - begin;
1290 if (expiredTime > fConfig->GetPPTimeOut())
1293 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1294 fCurrentDetector.Data(), expiredTime);
1295 Log("SHUTTLE", tmp);
1296 Log(fCurrentDetector, tmp);
1300 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1303 gSystem->Sleep(1000);
1307 gSystem->Sleep(1000);
1310 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1311 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1314 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1319 if (!fgets(buffer, 100, pipe))
1321 Log("SHUTTLE", "Error: ps did not return anything");
1322 gSystem->ClosePipe(pipe);
1325 gSystem->ClosePipe(pipe);
1327 //Log("SHUTTLE", Form("ps returned %s", buffer));
1330 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1332 Log("SHUTTLE", "Error: Could not parse output of ps");
1336 if (expiredTime % 60 == 0)
1337 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1338 fCurrentDetector.Data(), expiredTime, mem));
1340 if (mem > fConfig->GetPPMaxMem())
1343 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1344 mem, fConfig->GetPPMaxMem());
1345 Log("SHUTTLE", tmp);
1346 Log(fCurrentDetector, tmp);
1350 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1353 gSystem->Sleep(1000);
1358 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1359 GetCurrentRun(), aDetector->GetName()));
1361 if (WIFEXITED(status))
1363 Int_t returnCode = WEXITSTATUS(status);
1365 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1368 if (returnCode == 0) hasError = kTRUE;
1374 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1376 AliInfo("Redirecting output...");
1378 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1380 Log("SHUTTLE", "Could not freopen stdout");
1384 fOutputRedirected = kTRUE;
1385 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1386 Log("SHUTTLE", "Could not redirect stderr");
1390 Bool_t success = ProcessCurrentDetector();
1391 if (success) // Preprocessor finished successfully!
1393 // Update time_processed field in FXS DB
1394 if (UpdateTable() == kFALSE)
1395 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1397 // Transfer the data from local storage to main storage (Grid)
1398 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1399 if (StoreOCDB() == kFALSE)
1401 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1402 GetCurrentRun(), aDetector->GetName()));
1403 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1406 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1407 GetCurrentRun(), aDetector->GetName()));
1408 UpdateShuttleStatus(AliShuttleStatus::kDone);
1409 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1413 for (UInt_t iSys=0; iSys<3; iSys++)
1415 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1418 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1419 GetCurrentRun(), aDetector->GetName(), success));
1421 // the client exits here
1422 gSystem->Exit(success);
1424 AliError("We should never get here!!!");
1428 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1431 //check if shuttle is done for this run, if so update logbook
1432 TObjArray checkEntryArray;
1433 checkEntryArray.SetOwner(1);
1434 TString whereClause = Form("where run=%d", GetCurrentRun());
1435 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1436 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1438 return hasError == kFALSE;
1441 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1442 (checkEntryArray.At(0));
1446 if (checkEntry->IsDone())
1448 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1449 UpdateShuttleLogbook("shuttle_done");
1453 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1455 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1457 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1458 checkEntry->GetRun(), GetDetName(iDet)));
1459 fFirstUnprocessed[iDet] = kFALSE;
1465 // remove ML instance
1471 return hasError == kFALSE;
1474 //______________________________________________________________________________________________
1475 Bool_t AliShuttle::ProcessCurrentDetector()
1478 // Makes data retrieval just for a specific detector (fCurrentDetector).
1479 // Threre should be a configuration for this detector.
1481 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1483 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1489 Bool_t aDCSError = kFALSE;
1491 // call preprocessor
1492 AliPreprocessor* aPreprocessor =
1493 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1495 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1497 Bool_t processDCS = aPreprocessor->ProcessDCS();
1501 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1503 else if (fTestMode & kSkipDCS)
1505 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1507 else if (fTestMode & kErrorDCS)
1509 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1510 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1511 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1515 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1517 TString host(fConfig->GetDCSHost(fCurrentDetector));
1518 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1520 // Retrieval of Aliases
1521 TObjString* anAlias = 0;
1523 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
1524 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1525 while ((anAlias = (TObjString*) iterAliases.Next()))
1527 TObjArray *valueSet = new TObjArray();
1528 valueSet->SetOwner(1);
1531 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1535 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1536 AliInfo(Form("Alias %s (%d of %d) - %d values collected",
1537 anAlias->GetName(), iAlias, nTotAliases, valueSet->GetEntriesFast()));
1538 dcsMap.Add(anAlias->Clone(), valueSet);
1540 Log(fCurrentDetector,
1541 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1542 anAlias->GetName()));
1543 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1549 // Retrieval of Data Points
1550 TObjString* aDP = 0;
1552 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
1553 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1554 while ((aDP = (TObjString*) iterDP.Next()))
1556 TObjArray *valueSet = new TObjArray();
1557 valueSet->SetOwner(1);
1558 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1559 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1560 aDP->GetName(), iDP++, nTotDPs));
1561 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1565 dcsMap.Add(aDP->Clone(), valueSet);
1567 Log(fCurrentDetector,
1568 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1570 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1577 // DCS Archive DB processing successful. Call Preprocessor!
1578 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1580 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
1582 if (returnValue > 0) // Preprocessor error!
1584 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1585 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1591 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1592 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1593 fCurrentDetector.Data()));
1600 //______________________________________________________________________________________________
1601 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1604 // Query DAQ's Shuttle logbook and fills detector status object.
1605 // Call QueryRunParameters to query DAQ logbook for run parameters.
1608 entries.SetOwner(1);
1610 // check connection, in case connect
1611 if(!Connect(3)) return kFALSE;
1614 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1616 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1618 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1622 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1624 if(aResult->GetRowCount() == 0) {
1625 AliInfo("No entries in Shuttle Logbook match request");
1630 // TODO Check field count!
1631 const UInt_t nCols = 22;
1632 if (aResult->GetFieldCount() != (Int_t) nCols) {
1633 AliError("Invalid SQL result field number!");
1639 while ((aRow = aResult->Next())) {
1640 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1641 Int_t run = runString.Atoi();
1643 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1647 // loop on detectors
1648 for(UInt_t ii = 0; ii < nCols; ii++)
1649 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1651 entries.AddLast(entry);
1659 //______________________________________________________________________________________________
1660 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1663 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1666 // check connection, in case connect
1671 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1673 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1675 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1679 if (aResult->GetRowCount() == 0) {
1680 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1685 if (aResult->GetRowCount() > 1) {
1686 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1691 TSQLRow* aRow = aResult->Next();
1694 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1699 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1701 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1702 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1704 UInt_t startTime = entry->GetStartTime();
1705 UInt_t endTime = entry->GetEndTime();
1707 if (!startTime || !endTime || startTime > endTime) {
1709 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1710 run, startTime, endTime));
1723 //______________________________________________________________________________________________
1724 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1725 TObjArray* valueSet, DCSType type)
1727 // Retrieve all "entry" data points from the DCS server
1728 // host, port: TSocket connection parameters
1729 // entry: name of the alias or data point
1730 // valueSet: array of retrieved AliDCSValue's
1731 // type: kAlias or kDP
1733 AliDCSClient client(host, port, fTimeout, fRetries);
1734 if (!client.IsConnected())
1743 result = client.GetAliasValues(entry,
1744 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1748 result = client.GetDPValues(entry,
1749 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1754 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1755 entry, AliDCSClient::GetErrorString(result)));
1757 if (result == AliDCSClient::fgkServerError)
1759 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1760 client.GetServerError().Data()));
1769 //______________________________________________________________________________________________
1770 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1771 const char* id, const char* source)
1773 // Get calibration file from file exchange servers
1774 // First queris the FXS database for the file name, using the run, detector, id and source info
1775 // then calls RetrieveFile(filename) for actual copy to local disk
1776 // run: current run being processed (given by Logbook entry fLogbookEntry)
1777 // detector: the Preprocessor name
1778 // id: provided as a parameter by the Preprocessor
1779 // source: provided by the Preprocessor through GetFileSources function
1781 // check if test mode should simulate a FXS error
1782 if (fTestMode & kErrorFXSFiles)
1784 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1788 // check connection, in case connect
1789 if (!Connect(system))
1791 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1795 // Query preparation
1796 TString sourceName(source);
1798 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1799 fConfig->GetFXSdbTable(system));
1800 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1801 GetCurrentRun(), detector, id);
1805 whereClause += Form(" and DAQsource=\"%s\"", source);
1807 else if (system == kDCS)
1811 else if (system == kHLT)
1813 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1817 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1819 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1822 TSQLResult* aResult = 0;
1823 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1825 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1826 GetSystemName(system), id, sourceName.Data()));
1830 if(aResult->GetRowCount() == 0)
1833 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1834 GetSystemName(system), id, sourceName.Data()));
1839 if (aResult->GetRowCount() > 1) {
1841 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1842 GetSystemName(system), id, sourceName.Data()));
1847 if (aResult->GetFieldCount() != nFields) {
1849 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1850 GetSystemName(system), id, sourceName.Data()));
1855 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1858 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1859 GetSystemName(system), id, sourceName.Data()));
1864 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1865 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1866 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1871 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1872 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1874 // retrieved file is renamed to make it unique
1875 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1876 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1879 // file retrieval from FXS
1880 UInt_t nRetries = 0;
1881 UInt_t maxRetries = 3;
1882 Bool_t result = kFALSE;
1884 // copy!! if successful TSystem::Exec returns 0
1885 while(nRetries++ < maxRetries) {
1886 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1887 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1890 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1891 filePath.Data(), GetSystemName(system)));
1894 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1895 filePath.Data(), GetSystemName(system),
1896 GetShuttleTempDir(), localFileName.Data()));
1899 if (fileChecksum.Length()>0)
1901 // compare md5sum of local file with the one stored in the FXS DB
1902 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
1903 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
1907 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1913 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1914 filePath.Data(), GetSystemName(system)));
1919 if(!result) return 0;
1921 fFXSCalled[system]=kTRUE;
1922 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1923 fFXSlist[system].Add(fileParams);
1925 static TString fullLocalFileName;
1926 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1928 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1930 return fullLocalFileName.Data();
1934 //______________________________________________________________________________________________
1935 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
1938 // Copies file from FXS to local Shuttle machine
1941 // check temp directory: trying to cd to temp; if it does not exist, create it
1942 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1943 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
1945 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
1947 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1948 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
1953 gSystem->FreeDirectory(dir);
1956 TString baseFXSFolder;
1959 baseFXSFolder = "FES/";
1961 else if (system == kDCS)
1965 else if (system == kHLT)
1967 baseFXSFolder = "~/";
1971 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1972 fConfig->GetFXSPort(system),
1973 fConfig->GetFXSUser(system),
1974 fConfig->GetFXSHost(system),
1975 baseFXSFolder.Data(),
1977 GetShuttleTempDir(),
1980 AliDebug(2, Form("%s",command.Data()));
1982 Bool_t result = (gSystem->Exec(command.Data()) == 0);
1987 //______________________________________________________________________________________________
1988 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1991 // Get sources producing the condition file Id from file exchange servers
1992 // if id is NULL all sources are returned (distinct)
1995 // check if test mode should simulate a FXS error
1996 if (fTestMode & kErrorFXSSources)
1998 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2005 AliError("DCS system has only one source of data!");
2009 // check connection, in case connect
2010 if (!Connect(system))
2012 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2016 TString sourceName = 0;
2019 sourceName = "DAQsource";
2020 } else if (system == kHLT)
2022 sourceName = "DDLnumbers";
2025 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2026 TString whereClause = Form("run=%d and detector=\"%s\"",
2027 GetCurrentRun(), detector);
2029 whereClause += Form(" and fileId=\"%s\"", id);
2030 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2032 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2035 TSQLResult* aResult;
2036 aResult = fServer[system]->Query(sqlQuery);
2038 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2039 GetSystemName(system), id));
2043 TList *list = new TList();
2046 if (aResult->GetRowCount() == 0)
2049 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2056 while ((aRow = aResult->Next()))
2059 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2060 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2061 list->Add(new TObjString(source));
2070 //______________________________________________________________________________________________
2071 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2074 // Get all ids of condition files produced by a given source from file exchange servers
2077 // check if test mode should simulate a FXS error
2078 if (fTestMode & kErrorFXSSources)
2080 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2084 // check connection, in case connect
2085 if (!Connect(system))
2087 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2091 TString sourceName = 0;
2094 sourceName = "DAQsource";
2095 } else if (system == kHLT)
2097 sourceName = "DDLnumbers";
2100 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2101 TString whereClause = Form("run=%d and detector=\"%s\"",
2102 GetCurrentRun(), detector);
2103 if (sourceName.Length() > 0 && source)
2104 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2105 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2107 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2110 TSQLResult* aResult;
2111 aResult = fServer[system]->Query(sqlQuery);
2113 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2114 GetSystemName(system), source));
2118 TList *list = new TList();
2121 if (aResult->GetRowCount() == 0)
2124 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2131 while ((aRow = aResult->Next()))
2134 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2135 AliDebug(2, Form("fileId = %s", id.Data()));
2136 list->Add(new TObjString(id));
2145 //______________________________________________________________________________________________
2146 Bool_t AliShuttle::Connect(Int_t system)
2148 // Connect to MySQL Server of the system's FXS MySQL databases
2149 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2152 // check connection: if already connected return
2153 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2155 TString dbHost, dbUser, dbPass, dbName;
2157 if (system < 3) // FXS db servers
2159 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2160 dbUser = fConfig->GetFXSdbUser(system);
2161 dbPass = fConfig->GetFXSdbPass(system);
2162 dbName = fConfig->GetFXSdbName(system);
2163 } else { // Run & Shuttle logbook servers
2164 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2165 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2166 dbUser = fConfig->GetDAQlbUser();
2167 dbPass = fConfig->GetDAQlbPass();
2168 dbName = fConfig->GetDAQlbDB();
2171 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2172 if (!fServer[system] || !fServer[system]->IsConnected()) {
2175 AliError(Form("Can't establish connection to FXS database for %s",
2176 AliShuttleInterface::GetSystemName(system)));
2178 AliError("Can't establish connection to Run logbook.");
2180 if(fServer[system]) delete fServer[system];
2185 TSQLResult* aResult=0;
2188 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2191 aResult = fServer[kDCS]->GetTables(dbName.Data());
2194 aResult = fServer[kHLT]->GetTables(dbName.Data());
2197 aResult = fServer[3]->GetTables(dbName.Data());
2205 //______________________________________________________________________________________________
2206 Bool_t AliShuttle::UpdateTable()
2209 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2212 Bool_t result = kTRUE;
2214 for (UInt_t system=0; system<3; system++)
2216 if(!fFXSCalled[system]) continue;
2218 // check connection, in case connect
2219 if (!Connect(system))
2221 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2226 TTimeStamp now; // now
2228 // Loop on FXS list entries
2229 TIter iter(&fFXSlist[system]);
2230 TObjString *aFXSentry=0;
2231 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2233 TString aFXSentrystr = aFXSentry->String();
2234 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2235 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2237 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2238 GetSystemName(system), aFXSentrystr.Data()));
2239 if(aFXSarray) delete aFXSarray;
2243 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2244 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2246 TString whereClause;
2249 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2250 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2252 else if (system == kDCS)
2254 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2255 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2257 else if (system == kHLT)
2259 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2260 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2265 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2266 now.GetSec(), whereClause.Data());
2268 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2271 TSQLResult* aResult;
2272 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2275 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2276 GetSystemName(system), sqlQuery.Data()));
2287 //______________________________________________________________________________________________
2288 Bool_t AliShuttle::UpdateTableFailCase()
2290 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2291 // this is called in case the preprocessor is declared failed for the current run, because
2292 // the fields are updated only in case of success
2294 Bool_t result = kTRUE;
2296 for (UInt_t system=0; system<3; system++)
2298 // check connection, in case connect
2299 if (!Connect(system))
2301 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2302 GetSystemName(system)));
2307 TTimeStamp now; // now
2309 // Loop on FXS list entries
2311 TString whereClause = Form("where run=%d and detector=\"%s\";",
2312 GetCurrentRun(), fCurrentDetector.Data());
2315 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2316 now.GetSec(), whereClause.Data());
2318 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2321 TSQLResult* aResult;
2322 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2325 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2326 GetSystemName(system), sqlQuery.Data()));
2336 //______________________________________________________________________________________________
2337 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2340 // Update Shuttle logbook filling detector or shuttle_done column
2341 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2344 // check connection, in case connect
2346 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2350 TString detName(detector);
2352 if(detName == "shuttle_done")
2354 setClause = "set shuttle_done=1";
2356 // Send the information to ML
2357 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2360 mlList.Add(&mlStatus);
2362 fMonaLisa->SendParameters(&mlList);
2364 TString statusStr(status);
2365 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2366 statusStr.Contains("failed", TString::kIgnoreCase)){
2367 setClause = Form("set %s=\"%s\"", detector, status);
2370 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2376 TString whereClause = Form("where run=%d", GetCurrentRun());
2378 TString sqlQuery = Form("update %s %s %s",
2379 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2381 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2384 TSQLResult* aResult;
2385 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2387 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2395 //______________________________________________________________________________________________
2396 Int_t AliShuttle::GetCurrentRun() const
2399 // Get current run from logbook entry
2402 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2405 //______________________________________________________________________________________________
2406 UInt_t AliShuttle::GetCurrentStartTime() const
2409 // get current start time
2412 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2415 //______________________________________________________________________________________________
2416 UInt_t AliShuttle::GetCurrentEndTime() const
2419 // get current end time from logbook entry
2422 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2425 //______________________________________________________________________________________________
2426 void AliShuttle::Log(const char* detector, const char* message)
2429 // Fill log string with a message
2432 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2434 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2435 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2440 gSystem->FreeDirectory(dir);
2443 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2444 if (GetCurrentRun() >= 0)
2445 toLog += Form("run %d - ", GetCurrentRun());
2446 toLog += Form("%s", message);
2448 AliInfo(toLog.Data());
2450 // if we redirect the log output already to the file, leave here
2451 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2454 TString fileName = GetLogFileName(detector);
2456 gSystem->ExpandPathName(fileName);
2459 logFile.open(fileName, ofstream::out | ofstream::app);
2461 if (!logFile.is_open()) {
2462 AliError(Form("Could not open file %s", fileName.Data()));
2466 logFile << toLog.Data() << "\n";
2471 //______________________________________________________________________________________________
2472 TString AliShuttle::GetLogFileName(const char* detector) const
2475 // returns the name of the log file for a given sub detector
2480 if (GetCurrentRun() >= 0)
2481 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2483 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2488 //______________________________________________________________________________________________
2489 Bool_t AliShuttle::Collect(Int_t run)
2492 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2493 // If a dedicated run is given this run is processed
2495 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2499 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2501 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2503 SetLastAction("Starting");
2505 TString whereClause("where shuttle_done=0");
2507 whereClause += Form(" and run=%d", run);
2509 TObjArray shuttleLogbookEntries;
2510 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2512 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2516 if (shuttleLogbookEntries.GetEntries() == 0)
2519 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2521 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2522 "or it does not exist in Shuttle logbook", run));
2526 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2527 fFirstUnprocessed[iDet] = kTRUE;
2531 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2532 // flag them into fFirstUnprocessed array
2533 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2534 TObjArray tmpLogbookEntries;
2535 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2537 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2541 TIter iter(&tmpLogbookEntries);
2542 AliShuttleLogbookEntry* anEntry = 0;
2543 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2545 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2547 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2549 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2550 anEntry->GetRun(), GetDetName(iDet)));
2551 fFirstUnprocessed[iDet] = kFALSE;
2559 if (!RetrieveConditionsData(shuttleLogbookEntries))
2561 Log("SHUTTLE", "Collect - Process of at least one run failed");
2565 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2569 //______________________________________________________________________________________________
2570 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2573 // Retrieve conditions data for all runs that aren't processed yet
2576 Bool_t hasError = kFALSE;
2578 TIter iter(&dateEntries);
2579 AliShuttleLogbookEntry* anEntry;
2581 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2582 if (!Process(anEntry)){
2586 // clean SHUTTLE temp directory
2587 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2588 RemoveFile(filename.Data());
2591 return hasError == kFALSE;
2594 //______________________________________________________________________________________________
2595 ULong_t AliShuttle::GetTimeOfLastAction() const
2598 // Gets time of last action
2603 fMonitoringMutex->Lock();
2605 tmp = fLastActionTime;
2607 fMonitoringMutex->UnLock();
2612 //______________________________________________________________________________________________
2613 const TString AliShuttle::GetLastAction() const
2616 // returns a string description of the last action
2621 fMonitoringMutex->Lock();
2625 fMonitoringMutex->UnLock();
2630 //______________________________________________________________________________________________
2631 void AliShuttle::SetLastAction(const char* action)
2634 // updates the monitoring variables
2637 fMonitoringMutex->Lock();
2639 fLastAction = action;
2640 fLastActionTime = time(0);
2642 fMonitoringMutex->UnLock();
2645 //______________________________________________________________________________________________
2646 const char* AliShuttle::GetRunParameter(const char* param)
2649 // returns run parameter read from DAQ logbook
2652 if(!fLogbookEntry) {
2653 AliError("No logbook entry!");
2657 return fLogbookEntry->GetRunParameter(param);
2660 //______________________________________________________________________________________________
2661 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2664 // returns object from OCDB valid for current run
2667 if (fTestMode & kErrorOCDB)
2669 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2673 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2676 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2680 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2683 //______________________________________________________________________________________________
2684 Bool_t AliShuttle::SendMail()
2687 // sends a mail to the subdetector expert in case of preprocessor error
2690 if (fTestMode != kNone)
2693 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2696 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2698 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2703 gSystem->FreeDirectory(dir);
2706 TString bodyFileName;
2707 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2708 gSystem->ExpandPathName(bodyFileName);
2711 mailBody.open(bodyFileName, ofstream::out);
2713 if (!mailBody.is_open())
2715 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2720 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2721 TObjString *anExpert=0;
2722 while ((anExpert = (TObjString*) iterExperts.Next()))
2724 to += Form("%s,", anExpert->GetName());
2726 to.Remove(to.Length()-1);
2727 AliDebug(2, Form("to: %s",to.Data()));
2730 AliInfo("List of detector responsibles not yet set!");
2734 TString cc="alberto.colla@cern.ch";
2736 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
2737 fCurrentDetector.Data(), GetCurrentRun());
2738 AliDebug(2, Form("subject: %s", subject.Data()));
2740 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2741 body += Form("SHUTTLE just detected that your preprocessor "
2742 "failed processing run %d!!\n\n", GetCurrentRun());
2743 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2744 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2745 body += Form("Find the %s log for the current run on \n\n"
2746 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2747 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
2748 body += Form("The last 10 lines of %s log file are following:\n\n");
2750 AliDebug(2, Form("Body begin: %s", body.Data()));
2752 mailBody << body.Data();
2754 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2756 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2757 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2758 if (gSystem->Exec(tailCommand.Data()))
2760 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2763 TString endBody = Form("------------------------------------------------------\n\n");
2764 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2765 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2766 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
2768 AliDebug(2, Form("Body end: %s", endBody.Data()));
2770 mailBody << endBody.Data();
2775 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2779 bodyFileName.Data());
2780 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2782 Bool_t result = gSystem->Exec(mailCommand.Data());
2787 //______________________________________________________________________________________________
2788 const char* AliShuttle::GetRunType()
2791 // returns run type read from "run type" logbook
2794 if(!fLogbookEntry) {
2795 AliError("No logbook entry!");
2799 return fLogbookEntry->GetRunType();
2802 //______________________________________________________________________________________________
2803 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2806 // sets Shuttle temp directory
2809 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2812 //______________________________________________________________________________________________
2813 void AliShuttle::SetShuttleLogDir(const char* logDir)
2816 // sets Shuttle log directory
2819 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);