1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.50 2007/07/02 17:19:32 acolla
19 preprocessor is run in a temp directory that is removed when process is finished.
21 Revision 1.49 2007/06/29 10:45:06 acolla
22 Number of columns in MySql Shuttle logbook increased by one (HLT added)
24 Revision 1.48 2007/06/21 13:06:19 acolla
25 GetFileSources returns dummy list with 1 source if system=DCS (better than
26 returning error as it was)
28 Revision 1.47 2007/06/19 17:28:56 acolla
29 HLT updated; missing map bug removed.
31 Revision 1.46 2007/06/09 13:01:09 jgrosseo
32 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
34 Revision 1.45 2007/05/30 06:35:20 jgrosseo
35 Adding functionality to the Shuttle/TestShuttle:
36 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
37 o) Function to retrieve list of IDs for a given source (GetFileIDs)
38 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
39 Example code has been added to the TestProcessor in TestShuttle
41 Revision 1.44 2007/05/11 16:09:32 acolla
42 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
43 example: ITS/SPD/100_filename.root
45 Revision 1.43 2007/05/10 09:59:51 acolla
46 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
48 Revision 1.42 2007/05/03 08:01:39 jgrosseo
49 typo in last commit :-(
51 Revision 1.41 2007/05/03 08:00:48 jgrosseo
52 fixing log message when pp want to skip dcs value retrieval
54 Revision 1.40 2007/04/27 07:06:48 jgrosseo
55 GetFileSources returns empty list in case of no files, but successful query
56 No mails sent in testmode
58 Revision 1.39 2007/04/17 12:43:57 acolla
59 Correction in StoreOCDB; change of text in mail to detector expert
61 Revision 1.38 2007/04/12 08:26:18 jgrosseo
64 Revision 1.37 2007/04/10 16:53:14 jgrosseo
65 redirecting sub detector stdout, stderr to sub detector log file
67 Revision 1.35 2007/04/04 16:26:38 acolla
68 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
69 2. Added missing dependency in test preprocessors.
70 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
72 Revision 1.34 2007/04/04 10:33:36 jgrosseo
73 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
74 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
76 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
78 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
80 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
82 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
83 If you always need DCS data (like before), you do not need to implement it.
85 6) The run type has been added to the monitoring page
87 Revision 1.33 2007/04/03 13:56:01 acolla
88 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
91 Revision 1.32 2007/02/28 10:41:56 acolla
92 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
93 AliPreprocessor::GetRunType() function.
94 Added some ldap definition files.
96 Revision 1.30 2007/02/13 11:23:21 acolla
97 Moved getters and setters of Shuttle's main OCDB/Reference, local
98 OCDB/Reference, temp and log folders to AliShuttleInterface
100 Revision 1.27 2007/01/30 17:52:42 jgrosseo
101 adding monalisa monitoring
103 Revision 1.26 2007/01/23 19:20:03 acolla
104 Removed old ldif files, added TOF, MCH ldif files. Added some options in
105 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
108 Revision 1.25 2007/01/15 19:13:52 acolla
109 Moved some AliInfo to AliDebug in SendMail function
111 Revision 1.21 2006/12/07 08:51:26 jgrosseo
113 table, db names in ldap configuration
114 added GRP preprocessor
115 DCS data can also be retrieved by data point
117 Revision 1.20 2006/11/16 16:16:48 jgrosseo
118 introducing strict run ordering flag
119 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
121 Revision 1.19 2006/11/06 14:23:04 jgrosseo
122 major update (Alberto)
123 o) reading of run parameters from the logbook
124 o) online offline naming conversion
125 o) standalone DCSclient package
127 Revision 1.18 2006/10/20 15:22:59 jgrosseo
128 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
129 o) Merging Collect, CollectAll, CollectNew function
130 o) Removing implementation of empty copy constructors (declaration still there!)
132 Revision 1.17 2006/10/05 16:20:55 jgrosseo
133 adapting to new CDB classes
135 Revision 1.16 2006/10/05 15:46:26 jgrosseo
136 applying to the new interface
138 Revision 1.15 2006/10/02 16:38:39 jgrosseo
141 storing of objects that failed to be stored to the grid before
142 interfacing of shuttle status table in daq system
144 Revision 1.14 2006/08/29 09:16:05 jgrosseo
147 Revision 1.13 2006/08/15 10:50:00 jgrosseo
148 effc++ corrections (alberto)
150 Revision 1.12 2006/08/08 14:19:29 jgrosseo
151 Update to shuttle classes (Alberto)
153 - Possibility to set the full object's path in the Preprocessor's and
154 Shuttle's Store functions
155 - Possibility to extend the object's run validity in the same classes
156 ("startValidity" and "validityInfinite" parameters)
157 - Implementation of the StoreReferenceData function to store reference
158 data in a dedicated CDB storage.
160 Revision 1.11 2006/07/21 07:37:20 jgrosseo
161 last run is stored after each run
163 Revision 1.10 2006/07/20 09:54:40 jgrosseo
164 introducing status management: The processing per subdetector is divided into several steps,
165 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
166 can keep track of the number of failures and skips further processing after a certain threshold is
167 exceeded. These thresholds can be configured in LDAP.
169 Revision 1.9 2006/07/19 10:09:55 jgrosseo
170 new configuration, accesst to DAQ FES (Alberto)
172 Revision 1.8 2006/07/11 12:44:36 jgrosseo
173 adding parameters for extended validity range of data produced by preprocessor
175 Revision 1.7 2006/07/10 14:37:09 jgrosseo
176 small fix + todo comment
178 Revision 1.6 2006/07/10 13:01:41 jgrosseo
179 enhanced storing of last sucessfully processed run (alberto)
181 Revision 1.5 2006/07/04 14:59:57 jgrosseo
182 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
184 Revision 1.4 2006/06/12 09:11:16 jgrosseo
185 coding conventions (Alberto)
187 Revision 1.3 2006/06/06 14:26:40 jgrosseo
188 o) removed files that were moved to STEER
189 o) shuttle updated to follow the new interface (Alberto)
191 Revision 1.2 2006/03/07 07:52:34 hristov
192 New version (B.Yordanov)
194 Revision 1.6 2005/11/19 17:19:14 byordano
195 RetrieveDATEEntries and RetrieveConditionsData added
197 Revision 1.5 2005/11/19 11:09:27 byordano
198 AliShuttle declaration added
200 Revision 1.4 2005/11/17 17:47:34 byordano
201 TList changed to TObjArray
203 Revision 1.3 2005/11/17 14:43:23 byordano
206 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
207 Initial import as subdirectory in AliRoot
209 Revision 1.2 2005/09/13 08:41:15 byordano
210 default startTime endTime added
212 Revision 1.4 2005/08/30 09:13:02 byordano
215 Revision 1.3 2005/08/29 21:15:47 byordano
221 // This class is the main manager for AliShuttle.
222 // It organizes the data retrieval from DCS and call the
223 // interface methods of AliPreprocessor.
224 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
225 // data for its set of aliases is retrieved. If there is registered
226 // AliPreprocessor for this detector then it will be used
227 // accroding to the schema (see AliPreprocessor).
228 // If there isn't registered AliPreprocessor than the retrieved
229 // data is stored automatically to the undelying AliCDBStorage.
230 // For detSpec is used the alias name.
233 #include "AliShuttle.h"
235 #include "AliCDBManager.h"
236 #include "AliCDBStorage.h"
237 #include "AliCDBId.h"
238 #include "AliCDBRunRange.h"
239 #include "AliCDBPath.h"
240 #include "AliCDBEntry.h"
241 #include "AliShuttleConfig.h"
242 #include "DCSClient/AliDCSClient.h"
244 #include "AliPreprocessor.h"
245 #include "AliShuttleStatus.h"
246 #include "AliShuttleLogbookEntry.h"
251 #include <TTimeStamp.h>
252 #include <TObjString.h>
253 #include <TSQLServer.h>
254 #include <TSQLResult.h>
257 #include <TSystemDirectory.h>
258 #include <TSystemFile.h>
260 #include <TFileMerger.h>
262 #include <TGridResult.h>
264 #include <TMonaLisaWriter.h>
268 #include <sys/types.h>
269 #include <sys/wait.h>
273 //______________________________________________________________________________________________
274 AliShuttle::AliShuttle(const AliShuttleConfig* config,
275 UInt_t timeout, Int_t retries):
277 fTimeout(timeout), fRetries(retries),
287 fReadTestMode(kFALSE),
288 fOutputRedirected(kFALSE)
291 // config: AliShuttleConfig used
292 // timeout: timeout used for AliDCSClient connection
293 // retries: the number of retries in case of connection error.
296 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
297 for(int iSys=0;iSys<4;iSys++) {
300 fFXSlist[iSys].SetOwner(kTRUE);
302 fPreprocessorMap.SetOwner(kTRUE);
304 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
305 fFirstUnprocessed[iDet] = kFALSE;
307 fMonitoringMutex = new TMutex();
310 //______________________________________________________________________________________________
311 AliShuttle::~AliShuttle()
317 fPreprocessorMap.DeleteAll();
318 for(int iSys=0;iSys<4;iSys++)
320 fServer[iSys]->Close();
321 delete fServer[iSys];
330 if (fMonitoringMutex)
332 delete fMonitoringMutex;
333 fMonitoringMutex = 0;
337 //______________________________________________________________________________________________
338 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
341 // Registers new AliPreprocessor.
342 // It uses GetName() for indentificator of the pre processor.
343 // The pre processor is registered it there isn't any other
344 // with the same identificator (GetName()).
347 const char* detName = preprocessor->GetName();
348 if(GetDetPos(detName) < 0)
349 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
351 if (fPreprocessorMap.GetValue(detName)) {
352 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
356 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
358 //______________________________________________________________________________________________
359 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
360 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
362 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
363 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
364 // using this function. Use StoreReferenceData instead!
365 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
366 // finishes the data are transferred to the main storage (Grid).
368 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
371 //______________________________________________________________________________________________
372 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
374 // Stores a CDB object in the storage for reference data. This objects will not be available during
375 // offline reconstrunction. Use this function for reference data only!
376 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
377 // finishes the data are transferred to the main storage (Grid).
379 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
382 //______________________________________________________________________________________________
383 Bool_t AliShuttle::StoreLocally(const TString& localUri,
384 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
385 Int_t validityStart, Bool_t validityInfinite)
387 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
388 // when the preprocessor finishes the data are transferred to the main storage (Grid).
389 // The parameters are:
390 // 1) Uri of the backup storage (Local)
391 // 2) the object's path.
392 // 3) the object to be stored
393 // 4) the metaData to be associated with the object
394 // 5) the validity start run number w.r.t. the current run,
395 // if the data is valid only for this run leave the default 0
396 // 6) specifies if the calibration data is valid for infinity (this means until updated),
397 // typical for calibration runs, the default is kFALSE
399 // returns 0 if fail, 1 otherwise
401 if (fTestMode & kErrorStorage)
403 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
407 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
409 Int_t firstRun = GetCurrentRun() - validityStart;
411 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
416 if(validityInfinite) {
417 lastRun = AliCDBRunRange::Infinity();
419 lastRun = GetCurrentRun();
422 // Version is set to current run, it will be used later to transfer data to Grid
423 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
425 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
426 TObjString runUsed = Form("%d", GetCurrentRun());
427 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
430 Bool_t result = kFALSE;
432 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
433 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
435 result = AliCDBManager::Instance()->GetStorage(localUri)
436 ->Put(object, id, metaData);
441 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
447 //______________________________________________________________________________________________
448 Bool_t AliShuttle::StoreOCDB()
451 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
452 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
453 // Then calls StoreRefFilesToGrid to store reference files.
456 if (fTestMode & kErrorGrid)
458 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
459 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
463 Log("SHUTTLE","Storing OCDB data ...");
464 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
466 Log("SHUTTLE","Storing reference data ...");
467 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
469 Log("SHUTTLE","Storing reference files ...");
470 Bool_t resultRefFiles = StoreRefFilesToGrid();
472 return resultCDB && resultRef && resultRefFiles;
475 //______________________________________________________________________________________________
476 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
479 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
482 TObjArray* gridIds=0;
484 Bool_t result = kTRUE;
486 const char* type = 0;
488 if(gridURI == fgkMainCDB) {
490 localURI = fgkLocalCDB;
491 } else if(gridURI == fgkMainRefStorage) {
493 localURI = fgkLocalRefStorage;
495 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
499 AliCDBManager* man = AliCDBManager::Instance();
501 AliCDBStorage *gridSto = man->GetStorage(gridURI);
504 Form("StoreOCDB - cannot activate main %s storage", type));
508 gridIds = gridSto->GetQueryCDBList();
510 // get objects previously stored in local CDB
511 AliCDBStorage *localSto = man->GetStorage(localURI);
514 Form("StoreOCDB - cannot activate local %s storage", type));
517 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
518 // Local objects were stored with current run as Grid version!
519 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
520 localEntries->SetOwner(1);
522 // loop on local stored objects
523 TIter localIter(localEntries);
524 AliCDBEntry *aLocEntry = 0;
525 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
526 aLocEntry->SetOwner(1);
527 AliCDBId aLocId = aLocEntry->GetId();
528 aLocEntry->SetVersion(-1);
529 aLocEntry->SetSubVersion(-1);
531 // If local object is valid up to infinity we store it only if it is
532 // the first unprocessed run!
533 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
534 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
536 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
537 "there are previous unprocessed runs!",
538 fCurrentDetector.Data(), aLocId.GetPath().Data()));
542 // loop on Grid valid Id's
543 Bool_t store = kTRUE;
544 TIter gridIter(gridIds);
545 AliCDBId* aGridId = 0;
546 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
547 if(aGridId->GetPath() != aLocId.GetPath()) continue;
548 // skip all objects valid up to infinity
549 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
550 // if we get here, it means there's already some more recent object stored on Grid!
555 // If we get here, the file can be stored!
556 Bool_t storeOk = gridSto->Put(aLocEntry);
557 if(!store || storeOk){
561 Log(fCurrentDetector.Data(),
562 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
563 type, aGridId->ToString().Data()));
566 Form("StoreOCDB - Object <%s> successfully put into %s storage",
567 aLocId.ToString().Data(), type));
568 Log(fCurrentDetector.Data(),
569 Form("StoreOCDB - Object <%s> successfully put into %s storage",
570 aLocId.ToString().Data(), type));
573 // removing local filename...
575 localSto->IdToFilename(aLocId, filename);
576 AliInfo(Form("Removing local file %s", filename.Data()));
577 RemoveFile(filename.Data());
581 Form("StoreOCDB - Grid %s storage of object <%s> failed",
582 type, aLocId.ToString().Data()));
583 Log(fCurrentDetector.Data(),
584 Form("StoreOCDB - Grid %s storage of object <%s> failed",
585 type, aLocId.ToString().Data()));
589 localEntries->Clear();
594 //______________________________________________________________________________________________
595 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
597 // clears the directory used to store reference files of a given subdetector
599 AliCDBManager* man = AliCDBManager::Instance();
600 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
601 TString localBaseFolder = sto->GetBaseFolder();
603 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
605 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
608 begin.Form("%d_", GetCurrentRun());
610 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
614 TList* dirList = baseDir->GetListOfFiles();
617 if (!dirList) return kTRUE;
619 if (dirList->GetEntries() < 3)
625 Int_t nDirs = 0, nDel = 0;
626 TIter dirIter(dirList);
627 TSystemFile* entry = 0;
629 Bool_t success = kTRUE;
631 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
633 if (entry->IsDirectory())
636 TString fileName(entry->GetName());
637 if (!fileName.BeginsWith(begin))
643 Int_t result = gSystem->Unlink(fileName.Data());
647 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
655 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
656 nDel, nDirs, targetDir.Data()));
667 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
671 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
674 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
679 result = gSystem->mkdir(targetDir, kTRUE);
682 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
689 //______________________________________________________________________________________________
690 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
693 // Stores reference file directly (without opening it). This function stores the file locally.
695 // The file is stored under the following location:
696 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
697 // where <gridFileName> is the second parameter given to the function
700 if (fTestMode & kErrorStorage)
702 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
706 AliCDBManager* man = AliCDBManager::Instance();
707 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
709 TString localBaseFolder = sto->GetBaseFolder();
711 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
713 //try to open folder, if does not exist
714 void* dir = gSystem->OpenDirectory(targetDir.Data());
716 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
717 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
722 gSystem->FreeDirectory(dir);
726 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
728 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
731 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
735 result = gSystem->CopyFile(localFile, target);
739 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
744 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
745 localFile, target.Data(), result));
750 //______________________________________________________________________________________________
751 Bool_t AliShuttle::StoreRefFilesToGrid()
754 // Transfers the reference file to the Grid.
756 // The files are stored under the following location:
757 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
760 AliCDBManager* man = AliCDBManager::Instance();
761 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
764 TString localBaseFolder = sto->GetBaseFolder();
766 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
768 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
772 TString gridBaseFolder = gridSto->GetBaseFolder();
774 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
777 begin.Form("%d_", GetCurrentRun());
779 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
783 TList* dirList = baseDir->GetListOfFiles();
786 if (!dirList) return kTRUE;
788 if (dirList->GetEntries() < 3)
796 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
801 Int_t nDirs = 0, nTransfer = 0;
802 TIter dirIter(dirList);
803 TSystemFile* entry = 0;
805 Bool_t success = kTRUE;
806 Bool_t first = kTRUE;
808 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
810 if (entry->IsDirectory())
813 TString fileName(entry->GetName());
814 if (!fileName.BeginsWith(begin))
822 // check that DET folder exists, otherwise create it
823 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
831 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
833 if (!gGrid->Mkdir(alienDir.Data(),"",0))
835 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
840 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
844 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
848 TString fullLocalPath;
849 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
851 TString fullGridPath;
852 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
854 TFileMerger fileMerger;
855 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
859 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
860 RemoveFile(fullLocalPath);
865 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
870 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
877 //______________________________________________________________________________________________
878 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
881 // Get folder name of reference files
884 TString offDetStr(GetOfflineDetName(detector));
886 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
888 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
890 dir.Form("%s/%s", base, offDetStr.Data());
897 //______________________________________________________________________________________________
898 void AliShuttle::CleanLocalStorage(const TString& uri)
901 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
904 const char* type = 0;
905 if(uri == fgkLocalCDB) {
907 } else if(uri == fgkLocalRefStorage) {
910 AliError(Form("Invalid storage URI: %s", uri.Data()));
914 AliCDBManager* man = AliCDBManager::Instance();
916 // open local storage
917 AliCDBStorage *localSto = man->GetStorage(uri);
920 Form("CleanLocalStorage - cannot activate local %s storage", type));
924 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
925 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
927 AliInfo(Form("filename = %s", filename.Data()));
929 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
930 GetCurrentRun(), fCurrentDetector.Data()));
932 RemoveFile(filename.Data());
936 //______________________________________________________________________________________________
937 void AliShuttle::RemoveFile(const char* filename)
940 // removes local file
943 TString command(Form("rm -f %s", filename));
945 Int_t result = gSystem->Exec(command.Data());
948 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
949 fCurrentDetector.Data(), filename));
953 //______________________________________________________________________________________________
954 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
957 // Reads the AliShuttleStatus from the CDB
965 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
966 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
968 if (!fStatusEntry) return 0;
969 fStatusEntry->SetOwner(1);
971 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
973 AliError("Invalid object stored to CDB!");
980 //______________________________________________________________________________________________
981 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
984 // writes the status for one subdetector
992 Int_t run = GetCurrentRun();
994 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
996 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
997 fStatusEntry->SetOwner(1);
999 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1002 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1003 fCurrentDetector.Data(), run));
1012 //______________________________________________________________________________________________
1013 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1016 // changes the AliShuttleStatus for the given detector and run to the given status
1020 AliError("UNEXPECTED: fStatusEntry empty");
1024 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1027 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
1031 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1032 fCurrentDetector.Data(),
1033 status->GetStatusName(),
1034 status->GetStatusName(newStatus));
1035 Log("SHUTTLE", actionStr);
1036 SetLastAction(actionStr);
1038 status->SetStatus(newStatus);
1039 if (increaseCount) status->IncreaseCount();
1041 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1046 //______________________________________________________________________________________________
1047 void AliShuttle::SendMLInfo()
1050 // sends ML information about the current status of the current detector being processed
1053 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1056 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1060 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1061 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1064 mlList.Add(&mlStatus);
1065 mlList.Add(&mlRetryCount);
1067 fMonaLisa->SendParameters(&mlList);
1070 //______________________________________________________________________________________________
1071 Bool_t AliShuttle::ContinueProcessing()
1073 // this function reads the AliShuttleStatus information from CDB and
1074 // checks if the processing should be continued
1075 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1077 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1079 AliPreprocessor* aPreprocessor =
1080 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1083 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1087 AliShuttleLogbookEntry::Status entryStatus =
1088 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1090 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1091 AliInfo(Form("ContinueProcessing - %s is %s",
1092 fCurrentDetector.Data(),
1093 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1097 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1099 // check if current run is first unprocessed run for current detector
1100 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1101 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1103 if (fTestMode == kNone)
1105 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1110 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1114 AliShuttleStatus* status = ReadShuttleStatus();
1117 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1118 fCurrentDetector.Data()));
1119 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1120 return WriteShuttleStatus(status);
1123 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1124 // If it happens it may mean Logbook updating failed... let's do it now!
1125 if (status->GetStatus() == AliShuttleStatus::kDone ||
1126 status->GetStatus() == AliShuttleStatus::kFailed){
1127 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1128 fCurrentDetector.Data(),
1129 status->GetStatusName(status->GetStatus())));
1130 UpdateShuttleLogbook(fCurrentDetector.Data(),
1131 status->GetStatusName(status->GetStatus()));
1135 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1137 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1138 fCurrentDetector.Data()));
1139 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1141 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1142 fCurrentDetector.Data()));
1143 UpdateShuttleStatus(AliShuttleStatus::kDone);
1144 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1147 Form("ContinueProcessing - %s: Grid storage failed again",
1148 fCurrentDetector.Data()));
1149 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1154 // if we get here, there is a restart
1155 Bool_t cont = kFALSE;
1158 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1159 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1160 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1161 status->GetCount(), status->GetStatusName()));
1162 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1163 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1165 // there may still be objects in local OCDB and reference storage
1166 // and FXS databases may be not updated: do it now!
1168 // TODO Currently disabled, we want to keep files in case of failure!
1169 // CleanLocalStorage(fgkLocalCDB);
1170 // CleanLocalStorage(fgkLocalRefStorage);
1171 // UpdateTableFailCase();
1173 // Send mail to detector expert!
1174 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1176 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1177 fCurrentDetector.Data()));
1180 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1181 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1182 status->GetStatusName(), status->GetCount()));
1183 Bool_t increaseCount = kTRUE;
1184 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1185 increaseCount = kFALSE;
1186 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1193 //______________________________________________________________________________________________
1194 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1197 // Makes data retrieval for all detectors in the configuration.
1198 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1199 // (Unprocessed, Inactive, Failed or Done).
1200 // Returns kFALSE in case of error occured and kTRUE otherwise
1203 if (!entry) return kFALSE;
1205 fLogbookEntry = entry;
1207 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1210 // create ML instance that monitors this run
1211 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1212 // disable monitoring of other parameters that come e.g. from TFile
1213 gMonitoringWriter = 0;
1215 // Send the information to ML
1216 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1217 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1220 mlList.Add(&mlStatus);
1221 mlList.Add(&mlRunType);
1223 fMonaLisa->SendParameters(&mlList);
1225 if (fLogbookEntry->IsDone())
1227 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1228 UpdateShuttleLogbook("shuttle_done");
1233 // read test mode if flag is set
1237 TString logEntry(entry->GetRunParameter("log"));
1238 //printf("log entry = %s\n", logEntry.Data());
1239 TString searchStr("Testmode: ");
1240 Int_t pos = logEntry.Index(searchStr.Data());
1241 //printf("%d\n", pos);
1244 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1245 //printf("%s\n", subStr.String().Data());
1246 TString newStr(subStr.Data());
1247 TObjArray* token = newStr.Tokenize(' ');
1251 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1254 Int_t testMode = tmpStr->String().Atoi();
1257 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1258 SetTestMode((TestMode) testMode);
1266 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1268 fLogbookEntry->Print("all");
1271 Bool_t hasError = kFALSE;
1273 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1274 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1275 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1276 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1278 // Loop on detectors in the configuration
1279 TIter iter(fConfig->GetDetectors());
1280 TObjString* aDetector = 0;
1282 while ((aDetector = (TObjString*) iter.Next()))
1284 fCurrentDetector = aDetector->String();
1286 if (ContinueProcessing() == kFALSE) continue;
1288 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1289 GetCurrentRun(), aDetector->GetName()));
1291 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1293 Log(fCurrentDetector.Data(), "Starting processing");
1299 Log("SHUTTLE", "ERROR: Forking failed");
1304 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1305 GetCurrentRun(), aDetector->GetName()));
1307 Long_t begin = time(0);
1309 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1310 while (waitpid(pid, &status, WNOHANG) == 0)
1312 Long_t expiredTime = time(0) - begin;
1314 if (expiredTime > fConfig->GetPPTimeOut())
1317 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1318 fCurrentDetector.Data(), expiredTime);
1319 Log("SHUTTLE", tmp);
1320 Log(fCurrentDetector, tmp);
1324 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1327 gSystem->Sleep(1000);
1331 gSystem->Sleep(1000);
1334 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1335 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1338 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1343 if (!fgets(buffer, 100, pipe))
1345 Log("SHUTTLE", "Error: ps did not return anything");
1346 gSystem->ClosePipe(pipe);
1349 gSystem->ClosePipe(pipe);
1351 //Log("SHUTTLE", Form("ps returned %s", buffer));
1354 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1356 Log("SHUTTLE", "Error: Could not parse output of ps");
1360 if (expiredTime % 60 == 0)
1361 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1362 fCurrentDetector.Data(), expiredTime, mem));
1364 if (mem > fConfig->GetPPMaxMem())
1367 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1368 mem, fConfig->GetPPMaxMem());
1369 Log("SHUTTLE", tmp);
1370 Log(fCurrentDetector, tmp);
1374 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1377 gSystem->Sleep(1000);
1382 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1383 GetCurrentRun(), aDetector->GetName()));
1385 if (WIFEXITED(status))
1387 Int_t returnCode = WEXITSTATUS(status);
1389 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1392 if (returnCode == 0) hasError = kTRUE;
1398 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1400 AliInfo("Redirecting output...");
1402 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1404 Log("SHUTTLE", "Could not freopen stdout");
1408 fOutputRedirected = kTRUE;
1409 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1410 Log("SHUTTLE", "Could not redirect stderr");
1414 TString wd = gSystem->WorkingDirectory();
1415 TString tmpDir = Form("%s/%s_process",GetShuttleTempDir(),fCurrentDetector.Data());
1417 gSystem->mkdir(tmpDir.Data());
1418 gSystem->ChangeDirectory(tmpDir.Data());
1420 Bool_t success = ProcessCurrentDetector();
1422 gSystem->ChangeDirectory(wd.Data());
1424 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1426 if (success) // Preprocessor finished successfully!
1428 // Update time_processed field in FXS DB
1429 if (UpdateTable() == kFALSE)
1430 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1431 fCurrentDetector.Data()));
1433 // Transfer the data from local storage to main storage (Grid)
1434 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1435 if (StoreOCDB() == kFALSE)
1437 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1438 GetCurrentRun(), aDetector->GetName()));
1439 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1442 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1443 GetCurrentRun(), aDetector->GetName()));
1444 UpdateShuttleStatus(AliShuttleStatus::kDone);
1445 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1449 for (UInt_t iSys=0; iSys<3; iSys++)
1451 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1454 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1455 GetCurrentRun(), aDetector->GetName(), success));
1457 // the client exits here
1458 gSystem->Exit(success);
1460 AliError("We should never get here!!!");
1464 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1467 //check if shuttle is done for this run, if so update logbook
1468 TObjArray checkEntryArray;
1469 checkEntryArray.SetOwner(1);
1470 TString whereClause = Form("where run=%d", GetCurrentRun());
1471 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1472 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1474 return hasError == kFALSE;
1477 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1478 (checkEntryArray.At(0));
1482 if (checkEntry->IsDone())
1484 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1485 UpdateShuttleLogbook("shuttle_done");
1489 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1491 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1493 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1494 checkEntry->GetRun(), GetDetName(iDet)));
1495 fFirstUnprocessed[iDet] = kFALSE;
1501 // remove ML instance
1507 return hasError == kFALSE;
1510 //______________________________________________________________________________________________
1511 Bool_t AliShuttle::ProcessCurrentDetector()
1514 // Makes data retrieval just for a specific detector (fCurrentDetector).
1515 // Threre should be a configuration for this detector.
1517 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1519 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1524 // call preprocessor
1525 AliPreprocessor* aPreprocessor =
1526 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1528 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1530 Bool_t processDCS = aPreprocessor->ProcessDCS();
1534 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1536 else if (fTestMode & kSkipDCS)
1538 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1540 else if (fTestMode & kErrorDCS)
1542 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1543 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1544 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1548 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1550 TString host(fConfig->GetDCSHost(fCurrentDetector));
1551 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1553 if (fConfig->GetDCSAliases(fCurrentDetector)->GetEntries() > 0)
1555 dcsMap = GetValueSet(host, port, fConfig->GetDCSAliases(fCurrentDetector), kAlias);
1558 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS aliases");
1559 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1564 if (fConfig->GetDCSDataPoints(fCurrentDetector)->GetEntries() > 0)
1566 TMap* dcsMap2 = GetValueSet(host, port, fConfig->GetDCSDataPoints(fCurrentDetector), kDP);
1569 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS data points");
1570 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1582 TIter iter(dcsMap2);
1583 TObjString* key = 0;
1584 while ((key = (TObjString*) iter.Next()))
1585 dcsMap->Add(key, dcsMap2->GetValue(key->String()));
1587 dcsMap2->SetOwner(kFALSE);
1598 // DCS Archive DB processing successful. Call Preprocessor!
1599 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1601 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1603 if (returnValue > 0) // Preprocessor error!
1605 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1606 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1607 dcsMap->DeleteAll();
1613 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1614 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1615 fCurrentDetector.Data()));
1617 dcsMap->DeleteAll();
1623 //______________________________________________________________________________________________
1624 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1627 // Query DAQ's Shuttle logbook and fills detector status object.
1628 // Call QueryRunParameters to query DAQ logbook for run parameters.
1631 entries.SetOwner(1);
1633 // check connection, in case connect
1634 if(!Connect(3)) return kFALSE;
1637 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1639 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1641 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1645 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1647 if(aResult->GetRowCount() == 0) {
1648 AliInfo("No entries in Shuttle Logbook match request");
1653 // TODO Check field count!
1654 const UInt_t nCols = 23;
1655 if (aResult->GetFieldCount() != (Int_t) nCols) {
1656 AliError("Invalid SQL result field number!");
1662 while ((aRow = aResult->Next())) {
1663 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1664 Int_t run = runString.Atoi();
1666 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1670 // loop on detectors
1671 for(UInt_t ii = 0; ii < nCols; ii++)
1672 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1674 entries.AddLast(entry);
1682 //______________________________________________________________________________________________
1683 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1686 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1689 // check connection, in case connect
1694 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1696 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1698 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1702 if (aResult->GetRowCount() == 0) {
1703 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1708 if (aResult->GetRowCount() > 1) {
1709 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1714 TSQLRow* aRow = aResult->Next();
1717 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1722 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1724 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1725 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1727 UInt_t startTime = entry->GetStartTime();
1728 UInt_t endTime = entry->GetEndTime();
1730 if (!startTime || !endTime || startTime > endTime) {
1732 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1733 run, startTime, endTime));
1746 //______________________________________________________________________________________________
1747 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1748 TObjArray* valueSet, DCSType type)
1750 // Retrieve all "entry" data points from the DCS server
1751 // host, port: TSocket connection parameters
1752 // entry: name of the alias or data point
1753 // valueSet: array of retrieved AliDCSValue's
1754 // type: kAlias or kDP
1756 AliDCSClient client(host, port, fTimeout, fRetries);
1757 if (!client.IsConnected())
1766 result = client.GetAliasValues(entry,
1767 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1771 result = client.GetDPValues(entry,
1772 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1777 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1778 entry, AliDCSClient::GetErrorString(result)));
1780 if (result == AliDCSClient::fgkServerError)
1782 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1783 client.GetServerError().Data()));
1792 //______________________________________________________________________________________________
1793 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1796 // Retrieve all "entry" data points from the DCS server
1797 // host, port: TSocket connection parameters
1798 // entries: list of name of the alias or data point
1799 // type: kAlias or kDP
1800 // returns TMap of values, 0 when failure
1802 const Int_t kSplit = 100; // maximum number of DPs at a time
1804 Int_t totalEntries = entries->GetEntries();
1808 for (Int_t index=0; index < totalEntries; index += kSplit)
1810 Int_t endIndex = index + kSplit;
1812 AliDCSClient client(host, port, fTimeout, fRetries);
1813 if (!client.IsConnected())
1816 TMap* partialResult = 0;
1820 partialResult = client.GetAliasValues(entries, GetCurrentStartTime(),
1821 GetCurrentEndTime(), index, endIndex);
1823 else if (type == kDP)
1825 partialResult = client.GetDPValues(entries, GetCurrentStartTime(),
1826 GetCurrentEndTime(), index, endIndex);
1829 if (partialResult == 0)
1831 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries (%d...%d)! Reason: %s",
1832 index, endIndex, client.GetServerError().Data()));
1840 AliInfo(Form("Retrieved entries %d..%d (total %d); E.g. %s has %d values collected",
1841 index, endIndex, totalEntries, entries->At(index)->GetName(), ((TObjArray*)
1842 partialResult->GetValue(entries->At(index)->GetName()))->GetEntriesFast()));
1846 result = partialResult;
1850 TIter iter(partialResult);
1851 TObjString* key = 0;
1852 while ((key = (TObjString*) iter.Next()))
1853 result->Add(key, partialResult->GetValue(key->String()));
1855 partialResult->SetOwner(kFALSE);
1856 delete partialResult;
1863 //______________________________________________________________________________________________
1864 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1865 const char* id, const char* source)
1867 // Get calibration file from file exchange servers
1868 // First queris the FXS database for the file name, using the run, detector, id and source info
1869 // then calls RetrieveFile(filename) for actual copy to local disk
1870 // run: current run being processed (given by Logbook entry fLogbookEntry)
1871 // detector: the Preprocessor name
1872 // id: provided as a parameter by the Preprocessor
1873 // source: provided by the Preprocessor through GetFileSources function
1875 // check if test mode should simulate a FXS error
1876 if (fTestMode & kErrorFXSFiles)
1878 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1882 // check connection, in case connect
1883 if (!Connect(system))
1885 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1889 // Query preparation
1890 TString sourceName(source);
1892 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1893 fConfig->GetFXSdbTable(system));
1894 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1895 GetCurrentRun(), detector, id);
1899 whereClause += Form(" and DAQsource=\"%s\"", source);
1901 else if (system == kDCS)
1905 else if (system == kHLT)
1907 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1911 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1913 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1916 TSQLResult* aResult = 0;
1917 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1919 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1920 GetSystemName(system), id, sourceName.Data()));
1924 if(aResult->GetRowCount() == 0)
1927 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1928 GetSystemName(system), id, sourceName.Data()));
1933 if (aResult->GetRowCount() > 1) {
1935 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1936 GetSystemName(system), id, sourceName.Data()));
1941 if (aResult->GetFieldCount() != nFields) {
1943 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1944 GetSystemName(system), id, sourceName.Data()));
1949 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1952 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1953 GetSystemName(system), id, sourceName.Data()));
1958 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1959 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1960 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1965 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1966 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1968 // retrieved file is renamed to make it unique
1969 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1970 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1973 // file retrieval from FXS
1974 UInt_t nRetries = 0;
1975 UInt_t maxRetries = 3;
1976 Bool_t result = kFALSE;
1978 // copy!! if successful TSystem::Exec returns 0
1979 while(nRetries++ < maxRetries) {
1980 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1981 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1984 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1985 filePath.Data(), GetSystemName(system)));
1988 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1989 filePath.Data(), GetSystemName(system),
1990 GetShuttleTempDir(), localFileName.Data()));
1993 if (fileChecksum.Length()>0)
1995 // compare md5sum of local file with the one stored in the FXS DB
1996 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
1997 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
2001 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2007 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2008 filePath.Data(), GetSystemName(system)));
2013 if(!result) return 0;
2015 fFXSCalled[system]=kTRUE;
2016 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2017 fFXSlist[system].Add(fileParams);
2019 static TString fullLocalFileName;
2020 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
2022 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
2024 return fullLocalFileName.Data();
2028 //______________________________________________________________________________________________
2029 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2032 // Copies file from FXS to local Shuttle machine
2035 // check temp directory: trying to cd to temp; if it does not exist, create it
2036 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
2037 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
2039 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
2041 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
2042 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
2047 gSystem->FreeDirectory(dir);
2050 TString baseFXSFolder;
2053 baseFXSFolder = "FES/";
2055 else if (system == kDCS)
2059 else if (system == kHLT)
2061 baseFXSFolder = "/opt/FXS";
2065 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
2066 fConfig->GetFXSPort(system),
2067 fConfig->GetFXSUser(system),
2068 fConfig->GetFXSHost(system),
2069 baseFXSFolder.Data(),
2071 GetShuttleTempDir(),
2074 AliDebug(2, Form("%s",command.Data()));
2076 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2081 //______________________________________________________________________________________________
2082 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2085 // Get sources producing the condition file Id from file exchange servers
2086 // if id is NULL all sources are returned (distinct)
2089 // check if test mode should simulate a FXS error
2090 if (fTestMode & kErrorFXSSources)
2092 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2099 AliWarning("DCS system has only one source of data!");
2100 TList *list = new TList();
2102 list->Add(new TObjString(" "));
2106 // check connection, in case connect
2107 if (!Connect(system))
2109 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2113 TString sourceName = 0;
2116 sourceName = "DAQsource";
2117 } else if (system == kHLT)
2119 sourceName = "DDLnumbers";
2122 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2123 TString whereClause = Form("run=%d and detector=\"%s\"",
2124 GetCurrentRun(), detector);
2126 whereClause += Form(" and fileId=\"%s\"", id);
2127 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2129 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2132 TSQLResult* aResult;
2133 aResult = fServer[system]->Query(sqlQuery);
2135 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2136 GetSystemName(system), id));
2140 TList *list = new TList();
2143 if (aResult->GetRowCount() == 0)
2146 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2153 while ((aRow = aResult->Next()))
2156 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2157 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2158 list->Add(new TObjString(source));
2167 //______________________________________________________________________________________________
2168 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2171 // Get all ids of condition files produced by a given source from file exchange servers
2174 // check if test mode should simulate a FXS error
2175 if (fTestMode & kErrorFXSSources)
2177 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2181 // check connection, in case connect
2182 if (!Connect(system))
2184 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2188 TString sourceName = 0;
2191 sourceName = "DAQsource";
2192 } else if (system == kHLT)
2194 sourceName = "DDLnumbers";
2197 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2198 TString whereClause = Form("run=%d and detector=\"%s\"",
2199 GetCurrentRun(), detector);
2200 if (sourceName.Length() > 0 && source)
2201 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2202 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2204 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2207 TSQLResult* aResult;
2208 aResult = fServer[system]->Query(sqlQuery);
2210 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2211 GetSystemName(system), source));
2215 TList *list = new TList();
2218 if (aResult->GetRowCount() == 0)
2221 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2228 while ((aRow = aResult->Next()))
2231 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2232 AliDebug(2, Form("fileId = %s", id.Data()));
2233 list->Add(new TObjString(id));
2242 //______________________________________________________________________________________________
2243 Bool_t AliShuttle::Connect(Int_t system)
2245 // Connect to MySQL Server of the system's FXS MySQL databases
2246 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2249 // check connection: if already connected return
2250 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2252 TString dbHost, dbUser, dbPass, dbName;
2254 if (system < 3) // FXS db servers
2256 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2257 dbUser = fConfig->GetFXSdbUser(system);
2258 dbPass = fConfig->GetFXSdbPass(system);
2259 dbName = fConfig->GetFXSdbName(system);
2260 } else { // Run & Shuttle logbook servers
2261 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2262 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2263 dbUser = fConfig->GetDAQlbUser();
2264 dbPass = fConfig->GetDAQlbPass();
2265 dbName = fConfig->GetDAQlbDB();
2268 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2269 if (!fServer[system] || !fServer[system]->IsConnected()) {
2272 AliError(Form("Can't establish connection to FXS database for %s",
2273 AliShuttleInterface::GetSystemName(system)));
2275 AliError("Can't establish connection to Run logbook.");
2277 if(fServer[system]) delete fServer[system];
2282 TSQLResult* aResult=0;
2285 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2288 aResult = fServer[kDCS]->GetTables(dbName.Data());
2291 aResult = fServer[kHLT]->GetTables(dbName.Data());
2294 aResult = fServer[3]->GetTables(dbName.Data());
2302 //______________________________________________________________________________________________
2303 Bool_t AliShuttle::UpdateTable()
2306 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2309 Bool_t result = kTRUE;
2311 for (UInt_t system=0; system<3; system++)
2313 if(!fFXSCalled[system]) continue;
2315 // check connection, in case connect
2316 if (!Connect(system))
2318 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2323 TTimeStamp now; // now
2325 // Loop on FXS list entries
2326 TIter iter(&fFXSlist[system]);
2327 TObjString *aFXSentry=0;
2328 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2330 TString aFXSentrystr = aFXSentry->String();
2331 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2332 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2334 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2335 GetSystemName(system), aFXSentrystr.Data()));
2336 if(aFXSarray) delete aFXSarray;
2340 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2341 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2343 TString whereClause;
2346 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2347 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2349 else if (system == kDCS)
2351 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2352 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2354 else if (system == kHLT)
2356 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2357 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2362 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2363 now.GetSec(), whereClause.Data());
2365 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2368 TSQLResult* aResult;
2369 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2372 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2373 GetSystemName(system), sqlQuery.Data()));
2384 //______________________________________________________________________________________________
2385 Bool_t AliShuttle::UpdateTableFailCase()
2387 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2388 // this is called in case the preprocessor is declared failed for the current run, because
2389 // the fields are updated only in case of success
2391 Bool_t result = kTRUE;
2393 for (UInt_t system=0; system<3; system++)
2395 // check connection, in case connect
2396 if (!Connect(system))
2398 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2399 GetSystemName(system)));
2404 TTimeStamp now; // now
2406 // Loop on FXS list entries
2408 TString whereClause = Form("where run=%d and detector=\"%s\";",
2409 GetCurrentRun(), fCurrentDetector.Data());
2412 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2413 now.GetSec(), whereClause.Data());
2415 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2418 TSQLResult* aResult;
2419 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2422 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2423 GetSystemName(system), sqlQuery.Data()));
2433 //______________________________________________________________________________________________
2434 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2437 // Update Shuttle logbook filling detector or shuttle_done column
2438 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2441 // check connection, in case connect
2443 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2447 TString detName(detector);
2449 if(detName == "shuttle_done")
2451 setClause = "set shuttle_done=1";
2453 // Send the information to ML
2454 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2457 mlList.Add(&mlStatus);
2459 fMonaLisa->SendParameters(&mlList);
2461 TString statusStr(status);
2462 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2463 statusStr.Contains("failed", TString::kIgnoreCase)){
2464 setClause = Form("set %s=\"%s\"", detector, status);
2467 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2473 TString whereClause = Form("where run=%d", GetCurrentRun());
2475 TString sqlQuery = Form("update %s %s %s",
2476 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2478 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2481 TSQLResult* aResult;
2482 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2484 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2492 //______________________________________________________________________________________________
2493 Int_t AliShuttle::GetCurrentRun() const
2496 // Get current run from logbook entry
2499 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2502 //______________________________________________________________________________________________
2503 UInt_t AliShuttle::GetCurrentStartTime() const
2506 // get current start time
2509 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2512 //______________________________________________________________________________________________
2513 UInt_t AliShuttle::GetCurrentEndTime() const
2516 // get current end time from logbook entry
2519 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2522 //______________________________________________________________________________________________
2523 void AliShuttle::Log(const char* detector, const char* message)
2526 // Fill log string with a message
2529 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2531 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2532 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2537 gSystem->FreeDirectory(dir);
2540 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2541 if (GetCurrentRun() >= 0)
2542 toLog += Form("run %d - ", GetCurrentRun());
2543 toLog += Form("%s", message);
2545 AliInfo(toLog.Data());
2547 // if we redirect the log output already to the file, leave here
2548 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2551 TString fileName = GetLogFileName(detector);
2553 gSystem->ExpandPathName(fileName);
2556 logFile.open(fileName, ofstream::out | ofstream::app);
2558 if (!logFile.is_open()) {
2559 AliError(Form("Could not open file %s", fileName.Data()));
2563 logFile << toLog.Data() << "\n";
2568 //______________________________________________________________________________________________
2569 TString AliShuttle::GetLogFileName(const char* detector) const
2572 // returns the name of the log file for a given sub detector
2577 if (GetCurrentRun() >= 0)
2578 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2580 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2585 //______________________________________________________________________________________________
2586 Bool_t AliShuttle::Collect(Int_t run)
2589 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2590 // If a dedicated run is given this run is processed
2592 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2596 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2598 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2600 SetLastAction("Starting");
2602 TString whereClause("where shuttle_done=0");
2604 whereClause += Form(" and run=%d", run);
2606 TObjArray shuttleLogbookEntries;
2607 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2609 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2613 if (shuttleLogbookEntries.GetEntries() == 0)
2616 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2618 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2619 "or it does not exist in Shuttle logbook", run));
2623 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2624 fFirstUnprocessed[iDet] = kTRUE;
2628 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2629 // flag them into fFirstUnprocessed array
2630 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2631 TObjArray tmpLogbookEntries;
2632 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2634 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2638 TIter iter(&tmpLogbookEntries);
2639 AliShuttleLogbookEntry* anEntry = 0;
2640 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2642 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2644 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2646 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2647 anEntry->GetRun(), GetDetName(iDet)));
2648 fFirstUnprocessed[iDet] = kFALSE;
2656 if (!RetrieveConditionsData(shuttleLogbookEntries))
2658 Log("SHUTTLE", "Collect - Process of at least one run failed");
2662 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2666 //______________________________________________________________________________________________
2667 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2670 // Retrieve conditions data for all runs that aren't processed yet
2673 Bool_t hasError = kFALSE;
2675 TIter iter(&dateEntries);
2676 AliShuttleLogbookEntry* anEntry;
2678 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2679 if (!Process(anEntry)){
2683 // clean SHUTTLE temp directory
2684 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2685 RemoveFile(filename.Data());
2688 return hasError == kFALSE;
2691 //______________________________________________________________________________________________
2692 ULong_t AliShuttle::GetTimeOfLastAction() const
2695 // Gets time of last action
2700 fMonitoringMutex->Lock();
2702 tmp = fLastActionTime;
2704 fMonitoringMutex->UnLock();
2709 //______________________________________________________________________________________________
2710 const TString AliShuttle::GetLastAction() const
2713 // returns a string description of the last action
2718 fMonitoringMutex->Lock();
2722 fMonitoringMutex->UnLock();
2727 //______________________________________________________________________________________________
2728 void AliShuttle::SetLastAction(const char* action)
2731 // updates the monitoring variables
2734 fMonitoringMutex->Lock();
2736 fLastAction = action;
2737 fLastActionTime = time(0);
2739 fMonitoringMutex->UnLock();
2742 //______________________________________________________________________________________________
2743 const char* AliShuttle::GetRunParameter(const char* param)
2746 // returns run parameter read from DAQ logbook
2749 if(!fLogbookEntry) {
2750 AliError("No logbook entry!");
2754 return fLogbookEntry->GetRunParameter(param);
2757 //______________________________________________________________________________________________
2758 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2761 // returns object from OCDB valid for current run
2764 if (fTestMode & kErrorOCDB)
2766 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2770 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2773 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2777 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2780 //______________________________________________________________________________________________
2781 Bool_t AliShuttle::SendMail()
2784 // sends a mail to the subdetector expert in case of preprocessor error
2787 if (fTestMode != kNone)
2790 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2793 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2795 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2800 gSystem->FreeDirectory(dir);
2803 TString bodyFileName;
2804 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2805 gSystem->ExpandPathName(bodyFileName);
2808 mailBody.open(bodyFileName, ofstream::out);
2810 if (!mailBody.is_open())
2812 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2817 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2818 TObjString *anExpert=0;
2819 while ((anExpert = (TObjString*) iterExperts.Next()))
2821 to += Form("%s,", anExpert->GetName());
2823 to.Remove(to.Length()-1);
2824 AliDebug(2, Form("to: %s",to.Data()));
2827 AliInfo("List of detector responsibles not yet set!");
2831 TString cc="alberto.colla@cern.ch";
2833 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
2834 fCurrentDetector.Data(), GetCurrentRun());
2835 AliDebug(2, Form("subject: %s", subject.Data()));
2837 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2838 body += Form("SHUTTLE just detected that your preprocessor "
2839 "failed processing run %d!!\n\n", GetCurrentRun());
2840 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2841 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2842 body += Form("Find the %s log for the current run on \n\n"
2843 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2844 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
2845 body += Form("The last 10 lines of %s log file are following:\n\n");
2847 AliDebug(2, Form("Body begin: %s", body.Data()));
2849 mailBody << body.Data();
2851 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2853 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2854 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2855 if (gSystem->Exec(tailCommand.Data()))
2857 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2860 TString endBody = Form("------------------------------------------------------\n\n");
2861 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2862 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2863 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
2865 AliDebug(2, Form("Body end: %s", endBody.Data()));
2867 mailBody << endBody.Data();
2872 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2876 bodyFileName.Data());
2877 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2879 Bool_t result = gSystem->Exec(mailCommand.Data());
2884 //______________________________________________________________________________________________
2885 const char* AliShuttle::GetRunType()
2888 // returns run type read from "run type" logbook
2891 if(!fLogbookEntry) {
2892 AliError("No logbook entry!");
2896 return fLogbookEntry->GetRunType();
2899 //______________________________________________________________________________________________
2900 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2903 // sets Shuttle temp directory
2906 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2909 //______________________________________________________________________________________________
2910 void AliShuttle::SetShuttleLogDir(const char* logDir)
2913 // sets Shuttle log directory
2916 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);