1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.45 2007/05/30 06:35:20 jgrosseo
19 Adding functionality to the Shuttle/TestShuttle:
20 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
21 o) Function to retrieve list of IDs for a given source (GetFileIDs)
22 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
23 Example code has been added to the TestProcessor in TestShuttle
25 Revision 1.44 2007/05/11 16:09:32 acolla
26 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
27 example: ITS/SPD/100_filename.root
29 Revision 1.43 2007/05/10 09:59:51 acolla
30 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
32 Revision 1.42 2007/05/03 08:01:39 jgrosseo
33 typo in last commit :-(
35 Revision 1.41 2007/05/03 08:00:48 jgrosseo
36 fixing log message when pp want to skip dcs value retrieval
38 Revision 1.40 2007/04/27 07:06:48 jgrosseo
39 GetFileSources returns empty list in case of no files, but successful query
40 No mails sent in testmode
42 Revision 1.39 2007/04/17 12:43:57 acolla
43 Correction in StoreOCDB; change of text in mail to detector expert
45 Revision 1.38 2007/04/12 08:26:18 jgrosseo
48 Revision 1.37 2007/04/10 16:53:14 jgrosseo
49 redirecting sub detector stdout, stderr to sub detector log file
51 Revision 1.35 2007/04/04 16:26:38 acolla
52 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
53 2. Added missing dependency in test preprocessors.
54 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
56 Revision 1.34 2007/04/04 10:33:36 jgrosseo
57 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
58 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
60 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
62 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
64 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
66 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
67 If you always need DCS data (like before), you do not need to implement it.
69 6) The run type has been added to the monitoring page
71 Revision 1.33 2007/04/03 13:56:01 acolla
72 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
75 Revision 1.32 2007/02/28 10:41:56 acolla
76 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
77 AliPreprocessor::GetRunType() function.
78 Added some ldap definition files.
80 Revision 1.30 2007/02/13 11:23:21 acolla
81 Moved getters and setters of Shuttle's main OCDB/Reference, local
82 OCDB/Reference, temp and log folders to AliShuttleInterface
84 Revision 1.27 2007/01/30 17:52:42 jgrosseo
85 adding monalisa monitoring
87 Revision 1.26 2007/01/23 19:20:03 acolla
88 Removed old ldif files, added TOF, MCH ldif files. Added some options in
89 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
92 Revision 1.25 2007/01/15 19:13:52 acolla
93 Moved some AliInfo to AliDebug in SendMail function
95 Revision 1.21 2006/12/07 08:51:26 jgrosseo
97 table, db names in ldap configuration
98 added GRP preprocessor
99 DCS data can also be retrieved by data point
101 Revision 1.20 2006/11/16 16:16:48 jgrosseo
102 introducing strict run ordering flag
103 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
105 Revision 1.19 2006/11/06 14:23:04 jgrosseo
106 major update (Alberto)
107 o) reading of run parameters from the logbook
108 o) online offline naming conversion
109 o) standalone DCSclient package
111 Revision 1.18 2006/10/20 15:22:59 jgrosseo
112 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
113 o) Merging Collect, CollectAll, CollectNew function
114 o) Removing implementation of empty copy constructors (declaration still there!)
116 Revision 1.17 2006/10/05 16:20:55 jgrosseo
117 adapting to new CDB classes
119 Revision 1.16 2006/10/05 15:46:26 jgrosseo
120 applying to the new interface
122 Revision 1.15 2006/10/02 16:38:39 jgrosseo
125 storing of objects that failed to be stored to the grid before
126 interfacing of shuttle status table in daq system
128 Revision 1.14 2006/08/29 09:16:05 jgrosseo
131 Revision 1.13 2006/08/15 10:50:00 jgrosseo
132 effc++ corrections (alberto)
134 Revision 1.12 2006/08/08 14:19:29 jgrosseo
135 Update to shuttle classes (Alberto)
137 - Possibility to set the full object's path in the Preprocessor's and
138 Shuttle's Store functions
139 - Possibility to extend the object's run validity in the same classes
140 ("startValidity" and "validityInfinite" parameters)
141 - Implementation of the StoreReferenceData function to store reference
142 data in a dedicated CDB storage.
144 Revision 1.11 2006/07/21 07:37:20 jgrosseo
145 last run is stored after each run
147 Revision 1.10 2006/07/20 09:54:40 jgrosseo
148 introducing status management: The processing per subdetector is divided into several steps,
149 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
150 can keep track of the number of failures and skips further processing after a certain threshold is
151 exceeded. These thresholds can be configured in LDAP.
153 Revision 1.9 2006/07/19 10:09:55 jgrosseo
154 new configuration, accesst to DAQ FES (Alberto)
156 Revision 1.8 2006/07/11 12:44:36 jgrosseo
157 adding parameters for extended validity range of data produced by preprocessor
159 Revision 1.7 2006/07/10 14:37:09 jgrosseo
160 small fix + todo comment
162 Revision 1.6 2006/07/10 13:01:41 jgrosseo
163 enhanced storing of last sucessfully processed run (alberto)
165 Revision 1.5 2006/07/04 14:59:57 jgrosseo
166 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
168 Revision 1.4 2006/06/12 09:11:16 jgrosseo
169 coding conventions (Alberto)
171 Revision 1.3 2006/06/06 14:26:40 jgrosseo
172 o) removed files that were moved to STEER
173 o) shuttle updated to follow the new interface (Alberto)
175 Revision 1.2 2006/03/07 07:52:34 hristov
176 New version (B.Yordanov)
178 Revision 1.6 2005/11/19 17:19:14 byordano
179 RetrieveDATEEntries and RetrieveConditionsData added
181 Revision 1.5 2005/11/19 11:09:27 byordano
182 AliShuttle declaration added
184 Revision 1.4 2005/11/17 17:47:34 byordano
185 TList changed to TObjArray
187 Revision 1.3 2005/11/17 14:43:23 byordano
190 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
191 Initial import as subdirectory in AliRoot
193 Revision 1.2 2005/09/13 08:41:15 byordano
194 default startTime endTime added
196 Revision 1.4 2005/08/30 09:13:02 byordano
199 Revision 1.3 2005/08/29 21:15:47 byordano
205 // This class is the main manager for AliShuttle.
206 // It organizes the data retrieval from DCS and call the
207 // interface methods of AliPreprocessor.
208 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
209 // data for its set of aliases is retrieved. If there is registered
210 // AliPreprocessor for this detector then it will be used
211 // accroding to the schema (see AliPreprocessor).
212 // If there isn't registered AliPreprocessor than the retrieved
213 // data is stored automatically to the undelying AliCDBStorage.
214 // For detSpec is used the alias name.
217 #include "AliShuttle.h"
219 #include "AliCDBManager.h"
220 #include "AliCDBStorage.h"
221 #include "AliCDBId.h"
222 #include "AliCDBRunRange.h"
223 #include "AliCDBPath.h"
224 #include "AliCDBEntry.h"
225 #include "AliShuttleConfig.h"
226 #include "DCSClient/AliDCSClient.h"
228 #include "AliPreprocessor.h"
229 #include "AliShuttleStatus.h"
230 #include "AliShuttleLogbookEntry.h"
235 #include <TTimeStamp.h>
236 #include <TObjString.h>
237 #include <TSQLServer.h>
238 #include <TSQLResult.h>
241 #include <TSystemDirectory.h>
242 #include <TSystemFile.h>
243 #include <TFileMerger.h>
245 #include <TGridResult.h>
247 #include <TMonaLisaWriter.h>
251 #include <sys/types.h>
252 #include <sys/wait.h>
256 //______________________________________________________________________________________________
257 AliShuttle::AliShuttle(const AliShuttleConfig* config,
258 UInt_t timeout, Int_t retries):
260 fTimeout(timeout), fRetries(retries),
270 fReadTestMode(kFALSE),
271 fOutputRedirected(kFALSE)
274 // config: AliShuttleConfig used
275 // timeout: timeout used for AliDCSClient connection
276 // retries: the number of retries in case of connection error.
279 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
280 for(int iSys=0;iSys<4;iSys++) {
283 fFXSlist[iSys].SetOwner(kTRUE);
285 fPreprocessorMap.SetOwner(kTRUE);
287 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
288 fFirstUnprocessed[iDet] = kFALSE;
290 fMonitoringMutex = new TMutex();
293 //______________________________________________________________________________________________
294 AliShuttle::~AliShuttle()
300 fPreprocessorMap.DeleteAll();
301 for(int iSys=0;iSys<4;iSys++)
303 fServer[iSys]->Close();
304 delete fServer[iSys];
313 if (fMonitoringMutex)
315 delete fMonitoringMutex;
316 fMonitoringMutex = 0;
320 //______________________________________________________________________________________________
321 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
324 // Registers new AliPreprocessor.
325 // It uses GetName() for indentificator of the pre processor.
326 // The pre processor is registered it there isn't any other
327 // with the same identificator (GetName()).
330 const char* detName = preprocessor->GetName();
331 if(GetDetPos(detName) < 0)
332 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
334 if (fPreprocessorMap.GetValue(detName)) {
335 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
339 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
341 //______________________________________________________________________________________________
342 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
343 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
345 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
346 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
347 // using this function. Use StoreReferenceData instead!
348 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
349 // finishes the data are transferred to the main storage (Grid).
351 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
354 //______________________________________________________________________________________________
355 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
357 // Stores a CDB object in the storage for reference data. This objects will not be available during
358 // offline reconstrunction. Use this function for reference data only!
359 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
360 // finishes the data are transferred to the main storage (Grid).
362 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
365 //______________________________________________________________________________________________
366 Bool_t AliShuttle::StoreLocally(const TString& localUri,
367 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
368 Int_t validityStart, Bool_t validityInfinite)
370 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
371 // when the preprocessor finishes the data are transferred to the main storage (Grid).
372 // The parameters are:
373 // 1) Uri of the backup storage (Local)
374 // 2) the object's path.
375 // 3) the object to be stored
376 // 4) the metaData to be associated with the object
377 // 5) the validity start run number w.r.t. the current run,
378 // if the data is valid only for this run leave the default 0
379 // 6) specifies if the calibration data is valid for infinity (this means until updated),
380 // typical for calibration runs, the default is kFALSE
382 // returns 0 if fail, 1 otherwise
384 if (fTestMode & kErrorStorage)
386 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
390 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
392 Int_t firstRun = GetCurrentRun() - validityStart;
394 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
399 if(validityInfinite) {
400 lastRun = AliCDBRunRange::Infinity();
402 lastRun = GetCurrentRun();
405 // Version is set to current run, it will be used later to transfer data to Grid
406 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
408 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
409 TObjString runUsed = Form("%d", GetCurrentRun());
410 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
413 Bool_t result = kFALSE;
415 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
416 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
418 result = AliCDBManager::Instance()->GetStorage(localUri)
419 ->Put(object, id, metaData);
424 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
430 //______________________________________________________________________________________________
431 Bool_t AliShuttle::StoreOCDB()
434 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
435 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
436 // Then calls StoreRefFilesToGrid to store reference files.
439 if (fTestMode & kErrorGrid)
441 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
442 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
446 Log("SHUTTLE","Storing OCDB data ...");
447 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
449 Log("SHUTTLE","Storing reference data ...");
450 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
452 Log("SHUTTLE","Storing reference files ...");
453 Bool_t resultRefFiles = StoreRefFilesToGrid();
455 return resultCDB && resultRef && resultRefFiles;
458 //______________________________________________________________________________________________
459 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
462 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
465 TObjArray* gridIds=0;
467 Bool_t result = kTRUE;
469 const char* type = 0;
471 if(gridURI == fgkMainCDB) {
473 localURI = fgkLocalCDB;
474 } else if(gridURI == fgkMainRefStorage) {
476 localURI = fgkLocalRefStorage;
478 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
482 AliCDBManager* man = AliCDBManager::Instance();
484 AliCDBStorage *gridSto = man->GetStorage(gridURI);
487 Form("StoreOCDB - cannot activate main %s storage", type));
491 gridIds = gridSto->GetQueryCDBList();
493 // get objects previously stored in local CDB
494 AliCDBStorage *localSto = man->GetStorage(localURI);
497 Form("StoreOCDB - cannot activate local %s storage", type));
500 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
501 // Local objects were stored with current run as Grid version!
502 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
503 localEntries->SetOwner(1);
505 // loop on local stored objects
506 TIter localIter(localEntries);
507 AliCDBEntry *aLocEntry = 0;
508 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
509 aLocEntry->SetOwner(1);
510 AliCDBId aLocId = aLocEntry->GetId();
511 aLocEntry->SetVersion(-1);
512 aLocEntry->SetSubVersion(-1);
514 // If local object is valid up to infinity we store it only if it is
515 // the first unprocessed run!
516 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
517 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
519 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
520 "there are previous unprocessed runs!",
521 fCurrentDetector.Data(), aLocId.GetPath().Data()));
525 // loop on Grid valid Id's
526 Bool_t store = kTRUE;
527 TIter gridIter(gridIds);
528 AliCDBId* aGridId = 0;
529 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
530 if(aGridId->GetPath() != aLocId.GetPath()) continue;
531 // skip all objects valid up to infinity
532 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
533 // if we get here, it means there's already some more recent object stored on Grid!
538 // If we get here, the file can be stored!
539 Bool_t storeOk = gridSto->Put(aLocEntry);
540 if(!store || storeOk){
544 Log(fCurrentDetector.Data(),
545 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
546 type, aGridId->ToString().Data()));
549 Form("StoreOCDB - Object <%s> successfully put into %s storage",
550 aLocId.ToString().Data(), type));
551 Log(fCurrentDetector.Data(),
552 Form("StoreOCDB - Object <%s> successfully put into %s storage",
553 aLocId.ToString().Data(), type));
556 // removing local filename...
558 localSto->IdToFilename(aLocId, filename);
559 AliInfo(Form("Removing local file %s", filename.Data()));
560 RemoveFile(filename.Data());
564 Form("StoreOCDB - Grid %s storage of object <%s> failed",
565 type, aLocId.ToString().Data()));
566 Log(fCurrentDetector.Data(),
567 Form("StoreOCDB - Grid %s storage of object <%s> failed",
568 type, aLocId.ToString().Data()));
572 localEntries->Clear();
577 //______________________________________________________________________________________________
578 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
580 // clears the directory used to store reference files of a given subdetector
582 AliCDBManager* man = AliCDBManager::Instance();
583 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
584 TString localBaseFolder = sto->GetBaseFolder();
586 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
588 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
591 begin.Form("%d_", GetCurrentRun());
593 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
597 TList* dirList = baseDir->GetListOfFiles();
600 if (!dirList) return kTRUE;
602 if (dirList->GetEntries() < 3)
608 Int_t nDirs = 0, nDel = 0;
609 TIter dirIter(dirList);
610 TSystemFile* entry = 0;
612 Bool_t success = kTRUE;
614 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
616 if (entry->IsDirectory())
619 TString fileName(entry->GetName());
620 if (!fileName.BeginsWith(begin))
626 Int_t result = gSystem->Unlink(fileName.Data());
630 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
638 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
639 nDel, nDirs, targetDir.Data()));
650 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
654 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
657 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
662 result = gSystem->mkdir(targetDir, kTRUE);
665 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
672 //______________________________________________________________________________________________
673 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
676 // Stores reference file directly (without opening it). This function stores the file locally.
678 // The file is stored under the following location:
679 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
680 // where <gridFileName> is the second parameter given to the function
683 if (fTestMode & kErrorStorage)
685 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
689 AliCDBManager* man = AliCDBManager::Instance();
690 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
692 TString localBaseFolder = sto->GetBaseFolder();
694 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
696 //try to open folder, if does not exist
697 void* dir = gSystem->OpenDirectory(targetDir.Data());
699 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
700 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
705 gSystem->FreeDirectory(dir);
709 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
711 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
714 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
718 result = gSystem->CopyFile(localFile, target);
722 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
727 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
728 localFile, target.Data(), result));
733 //______________________________________________________________________________________________
734 Bool_t AliShuttle::StoreRefFilesToGrid()
737 // Transfers the reference file to the Grid.
739 // The files are stored under the following location:
740 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
743 AliCDBManager* man = AliCDBManager::Instance();
744 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
747 TString localBaseFolder = sto->GetBaseFolder();
749 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
751 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
755 TString gridBaseFolder = gridSto->GetBaseFolder();
757 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
760 begin.Form("%d_", GetCurrentRun());
762 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
766 TList* dirList = baseDir->GetListOfFiles();
769 if (!dirList) return kTRUE;
771 if (dirList->GetEntries() < 3)
779 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
784 Int_t nDirs = 0, nTransfer = 0;
785 TIter dirIter(dirList);
786 TSystemFile* entry = 0;
788 Bool_t success = kTRUE;
789 Bool_t first = kTRUE;
791 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
793 if (entry->IsDirectory())
796 TString fileName(entry->GetName());
797 if (!fileName.BeginsWith(begin))
805 // check that DET folder exists, otherwise create it
806 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
814 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
816 if (!gGrid->Mkdir(alienDir.Data(),"",0))
818 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
823 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
827 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
831 TString fullLocalPath;
832 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
834 TString fullGridPath;
835 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
837 TFileMerger fileMerger;
838 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
842 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
843 RemoveFile(fullLocalPath);
848 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
853 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
860 //______________________________________________________________________________________________
861 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
864 // Get folder name of reference files
867 TString offDetStr(GetOfflineDetName(detector));
869 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
871 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
873 dir.Form("%s/%s", base, offDetStr.Data());
880 //______________________________________________________________________________________________
881 void AliShuttle::CleanLocalStorage(const TString& uri)
884 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
887 const char* type = 0;
888 if(uri == fgkLocalCDB) {
890 } else if(uri == fgkLocalRefStorage) {
893 AliError(Form("Invalid storage URI: %s", uri.Data()));
897 AliCDBManager* man = AliCDBManager::Instance();
899 // open local storage
900 AliCDBStorage *localSto = man->GetStorage(uri);
903 Form("CleanLocalStorage - cannot activate local %s storage", type));
907 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
908 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
910 AliInfo(Form("filename = %s", filename.Data()));
912 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
913 GetCurrentRun(), fCurrentDetector.Data()));
915 RemoveFile(filename.Data());
919 //______________________________________________________________________________________________
920 void AliShuttle::RemoveFile(const char* filename)
923 // removes local file
926 TString command(Form("rm -f %s", filename));
928 Int_t result = gSystem->Exec(command.Data());
931 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
932 fCurrentDetector.Data(), filename));
936 //______________________________________________________________________________________________
937 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
940 // Reads the AliShuttleStatus from the CDB
948 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
949 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
951 if (!fStatusEntry) return 0;
952 fStatusEntry->SetOwner(1);
954 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
956 AliError("Invalid object stored to CDB!");
963 //______________________________________________________________________________________________
964 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
967 // writes the status for one subdetector
975 Int_t run = GetCurrentRun();
977 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
979 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
980 fStatusEntry->SetOwner(1);
982 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
985 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
986 fCurrentDetector.Data(), run));
995 //______________________________________________________________________________________________
996 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
999 // changes the AliShuttleStatus for the given detector and run to the given status
1003 AliError("UNEXPECTED: fStatusEntry empty");
1007 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1010 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
1014 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1015 fCurrentDetector.Data(),
1016 status->GetStatusName(),
1017 status->GetStatusName(newStatus));
1018 Log("SHUTTLE", actionStr);
1019 SetLastAction(actionStr);
1021 status->SetStatus(newStatus);
1022 if (increaseCount) status->IncreaseCount();
1024 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1029 //______________________________________________________________________________________________
1030 void AliShuttle::SendMLInfo()
1033 // sends ML information about the current status of the current detector being processed
1036 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1039 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1043 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1044 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1047 mlList.Add(&mlStatus);
1048 mlList.Add(&mlRetryCount);
1050 fMonaLisa->SendParameters(&mlList);
1053 //______________________________________________________________________________________________
1054 Bool_t AliShuttle::ContinueProcessing()
1056 // this function reads the AliShuttleStatus information from CDB and
1057 // checks if the processing should be continued
1058 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1060 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1062 AliPreprocessor* aPreprocessor =
1063 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1066 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1070 AliShuttleLogbookEntry::Status entryStatus =
1071 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1073 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1074 AliInfo(Form("ContinueProcessing - %s is %s",
1075 fCurrentDetector.Data(),
1076 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1080 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1082 // check if current run is first unprocessed run for current detector
1083 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1084 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1086 if (fTestMode == kNone)
1088 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1093 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1097 AliShuttleStatus* status = ReadShuttleStatus();
1100 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1101 fCurrentDetector.Data()));
1102 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1103 return WriteShuttleStatus(status);
1106 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1107 // If it happens it may mean Logbook updating failed... let's do it now!
1108 if (status->GetStatus() == AliShuttleStatus::kDone ||
1109 status->GetStatus() == AliShuttleStatus::kFailed){
1110 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1111 fCurrentDetector.Data(),
1112 status->GetStatusName(status->GetStatus())));
1113 UpdateShuttleLogbook(fCurrentDetector.Data(),
1114 status->GetStatusName(status->GetStatus()));
1118 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1120 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1121 fCurrentDetector.Data()));
1122 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1124 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1125 fCurrentDetector.Data()));
1126 UpdateShuttleStatus(AliShuttleStatus::kDone);
1127 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1130 Form("ContinueProcessing - %s: Grid storage failed again",
1131 fCurrentDetector.Data()));
1132 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1137 // if we get here, there is a restart
1138 Bool_t cont = kFALSE;
1141 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1142 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1143 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1144 status->GetCount(), status->GetStatusName()));
1145 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1146 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1148 // there may still be objects in local OCDB and reference storage
1149 // and FXS databases may be not updated: do it now!
1151 // TODO Currently disabled, we want to keep files in case of failure!
1152 // CleanLocalStorage(fgkLocalCDB);
1153 // CleanLocalStorage(fgkLocalRefStorage);
1154 // UpdateTableFailCase();
1156 // Send mail to detector expert!
1157 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1159 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1160 fCurrentDetector.Data()));
1163 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1164 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1165 status->GetStatusName(), status->GetCount()));
1166 Bool_t increaseCount = kTRUE;
1167 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1168 increaseCount = kFALSE;
1169 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1176 //______________________________________________________________________________________________
1177 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1180 // Makes data retrieval for all detectors in the configuration.
1181 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1182 // (Unprocessed, Inactive, Failed or Done).
1183 // Returns kFALSE in case of error occured and kTRUE otherwise
1186 if (!entry) return kFALSE;
1188 fLogbookEntry = entry;
1190 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1193 // create ML instance that monitors this run
1194 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1195 // disable monitoring of other parameters that come e.g. from TFile
1196 gMonitoringWriter = 0;
1198 // Send the information to ML
1199 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1200 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1203 mlList.Add(&mlStatus);
1204 mlList.Add(&mlRunType);
1206 fMonaLisa->SendParameters(&mlList);
1208 if (fLogbookEntry->IsDone())
1210 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1211 UpdateShuttleLogbook("shuttle_done");
1216 // read test mode if flag is set
1220 TString logEntry(entry->GetRunParameter("log"));
1221 //printf("log entry = %s\n", logEntry.Data());
1222 TString searchStr("Testmode: ");
1223 Int_t pos = logEntry.Index(searchStr.Data());
1224 //printf("%d\n", pos);
1227 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1228 //printf("%s\n", subStr.String().Data());
1229 TString newStr(subStr.Data());
1230 TObjArray* token = newStr.Tokenize(' ');
1234 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1237 Int_t testMode = tmpStr->String().Atoi();
1240 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1241 SetTestMode((TestMode) testMode);
1249 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1251 fLogbookEntry->Print("all");
1254 Bool_t hasError = kFALSE;
1256 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1257 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1258 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1259 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1261 // Loop on detectors in the configuration
1262 TIter iter(fConfig->GetDetectors());
1263 TObjString* aDetector = 0;
1265 while ((aDetector = (TObjString*) iter.Next()))
1267 fCurrentDetector = aDetector->String();
1269 if (ContinueProcessing() == kFALSE) continue;
1271 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1272 GetCurrentRun(), aDetector->GetName()));
1274 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1276 Log(fCurrentDetector.Data(), "Starting processing");
1282 Log("SHUTTLE", "ERROR: Forking failed");
1287 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1288 GetCurrentRun(), aDetector->GetName()));
1290 Long_t begin = time(0);
1292 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1293 while (waitpid(pid, &status, WNOHANG) == 0)
1295 Long_t expiredTime = time(0) - begin;
1297 if (expiredTime > fConfig->GetPPTimeOut())
1300 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1301 fCurrentDetector.Data(), expiredTime);
1302 Log("SHUTTLE", tmp);
1303 Log(fCurrentDetector, tmp);
1307 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1310 gSystem->Sleep(1000);
1314 gSystem->Sleep(1000);
1317 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1318 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1321 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1326 if (!fgets(buffer, 100, pipe))
1328 Log("SHUTTLE", "Error: ps did not return anything");
1329 gSystem->ClosePipe(pipe);
1332 gSystem->ClosePipe(pipe);
1334 //Log("SHUTTLE", Form("ps returned %s", buffer));
1337 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1339 Log("SHUTTLE", "Error: Could not parse output of ps");
1343 if (expiredTime % 60 == 0)
1344 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1345 fCurrentDetector.Data(), expiredTime, mem));
1347 if (mem > fConfig->GetPPMaxMem())
1350 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1351 mem, fConfig->GetPPMaxMem());
1352 Log("SHUTTLE", tmp);
1353 Log(fCurrentDetector, tmp);
1357 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1360 gSystem->Sleep(1000);
1365 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1366 GetCurrentRun(), aDetector->GetName()));
1368 if (WIFEXITED(status))
1370 Int_t returnCode = WEXITSTATUS(status);
1372 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1375 if (returnCode == 0) hasError = kTRUE;
1381 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1383 AliInfo("Redirecting output...");
1385 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1387 Log("SHUTTLE", "Could not freopen stdout");
1391 fOutputRedirected = kTRUE;
1392 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1393 Log("SHUTTLE", "Could not redirect stderr");
1397 Bool_t success = ProcessCurrentDetector();
1398 if (success) // Preprocessor finished successfully!
1400 // Update time_processed field in FXS DB
1401 if (UpdateTable() == kFALSE)
1402 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1404 // Transfer the data from local storage to main storage (Grid)
1405 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1406 if (StoreOCDB() == kFALSE)
1408 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1409 GetCurrentRun(), aDetector->GetName()));
1410 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1413 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1414 GetCurrentRun(), aDetector->GetName()));
1415 UpdateShuttleStatus(AliShuttleStatus::kDone);
1416 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1420 for (UInt_t iSys=0; iSys<3; iSys++)
1422 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1425 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1426 GetCurrentRun(), aDetector->GetName(), success));
1428 // the client exits here
1429 gSystem->Exit(success);
1431 AliError("We should never get here!!!");
1435 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1438 //check if shuttle is done for this run, if so update logbook
1439 TObjArray checkEntryArray;
1440 checkEntryArray.SetOwner(1);
1441 TString whereClause = Form("where run=%d", GetCurrentRun());
1442 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1443 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1445 return hasError == kFALSE;
1448 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1449 (checkEntryArray.At(0));
1453 if (checkEntry->IsDone())
1455 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1456 UpdateShuttleLogbook("shuttle_done");
1460 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1462 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1464 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1465 checkEntry->GetRun(), GetDetName(iDet)));
1466 fFirstUnprocessed[iDet] = kFALSE;
1472 // remove ML instance
1478 return hasError == kFALSE;
1481 //______________________________________________________________________________________________
1482 Bool_t AliShuttle::ProcessCurrentDetector()
1485 // Makes data retrieval just for a specific detector (fCurrentDetector).
1486 // Threre should be a configuration for this detector.
1488 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1490 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1495 // call preprocessor
1496 AliPreprocessor* aPreprocessor =
1497 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1499 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1501 Bool_t processDCS = aPreprocessor->ProcessDCS();
1505 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1507 else if (fTestMode & kSkipDCS)
1509 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1511 else if (fTestMode & kErrorDCS)
1513 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1514 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1515 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1519 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1521 TString host(fConfig->GetDCSHost(fCurrentDetector));
1522 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1524 if (fConfig->GetDCSAliases(fCurrentDetector)->GetEntries() > 0)
1526 dcsMap = GetValueSet(host, port, fConfig->GetDCSAliases(fCurrentDetector), kAlias);
1529 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS aliases");
1530 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1535 if (fConfig->GetDCSDataPoints(fCurrentDetector)->GetEntries() > 0)
1537 TMap* dcsMap2 = GetValueSet(host, port, fConfig->GetDCSDataPoints(fCurrentDetector), kDP);
1540 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS data points");
1541 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1553 TIter iter(dcsMap2);
1554 TObjString* key = 0;
1555 while ((key = (TObjString*) iter.Next()))
1556 dcsMap->Add(key, dcsMap2->GetValue(key->String()));
1558 dcsMap2->SetOwner(kFALSE);
1568 // DCS Archive DB processing successful. Call Preprocessor!
1569 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1571 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1573 if (returnValue > 0) // Preprocessor error!
1575 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1576 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1577 dcsMap->DeleteAll();
1583 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1584 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1585 fCurrentDetector.Data()));
1587 dcsMap->DeleteAll();
1593 //______________________________________________________________________________________________
1594 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1597 // Query DAQ's Shuttle logbook and fills detector status object.
1598 // Call QueryRunParameters to query DAQ logbook for run parameters.
1601 entries.SetOwner(1);
1603 // check connection, in case connect
1604 if(!Connect(3)) return kFALSE;
1607 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1609 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1611 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1615 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1617 if(aResult->GetRowCount() == 0) {
1618 AliInfo("No entries in Shuttle Logbook match request");
1623 // TODO Check field count!
1624 const UInt_t nCols = 22;
1625 if (aResult->GetFieldCount() != (Int_t) nCols) {
1626 AliError("Invalid SQL result field number!");
1632 while ((aRow = aResult->Next())) {
1633 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1634 Int_t run = runString.Atoi();
1636 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1640 // loop on detectors
1641 for(UInt_t ii = 0; ii < nCols; ii++)
1642 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1644 entries.AddLast(entry);
1652 //______________________________________________________________________________________________
1653 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1656 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1659 // check connection, in case connect
1664 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1666 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1668 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1672 if (aResult->GetRowCount() == 0) {
1673 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1678 if (aResult->GetRowCount() > 1) {
1679 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1684 TSQLRow* aRow = aResult->Next();
1687 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1692 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1694 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1695 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1697 UInt_t startTime = entry->GetStartTime();
1698 UInt_t endTime = entry->GetEndTime();
1700 if (!startTime || !endTime || startTime > endTime) {
1702 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1703 run, startTime, endTime));
1716 //______________________________________________________________________________________________
1717 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1718 TObjArray* valueSet, DCSType type)
1720 // Retrieve all "entry" data points from the DCS server
1721 // host, port: TSocket connection parameters
1722 // entry: name of the alias or data point
1723 // valueSet: array of retrieved AliDCSValue's
1724 // type: kAlias or kDP
1726 AliDCSClient client(host, port, fTimeout, fRetries);
1727 if (!client.IsConnected())
1736 result = client.GetAliasValues(entry,
1737 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1741 result = client.GetDPValues(entry,
1742 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1747 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1748 entry, AliDCSClient::GetErrorString(result)));
1750 if (result == AliDCSClient::fgkServerError)
1752 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1753 client.GetServerError().Data()));
1762 //______________________________________________________________________________________________
1763 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1766 // Retrieve all "entry" data points from the DCS server
1767 // host, port: TSocket connection parameters
1768 // entries: list of name of the alias or data point
1769 // type: kAlias or kDP
1770 // returns TMap of values, 0 when failure
1772 const Int_t kSplit = 100; // maximum number of DPs at a time
1774 Int_t totalEntries = entries->GetEntries();
1778 for (Int_t index=0; index < totalEntries; index += kSplit)
1780 Int_t endIndex = index + kSplit;
1782 AliDCSClient client(host, port, fTimeout, fRetries);
1783 if (!client.IsConnected())
1786 TMap* partialResult = 0;
1790 partialResult = client.GetAliasValues(entries, GetCurrentStartTime(),
1791 GetCurrentEndTime(), index, endIndex);
1793 else if (type == kDP)
1795 partialResult = client.GetDPValues(entries, GetCurrentStartTime(),
1796 GetCurrentEndTime(), index, endIndex);
1799 if (partialResult == 0)
1801 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries (%d...%d)! Reason: %s",
1802 index, endIndex, client.GetServerError().Data()));
1810 AliInfo(Form("Retrieved entries %d..%d (total %d); E.g. %s has %d values collected",
1811 index, endIndex, totalEntries, entries->At(index)->GetName(), ((TObjArray*)
1812 partialResult->GetValue(entries->At(index)->GetName()))->GetEntriesFast()));
1816 result = partialResult;
1820 TIter iter(partialResult);
1821 TObjString* key = 0;
1822 while ((key = (TObjString*) iter.Next()))
1823 result->Add(key, partialResult->GetValue(key->String()));
1825 partialResult->SetOwner(kFALSE);
1826 delete partialResult;
1833 //______________________________________________________________________________________________
1834 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1835 const char* id, const char* source)
1837 // Get calibration file from file exchange servers
1838 // First queris the FXS database for the file name, using the run, detector, id and source info
1839 // then calls RetrieveFile(filename) for actual copy to local disk
1840 // run: current run being processed (given by Logbook entry fLogbookEntry)
1841 // detector: the Preprocessor name
1842 // id: provided as a parameter by the Preprocessor
1843 // source: provided by the Preprocessor through GetFileSources function
1845 // check if test mode should simulate a FXS error
1846 if (fTestMode & kErrorFXSFiles)
1848 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1852 // check connection, in case connect
1853 if (!Connect(system))
1855 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1859 // Query preparation
1860 TString sourceName(source);
1862 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1863 fConfig->GetFXSdbTable(system));
1864 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1865 GetCurrentRun(), detector, id);
1869 whereClause += Form(" and DAQsource=\"%s\"", source);
1871 else if (system == kDCS)
1875 else if (system == kHLT)
1877 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1881 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1883 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1886 TSQLResult* aResult = 0;
1887 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1889 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1890 GetSystemName(system), id, sourceName.Data()));
1894 if(aResult->GetRowCount() == 0)
1897 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1898 GetSystemName(system), id, sourceName.Data()));
1903 if (aResult->GetRowCount() > 1) {
1905 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1906 GetSystemName(system), id, sourceName.Data()));
1911 if (aResult->GetFieldCount() != nFields) {
1913 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1914 GetSystemName(system), id, sourceName.Data()));
1919 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1922 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1923 GetSystemName(system), id, sourceName.Data()));
1928 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1929 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1930 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1935 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1936 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1938 // retrieved file is renamed to make it unique
1939 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1940 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1943 // file retrieval from FXS
1944 UInt_t nRetries = 0;
1945 UInt_t maxRetries = 3;
1946 Bool_t result = kFALSE;
1948 // copy!! if successful TSystem::Exec returns 0
1949 while(nRetries++ < maxRetries) {
1950 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1951 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1954 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1955 filePath.Data(), GetSystemName(system)));
1958 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1959 filePath.Data(), GetSystemName(system),
1960 GetShuttleTempDir(), localFileName.Data()));
1963 if (fileChecksum.Length()>0)
1965 // compare md5sum of local file with the one stored in the FXS DB
1966 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
1967 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
1971 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1977 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1978 filePath.Data(), GetSystemName(system)));
1983 if(!result) return 0;
1985 fFXSCalled[system]=kTRUE;
1986 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1987 fFXSlist[system].Add(fileParams);
1989 static TString fullLocalFileName;
1990 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1992 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1994 return fullLocalFileName.Data();
1998 //______________________________________________________________________________________________
1999 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2002 // Copies file from FXS to local Shuttle machine
2005 // check temp directory: trying to cd to temp; if it does not exist, create it
2006 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
2007 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
2009 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
2011 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
2012 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
2017 gSystem->FreeDirectory(dir);
2020 TString baseFXSFolder;
2023 baseFXSFolder = "FES/";
2025 else if (system == kDCS)
2029 else if (system == kHLT)
2031 baseFXSFolder = "~/";
2035 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
2036 fConfig->GetFXSPort(system),
2037 fConfig->GetFXSUser(system),
2038 fConfig->GetFXSHost(system),
2039 baseFXSFolder.Data(),
2041 GetShuttleTempDir(),
2044 AliDebug(2, Form("%s",command.Data()));
2046 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2051 //______________________________________________________________________________________________
2052 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2055 // Get sources producing the condition file Id from file exchange servers
2056 // if id is NULL all sources are returned (distinct)
2059 // check if test mode should simulate a FXS error
2060 if (fTestMode & kErrorFXSSources)
2062 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2069 AliError("DCS system has only one source of data!");
2073 // check connection, in case connect
2074 if (!Connect(system))
2076 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2080 TString sourceName = 0;
2083 sourceName = "DAQsource";
2084 } else if (system == kHLT)
2086 sourceName = "DDLnumbers";
2089 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2090 TString whereClause = Form("run=%d and detector=\"%s\"",
2091 GetCurrentRun(), detector);
2093 whereClause += Form(" and fileId=\"%s\"", id);
2094 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2096 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2099 TSQLResult* aResult;
2100 aResult = fServer[system]->Query(sqlQuery);
2102 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2103 GetSystemName(system), id));
2107 TList *list = new TList();
2110 if (aResult->GetRowCount() == 0)
2113 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2120 while ((aRow = aResult->Next()))
2123 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2124 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2125 list->Add(new TObjString(source));
2134 //______________________________________________________________________________________________
2135 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2138 // Get all ids of condition files produced by a given source from file exchange servers
2141 // check if test mode should simulate a FXS error
2142 if (fTestMode & kErrorFXSSources)
2144 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2148 // check connection, in case connect
2149 if (!Connect(system))
2151 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2155 TString sourceName = 0;
2158 sourceName = "DAQsource";
2159 } else if (system == kHLT)
2161 sourceName = "DDLnumbers";
2164 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2165 TString whereClause = Form("run=%d and detector=\"%s\"",
2166 GetCurrentRun(), detector);
2167 if (sourceName.Length() > 0 && source)
2168 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2169 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2171 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2174 TSQLResult* aResult;
2175 aResult = fServer[system]->Query(sqlQuery);
2177 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2178 GetSystemName(system), source));
2182 TList *list = new TList();
2185 if (aResult->GetRowCount() == 0)
2188 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2195 while ((aRow = aResult->Next()))
2198 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2199 AliDebug(2, Form("fileId = %s", id.Data()));
2200 list->Add(new TObjString(id));
2209 //______________________________________________________________________________________________
2210 Bool_t AliShuttle::Connect(Int_t system)
2212 // Connect to MySQL Server of the system's FXS MySQL databases
2213 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2216 // check connection: if already connected return
2217 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2219 TString dbHost, dbUser, dbPass, dbName;
2221 if (system < 3) // FXS db servers
2223 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2224 dbUser = fConfig->GetFXSdbUser(system);
2225 dbPass = fConfig->GetFXSdbPass(system);
2226 dbName = fConfig->GetFXSdbName(system);
2227 } else { // Run & Shuttle logbook servers
2228 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2229 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2230 dbUser = fConfig->GetDAQlbUser();
2231 dbPass = fConfig->GetDAQlbPass();
2232 dbName = fConfig->GetDAQlbDB();
2235 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2236 if (!fServer[system] || !fServer[system]->IsConnected()) {
2239 AliError(Form("Can't establish connection to FXS database for %s",
2240 AliShuttleInterface::GetSystemName(system)));
2242 AliError("Can't establish connection to Run logbook.");
2244 if(fServer[system]) delete fServer[system];
2249 TSQLResult* aResult=0;
2252 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2255 aResult = fServer[kDCS]->GetTables(dbName.Data());
2258 aResult = fServer[kHLT]->GetTables(dbName.Data());
2261 aResult = fServer[3]->GetTables(dbName.Data());
2269 //______________________________________________________________________________________________
2270 Bool_t AliShuttle::UpdateTable()
2273 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2276 Bool_t result = kTRUE;
2278 for (UInt_t system=0; system<3; system++)
2280 if(!fFXSCalled[system]) continue;
2282 // check connection, in case connect
2283 if (!Connect(system))
2285 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2290 TTimeStamp now; // now
2292 // Loop on FXS list entries
2293 TIter iter(&fFXSlist[system]);
2294 TObjString *aFXSentry=0;
2295 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2297 TString aFXSentrystr = aFXSentry->String();
2298 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2299 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2301 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2302 GetSystemName(system), aFXSentrystr.Data()));
2303 if(aFXSarray) delete aFXSarray;
2307 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2308 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2310 TString whereClause;
2313 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2314 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2316 else if (system == kDCS)
2318 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2319 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2321 else if (system == kHLT)
2323 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2324 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2329 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2330 now.GetSec(), whereClause.Data());
2332 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2335 TSQLResult* aResult;
2336 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2339 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2340 GetSystemName(system), sqlQuery.Data()));
2351 //______________________________________________________________________________________________
2352 Bool_t AliShuttle::UpdateTableFailCase()
2354 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2355 // this is called in case the preprocessor is declared failed for the current run, because
2356 // the fields are updated only in case of success
2358 Bool_t result = kTRUE;
2360 for (UInt_t system=0; system<3; system++)
2362 // check connection, in case connect
2363 if (!Connect(system))
2365 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2366 GetSystemName(system)));
2371 TTimeStamp now; // now
2373 // Loop on FXS list entries
2375 TString whereClause = Form("where run=%d and detector=\"%s\";",
2376 GetCurrentRun(), fCurrentDetector.Data());
2379 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2380 now.GetSec(), whereClause.Data());
2382 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2385 TSQLResult* aResult;
2386 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2389 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2390 GetSystemName(system), sqlQuery.Data()));
2400 //______________________________________________________________________________________________
2401 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2404 // Update Shuttle logbook filling detector or shuttle_done column
2405 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2408 // check connection, in case connect
2410 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2414 TString detName(detector);
2416 if(detName == "shuttle_done")
2418 setClause = "set shuttle_done=1";
2420 // Send the information to ML
2421 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2424 mlList.Add(&mlStatus);
2426 fMonaLisa->SendParameters(&mlList);
2428 TString statusStr(status);
2429 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2430 statusStr.Contains("failed", TString::kIgnoreCase)){
2431 setClause = Form("set %s=\"%s\"", detector, status);
2434 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2440 TString whereClause = Form("where run=%d", GetCurrentRun());
2442 TString sqlQuery = Form("update %s %s %s",
2443 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2445 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2448 TSQLResult* aResult;
2449 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2451 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2459 //______________________________________________________________________________________________
2460 Int_t AliShuttle::GetCurrentRun() const
2463 // Get current run from logbook entry
2466 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2469 //______________________________________________________________________________________________
2470 UInt_t AliShuttle::GetCurrentStartTime() const
2473 // get current start time
2476 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2479 //______________________________________________________________________________________________
2480 UInt_t AliShuttle::GetCurrentEndTime() const
2483 // get current end time from logbook entry
2486 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2489 //______________________________________________________________________________________________
2490 void AliShuttle::Log(const char* detector, const char* message)
2493 // Fill log string with a message
2496 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2498 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2499 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2504 gSystem->FreeDirectory(dir);
2507 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2508 if (GetCurrentRun() >= 0)
2509 toLog += Form("run %d - ", GetCurrentRun());
2510 toLog += Form("%s", message);
2512 AliInfo(toLog.Data());
2514 // if we redirect the log output already to the file, leave here
2515 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2518 TString fileName = GetLogFileName(detector);
2520 gSystem->ExpandPathName(fileName);
2523 logFile.open(fileName, ofstream::out | ofstream::app);
2525 if (!logFile.is_open()) {
2526 AliError(Form("Could not open file %s", fileName.Data()));
2530 logFile << toLog.Data() << "\n";
2535 //______________________________________________________________________________________________
2536 TString AliShuttle::GetLogFileName(const char* detector) const
2539 // returns the name of the log file for a given sub detector
2544 if (GetCurrentRun() >= 0)
2545 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2547 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2552 //______________________________________________________________________________________________
2553 Bool_t AliShuttle::Collect(Int_t run)
2556 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2557 // If a dedicated run is given this run is processed
2559 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2563 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2565 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2567 SetLastAction("Starting");
2569 TString whereClause("where shuttle_done=0");
2571 whereClause += Form(" and run=%d", run);
2573 TObjArray shuttleLogbookEntries;
2574 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2576 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2580 if (shuttleLogbookEntries.GetEntries() == 0)
2583 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2585 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2586 "or it does not exist in Shuttle logbook", run));
2590 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2591 fFirstUnprocessed[iDet] = kTRUE;
2595 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2596 // flag them into fFirstUnprocessed array
2597 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2598 TObjArray tmpLogbookEntries;
2599 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2601 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2605 TIter iter(&tmpLogbookEntries);
2606 AliShuttleLogbookEntry* anEntry = 0;
2607 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2609 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2611 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2613 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2614 anEntry->GetRun(), GetDetName(iDet)));
2615 fFirstUnprocessed[iDet] = kFALSE;
2623 if (!RetrieveConditionsData(shuttleLogbookEntries))
2625 Log("SHUTTLE", "Collect - Process of at least one run failed");
2629 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2633 //______________________________________________________________________________________________
2634 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2637 // Retrieve conditions data for all runs that aren't processed yet
2640 Bool_t hasError = kFALSE;
2642 TIter iter(&dateEntries);
2643 AliShuttleLogbookEntry* anEntry;
2645 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2646 if (!Process(anEntry)){
2650 // clean SHUTTLE temp directory
2651 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2652 RemoveFile(filename.Data());
2655 return hasError == kFALSE;
2658 //______________________________________________________________________________________________
2659 ULong_t AliShuttle::GetTimeOfLastAction() const
2662 // Gets time of last action
2667 fMonitoringMutex->Lock();
2669 tmp = fLastActionTime;
2671 fMonitoringMutex->UnLock();
2676 //______________________________________________________________________________________________
2677 const TString AliShuttle::GetLastAction() const
2680 // returns a string description of the last action
2685 fMonitoringMutex->Lock();
2689 fMonitoringMutex->UnLock();
2694 //______________________________________________________________________________________________
2695 void AliShuttle::SetLastAction(const char* action)
2698 // updates the monitoring variables
2701 fMonitoringMutex->Lock();
2703 fLastAction = action;
2704 fLastActionTime = time(0);
2706 fMonitoringMutex->UnLock();
2709 //______________________________________________________________________________________________
2710 const char* AliShuttle::GetRunParameter(const char* param)
2713 // returns run parameter read from DAQ logbook
2716 if(!fLogbookEntry) {
2717 AliError("No logbook entry!");
2721 return fLogbookEntry->GetRunParameter(param);
2724 //______________________________________________________________________________________________
2725 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2728 // returns object from OCDB valid for current run
2731 if (fTestMode & kErrorOCDB)
2733 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2737 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2740 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2744 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2747 //______________________________________________________________________________________________
2748 Bool_t AliShuttle::SendMail()
2751 // sends a mail to the subdetector expert in case of preprocessor error
2754 if (fTestMode != kNone)
2757 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2760 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2762 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2767 gSystem->FreeDirectory(dir);
2770 TString bodyFileName;
2771 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2772 gSystem->ExpandPathName(bodyFileName);
2775 mailBody.open(bodyFileName, ofstream::out);
2777 if (!mailBody.is_open())
2779 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2784 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2785 TObjString *anExpert=0;
2786 while ((anExpert = (TObjString*) iterExperts.Next()))
2788 to += Form("%s,", anExpert->GetName());
2790 to.Remove(to.Length()-1);
2791 AliDebug(2, Form("to: %s",to.Data()));
2794 AliInfo("List of detector responsibles not yet set!");
2798 TString cc="alberto.colla@cern.ch";
2800 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
2801 fCurrentDetector.Data(), GetCurrentRun());
2802 AliDebug(2, Form("subject: %s", subject.Data()));
2804 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2805 body += Form("SHUTTLE just detected that your preprocessor "
2806 "failed processing run %d!!\n\n", GetCurrentRun());
2807 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2808 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2809 body += Form("Find the %s log for the current run on \n\n"
2810 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2811 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
2812 body += Form("The last 10 lines of %s log file are following:\n\n");
2814 AliDebug(2, Form("Body begin: %s", body.Data()));
2816 mailBody << body.Data();
2818 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2820 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2821 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2822 if (gSystem->Exec(tailCommand.Data()))
2824 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2827 TString endBody = Form("------------------------------------------------------\n\n");
2828 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2829 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2830 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
2832 AliDebug(2, Form("Body end: %s", endBody.Data()));
2834 mailBody << endBody.Data();
2839 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2843 bodyFileName.Data());
2844 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2846 Bool_t result = gSystem->Exec(mailCommand.Data());
2851 //______________________________________________________________________________________________
2852 const char* AliShuttle::GetRunType()
2855 // returns run type read from "run type" logbook
2858 if(!fLogbookEntry) {
2859 AliError("No logbook entry!");
2863 return fLogbookEntry->GetRunType();
2866 //______________________________________________________________________________________________
2867 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2870 // sets Shuttle temp directory
2873 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2876 //______________________________________________________________________________________________
2877 void AliShuttle::SetShuttleLogDir(const char* logDir)
2880 // sets Shuttle log directory
2883 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);