1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.35 2007/04/04 16:26:38 acolla
19 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
20 2. Added missing dependency in test preprocessors.
21 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
23 Revision 1.34 2007/04/04 10:33:36 jgrosseo
24 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
25 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
27 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
29 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
31 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
33 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
34 If you always need DCS data (like before), you do not need to implement it.
36 6) The run type has been added to the monitoring page
38 Revision 1.33 2007/04/03 13:56:01 acolla
39 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
42 Revision 1.32 2007/02/28 10:41:56 acolla
43 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
44 AliPreprocessor::GetRunType() function.
45 Added some ldap definition files.
47 Revision 1.30 2007/02/13 11:23:21 acolla
48 Moved getters and setters of Shuttle's main OCDB/Reference, local
49 OCDB/Reference, temp and log folders to AliShuttleInterface
51 Revision 1.27 2007/01/30 17:52:42 jgrosseo
52 adding monalisa monitoring
54 Revision 1.26 2007/01/23 19:20:03 acolla
55 Removed old ldif files, added TOF, MCH ldif files. Added some options in
56 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
59 Revision 1.25 2007/01/15 19:13:52 acolla
60 Moved some AliInfo to AliDebug in SendMail function
62 Revision 1.21 2006/12/07 08:51:26 jgrosseo
64 table, db names in ldap configuration
65 added GRP preprocessor
66 DCS data can also be retrieved by data point
68 Revision 1.20 2006/11/16 16:16:48 jgrosseo
69 introducing strict run ordering flag
70 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
72 Revision 1.19 2006/11/06 14:23:04 jgrosseo
73 major update (Alberto)
74 o) reading of run parameters from the logbook
75 o) online offline naming conversion
76 o) standalone DCSclient package
78 Revision 1.18 2006/10/20 15:22:59 jgrosseo
79 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
80 o) Merging Collect, CollectAll, CollectNew function
81 o) Removing implementation of empty copy constructors (declaration still there!)
83 Revision 1.17 2006/10/05 16:20:55 jgrosseo
84 adapting to new CDB classes
86 Revision 1.16 2006/10/05 15:46:26 jgrosseo
87 applying to the new interface
89 Revision 1.15 2006/10/02 16:38:39 jgrosseo
92 storing of objects that failed to be stored to the grid before
93 interfacing of shuttle status table in daq system
95 Revision 1.14 2006/08/29 09:16:05 jgrosseo
98 Revision 1.13 2006/08/15 10:50:00 jgrosseo
99 effc++ corrections (alberto)
101 Revision 1.12 2006/08/08 14:19:29 jgrosseo
102 Update to shuttle classes (Alberto)
104 - Possibility to set the full object's path in the Preprocessor's and
105 Shuttle's Store functions
106 - Possibility to extend the object's run validity in the same classes
107 ("startValidity" and "validityInfinite" parameters)
108 - Implementation of the StoreReferenceData function to store reference
109 data in a dedicated CDB storage.
111 Revision 1.11 2006/07/21 07:37:20 jgrosseo
112 last run is stored after each run
114 Revision 1.10 2006/07/20 09:54:40 jgrosseo
115 introducing status management: The processing per subdetector is divided into several steps,
116 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
117 can keep track of the number of failures and skips further processing after a certain threshold is
118 exceeded. These thresholds can be configured in LDAP.
120 Revision 1.9 2006/07/19 10:09:55 jgrosseo
121 new configuration, accesst to DAQ FES (Alberto)
123 Revision 1.8 2006/07/11 12:44:36 jgrosseo
124 adding parameters for extended validity range of data produced by preprocessor
126 Revision 1.7 2006/07/10 14:37:09 jgrosseo
127 small fix + todo comment
129 Revision 1.6 2006/07/10 13:01:41 jgrosseo
130 enhanced storing of last sucessfully processed run (alberto)
132 Revision 1.5 2006/07/04 14:59:57 jgrosseo
133 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
135 Revision 1.4 2006/06/12 09:11:16 jgrosseo
136 coding conventions (Alberto)
138 Revision 1.3 2006/06/06 14:26:40 jgrosseo
139 o) removed files that were moved to STEER
140 o) shuttle updated to follow the new interface (Alberto)
142 Revision 1.2 2006/03/07 07:52:34 hristov
143 New version (B.Yordanov)
145 Revision 1.6 2005/11/19 17:19:14 byordano
146 RetrieveDATEEntries and RetrieveConditionsData added
148 Revision 1.5 2005/11/19 11:09:27 byordano
149 AliShuttle declaration added
151 Revision 1.4 2005/11/17 17:47:34 byordano
152 TList changed to TObjArray
154 Revision 1.3 2005/11/17 14:43:23 byordano
157 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
158 Initial import as subdirectory in AliRoot
160 Revision 1.2 2005/09/13 08:41:15 byordano
161 default startTime endTime added
163 Revision 1.4 2005/08/30 09:13:02 byordano
166 Revision 1.3 2005/08/29 21:15:47 byordano
172 // This class is the main manager for AliShuttle.
173 // It organizes the data retrieval from DCS and call the
174 // interface methods of AliPreprocessor.
175 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
176 // data for its set of aliases is retrieved. If there is registered
177 // AliPreprocessor for this detector then it will be used
178 // accroding to the schema (see AliPreprocessor).
179 // If there isn't registered AliPreprocessor than the retrieved
180 // data is stored automatically to the undelying AliCDBStorage.
181 // For detSpec is used the alias name.
184 #include "AliShuttle.h"
186 #include "AliCDBManager.h"
187 #include "AliCDBStorage.h"
188 #include "AliCDBId.h"
189 #include "AliCDBRunRange.h"
190 #include "AliCDBPath.h"
191 #include "AliCDBEntry.h"
192 #include "AliShuttleConfig.h"
193 #include "DCSClient/AliDCSClient.h"
195 #include "AliPreprocessor.h"
196 #include "AliShuttleStatus.h"
197 #include "AliShuttleLogbookEntry.h"
202 #include <TTimeStamp.h>
203 #include <TObjString.h>
204 #include <TSQLServer.h>
205 #include <TSQLResult.h>
208 #include <TSystemDirectory.h>
209 #include <TSystemFile.h>
210 #include <TFileMerger.h>
212 #include <TGridResult.h>
214 #include <TMonaLisaWriter.h>
218 #include <sys/types.h>
219 #include <sys/wait.h>
223 //______________________________________________________________________________________________
224 AliShuttle::AliShuttle(const AliShuttleConfig* config,
225 UInt_t timeout, Int_t retries):
227 fTimeout(timeout), fRetries(retries),
237 fReadTestMode(kFALSE),
238 fOutputRedirected(kFALSE)
241 // config: AliShuttleConfig used
242 // timeout: timeout used for AliDCSClient connection
243 // retries: the number of retries in case of connection error.
246 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
247 for(int iSys=0;iSys<4;iSys++) {
250 fFXSlist[iSys].SetOwner(kTRUE);
252 fPreprocessorMap.SetOwner(kTRUE);
254 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
255 fFirstUnprocessed[iDet] = kFALSE;
257 fMonitoringMutex = new TMutex();
260 //______________________________________________________________________________________________
261 AliShuttle::~AliShuttle()
267 fPreprocessorMap.DeleteAll();
268 for(int iSys=0;iSys<4;iSys++)
270 fServer[iSys]->Close();
271 delete fServer[iSys];
280 if (fMonitoringMutex)
282 delete fMonitoringMutex;
283 fMonitoringMutex = 0;
287 //______________________________________________________________________________________________
288 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
291 // Registers new AliPreprocessor.
292 // It uses GetName() for indentificator of the pre processor.
293 // The pre processor is registered it there isn't any other
294 // with the same identificator (GetName()).
297 const char* detName = preprocessor->GetName();
298 if(GetDetPos(detName) < 0)
299 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
301 if (fPreprocessorMap.GetValue(detName)) {
302 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
306 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
308 //______________________________________________________________________________________________
309 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
310 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
312 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
313 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
314 // using this function. Use StoreReferenceData instead!
315 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
316 // finishes the data are transferred to the main storage (Grid).
318 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
321 //______________________________________________________________________________________________
322 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
324 // Stores a CDB object in the storage for reference data. This objects will not be available during
325 // offline reconstrunction. Use this function for reference data only!
326 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
327 // finishes the data are transferred to the main storage (Grid).
329 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
332 //______________________________________________________________________________________________
333 Bool_t AliShuttle::StoreLocally(const TString& localUri,
334 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
335 Int_t validityStart, Bool_t validityInfinite)
337 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
338 // when the preprocessor finishes the data are transferred to the main storage (Grid).
339 // The parameters are:
340 // 1) Uri of the backup storage (Local)
341 // 2) the object's path.
342 // 3) the object to be stored
343 // 4) the metaData to be associated with the object
344 // 5) the validity start run number w.r.t. the current run,
345 // if the data is valid only for this run leave the default 0
346 // 6) specifies if the calibration data is valid for infinity (this means until updated),
347 // typical for calibration runs, the default is kFALSE
349 // returns 0 if fail, 1 otherwise
351 if (fTestMode & kErrorStorage)
353 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
357 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
359 Int_t firstRun = GetCurrentRun() - validityStart;
361 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
366 if(validityInfinite) {
367 lastRun = AliCDBRunRange::Infinity();
369 lastRun = GetCurrentRun();
372 // Version is set to current run, it will be used later to transfer data to Grid
373 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
375 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
376 TObjString runUsed = Form("%d", GetCurrentRun());
377 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
380 Bool_t result = kFALSE;
382 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
383 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
385 result = AliCDBManager::Instance()->GetStorage(localUri)
386 ->Put(object, id, metaData);
391 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
397 //______________________________________________________________________________________________
398 Bool_t AliShuttle::StoreOCDB()
401 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
402 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
403 // Then calls StoreRefFilesToGrid to store reference files.
406 if (fTestMode & kErrorGrid)
408 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
409 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
413 AliInfo("Storing OCDB data ...");
414 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
416 AliInfo("Storing reference data ...");
417 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
419 AliInfo("Storing reference files ...");
420 Bool_t resultRefFiles = StoreRefFilesToGrid();
422 return resultCDB && resultRef && resultRefFiles;
425 //______________________________________________________________________________________________
426 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
429 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
432 TObjArray* gridIds=0;
434 Bool_t result = kTRUE;
436 const char* type = 0;
438 if(gridURI == fgkMainCDB) {
440 localURI = fgkLocalCDB;
441 } else if(gridURI == fgkMainRefStorage) {
443 localURI = fgkLocalRefStorage;
445 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
449 AliCDBManager* man = AliCDBManager::Instance();
451 AliCDBStorage *gridSto = man->GetStorage(gridURI);
454 Form("StoreOCDB - cannot activate main %s storage", type));
458 gridIds = gridSto->GetQueryCDBList();
460 // get objects previously stored in local CDB
461 AliCDBStorage *localSto = man->GetStorage(localURI);
464 Form("StoreOCDB - cannot activate local %s storage", type));
467 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
468 // Local objects were stored with current run as Grid version!
469 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
470 localEntries->SetOwner(1);
472 // loop on local stored objects
473 TIter localIter(localEntries);
474 AliCDBEntry *aLocEntry = 0;
475 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
476 aLocEntry->SetOwner(1);
477 AliCDBId aLocId = aLocEntry->GetId();
478 aLocEntry->SetVersion(-1);
479 aLocEntry->SetSubVersion(-1);
481 // If local object is valid up to infinity we store it only if it is
482 // the first unprocessed run!
483 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
484 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
486 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
487 "there are previous unprocessed runs!",
488 fCurrentDetector.Data(), aLocId.GetPath().Data()));
492 // loop on Grid valid Id's
493 Bool_t store = kTRUE;
494 TIter gridIter(gridIds);
495 AliCDBId* aGridId = 0;
496 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
497 if(aGridId->GetPath() != aLocId.GetPath()) continue;
498 // skip all objects valid up to infinity
499 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
500 // if we get here, it means there's already some more recent object stored on Grid!
505 // If we get here, the file can be stored!
506 Bool_t storeOk = gridSto->Put(aLocEntry);
507 if(!store || storeOk){
511 Log(fCurrentDetector.Data(),
512 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
513 type, aGridId->ToString().Data()));
516 Form("StoreOCDB - Object <%s> successfully put into %s storage",
517 aLocId.ToString().Data(), type));
520 // removing local filename...
522 localSto->IdToFilename(aLocId, filename);
523 AliInfo(Form("Removing local file %s", filename.Data()));
524 RemoveFile(filename.Data());
528 Form("StoreOCDB - Grid %s storage of object <%s> failed",
529 type, aLocId.ToString().Data()));
533 localEntries->Clear();
538 //______________________________________________________________________________________________
539 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
542 // Stores reference file directly (without opening it). This function stores the file locally
543 // renaming it to #runNumber_gridFileName.
546 if (fTestMode & kErrorStorage)
548 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
552 AliCDBManager* man = AliCDBManager::Instance();
553 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
555 TString localBaseFolder = sto->GetBaseFolder();
558 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
561 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
563 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
566 result = gSystem->mkdir(targetDir, kTRUE);
569 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
574 result = gSystem->CopyFile(localFile, target);
578 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
583 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
588 //______________________________________________________________________________________________
589 Bool_t AliShuttle::StoreRefFilesToGrid()
592 // Transfers the reference file to the Grid.
593 // The final full path of the file is:
594 // gridBaseReferenceFolder/DET/#runNumber_gridFileName
597 AliCDBManager* man = AliCDBManager::Instance();
598 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
601 TString localBaseFolder = sto->GetBaseFolder();
604 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
606 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
609 TString gridBaseFolder = gridSto->GetBaseFolder();
611 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
617 begin.Form("%d_", GetCurrentRun());
619 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
623 TList* dirList = baseDir->GetListOfFiles();
630 Int_t nDirs = dirList->GetEntries();
632 Bool_t success = kTRUE;
633 Bool_t first = kTRUE;
635 for (Int_t iDir=0; iDir<nDirs; ++iDir)
637 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
641 if (entry->IsDirectory())
644 TString fileName(entry->GetName());
645 if (!fileName.BeginsWith(begin))
651 // check that DET folder exists, otherwise create it
652 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
657 if (!result->GetFileName(0))
659 if (!gGrid->Mkdir(alienDir.Data(),"",0))
661 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
670 TString fullLocalPath;
671 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
673 TString fullGridPath;
674 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
676 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
678 TFileMerger fileMerger;
679 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
683 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
684 RemoveFile(fullLocalPath);
688 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
698 //______________________________________________________________________________________________
699 void AliShuttle::CleanLocalStorage(const TString& uri)
702 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
705 const char* type = 0;
706 if(uri == fgkLocalCDB) {
708 } else if(uri == fgkLocalRefStorage) {
711 AliError(Form("Invalid storage URI: %s", uri.Data()));
715 AliCDBManager* man = AliCDBManager::Instance();
717 // open local storage
718 AliCDBStorage *localSto = man->GetStorage(uri);
721 Form("CleanLocalStorage - cannot activate local %s storage", type));
725 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
726 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
728 AliInfo(Form("filename = %s", filename.Data()));
730 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
731 GetCurrentRun(), fCurrentDetector.Data()));
733 RemoveFile(filename.Data());
737 //______________________________________________________________________________________________
738 void AliShuttle::RemoveFile(const char* filename)
741 // removes local file
744 TString command(Form("rm -f %s", filename));
746 Int_t result = gSystem->Exec(command.Data());
749 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
750 fCurrentDetector.Data(), filename));
754 //______________________________________________________________________________________________
755 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
758 // Reads the AliShuttleStatus from the CDB
766 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
767 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
769 if (!fStatusEntry) return 0;
770 fStatusEntry->SetOwner(1);
772 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
774 AliError("Invalid object stored to CDB!");
781 //______________________________________________________________________________________________
782 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
785 // writes the status for one subdetector
793 Int_t run = GetCurrentRun();
795 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
797 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
798 fStatusEntry->SetOwner(1);
800 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
803 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
804 fCurrentDetector.Data(), run));
813 //______________________________________________________________________________________________
814 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
817 // changes the AliShuttleStatus for the given detector and run to the given status
821 AliError("UNEXPECTED: fStatusEntry empty");
825 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
828 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
832 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
833 fCurrentDetector.Data(),
834 status->GetStatusName(),
835 status->GetStatusName(newStatus));
836 Log("SHUTTLE", actionStr);
837 SetLastAction(actionStr);
839 status->SetStatus(newStatus);
840 if (increaseCount) status->IncreaseCount();
842 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
847 //______________________________________________________________________________________________
848 void AliShuttle::SendMLInfo()
851 // sends ML information about the current status of the current detector being processed
854 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
857 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
861 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
862 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
865 mlList.Add(&mlStatus);
866 mlList.Add(&mlRetryCount);
868 fMonaLisa->SendParameters(&mlList);
871 //______________________________________________________________________________________________
872 Bool_t AliShuttle::ContinueProcessing()
874 // this function reads the AliShuttleStatus information from CDB and
875 // checks if the processing should be continued
876 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
878 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
880 AliPreprocessor* aPreprocessor =
881 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
884 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
888 AliShuttleLogbookEntry::Status entryStatus =
889 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
891 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
892 AliInfo(Form("ContinueProcessing - %s is %s",
893 fCurrentDetector.Data(),
894 fLogbookEntry->GetDetectorStatusName(entryStatus)));
898 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
900 // check if current run is first unprocessed run for current detector
901 if (fConfig->StrictRunOrder(fCurrentDetector) &&
902 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
904 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
908 AliShuttleStatus* status = ReadShuttleStatus();
911 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
912 fCurrentDetector.Data()));
913 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
914 return WriteShuttleStatus(status);
917 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
918 // If it happens it may mean Logbook updating failed... let's do it now!
919 if (status->GetStatus() == AliShuttleStatus::kDone ||
920 status->GetStatus() == AliShuttleStatus::kFailed){
921 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
922 fCurrentDetector.Data(),
923 status->GetStatusName(status->GetStatus())));
924 UpdateShuttleLogbook(fCurrentDetector.Data(),
925 status->GetStatusName(status->GetStatus()));
929 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
931 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
932 fCurrentDetector.Data()));
933 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
935 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
936 fCurrentDetector.Data()));
937 UpdateShuttleStatus(AliShuttleStatus::kDone);
938 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
941 Form("ContinueProcessing - %s: Grid storage failed again",
942 fCurrentDetector.Data()));
943 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
948 // if we get here, there is a restart
949 Bool_t cont = kFALSE;
952 if (status->GetCount() >= fConfig->GetMaxRetries()) {
953 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
954 "Updating Shuttle Logbook", fCurrentDetector.Data(),
955 status->GetCount(), status->GetStatusName()));
956 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
957 UpdateShuttleStatus(AliShuttleStatus::kFailed);
959 // there may still be objects in local OCDB and reference storage
960 // and FXS databases may be not updated: do it now!
962 // TODO Currently disabled, we want to keep files in case of failure!
963 // CleanLocalStorage(fgkLocalCDB);
964 // CleanLocalStorage(fgkLocalRefStorage);
965 // UpdateTableFailCase();
967 // Send mail to detector expert!
968 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
970 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
971 fCurrentDetector.Data()));
974 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
975 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
976 status->GetStatusName(), status->GetCount()));
977 Bool_t increaseCount = kTRUE;
978 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
979 increaseCount = kFALSE;
980 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
987 //______________________________________________________________________________________________
988 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
991 // Makes data retrieval for all detectors in the configuration.
992 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
993 // (Unprocessed, Inactive, Failed or Done).
994 // Returns kFALSE in case of error occured and kTRUE otherwise
997 if (!entry) return kFALSE;
999 fLogbookEntry = entry;
1001 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1004 // create ML instance that monitors this run
1005 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1006 // disable monitoring of other parameters that come e.g. from TFile
1007 gMonitoringWriter = 0;
1009 // Send the information to ML
1010 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1011 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1014 mlList.Add(&mlStatus);
1015 mlList.Add(&mlRunType);
1017 fMonaLisa->SendParameters(&mlList);
1019 if (fLogbookEntry->IsDone())
1021 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1022 UpdateShuttleLogbook("shuttle_done");
1027 // read test mode if flag is set
1031 TString logEntry(entry->GetRunParameter("log"));
1032 //printf("log entry = %s\n", logEntry.Data());
1033 TString searchStr("Testmode: ");
1034 Int_t pos = logEntry.Index(searchStr.Data());
1035 //printf("%d\n", pos);
1038 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1039 //printf("%s\n", subStr.String().Data());
1040 TString newStr(subStr.Data());
1041 TObjArray* token = newStr.Tokenize(' ');
1045 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1048 Int_t testMode = tmpStr->String().Atoi();
1051 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1052 SetTestMode((TestMode) testMode);
1060 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1062 fLogbookEntry->Print("all");
1065 Bool_t hasError = kFALSE;
1067 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1068 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1069 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1070 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1072 // Loop on detectors in the configuration
1073 TIter iter(fConfig->GetDetectors());
1074 TObjString* aDetector = 0;
1076 while ((aDetector = (TObjString*) iter.Next()))
1078 fCurrentDetector = aDetector->String();
1080 if (ContinueProcessing() == kFALSE) continue;
1082 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1083 GetCurrentRun(), aDetector->GetName()));
1085 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1087 Log(fCurrentDetector.Data(), "Starting processing");
1093 Log("SHUTTLE", "ERROR: Forking failed");
1098 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1099 GetCurrentRun(), aDetector->GetName()));
1101 Long_t begin = time(0);
1103 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1104 while (waitpid(pid, &status, WNOHANG) == 0)
1106 Long_t expiredTime = time(0) - begin;
1108 if (expiredTime > fConfig->GetPPTimeOut())
1111 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1112 fCurrentDetector.Data(), expiredTime);
1113 Log("SHUTTLE", tmp);
1114 Log(fCurrentDetector, tmp);
1118 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1121 gSystem->Sleep(1000);
1125 gSystem->Sleep(1000);
1128 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1129 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1132 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1137 if (!fgets(buffer, 100, pipe))
1139 Log("SHUTTLE", "Error: ps did not return anything");
1140 gSystem->ClosePipe(pipe);
1143 gSystem->ClosePipe(pipe);
1145 //Log("SHUTTLE", Form("ps returned %s", buffer));
1148 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1150 Log("SHUTTLE", "Error: Could not parse output of ps");
1154 if (expiredTime % 60 == 0)
1155 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1156 fCurrentDetector.Data(), expiredTime, mem));
1158 if (mem > fConfig->GetPPMaxMem())
1161 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1162 mem, fConfig->GetPPMaxMem());
1163 Log("SHUTTLE", tmp);
1164 Log(fCurrentDetector, tmp);
1168 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1171 gSystem->Sleep(1000);
1176 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1177 GetCurrentRun(), aDetector->GetName()));
1179 if (WIFEXITED(status))
1181 Int_t returnCode = WEXITSTATUS(status);
1183 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1186 if (returnCode == 0) hasError = kTRUE;
1192 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1194 AliInfo("Redirecting output...");
1196 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1198 Log("SHUTTLE", "Could not freopen stdout");
1202 fOutputRedirected = kTRUE;
1203 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1204 Log("SHUTTLE", "Could not redirect stderr");
1208 Bool_t success = ProcessCurrentDetector();
1209 if (success) // Preprocessor finished successfully!
1211 // Update time_processed field in FXS DB
1212 if (UpdateTable() == kFALSE)
1213 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1215 // Transfer the data from local storage to main storage (Grid)
1216 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1217 if (StoreOCDB() == kFALSE)
1219 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1220 GetCurrentRun(), aDetector->GetName()));
1221 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1224 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1225 GetCurrentRun(), aDetector->GetName()));
1226 UpdateShuttleStatus(AliShuttleStatus::kDone);
1227 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1231 for (UInt_t iSys=0; iSys<3; iSys++)
1233 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1236 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1237 GetCurrentRun(), aDetector->GetName(), success));
1239 // the client exits here
1240 gSystem->Exit(success);
1242 AliError("We should never get here!!!");
1246 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1249 //check if shuttle is done for this run, if so update logbook
1250 TObjArray checkEntryArray;
1251 checkEntryArray.SetOwner(1);
1252 TString whereClause = Form("where run=%d", GetCurrentRun());
1253 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1254 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1256 return hasError == kFALSE;
1259 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1260 (checkEntryArray.At(0));
1264 if (checkEntry->IsDone())
1266 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1267 UpdateShuttleLogbook("shuttle_done");
1271 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1273 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1275 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1276 checkEntry->GetRun(), GetDetName(iDet)));
1277 fFirstUnprocessed[iDet] = kFALSE;
1283 // remove ML instance
1289 return hasError == kFALSE;
1292 //______________________________________________________________________________________________
1293 Bool_t AliShuttle::ProcessCurrentDetector()
1296 // Makes data retrieval just for a specific detector (fCurrentDetector).
1297 // Threre should be a configuration for this detector.
1299 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1304 Bool_t aDCSError = kFALSE;
1306 // call preprocessor
1307 AliPreprocessor* aPreprocessor =
1308 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1310 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1312 Bool_t processDCS = aPreprocessor->ProcessDCS();
1314 if (!processDCS || (fTestMode & kSkipDCS))
1316 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1318 else if (fTestMode & kErrorDCS)
1320 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1321 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1322 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1326 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1328 TString host(fConfig->GetDCSHost(fCurrentDetector));
1329 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1331 // Retrieval of Aliases
1332 TObjString* anAlias = 0;
1334 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
1335 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1336 while ((anAlias = (TObjString*) iterAliases.Next()))
1338 TObjArray *valueSet = new TObjArray();
1339 valueSet->SetOwner(1);
1341 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1342 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1343 anAlias->GetName(), iAlias++, nTotAliases));
1344 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1348 dcsMap.Add(anAlias->Clone(), valueSet);
1350 Log(fCurrentDetector,
1351 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1352 anAlias->GetName()));
1353 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1359 // Retrieval of Data Points
1360 TObjString* aDP = 0;
1362 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
1363 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1364 while ((aDP = (TObjString*) iterDP.Next()))
1366 TObjArray *valueSet = new TObjArray();
1367 valueSet->SetOwner(1);
1368 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1369 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1370 aDP->GetName(), iDP++, nTotDPs));
1371 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1375 dcsMap.Add(aDP->Clone(), valueSet);
1377 Log(fCurrentDetector,
1378 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1380 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1387 // DCS Archive DB processing successful. Call Preprocessor!
1388 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1390 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
1392 if (returnValue > 0) // Preprocessor error!
1394 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1395 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1401 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1402 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1403 fCurrentDetector.Data()));
1410 //______________________________________________________________________________________________
1411 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1414 // Query DAQ's Shuttle logbook and fills detector status object.
1415 // Call QueryRunParameters to query DAQ logbook for run parameters.
1418 entries.SetOwner(1);
1420 // check connection, in case connect
1421 if(!Connect(3)) return kFALSE;
1424 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1426 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1428 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1432 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1434 if(aResult->GetRowCount() == 0) {
1435 AliInfo("No entries in Shuttle Logbook match request");
1440 // TODO Check field count!
1441 const UInt_t nCols = 22;
1442 if (aResult->GetFieldCount() != (Int_t) nCols) {
1443 AliError("Invalid SQL result field number!");
1449 while ((aRow = aResult->Next())) {
1450 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1451 Int_t run = runString.Atoi();
1453 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1457 // loop on detectors
1458 for(UInt_t ii = 0; ii < nCols; ii++)
1459 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1461 entries.AddLast(entry);
1469 //______________________________________________________________________________________________
1470 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1473 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1476 // check connection, in case connect
1481 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1483 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1485 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1489 if (aResult->GetRowCount() == 0) {
1490 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1495 if (aResult->GetRowCount() > 1) {
1496 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1501 TSQLRow* aRow = aResult->Next();
1504 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1509 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1511 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1512 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1514 UInt_t startTime = entry->GetStartTime();
1515 UInt_t endTime = entry->GetEndTime();
1517 if (!startTime || !endTime || startTime > endTime) {
1519 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1520 run, startTime, endTime));
1533 //______________________________________________________________________________________________
1534 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1535 TObjArray* valueSet, DCSType type)
1537 // Retrieve all "entry" data points from the DCS server
1538 // host, port: TSocket connection parameters
1539 // entry: name of the alias or data point
1540 // valueSet: array of retrieved AliDCSValue's
1541 // type: kAlias or kDP
1543 AliDCSClient client(host, port, fTimeout, fRetries);
1544 if (!client.IsConnected())
1553 result = client.GetAliasValues(entry,
1554 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1558 result = client.GetDPValues(entry,
1559 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1564 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1565 entry, AliDCSClient::GetErrorString(result)));
1567 if (result == AliDCSClient::fgkServerError)
1569 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1570 client.GetServerError().Data()));
1579 //______________________________________________________________________________________________
1580 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1581 const char* id, const char* source)
1583 // Get calibration file from file exchange servers
1584 // First queris the FXS database for the file name, using the run, detector, id and source info
1585 // then calls RetrieveFile(filename) for actual copy to local disk
1586 // run: current run being processed (given by Logbook entry fLogbookEntry)
1587 // detector: the Preprocessor name
1588 // id: provided as a parameter by the Preprocessor
1589 // source: provided by the Preprocessor through GetFileSources function
1591 // check if test mode should simulate a FXS error
1592 if (fTestMode & kErrorFXSFiles)
1594 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1598 // check connection, in case connect
1599 if (!Connect(system))
1601 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1605 // Query preparation
1606 TString sourceName(source);
1608 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1609 fConfig->GetFXSdbTable(system));
1610 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1611 GetCurrentRun(), detector, id);
1615 whereClause += Form(" and DAQsource=\"%s\"", source);
1617 else if (system == kDCS)
1621 else if (system == kHLT)
1623 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1627 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1629 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1632 TSQLResult* aResult = 0;
1633 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1635 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1636 GetSystemName(system), id, sourceName.Data()));
1640 if(aResult->GetRowCount() == 0)
1643 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1644 GetSystemName(system), id, sourceName.Data()));
1649 if (aResult->GetRowCount() > 1) {
1651 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1652 GetSystemName(system), id, sourceName.Data()));
1657 if (aResult->GetFieldCount() != nFields) {
1659 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1660 GetSystemName(system), id, sourceName.Data()));
1665 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1668 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1669 GetSystemName(system), id, sourceName.Data()));
1674 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1675 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1676 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1681 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1682 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1684 // retrieved file is renamed to make it unique
1685 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1686 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1689 // file retrieval from FXS
1690 UInt_t nRetries = 0;
1691 UInt_t maxRetries = 3;
1692 Bool_t result = kFALSE;
1694 // copy!! if successful TSystem::Exec returns 0
1695 while(nRetries++ < maxRetries) {
1696 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1697 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1700 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1701 filePath.Data(), GetSystemName(system)));
1704 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1705 filePath.Data(), GetSystemName(system),
1706 GetShuttleTempDir(), localFileName.Data()));
1709 if (fileChecksum.Length()>0)
1711 // compare md5sum of local file with the one stored in the FXS DB
1712 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
1713 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
1717 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1723 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1724 filePath.Data(), GetSystemName(system)));
1729 if(!result) return 0;
1731 fFXSCalled[system]=kTRUE;
1732 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1733 fFXSlist[system].Add(fileParams);
1735 static TString fullLocalFileName;
1736 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1738 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1740 return fullLocalFileName.Data();
1744 //______________________________________________________________________________________________
1745 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
1748 // Copies file from FXS to local Shuttle machine
1751 // check temp directory: trying to cd to temp; if it does not exist, create it
1752 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1753 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
1755 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
1757 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1758 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
1763 gSystem->FreeDirectory(dir);
1766 TString baseFXSFolder;
1769 baseFXSFolder = "FES/";
1771 else if (system == kDCS)
1775 else if (system == kHLT)
1777 baseFXSFolder = "~/";
1781 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1782 fConfig->GetFXSPort(system),
1783 fConfig->GetFXSUser(system),
1784 fConfig->GetFXSHost(system),
1785 baseFXSFolder.Data(),
1787 GetShuttleTempDir(),
1790 AliDebug(2, Form("%s",command.Data()));
1792 Bool_t result = (gSystem->Exec(command.Data()) == 0);
1797 //______________________________________________________________________________________________
1798 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1801 // Get sources producing the condition file Id from file exchange servers
1804 // check if test mode should simulate a FXS error
1805 if (fTestMode & kErrorFXSSources)
1807 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1814 AliError("DCS system has only one source of data!");
1818 // check connection, in case connect
1819 if (!Connect(system))
1821 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1825 TString sourceName = 0;
1828 sourceName = "DAQsource";
1829 } else if (system == kHLT)
1831 sourceName = "DDLnumbers";
1834 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
1835 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1836 GetCurrentRun(), detector, id);
1837 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1839 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1842 TSQLResult* aResult;
1843 aResult = fServer[system]->Query(sqlQuery);
1845 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1846 GetSystemName(system), id));
1850 if (aResult->GetRowCount() == 0)
1853 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
1859 TList *list = new TList();
1862 while ((aRow = aResult->Next()))
1865 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1866 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1867 list->Add(new TObjString(source));
1876 //______________________________________________________________________________________________
1877 Bool_t AliShuttle::Connect(Int_t system)
1879 // Connect to MySQL Server of the system's FXS MySQL databases
1880 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1883 // check connection: if already connected return
1884 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
1886 TString dbHost, dbUser, dbPass, dbName;
1888 if (system < 3) // FXS db servers
1890 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1891 dbUser = fConfig->GetFXSdbUser(system);
1892 dbPass = fConfig->GetFXSdbPass(system);
1893 dbName = fConfig->GetFXSdbName(system);
1894 } else { // Run & Shuttle logbook servers
1895 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1896 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1897 dbUser = fConfig->GetDAQlbUser();
1898 dbPass = fConfig->GetDAQlbPass();
1899 dbName = fConfig->GetDAQlbDB();
1902 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1903 if (!fServer[system] || !fServer[system]->IsConnected()) {
1906 AliError(Form("Can't establish connection to FXS database for %s",
1907 AliShuttleInterface::GetSystemName(system)));
1909 AliError("Can't establish connection to Run logbook.");
1911 if(fServer[system]) delete fServer[system];
1916 TSQLResult* aResult=0;
1919 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1922 aResult = fServer[kDCS]->GetTables(dbName.Data());
1925 aResult = fServer[kHLT]->GetTables(dbName.Data());
1928 aResult = fServer[3]->GetTables(dbName.Data());
1936 //______________________________________________________________________________________________
1937 Bool_t AliShuttle::UpdateTable()
1940 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1943 Bool_t result = kTRUE;
1945 for (UInt_t system=0; system<3; system++)
1947 if(!fFXSCalled[system]) continue;
1949 // check connection, in case connect
1950 if (!Connect(system))
1952 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1957 TTimeStamp now; // now
1959 // Loop on FXS list entries
1960 TIter iter(&fFXSlist[system]);
1961 TObjString *aFXSentry=0;
1962 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1964 TString aFXSentrystr = aFXSentry->String();
1965 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1966 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1968 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
1969 GetSystemName(system), aFXSentrystr.Data()));
1970 if(aFXSarray) delete aFXSarray;
1974 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
1975 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
1977 TString whereClause;
1980 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
1981 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
1983 else if (system == kDCS)
1985 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
1986 GetCurrentRun(), fCurrentDetector.Data(), fileId);
1988 else if (system == kHLT)
1990 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
1991 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
1996 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
1997 now.GetSec(), whereClause.Data());
1999 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2002 TSQLResult* aResult;
2003 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2006 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2007 GetSystemName(system), sqlQuery.Data()));
2018 //______________________________________________________________________________________________
2019 Bool_t AliShuttle::UpdateTableFailCase()
2021 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2022 // this is called in case the preprocessor is declared failed for the current run, because
2023 // the fields are updated only in case of success
2025 Bool_t result = kTRUE;
2027 for (UInt_t system=0; system<3; system++)
2029 // check connection, in case connect
2030 if (!Connect(system))
2032 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2033 GetSystemName(system)));
2038 TTimeStamp now; // now
2040 // Loop on FXS list entries
2042 TString whereClause = Form("where run=%d and detector=\"%s\";",
2043 GetCurrentRun(), fCurrentDetector.Data());
2046 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2047 now.GetSec(), whereClause.Data());
2049 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2052 TSQLResult* aResult;
2053 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2056 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2057 GetSystemName(system), sqlQuery.Data()));
2067 //______________________________________________________________________________________________
2068 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2071 // Update Shuttle logbook filling detector or shuttle_done column
2072 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2075 // check connection, in case connect
2077 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2081 TString detName(detector);
2083 if(detName == "shuttle_done")
2085 setClause = "set shuttle_done=1";
2087 // Send the information to ML
2088 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2091 mlList.Add(&mlStatus);
2093 fMonaLisa->SendParameters(&mlList);
2095 TString statusStr(status);
2096 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2097 statusStr.Contains("failed", TString::kIgnoreCase)){
2098 setClause = Form("set %s=\"%s\"", detector, status);
2101 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2107 TString whereClause = Form("where run=%d", GetCurrentRun());
2109 TString sqlQuery = Form("update %s %s %s",
2110 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2112 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2115 TSQLResult* aResult;
2116 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2118 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2126 //______________________________________________________________________________________________
2127 Int_t AliShuttle::GetCurrentRun() const
2130 // Get current run from logbook entry
2133 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2136 //______________________________________________________________________________________________
2137 UInt_t AliShuttle::GetCurrentStartTime() const
2140 // get current start time
2143 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2146 //______________________________________________________________________________________________
2147 UInt_t AliShuttle::GetCurrentEndTime() const
2150 // get current end time from logbook entry
2153 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2156 //______________________________________________________________________________________________
2157 void AliShuttle::Log(const char* detector, const char* message)
2160 // Fill log string with a message
2163 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2165 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2166 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2171 gSystem->FreeDirectory(dir);
2174 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2175 if (GetCurrentRun() >= 0)
2176 toLog += Form("run %d - ", GetCurrentRun());
2177 toLog += Form("%s", message);
2179 AliInfo(toLog.Data());
2181 // if we redirect the log output already to the file, leave here
2182 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2185 TString fileName = GetLogFileName(detector);
2187 gSystem->ExpandPathName(fileName);
2190 logFile.open(fileName, ofstream::out | ofstream::app);
2192 if (!logFile.is_open()) {
2193 AliError(Form("Could not open file %s", fileName.Data()));
2197 logFile << toLog.Data() << "\n";
2202 //______________________________________________________________________________________________
2203 TString AliShuttle::GetLogFileName(const char* detector) const
2206 // returns the name of the log file for a given sub detector
2211 if (GetCurrentRun() >= 0)
2212 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2214 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2219 //______________________________________________________________________________________________
2220 Bool_t AliShuttle::Collect(Int_t run)
2223 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2224 // If a dedicated run is given this run is processed
2226 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2230 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2232 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2234 SetLastAction("Starting");
2236 TString whereClause("where shuttle_done=0");
2238 whereClause += Form(" and run=%d", run);
2240 TObjArray shuttleLogbookEntries;
2241 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2243 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2247 if (shuttleLogbookEntries.GetEntries() == 0)
2250 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2252 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2253 "or it does not exist in Shuttle logbook", run));
2257 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2258 fFirstUnprocessed[iDet] = kTRUE;
2262 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2263 // flag them into fFirstUnprocessed array
2264 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2265 TObjArray tmpLogbookEntries;
2266 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2268 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2272 TIter iter(&tmpLogbookEntries);
2273 AliShuttleLogbookEntry* anEntry = 0;
2274 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2276 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2278 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2280 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2281 anEntry->GetRun(), GetDetName(iDet)));
2282 fFirstUnprocessed[iDet] = kFALSE;
2290 if (!RetrieveConditionsData(shuttleLogbookEntries))
2292 Log("SHUTTLE", "Collect - Process of at least one run failed");
2296 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2300 //______________________________________________________________________________________________
2301 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2304 // Retrieve conditions data for all runs that aren't processed yet
2307 Bool_t hasError = kFALSE;
2309 TIter iter(&dateEntries);
2310 AliShuttleLogbookEntry* anEntry;
2312 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2313 if (!Process(anEntry)){
2317 // clean SHUTTLE temp directory
2318 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2319 RemoveFile(filename.Data());
2322 return hasError == kFALSE;
2325 //______________________________________________________________________________________________
2326 ULong_t AliShuttle::GetTimeOfLastAction() const
2329 // Gets time of last action
2334 fMonitoringMutex->Lock();
2336 tmp = fLastActionTime;
2338 fMonitoringMutex->UnLock();
2343 //______________________________________________________________________________________________
2344 const TString AliShuttle::GetLastAction() const
2347 // returns a string description of the last action
2352 fMonitoringMutex->Lock();
2356 fMonitoringMutex->UnLock();
2361 //______________________________________________________________________________________________
2362 void AliShuttle::SetLastAction(const char* action)
2365 // updates the monitoring variables
2368 fMonitoringMutex->Lock();
2370 fLastAction = action;
2371 fLastActionTime = time(0);
2373 fMonitoringMutex->UnLock();
2376 //______________________________________________________________________________________________
2377 const char* AliShuttle::GetRunParameter(const char* param)
2380 // returns run parameter read from DAQ logbook
2383 if(!fLogbookEntry) {
2384 AliError("No logbook entry!");
2388 return fLogbookEntry->GetRunParameter(param);
2391 //______________________________________________________________________________________________
2392 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2395 // returns object from OCDB valid for current run
2398 if (fTestMode & kErrorOCDB)
2400 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2404 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2407 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2411 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2414 //______________________________________________________________________________________________
2415 Bool_t AliShuttle::SendMail()
2418 // sends a mail to the subdetector expert in case of preprocessor error
2421 if (fTestMode != kNone)
2424 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2427 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2429 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2434 gSystem->FreeDirectory(dir);
2437 TString bodyFileName;
2438 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2439 gSystem->ExpandPathName(bodyFileName);
2442 mailBody.open(bodyFileName, ofstream::out);
2444 if (!mailBody.is_open())
2446 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2451 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2452 TObjString *anExpert=0;
2453 while ((anExpert = (TObjString*) iterExperts.Next()))
2455 to += Form("%s,", anExpert->GetName());
2457 to.Remove(to.Length()-1);
2458 AliDebug(2, Form("to: %s",to.Data()));
2460 // TODO this will be removed...
2461 if (to.Contains("not_yet_set")) {
2462 AliInfo("List of detector responsibles not yet set!");
2466 TString cc="alberto.colla@cern.ch";
2468 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2469 fCurrentDetector.Data(), GetCurrentRun());
2470 AliDebug(2, Form("subject: %s", subject.Data()));
2472 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2473 body += Form("SHUTTLE just detected that your preprocessor "
2474 "exited with ERROR state in run %d!!\n\n", GetCurrentRun());
2475 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2476 body += Form("The last 10 lines of %s log file are following:\n\n");
2478 AliDebug(2, Form("Body begin: %s", body.Data()));
2480 mailBody << body.Data();
2482 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2484 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2485 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2486 if (gSystem->Exec(tailCommand.Data()))
2488 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2491 TString endBody = Form("------------------------------------------------------\n\n");
2492 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2493 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2494 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2496 AliDebug(2, Form("Body end: %s", endBody.Data()));
2498 mailBody << endBody.Data();
2503 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2507 bodyFileName.Data());
2508 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2510 Bool_t result = gSystem->Exec(mailCommand.Data());
2515 //______________________________________________________________________________________________
2516 const char* AliShuttle::GetRunType()
2519 // returns run type read from "run type" logbook
2522 if(!fLogbookEntry) {
2523 AliError("No logbook entry!");
2527 return fLogbookEntry->GetRunType();
2530 //______________________________________________________________________________________________
2531 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2534 // sets Shuttle temp directory
2537 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2540 //______________________________________________________________________________________________
2541 void AliShuttle::SetShuttleLogDir(const char* logDir)
2544 // sets Shuttle log directory
2547 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);