1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.38 2007/04/12 08:26:18 jgrosseo
21 Revision 1.37 2007/04/10 16:53:14 jgrosseo
22 redirecting sub detector stdout, stderr to sub detector log file
24 Revision 1.35 2007/04/04 16:26:38 acolla
25 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
26 2. Added missing dependency in test preprocessors.
27 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
29 Revision 1.34 2007/04/04 10:33:36 jgrosseo
30 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
31 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
33 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
35 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
37 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
39 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
40 If you always need DCS data (like before), you do not need to implement it.
42 6) The run type has been added to the monitoring page
44 Revision 1.33 2007/04/03 13:56:01 acolla
45 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
48 Revision 1.32 2007/02/28 10:41:56 acolla
49 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
50 AliPreprocessor::GetRunType() function.
51 Added some ldap definition files.
53 Revision 1.30 2007/02/13 11:23:21 acolla
54 Moved getters and setters of Shuttle's main OCDB/Reference, local
55 OCDB/Reference, temp and log folders to AliShuttleInterface
57 Revision 1.27 2007/01/30 17:52:42 jgrosseo
58 adding monalisa monitoring
60 Revision 1.26 2007/01/23 19:20:03 acolla
61 Removed old ldif files, added TOF, MCH ldif files. Added some options in
62 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
65 Revision 1.25 2007/01/15 19:13:52 acolla
66 Moved some AliInfo to AliDebug in SendMail function
68 Revision 1.21 2006/12/07 08:51:26 jgrosseo
70 table, db names in ldap configuration
71 added GRP preprocessor
72 DCS data can also be retrieved by data point
74 Revision 1.20 2006/11/16 16:16:48 jgrosseo
75 introducing strict run ordering flag
76 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
78 Revision 1.19 2006/11/06 14:23:04 jgrosseo
79 major update (Alberto)
80 o) reading of run parameters from the logbook
81 o) online offline naming conversion
82 o) standalone DCSclient package
84 Revision 1.18 2006/10/20 15:22:59 jgrosseo
85 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
86 o) Merging Collect, CollectAll, CollectNew function
87 o) Removing implementation of empty copy constructors (declaration still there!)
89 Revision 1.17 2006/10/05 16:20:55 jgrosseo
90 adapting to new CDB classes
92 Revision 1.16 2006/10/05 15:46:26 jgrosseo
93 applying to the new interface
95 Revision 1.15 2006/10/02 16:38:39 jgrosseo
98 storing of objects that failed to be stored to the grid before
99 interfacing of shuttle status table in daq system
101 Revision 1.14 2006/08/29 09:16:05 jgrosseo
104 Revision 1.13 2006/08/15 10:50:00 jgrosseo
105 effc++ corrections (alberto)
107 Revision 1.12 2006/08/08 14:19:29 jgrosseo
108 Update to shuttle classes (Alberto)
110 - Possibility to set the full object's path in the Preprocessor's and
111 Shuttle's Store functions
112 - Possibility to extend the object's run validity in the same classes
113 ("startValidity" and "validityInfinite" parameters)
114 - Implementation of the StoreReferenceData function to store reference
115 data in a dedicated CDB storage.
117 Revision 1.11 2006/07/21 07:37:20 jgrosseo
118 last run is stored after each run
120 Revision 1.10 2006/07/20 09:54:40 jgrosseo
121 introducing status management: The processing per subdetector is divided into several steps,
122 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
123 can keep track of the number of failures and skips further processing after a certain threshold is
124 exceeded. These thresholds can be configured in LDAP.
126 Revision 1.9 2006/07/19 10:09:55 jgrosseo
127 new configuration, accesst to DAQ FES (Alberto)
129 Revision 1.8 2006/07/11 12:44:36 jgrosseo
130 adding parameters for extended validity range of data produced by preprocessor
132 Revision 1.7 2006/07/10 14:37:09 jgrosseo
133 small fix + todo comment
135 Revision 1.6 2006/07/10 13:01:41 jgrosseo
136 enhanced storing of last sucessfully processed run (alberto)
138 Revision 1.5 2006/07/04 14:59:57 jgrosseo
139 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
141 Revision 1.4 2006/06/12 09:11:16 jgrosseo
142 coding conventions (Alberto)
144 Revision 1.3 2006/06/06 14:26:40 jgrosseo
145 o) removed files that were moved to STEER
146 o) shuttle updated to follow the new interface (Alberto)
148 Revision 1.2 2006/03/07 07:52:34 hristov
149 New version (B.Yordanov)
151 Revision 1.6 2005/11/19 17:19:14 byordano
152 RetrieveDATEEntries and RetrieveConditionsData added
154 Revision 1.5 2005/11/19 11:09:27 byordano
155 AliShuttle declaration added
157 Revision 1.4 2005/11/17 17:47:34 byordano
158 TList changed to TObjArray
160 Revision 1.3 2005/11/17 14:43:23 byordano
163 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
164 Initial import as subdirectory in AliRoot
166 Revision 1.2 2005/09/13 08:41:15 byordano
167 default startTime endTime added
169 Revision 1.4 2005/08/30 09:13:02 byordano
172 Revision 1.3 2005/08/29 21:15:47 byordano
178 // This class is the main manager for AliShuttle.
179 // It organizes the data retrieval from DCS and call the
180 // interface methods of AliPreprocessor.
181 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
182 // data for its set of aliases is retrieved. If there is registered
183 // AliPreprocessor for this detector then it will be used
184 // accroding to the schema (see AliPreprocessor).
185 // If there isn't registered AliPreprocessor than the retrieved
186 // data is stored automatically to the undelying AliCDBStorage.
187 // For detSpec is used the alias name.
190 #include "AliShuttle.h"
192 #include "AliCDBManager.h"
193 #include "AliCDBStorage.h"
194 #include "AliCDBId.h"
195 #include "AliCDBRunRange.h"
196 #include "AliCDBPath.h"
197 #include "AliCDBEntry.h"
198 #include "AliShuttleConfig.h"
199 #include "DCSClient/AliDCSClient.h"
201 #include "AliPreprocessor.h"
202 #include "AliShuttleStatus.h"
203 #include "AliShuttleLogbookEntry.h"
208 #include <TTimeStamp.h>
209 #include <TObjString.h>
210 #include <TSQLServer.h>
211 #include <TSQLResult.h>
214 #include <TSystemDirectory.h>
215 #include <TSystemFile.h>
216 #include <TFileMerger.h>
218 #include <TGridResult.h>
220 #include <TMonaLisaWriter.h>
224 #include <sys/types.h>
225 #include <sys/wait.h>
229 //______________________________________________________________________________________________
230 AliShuttle::AliShuttle(const AliShuttleConfig* config,
231 UInt_t timeout, Int_t retries):
233 fTimeout(timeout), fRetries(retries),
243 fReadTestMode(kFALSE),
244 fOutputRedirected(kFALSE)
247 // config: AliShuttleConfig used
248 // timeout: timeout used for AliDCSClient connection
249 // retries: the number of retries in case of connection error.
252 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
253 for(int iSys=0;iSys<4;iSys++) {
256 fFXSlist[iSys].SetOwner(kTRUE);
258 fPreprocessorMap.SetOwner(kTRUE);
260 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
261 fFirstUnprocessed[iDet] = kFALSE;
263 fMonitoringMutex = new TMutex();
266 //______________________________________________________________________________________________
267 AliShuttle::~AliShuttle()
273 fPreprocessorMap.DeleteAll();
274 for(int iSys=0;iSys<4;iSys++)
276 fServer[iSys]->Close();
277 delete fServer[iSys];
286 if (fMonitoringMutex)
288 delete fMonitoringMutex;
289 fMonitoringMutex = 0;
293 //______________________________________________________________________________________________
294 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
297 // Registers new AliPreprocessor.
298 // It uses GetName() for indentificator of the pre processor.
299 // The pre processor is registered it there isn't any other
300 // with the same identificator (GetName()).
303 const char* detName = preprocessor->GetName();
304 if(GetDetPos(detName) < 0)
305 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
307 if (fPreprocessorMap.GetValue(detName)) {
308 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
312 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
314 //______________________________________________________________________________________________
315 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
316 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
318 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
319 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
320 // using this function. Use StoreReferenceData instead!
321 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
322 // finishes the data are transferred to the main storage (Grid).
324 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
327 //______________________________________________________________________________________________
328 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
330 // Stores a CDB object in the storage for reference data. This objects will not be available during
331 // offline reconstrunction. Use this function for reference data only!
332 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
333 // finishes the data are transferred to the main storage (Grid).
335 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
338 //______________________________________________________________________________________________
339 Bool_t AliShuttle::StoreLocally(const TString& localUri,
340 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
341 Int_t validityStart, Bool_t validityInfinite)
343 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
344 // when the preprocessor finishes the data are transferred to the main storage (Grid).
345 // The parameters are:
346 // 1) Uri of the backup storage (Local)
347 // 2) the object's path.
348 // 3) the object to be stored
349 // 4) the metaData to be associated with the object
350 // 5) the validity start run number w.r.t. the current run,
351 // if the data is valid only for this run leave the default 0
352 // 6) specifies if the calibration data is valid for infinity (this means until updated),
353 // typical for calibration runs, the default is kFALSE
355 // returns 0 if fail, 1 otherwise
357 if (fTestMode & kErrorStorage)
359 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
363 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
365 Int_t firstRun = GetCurrentRun() - validityStart;
367 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
372 if(validityInfinite) {
373 lastRun = AliCDBRunRange::Infinity();
375 lastRun = GetCurrentRun();
378 // Version is set to current run, it will be used later to transfer data to Grid
379 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
381 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
382 TObjString runUsed = Form("%d", GetCurrentRun());
383 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
386 Bool_t result = kFALSE;
388 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
389 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
391 result = AliCDBManager::Instance()->GetStorage(localUri)
392 ->Put(object, id, metaData);
397 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
403 //______________________________________________________________________________________________
404 Bool_t AliShuttle::StoreOCDB()
407 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
408 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
409 // Then calls StoreRefFilesToGrid to store reference files.
412 if (fTestMode & kErrorGrid)
414 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
415 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
419 AliInfo("Storing reference data ...");
420 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
422 AliInfo("Storing reference files ...");
423 Bool_t resultRefFiles = StoreRefFilesToGrid();
425 AliInfo("Storing OCDB data ...");
426 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
428 return resultCDB && resultRef && resultRefFiles;
431 //______________________________________________________________________________________________
432 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
435 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
438 TObjArray* gridIds=0;
440 Bool_t result = kTRUE;
441 // to check whether all files have been transferred, or some files were left behind
442 // because the run is not first unprocessed
443 Bool_t willDoAgain = kFALSE;
445 const char* type = 0;
447 if(gridURI == fgkMainCDB) {
449 localURI = fgkLocalCDB;
450 } else if(gridURI == fgkMainRefStorage) {
452 localURI = fgkLocalRefStorage;
454 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
458 AliCDBManager* man = AliCDBManager::Instance();
460 AliCDBStorage *gridSto = man->GetStorage(gridURI);
463 Form("StoreOCDB - cannot activate main %s storage", type));
467 gridIds = gridSto->GetQueryCDBList();
469 // get objects previously stored in local CDB
470 AliCDBStorage *localSto = man->GetStorage(localURI);
473 Form("StoreOCDB - cannot activate local %s storage", type));
476 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
477 // Local objects were stored with current run as Grid version!
478 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
479 localEntries->SetOwner(1);
481 // loop on local stored objects
482 TIter localIter(localEntries);
483 AliCDBEntry *aLocEntry = 0;
484 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
485 aLocEntry->SetOwner(1);
486 AliCDBId aLocId = aLocEntry->GetId();
487 aLocEntry->SetVersion(-1);
488 aLocEntry->SetSubVersion(-1);
490 // If local object is valid up to infinity we store it only if it is
491 // the first unprocessed run!
492 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
493 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
495 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
496 "there are previous unprocessed runs!",
497 fCurrentDetector.Data(), aLocId.GetPath().Data()));
502 // loop on Grid valid Id's
503 Bool_t store = kTRUE;
504 TIter gridIter(gridIds);
505 AliCDBId* aGridId = 0;
506 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
507 if(aGridId->GetPath() != aLocId.GetPath()) continue;
508 // skip all objects valid up to infinity
509 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
510 // if we get here, it means there's already some more recent object stored on Grid!
515 // If we get here, the file can be stored!
516 Bool_t storeOk = gridSto->Put(aLocEntry);
517 if(!store || storeOk){
521 Log(fCurrentDetector.Data(),
522 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
523 type, aGridId->ToString().Data()));
526 Form("StoreOCDB - Object <%s> successfully put into %s storage",
527 aLocId.ToString().Data(), type));
530 // removing local filename...
532 localSto->IdToFilename(aLocId, filename);
533 AliInfo(Form("Removing local file %s", filename.Data()));
534 RemoveFile(filename.Data());
538 Form("StoreOCDB - Grid %s storage of object <%s> failed",
539 type, aLocId.ToString().Data()));
543 localEntries->Clear();
545 if(result && willDoAgain) {
546 Log(fCurrentDetector.Data(),
547 "Some files have been left on local storage, will try again later!");
554 //______________________________________________________________________________________________
555 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
558 // Stores reference file directly (without opening it). This function stores the file locally.
560 // The file is stored under the following location:
561 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
562 // where <gridFileName> is the second parameter given to the function
565 if (fTestMode & kErrorStorage)
567 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
571 AliCDBManager* man = AliCDBManager::Instance();
572 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
574 TString localBaseFolder = sto->GetBaseFolder();
577 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
580 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
582 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
585 result = gSystem->mkdir(targetDir, kTRUE);
588 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
593 result = gSystem->CopyFile(localFile, target);
597 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
602 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
607 //______________________________________________________________________________________________
608 Bool_t AliShuttle::StoreRefFilesToGrid()
611 // Transfers the reference file to the Grid.
613 // The file is stored under the following location:
614 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
615 // where <gridFileName> is the second parameter given to the function
618 AliCDBManager* man = AliCDBManager::Instance();
619 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
622 TString localBaseFolder = sto->GetBaseFolder();
625 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
627 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
630 TString gridBaseFolder = gridSto->GetBaseFolder();
632 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
638 begin.Form("%d_", GetCurrentRun());
640 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
644 TList* dirList = baseDir->GetListOfFiles();
651 Int_t nDirs = dirList->GetEntries();
653 Bool_t success = kTRUE;
654 Bool_t first = kTRUE;
656 for (Int_t iDir=0; iDir<nDirs; ++iDir)
658 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
662 if (entry->IsDirectory())
665 TString fileName(entry->GetName());
666 if (!fileName.BeginsWith(begin))
672 // check that DET folder exists, otherwise create it
673 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
678 if (!result->GetFileName(0))
680 if (!gGrid->Mkdir(alienDir.Data(),"",0))
682 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
691 TString fullLocalPath;
692 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
694 TString fullGridPath;
695 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
697 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
699 TFileMerger fileMerger;
700 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
704 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
705 RemoveFile(fullLocalPath);
709 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
719 //______________________________________________________________________________________________
720 void AliShuttle::CleanLocalStorage(const TString& uri)
723 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
726 const char* type = 0;
727 if(uri == fgkLocalCDB) {
729 } else if(uri == fgkLocalRefStorage) {
732 AliError(Form("Invalid storage URI: %s", uri.Data()));
736 AliCDBManager* man = AliCDBManager::Instance();
738 // open local storage
739 AliCDBStorage *localSto = man->GetStorage(uri);
742 Form("CleanLocalStorage - cannot activate local %s storage", type));
746 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
747 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
749 AliInfo(Form("filename = %s", filename.Data()));
751 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
752 GetCurrentRun(), fCurrentDetector.Data()));
754 RemoveFile(filename.Data());
758 //______________________________________________________________________________________________
759 void AliShuttle::RemoveFile(const char* filename)
762 // removes local file
765 TString command(Form("rm -f %s", filename));
767 Int_t result = gSystem->Exec(command.Data());
770 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
771 fCurrentDetector.Data(), filename));
775 //______________________________________________________________________________________________
776 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
779 // Reads the AliShuttleStatus from the CDB
787 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
788 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
790 if (!fStatusEntry) return 0;
791 fStatusEntry->SetOwner(1);
793 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
795 AliError("Invalid object stored to CDB!");
802 //______________________________________________________________________________________________
803 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
806 // writes the status for one subdetector
814 Int_t run = GetCurrentRun();
816 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
818 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
819 fStatusEntry->SetOwner(1);
821 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
824 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
825 fCurrentDetector.Data(), run));
834 //______________________________________________________________________________________________
835 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
838 // changes the AliShuttleStatus for the given detector and run to the given status
842 AliError("UNEXPECTED: fStatusEntry empty");
846 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
849 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
853 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
854 fCurrentDetector.Data(),
855 status->GetStatusName(),
856 status->GetStatusName(newStatus));
857 Log("SHUTTLE", actionStr);
858 SetLastAction(actionStr);
860 status->SetStatus(newStatus);
861 if (increaseCount) status->IncreaseCount();
863 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
868 //______________________________________________________________________________________________
869 void AliShuttle::SendMLInfo()
872 // sends ML information about the current status of the current detector being processed
875 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
878 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
882 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
883 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
886 mlList.Add(&mlStatus);
887 mlList.Add(&mlRetryCount);
889 fMonaLisa->SendParameters(&mlList);
892 //______________________________________________________________________________________________
893 Bool_t AliShuttle::ContinueProcessing()
895 // this function reads the AliShuttleStatus information from CDB and
896 // checks if the processing should be continued
897 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
899 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
901 AliPreprocessor* aPreprocessor =
902 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
905 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
909 AliShuttleLogbookEntry::Status entryStatus =
910 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
912 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
913 AliInfo(Form("ContinueProcessing - %s is %s",
914 fCurrentDetector.Data(),
915 fLogbookEntry->GetDetectorStatusName(entryStatus)));
919 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
921 // check if current run is first unprocessed run for current detector
922 if (fConfig->StrictRunOrder(fCurrentDetector) &&
923 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
925 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
929 AliShuttleStatus* status = ReadShuttleStatus();
932 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
933 fCurrentDetector.Data()));
934 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
935 return WriteShuttleStatus(status);
938 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
939 // If it happens it may mean Logbook updating failed... let's do it now!
940 if (status->GetStatus() == AliShuttleStatus::kDone ||
941 status->GetStatus() == AliShuttleStatus::kFailed){
942 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
943 fCurrentDetector.Data(),
944 status->GetStatusName(status->GetStatus())));
945 UpdateShuttleLogbook(fCurrentDetector.Data(),
946 status->GetStatusName(status->GetStatus()));
950 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
952 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
953 fCurrentDetector.Data()));
954 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
956 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
957 fCurrentDetector.Data()));
958 UpdateShuttleStatus(AliShuttleStatus::kDone);
959 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
962 Form("ContinueProcessing - %s: Grid storage failed again",
963 fCurrentDetector.Data()));
964 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
969 // if we get here, there is a restart
970 Bool_t cont = kFALSE;
973 if (status->GetCount() >= fConfig->GetMaxRetries()) {
974 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
975 "Updating Shuttle Logbook", fCurrentDetector.Data(),
976 status->GetCount(), status->GetStatusName()));
977 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
978 UpdateShuttleStatus(AliShuttleStatus::kFailed);
980 // there may still be objects in local OCDB and reference storage
981 // and FXS databases may be not updated: do it now!
983 // TODO Currently disabled, we want to keep files in case of failure!
984 // CleanLocalStorage(fgkLocalCDB);
985 // CleanLocalStorage(fgkLocalRefStorage);
986 // UpdateTableFailCase();
988 // Send mail to detector expert!
989 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
991 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
992 fCurrentDetector.Data()));
995 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
996 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
997 status->GetStatusName(), status->GetCount()));
998 Bool_t increaseCount = kTRUE;
999 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1000 increaseCount = kFALSE;
1001 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1008 //______________________________________________________________________________________________
1009 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1012 // Makes data retrieval for all detectors in the configuration.
1013 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1014 // (Unprocessed, Inactive, Failed or Done).
1015 // Returns kFALSE in case of error occured and kTRUE otherwise
1018 if (!entry) return kFALSE;
1020 fLogbookEntry = entry;
1022 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1025 // create ML instance that monitors this run
1026 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1027 // disable monitoring of other parameters that come e.g. from TFile
1028 gMonitoringWriter = 0;
1030 // Send the information to ML
1031 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1032 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1035 mlList.Add(&mlStatus);
1036 mlList.Add(&mlRunType);
1038 fMonaLisa->SendParameters(&mlList);
1040 if (fLogbookEntry->IsDone())
1042 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1043 UpdateShuttleLogbook("shuttle_done");
1048 // read test mode if flag is set
1052 TString logEntry(entry->GetRunParameter("log"));
1053 //printf("log entry = %s\n", logEntry.Data());
1054 TString searchStr("Testmode: ");
1055 Int_t pos = logEntry.Index(searchStr.Data());
1056 //printf("%d\n", pos);
1059 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1060 //printf("%s\n", subStr.String().Data());
1061 TString newStr(subStr.Data());
1062 TObjArray* token = newStr.Tokenize(' ');
1066 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1069 Int_t testMode = tmpStr->String().Atoi();
1072 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1073 SetTestMode((TestMode) testMode);
1081 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1083 fLogbookEntry->Print("all");
1086 Bool_t hasError = kFALSE;
1088 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1089 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1090 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1091 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1093 // Loop on detectors in the configuration
1094 TIter iter(fConfig->GetDetectors());
1095 TObjString* aDetector = 0;
1097 while ((aDetector = (TObjString*) iter.Next()))
1099 fCurrentDetector = aDetector->String();
1101 if (ContinueProcessing() == kFALSE) continue;
1103 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1104 GetCurrentRun(), aDetector->GetName()));
1106 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1108 Log(fCurrentDetector.Data(), "Starting processing");
1114 Log("SHUTTLE", "ERROR: Forking failed");
1119 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1120 GetCurrentRun(), aDetector->GetName()));
1122 Long_t begin = time(0);
1124 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1125 while (waitpid(pid, &status, WNOHANG) == 0)
1127 Long_t expiredTime = time(0) - begin;
1129 if (expiredTime > fConfig->GetPPTimeOut())
1132 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1133 fCurrentDetector.Data(), expiredTime);
1134 Log("SHUTTLE", tmp);
1135 Log(fCurrentDetector, tmp);
1139 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1142 gSystem->Sleep(1000);
1146 gSystem->Sleep(1000);
1149 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1150 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1153 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1158 if (!fgets(buffer, 100, pipe))
1160 Log("SHUTTLE", "Error: ps did not return anything");
1161 gSystem->ClosePipe(pipe);
1164 gSystem->ClosePipe(pipe);
1166 //Log("SHUTTLE", Form("ps returned %s", buffer));
1169 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1171 Log("SHUTTLE", "Error: Could not parse output of ps");
1175 if (expiredTime % 60 == 0)
1176 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1177 fCurrentDetector.Data(), expiredTime, mem));
1179 if (mem > fConfig->GetPPMaxMem())
1182 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1183 mem, fConfig->GetPPMaxMem());
1184 Log("SHUTTLE", tmp);
1185 Log(fCurrentDetector, tmp);
1189 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1192 gSystem->Sleep(1000);
1197 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1198 GetCurrentRun(), aDetector->GetName()));
1200 if (WIFEXITED(status))
1202 Int_t returnCode = WEXITSTATUS(status);
1204 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1207 if (returnCode == 0) hasError = kTRUE;
1213 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1215 AliInfo("Redirecting output...");
1217 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1219 Log("SHUTTLE", "Could not freopen stdout");
1223 fOutputRedirected = kTRUE;
1224 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1225 Log("SHUTTLE", "Could not redirect stderr");
1229 Bool_t success = ProcessCurrentDetector();
1230 if (success) // Preprocessor finished successfully!
1232 // Update time_processed field in FXS DB
1233 if (UpdateTable() == kFALSE)
1234 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1236 // Transfer the data from local storage to main storage (Grid)
1237 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1238 if (StoreOCDB() == kFALSE)
1240 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1241 GetCurrentRun(), aDetector->GetName()));
1242 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1245 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1246 GetCurrentRun(), aDetector->GetName()));
1247 UpdateShuttleStatus(AliShuttleStatus::kDone);
1248 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1252 for (UInt_t iSys=0; iSys<3; iSys++)
1254 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1257 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1258 GetCurrentRun(), aDetector->GetName(), success));
1260 // the client exits here
1261 gSystem->Exit(success);
1263 AliError("We should never get here!!!");
1267 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1270 //check if shuttle is done for this run, if so update logbook
1271 TObjArray checkEntryArray;
1272 checkEntryArray.SetOwner(1);
1273 TString whereClause = Form("where run=%d", GetCurrentRun());
1274 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1275 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1277 return hasError == kFALSE;
1280 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1281 (checkEntryArray.At(0));
1285 if (checkEntry->IsDone())
1287 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1288 UpdateShuttleLogbook("shuttle_done");
1292 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1294 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1296 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1297 checkEntry->GetRun(), GetDetName(iDet)));
1298 fFirstUnprocessed[iDet] = kFALSE;
1304 // remove ML instance
1310 return hasError == kFALSE;
1313 //______________________________________________________________________________________________
1314 Bool_t AliShuttle::ProcessCurrentDetector()
1317 // Makes data retrieval just for a specific detector (fCurrentDetector).
1318 // Threre should be a configuration for this detector.
1320 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1325 Bool_t aDCSError = kFALSE;
1327 // call preprocessor
1328 AliPreprocessor* aPreprocessor =
1329 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1331 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1333 Bool_t processDCS = aPreprocessor->ProcessDCS();
1335 if (!processDCS || (fTestMode & kSkipDCS))
1337 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1339 else if (fTestMode & kErrorDCS)
1341 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1342 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1343 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1347 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1349 TString host(fConfig->GetDCSHost(fCurrentDetector));
1350 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1352 // Retrieval of Aliases
1353 TObjString* anAlias = 0;
1355 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
1356 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1357 while ((anAlias = (TObjString*) iterAliases.Next()))
1359 TObjArray *valueSet = new TObjArray();
1360 valueSet->SetOwner(1);
1362 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1363 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1364 anAlias->GetName(), iAlias++, nTotAliases));
1365 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1369 dcsMap.Add(anAlias->Clone(), valueSet);
1371 Log(fCurrentDetector,
1372 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1373 anAlias->GetName()));
1374 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1380 // Retrieval of Data Points
1381 TObjString* aDP = 0;
1383 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
1384 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1385 while ((aDP = (TObjString*) iterDP.Next()))
1387 TObjArray *valueSet = new TObjArray();
1388 valueSet->SetOwner(1);
1389 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1390 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1391 aDP->GetName(), iDP++, nTotDPs));
1392 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1396 dcsMap.Add(aDP->Clone(), valueSet);
1398 Log(fCurrentDetector,
1399 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1401 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1408 // DCS Archive DB processing successful. Call Preprocessor!
1409 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1411 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
1413 if (returnValue > 0) // Preprocessor error!
1415 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1416 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1422 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1423 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1424 fCurrentDetector.Data()));
1431 //______________________________________________________________________________________________
1432 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1435 // Query DAQ's Shuttle logbook and fills detector status object.
1436 // Call QueryRunParameters to query DAQ logbook for run parameters.
1439 entries.SetOwner(1);
1441 // check connection, in case connect
1442 if(!Connect(3)) return kFALSE;
1445 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1447 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1449 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1453 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1455 if(aResult->GetRowCount() == 0) {
1456 AliInfo("No entries in Shuttle Logbook match request");
1461 // TODO Check field count!
1462 const UInt_t nCols = 22;
1463 if (aResult->GetFieldCount() != (Int_t) nCols) {
1464 AliError("Invalid SQL result field number!");
1470 while ((aRow = aResult->Next())) {
1471 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1472 Int_t run = runString.Atoi();
1474 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1478 // loop on detectors
1479 for(UInt_t ii = 0; ii < nCols; ii++)
1480 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1482 entries.AddLast(entry);
1490 //______________________________________________________________________________________________
1491 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1494 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1497 // check connection, in case connect
1502 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1504 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1506 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1510 if (aResult->GetRowCount() == 0) {
1511 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1516 if (aResult->GetRowCount() > 1) {
1517 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1522 TSQLRow* aRow = aResult->Next();
1525 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1530 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1532 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1533 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1535 UInt_t startTime = entry->GetStartTime();
1536 UInt_t endTime = entry->GetEndTime();
1538 if (!startTime || !endTime || startTime > endTime) {
1540 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1541 run, startTime, endTime));
1554 //______________________________________________________________________________________________
1555 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1556 TObjArray* valueSet, DCSType type)
1558 // Retrieve all "entry" data points from the DCS server
1559 // host, port: TSocket connection parameters
1560 // entry: name of the alias or data point
1561 // valueSet: array of retrieved AliDCSValue's
1562 // type: kAlias or kDP
1564 AliDCSClient client(host, port, fTimeout, fRetries);
1565 if (!client.IsConnected())
1574 result = client.GetAliasValues(entry,
1575 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1579 result = client.GetDPValues(entry,
1580 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1585 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1586 entry, AliDCSClient::GetErrorString(result)));
1588 if (result == AliDCSClient::fgkServerError)
1590 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1591 client.GetServerError().Data()));
1600 //______________________________________________________________________________________________
1601 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1602 const char* id, const char* source)
1604 // Get calibration file from file exchange servers
1605 // First queris the FXS database for the file name, using the run, detector, id and source info
1606 // then calls RetrieveFile(filename) for actual copy to local disk
1607 // run: current run being processed (given by Logbook entry fLogbookEntry)
1608 // detector: the Preprocessor name
1609 // id: provided as a parameter by the Preprocessor
1610 // source: provided by the Preprocessor through GetFileSources function
1612 // check if test mode should simulate a FXS error
1613 if (fTestMode & kErrorFXSFiles)
1615 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1619 // check connection, in case connect
1620 if (!Connect(system))
1622 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1626 // Query preparation
1627 TString sourceName(source);
1629 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1630 fConfig->GetFXSdbTable(system));
1631 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1632 GetCurrentRun(), detector, id);
1636 whereClause += Form(" and DAQsource=\"%s\"", source);
1638 else if (system == kDCS)
1642 else if (system == kHLT)
1644 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1648 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1650 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1653 TSQLResult* aResult = 0;
1654 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1656 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1657 GetSystemName(system), id, sourceName.Data()));
1661 if(aResult->GetRowCount() == 0)
1664 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1665 GetSystemName(system), id, sourceName.Data()));
1670 if (aResult->GetRowCount() > 1) {
1672 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1673 GetSystemName(system), id, sourceName.Data()));
1678 if (aResult->GetFieldCount() != nFields) {
1680 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1681 GetSystemName(system), id, sourceName.Data()));
1686 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1689 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1690 GetSystemName(system), id, sourceName.Data()));
1695 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1696 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1697 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1702 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1703 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1705 // retrieved file is renamed to make it unique
1706 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1707 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1710 // file retrieval from FXS
1711 UInt_t nRetries = 0;
1712 UInt_t maxRetries = 3;
1713 Bool_t result = kFALSE;
1715 // copy!! if successful TSystem::Exec returns 0
1716 while(nRetries++ < maxRetries) {
1717 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1718 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1721 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1722 filePath.Data(), GetSystemName(system)));
1725 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1726 filePath.Data(), GetSystemName(system),
1727 GetShuttleTempDir(), localFileName.Data()));
1730 if (fileChecksum.Length()>0)
1732 // compare md5sum of local file with the one stored in the FXS DB
1733 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
1734 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
1738 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1744 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1745 filePath.Data(), GetSystemName(system)));
1750 if(!result) return 0;
1752 fFXSCalled[system]=kTRUE;
1753 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1754 fFXSlist[system].Add(fileParams);
1756 static TString fullLocalFileName;
1757 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1759 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1761 return fullLocalFileName.Data();
1765 //______________________________________________________________________________________________
1766 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
1769 // Copies file from FXS to local Shuttle machine
1772 // check temp directory: trying to cd to temp; if it does not exist, create it
1773 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1774 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
1776 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
1778 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1779 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
1784 gSystem->FreeDirectory(dir);
1787 TString baseFXSFolder;
1790 baseFXSFolder = "FES/";
1792 else if (system == kDCS)
1796 else if (system == kHLT)
1798 baseFXSFolder = "~/";
1802 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1803 fConfig->GetFXSPort(system),
1804 fConfig->GetFXSUser(system),
1805 fConfig->GetFXSHost(system),
1806 baseFXSFolder.Data(),
1808 GetShuttleTempDir(),
1811 AliDebug(2, Form("%s",command.Data()));
1813 Bool_t result = (gSystem->Exec(command.Data()) == 0);
1818 //______________________________________________________________________________________________
1819 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1822 // Get sources producing the condition file Id from file exchange servers
1825 // check if test mode should simulate a FXS error
1826 if (fTestMode & kErrorFXSSources)
1828 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1835 AliError("DCS system has only one source of data!");
1839 // check connection, in case connect
1840 if (!Connect(system))
1842 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1846 TString sourceName = 0;
1849 sourceName = "DAQsource";
1850 } else if (system == kHLT)
1852 sourceName = "DDLnumbers";
1855 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
1856 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1857 GetCurrentRun(), detector, id);
1858 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1860 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1863 TSQLResult* aResult;
1864 aResult = fServer[system]->Query(sqlQuery);
1866 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1867 GetSystemName(system), id));
1871 if (aResult->GetRowCount() == 0)
1874 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
1880 TList *list = new TList();
1883 while ((aRow = aResult->Next()))
1886 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1887 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1888 list->Add(new TObjString(source));
1897 //______________________________________________________________________________________________
1898 Bool_t AliShuttle::Connect(Int_t system)
1900 // Connect to MySQL Server of the system's FXS MySQL databases
1901 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1904 // check connection: if already connected return
1905 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
1907 TString dbHost, dbUser, dbPass, dbName;
1909 if (system < 3) // FXS db servers
1911 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1912 dbUser = fConfig->GetFXSdbUser(system);
1913 dbPass = fConfig->GetFXSdbPass(system);
1914 dbName = fConfig->GetFXSdbName(system);
1915 } else { // Run & Shuttle logbook servers
1916 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1917 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1918 dbUser = fConfig->GetDAQlbUser();
1919 dbPass = fConfig->GetDAQlbPass();
1920 dbName = fConfig->GetDAQlbDB();
1923 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1924 if (!fServer[system] || !fServer[system]->IsConnected()) {
1927 AliError(Form("Can't establish connection to FXS database for %s",
1928 AliShuttleInterface::GetSystemName(system)));
1930 AliError("Can't establish connection to Run logbook.");
1932 if(fServer[system]) delete fServer[system];
1937 TSQLResult* aResult=0;
1940 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1943 aResult = fServer[kDCS]->GetTables(dbName.Data());
1946 aResult = fServer[kHLT]->GetTables(dbName.Data());
1949 aResult = fServer[3]->GetTables(dbName.Data());
1957 //______________________________________________________________________________________________
1958 Bool_t AliShuttle::UpdateTable()
1961 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1964 Bool_t result = kTRUE;
1966 for (UInt_t system=0; system<3; system++)
1968 if(!fFXSCalled[system]) continue;
1970 // check connection, in case connect
1971 if (!Connect(system))
1973 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1978 TTimeStamp now; // now
1980 // Loop on FXS list entries
1981 TIter iter(&fFXSlist[system]);
1982 TObjString *aFXSentry=0;
1983 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1985 TString aFXSentrystr = aFXSentry->String();
1986 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1987 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1989 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
1990 GetSystemName(system), aFXSentrystr.Data()));
1991 if(aFXSarray) delete aFXSarray;
1995 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
1996 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
1998 TString whereClause;
2001 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2002 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2004 else if (system == kDCS)
2006 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2007 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2009 else if (system == kHLT)
2011 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2012 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2017 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2018 now.GetSec(), whereClause.Data());
2020 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2023 TSQLResult* aResult;
2024 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2027 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2028 GetSystemName(system), sqlQuery.Data()));
2039 //______________________________________________________________________________________________
2040 Bool_t AliShuttle::UpdateTableFailCase()
2042 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2043 // this is called in case the preprocessor is declared failed for the current run, because
2044 // the fields are updated only in case of success
2046 Bool_t result = kTRUE;
2048 for (UInt_t system=0; system<3; system++)
2050 // check connection, in case connect
2051 if (!Connect(system))
2053 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2054 GetSystemName(system)));
2059 TTimeStamp now; // now
2061 // Loop on FXS list entries
2063 TString whereClause = Form("where run=%d and detector=\"%s\";",
2064 GetCurrentRun(), fCurrentDetector.Data());
2067 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2068 now.GetSec(), whereClause.Data());
2070 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2073 TSQLResult* aResult;
2074 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2077 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2078 GetSystemName(system), sqlQuery.Data()));
2088 //______________________________________________________________________________________________
2089 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2092 // Update Shuttle logbook filling detector or shuttle_done column
2093 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2096 // check connection, in case connect
2098 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2102 TString detName(detector);
2104 if(detName == "shuttle_done")
2106 setClause = "set shuttle_done=1";
2108 // Send the information to ML
2109 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2112 mlList.Add(&mlStatus);
2114 fMonaLisa->SendParameters(&mlList);
2116 TString statusStr(status);
2117 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2118 statusStr.Contains("failed", TString::kIgnoreCase)){
2119 setClause = Form("set %s=\"%s\"", detector, status);
2122 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2128 TString whereClause = Form("where run=%d", GetCurrentRun());
2130 TString sqlQuery = Form("update %s %s %s",
2131 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2133 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2136 TSQLResult* aResult;
2137 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2139 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2147 //______________________________________________________________________________________________
2148 Int_t AliShuttle::GetCurrentRun() const
2151 // Get current run from logbook entry
2154 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2157 //______________________________________________________________________________________________
2158 UInt_t AliShuttle::GetCurrentStartTime() const
2161 // get current start time
2164 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2167 //______________________________________________________________________________________________
2168 UInt_t AliShuttle::GetCurrentEndTime() const
2171 // get current end time from logbook entry
2174 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2177 //______________________________________________________________________________________________
2178 void AliShuttle::Log(const char* detector, const char* message)
2181 // Fill log string with a message
2184 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2186 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2187 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2192 gSystem->FreeDirectory(dir);
2195 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2196 if (GetCurrentRun() >= 0)
2197 toLog += Form("run %d - ", GetCurrentRun());
2198 toLog += Form("%s", message);
2200 AliInfo(toLog.Data());
2202 // if we redirect the log output already to the file, leave here
2203 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2206 TString fileName = GetLogFileName(detector);
2208 gSystem->ExpandPathName(fileName);
2211 logFile.open(fileName, ofstream::out | ofstream::app);
2213 if (!logFile.is_open()) {
2214 AliError(Form("Could not open file %s", fileName.Data()));
2218 logFile << toLog.Data() << "\n";
2223 //______________________________________________________________________________________________
2224 TString AliShuttle::GetLogFileName(const char* detector) const
2227 // returns the name of the log file for a given sub detector
2232 if (GetCurrentRun() >= 0)
2233 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2235 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2240 //______________________________________________________________________________________________
2241 Bool_t AliShuttle::Collect(Int_t run)
2244 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2245 // If a dedicated run is given this run is processed
2247 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2251 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2253 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2255 SetLastAction("Starting");
2257 TString whereClause("where shuttle_done=0");
2259 whereClause += Form(" and run=%d", run);
2261 TObjArray shuttleLogbookEntries;
2262 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2264 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2268 if (shuttleLogbookEntries.GetEntries() == 0)
2271 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2273 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2274 "or it does not exist in Shuttle logbook", run));
2278 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2279 fFirstUnprocessed[iDet] = kTRUE;
2283 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2284 // flag them into fFirstUnprocessed array
2285 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2286 TObjArray tmpLogbookEntries;
2287 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2289 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2293 TIter iter(&tmpLogbookEntries);
2294 AliShuttleLogbookEntry* anEntry = 0;
2295 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2297 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2299 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2301 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2302 anEntry->GetRun(), GetDetName(iDet)));
2303 fFirstUnprocessed[iDet] = kFALSE;
2311 if (!RetrieveConditionsData(shuttleLogbookEntries))
2313 Log("SHUTTLE", "Collect - Process of at least one run failed");
2317 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2321 //______________________________________________________________________________________________
2322 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2325 // Retrieve conditions data for all runs that aren't processed yet
2328 Bool_t hasError = kFALSE;
2330 TIter iter(&dateEntries);
2331 AliShuttleLogbookEntry* anEntry;
2333 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2334 if (!Process(anEntry)){
2338 // clean SHUTTLE temp directory
2339 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2340 RemoveFile(filename.Data());
2343 return hasError == kFALSE;
2346 //______________________________________________________________________________________________
2347 ULong_t AliShuttle::GetTimeOfLastAction() const
2350 // Gets time of last action
2355 fMonitoringMutex->Lock();
2357 tmp = fLastActionTime;
2359 fMonitoringMutex->UnLock();
2364 //______________________________________________________________________________________________
2365 const TString AliShuttle::GetLastAction() const
2368 // returns a string description of the last action
2373 fMonitoringMutex->Lock();
2377 fMonitoringMutex->UnLock();
2382 //______________________________________________________________________________________________
2383 void AliShuttle::SetLastAction(const char* action)
2386 // updates the monitoring variables
2389 fMonitoringMutex->Lock();
2391 fLastAction = action;
2392 fLastActionTime = time(0);
2394 fMonitoringMutex->UnLock();
2397 //______________________________________________________________________________________________
2398 const char* AliShuttle::GetRunParameter(const char* param)
2401 // returns run parameter read from DAQ logbook
2404 if(!fLogbookEntry) {
2405 AliError("No logbook entry!");
2409 return fLogbookEntry->GetRunParameter(param);
2412 //______________________________________________________________________________________________
2413 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2416 // returns object from OCDB valid for current run
2419 if (fTestMode & kErrorOCDB)
2421 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2425 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2428 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2432 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2435 //______________________________________________________________________________________________
2436 Bool_t AliShuttle::SendMail()
2439 // sends a mail to the subdetector expert in case of preprocessor error
2442 if (fTestMode != kNone)
2445 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2448 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2450 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2455 gSystem->FreeDirectory(dir);
2458 TString bodyFileName;
2459 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2460 gSystem->ExpandPathName(bodyFileName);
2463 mailBody.open(bodyFileName, ofstream::out);
2465 if (!mailBody.is_open())
2467 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2472 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2473 TObjString *anExpert=0;
2474 while ((anExpert = (TObjString*) iterExperts.Next()))
2476 to += Form("%s,", anExpert->GetName());
2478 to.Remove(to.Length()-1);
2479 AliDebug(2, Form("to: %s",to.Data()));
2481 // TODO this will be removed...
2482 if (to.Contains("not_yet_set")) {
2483 AliInfo("List of detector responsibles not yet set!");
2487 TString cc="alberto.colla@cern.ch";
2489 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2490 fCurrentDetector.Data(), GetCurrentRun());
2491 AliDebug(2, Form("subject: %s", subject.Data()));
2493 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2494 body += Form("SHUTTLE just detected that your preprocessor "
2495 "FAILED after %d retries in run %d!!\n\n", fConfig->GetMaxRetries(), GetCurrentRun());
2496 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2497 body += Form("The last 10 lines of %s log file are following:\n\n");
2499 AliDebug(2, Form("Body begin: %s", body.Data()));
2501 mailBody << body.Data();
2503 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2505 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2506 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2507 if (gSystem->Exec(tailCommand.Data()))
2509 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2512 TString endBody = Form("------------------------------------------------------\n\n");
2513 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2514 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2515 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2517 AliDebug(2, Form("Body end: %s", endBody.Data()));
2519 mailBody << endBody.Data();
2524 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2528 bodyFileName.Data());
2529 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2531 Bool_t result = gSystem->Exec(mailCommand.Data());
2536 //______________________________________________________________________________________________
2537 const char* AliShuttle::GetRunType()
2540 // returns run type read from "run type" logbook
2543 if(!fLogbookEntry) {
2544 AliError("No logbook entry!");
2548 return fLogbookEntry->GetRunType();
2551 //______________________________________________________________________________________________
2552 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2555 // sets Shuttle temp directory
2558 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2561 //______________________________________________________________________________________________
2562 void AliShuttle::SetShuttleLogDir(const char* logDir)
2565 // sets Shuttle log directory
2568 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);