1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.54 2007/07/12 09:51:25 jgrosseo
19 removed duplicated log message in GetFile
21 Revision 1.53 2007/07/12 09:26:28 jgrosseo
22 updating hlt fxs base path
24 Revision 1.52 2007/07/12 08:06:45 jgrosseo
25 adding log messages in getfile... functions
26 adding not implemented copy constructor in alishuttleconfigholder
28 Revision 1.51 2007/07/03 17:24:52 acolla
29 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
31 Revision 1.50 2007/07/02 17:19:32 acolla
32 preprocessor is run in a temp directory that is removed when process is finished.
34 Revision 1.49 2007/06/29 10:45:06 acolla
35 Number of columns in MySql Shuttle logbook increased by one (HLT added)
37 Revision 1.48 2007/06/21 13:06:19 acolla
38 GetFileSources returns dummy list with 1 source if system=DCS (better than
39 returning error as it was)
41 Revision 1.47 2007/06/19 17:28:56 acolla
42 HLT updated; missing map bug removed.
44 Revision 1.46 2007/06/09 13:01:09 jgrosseo
45 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
47 Revision 1.45 2007/05/30 06:35:20 jgrosseo
48 Adding functionality to the Shuttle/TestShuttle:
49 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
50 o) Function to retrieve list of IDs for a given source (GetFileIDs)
51 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
52 Example code has been added to the TestProcessor in TestShuttle
54 Revision 1.44 2007/05/11 16:09:32 acolla
55 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
56 example: ITS/SPD/100_filename.root
58 Revision 1.43 2007/05/10 09:59:51 acolla
59 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
61 Revision 1.42 2007/05/03 08:01:39 jgrosseo
62 typo in last commit :-(
64 Revision 1.41 2007/05/03 08:00:48 jgrosseo
65 fixing log message when pp want to skip dcs value retrieval
67 Revision 1.40 2007/04/27 07:06:48 jgrosseo
68 GetFileSources returns empty list in case of no files, but successful query
69 No mails sent in testmode
71 Revision 1.39 2007/04/17 12:43:57 acolla
72 Correction in StoreOCDB; change of text in mail to detector expert
74 Revision 1.38 2007/04/12 08:26:18 jgrosseo
77 Revision 1.37 2007/04/10 16:53:14 jgrosseo
78 redirecting sub detector stdout, stderr to sub detector log file
80 Revision 1.35 2007/04/04 16:26:38 acolla
81 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
82 2. Added missing dependency in test preprocessors.
83 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
85 Revision 1.34 2007/04/04 10:33:36 jgrosseo
86 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
87 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
89 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
91 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
93 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
95 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
96 If you always need DCS data (like before), you do not need to implement it.
98 6) The run type has been added to the monitoring page
100 Revision 1.33 2007/04/03 13:56:01 acolla
101 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
104 Revision 1.32 2007/02/28 10:41:56 acolla
105 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
106 AliPreprocessor::GetRunType() function.
107 Added some ldap definition files.
109 Revision 1.30 2007/02/13 11:23:21 acolla
110 Moved getters and setters of Shuttle's main OCDB/Reference, local
111 OCDB/Reference, temp and log folders to AliShuttleInterface
113 Revision 1.27 2007/01/30 17:52:42 jgrosseo
114 adding monalisa monitoring
116 Revision 1.26 2007/01/23 19:20:03 acolla
117 Removed old ldif files, added TOF, MCH ldif files. Added some options in
118 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
121 Revision 1.25 2007/01/15 19:13:52 acolla
122 Moved some AliInfo to AliDebug in SendMail function
124 Revision 1.21 2006/12/07 08:51:26 jgrosseo
126 table, db names in ldap configuration
127 added GRP preprocessor
128 DCS data can also be retrieved by data point
130 Revision 1.20 2006/11/16 16:16:48 jgrosseo
131 introducing strict run ordering flag
132 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
134 Revision 1.19 2006/11/06 14:23:04 jgrosseo
135 major update (Alberto)
136 o) reading of run parameters from the logbook
137 o) online offline naming conversion
138 o) standalone DCSclient package
140 Revision 1.18 2006/10/20 15:22:59 jgrosseo
141 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
142 o) Merging Collect, CollectAll, CollectNew function
143 o) Removing implementation of empty copy constructors (declaration still there!)
145 Revision 1.17 2006/10/05 16:20:55 jgrosseo
146 adapting to new CDB classes
148 Revision 1.16 2006/10/05 15:46:26 jgrosseo
149 applying to the new interface
151 Revision 1.15 2006/10/02 16:38:39 jgrosseo
154 storing of objects that failed to be stored to the grid before
155 interfacing of shuttle status table in daq system
157 Revision 1.14 2006/08/29 09:16:05 jgrosseo
160 Revision 1.13 2006/08/15 10:50:00 jgrosseo
161 effc++ corrections (alberto)
163 Revision 1.12 2006/08/08 14:19:29 jgrosseo
164 Update to shuttle classes (Alberto)
166 - Possibility to set the full object's path in the Preprocessor's and
167 Shuttle's Store functions
168 - Possibility to extend the object's run validity in the same classes
169 ("startValidity" and "validityInfinite" parameters)
170 - Implementation of the StoreReferenceData function to store reference
171 data in a dedicated CDB storage.
173 Revision 1.11 2006/07/21 07:37:20 jgrosseo
174 last run is stored after each run
176 Revision 1.10 2006/07/20 09:54:40 jgrosseo
177 introducing status management: The processing per subdetector is divided into several steps,
178 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
179 can keep track of the number of failures and skips further processing after a certain threshold is
180 exceeded. These thresholds can be configured in LDAP.
182 Revision 1.9 2006/07/19 10:09:55 jgrosseo
183 new configuration, accesst to DAQ FES (Alberto)
185 Revision 1.8 2006/07/11 12:44:36 jgrosseo
186 adding parameters for extended validity range of data produced by preprocessor
188 Revision 1.7 2006/07/10 14:37:09 jgrosseo
189 small fix + todo comment
191 Revision 1.6 2006/07/10 13:01:41 jgrosseo
192 enhanced storing of last sucessfully processed run (alberto)
194 Revision 1.5 2006/07/04 14:59:57 jgrosseo
195 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
197 Revision 1.4 2006/06/12 09:11:16 jgrosseo
198 coding conventions (Alberto)
200 Revision 1.3 2006/06/06 14:26:40 jgrosseo
201 o) removed files that were moved to STEER
202 o) shuttle updated to follow the new interface (Alberto)
204 Revision 1.2 2006/03/07 07:52:34 hristov
205 New version (B.Yordanov)
207 Revision 1.6 2005/11/19 17:19:14 byordano
208 RetrieveDATEEntries and RetrieveConditionsData added
210 Revision 1.5 2005/11/19 11:09:27 byordano
211 AliShuttle declaration added
213 Revision 1.4 2005/11/17 17:47:34 byordano
214 TList changed to TObjArray
216 Revision 1.3 2005/11/17 14:43:23 byordano
219 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
220 Initial import as subdirectory in AliRoot
222 Revision 1.2 2005/09/13 08:41:15 byordano
223 default startTime endTime added
225 Revision 1.4 2005/08/30 09:13:02 byordano
228 Revision 1.3 2005/08/29 21:15:47 byordano
234 // This class is the main manager for AliShuttle.
235 // It organizes the data retrieval from DCS and call the
236 // interface methods of AliPreprocessor.
237 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
238 // data for its set of aliases is retrieved. If there is registered
239 // AliPreprocessor for this detector then it will be used
240 // accroding to the schema (see AliPreprocessor).
241 // If there isn't registered AliPreprocessor than the retrieved
242 // data is stored automatically to the undelying AliCDBStorage.
243 // For detSpec is used the alias name.
246 #include "AliShuttle.h"
248 #include "AliCDBManager.h"
249 #include "AliCDBStorage.h"
250 #include "AliCDBId.h"
251 #include "AliCDBRunRange.h"
252 #include "AliCDBPath.h"
253 #include "AliCDBEntry.h"
254 #include "AliShuttleConfig.h"
255 #include "DCSClient/AliDCSClient.h"
257 #include "AliPreprocessor.h"
258 #include "AliShuttleStatus.h"
259 #include "AliShuttleLogbookEntry.h"
264 #include <TTimeStamp.h>
265 #include <TObjString.h>
266 #include <TSQLServer.h>
267 #include <TSQLResult.h>
270 #include <TSystemDirectory.h>
271 #include <TSystemFile.h>
273 #include <TFileMerger.h>
275 #include <TGridResult.h>
277 #include <TMonaLisaWriter.h>
281 #include <sys/types.h>
282 #include <sys/wait.h>
286 //______________________________________________________________________________________________
287 AliShuttle::AliShuttle(const AliShuttleConfig* config,
288 UInt_t timeout, Int_t retries):
290 fTimeout(timeout), fRetries(retries),
300 fReadTestMode(kFALSE),
301 fOutputRedirected(kFALSE)
304 // config: AliShuttleConfig used
305 // timeout: timeout used for AliDCSClient connection
306 // retries: the number of retries in case of connection error.
309 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
310 for(int iSys=0;iSys<4;iSys++) {
313 fFXSlist[iSys].SetOwner(kTRUE);
315 fPreprocessorMap.SetOwner(kTRUE);
317 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
318 fFirstUnprocessed[iDet] = kFALSE;
320 fMonitoringMutex = new TMutex();
323 //______________________________________________________________________________________________
324 AliShuttle::~AliShuttle()
330 fPreprocessorMap.DeleteAll();
331 for(int iSys=0;iSys<4;iSys++)
333 fServer[iSys]->Close();
334 delete fServer[iSys];
343 if (fMonitoringMutex)
345 delete fMonitoringMutex;
346 fMonitoringMutex = 0;
350 //______________________________________________________________________________________________
351 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
354 // Registers new AliPreprocessor.
355 // It uses GetName() for indentificator of the pre processor.
356 // The pre processor is registered it there isn't any other
357 // with the same identificator (GetName()).
360 const char* detName = preprocessor->GetName();
361 if(GetDetPos(detName) < 0)
362 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
364 if (fPreprocessorMap.GetValue(detName)) {
365 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
369 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
371 //______________________________________________________________________________________________
372 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
373 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
375 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
376 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
377 // using this function. Use StoreReferenceData instead!
378 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
379 // finishes the data are transferred to the main storage (Grid).
381 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
384 //______________________________________________________________________________________________
385 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
387 // Stores a CDB object in the storage for reference data. This objects will not be available during
388 // offline reconstrunction. Use this function for reference data only!
389 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
390 // finishes the data are transferred to the main storage (Grid).
392 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
395 //______________________________________________________________________________________________
396 Bool_t AliShuttle::StoreLocally(const TString& localUri,
397 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
398 Int_t validityStart, Bool_t validityInfinite)
400 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
401 // when the preprocessor finishes the data are transferred to the main storage (Grid).
402 // The parameters are:
403 // 1) Uri of the backup storage (Local)
404 // 2) the object's path.
405 // 3) the object to be stored
406 // 4) the metaData to be associated with the object
407 // 5) the validity start run number w.r.t. the current run,
408 // if the data is valid only for this run leave the default 0
409 // 6) specifies if the calibration data is valid for infinity (this means until updated),
410 // typical for calibration runs, the default is kFALSE
412 // returns 0 if fail, 1 otherwise
414 if (fTestMode & kErrorStorage)
416 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
420 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
422 Int_t firstRun = GetCurrentRun() - validityStart;
424 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
429 if(validityInfinite) {
430 lastRun = AliCDBRunRange::Infinity();
432 lastRun = GetCurrentRun();
435 // Version is set to current run, it will be used later to transfer data to Grid
436 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
438 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
439 TObjString runUsed = Form("%d", GetCurrentRun());
440 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
443 Bool_t result = kFALSE;
445 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
446 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
448 result = AliCDBManager::Instance()->GetStorage(localUri)
449 ->Put(object, id, metaData);
454 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
460 //______________________________________________________________________________________________
461 Bool_t AliShuttle::StoreOCDB()
464 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
465 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
466 // Then calls StoreRefFilesToGrid to store reference files.
469 if (fTestMode & kErrorGrid)
471 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
472 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
476 Log("SHUTTLE","Storing OCDB data ...");
477 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
479 Log("SHUTTLE","Storing reference data ...");
480 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
482 Log("SHUTTLE","Storing reference files ...");
483 Bool_t resultRefFiles = StoreRefFilesToGrid();
485 return resultCDB && resultRef && resultRefFiles;
488 //______________________________________________________________________________________________
489 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
492 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
495 TObjArray* gridIds=0;
497 Bool_t result = kTRUE;
499 const char* type = 0;
501 if(gridURI == fgkMainCDB) {
503 localURI = fgkLocalCDB;
504 } else if(gridURI == fgkMainRefStorage) {
506 localURI = fgkLocalRefStorage;
508 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
512 AliCDBManager* man = AliCDBManager::Instance();
514 AliCDBStorage *gridSto = man->GetStorage(gridURI);
517 Form("StoreOCDB - cannot activate main %s storage", type));
521 gridIds = gridSto->GetQueryCDBList();
523 // get objects previously stored in local CDB
524 AliCDBStorage *localSto = man->GetStorage(localURI);
527 Form("StoreOCDB - cannot activate local %s storage", type));
530 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
531 // Local objects were stored with current run as Grid version!
532 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
533 localEntries->SetOwner(1);
535 // loop on local stored objects
536 TIter localIter(localEntries);
537 AliCDBEntry *aLocEntry = 0;
538 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
539 aLocEntry->SetOwner(1);
540 AliCDBId aLocId = aLocEntry->GetId();
541 aLocEntry->SetVersion(-1);
542 aLocEntry->SetSubVersion(-1);
544 // If local object is valid up to infinity we store it only if it is
545 // the first unprocessed run!
546 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
547 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
549 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
550 "there are previous unprocessed runs!",
551 fCurrentDetector.Data(), aLocId.GetPath().Data()));
555 // loop on Grid valid Id's
556 Bool_t store = kTRUE;
557 TIter gridIter(gridIds);
558 AliCDBId* aGridId = 0;
559 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
560 if(aGridId->GetPath() != aLocId.GetPath()) continue;
561 // skip all objects valid up to infinity
562 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
563 // if we get here, it means there's already some more recent object stored on Grid!
568 // If we get here, the file can be stored!
569 Bool_t storeOk = gridSto->Put(aLocEntry);
570 if(!store || storeOk){
574 Log(fCurrentDetector.Data(),
575 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
576 type, aGridId->ToString().Data()));
579 Form("StoreOCDB - Object <%s> successfully put into %s storage",
580 aLocId.ToString().Data(), type));
581 Log(fCurrentDetector.Data(),
582 Form("StoreOCDB - Object <%s> successfully put into %s storage",
583 aLocId.ToString().Data(), type));
586 // removing local filename...
588 localSto->IdToFilename(aLocId, filename);
589 AliInfo(Form("Removing local file %s", filename.Data()));
590 RemoveFile(filename.Data());
594 Form("StoreOCDB - Grid %s storage of object <%s> failed",
595 type, aLocId.ToString().Data()));
596 Log(fCurrentDetector.Data(),
597 Form("StoreOCDB - Grid %s storage of object <%s> failed",
598 type, aLocId.ToString().Data()));
602 localEntries->Clear();
607 //______________________________________________________________________________________________
608 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
610 // clears the directory used to store reference files of a given subdetector
612 AliCDBManager* man = AliCDBManager::Instance();
613 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
614 TString localBaseFolder = sto->GetBaseFolder();
616 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
618 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
621 begin.Form("%d_", GetCurrentRun());
623 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
627 TList* dirList = baseDir->GetListOfFiles();
630 if (!dirList) return kTRUE;
632 if (dirList->GetEntries() < 3)
638 Int_t nDirs = 0, nDel = 0;
639 TIter dirIter(dirList);
640 TSystemFile* entry = 0;
642 Bool_t success = kTRUE;
644 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
646 if (entry->IsDirectory())
649 TString fileName(entry->GetName());
650 if (!fileName.BeginsWith(begin))
656 Int_t result = gSystem->Unlink(fileName.Data());
660 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
668 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
669 nDel, nDirs, targetDir.Data()));
680 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
684 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
687 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
692 result = gSystem->mkdir(targetDir, kTRUE);
695 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
702 //______________________________________________________________________________________________
703 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
706 // Stores reference file directly (without opening it). This function stores the file locally.
708 // The file is stored under the following location:
709 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
710 // where <gridFileName> is the second parameter given to the function
713 if (fTestMode & kErrorStorage)
715 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
719 AliCDBManager* man = AliCDBManager::Instance();
720 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
722 TString localBaseFolder = sto->GetBaseFolder();
724 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
726 //try to open folder, if does not exist
727 void* dir = gSystem->OpenDirectory(targetDir.Data());
729 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
730 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
735 gSystem->FreeDirectory(dir);
739 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
741 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
744 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
748 result = gSystem->CopyFile(localFile, target);
752 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
757 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
758 localFile, target.Data(), result));
763 //______________________________________________________________________________________________
764 Bool_t AliShuttle::StoreRefFilesToGrid()
767 // Transfers the reference file to the Grid.
769 // The files are stored under the following location:
770 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
773 AliCDBManager* man = AliCDBManager::Instance();
774 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
777 TString localBaseFolder = sto->GetBaseFolder();
779 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
781 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
785 TString gridBaseFolder = gridSto->GetBaseFolder();
787 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
790 begin.Form("%d_", GetCurrentRun());
792 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
796 TList* dirList = baseDir->GetListOfFiles();
799 if (!dirList) return kTRUE;
801 if (dirList->GetEntries() < 3)
809 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
814 Int_t nDirs = 0, nTransfer = 0;
815 TIter dirIter(dirList);
816 TSystemFile* entry = 0;
818 Bool_t success = kTRUE;
819 Bool_t first = kTRUE;
821 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
823 if (entry->IsDirectory())
826 TString fileName(entry->GetName());
827 if (!fileName.BeginsWith(begin))
835 // check that DET folder exists, otherwise create it
836 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
844 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
846 if (!gGrid->Mkdir(alienDir.Data(),"",0))
848 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
853 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
857 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
861 TString fullLocalPath;
862 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
864 TString fullGridPath;
865 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
867 TFileMerger fileMerger;
868 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
872 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
873 RemoveFile(fullLocalPath);
878 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
883 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
890 //______________________________________________________________________________________________
891 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
894 // Get folder name of reference files
897 TString offDetStr(GetOfflineDetName(detector));
899 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
901 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
903 dir.Form("%s/%s", base, offDetStr.Data());
910 //______________________________________________________________________________________________
911 void AliShuttle::CleanLocalStorage(const TString& uri)
914 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
917 const char* type = 0;
918 if(uri == fgkLocalCDB) {
920 } else if(uri == fgkLocalRefStorage) {
923 AliError(Form("Invalid storage URI: %s", uri.Data()));
927 AliCDBManager* man = AliCDBManager::Instance();
929 // open local storage
930 AliCDBStorage *localSto = man->GetStorage(uri);
933 Form("CleanLocalStorage - cannot activate local %s storage", type));
937 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
938 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
940 AliInfo(Form("filename = %s", filename.Data()));
942 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
943 GetCurrentRun(), fCurrentDetector.Data()));
945 RemoveFile(filename.Data());
949 //______________________________________________________________________________________________
950 void AliShuttle::RemoveFile(const char* filename)
953 // removes local file
956 TString command(Form("rm -f %s", filename));
958 Int_t result = gSystem->Exec(command.Data());
961 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
962 fCurrentDetector.Data(), filename));
966 //______________________________________________________________________________________________
967 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
970 // Reads the AliShuttleStatus from the CDB
978 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
979 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
981 if (!fStatusEntry) return 0;
982 fStatusEntry->SetOwner(1);
984 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
986 AliError("Invalid object stored to CDB!");
993 //______________________________________________________________________________________________
994 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
997 // writes the status for one subdetector
1001 delete fStatusEntry;
1005 Int_t run = GetCurrentRun();
1007 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1009 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1010 fStatusEntry->SetOwner(1);
1012 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1015 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1016 fCurrentDetector.Data(), run));
1025 //______________________________________________________________________________________________
1026 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1029 // changes the AliShuttleStatus for the given detector and run to the given status
1033 AliError("UNEXPECTED: fStatusEntry empty");
1037 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1040 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
1044 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1045 fCurrentDetector.Data(),
1046 status->GetStatusName(),
1047 status->GetStatusName(newStatus));
1048 Log("SHUTTLE", actionStr);
1049 SetLastAction(actionStr);
1051 status->SetStatus(newStatus);
1052 if (increaseCount) status->IncreaseCount();
1054 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1059 //______________________________________________________________________________________________
1060 void AliShuttle::SendMLInfo()
1063 // sends ML information about the current status of the current detector being processed
1066 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1069 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1073 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1074 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1077 mlList.Add(&mlStatus);
1078 mlList.Add(&mlRetryCount);
1080 fMonaLisa->SendParameters(&mlList);
1083 //______________________________________________________________________________________________
1084 Bool_t AliShuttle::ContinueProcessing()
1086 // this function reads the AliShuttleStatus information from CDB and
1087 // checks if the processing should be continued
1088 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1090 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1092 AliPreprocessor* aPreprocessor =
1093 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1096 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1100 AliShuttleLogbookEntry::Status entryStatus =
1101 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1103 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1104 AliInfo(Form("ContinueProcessing - %s is %s",
1105 fCurrentDetector.Data(),
1106 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1110 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1112 // check if current run is first unprocessed run for current detector
1113 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1114 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1116 if (fTestMode == kNone)
1118 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1123 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1127 AliShuttleStatus* status = ReadShuttleStatus();
1130 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1131 fCurrentDetector.Data()));
1132 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1133 return WriteShuttleStatus(status);
1136 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1137 // If it happens it may mean Logbook updating failed... let's do it now!
1138 if (status->GetStatus() == AliShuttleStatus::kDone ||
1139 status->GetStatus() == AliShuttleStatus::kFailed){
1140 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1141 fCurrentDetector.Data(),
1142 status->GetStatusName(status->GetStatus())));
1143 UpdateShuttleLogbook(fCurrentDetector.Data(),
1144 status->GetStatusName(status->GetStatus()));
1148 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1150 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1151 fCurrentDetector.Data()));
1152 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1154 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1155 fCurrentDetector.Data()));
1156 UpdateShuttleStatus(AliShuttleStatus::kDone);
1157 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1160 Form("ContinueProcessing - %s: Grid storage failed again",
1161 fCurrentDetector.Data()));
1162 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1167 // if we get here, there is a restart
1168 Bool_t cont = kFALSE;
1171 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1172 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1173 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1174 status->GetCount(), status->GetStatusName()));
1175 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1176 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1178 // there may still be objects in local OCDB and reference storage
1179 // and FXS databases may be not updated: do it now!
1181 // TODO Currently disabled, we want to keep files in case of failure!
1182 // CleanLocalStorage(fgkLocalCDB);
1183 // CleanLocalStorage(fgkLocalRefStorage);
1184 // UpdateTableFailCase();
1186 // Send mail to detector expert!
1187 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1189 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1190 fCurrentDetector.Data()));
1193 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1194 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1195 status->GetStatusName(), status->GetCount()));
1196 Bool_t increaseCount = kTRUE;
1197 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1198 increaseCount = kFALSE;
1199 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1206 //______________________________________________________________________________________________
1207 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1210 // Makes data retrieval for all detectors in the configuration.
1211 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1212 // (Unprocessed, Inactive, Failed or Done).
1213 // Returns kFALSE in case of error occured and kTRUE otherwise
1216 if (!entry) return kFALSE;
1218 fLogbookEntry = entry;
1220 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1223 // create ML instance that monitors this run
1224 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1225 // disable monitoring of other parameters that come e.g. from TFile
1226 gMonitoringWriter = 0;
1228 // Send the information to ML
1229 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1230 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1233 mlList.Add(&mlStatus);
1234 mlList.Add(&mlRunType);
1236 fMonaLisa->SendParameters(&mlList);
1238 if (fLogbookEntry->IsDone())
1240 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1241 UpdateShuttleLogbook("shuttle_done");
1246 // read test mode if flag is set
1250 TString logEntry(entry->GetRunParameter("log"));
1251 //printf("log entry = %s\n", logEntry.Data());
1252 TString searchStr("Testmode: ");
1253 Int_t pos = logEntry.Index(searchStr.Data());
1254 //printf("%d\n", pos);
1257 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1258 //printf("%s\n", subStr.String().Data());
1259 TString newStr(subStr.Data());
1260 TObjArray* token = newStr.Tokenize(' ');
1264 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1267 Int_t testMode = tmpStr->String().Atoi();
1270 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1271 SetTestMode((TestMode) testMode);
1279 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1281 fLogbookEntry->Print("all");
1284 Bool_t hasError = kFALSE;
1286 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1287 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1288 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1289 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1291 // Loop on detectors in the configuration
1292 TIter iter(fConfig->GetDetectors());
1293 TObjString* aDetector = 0;
1295 while ((aDetector = (TObjString*) iter.Next()))
1297 fCurrentDetector = aDetector->String();
1299 if (ContinueProcessing() == kFALSE) continue;
1301 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1302 GetCurrentRun(), aDetector->GetName()));
1304 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1306 Log(fCurrentDetector.Data(), "Starting processing");
1312 Log("SHUTTLE", "ERROR: Forking failed");
1317 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1318 GetCurrentRun(), aDetector->GetName()));
1320 Long_t begin = time(0);
1322 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1323 while (waitpid(pid, &status, WNOHANG) == 0)
1325 Long_t expiredTime = time(0) - begin;
1327 if (expiredTime > fConfig->GetPPTimeOut())
1330 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1331 fCurrentDetector.Data(), expiredTime);
1332 Log("SHUTTLE", tmp);
1333 Log(fCurrentDetector, tmp);
1337 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1340 gSystem->Sleep(1000);
1344 gSystem->Sleep(1000);
1347 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1348 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1351 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1356 if (!fgets(buffer, 100, pipe))
1358 Log("SHUTTLE", "Error: ps did not return anything");
1359 gSystem->ClosePipe(pipe);
1362 gSystem->ClosePipe(pipe);
1364 //Log("SHUTTLE", Form("ps returned %s", buffer));
1367 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1369 Log("SHUTTLE", "Error: Could not parse output of ps");
1373 if (expiredTime % 60 == 0)
1374 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1375 fCurrentDetector.Data(), expiredTime, mem));
1377 if (mem > fConfig->GetPPMaxMem())
1380 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1381 mem, fConfig->GetPPMaxMem());
1382 Log("SHUTTLE", tmp);
1383 Log(fCurrentDetector, tmp);
1387 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1390 gSystem->Sleep(1000);
1395 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1396 GetCurrentRun(), aDetector->GetName()));
1398 if (WIFEXITED(status))
1400 Int_t returnCode = WEXITSTATUS(status);
1402 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1405 if (returnCode == 0) hasError = kTRUE;
1411 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1413 AliInfo("Redirecting output...");
1415 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1417 Log("SHUTTLE", "Could not freopen stdout");
1421 fOutputRedirected = kTRUE;
1422 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1423 Log("SHUTTLE", "Could not redirect stderr");
1427 TString wd = gSystem->WorkingDirectory();
1428 TString tmpDir = Form("%s/%s_process",GetShuttleTempDir(),fCurrentDetector.Data());
1430 gSystem->mkdir(tmpDir.Data());
1431 gSystem->ChangeDirectory(tmpDir.Data());
1433 Bool_t success = ProcessCurrentDetector();
1435 gSystem->ChangeDirectory(wd.Data());
1437 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1439 if (success) // Preprocessor finished successfully!
1441 // Update time_processed field in FXS DB
1442 if (UpdateTable() == kFALSE)
1443 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1444 fCurrentDetector.Data()));
1446 // Transfer the data from local storage to main storage (Grid)
1447 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1448 if (StoreOCDB() == kFALSE)
1450 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1451 GetCurrentRun(), aDetector->GetName()));
1452 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1455 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1456 GetCurrentRun(), aDetector->GetName()));
1457 UpdateShuttleStatus(AliShuttleStatus::kDone);
1458 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1462 for (UInt_t iSys=0; iSys<3; iSys++)
1464 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1467 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
1468 GetCurrentRun(), aDetector->GetName(), success));
1470 // the client exits here
1471 gSystem->Exit(success);
1473 AliError("We should never get here!!!");
1477 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1480 //check if shuttle is done for this run, if so update logbook
1481 TObjArray checkEntryArray;
1482 checkEntryArray.SetOwner(1);
1483 TString whereClause = Form("where run=%d", GetCurrentRun());
1484 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1485 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1487 return hasError == kFALSE;
1490 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1491 (checkEntryArray.At(0));
1495 if (checkEntry->IsDone())
1497 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1498 UpdateShuttleLogbook("shuttle_done");
1502 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1504 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1506 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1507 checkEntry->GetRun(), GetDetName(iDet)));
1508 fFirstUnprocessed[iDet] = kFALSE;
1514 // remove ML instance
1520 return hasError == kFALSE;
1523 //______________________________________________________________________________________________
1524 Bool_t AliShuttle::ProcessCurrentDetector()
1527 // Makes data retrieval just for a specific detector (fCurrentDetector).
1528 // Threre should be a configuration for this detector.
1530 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
1532 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1537 // call preprocessor
1538 AliPreprocessor* aPreprocessor =
1539 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1541 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1543 Bool_t processDCS = aPreprocessor->ProcessDCS();
1547 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1549 else if (fTestMode & kSkipDCS)
1551 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
1553 else if (fTestMode & kErrorDCS)
1555 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1556 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1557 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1561 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1563 TString host(fConfig->GetDCSHost(fCurrentDetector));
1564 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1566 if (fConfig->GetDCSAliases(fCurrentDetector)->GetEntries() > 0)
1568 dcsMap = GetValueSet(host, port, fConfig->GetDCSAliases(fCurrentDetector), kAlias);
1571 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS aliases");
1572 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1577 if (fConfig->GetDCSDataPoints(fCurrentDetector)->GetEntries() > 0)
1579 TMap* dcsMap2 = GetValueSet(host, port, fConfig->GetDCSDataPoints(fCurrentDetector), kDP);
1582 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS data points");
1583 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1595 TIter iter(dcsMap2);
1596 TObjString* key = 0;
1597 while ((key = (TObjString*) iter.Next()))
1598 dcsMap->Add(key, dcsMap2->GetValue(key->String()));
1600 dcsMap2->SetOwner(kFALSE);
1611 // DCS Archive DB processing successful. Call Preprocessor!
1612 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1614 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1616 if (returnValue > 0) // Preprocessor error!
1618 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
1619 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1620 dcsMap->DeleteAll();
1626 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1627 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1628 fCurrentDetector.Data()));
1630 dcsMap->DeleteAll();
1636 //______________________________________________________________________________________________
1637 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1640 // Query DAQ's Shuttle logbook and fills detector status object.
1641 // Call QueryRunParameters to query DAQ logbook for run parameters.
1644 entries.SetOwner(1);
1646 // check connection, in case connect
1647 if(!Connect(3)) return kFALSE;
1650 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1652 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1654 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1658 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1660 if(aResult->GetRowCount() == 0) {
1661 AliInfo("No entries in Shuttle Logbook match request");
1666 // TODO Check field count!
1667 const UInt_t nCols = 23;
1668 if (aResult->GetFieldCount() != (Int_t) nCols) {
1669 AliError("Invalid SQL result field number!");
1675 while ((aRow = aResult->Next())) {
1676 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1677 Int_t run = runString.Atoi();
1679 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1683 // loop on detectors
1684 for(UInt_t ii = 0; ii < nCols; ii++)
1685 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1687 entries.AddLast(entry);
1695 //______________________________________________________________________________________________
1696 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1699 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1702 // check connection, in case connect
1707 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1709 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1711 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1715 if (aResult->GetRowCount() == 0) {
1716 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1721 if (aResult->GetRowCount() > 1) {
1722 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1727 TSQLRow* aRow = aResult->Next();
1730 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1735 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
1737 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1738 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
1740 UInt_t startTime = entry->GetStartTime();
1741 UInt_t endTime = entry->GetEndTime();
1743 if (!startTime || !endTime || startTime > endTime) {
1745 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1746 run, startTime, endTime));
1759 //______________________________________________________________________________________________
1760 Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1761 TObjArray* valueSet, DCSType type)
1763 // Retrieve all "entry" data points from the DCS server
1764 // host, port: TSocket connection parameters
1765 // entry: name of the alias or data point
1766 // valueSet: array of retrieved AliDCSValue's
1767 // type: kAlias or kDP
1769 AliDCSClient client(host, port, fTimeout, fRetries);
1770 if (!client.IsConnected())
1779 result = client.GetAliasValues(entry,
1780 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1784 result = client.GetDPValues(entry,
1785 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1790 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
1791 entry, AliDCSClient::GetErrorString(result)));
1793 if (result == AliDCSClient::fgkServerError)
1795 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
1796 client.GetServerError().Data()));
1805 //______________________________________________________________________________________________
1806 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1809 // Retrieve all "entry" data points from the DCS server
1810 // host, port: TSocket connection parameters
1811 // entries: list of name of the alias or data point
1812 // type: kAlias or kDP
1813 // returns TMap of values, 0 when failure
1815 const Int_t kSplit = 100; // maximum number of DPs at a time
1817 Int_t totalEntries = entries->GetEntries();
1821 for (Int_t index=0; index < totalEntries; index += kSplit)
1823 Int_t endIndex = index + kSplit;
1825 AliDCSClient client(host, port, fTimeout, fRetries);
1826 if (!client.IsConnected())
1829 TMap* partialResult = 0;
1833 partialResult = client.GetAliasValues(entries, GetCurrentStartTime(),
1834 GetCurrentEndTime(), index, endIndex);
1836 else if (type == kDP)
1838 partialResult = client.GetDPValues(entries, GetCurrentStartTime(),
1839 GetCurrentEndTime(), index, endIndex);
1842 if (partialResult == 0)
1844 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries (%d...%d)! Reason: %s",
1845 index, endIndex, client.GetServerError().Data()));
1853 AliInfo(Form("Retrieved entries %d..%d (total %d); E.g. %s has %d values collected",
1854 index, endIndex, totalEntries, entries->At(index)->GetName(), ((TObjArray*)
1855 partialResult->GetValue(entries->At(index)->GetName()))->GetEntriesFast()));
1859 result = partialResult;
1863 TIter iter(partialResult);
1864 TObjString* key = 0;
1865 while ((key = (TObjString*) iter.Next()))
1866 result->Add(key, partialResult->GetValue(key->String()));
1868 partialResult->SetOwner(kFALSE);
1869 delete partialResult;
1876 //______________________________________________________________________________________________
1877 const char* AliShuttle::GetFile(Int_t system, const char* detector,
1878 const char* id, const char* source)
1880 // Get calibration file from file exchange servers
1881 // First queris the FXS database for the file name, using the run, detector, id and source info
1882 // then calls RetrieveFile(filename) for actual copy to local disk
1883 // run: current run being processed (given by Logbook entry fLogbookEntry)
1884 // detector: the Preprocessor name
1885 // id: provided as a parameter by the Preprocessor
1886 // source: provided by the Preprocessor through GetFileSources function
1888 // check if test mode should simulate a FXS error
1889 if (fTestMode & kErrorFXSFiles)
1891 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1895 // check connection, in case connect
1896 if (!Connect(system))
1898 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1902 // Query preparation
1903 TString sourceName(source);
1905 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1906 fConfig->GetFXSdbTable(system));
1907 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1908 GetCurrentRun(), detector, id);
1912 whereClause += Form(" and DAQsource=\"%s\"", source);
1914 else if (system == kDCS)
1918 else if (system == kHLT)
1920 whereClause += Form(" and DDLnumbers=\"%s\"", source);
1924 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1926 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1929 TSQLResult* aResult = 0;
1930 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
1932 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1933 GetSystemName(system), id, sourceName.Data()));
1937 if(aResult->GetRowCount() == 0)
1940 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1941 GetSystemName(system), id, sourceName.Data()));
1946 if (aResult->GetRowCount() > 1) {
1948 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1949 GetSystemName(system), id, sourceName.Data()));
1954 if (aResult->GetFieldCount() != nFields) {
1956 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1957 GetSystemName(system), id, sourceName.Data()));
1962 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1965 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1966 GetSystemName(system), id, sourceName.Data()));
1971 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1972 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
1973 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
1978 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1979 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
1981 // retrieved file is renamed to make it unique
1982 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1983 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1986 // file retrieval from FXS
1987 UInt_t nRetries = 0;
1988 UInt_t maxRetries = 3;
1989 Bool_t result = kFALSE;
1991 // copy!! if successful TSystem::Exec returns 0
1992 while(nRetries++ < maxRetries) {
1993 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1994 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1997 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
1998 filePath.Data(), GetSystemName(system)));
2002 if (fileChecksum.Length()>0)
2004 // compare md5sum of local file with the one stored in the FXS DB
2005 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
2006 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
2010 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2016 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2017 filePath.Data(), GetSystemName(system)));
2022 if(!result) return 0;
2024 fFXSCalled[system]=kTRUE;
2025 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2026 fFXSlist[system].Add(fileParams);
2028 static TString fullLocalFileName;
2029 fullLocalFileName.Form("%s/%s", GetShuttleTempDir(), localFileName.Data());
2031 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and source %s from %s to %s", id, source, GetSystemName(system), fullLocalFileName.Data()));
2033 return fullLocalFileName.Data();
2036 //______________________________________________________________________________________________
2037 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2040 // Copies file from FXS to local Shuttle machine
2043 // check temp directory: trying to cd to temp; if it does not exist, create it
2044 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
2045 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
2047 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
2049 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
2050 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
2055 gSystem->FreeDirectory(dir);
2058 TString baseFXSFolder;
2061 baseFXSFolder = "FES/";
2063 else if (system == kDCS)
2067 else if (system == kHLT)
2069 baseFXSFolder = "/opt/FXS/";
2073 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
2074 fConfig->GetFXSPort(system),
2075 fConfig->GetFXSUser(system),
2076 fConfig->GetFXSHost(system),
2077 baseFXSFolder.Data(),
2079 GetShuttleTempDir(),
2082 AliDebug(2, Form("%s",command.Data()));
2084 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2089 //______________________________________________________________________________________________
2090 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2093 // Get sources producing the condition file Id from file exchange servers
2094 // if id is NULL all sources are returned (distinct)
2097 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2099 // check if test mode should simulate a FXS error
2100 if (fTestMode & kErrorFXSSources)
2102 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2108 AliWarning("DCS system has only one source of data!");
2109 TList *list = new TList();
2111 list->Add(new TObjString(" "));
2115 // check connection, in case connect
2116 if (!Connect(system))
2118 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2122 TString sourceName = 0;
2125 sourceName = "DAQsource";
2126 } else if (system == kHLT)
2128 sourceName = "DDLnumbers";
2131 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2132 TString whereClause = Form("run=%d and detector=\"%s\"",
2133 GetCurrentRun(), detector);
2135 whereClause += Form(" and fileId=\"%s\"", id);
2136 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2138 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2141 TSQLResult* aResult;
2142 aResult = fServer[system]->Query(sqlQuery);
2144 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2145 GetSystemName(system), id));
2149 TList *list = new TList();
2152 if (aResult->GetRowCount() == 0)
2155 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2160 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2163 while ((aRow = aResult->Next()))
2166 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2167 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2168 list->Add(new TObjString(source));
2177 //______________________________________________________________________________________________
2178 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2181 // Get all ids of condition files produced by a given source from file exchange servers
2184 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2186 // check if test mode should simulate a FXS error
2187 if (fTestMode & kErrorFXSSources)
2189 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2193 // check connection, in case connect
2194 if (!Connect(system))
2196 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2200 TString sourceName = 0;
2203 sourceName = "DAQsource";
2204 } else if (system == kHLT)
2206 sourceName = "DDLnumbers";
2209 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2210 TString whereClause = Form("run=%d and detector=\"%s\"",
2211 GetCurrentRun(), detector);
2212 if (sourceName.Length() > 0 && source)
2213 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2214 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2216 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2219 TSQLResult* aResult;
2220 aResult = fServer[system]->Query(sqlQuery);
2222 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2223 GetSystemName(system), source));
2227 TList *list = new TList();
2230 if (aResult->GetRowCount() == 0)
2233 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2238 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2242 while ((aRow = aResult->Next()))
2245 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2246 AliDebug(2, Form("fileId = %s", id.Data()));
2247 list->Add(new TObjString(id));
2256 //______________________________________________________________________________________________
2257 Bool_t AliShuttle::Connect(Int_t system)
2259 // Connect to MySQL Server of the system's FXS MySQL databases
2260 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2263 // check connection: if already connected return
2264 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2266 TString dbHost, dbUser, dbPass, dbName;
2268 if (system < 3) // FXS db servers
2270 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2271 dbUser = fConfig->GetFXSdbUser(system);
2272 dbPass = fConfig->GetFXSdbPass(system);
2273 dbName = fConfig->GetFXSdbName(system);
2274 } else { // Run & Shuttle logbook servers
2275 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2276 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2277 dbUser = fConfig->GetDAQlbUser();
2278 dbPass = fConfig->GetDAQlbPass();
2279 dbName = fConfig->GetDAQlbDB();
2282 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2283 if (!fServer[system] || !fServer[system]->IsConnected()) {
2286 AliError(Form("Can't establish connection to FXS database for %s",
2287 AliShuttleInterface::GetSystemName(system)));
2289 AliError("Can't establish connection to Run logbook.");
2291 if(fServer[system]) delete fServer[system];
2296 TSQLResult* aResult=0;
2299 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2302 aResult = fServer[kDCS]->GetTables(dbName.Data());
2305 aResult = fServer[kHLT]->GetTables(dbName.Data());
2308 aResult = fServer[3]->GetTables(dbName.Data());
2316 //______________________________________________________________________________________________
2317 Bool_t AliShuttle::UpdateTable()
2320 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2323 Bool_t result = kTRUE;
2325 for (UInt_t system=0; system<3; system++)
2327 if(!fFXSCalled[system]) continue;
2329 // check connection, in case connect
2330 if (!Connect(system))
2332 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2337 TTimeStamp now; // now
2339 // Loop on FXS list entries
2340 TIter iter(&fFXSlist[system]);
2341 TObjString *aFXSentry=0;
2342 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2344 TString aFXSentrystr = aFXSentry->String();
2345 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2346 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2348 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2349 GetSystemName(system), aFXSentrystr.Data()));
2350 if(aFXSarray) delete aFXSarray;
2354 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2355 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2357 TString whereClause;
2360 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2361 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2363 else if (system == kDCS)
2365 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2366 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2368 else if (system == kHLT)
2370 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2371 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2376 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2377 now.GetSec(), whereClause.Data());
2379 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2382 TSQLResult* aResult;
2383 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2386 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2387 GetSystemName(system), sqlQuery.Data()));
2398 //______________________________________________________________________________________________
2399 Bool_t AliShuttle::UpdateTableFailCase()
2401 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2402 // this is called in case the preprocessor is declared failed for the current run, because
2403 // the fields are updated only in case of success
2405 Bool_t result = kTRUE;
2407 for (UInt_t system=0; system<3; system++)
2409 // check connection, in case connect
2410 if (!Connect(system))
2412 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2413 GetSystemName(system)));
2418 TTimeStamp now; // now
2420 // Loop on FXS list entries
2422 TString whereClause = Form("where run=%d and detector=\"%s\";",
2423 GetCurrentRun(), fCurrentDetector.Data());
2426 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2427 now.GetSec(), whereClause.Data());
2429 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2432 TSQLResult* aResult;
2433 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2436 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2437 GetSystemName(system), sqlQuery.Data()));
2447 //______________________________________________________________________________________________
2448 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2451 // Update Shuttle logbook filling detector or shuttle_done column
2452 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2455 // check connection, in case connect
2457 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2461 TString detName(detector);
2463 if(detName == "shuttle_done")
2465 setClause = "set shuttle_done=1";
2467 // Send the information to ML
2468 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2471 mlList.Add(&mlStatus);
2473 fMonaLisa->SendParameters(&mlList);
2475 TString statusStr(status);
2476 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2477 statusStr.Contains("failed", TString::kIgnoreCase)){
2478 setClause = Form("set %s=\"%s\"", detector, status);
2481 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2487 TString whereClause = Form("where run=%d", GetCurrentRun());
2489 TString sqlQuery = Form("update %s %s %s",
2490 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2492 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2495 TSQLResult* aResult;
2496 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2498 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2506 //______________________________________________________________________________________________
2507 Int_t AliShuttle::GetCurrentRun() const
2510 // Get current run from logbook entry
2513 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2516 //______________________________________________________________________________________________
2517 UInt_t AliShuttle::GetCurrentStartTime() const
2520 // get current start time
2523 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2526 //______________________________________________________________________________________________
2527 UInt_t AliShuttle::GetCurrentEndTime() const
2530 // get current end time from logbook entry
2533 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2536 //______________________________________________________________________________________________
2537 void AliShuttle::Log(const char* detector, const char* message)
2540 // Fill log string with a message
2543 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2545 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2546 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2551 gSystem->FreeDirectory(dir);
2554 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2555 if (GetCurrentRun() >= 0)
2556 toLog += Form("run %d - ", GetCurrentRun());
2557 toLog += Form("%s", message);
2559 AliInfo(toLog.Data());
2561 // if we redirect the log output already to the file, leave here
2562 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2565 TString fileName = GetLogFileName(detector);
2567 gSystem->ExpandPathName(fileName);
2570 logFile.open(fileName, ofstream::out | ofstream::app);
2572 if (!logFile.is_open()) {
2573 AliError(Form("Could not open file %s", fileName.Data()));
2577 logFile << toLog.Data() << "\n";
2582 //______________________________________________________________________________________________
2583 TString AliShuttle::GetLogFileName(const char* detector) const
2586 // returns the name of the log file for a given sub detector
2591 if (GetCurrentRun() >= 0)
2592 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2594 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2599 //______________________________________________________________________________________________
2600 Bool_t AliShuttle::Collect(Int_t run)
2603 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2604 // If a dedicated run is given this run is processed
2606 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2610 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2612 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2614 SetLastAction("Starting");
2616 TString whereClause("where shuttle_done=0");
2618 whereClause += Form(" and run=%d", run);
2620 TObjArray shuttleLogbookEntries;
2621 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2623 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2627 if (shuttleLogbookEntries.GetEntries() == 0)
2630 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2632 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2633 "or it does not exist in Shuttle logbook", run));
2637 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2638 fFirstUnprocessed[iDet] = kTRUE;
2642 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2643 // flag them into fFirstUnprocessed array
2644 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2645 TObjArray tmpLogbookEntries;
2646 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2648 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2652 TIter iter(&tmpLogbookEntries);
2653 AliShuttleLogbookEntry* anEntry = 0;
2654 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2656 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2658 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2660 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2661 anEntry->GetRun(), GetDetName(iDet)));
2662 fFirstUnprocessed[iDet] = kFALSE;
2670 if (!RetrieveConditionsData(shuttleLogbookEntries))
2672 Log("SHUTTLE", "Collect - Process of at least one run failed");
2676 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2680 //______________________________________________________________________________________________
2681 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2684 // Retrieve conditions data for all runs that aren't processed yet
2687 Bool_t hasError = kFALSE;
2689 TIter iter(&dateEntries);
2690 AliShuttleLogbookEntry* anEntry;
2692 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2693 if (!Process(anEntry)){
2697 // clean SHUTTLE temp directory
2698 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2699 RemoveFile(filename.Data());
2702 return hasError == kFALSE;
2705 //______________________________________________________________________________________________
2706 ULong_t AliShuttle::GetTimeOfLastAction() const
2709 // Gets time of last action
2714 fMonitoringMutex->Lock();
2716 tmp = fLastActionTime;
2718 fMonitoringMutex->UnLock();
2723 //______________________________________________________________________________________________
2724 const TString AliShuttle::GetLastAction() const
2727 // returns a string description of the last action
2732 fMonitoringMutex->Lock();
2736 fMonitoringMutex->UnLock();
2741 //______________________________________________________________________________________________
2742 void AliShuttle::SetLastAction(const char* action)
2745 // updates the monitoring variables
2748 fMonitoringMutex->Lock();
2750 fLastAction = action;
2751 fLastActionTime = time(0);
2753 fMonitoringMutex->UnLock();
2756 //______________________________________________________________________________________________
2757 const char* AliShuttle::GetRunParameter(const char* param)
2760 // returns run parameter read from DAQ logbook
2763 if(!fLogbookEntry) {
2764 AliError("No logbook entry!");
2768 return fLogbookEntry->GetRunParameter(param);
2771 //______________________________________________________________________________________________
2772 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
2775 // returns object from OCDB valid for current run
2778 if (fTestMode & kErrorOCDB)
2780 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2784 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2787 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
2791 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2794 //______________________________________________________________________________________________
2795 Bool_t AliShuttle::SendMail()
2798 // sends a mail to the subdetector expert in case of preprocessor error
2801 if (fTestMode != kNone)
2804 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
2807 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
2809 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2814 gSystem->FreeDirectory(dir);
2817 TString bodyFileName;
2818 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
2819 gSystem->ExpandPathName(bodyFileName);
2822 mailBody.open(bodyFileName, ofstream::out);
2824 if (!mailBody.is_open())
2826 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2831 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2832 TObjString *anExpert=0;
2833 while ((anExpert = (TObjString*) iterExperts.Next()))
2835 to += Form("%s,", anExpert->GetName());
2837 to.Remove(to.Length()-1);
2838 AliDebug(2, Form("to: %s",to.Data()));
2841 AliInfo("List of detector responsibles not yet set!");
2845 TString cc="alberto.colla@cern.ch";
2847 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
2848 fCurrentDetector.Data(), GetCurrentRun());
2849 AliDebug(2, Form("subject: %s", subject.Data()));
2851 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2852 body += Form("SHUTTLE just detected that your preprocessor "
2853 "failed processing run %d!!\n\n", GetCurrentRun());
2854 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2855 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2856 body += Form("Find the %s log for the current run on \n\n"
2857 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2858 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
2859 body += Form("The last 10 lines of %s log file are following:\n\n");
2861 AliDebug(2, Form("Body begin: %s", body.Data()));
2863 mailBody << body.Data();
2865 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2867 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
2868 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2869 if (gSystem->Exec(tailCommand.Data()))
2871 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2874 TString endBody = Form("------------------------------------------------------\n\n");
2875 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2876 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
2877 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
2879 AliDebug(2, Form("Body end: %s", endBody.Data()));
2881 mailBody << endBody.Data();
2886 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2890 bodyFileName.Data());
2891 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
2893 Bool_t result = gSystem->Exec(mailCommand.Data());
2898 //______________________________________________________________________________________________
2899 const char* AliShuttle::GetRunType()
2902 // returns run type read from "run type" logbook
2905 if(!fLogbookEntry) {
2906 AliError("No logbook entry!");
2910 return fLogbookEntry->GetRunType();
2913 //______________________________________________________________________________________________
2914 Bool_t AliShuttle::GetHLTStatus()
2916 // Return HLT status (ON=1 OFF=0)
2917 // Converts the HLT status from the status string read in the run logbook (not just a bool)
2919 if(!fLogbookEntry) {
2920 AliError("No logbook entry!");
2924 // TODO implement when HLTStatus is inserted in run logbook
2925 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
2926 //if(hltStatus == "OFF") {return kFALSE};
2931 //______________________________________________________________________________________________
2932 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2935 // sets Shuttle temp directory
2938 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2941 //______________________________________________________________________________________________
2942 void AliShuttle::SetShuttleLogDir(const char* logDir)
2945 // sets Shuttle log directory
2948 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);