1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.66 2007/12/05 10:45:19 jgrosseo
19 changed order of arguments to TMonaLisaWriter
21 Revision 1.65 2007/11/26 16:58:37 acolla
22 Monalisa configuration added: host and table name
24 Revision 1.64 2007/11/13 16:15:47 acolla
25 DCS map is stored in a file in the temp folder where the detector is processed.
26 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
28 Revision 1.63 2007/11/02 10:53:16 acolla
29 Protection added to AliShuttle::CopyFileLocally
31 Revision 1.62 2007/10/31 18:23:13 acolla
32 Furter developement on the Shuttle:
34 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
35 are now built from /alice/data, e.g.:
36 /alice/data/2007/LHC07a/OCDB
38 the year and LHC period are taken from the Shuttle.
39 Raw metadata files are stored by GRP to:
40 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
42 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
44 Revision 1.61 2007/10/30 20:33:51 acolla
45 Improved managing of temporary folders, which weren't correctly handled.
46 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
48 Revision 1.60 2007/10/29 18:06:16 acolla
50 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
51 This function can be used by GRP only. It stores raw data tags merged file to the
52 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
56 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
57 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
58 has been corrected in the root package on the Shuttle machine.
60 Revision 1.59 2007/10/05 12:40:55 acolla
62 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
64 Revision 1.58 2007/09/28 15:27:40 acolla
66 AliDCSClient "multiSplit" option added in the DCS configuration
67 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
69 Revision 1.57 2007/09/27 16:53:13 acolla
70 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
71 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
73 Revision 1.56 2007/09/14 16:46:14 jgrosseo
74 1) Connect and Close are called before and after each query, so one can
75 keep the same AliDCSClient object.
76 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
77 3) Splitting interval can be specified in constructor
79 Revision 1.55 2007/08/06 12:26:40 acolla
80 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
81 read from the run logbook.
83 Revision 1.54 2007/07/12 09:51:25 jgrosseo
84 removed duplicated log message in GetFile
86 Revision 1.53 2007/07/12 09:26:28 jgrosseo
87 updating hlt fxs base path
89 Revision 1.52 2007/07/12 08:06:45 jgrosseo
90 adding log messages in getfile... functions
91 adding not implemented copy constructor in alishuttleconfigholder
93 Revision 1.51 2007/07/03 17:24:52 acolla
94 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
96 Revision 1.50 2007/07/02 17:19:32 acolla
97 preprocessor is run in a temp directory that is removed when process is finished.
99 Revision 1.49 2007/06/29 10:45:06 acolla
100 Number of columns in MySql Shuttle logbook increased by one (HLT added)
102 Revision 1.48 2007/06/21 13:06:19 acolla
103 GetFileSources returns dummy list with 1 source if system=DCS (better than
104 returning error as it was)
106 Revision 1.47 2007/06/19 17:28:56 acolla
107 HLT updated; missing map bug removed.
109 Revision 1.46 2007/06/09 13:01:09 jgrosseo
110 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
112 Revision 1.45 2007/05/30 06:35:20 jgrosseo
113 Adding functionality to the Shuttle/TestShuttle:
114 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
115 o) Function to retrieve list of IDs for a given source (GetFileIDs)
116 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
117 Example code has been added to the TestProcessor in TestShuttle
119 Revision 1.44 2007/05/11 16:09:32 acolla
120 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
121 example: ITS/SPD/100_filename.root
123 Revision 1.43 2007/05/10 09:59:51 acolla
124 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
126 Revision 1.42 2007/05/03 08:01:39 jgrosseo
127 typo in last commit :-(
129 Revision 1.41 2007/05/03 08:00:48 jgrosseo
130 fixing log message when pp want to skip dcs value retrieval
132 Revision 1.40 2007/04/27 07:06:48 jgrosseo
133 GetFileSources returns empty list in case of no files, but successful query
134 No mails sent in testmode
136 Revision 1.39 2007/04/17 12:43:57 acolla
137 Correction in StoreOCDB; change of text in mail to detector expert
139 Revision 1.38 2007/04/12 08:26:18 jgrosseo
142 Revision 1.37 2007/04/10 16:53:14 jgrosseo
143 redirecting sub detector stdout, stderr to sub detector log file
145 Revision 1.35 2007/04/04 16:26:38 acolla
146 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
147 2. Added missing dependency in test preprocessors.
148 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
150 Revision 1.34 2007/04/04 10:33:36 jgrosseo
151 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
152 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
154 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
156 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
158 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
160 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
161 If you always need DCS data (like before), you do not need to implement it.
163 6) The run type has been added to the monitoring page
165 Revision 1.33 2007/04/03 13:56:01 acolla
166 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
169 Revision 1.32 2007/02/28 10:41:56 acolla
170 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
171 AliPreprocessor::GetRunType() function.
172 Added some ldap definition files.
174 Revision 1.30 2007/02/13 11:23:21 acolla
175 Moved getters and setters of Shuttle's main OCDB/Reference, local
176 OCDB/Reference, temp and log folders to AliShuttleInterface
178 Revision 1.27 2007/01/30 17:52:42 jgrosseo
179 adding monalisa monitoring
181 Revision 1.26 2007/01/23 19:20:03 acolla
182 Removed old ldif files, added TOF, MCH ldif files. Added some options in
183 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
186 Revision 1.25 2007/01/15 19:13:52 acolla
187 Moved some AliInfo to AliDebug in SendMail function
189 Revision 1.21 2006/12/07 08:51:26 jgrosseo
191 table, db names in ldap configuration
192 added GRP preprocessor
193 DCS data can also be retrieved by data point
195 Revision 1.20 2006/11/16 16:16:48 jgrosseo
196 introducing strict run ordering flag
197 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
199 Revision 1.19 2006/11/06 14:23:04 jgrosseo
200 major update (Alberto)
201 o) reading of run parameters from the logbook
202 o) online offline naming conversion
203 o) standalone DCSclient package
205 Revision 1.18 2006/10/20 15:22:59 jgrosseo
206 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
207 o) Merging Collect, CollectAll, CollectNew function
208 o) Removing implementation of empty copy constructors (declaration still there!)
210 Revision 1.17 2006/10/05 16:20:55 jgrosseo
211 adapting to new CDB classes
213 Revision 1.16 2006/10/05 15:46:26 jgrosseo
214 applying to the new interface
216 Revision 1.15 2006/10/02 16:38:39 jgrosseo
219 storing of objects that failed to be stored to the grid before
220 interfacing of shuttle status table in daq system
222 Revision 1.14 2006/08/29 09:16:05 jgrosseo
225 Revision 1.13 2006/08/15 10:50:00 jgrosseo
226 effc++ corrections (alberto)
228 Revision 1.12 2006/08/08 14:19:29 jgrosseo
229 Update to shuttle classes (Alberto)
231 - Possibility to set the full object's path in the Preprocessor's and
232 Shuttle's Store functions
233 - Possibility to extend the object's run validity in the same classes
234 ("startValidity" and "validityInfinite" parameters)
235 - Implementation of the StoreReferenceData function to store reference
236 data in a dedicated CDB storage.
238 Revision 1.11 2006/07/21 07:37:20 jgrosseo
239 last run is stored after each run
241 Revision 1.10 2006/07/20 09:54:40 jgrosseo
242 introducing status management: The processing per subdetector is divided into several steps,
243 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
244 can keep track of the number of failures and skips further processing after a certain threshold is
245 exceeded. These thresholds can be configured in LDAP.
247 Revision 1.9 2006/07/19 10:09:55 jgrosseo
248 new configuration, accesst to DAQ FES (Alberto)
250 Revision 1.8 2006/07/11 12:44:36 jgrosseo
251 adding parameters for extended validity range of data produced by preprocessor
253 Revision 1.7 2006/07/10 14:37:09 jgrosseo
254 small fix + todo comment
256 Revision 1.6 2006/07/10 13:01:41 jgrosseo
257 enhanced storing of last sucessfully processed run (alberto)
259 Revision 1.5 2006/07/04 14:59:57 jgrosseo
260 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
262 Revision 1.4 2006/06/12 09:11:16 jgrosseo
263 coding conventions (Alberto)
265 Revision 1.3 2006/06/06 14:26:40 jgrosseo
266 o) removed files that were moved to STEER
267 o) shuttle updated to follow the new interface (Alberto)
269 Revision 1.2 2006/03/07 07:52:34 hristov
270 New version (B.Yordanov)
272 Revision 1.6 2005/11/19 17:19:14 byordano
273 RetrieveDATEEntries and RetrieveConditionsData added
275 Revision 1.5 2005/11/19 11:09:27 byordano
276 AliShuttle declaration added
278 Revision 1.4 2005/11/17 17:47:34 byordano
279 TList changed to TObjArray
281 Revision 1.3 2005/11/17 14:43:23 byordano
284 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
285 Initial import as subdirectory in AliRoot
287 Revision 1.2 2005/09/13 08:41:15 byordano
288 default startTime endTime added
290 Revision 1.4 2005/08/30 09:13:02 byordano
293 Revision 1.3 2005/08/29 21:15:47 byordano
299 // This class is the main manager for AliShuttle.
300 // It organizes the data retrieval from DCS and call the
301 // interface methods of AliPreprocessor.
302 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
303 // data for its set of aliases is retrieved. If there is registered
304 // AliPreprocessor for this detector then it will be used
305 // accroding to the schema (see AliPreprocessor).
306 // If there isn't registered AliPreprocessor than the retrieved
307 // data is stored automatically to the undelying AliCDBStorage.
308 // For detSpec is used the alias name.
311 #include "AliShuttle.h"
313 #include "AliCDBManager.h"
314 #include "AliCDBStorage.h"
315 #include "AliCDBId.h"
316 #include "AliCDBRunRange.h"
317 #include "AliCDBPath.h"
318 #include "AliCDBEntry.h"
319 #include "AliShuttleConfig.h"
320 #include "DCSClient/AliDCSClient.h"
322 #include "AliPreprocessor.h"
323 #include "AliShuttleStatus.h"
324 #include "AliShuttleLogbookEntry.h"
329 #include <TTimeStamp.h>
330 #include <TObjString.h>
331 #include <TSQLServer.h>
332 #include <TSQLResult.h>
335 #include <TSystemDirectory.h>
336 #include <TSystemFile.h>
339 #include <TGridResult.h>
341 #include <TMonaLisaWriter.h>
345 #include <sys/types.h>
346 #include <sys/wait.h>
350 //______________________________________________________________________________________________
351 AliShuttle::AliShuttle(const AliShuttleConfig* config,
352 UInt_t timeout, Int_t retries):
354 fTimeout(timeout), fRetries(retries),
364 fReadTestMode(kFALSE),
365 fOutputRedirected(kFALSE)
368 // config: AliShuttleConfig used
369 // timeout: timeout used for AliDCSClient connection
370 // retries: the number of retries in case of connection error.
373 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
374 for(int iSys=0;iSys<4;iSys++) {
377 fFXSlist[iSys].SetOwner(kTRUE);
379 fPreprocessorMap.SetOwner(kTRUE);
381 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
382 fFirstUnprocessed[iDet] = kFALSE;
384 fMonitoringMutex = new TMutex();
387 //______________________________________________________________________________________________
388 AliShuttle::~AliShuttle()
394 fPreprocessorMap.DeleteAll();
395 for(int iSys=0;iSys<4;iSys++)
397 fServer[iSys]->Close();
398 delete fServer[iSys];
407 if (fMonitoringMutex)
409 delete fMonitoringMutex;
410 fMonitoringMutex = 0;
414 //______________________________________________________________________________________________
415 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
418 // Registers new AliPreprocessor.
419 // It uses GetName() for indentificator of the pre processor.
420 // The pre processor is registered it there isn't any other
421 // with the same identificator (GetName()).
424 const char* detName = preprocessor->GetName();
425 if(GetDetPos(detName) < 0)
426 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
428 if (fPreprocessorMap.GetValue(detName)) {
429 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
433 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
435 //______________________________________________________________________________________________
436 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
437 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
439 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
440 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
441 // using this function. Use StoreReferenceData instead!
442 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
443 // finishes the data are transferred to the main storage (Grid).
445 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
448 //______________________________________________________________________________________________
449 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
451 // Stores a CDB object in the storage for reference data. This objects will not be available during
452 // offline reconstrunction. Use this function for reference data only!
453 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
454 // finishes the data are transferred to the main storage (Grid).
456 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
459 //______________________________________________________________________________________________
460 Bool_t AliShuttle::StoreLocally(const TString& localUri,
461 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
462 Int_t validityStart, Bool_t validityInfinite)
464 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
465 // when the preprocessor finishes the data are transferred to the main storage (Grid).
466 // The parameters are:
467 // 1) Uri of the backup storage (Local)
468 // 2) the object's path.
469 // 3) the object to be stored
470 // 4) the metaData to be associated with the object
471 // 5) the validity start run number w.r.t. the current run,
472 // if the data is valid only for this run leave the default 0
473 // 6) specifies if the calibration data is valid for infinity (this means until updated),
474 // typical for calibration runs, the default is kFALSE
476 // returns 0 if fail, 1 otherwise
478 if (fTestMode & kErrorStorage)
480 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
484 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
486 Int_t firstRun = GetCurrentRun() - validityStart;
488 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
493 if(validityInfinite) {
494 lastRun = AliCDBRunRange::Infinity();
496 lastRun = GetCurrentRun();
499 // Version is set to current run, it will be used later to transfer data to Grid
500 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
502 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
503 TObjString runUsed = Form("%d", GetCurrentRun());
504 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
507 Bool_t result = kFALSE;
509 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
510 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
512 result = AliCDBManager::Instance()->GetStorage(localUri)
513 ->Put(object, id, metaData);
518 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
524 //______________________________________________________________________________________________
525 Bool_t AliShuttle::StoreOCDB()
528 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
529 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
530 // Then calls StoreRefFilesToGrid to store reference files.
533 if (fTestMode & kErrorGrid)
535 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
536 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
540 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
541 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
543 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
544 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
546 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
547 Bool_t resultRefFiles = CopyFilesToGrid("reference");
549 Bool_t resultMetadata = kTRUE;
550 if(fCurrentDetector == "GRP")
552 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
553 resultMetadata = CopyFilesToGrid("metadata");
556 return resultCDB && resultRef && resultRefFiles && resultMetadata;
559 //______________________________________________________________________________________________
560 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
563 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
566 TObjArray* gridIds=0;
568 Bool_t result = kTRUE;
570 const char* type = 0;
572 if(gridURI == fgkMainCDB) {
574 localURI = fgkLocalCDB;
575 } else if(gridURI == fgkMainRefStorage) {
577 localURI = fgkLocalRefStorage;
579 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
583 AliCDBManager* man = AliCDBManager::Instance();
585 AliCDBStorage *gridSto = man->GetStorage(gridURI);
588 Form("StoreOCDB - cannot activate main %s storage", type));
592 gridIds = gridSto->GetQueryCDBList();
594 // get objects previously stored in local CDB
595 AliCDBStorage *localSto = man->GetStorage(localURI);
598 Form("StoreOCDB - cannot activate local %s storage", type));
601 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
602 // Local objects were stored with current run as Grid version!
603 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
604 localEntries->SetOwner(1);
606 // loop on local stored objects
607 TIter localIter(localEntries);
608 AliCDBEntry *aLocEntry = 0;
609 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
610 aLocEntry->SetOwner(1);
611 AliCDBId aLocId = aLocEntry->GetId();
612 aLocEntry->SetVersion(-1);
613 aLocEntry->SetSubVersion(-1);
615 // If local object is valid up to infinity we store it only if it is
616 // the first unprocessed run!
617 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
618 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
620 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
621 "there are previous unprocessed runs!",
622 fCurrentDetector.Data(), aLocId.GetPath().Data()));
626 // loop on Grid valid Id's
627 Bool_t store = kTRUE;
628 TIter gridIter(gridIds);
629 AliCDBId* aGridId = 0;
630 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
631 if(aGridId->GetPath() != aLocId.GetPath()) continue;
632 // skip all objects valid up to infinity
633 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
634 // if we get here, it means there's already some more recent object stored on Grid!
639 // If we get here, the file can be stored!
640 Bool_t storeOk = gridSto->Put(aLocEntry);
641 if(!store || storeOk){
645 Log(fCurrentDetector.Data(),
646 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
647 type, aGridId->ToString().Data()));
650 Form("StoreOCDB - Object <%s> successfully put into %s storage",
651 aLocId.ToString().Data(), type));
652 Log(fCurrentDetector.Data(),
653 Form("StoreOCDB - Object <%s> successfully put into %s storage",
654 aLocId.ToString().Data(), type));
657 // removing local filename...
659 localSto->IdToFilename(aLocId, filename);
660 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
661 RemoveFile(filename.Data());
665 Form("StoreOCDB - Grid %s storage of object <%s> failed",
666 type, aLocId.ToString().Data()));
667 Log(fCurrentDetector.Data(),
668 Form("StoreOCDB - Grid %s storage of object <%s> failed",
669 type, aLocId.ToString().Data()));
673 localEntries->Clear();
678 //______________________________________________________________________________________________
679 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
681 // clears the directory used to store reference files of a given subdetector
683 AliCDBManager* man = AliCDBManager::Instance();
684 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
685 TString localBaseFolder = sto->GetBaseFolder();
687 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
689 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
692 begin.Form("%d_", GetCurrentRun());
694 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
698 TList* dirList = baseDir->GetListOfFiles();
701 if (!dirList) return kTRUE;
703 if (dirList->GetEntries() < 3)
709 Int_t nDirs = 0, nDel = 0;
710 TIter dirIter(dirList);
711 TSystemFile* entry = 0;
713 Bool_t success = kTRUE;
715 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
717 if (entry->IsDirectory())
720 TString fileName(entry->GetName());
721 if (!fileName.BeginsWith(begin))
727 Int_t result = gSystem->Unlink(fileName.Data());
731 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
739 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
740 nDel, nDirs, targetDir.Data()));
751 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
755 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
758 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
763 result = gSystem->mkdir(targetDir, kTRUE);
766 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
773 //______________________________________________________________________________________________
774 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
777 // Stores reference file directly (without opening it). This function stores the file locally.
779 // The file is stored under the following location:
780 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
781 // where <gridFileName> is the second parameter given to the function
784 if (fTestMode & kErrorStorage)
786 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
790 AliCDBManager* man = AliCDBManager::Instance();
791 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
793 TString localBaseFolder = sto->GetBaseFolder();
795 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
796 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
798 return CopyFileLocally(localFile, target);
801 //______________________________________________________________________________________________
802 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
805 // Stores Run metadata file to the Grid, in the run folder
807 // Only GRP can call this function.
809 if (fTestMode & kErrorStorage)
811 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
815 AliCDBManager* man = AliCDBManager::Instance();
816 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
818 TString localBaseFolder = sto->GetBaseFolder();
820 // Build Run level folder
821 // folder = /alice/data/year/lhcPeriod/runNb/Raw
824 TString lhcPeriod = GetLHCPeriod();
825 if (lhcPeriod.Length() == 0)
827 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
831 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
832 localBaseFolder.Data(), GetCurrentYear(),
833 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
835 return CopyFileLocally(localFile, target);
838 //______________________________________________________________________________________________
839 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
842 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
843 // Files are temporarily stored in the local reference storage. When the preprocessor
844 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
845 // (in reference or run level folders)
848 TString targetDir(target(0, target.Last('/')));
850 //try to open base dir folder, if it does not exist
851 void* dir = gSystem->OpenDirectory(targetDir.Data());
853 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
854 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
859 gSystem->FreeDirectory(dir);
864 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
867 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
871 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
874 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
875 if (gSystem->Unlink(target.Data()))
877 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
882 result = gSystem->CopyFile(localFile, target);
886 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
891 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
892 localFile, target.Data(), result));
900 //______________________________________________________________________________________________
901 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
904 // Transfers local files to the Grid. Local files can be reference files
905 // or run metadata file (from GRP only).
907 // According to the type (ref, metadata) the files are stored under the following location:
908 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
909 // metadata --> <run data folder>/<MetadataFileName>
912 AliCDBManager* man = AliCDBManager::Instance();
913 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
916 TString localBaseFolder = sto->GetBaseFolder();
922 if (strcmp(type, "reference") == 0)
924 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
925 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
928 TString gridBaseFolder = gridSto->GetBaseFolder();
929 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
930 begin = Form("%d_", GetCurrentRun());
932 else if (strcmp(type, "metadata") == 0)
935 TString lhcPeriod = GetLHCPeriod();
937 if (lhcPeriod.Length() == 0)
939 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
943 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
944 localBaseFolder.Data(), GetCurrentYear(),
945 lhcPeriod.Data(), GetCurrentRun());
946 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
952 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
956 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
960 TList* dirList = baseDir->GetListOfFiles();
963 if (!dirList) return kTRUE;
965 if (dirList->GetEntries() < 3)
973 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
978 Int_t nDirs = 0, nTransfer = 0;
979 TIter dirIter(dirList);
980 TSystemFile* entry = 0;
982 Bool_t success = kTRUE;
983 Bool_t first = kTRUE;
985 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
987 if (entry->IsDirectory())
990 TString fileName(entry->GetName());
991 if (!fileName.BeginsWith(begin))
999 // check that folder exists, otherwise create it
1000 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1008 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1010 // TODO It does not work currently! Bug in TAliEn::Mkdir
1011 // TODO Manually fixed in local root v5-16-00
1012 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1014 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1019 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1023 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1027 TString fullLocalPath;
1028 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1030 TString fullGridPath;
1031 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1033 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1037 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1038 fullLocalPath.Data(), fullGridPath.Data()));
1039 RemoveFile(fullLocalPath);
1044 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1045 fullLocalPath.Data(), fullGridPath.Data()));
1050 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1051 nTransfer, nDirs, dir.Data()));
1058 //______________________________________________________________________________________________
1059 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1062 // Get folder name of reference files
1065 TString offDetStr(GetOfflineDetName(detector));
1067 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1069 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1071 dir.Form("%s/%s", base, offDetStr.Data());
1079 //______________________________________________________________________________________________
1080 void AliShuttle::CleanLocalStorage(const TString& uri)
1083 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1086 const char* type = 0;
1087 if(uri == fgkLocalCDB) {
1089 } else if(uri == fgkLocalRefStorage) {
1092 AliError(Form("Invalid storage URI: %s", uri.Data()));
1096 AliCDBManager* man = AliCDBManager::Instance();
1098 // open local storage
1099 AliCDBStorage *localSto = man->GetStorage(uri);
1102 Form("CleanLocalStorage - cannot activate local %s storage", type));
1106 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1107 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1109 AliDebug(2, Form("filename = %s", filename.Data()));
1111 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1112 GetCurrentRun(), fCurrentDetector.Data()));
1114 RemoveFile(filename.Data());
1118 //______________________________________________________________________________________________
1119 void AliShuttle::RemoveFile(const char* filename)
1122 // removes local file
1125 TString command(Form("rm -f %s", filename));
1127 Int_t result = gSystem->Exec(command.Data());
1130 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1131 fCurrentDetector.Data(), filename));
1135 //______________________________________________________________________________________________
1136 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1139 // Reads the AliShuttleStatus from the CDB
1143 delete fStatusEntry;
1147 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1148 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1150 if (!fStatusEntry) return 0;
1151 fStatusEntry->SetOwner(1);
1153 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1155 AliError("Invalid object stored to CDB!");
1162 //______________________________________________________________________________________________
1163 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1166 // writes the status for one subdetector
1170 delete fStatusEntry;
1174 Int_t run = GetCurrentRun();
1176 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1178 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1179 fStatusEntry->SetOwner(1);
1181 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1184 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1185 fCurrentDetector.Data(), run));
1194 //______________________________________________________________________________________________
1195 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1198 // changes the AliShuttleStatus for the given detector and run to the given status
1202 AliError("UNEXPECTED: fStatusEntry empty");
1206 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1209 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1213 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1214 fCurrentDetector.Data(),
1215 status->GetStatusName(),
1216 status->GetStatusName(newStatus));
1217 Log("SHUTTLE", actionStr);
1218 SetLastAction(actionStr);
1220 status->SetStatus(newStatus);
1221 if (increaseCount) status->IncreaseCount();
1223 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1228 //______________________________________________________________________________________________
1229 void AliShuttle::SendMLInfo()
1232 // sends ML information about the current status of the current detector being processed
1235 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1238 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1242 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1243 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1246 mlList.Add(&mlStatus);
1247 mlList.Add(&mlRetryCount);
1249 fMonaLisa->SendParameters(&mlList);
1252 //______________________________________________________________________________________________
1253 Bool_t AliShuttle::ContinueProcessing()
1255 // this function reads the AliShuttleStatus information from CDB and
1256 // checks if the processing should be continued
1257 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1259 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1261 AliPreprocessor* aPreprocessor =
1262 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1265 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1269 AliShuttleLogbookEntry::Status entryStatus =
1270 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1272 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1273 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1274 fCurrentDetector.Data(),
1275 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1279 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1281 // check if current run is first unprocessed run for current detector
1282 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1283 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1285 if (fTestMode == kNone)
1287 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1288 " but this is not the first unprocessed run!"));
1293 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1294 "Although %s requires strict run ordering "
1295 "and this is not the first unprocessed run, "
1296 "the SHUTTLE continues"));
1300 AliShuttleStatus* status = ReadShuttleStatus();
1303 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1304 fCurrentDetector.Data()));
1305 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1306 return WriteShuttleStatus(status);
1309 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1310 // If it happens it may mean Logbook updating failed... let's do it now!
1311 if (status->GetStatus() == AliShuttleStatus::kDone ||
1312 status->GetStatus() == AliShuttleStatus::kFailed){
1313 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1314 fCurrentDetector.Data(),
1315 status->GetStatusName(status->GetStatus())));
1316 UpdateShuttleLogbook(fCurrentDetector.Data(),
1317 status->GetStatusName(status->GetStatus()));
1321 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1323 Form("ContinueProcessing - %s: Grid storage of one or more "
1324 "objects failed. Trying again now",
1325 fCurrentDetector.Data()));
1326 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1328 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1329 "successfully stored into main storage",
1330 fCurrentDetector.Data()));
1331 UpdateShuttleStatus(AliShuttleStatus::kDone);
1332 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1335 Form("ContinueProcessing - %s: Grid storage failed again",
1336 fCurrentDetector.Data()));
1337 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1342 // if we get here, there is a restart
1343 Bool_t cont = kFALSE;
1346 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1347 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1348 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1349 status->GetCount(), status->GetStatusName()));
1350 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1351 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1353 // there may still be objects in local OCDB and reference storage
1354 // and FXS databases may be not updated: do it now!
1356 // TODO Currently disabled, we want to keep files in case of failure!
1357 // CleanLocalStorage(fgkLocalCDB);
1358 // CleanLocalStorage(fgkLocalRefStorage);
1359 // UpdateTableFailCase();
1361 // Send mail to detector expert!
1362 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1363 fCurrentDetector.Data()));
1365 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1366 fCurrentDetector.Data()));
1369 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1370 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1371 status->GetStatusName(), status->GetCount()));
1372 Bool_t increaseCount = kTRUE;
1373 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1374 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1375 increaseCount = kFALSE;
1377 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1384 //______________________________________________________________________________________________
1385 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1388 // Makes data retrieval for all detectors in the configuration.
1389 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1390 // (Unprocessed, Inactive, Failed or Done).
1391 // Returns kFALSE in case of error occured and kTRUE otherwise
1394 if (!entry) return kFALSE;
1396 fLogbookEntry = entry;
1398 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1401 // create ML instance that monitors this run
1402 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable(), Form("%d", GetCurrentRun()));
1404 // Send the information to ML
1405 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1406 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1409 mlList.Add(&mlStatus);
1410 mlList.Add(&mlRunType);
1412 fMonaLisa->SendParameters(&mlList);
1414 if (fLogbookEntry->IsDone())
1416 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1417 UpdateShuttleLogbook("shuttle_done");
1422 // read test mode if flag is set
1426 TString logEntry(entry->GetRunParameter("log"));
1427 //printf("log entry = %s\n", logEntry.Data());
1428 TString searchStr("Testmode: ");
1429 Int_t pos = logEntry.Index(searchStr.Data());
1430 //printf("%d\n", pos);
1433 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1434 //printf("%s\n", subStr.String().Data());
1435 TString newStr(subStr.Data());
1436 TObjArray* token = newStr.Tokenize(' ');
1440 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1443 Int_t testMode = tmpStr->String().Atoi();
1446 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1447 SetTestMode((TestMode) testMode);
1455 fLogbookEntry->Print("all");
1458 Bool_t hasError = kFALSE;
1460 // Set the CDB and Reference folders according to the year and LHC period
1461 TString lhcPeriod(GetLHCPeriod());
1462 if (lhcPeriod.Length() == 0)
1464 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1468 if (fgkMainCDB.Length() == 0)
1469 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1470 GetCurrentYear(), lhcPeriod.Data());
1472 if (fgkMainRefStorage.Length() == 0)
1473 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1474 GetCurrentYear(), lhcPeriod.Data());
1476 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1477 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1478 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1479 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1481 // Loop on detectors in the configuration
1482 TIter iter(fConfig->GetDetectors());
1483 TObjString* aDetector = 0;
1485 while ((aDetector = (TObjString*) iter.Next()))
1487 fCurrentDetector = aDetector->String();
1489 if (ContinueProcessing() == kFALSE) continue;
1491 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1492 GetCurrentRun(), aDetector->GetName()));
1494 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1496 Log(fCurrentDetector.Data(), "Process - Starting processing");
1502 Log("SHUTTLE", "Process - ERROR: Forking failed");
1507 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1508 GetCurrentRun(), aDetector->GetName()));
1510 Long_t begin = time(0);
1512 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1513 while (waitpid(pid, &status, WNOHANG) == 0)
1515 Long_t expiredTime = time(0) - begin;
1517 if (expiredTime > fConfig->GetPPTimeOut())
1520 tmp.Form("Process - Process of %s time out. "
1521 "Run time: %d seconds. Killing...",
1522 fCurrentDetector.Data(), expiredTime);
1523 Log("SHUTTLE", tmp);
1524 Log(fCurrentDetector, tmp);
1528 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1531 gSystem->Sleep(1000);
1535 gSystem->Sleep(1000);
1538 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1539 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1542 Log("SHUTTLE", Form("Process - Error: "
1543 "Could not open pipe to %s", checkStr.Data()));
1548 if (!fgets(buffer, 100, pipe))
1550 Log("SHUTTLE", "Process - Error: ps did not return anything");
1551 gSystem->ClosePipe(pipe);
1554 gSystem->ClosePipe(pipe);
1556 //Log("SHUTTLE", Form("ps returned %s", buffer));
1559 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1561 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1565 if (expiredTime % 60 == 0)
1566 Log("SHUTTLE", Form("Process - %s: Checking process. "
1567 "Run time: %d seconds - Memory consumption: %d KB",
1568 fCurrentDetector.Data(), expiredTime, mem));
1570 if (mem > fConfig->GetPPMaxMem())
1573 tmp.Form("Process - Process exceeds maximum allowed memory "
1574 "(%d KB > %d KB). Killing...",
1575 mem, fConfig->GetPPMaxMem());
1576 Log("SHUTTLE", tmp);
1577 Log(fCurrentDetector, tmp);
1581 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1584 gSystem->Sleep(1000);
1589 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1590 GetCurrentRun(), aDetector->GetName()));
1592 if (WIFEXITED(status))
1594 Int_t returnCode = WEXITSTATUS(status);
1596 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1599 if (returnCode == 0) hasError = kTRUE;
1605 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1606 aDetector->GetName()));
1608 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1610 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1612 Log("SHUTTLE", "Process - Could not freopen stdout");
1616 fOutputRedirected = kTRUE;
1617 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1618 Log("SHUTTLE", "Process - Could not redirect stderr");
1622 TString wd = gSystem->WorkingDirectory();
1623 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1624 fCurrentDetector.Data(), GetCurrentRun());
1626 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1627 if (!result) // temp dir already exists!
1629 Log(fCurrentDetector.Data(),
1630 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1631 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1634 if (gSystem->mkdir(tmpDir.Data(), 1))
1636 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1640 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1642 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1646 Bool_t success = ProcessCurrentDetector();
1648 gSystem->ChangeDirectory(wd.Data());
1650 if (success) // Preprocessor finished successfully!
1652 // remove temporary folder
1653 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1655 // Update time_processed field in FXS DB
1656 if (UpdateTable() == kFALSE)
1657 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1658 fCurrentDetector.Data()));
1660 // Transfer the data from local storage to main storage (Grid)
1661 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1662 if (StoreOCDB() == kFALSE)
1665 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1666 GetCurrentRun(), aDetector->GetName()));
1667 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1671 Form("\t\t\t****** run %d - %s: DONE ******",
1672 GetCurrentRun(), aDetector->GetName()));
1673 UpdateShuttleStatus(AliShuttleStatus::kDone);
1674 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1679 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1680 GetCurrentRun(), aDetector->GetName()));
1683 for (UInt_t iSys=0; iSys<3; iSys++)
1685 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1688 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1689 GetCurrentRun(), aDetector->GetName(), success));
1691 // the client exits here
1692 gSystem->Exit(success);
1694 AliError("We should never get here!!!");
1698 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1701 //check if shuttle is done for this run, if so update logbook
1702 TObjArray checkEntryArray;
1703 checkEntryArray.SetOwner(1);
1704 TString whereClause = Form("where run=%d", GetCurrentRun());
1705 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1706 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1708 return hasError == kFALSE;
1711 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1712 (checkEntryArray.At(0));
1716 if (checkEntry->IsDone())
1718 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1719 UpdateShuttleLogbook("shuttle_done");
1723 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1725 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1727 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1728 checkEntry->GetRun(), GetDetName(iDet)));
1729 fFirstUnprocessed[iDet] = kFALSE;
1735 // remove ML instance
1741 return hasError == kFALSE;
1744 //______________________________________________________________________________________________
1745 Bool_t AliShuttle::ProcessCurrentDetector()
1748 // Makes data retrieval just for a specific detector (fCurrentDetector).
1749 // Threre should be a configuration for this detector.
1751 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1752 fCurrentDetector.Data(), GetCurrentRun()));
1754 TString wd = gSystem->WorkingDirectory();
1756 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1759 gSystem->ChangeDirectory(wd.Data());
1761 TMap* dcsMap = new TMap();
1763 // call preprocessor
1764 AliPreprocessor* aPreprocessor =
1765 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1767 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1769 Bool_t processDCS = aPreprocessor->ProcessDCS();
1773 Log(fCurrentDetector, "ProcessCurrentDetector -"
1774 " The preprocessor requested to skip the retrieval of DCS values");
1776 else if (fTestMode & kSkipDCS)
1778 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1780 else if (fTestMode & kErrorDCS)
1782 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1783 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1784 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1789 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1791 // Query DCS archive
1792 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1794 for (int iServ=0; iServ<nServers; iServ++)
1797 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1798 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1799 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1801 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1802 " Querying DCS Amanda server %s:%d (%d of %d)",
1803 host.Data(), port, iServ+1, nServers));
1808 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1810 aliasMap = GetValueSet(host, port,
1811 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1812 kAlias, multiSplit);
1815 Log(fCurrentDetector,
1816 Form("ProcessCurrentDetector -"
1817 " Error retrieving DCS aliases from server %s."
1818 " Sending mail to DCS experts!", host.Data()));
1819 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1821 if (!SendMailToDCS())
1822 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1829 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1831 dpMap = GetValueSet(host, port,
1832 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1836 Log(fCurrentDetector,
1837 Form("ProcessCurrentDetector -"
1838 " Error retrieving DCS data points from server %s."
1839 " Sending mail to DCS experts!", host.Data()));
1840 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1842 if (!SendMailToDCS())
1843 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1845 if (aliasMap) delete aliasMap;
1851 // merge aliasMap and dpMap into dcsMap
1853 TIter iter(aliasMap);
1854 TObjString* key = 0;
1855 while ((key = (TObjString*) iter.Next()))
1856 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1858 aliasMap->SetOwner(kFALSE);
1864 TObjString* key = 0;
1865 while ((key = (TObjString*) iter.Next()))
1866 dcsMap->Add(key, dpMap->GetValue(key->String()));
1868 dpMap->SetOwner(kFALSE);
1874 // save map into file, to help debugging in case of preprocessor error
1875 TFile* f = TFile::Open("DCSMap.root","recreate");
1877 dcsMap->Write("DCSMap", TObject::kSingleKey);
1881 // DCS Archive DB processing successful. Call Preprocessor!
1882 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1884 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1886 if (returnValue > 0) // Preprocessor error!
1888 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1889 "Preprocessor failed. Process returned %d.", returnValue));
1890 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1891 dcsMap->DeleteAll();
1897 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1898 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1899 fCurrentDetector.Data()));
1901 dcsMap->DeleteAll();
1907 //______________________________________________________________________________________________
1908 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1911 // Query DAQ's Shuttle logbook and fills detector status object.
1912 // Call QueryRunParameters to query DAQ logbook for run parameters.
1915 entries.SetOwner(1);
1917 // check connection, in case connect
1918 if(!Connect(3)) return kFALSE;
1921 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1923 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1925 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1929 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1931 if(aResult->GetRowCount() == 0) {
1932 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1937 // TODO Check field count!
1938 const UInt_t nCols = 23;
1939 if (aResult->GetFieldCount() != (Int_t) nCols) {
1940 Log("SHUTTLE", "Invalid SQL result field number!");
1946 while ((aRow = aResult->Next())) {
1947 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1948 Int_t run = runString.Atoi();
1950 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1954 // loop on detectors
1955 for(UInt_t ii = 0; ii < nCols; ii++)
1956 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1958 entries.AddLast(entry);
1966 //______________________________________________________________________________________________
1967 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1970 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1973 // check connection, in case connect
1978 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1980 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1982 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
1986 if (aResult->GetRowCount() == 0) {
1987 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1992 if (aResult->GetRowCount() > 1) {
1993 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
1994 "more than one entry in DAQ Logbook for run %d!", run));
1999 TSQLRow* aRow = aResult->Next();
2002 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2007 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2009 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2010 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2012 UInt_t startTime = entry->GetStartTime();
2013 UInt_t endTime = entry->GetEndTime();
2015 if (!startTime || !endTime || startTime > endTime) {
2017 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
2018 run, startTime, endTime));
2031 //______________________________________________________________________________________________
2032 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2033 DCSType type, Int_t multiSplit)
2035 // Retrieve all "entry" data points from the DCS server
2036 // host, port: TSocket connection parameters
2037 // entries: list of name of the alias or data point
2038 // type: kAlias or kDP
2039 // returns TMap of values, 0 when failure
2041 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2046 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2047 GetCurrentEndTime());
2049 else if (type == kDP)
2051 result = client.GetDPValues(entries, GetCurrentStartTime(),
2052 GetCurrentEndTime());
2057 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2058 client.GetErrorString(client.GetResultErrorCode())));
2059 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2060 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2061 client.GetServerError().Data()));
2069 //______________________________________________________________________________________________
2070 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2071 const char* id, const char* source)
2073 // Get calibration file from file exchange servers
2074 // First queris the FXS database for the file name, using the run, detector, id and source info
2075 // then calls RetrieveFile(filename) for actual copy to local disk
2076 // run: current run being processed (given by Logbook entry fLogbookEntry)
2077 // detector: the Preprocessor name
2078 // id: provided as a parameter by the Preprocessor
2079 // source: provided by the Preprocessor through GetFileSources function
2081 // check if test mode should simulate a FXS error
2082 if (fTestMode & kErrorFXSFiles)
2084 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2088 // check connection, in case connect
2089 if (!Connect(system))
2091 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2095 // Query preparation
2096 TString sourceName(source);
2098 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2099 fConfig->GetFXSdbTable(system));
2100 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2101 GetCurrentRun(), detector, id);
2105 whereClause += Form(" and DAQsource=\"%s\"", source);
2107 else if (system == kDCS)
2111 else if (system == kHLT)
2113 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2117 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2119 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2122 TSQLResult* aResult = 0;
2123 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2125 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2126 GetSystemName(system), id, sourceName.Data()));
2130 if(aResult->GetRowCount() == 0)
2133 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2134 GetSystemName(system), id, sourceName.Data()));
2139 if (aResult->GetRowCount() > 1) {
2141 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2142 GetSystemName(system), id, sourceName.Data()));
2147 if (aResult->GetFieldCount() != nFields) {
2149 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2150 GetSystemName(system), id, sourceName.Data()));
2155 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2158 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2159 GetSystemName(system), id, sourceName.Data()));
2164 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2165 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2166 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2171 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2172 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2174 // retrieved file is renamed to make it unique
2175 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2176 GetShuttleTempDir(), detector, GetCurrentRun(),
2177 GetSystemName(system), detector, GetCurrentRun(),
2178 id, sourceName.Data());
2181 // file retrieval from FXS
2182 UInt_t nRetries = 0;
2183 UInt_t maxRetries = 3;
2184 Bool_t result = kFALSE;
2186 // copy!! if successful TSystem::Exec returns 0
2187 while(nRetries++ < maxRetries) {
2188 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2189 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2192 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2193 filePath.Data(), GetSystemName(system)));
2197 if (fileChecksum.Length()>0)
2199 // compare md5sum of local file with the one stored in the FXS DB
2200 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2201 localFileName.Data(), fileChecksum.Data()));
2205 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2211 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2212 filePath.Data(), GetSystemName(system)));
2217 if(!result) return 0;
2219 fFXSCalled[system]=kTRUE;
2220 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2221 fFXSlist[system].Add(fileParams);
2223 static TString staticLocalFileName;
2224 staticLocalFileName.Form("%s", localFileName.Data());
2226 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2227 "source %s from %s to %s", id, source,
2228 GetSystemName(system), localFileName.Data()));
2230 return staticLocalFileName.Data();
2233 //______________________________________________________________________________________________
2234 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2237 // Copies file from FXS to local Shuttle machine
2240 // check temp directory: trying to cd to temp; if it does not exist, create it
2241 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2242 GetSystemName(system), fxsFileName, localFileName));
2244 TString tmpDir(localFileName);
2246 tmpDir = tmpDir(0,tmpDir.Last('/'));
2248 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2249 if (noDir) // temp dir does not exists!
2251 if (gSystem->mkdir(tmpDir.Data(), 1))
2253 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2258 TString baseFXSFolder;
2261 baseFXSFolder = "FES/";
2263 else if (system == kDCS)
2267 else if (system == kHLT)
2269 baseFXSFolder = "/opt/FXS/";
2273 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2274 fConfig->GetFXSPort(system),
2275 fConfig->GetFXSUser(system),
2276 fConfig->GetFXSHost(system),
2277 baseFXSFolder.Data(),
2281 AliDebug(2, Form("%s",command.Data()));
2283 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2288 //______________________________________________________________________________________________
2289 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2292 // Get sources producing the condition file Id from file exchange servers
2293 // if id is NULL all sources are returned (distinct)
2296 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2298 // check if test mode should simulate a FXS error
2299 if (fTestMode & kErrorFXSSources)
2301 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2307 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2308 TList *list = new TList();
2310 list->Add(new TObjString(" "));
2314 // check connection, in case connect
2315 if (!Connect(system))
2317 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2321 TString sourceName = 0;
2324 sourceName = "DAQsource";
2325 } else if (system == kHLT)
2327 sourceName = "DDLnumbers";
2330 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2331 TString whereClause = Form("run=%d and detector=\"%s\"",
2332 GetCurrentRun(), detector);
2334 whereClause += Form(" and fileId=\"%s\"", id);
2335 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2337 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2340 TSQLResult* aResult;
2341 aResult = fServer[system]->Query(sqlQuery);
2343 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2344 GetSystemName(system), id));
2348 TList *list = new TList();
2351 if (aResult->GetRowCount() == 0)
2354 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2359 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2362 while ((aRow = aResult->Next()))
2365 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2366 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2367 list->Add(new TObjString(source));
2376 //______________________________________________________________________________________________
2377 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2380 // Get all ids of condition files produced by a given source from file exchange servers
2383 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2385 // check if test mode should simulate a FXS error
2386 if (fTestMode & kErrorFXSSources)
2388 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2392 // check connection, in case connect
2393 if (!Connect(system))
2395 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2399 TString sourceName = 0;
2402 sourceName = "DAQsource";
2403 } else if (system == kHLT)
2405 sourceName = "DDLnumbers";
2408 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2409 TString whereClause = Form("run=%d and detector=\"%s\"",
2410 GetCurrentRun(), detector);
2411 if (sourceName.Length() > 0 && source)
2412 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2413 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2415 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2418 TSQLResult* aResult;
2419 aResult = fServer[system]->Query(sqlQuery);
2421 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2422 GetSystemName(system), source));
2426 TList *list = new TList();
2429 if (aResult->GetRowCount() == 0)
2432 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2437 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2441 while ((aRow = aResult->Next()))
2444 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2445 AliDebug(2, Form("fileId = %s", id.Data()));
2446 list->Add(new TObjString(id));
2455 //______________________________________________________________________________________________
2456 Bool_t AliShuttle::Connect(Int_t system)
2458 // Connect to MySQL Server of the system's FXS MySQL databases
2459 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2462 // check connection: if already connected return
2463 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2465 TString dbHost, dbUser, dbPass, dbName;
2467 if (system < 3) // FXS db servers
2469 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2470 dbUser = fConfig->GetFXSdbUser(system);
2471 dbPass = fConfig->GetFXSdbPass(system);
2472 dbName = fConfig->GetFXSdbName(system);
2473 } else { // Run & Shuttle logbook servers
2474 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2475 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2476 dbUser = fConfig->GetDAQlbUser();
2477 dbPass = fConfig->GetDAQlbPass();
2478 dbName = fConfig->GetDAQlbDB();
2481 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2482 if (!fServer[system] || !fServer[system]->IsConnected()) {
2485 AliError(Form("Can't establish connection to FXS database for %s",
2486 AliShuttleInterface::GetSystemName(system)));
2488 AliError("Can't establish connection to Run logbook.");
2490 if(fServer[system]) delete fServer[system];
2495 TSQLResult* aResult=0;
2498 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2501 aResult = fServer[kDCS]->GetTables(dbName.Data());
2504 aResult = fServer[kHLT]->GetTables(dbName.Data());
2507 aResult = fServer[3]->GetTables(dbName.Data());
2515 //______________________________________________________________________________________________
2516 Bool_t AliShuttle::UpdateTable()
2519 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2522 Bool_t result = kTRUE;
2524 for (UInt_t system=0; system<3; system++)
2526 if(!fFXSCalled[system]) continue;
2528 // check connection, in case connect
2529 if (!Connect(system))
2531 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2536 TTimeStamp now; // now
2538 // Loop on FXS list entries
2539 TIter iter(&fFXSlist[system]);
2540 TObjString *aFXSentry=0;
2541 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2543 TString aFXSentrystr = aFXSentry->String();
2544 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2545 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2547 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2548 GetSystemName(system), aFXSentrystr.Data()));
2549 if(aFXSarray) delete aFXSarray;
2553 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2554 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2556 TString whereClause;
2559 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2560 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2562 else if (system == kDCS)
2564 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2565 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2567 else if (system == kHLT)
2569 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2570 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2575 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2576 now.GetSec(), whereClause.Data());
2578 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2581 TSQLResult* aResult;
2582 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2585 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2586 GetSystemName(system), sqlQuery.Data()));
2597 //______________________________________________________________________________________________
2598 Bool_t AliShuttle::UpdateTableFailCase()
2600 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2601 // this is called in case the preprocessor is declared failed for the current run, because
2602 // the fields are updated only in case of success
2604 Bool_t result = kTRUE;
2606 for (UInt_t system=0; system<3; system++)
2608 // check connection, in case connect
2609 if (!Connect(system))
2611 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2612 GetSystemName(system)));
2617 TTimeStamp now; // now
2619 // Loop on FXS list entries
2621 TString whereClause = Form("where run=%d and detector=\"%s\";",
2622 GetCurrentRun(), fCurrentDetector.Data());
2625 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2626 now.GetSec(), whereClause.Data());
2628 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2631 TSQLResult* aResult;
2632 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2635 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2636 GetSystemName(system), sqlQuery.Data()));
2646 //______________________________________________________________________________________________
2647 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2650 // Update Shuttle logbook filling detector or shuttle_done column
2651 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2654 // check connection, in case connect
2656 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2660 TString detName(detector);
2662 if(detName == "shuttle_done")
2664 setClause = "set shuttle_done=1";
2666 // Send the information to ML
2667 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2670 mlList.Add(&mlStatus);
2672 fMonaLisa->SendParameters(&mlList);
2674 TString statusStr(status);
2675 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2676 statusStr.Contains("failed", TString::kIgnoreCase)){
2677 setClause = Form("set %s=\"%s\"", detector, status);
2680 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2686 TString whereClause = Form("where run=%d", GetCurrentRun());
2688 TString sqlQuery = Form("update %s %s %s",
2689 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2691 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2694 TSQLResult* aResult;
2695 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2697 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2705 //______________________________________________________________________________________________
2706 Int_t AliShuttle::GetCurrentRun() const
2709 // Get current run from logbook entry
2712 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2715 //______________________________________________________________________________________________
2716 UInt_t AliShuttle::GetCurrentStartTime() const
2719 // get current start time
2722 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2725 //______________________________________________________________________________________________
2726 UInt_t AliShuttle::GetCurrentEndTime() const
2729 // get current end time from logbook entry
2732 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2735 //______________________________________________________________________________________________
2736 UInt_t AliShuttle::GetCurrentYear() const
2739 // Get current year from logbook entry
2742 if (!fLogbookEntry) return 0;
2744 TTimeStamp startTime(GetCurrentStartTime());
2745 TString year = Form("%d",startTime.GetDate());
2751 //______________________________________________________________________________________________
2752 const char* AliShuttle::GetLHCPeriod() const
2755 // Get current LHC period from logbook entry
2758 if (!fLogbookEntry) return 0;
2760 return fLogbookEntry->GetRunParameter("LHCperiod");
2763 //______________________________________________________________________________________________
2764 void AliShuttle::Log(const char* detector, const char* message)
2767 // Fill log string with a message
2770 TString logRunDir = GetShuttleLogDir();
2771 if (GetCurrentRun() >=0)
2772 logRunDir += Form("/%d", GetCurrentRun());
2774 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2776 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2777 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2782 gSystem->FreeDirectory(dir);
2785 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2786 if (GetCurrentRun() >= 0)
2787 toLog += Form("run %d - ", GetCurrentRun());
2788 toLog += Form("%s", message);
2790 AliInfo(toLog.Data());
2792 // if we redirect the log output already to the file, leave here
2793 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2796 TString fileName = GetLogFileName(detector);
2798 gSystem->ExpandPathName(fileName);
2801 logFile.open(fileName, ofstream::out | ofstream::app);
2803 if (!logFile.is_open()) {
2804 AliError(Form("Could not open file %s", fileName.Data()));
2808 logFile << toLog.Data() << "\n";
2813 //______________________________________________________________________________________________
2814 TString AliShuttle::GetLogFileName(const char* detector) const
2817 // returns the name of the log file for a given sub detector
2822 if (GetCurrentRun() >= 0)
2824 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
2825 detector, GetCurrentRun());
2827 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2833 //______________________________________________________________________________________________
2834 Bool_t AliShuttle::Collect(Int_t run)
2837 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2838 // If a dedicated run is given this run is processed
2840 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2844 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2846 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2848 SetLastAction("Starting");
2850 TString whereClause("where shuttle_done=0");
2852 whereClause += Form(" and run=%d", run);
2854 TObjArray shuttleLogbookEntries;
2855 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2857 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2861 if (shuttleLogbookEntries.GetEntries() == 0)
2864 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2866 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2867 "or it does not exist in Shuttle logbook", run));
2871 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2872 fFirstUnprocessed[iDet] = kTRUE;
2876 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2877 // flag them into fFirstUnprocessed array
2878 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2879 TObjArray tmpLogbookEntries;
2880 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2882 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2886 TIter iter(&tmpLogbookEntries);
2887 AliShuttleLogbookEntry* anEntry = 0;
2888 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2890 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2892 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2894 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2895 anEntry->GetRun(), GetDetName(iDet)));
2896 fFirstUnprocessed[iDet] = kFALSE;
2904 if (!RetrieveConditionsData(shuttleLogbookEntries))
2906 Log("SHUTTLE", "Collect - Process of at least one run failed");
2910 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2914 //______________________________________________________________________________________________
2915 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2918 // Retrieve conditions data for all runs that aren't processed yet
2921 Bool_t hasError = kFALSE;
2923 TIter iter(&dateEntries);
2924 AliShuttleLogbookEntry* anEntry;
2926 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2927 if (!Process(anEntry)){
2931 // clean SHUTTLE temp directory
2932 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2933 //RemoveFile(filename.Data());
2936 return hasError == kFALSE;
2939 //______________________________________________________________________________________________
2940 ULong_t AliShuttle::GetTimeOfLastAction() const
2943 // Gets time of last action
2948 fMonitoringMutex->Lock();
2950 tmp = fLastActionTime;
2952 fMonitoringMutex->UnLock();
2957 //______________________________________________________________________________________________
2958 const TString AliShuttle::GetLastAction() const
2961 // returns a string description of the last action
2966 fMonitoringMutex->Lock();
2970 fMonitoringMutex->UnLock();
2975 //______________________________________________________________________________________________
2976 void AliShuttle::SetLastAction(const char* action)
2979 // updates the monitoring variables
2982 fMonitoringMutex->Lock();
2984 fLastAction = action;
2985 fLastActionTime = time(0);
2987 fMonitoringMutex->UnLock();
2990 //______________________________________________________________________________________________
2991 const char* AliShuttle::GetRunParameter(const char* param)
2994 // returns run parameter read from DAQ logbook
2997 if(!fLogbookEntry) {
2998 AliError("No logbook entry!");
3002 return fLogbookEntry->GetRunParameter(param);
3005 //______________________________________________________________________________________________
3006 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3009 // returns object from OCDB valid for current run
3012 if (fTestMode & kErrorOCDB)
3014 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3018 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3021 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3025 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3028 //______________________________________________________________________________________________
3029 Bool_t AliShuttle::SendMail()
3032 // sends a mail to the subdetector expert in case of preprocessor error
3035 if (fTestMode != kNone)
3038 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3041 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3043 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3048 gSystem->FreeDirectory(dir);
3051 TString bodyFileName;
3052 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3053 gSystem->ExpandPathName(bodyFileName);
3056 mailBody.open(bodyFileName, ofstream::out);
3058 if (!mailBody.is_open())
3060 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3065 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3066 TObjString *anExpert=0;
3067 while ((anExpert = (TObjString*) iterExperts.Next()))
3069 to += Form("%s,", anExpert->GetName());
3071 to.Remove(to.Length()-1);
3072 AliDebug(2, Form("to: %s",to.Data()));
3075 Log("SHUTTLE", "List of detector responsibles not yet set!");
3079 TString cc="alberto.colla@cern.ch";
3081 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
3082 fCurrentDetector.Data(), GetCurrentRun());
3083 AliDebug(2, Form("subject: %s", subject.Data()));
3085 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3086 body += Form("SHUTTLE just detected that your preprocessor "
3087 "failed processing run %d!!\n\n", GetCurrentRun());
3088 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3089 fCurrentDetector.Data());
3090 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3092 TString logFolder = "logs";
3093 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3094 logFolder += "_PROD";
3097 body += Form("Find the %s log for the current run on \n\n"
3098 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3099 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3100 fCurrentDetector.Data(), GetCurrentRun());
3101 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3103 AliDebug(2, Form("Body begin: %s", body.Data()));
3105 mailBody << body.Data();
3107 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3109 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3110 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3111 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3112 if (gSystem->Exec(tailCommand.Data()))
3114 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3117 TString endBody = Form("------------------------------------------------------\n\n");
3118 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3119 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3120 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3122 AliDebug(2, Form("Body end: %s", endBody.Data()));
3124 mailBody << endBody.Data();
3129 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3133 bodyFileName.Data());
3134 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3136 Bool_t result = gSystem->Exec(mailCommand.Data());
3141 //______________________________________________________________________________________________
3142 Bool_t AliShuttle::SendMailToDCS()
3145 // sends a mail to the DCS experts in case of DCS error
3148 if (fTestMode != kNone)
3151 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3154 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3156 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3161 gSystem->FreeDirectory(dir);
3164 TString bodyFileName;
3165 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3166 gSystem->ExpandPathName(bodyFileName);
3169 mailBody.open(bodyFileName, ofstream::out);
3171 if (!mailBody.is_open())
3173 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3177 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3178 //TString to="alberto.colla@cern.ch";
3179 AliDebug(2, Form("to: %s",to.Data()));
3182 Log("SHUTTLE", "List of detector responsibles not yet set!");
3186 TString cc="alberto.colla@cern.ch";
3188 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3189 fCurrentDetector.Data(), GetCurrentRun());
3190 AliDebug(2, Form("subject: %s", subject.Data()));
3192 TString body = Form("Dear DCS experts, \n\n");
3193 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3194 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3195 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3196 fCurrentDetector.Data());
3197 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3199 TString logFolder = "logs";
3200 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3201 logFolder += "_PROD";
3204 body += Form("Find the %s log for the current run on \n\n"
3205 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3206 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3207 fCurrentDetector.Data(), GetCurrentRun());
3208 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3210 AliDebug(2, Form("Body begin: %s", body.Data()));
3212 mailBody << body.Data();
3214 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3216 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3217 fCurrentDetector.Data(), GetCurrentRun());
3218 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3219 if (gSystem->Exec(tailCommand.Data()))
3221 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3224 TString endBody = Form("------------------------------------------------------\n\n");
3225 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3226 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3227 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3229 AliDebug(2, Form("Body end: %s", endBody.Data()));
3231 mailBody << endBody.Data();
3236 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3240 bodyFileName.Data());
3241 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3243 Bool_t result = gSystem->Exec(mailCommand.Data());
3248 //______________________________________________________________________________________________
3249 const char* AliShuttle::GetRunType()
3252 // returns run type read from "run type" logbook
3255 if(!fLogbookEntry) {
3256 AliError("No logbook entry!");
3260 return fLogbookEntry->GetRunType();
3263 //______________________________________________________________________________________________
3264 Bool_t AliShuttle::GetHLTStatus()
3266 // Return HLT status (ON=1 OFF=0)
3267 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3269 if(!fLogbookEntry) {
3270 AliError("No logbook entry!");
3274 // TODO implement when HLTStatus is inserted in run logbook
3275 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3276 //if(hltStatus == "OFF") {return kFALSE};
3281 //______________________________________________________________________________________________
3282 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3285 // sets Shuttle temp directory
3288 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3291 //______________________________________________________________________________________________
3292 void AliShuttle::SetShuttleLogDir(const char* logDir)
3295 // sets Shuttle log directory
3298 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);