1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.70 2007/12/12 13:45:35 acolla
19 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
21 Revision 1.69 2007/12/12 10:06:29 acolla
22 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
24 time_start==0 && time_end==0
26 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
28 Revision 1.68 2007/12/11 10:15:17 acolla
29 Added marking SHUTTLE=DONE for invalid runs
30 (invalid start time or end time) and runs with totalEvents < 1
32 Revision 1.67 2007/12/07 19:14:36 acolla
35 Added automatic collection of new runs on a regular time basis (settable from the configuration)
37 in AliShuttleConfig: new members
39 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
40 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
44 - logs now stored in logs/#RUN/DET_#RUN.log
46 Revision 1.66 2007/12/05 10:45:19 jgrosseo
47 changed order of arguments to TMonaLisaWriter
49 Revision 1.65 2007/11/26 16:58:37 acolla
50 Monalisa configuration added: host and table name
52 Revision 1.64 2007/11/13 16:15:47 acolla
53 DCS map is stored in a file in the temp folder where the detector is processed.
54 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
56 Revision 1.63 2007/11/02 10:53:16 acolla
57 Protection added to AliShuttle::CopyFileLocally
59 Revision 1.62 2007/10/31 18:23:13 acolla
60 Furter developement on the Shuttle:
62 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
63 are now built from /alice/data, e.g.:
64 /alice/data/2007/LHC07a/OCDB
66 the year and LHC period are taken from the Shuttle.
67 Raw metadata files are stored by GRP to:
68 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
70 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
72 Revision 1.61 2007/10/30 20:33:51 acolla
73 Improved managing of temporary folders, which weren't correctly handled.
74 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
76 Revision 1.60 2007/10/29 18:06:16 acolla
78 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
79 This function can be used by GRP only. It stores raw data tags merged file to the
80 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
84 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
85 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
86 has been corrected in the root package on the Shuttle machine.
88 Revision 1.59 2007/10/05 12:40:55 acolla
90 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
92 Revision 1.58 2007/09/28 15:27:40 acolla
94 AliDCSClient "multiSplit" option added in the DCS configuration
95 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
97 Revision 1.57 2007/09/27 16:53:13 acolla
98 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
99 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
101 Revision 1.56 2007/09/14 16:46:14 jgrosseo
102 1) Connect and Close are called before and after each query, so one can
103 keep the same AliDCSClient object.
104 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
105 3) Splitting interval can be specified in constructor
107 Revision 1.55 2007/08/06 12:26:40 acolla
108 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
109 read from the run logbook.
111 Revision 1.54 2007/07/12 09:51:25 jgrosseo
112 removed duplicated log message in GetFile
114 Revision 1.53 2007/07/12 09:26:28 jgrosseo
115 updating hlt fxs base path
117 Revision 1.52 2007/07/12 08:06:45 jgrosseo
118 adding log messages in getfile... functions
119 adding not implemented copy constructor in alishuttleconfigholder
121 Revision 1.51 2007/07/03 17:24:52 acolla
122 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
124 Revision 1.50 2007/07/02 17:19:32 acolla
125 preprocessor is run in a temp directory that is removed when process is finished.
127 Revision 1.49 2007/06/29 10:45:06 acolla
128 Number of columns in MySql Shuttle logbook increased by one (HLT added)
130 Revision 1.48 2007/06/21 13:06:19 acolla
131 GetFileSources returns dummy list with 1 source if system=DCS (better than
132 returning error as it was)
134 Revision 1.47 2007/06/19 17:28:56 acolla
135 HLT updated; missing map bug removed.
137 Revision 1.46 2007/06/09 13:01:09 jgrosseo
138 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
140 Revision 1.45 2007/05/30 06:35:20 jgrosseo
141 Adding functionality to the Shuttle/TestShuttle:
142 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
143 o) Function to retrieve list of IDs for a given source (GetFileIDs)
144 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
145 Example code has been added to the TestProcessor in TestShuttle
147 Revision 1.44 2007/05/11 16:09:32 acolla
148 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
149 example: ITS/SPD/100_filename.root
151 Revision 1.43 2007/05/10 09:59:51 acolla
152 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
154 Revision 1.42 2007/05/03 08:01:39 jgrosseo
155 typo in last commit :-(
157 Revision 1.41 2007/05/03 08:00:48 jgrosseo
158 fixing log message when pp want to skip dcs value retrieval
160 Revision 1.40 2007/04/27 07:06:48 jgrosseo
161 GetFileSources returns empty list in case of no files, but successful query
162 No mails sent in testmode
164 Revision 1.39 2007/04/17 12:43:57 acolla
165 Correction in StoreOCDB; change of text in mail to detector expert
167 Revision 1.38 2007/04/12 08:26:18 jgrosseo
170 Revision 1.37 2007/04/10 16:53:14 jgrosseo
171 redirecting sub detector stdout, stderr to sub detector log file
173 Revision 1.35 2007/04/04 16:26:38 acolla
174 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
175 2. Added missing dependency in test preprocessors.
176 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
178 Revision 1.34 2007/04/04 10:33:36 jgrosseo
179 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
180 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
182 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
184 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
186 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
188 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
189 If you always need DCS data (like before), you do not need to implement it.
191 6) The run type has been added to the monitoring page
193 Revision 1.33 2007/04/03 13:56:01 acolla
194 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
197 Revision 1.32 2007/02/28 10:41:56 acolla
198 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
199 AliPreprocessor::GetRunType() function.
200 Added some ldap definition files.
202 Revision 1.30 2007/02/13 11:23:21 acolla
203 Moved getters and setters of Shuttle's main OCDB/Reference, local
204 OCDB/Reference, temp and log folders to AliShuttleInterface
206 Revision 1.27 2007/01/30 17:52:42 jgrosseo
207 adding monalisa monitoring
209 Revision 1.26 2007/01/23 19:20:03 acolla
210 Removed old ldif files, added TOF, MCH ldif files. Added some options in
211 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
214 Revision 1.25 2007/01/15 19:13:52 acolla
215 Moved some AliInfo to AliDebug in SendMail function
217 Revision 1.21 2006/12/07 08:51:26 jgrosseo
219 table, db names in ldap configuration
220 added GRP preprocessor
221 DCS data can also be retrieved by data point
223 Revision 1.20 2006/11/16 16:16:48 jgrosseo
224 introducing strict run ordering flag
225 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
227 Revision 1.19 2006/11/06 14:23:04 jgrosseo
228 major update (Alberto)
229 o) reading of run parameters from the logbook
230 o) online offline naming conversion
231 o) standalone DCSclient package
233 Revision 1.18 2006/10/20 15:22:59 jgrosseo
234 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
235 o) Merging Collect, CollectAll, CollectNew function
236 o) Removing implementation of empty copy constructors (declaration still there!)
238 Revision 1.17 2006/10/05 16:20:55 jgrosseo
239 adapting to new CDB classes
241 Revision 1.16 2006/10/05 15:46:26 jgrosseo
242 applying to the new interface
244 Revision 1.15 2006/10/02 16:38:39 jgrosseo
247 storing of objects that failed to be stored to the grid before
248 interfacing of shuttle status table in daq system
250 Revision 1.14 2006/08/29 09:16:05 jgrosseo
253 Revision 1.13 2006/08/15 10:50:00 jgrosseo
254 effc++ corrections (alberto)
256 Revision 1.12 2006/08/08 14:19:29 jgrosseo
257 Update to shuttle classes (Alberto)
259 - Possibility to set the full object's path in the Preprocessor's and
260 Shuttle's Store functions
261 - Possibility to extend the object's run validity in the same classes
262 ("startValidity" and "validityInfinite" parameters)
263 - Implementation of the StoreReferenceData function to store reference
264 data in a dedicated CDB storage.
266 Revision 1.11 2006/07/21 07:37:20 jgrosseo
267 last run is stored after each run
269 Revision 1.10 2006/07/20 09:54:40 jgrosseo
270 introducing status management: The processing per subdetector is divided into several steps,
271 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
272 can keep track of the number of failures and skips further processing after a certain threshold is
273 exceeded. These thresholds can be configured in LDAP.
275 Revision 1.9 2006/07/19 10:09:55 jgrosseo
276 new configuration, accesst to DAQ FES (Alberto)
278 Revision 1.8 2006/07/11 12:44:36 jgrosseo
279 adding parameters for extended validity range of data produced by preprocessor
281 Revision 1.7 2006/07/10 14:37:09 jgrosseo
282 small fix + todo comment
284 Revision 1.6 2006/07/10 13:01:41 jgrosseo
285 enhanced storing of last sucessfully processed run (alberto)
287 Revision 1.5 2006/07/04 14:59:57 jgrosseo
288 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
290 Revision 1.4 2006/06/12 09:11:16 jgrosseo
291 coding conventions (Alberto)
293 Revision 1.3 2006/06/06 14:26:40 jgrosseo
294 o) removed files that were moved to STEER
295 o) shuttle updated to follow the new interface (Alberto)
297 Revision 1.2 2006/03/07 07:52:34 hristov
298 New version (B.Yordanov)
300 Revision 1.6 2005/11/19 17:19:14 byordano
301 RetrieveDATEEntries and RetrieveConditionsData added
303 Revision 1.5 2005/11/19 11:09:27 byordano
304 AliShuttle declaration added
306 Revision 1.4 2005/11/17 17:47:34 byordano
307 TList changed to TObjArray
309 Revision 1.3 2005/11/17 14:43:23 byordano
312 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
313 Initial import as subdirectory in AliRoot
315 Revision 1.2 2005/09/13 08:41:15 byordano
316 default startTime endTime added
318 Revision 1.4 2005/08/30 09:13:02 byordano
321 Revision 1.3 2005/08/29 21:15:47 byordano
327 // This class is the main manager for AliShuttle.
328 // It organizes the data retrieval from DCS and call the
329 // interface methods of AliPreprocessor.
330 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
331 // data for its set of aliases is retrieved. If there is registered
332 // AliPreprocessor for this detector then it will be used
333 // accroding to the schema (see AliPreprocessor).
334 // If there isn't registered AliPreprocessor than the retrieved
335 // data is stored automatically to the undelying AliCDBStorage.
336 // For detSpec is used the alias name.
339 #include "AliShuttle.h"
341 #include "AliCDBManager.h"
342 #include "AliCDBStorage.h"
343 #include "AliCDBId.h"
344 #include "AliCDBRunRange.h"
345 #include "AliCDBPath.h"
346 #include "AliCDBEntry.h"
347 #include "AliShuttleConfig.h"
348 #include "DCSClient/AliDCSClient.h"
350 #include "AliPreprocessor.h"
351 #include "AliShuttleStatus.h"
352 #include "AliShuttleLogbookEntry.h"
357 #include <TTimeStamp.h>
358 #include <TObjString.h>
359 #include <TSQLServer.h>
360 #include <TSQLResult.h>
363 #include <TSystemDirectory.h>
364 #include <TSystemFile.h>
367 #include <TGridResult.h>
369 #include <TMonaLisaWriter.h>
373 #include <sys/types.h>
374 #include <sys/wait.h>
378 //______________________________________________________________________________________________
379 AliShuttle::AliShuttle(const AliShuttleConfig* config,
380 UInt_t timeout, Int_t retries):
382 fTimeout(timeout), fRetries(retries),
392 fReadTestMode(kFALSE),
393 fOutputRedirected(kFALSE)
396 // config: AliShuttleConfig used
397 // timeout: timeout used for AliDCSClient connection
398 // retries: the number of retries in case of connection error.
401 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
402 for(int iSys=0;iSys<4;iSys++) {
405 fFXSlist[iSys].SetOwner(kTRUE);
407 fPreprocessorMap.SetOwner(kTRUE);
409 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
410 fFirstUnprocessed[iDet] = kFALSE;
412 fMonitoringMutex = new TMutex();
415 //______________________________________________________________________________________________
416 AliShuttle::~AliShuttle()
422 fPreprocessorMap.DeleteAll();
423 for(int iSys=0;iSys<4;iSys++)
425 fServer[iSys]->Close();
426 delete fServer[iSys];
435 if (fMonitoringMutex)
437 delete fMonitoringMutex;
438 fMonitoringMutex = 0;
442 //______________________________________________________________________________________________
443 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
446 // Registers new AliPreprocessor.
447 // It uses GetName() for indentificator of the pre processor.
448 // The pre processor is registered it there isn't any other
449 // with the same identificator (GetName()).
452 const char* detName = preprocessor->GetName();
453 if(GetDetPos(detName) < 0)
454 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
456 if (fPreprocessorMap.GetValue(detName)) {
457 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
461 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
463 //______________________________________________________________________________________________
464 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
465 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
467 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
468 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
469 // using this function. Use StoreReferenceData instead!
470 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
471 // finishes the data are transferred to the main storage (Grid).
473 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
476 //______________________________________________________________________________________________
477 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
479 // Stores a CDB object in the storage for reference data. This objects will not be available during
480 // offline reconstrunction. Use this function for reference data only!
481 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
482 // finishes the data are transferred to the main storage (Grid).
484 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
487 //______________________________________________________________________________________________
488 Bool_t AliShuttle::StoreLocally(const TString& localUri,
489 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
490 Int_t validityStart, Bool_t validityInfinite)
492 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
493 // when the preprocessor finishes the data are transferred to the main storage (Grid).
494 // The parameters are:
495 // 1) Uri of the backup storage (Local)
496 // 2) the object's path.
497 // 3) the object to be stored
498 // 4) the metaData to be associated with the object
499 // 5) the validity start run number w.r.t. the current run,
500 // if the data is valid only for this run leave the default 0
501 // 6) specifies if the calibration data is valid for infinity (this means until updated),
502 // typical for calibration runs, the default is kFALSE
504 // returns 0 if fail, 1 otherwise
506 if (fTestMode & kErrorStorage)
508 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
512 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
514 Int_t firstRun = GetCurrentRun() - validityStart;
516 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
521 if(validityInfinite) {
522 lastRun = AliCDBRunRange::Infinity();
524 lastRun = GetCurrentRun();
527 // Version is set to current run, it will be used later to transfer data to Grid
528 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
530 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
531 TObjString runUsed = Form("%d", GetCurrentRun());
532 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
535 Bool_t result = kFALSE;
537 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
538 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
540 result = AliCDBManager::Instance()->GetStorage(localUri)
541 ->Put(object, id, metaData);
546 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
552 //______________________________________________________________________________________________
553 Bool_t AliShuttle::StoreOCDB()
556 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
557 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
558 // Then calls StoreRefFilesToGrid to store reference files.
561 if (fTestMode & kErrorGrid)
563 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
564 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
568 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
569 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
571 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
572 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
574 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
575 Bool_t resultRefFiles = CopyFilesToGrid("reference");
577 Bool_t resultMetadata = kTRUE;
578 if(fCurrentDetector == "GRP")
580 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
581 resultMetadata = CopyFilesToGrid("metadata");
584 return resultCDB && resultRef && resultRefFiles && resultMetadata;
587 //______________________________________________________________________________________________
588 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
591 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
594 TObjArray* gridIds=0;
596 Bool_t result = kTRUE;
598 const char* type = 0;
600 if(gridURI == fgkMainCDB) {
602 localURI = fgkLocalCDB;
603 } else if(gridURI == fgkMainRefStorage) {
605 localURI = fgkLocalRefStorage;
607 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
611 AliCDBManager* man = AliCDBManager::Instance();
613 AliCDBStorage *gridSto = man->GetStorage(gridURI);
616 Form("StoreOCDB - cannot activate main %s storage", type));
620 gridIds = gridSto->GetQueryCDBList();
622 // get objects previously stored in local CDB
623 AliCDBStorage *localSto = man->GetStorage(localURI);
626 Form("StoreOCDB - cannot activate local %s storage", type));
629 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
630 // Local objects were stored with current run as Grid version!
631 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
632 localEntries->SetOwner(1);
634 // loop on local stored objects
635 TIter localIter(localEntries);
636 AliCDBEntry *aLocEntry = 0;
637 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
638 aLocEntry->SetOwner(1);
639 AliCDBId aLocId = aLocEntry->GetId();
640 aLocEntry->SetVersion(-1);
641 aLocEntry->SetSubVersion(-1);
643 // If local object is valid up to infinity we store it only if it is
644 // the first unprocessed run!
645 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
646 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
648 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
649 "there are previous unprocessed runs!",
650 fCurrentDetector.Data(), aLocId.GetPath().Data()));
654 // loop on Grid valid Id's
655 Bool_t store = kTRUE;
656 TIter gridIter(gridIds);
657 AliCDBId* aGridId = 0;
658 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
659 if(aGridId->GetPath() != aLocId.GetPath()) continue;
660 // skip all objects valid up to infinity
661 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
662 // if we get here, it means there's already some more recent object stored on Grid!
667 // If we get here, the file can be stored!
668 Bool_t storeOk = gridSto->Put(aLocEntry);
669 if(!store || storeOk){
673 Log(fCurrentDetector.Data(),
674 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
675 type, aGridId->ToString().Data()));
678 Form("StoreOCDB - Object <%s> successfully put into %s storage",
679 aLocId.ToString().Data(), type));
680 Log(fCurrentDetector.Data(),
681 Form("StoreOCDB - Object <%s> successfully put into %s storage",
682 aLocId.ToString().Data(), type));
685 // removing local filename...
687 localSto->IdToFilename(aLocId, filename);
688 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
689 RemoveFile(filename.Data());
693 Form("StoreOCDB - Grid %s storage of object <%s> failed",
694 type, aLocId.ToString().Data()));
695 Log(fCurrentDetector.Data(),
696 Form("StoreOCDB - Grid %s storage of object <%s> failed",
697 type, aLocId.ToString().Data()));
701 localEntries->Clear();
706 //______________________________________________________________________________________________
707 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
709 // clears the directory used to store reference files of a given subdetector
711 AliCDBManager* man = AliCDBManager::Instance();
712 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
713 TString localBaseFolder = sto->GetBaseFolder();
715 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
717 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
720 begin.Form("%d_", GetCurrentRun());
722 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
726 TList* dirList = baseDir->GetListOfFiles();
729 if (!dirList) return kTRUE;
731 if (dirList->GetEntries() < 3)
737 Int_t nDirs = 0, nDel = 0;
738 TIter dirIter(dirList);
739 TSystemFile* entry = 0;
741 Bool_t success = kTRUE;
743 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
745 if (entry->IsDirectory())
748 TString fileName(entry->GetName());
749 if (!fileName.BeginsWith(begin))
755 Int_t result = gSystem->Unlink(fileName.Data());
759 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
767 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
768 nDel, nDirs, targetDir.Data()));
779 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
783 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
786 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
791 result = gSystem->mkdir(targetDir, kTRUE);
794 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
801 //______________________________________________________________________________________________
802 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
805 // Stores reference file directly (without opening it). This function stores the file locally.
807 // The file is stored under the following location:
808 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
809 // where <gridFileName> is the second parameter given to the function
812 if (fTestMode & kErrorStorage)
814 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
818 AliCDBManager* man = AliCDBManager::Instance();
819 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
821 TString localBaseFolder = sto->GetBaseFolder();
823 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
824 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
826 return CopyFileLocally(localFile, target);
829 //______________________________________________________________________________________________
830 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
833 // Stores Run metadata file to the Grid, in the run folder
835 // Only GRP can call this function.
837 if (fTestMode & kErrorStorage)
839 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
843 AliCDBManager* man = AliCDBManager::Instance();
844 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
846 TString localBaseFolder = sto->GetBaseFolder();
848 // Build Run level folder
849 // folder = /alice/data/year/lhcPeriod/runNb/Raw
852 TString lhcPeriod = GetLHCPeriod();
853 if (lhcPeriod.Length() == 0)
855 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
859 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
860 localBaseFolder.Data(), GetCurrentYear(),
861 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
863 return CopyFileLocally(localFile, target);
866 //______________________________________________________________________________________________
867 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
870 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
871 // Files are temporarily stored in the local reference storage. When the preprocessor
872 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
873 // (in reference or run level folders)
876 TString targetDir(target(0, target.Last('/')));
878 //try to open base dir folder, if it does not exist
879 void* dir = gSystem->OpenDirectory(targetDir.Data());
881 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
882 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
887 gSystem->FreeDirectory(dir);
892 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
895 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
899 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
902 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
903 if (gSystem->Unlink(target.Data()))
905 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
910 result = gSystem->CopyFile(localFile, target);
914 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
919 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
920 localFile, target.Data(), result));
928 //______________________________________________________________________________________________
929 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
932 // Transfers local files to the Grid. Local files can be reference files
933 // or run metadata file (from GRP only).
935 // According to the type (ref, metadata) the files are stored under the following location:
936 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
937 // metadata --> <run data folder>/<MetadataFileName>
940 AliCDBManager* man = AliCDBManager::Instance();
941 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
944 TString localBaseFolder = sto->GetBaseFolder();
950 if (strcmp(type, "reference") == 0)
952 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
953 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
956 TString gridBaseFolder = gridSto->GetBaseFolder();
957 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
958 begin = Form("%d_", GetCurrentRun());
960 else if (strcmp(type, "metadata") == 0)
963 TString lhcPeriod = GetLHCPeriod();
965 if (lhcPeriod.Length() == 0)
967 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
971 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
972 localBaseFolder.Data(), GetCurrentYear(),
973 lhcPeriod.Data(), GetCurrentRun());
974 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
980 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
984 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
988 TList* dirList = baseDir->GetListOfFiles();
991 if (!dirList) return kTRUE;
993 if (dirList->GetEntries() < 3)
1001 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1006 Int_t nDirs = 0, nTransfer = 0;
1007 TIter dirIter(dirList);
1008 TSystemFile* entry = 0;
1010 Bool_t success = kTRUE;
1011 Bool_t first = kTRUE;
1013 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1015 if (entry->IsDirectory())
1018 TString fileName(entry->GetName());
1019 if (!fileName.BeginsWith(begin))
1027 // check that folder exists, otherwise create it
1028 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1036 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1038 // TODO It does not work currently! Bug in TAliEn::Mkdir
1039 // TODO Manually fixed in local root v5-16-00
1040 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1042 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1047 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1051 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1055 TString fullLocalPath;
1056 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1058 TString fullGridPath;
1059 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1061 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1065 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1066 fullLocalPath.Data(), fullGridPath.Data()));
1067 RemoveFile(fullLocalPath);
1072 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1073 fullLocalPath.Data(), fullGridPath.Data()));
1078 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1079 nTransfer, nDirs, dir.Data()));
1086 //______________________________________________________________________________________________
1087 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1090 // Get folder name of reference files
1093 TString offDetStr(GetOfflineDetName(detector));
1095 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1097 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1099 dir.Form("%s/%s", base, offDetStr.Data());
1107 //______________________________________________________________________________________________
1108 void AliShuttle::CleanLocalStorage(const TString& uri)
1111 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1114 const char* type = 0;
1115 if(uri == fgkLocalCDB) {
1117 } else if(uri == fgkLocalRefStorage) {
1120 AliError(Form("Invalid storage URI: %s", uri.Data()));
1124 AliCDBManager* man = AliCDBManager::Instance();
1126 // open local storage
1127 AliCDBStorage *localSto = man->GetStorage(uri);
1130 Form("CleanLocalStorage - cannot activate local %s storage", type));
1134 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1135 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1137 AliDebug(2, Form("filename = %s", filename.Data()));
1139 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1140 GetCurrentRun(), fCurrentDetector.Data()));
1142 RemoveFile(filename.Data());
1146 //______________________________________________________________________________________________
1147 void AliShuttle::RemoveFile(const char* filename)
1150 // removes local file
1153 TString command(Form("rm -f %s", filename));
1155 Int_t result = gSystem->Exec(command.Data());
1158 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1159 fCurrentDetector.Data(), filename));
1163 //______________________________________________________________________________________________
1164 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1167 // Reads the AliShuttleStatus from the CDB
1171 delete fStatusEntry;
1175 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1176 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1178 if (!fStatusEntry) return 0;
1179 fStatusEntry->SetOwner(1);
1181 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1183 AliError("Invalid object stored to CDB!");
1190 //______________________________________________________________________________________________
1191 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1194 // writes the status for one subdetector
1198 delete fStatusEntry;
1202 Int_t run = GetCurrentRun();
1204 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1206 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1207 fStatusEntry->SetOwner(1);
1209 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1212 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1213 fCurrentDetector.Data(), run));
1222 //______________________________________________________________________________________________
1223 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1226 // changes the AliShuttleStatus for the given detector and run to the given status
1230 AliError("UNEXPECTED: fStatusEntry empty");
1234 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1237 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1241 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1242 fCurrentDetector.Data(),
1243 status->GetStatusName(),
1244 status->GetStatusName(newStatus));
1245 Log("SHUTTLE", actionStr);
1246 SetLastAction(actionStr);
1248 status->SetStatus(newStatus);
1249 if (increaseCount) status->IncreaseCount();
1251 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1256 //______________________________________________________________________________________________
1257 void AliShuttle::SendMLInfo()
1260 // sends ML information about the current status of the current detector being processed
1263 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1266 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1270 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1271 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1274 mlList.Add(&mlStatus);
1275 mlList.Add(&mlRetryCount);
1278 mlID.Form("%d", GetCurrentRun());
1279 fMonaLisa->SendParameters(&mlList, mlID);
1282 //______________________________________________________________________________________________
1283 Bool_t AliShuttle::ContinueProcessing()
1285 // this function reads the AliShuttleStatus information from CDB and
1286 // checks if the processing should be continued
1287 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1289 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1291 AliPreprocessor* aPreprocessor =
1292 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1295 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1299 AliShuttleLogbookEntry::Status entryStatus =
1300 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1302 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1303 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1304 fCurrentDetector.Data(),
1305 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1309 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1311 // check if current run is first unprocessed run for current detector
1312 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1313 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1315 if (fTestMode == kNone)
1317 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1318 " but this is not the first unprocessed run!"));
1323 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1324 "Although %s requires strict run ordering "
1325 "and this is not the first unprocessed run, "
1326 "the SHUTTLE continues"));
1330 AliShuttleStatus* status = ReadShuttleStatus();
1333 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1334 fCurrentDetector.Data()));
1335 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1336 return WriteShuttleStatus(status);
1339 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1340 // If it happens it may mean Logbook updating failed... let's do it now!
1341 if (status->GetStatus() == AliShuttleStatus::kDone ||
1342 status->GetStatus() == AliShuttleStatus::kFailed){
1343 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1344 fCurrentDetector.Data(),
1345 status->GetStatusName(status->GetStatus())));
1346 UpdateShuttleLogbook(fCurrentDetector.Data(),
1347 status->GetStatusName(status->GetStatus()));
1351 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1353 Form("ContinueProcessing - %s: Grid storage of one or more "
1354 "objects failed. Trying again now",
1355 fCurrentDetector.Data()));
1356 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1358 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1359 "successfully stored into main storage",
1360 fCurrentDetector.Data()));
1363 Form("ContinueProcessing - %s: Grid storage failed again",
1364 fCurrentDetector.Data()));
1365 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1370 // if we get here, there is a restart
1371 Bool_t cont = kFALSE;
1374 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1375 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1376 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1377 status->GetCount(), status->GetStatusName()));
1378 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1379 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1381 // there may still be objects in local OCDB and reference storage
1382 // and FXS databases may be not updated: do it now!
1384 // TODO Currently disabled, we want to keep files in case of failure!
1385 // CleanLocalStorage(fgkLocalCDB);
1386 // CleanLocalStorage(fgkLocalRefStorage);
1387 // UpdateTableFailCase();
1389 // Send mail to detector expert!
1390 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1391 fCurrentDetector.Data()));
1393 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1394 fCurrentDetector.Data()));
1397 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1398 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1399 status->GetStatusName(), status->GetCount()));
1400 Bool_t increaseCount = kTRUE;
1401 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1402 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1403 increaseCount = kFALSE;
1405 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1412 //______________________________________________________________________________________________
1413 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1416 // Makes data retrieval for all detectors in the configuration.
1417 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1418 // (Unprocessed, Inactive, Failed or Done).
1419 // Returns kFALSE in case of error occured and kTRUE otherwise
1422 if (!entry) return kFALSE;
1424 fLogbookEntry = entry;
1426 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1429 // Send the information to ML
1430 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1431 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1434 mlList.Add(&mlStatus);
1435 mlList.Add(&mlRunType);
1438 mlID.Form("%d", GetCurrentRun());
1439 fMonaLisa->SendParameters(&mlList, mlID);
1441 if (fLogbookEntry->IsDone())
1443 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1444 UpdateShuttleLogbook("shuttle_done");
1449 // read test mode if flag is set
1453 TString logEntry(entry->GetRunParameter("log"));
1454 //printf("log entry = %s\n", logEntry.Data());
1455 TString searchStr("Testmode: ");
1456 Int_t pos = logEntry.Index(searchStr.Data());
1457 //printf("%d\n", pos);
1460 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1461 //printf("%s\n", subStr.String().Data());
1462 TString newStr(subStr.Data());
1463 TObjArray* token = newStr.Tokenize(' ');
1467 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1470 Int_t testMode = tmpStr->String().Atoi();
1473 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1474 SetTestMode((TestMode) testMode);
1482 fLogbookEntry->Print("all");
1485 Bool_t hasError = kFALSE;
1487 // Set the CDB and Reference folders according to the year and LHC period
1488 TString lhcPeriod(GetLHCPeriod());
1489 if (lhcPeriod.Length() == 0)
1491 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1495 if (fgkMainCDB.Length() == 0)
1496 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1497 GetCurrentYear(), lhcPeriod.Data());
1499 if (fgkMainRefStorage.Length() == 0)
1500 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1501 GetCurrentYear(), lhcPeriod.Data());
1503 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1504 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1505 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1506 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1508 // Loop on detectors in the configuration
1509 TIter iter(fConfig->GetDetectors());
1510 TObjString* aDetector = 0;
1512 while ((aDetector = (TObjString*) iter.Next()))
1514 fCurrentDetector = aDetector->String();
1516 if (ContinueProcessing() == kFALSE) continue;
1518 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1519 GetCurrentRun(), aDetector->GetName()));
1521 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1523 Log(fCurrentDetector.Data(), "Process - Starting processing");
1529 Log("SHUTTLE", "Process - ERROR: Forking failed");
1534 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1535 GetCurrentRun(), aDetector->GetName()));
1537 Long_t begin = time(0);
1539 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1540 while (waitpid(pid, &status, WNOHANG) == 0)
1542 Long_t expiredTime = time(0) - begin;
1544 if (expiredTime > fConfig->GetPPTimeOut())
1547 tmp.Form("Process - Process of %s time out. "
1548 "Run time: %d seconds. Killing...",
1549 fCurrentDetector.Data(), expiredTime);
1550 Log("SHUTTLE", tmp);
1551 Log(fCurrentDetector, tmp);
1555 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1558 gSystem->Sleep(1000);
1562 gSystem->Sleep(1000);
1565 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1566 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1569 Log("SHUTTLE", Form("Process - Error: "
1570 "Could not open pipe to %s", checkStr.Data()));
1575 if (!fgets(buffer, 100, pipe))
1577 Log("SHUTTLE", "Process - Error: ps did not return anything");
1578 gSystem->ClosePipe(pipe);
1581 gSystem->ClosePipe(pipe);
1583 //Log("SHUTTLE", Form("ps returned %s", buffer));
1586 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1588 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1592 if (expiredTime % 60 == 0)
1594 Log("SHUTTLE", Form("Process - %s: Checking process. "
1595 "Run time: %d seconds - Memory consumption: %d KB",
1596 fCurrentDetector.Data(), expiredTime, mem));
1600 if (mem > fConfig->GetPPMaxMem())
1603 tmp.Form("Process - Process exceeds maximum allowed memory "
1604 "(%d KB > %d KB). Killing...",
1605 mem, fConfig->GetPPMaxMem());
1606 Log("SHUTTLE", tmp);
1607 Log(fCurrentDetector, tmp);
1611 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1614 gSystem->Sleep(1000);
1619 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1620 GetCurrentRun(), aDetector->GetName()));
1622 if (WIFEXITED(status))
1624 Int_t returnCode = WEXITSTATUS(status);
1626 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1629 if (returnCode == 0) hasError = kTRUE;
1635 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1636 aDetector->GetName()));
1638 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1640 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1642 Log("SHUTTLE", "Process - Could not freopen stdout");
1646 fOutputRedirected = kTRUE;
1647 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1648 Log("SHUTTLE", "Process - Could not redirect stderr");
1652 TString wd = gSystem->WorkingDirectory();
1653 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1654 fCurrentDetector.Data(), GetCurrentRun());
1656 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1657 if (!result) // temp dir already exists!
1659 Log(fCurrentDetector.Data(),
1660 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1661 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1664 if (gSystem->mkdir(tmpDir.Data(), 1))
1666 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1670 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1672 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1676 Bool_t success = ProcessCurrentDetector();
1678 gSystem->ChangeDirectory(wd.Data());
1680 if (success) // Preprocessor finished successfully!
1682 // remove temporary folder
1683 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1685 // Update time_processed field in FXS DB
1686 if (UpdateTable() == kFALSE)
1687 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1688 fCurrentDetector.Data()));
1690 // Transfer the data from local storage to main storage (Grid)
1691 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1692 if (StoreOCDB() == kFALSE)
1695 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1696 GetCurrentRun(), aDetector->GetName()));
1697 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1701 Form("\t\t\t****** run %d - %s: DONE ******",
1702 GetCurrentRun(), aDetector->GetName()));
1703 UpdateShuttleStatus(AliShuttleStatus::kDone);
1704 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1709 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1710 GetCurrentRun(), aDetector->GetName()));
1713 for (UInt_t iSys=0; iSys<3; iSys++)
1715 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1718 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1719 GetCurrentRun(), aDetector->GetName(), success));
1721 // the client exits here
1722 gSystem->Exit(success);
1724 AliError("We should never get here!!!");
1728 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1731 //check if shuttle is done for this run, if so update logbook
1732 TObjArray checkEntryArray;
1733 checkEntryArray.SetOwner(1);
1734 TString whereClause = Form("where run=%d", GetCurrentRun());
1735 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1736 checkEntryArray.GetEntries() == 0) {
1737 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1739 return hasError == kFALSE;
1742 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1743 (checkEntryArray.At(0));
1747 if (checkEntry->IsDone())
1749 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1750 UpdateShuttleLogbook("shuttle_done");
1754 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1756 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1758 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1759 checkEntry->GetRun(), GetDetName(iDet)));
1760 fFirstUnprocessed[iDet] = kFALSE;
1768 return hasError == kFALSE;
1771 //______________________________________________________________________________________________
1772 Bool_t AliShuttle::ProcessCurrentDetector()
1775 // Makes data retrieval just for a specific detector (fCurrentDetector).
1776 // Threre should be a configuration for this detector.
1778 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1779 fCurrentDetector.Data(), GetCurrentRun()));
1781 TString wd = gSystem->WorkingDirectory();
1783 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1786 gSystem->ChangeDirectory(wd.Data());
1788 TMap* dcsMap = new TMap();
1790 // call preprocessor
1791 AliPreprocessor* aPreprocessor =
1792 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1794 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1796 Bool_t processDCS = aPreprocessor->ProcessDCS();
1800 Log(fCurrentDetector, "ProcessCurrentDetector -"
1801 " The preprocessor requested to skip the retrieval of DCS values");
1803 else if (fTestMode & kSkipDCS)
1805 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1807 else if (fTestMode & kErrorDCS)
1809 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1810 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1811 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1816 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1818 // Query DCS archive
1819 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1821 for (int iServ=0; iServ<nServers; iServ++)
1824 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1825 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1826 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1828 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1829 " Querying DCS Amanda server %s:%d (%d of %d)",
1830 host.Data(), port, iServ+1, nServers));
1835 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1837 aliasMap = GetValueSet(host, port,
1838 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1839 kAlias, multiSplit);
1842 Log(fCurrentDetector,
1843 Form("ProcessCurrentDetector -"
1844 " Error retrieving DCS aliases from server %s."
1845 " Sending mail to DCS experts!", host.Data()));
1846 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1848 if (!SendMailToDCS())
1849 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1856 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1858 dpMap = GetValueSet(host, port,
1859 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1863 Log(fCurrentDetector,
1864 Form("ProcessCurrentDetector -"
1865 " Error retrieving DCS data points from server %s."
1866 " Sending mail to DCS experts!", host.Data()));
1867 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1869 if (!SendMailToDCS())
1870 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1872 if (aliasMap) delete aliasMap;
1878 // merge aliasMap and dpMap into dcsMap
1880 TIter iter(aliasMap);
1881 TObjString* key = 0;
1882 while ((key = (TObjString*) iter.Next()))
1883 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1885 aliasMap->SetOwner(kFALSE);
1891 TObjString* key = 0;
1892 while ((key = (TObjString*) iter.Next()))
1893 dcsMap->Add(key, dpMap->GetValue(key->String()));
1895 dpMap->SetOwner(kFALSE);
1901 // save map into file, to help debugging in case of preprocessor error
1902 TFile* f = TFile::Open("DCSMap.root","recreate");
1904 dcsMap->Write("DCSMap", TObject::kSingleKey);
1908 // DCS Archive DB processing successful. Call Preprocessor!
1909 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1911 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1913 if (returnValue > 0) // Preprocessor error!
1915 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1916 "Preprocessor failed. Process returned %d.", returnValue));
1917 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1918 dcsMap->DeleteAll();
1924 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1925 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1926 fCurrentDetector.Data()));
1928 dcsMap->DeleteAll();
1934 //______________________________________________________________________________________________
1935 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1938 // Query DAQ's Shuttle logbook and fills detector status object.
1939 // Call QueryRunParameters to query DAQ logbook for run parameters.
1942 entries.SetOwner(1);
1944 // check connection, in case connect
1945 if(!Connect(3)) return kFALSE;
1948 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1950 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1952 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1956 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1958 if(aResult->GetRowCount() == 0) {
1959 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1964 // TODO Check field count!
1965 const UInt_t nCols = 23;
1966 if (aResult->GetFieldCount() != (Int_t) nCols) {
1967 Log("SHUTTLE", "Invalid SQL result field number!");
1973 while ((aRow = aResult->Next())) {
1974 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1975 Int_t run = runString.Atoi();
1977 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1981 // loop on detectors
1982 for(UInt_t ii = 0; ii < nCols; ii++)
1983 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1985 entries.AddLast(entry);
1993 //______________________________________________________________________________________________
1994 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1997 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2000 // check connection, in case connect
2005 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2007 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2009 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2013 if (aResult->GetRowCount() == 0) {
2014 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2019 if (aResult->GetRowCount() > 1) {
2020 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2021 "more than one entry in DAQ Logbook for run %d!", run));
2026 TSQLRow* aRow = aResult->Next();
2029 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2034 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2036 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2037 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2039 UInt_t startTime = entry->GetStartTime();
2040 UInt_t endTime = entry->GetEndTime();
2042 // if (!startTime || !endTime || startTime > endTime)
2045 // Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2046 // run, startTime, endTime));
2048 // Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2049 // fLogbookEntry = entry;
2050 // if (!UpdateShuttleLogbook("shuttle_done"))
2052 // AliError(Form("Could not update logbook for run %d !", run));
2054 // fLogbookEntry = 0;
2065 Form("QueryRunParameters - Invalid parameters for Run %d: "
2066 "startTime = %d, endTime = %d. Skipping!",
2067 run, startTime, endTime));
2069 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2070 fLogbookEntry = entry;
2071 if (!UpdateShuttleLogbook("shuttle_ignored"))
2073 AliError(Form("Could not update logbook for run %d !", run));
2083 if (startTime && !endTime)
2085 // TODO Here we don't mark SHUTTLE done, because this may mean
2086 //the run is still ongoing!!
2088 Form("QueryRunParameters - Invalid parameters for Run %d: "
2089 "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2090 run, startTime, endTime));
2092 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2093 //fLogbookEntry = entry;
2094 //if (!UpdateShuttleLogbook("shuttle_done"))
2096 // AliError(Form("Could not update logbook for run %d !", run));
2098 //fLogbookEntry = 0;
2106 if (startTime && endTime && (startTime > endTime))
2109 Form("QueryRunParameters - Invalid parameters for Run %d: "
2110 "startTime = %d, endTime = %d. Skipping!",
2111 run, startTime, endTime));
2113 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2114 fLogbookEntry = entry;
2115 if (!UpdateShuttleLogbook("shuttle_ignored"))
2117 AliError(Form("Could not update logbook for run %d !", run));
2127 TString totEventsStr = entry->GetRunParameter("totalEvents");
2128 Int_t totEvents = totEventsStr.Atoi();
2132 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2134 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2135 fLogbookEntry = entry;
2136 if (!UpdateShuttleLogbook("shuttle_ignored"))
2138 AliError(Form("Could not update logbook for run %d !", run));
2154 //______________________________________________________________________________________________
2155 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2156 DCSType type, Int_t multiSplit)
2158 // Retrieve all "entry" data points from the DCS server
2159 // host, port: TSocket connection parameters
2160 // entries: list of name of the alias or data point
2161 // type: kAlias or kDP
2162 // returns TMap of values, 0 when failure
2164 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2169 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2170 GetCurrentEndTime());
2172 else if (type == kDP)
2174 result = client.GetDPValues(entries, GetCurrentStartTime(),
2175 GetCurrentEndTime());
2180 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2181 client.GetErrorString(client.GetResultErrorCode())));
2182 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2183 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2184 client.GetServerError().Data()));
2192 //______________________________________________________________________________________________
2193 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2194 const char* id, const char* source)
2196 // Get calibration file from file exchange servers
2197 // First queris the FXS database for the file name, using the run, detector, id and source info
2198 // then calls RetrieveFile(filename) for actual copy to local disk
2199 // run: current run being processed (given by Logbook entry fLogbookEntry)
2200 // detector: the Preprocessor name
2201 // id: provided as a parameter by the Preprocessor
2202 // source: provided by the Preprocessor through GetFileSources function
2204 // check if test mode should simulate a FXS error
2205 if (fTestMode & kErrorFXSFiles)
2207 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2211 // check connection, in case connect
2212 if (!Connect(system))
2214 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2218 // Query preparation
2219 TString sourceName(source);
2221 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2222 fConfig->GetFXSdbTable(system));
2223 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2224 GetCurrentRun(), detector, id);
2228 whereClause += Form(" and DAQsource=\"%s\"", source);
2230 else if (system == kDCS)
2234 else if (system == kHLT)
2236 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2240 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2242 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2245 TSQLResult* aResult = 0;
2246 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2248 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2249 GetSystemName(system), id, sourceName.Data()));
2253 if(aResult->GetRowCount() == 0)
2256 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2257 GetSystemName(system), id, sourceName.Data()));
2262 if (aResult->GetRowCount() > 1) {
2264 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2265 GetSystemName(system), id, sourceName.Data()));
2270 if (aResult->GetFieldCount() != nFields) {
2272 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2273 GetSystemName(system), id, sourceName.Data()));
2278 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2281 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2282 GetSystemName(system), id, sourceName.Data()));
2287 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2288 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2289 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2294 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2295 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2297 // retrieved file is renamed to make it unique
2298 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2299 GetShuttleTempDir(), detector, GetCurrentRun(),
2300 GetSystemName(system), detector, GetCurrentRun(),
2301 id, sourceName.Data());
2304 // file retrieval from FXS
2305 UInt_t nRetries = 0;
2306 UInt_t maxRetries = 3;
2307 Bool_t result = kFALSE;
2309 // copy!! if successful TSystem::Exec returns 0
2310 while(nRetries++ < maxRetries) {
2311 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2312 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2315 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2316 filePath.Data(), GetSystemName(system)));
2320 if (fileChecksum.Length()>0)
2322 // compare md5sum of local file with the one stored in the FXS DB
2323 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2324 localFileName.Data(), fileChecksum.Data()));
2328 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2334 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2335 filePath.Data(), GetSystemName(system)));
2340 if(!result) return 0;
2342 fFXSCalled[system]=kTRUE;
2343 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2344 fFXSlist[system].Add(fileParams);
2346 static TString staticLocalFileName;
2347 staticLocalFileName.Form("%s", localFileName.Data());
2349 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2350 "source %s from %s to %s", id, source,
2351 GetSystemName(system), localFileName.Data()));
2353 return staticLocalFileName.Data();
2356 //______________________________________________________________________________________________
2357 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2360 // Copies file from FXS to local Shuttle machine
2363 // check temp directory: trying to cd to temp; if it does not exist, create it
2364 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2365 GetSystemName(system), fxsFileName, localFileName));
2367 TString tmpDir(localFileName);
2369 tmpDir = tmpDir(0,tmpDir.Last('/'));
2371 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2372 if (noDir) // temp dir does not exists!
2374 if (gSystem->mkdir(tmpDir.Data(), 1))
2376 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2381 TString baseFXSFolder;
2384 baseFXSFolder = "FES/";
2386 else if (system == kDCS)
2390 else if (system == kHLT)
2392 baseFXSFolder = "/opt/FXS/";
2396 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2397 fConfig->GetFXSPort(system),
2398 fConfig->GetFXSUser(system),
2399 fConfig->GetFXSHost(system),
2400 baseFXSFolder.Data(),
2404 AliDebug(2, Form("%s",command.Data()));
2406 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2411 //______________________________________________________________________________________________
2412 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2415 // Get sources producing the condition file Id from file exchange servers
2416 // if id is NULL all sources are returned (distinct)
2419 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2421 // check if test mode should simulate a FXS error
2422 if (fTestMode & kErrorFXSSources)
2424 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2430 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2431 TList *list = new TList();
2433 list->Add(new TObjString(" "));
2437 // check connection, in case connect
2438 if (!Connect(system))
2440 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2444 TString sourceName = 0;
2447 sourceName = "DAQsource";
2448 } else if (system == kHLT)
2450 sourceName = "DDLnumbers";
2453 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2454 TString whereClause = Form("run=%d and detector=\"%s\"",
2455 GetCurrentRun(), detector);
2457 whereClause += Form(" and fileId=\"%s\"", id);
2458 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2460 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2463 TSQLResult* aResult;
2464 aResult = fServer[system]->Query(sqlQuery);
2466 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2467 GetSystemName(system), id));
2471 TList *list = new TList();
2474 if (aResult->GetRowCount() == 0)
2477 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2482 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2485 while ((aRow = aResult->Next()))
2488 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2489 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2490 list->Add(new TObjString(source));
2499 //______________________________________________________________________________________________
2500 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2503 // Get all ids of condition files produced by a given source from file exchange servers
2506 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2508 // check if test mode should simulate a FXS error
2509 if (fTestMode & kErrorFXSSources)
2511 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2515 // check connection, in case connect
2516 if (!Connect(system))
2518 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2522 TString sourceName = 0;
2525 sourceName = "DAQsource";
2526 } else if (system == kHLT)
2528 sourceName = "DDLnumbers";
2531 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2532 TString whereClause = Form("run=%d and detector=\"%s\"",
2533 GetCurrentRun(), detector);
2534 if (sourceName.Length() > 0 && source)
2535 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2536 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2538 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2541 TSQLResult* aResult;
2542 aResult = fServer[system]->Query(sqlQuery);
2544 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2545 GetSystemName(system), source));
2549 TList *list = new TList();
2552 if (aResult->GetRowCount() == 0)
2555 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2560 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2564 while ((aRow = aResult->Next()))
2567 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2568 AliDebug(2, Form("fileId = %s", id.Data()));
2569 list->Add(new TObjString(id));
2578 //______________________________________________________________________________________________
2579 Bool_t AliShuttle::Connect(Int_t system)
2581 // Connect to MySQL Server of the system's FXS MySQL databases
2582 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2585 // check connection: if already connected return
2586 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2588 TString dbHost, dbUser, dbPass, dbName;
2590 if (system < 3) // FXS db servers
2592 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2593 dbUser = fConfig->GetFXSdbUser(system);
2594 dbPass = fConfig->GetFXSdbPass(system);
2595 dbName = fConfig->GetFXSdbName(system);
2596 } else { // Run & Shuttle logbook servers
2597 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2598 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2599 dbUser = fConfig->GetDAQlbUser();
2600 dbPass = fConfig->GetDAQlbPass();
2601 dbName = fConfig->GetDAQlbDB();
2604 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2605 if (!fServer[system] || !fServer[system]->IsConnected()) {
2608 AliError(Form("Can't establish connection to FXS database for %s",
2609 AliShuttleInterface::GetSystemName(system)));
2611 AliError("Can't establish connection to Run logbook.");
2613 if(fServer[system]) delete fServer[system];
2618 TSQLResult* aResult=0;
2621 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2624 aResult = fServer[kDCS]->GetTables(dbName.Data());
2627 aResult = fServer[kHLT]->GetTables(dbName.Data());
2630 aResult = fServer[3]->GetTables(dbName.Data());
2638 //______________________________________________________________________________________________
2639 Bool_t AliShuttle::UpdateTable()
2642 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2645 Bool_t result = kTRUE;
2647 for (UInt_t system=0; system<3; system++)
2649 if(!fFXSCalled[system]) continue;
2651 // check connection, in case connect
2652 if (!Connect(system))
2654 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2659 TTimeStamp now; // now
2661 // Loop on FXS list entries
2662 TIter iter(&fFXSlist[system]);
2663 TObjString *aFXSentry=0;
2664 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2666 TString aFXSentrystr = aFXSentry->String();
2667 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2668 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2670 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2671 GetSystemName(system), aFXSentrystr.Data()));
2672 if(aFXSarray) delete aFXSarray;
2676 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2677 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2679 TString whereClause;
2682 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2683 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2685 else if (system == kDCS)
2687 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2688 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2690 else if (system == kHLT)
2692 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2693 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2698 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2699 now.GetSec(), whereClause.Data());
2701 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2704 TSQLResult* aResult;
2705 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2708 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2709 GetSystemName(system), sqlQuery.Data()));
2720 //______________________________________________________________________________________________
2721 Bool_t AliShuttle::UpdateTableFailCase()
2723 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2724 // this is called in case the preprocessor is declared failed for the current run, because
2725 // the fields are updated only in case of success
2727 Bool_t result = kTRUE;
2729 for (UInt_t system=0; system<3; system++)
2731 // check connection, in case connect
2732 if (!Connect(system))
2734 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2735 GetSystemName(system)));
2740 TTimeStamp now; // now
2742 // Loop on FXS list entries
2744 TString whereClause = Form("where run=%d and detector=\"%s\";",
2745 GetCurrentRun(), fCurrentDetector.Data());
2748 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2749 now.GetSec(), whereClause.Data());
2751 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2754 TSQLResult* aResult;
2755 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2758 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2759 GetSystemName(system), sqlQuery.Data()));
2769 //______________________________________________________________________________________________
2770 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2773 // Update Shuttle logbook filling detector or shuttle_done column
2774 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2777 // check connection, in case connect
2779 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2783 TString detName(detector);
2785 if (detName == "shuttle_done" || detName == "shuttle_ignored")
2787 setClause = "set shuttle_done=1";
2789 if (detName == "shuttle_done")
2791 // Send the information to ML
2792 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2795 mlList.Add(&mlStatus);
2798 mlID.Form("%d", GetCurrentRun());
2799 fMonaLisa->SendParameters(&mlList, mlID);
2802 TString statusStr(status);
2803 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2804 statusStr.Contains("failed", TString::kIgnoreCase)){
2805 setClause = Form("set %s=\"%s\"", detector, status);
2808 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2814 TString whereClause = Form("where run=%d", GetCurrentRun());
2816 TString sqlQuery = Form("update %s %s %s",
2817 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2819 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2822 TSQLResult* aResult;
2823 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2825 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2833 //______________________________________________________________________________________________
2834 Int_t AliShuttle::GetCurrentRun() const
2837 // Get current run from logbook entry
2840 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2843 //______________________________________________________________________________________________
2844 UInt_t AliShuttle::GetCurrentStartTime() const
2847 // get current start time
2850 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2853 //______________________________________________________________________________________________
2854 UInt_t AliShuttle::GetCurrentEndTime() const
2857 // get current end time from logbook entry
2860 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2863 //______________________________________________________________________________________________
2864 UInt_t AliShuttle::GetCurrentYear() const
2867 // Get current year from logbook entry
2870 if (!fLogbookEntry) return 0;
2872 TTimeStamp startTime(GetCurrentStartTime());
2873 TString year = Form("%d",startTime.GetDate());
2879 //______________________________________________________________________________________________
2880 const char* AliShuttle::GetLHCPeriod() const
2883 // Get current LHC period from logbook entry
2886 if (!fLogbookEntry) return 0;
2888 return fLogbookEntry->GetRunParameter("LHCperiod");
2891 //______________________________________________________________________________________________
2892 void AliShuttle::Log(const char* detector, const char* message)
2895 // Fill log string with a message
2898 TString logRunDir = GetShuttleLogDir();
2899 if (GetCurrentRun() >=0)
2900 logRunDir += Form("/%d", GetCurrentRun());
2902 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2904 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2905 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2910 gSystem->FreeDirectory(dir);
2913 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2914 if (GetCurrentRun() >= 0)
2915 toLog += Form("run %d - ", GetCurrentRun());
2916 toLog += Form("%s", message);
2918 AliInfo(toLog.Data());
2920 // if we redirect the log output already to the file, leave here
2921 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2924 TString fileName = GetLogFileName(detector);
2926 gSystem->ExpandPathName(fileName);
2929 logFile.open(fileName, ofstream::out | ofstream::app);
2931 if (!logFile.is_open()) {
2932 AliError(Form("Could not open file %s", fileName.Data()));
2936 logFile << toLog.Data() << "\n";
2941 //______________________________________________________________________________________________
2942 TString AliShuttle::GetLogFileName(const char* detector) const
2945 // returns the name of the log file for a given sub detector
2950 if (GetCurrentRun() >= 0)
2952 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
2953 detector, GetCurrentRun());
2955 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2961 //______________________________________________________________________________________________
2962 void AliShuttle::SendAlive()
2964 // sends alive message to ML
2966 TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
2969 mlList.Add(&mlStatus);
2971 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2974 //______________________________________________________________________________________________
2975 Bool_t AliShuttle::Collect(Int_t run)
2978 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2979 // If a dedicated run is given this run is processed
2981 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2985 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2987 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2989 SetLastAction("Starting");
2991 // create ML instance
2993 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
2998 TString whereClause("where shuttle_done=0");
3000 whereClause += Form(" and run=%d", run);
3002 TObjArray shuttleLogbookEntries;
3003 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3005 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3009 if (shuttleLogbookEntries.GetEntries() == 0)
3012 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3014 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3015 "or it does not exist in Shuttle logbook", run));
3019 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3020 fFirstUnprocessed[iDet] = kTRUE;
3024 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3025 // flag them into fFirstUnprocessed array
3026 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3027 TObjArray tmpLogbookEntries;
3028 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3030 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3034 TIter iter(&tmpLogbookEntries);
3035 AliShuttleLogbookEntry* anEntry = 0;
3036 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3038 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3040 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3042 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3043 anEntry->GetRun(), GetDetName(iDet)));
3044 fFirstUnprocessed[iDet] = kFALSE;
3052 if (!RetrieveConditionsData(shuttleLogbookEntries))
3054 Log("SHUTTLE", "Collect - Process of at least one run failed");
3058 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3062 //______________________________________________________________________________________________
3063 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3066 // Retrieve conditions data for all runs that aren't processed yet
3069 Bool_t hasError = kFALSE;
3071 TIter iter(&dateEntries);
3072 AliShuttleLogbookEntry* anEntry;
3074 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3075 if (!Process(anEntry)){
3079 // clean SHUTTLE temp directory
3080 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3081 //RemoveFile(filename.Data());
3084 return hasError == kFALSE;
3087 //______________________________________________________________________________________________
3088 ULong_t AliShuttle::GetTimeOfLastAction() const
3091 // Gets time of last action
3096 fMonitoringMutex->Lock();
3098 tmp = fLastActionTime;
3100 fMonitoringMutex->UnLock();
3105 //______________________________________________________________________________________________
3106 const TString AliShuttle::GetLastAction() const
3109 // returns a string description of the last action
3114 fMonitoringMutex->Lock();
3118 fMonitoringMutex->UnLock();
3123 //______________________________________________________________________________________________
3124 void AliShuttle::SetLastAction(const char* action)
3127 // updates the monitoring variables
3130 fMonitoringMutex->Lock();
3132 fLastAction = action;
3133 fLastActionTime = time(0);
3135 fMonitoringMutex->UnLock();
3138 //______________________________________________________________________________________________
3139 const char* AliShuttle::GetRunParameter(const char* param)
3142 // returns run parameter read from DAQ logbook
3145 if(!fLogbookEntry) {
3146 AliError("No logbook entry!");
3150 return fLogbookEntry->GetRunParameter(param);
3153 //______________________________________________________________________________________________
3154 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3157 // returns object from OCDB valid for current run
3160 if (fTestMode & kErrorOCDB)
3162 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3166 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3169 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3173 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3176 //______________________________________________________________________________________________
3177 Bool_t AliShuttle::SendMail()
3180 // sends a mail to the subdetector expert in case of preprocessor error
3183 if (fTestMode != kNone)
3186 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3189 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3191 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3196 gSystem->FreeDirectory(dir);
3199 TString bodyFileName;
3200 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3201 gSystem->ExpandPathName(bodyFileName);
3204 mailBody.open(bodyFileName, ofstream::out);
3206 if (!mailBody.is_open())
3208 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3213 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3214 TObjString *anExpert=0;
3215 while ((anExpert = (TObjString*) iterExperts.Next()))
3217 to += Form("%s,", anExpert->GetName());
3219 to.Remove(to.Length()-1);
3220 AliDebug(2, Form("to: %s",to.Data()));
3223 Log("SHUTTLE", "List of detector responsibles not yet set!");
3227 TString cc="alberto.colla@cern.ch";
3229 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
3230 fCurrentDetector.Data(), GetCurrentRun());
3231 AliDebug(2, Form("subject: %s", subject.Data()));
3233 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3234 body += Form("SHUTTLE just detected that your preprocessor "
3235 "failed processing run %d!!\n\n", GetCurrentRun());
3236 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3237 fCurrentDetector.Data());
3238 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3240 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3242 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3246 TString logFolder = "logs";
3247 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3248 logFolder += "_PROD";
3251 body += Form("Find the %s log for the current run on \n\n"
3252 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3253 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3254 fCurrentDetector.Data(), GetCurrentRun());
3255 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3257 AliDebug(2, Form("Body begin: %s", body.Data()));
3259 mailBody << body.Data();
3261 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3263 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3264 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3265 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3266 if (gSystem->Exec(tailCommand.Data()))
3268 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3271 TString endBody = Form("------------------------------------------------------\n\n");
3272 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3273 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3274 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3276 AliDebug(2, Form("Body end: %s", endBody.Data()));
3278 mailBody << endBody.Data();
3283 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3287 bodyFileName.Data());
3288 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3290 Bool_t result = gSystem->Exec(mailCommand.Data());
3295 //______________________________________________________________________________________________
3296 Bool_t AliShuttle::SendMailToDCS()
3299 // sends a mail to the DCS experts in case of DCS error
3302 if (fTestMode != kNone)
3305 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3308 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3310 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3315 gSystem->FreeDirectory(dir);
3318 TString bodyFileName;
3319 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3320 gSystem->ExpandPathName(bodyFileName);
3323 mailBody.open(bodyFileName, ofstream::out);
3325 if (!mailBody.is_open())
3327 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3331 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3332 //TString to="alberto.colla@cern.ch";
3333 AliDebug(2, Form("to: %s",to.Data()));
3336 Log("SHUTTLE", "List of detector responsibles not yet set!");
3340 TString cc="alberto.colla@cern.ch";
3342 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3343 fCurrentDetector.Data(), GetCurrentRun());
3344 AliDebug(2, Form("subject: %s", subject.Data()));
3346 TString body = Form("Dear DCS experts, \n\n");
3347 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3348 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3349 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3350 fCurrentDetector.Data());
3351 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3353 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3355 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3358 TString logFolder = "logs";
3359 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3360 logFolder += "_PROD";
3363 body += Form("Find the %s log for the current run on \n\n"
3364 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3365 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3366 fCurrentDetector.Data(), GetCurrentRun());
3367 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3369 AliDebug(2, Form("Body begin: %s", body.Data()));
3371 mailBody << body.Data();
3373 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3375 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3376 fCurrentDetector.Data(), GetCurrentRun());
3377 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3378 if (gSystem->Exec(tailCommand.Data()))
3380 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3383 TString endBody = Form("------------------------------------------------------\n\n");
3384 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3385 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3386 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3388 AliDebug(2, Form("Body end: %s", endBody.Data()));
3390 mailBody << endBody.Data();
3395 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3399 bodyFileName.Data());
3400 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3402 Bool_t result = gSystem->Exec(mailCommand.Data());
3407 //______________________________________________________________________________________________
3408 const char* AliShuttle::GetRunType()
3411 // returns run type read from "run type" logbook
3414 if(!fLogbookEntry) {
3415 AliError("No logbook entry!");
3419 return fLogbookEntry->GetRunType();
3422 //______________________________________________________________________________________________
3423 Bool_t AliShuttle::GetHLTStatus()
3425 // Return HLT status (ON=1 OFF=0)
3426 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3428 if(!fLogbookEntry) {
3429 AliError("No logbook entry!");
3433 // TODO implement when HLTStatus is inserted in run logbook
3434 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3435 //if(hltStatus == "OFF") {return kFALSE};
3440 //______________________________________________________________________________________________
3441 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3444 // sets Shuttle temp directory
3447 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3450 //______________________________________________________________________________________________
3451 void AliShuttle::SetShuttleLogDir(const char* logDir)
3454 // sets Shuttle log directory
3457 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);