1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.72 2007/12/13 15:44:28 acolla
19 Run type added in mail sent to detector expert (eases understanding)
21 Revision 1.71 2007/12/12 14:56:14 jgrosseo
22 sending shuttle_ignore to ML also in case of 0 events
24 Revision 1.70 2007/12/12 13:45:35 acolla
25 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
27 Revision 1.69 2007/12/12 10:06:29 acolla
28 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
30 time_start==0 && time_end==0
32 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
34 Revision 1.68 2007/12/11 10:15:17 acolla
35 Added marking SHUTTLE=DONE for invalid runs
36 (invalid start time or end time) and runs with totalEvents < 1
38 Revision 1.67 2007/12/07 19:14:36 acolla
41 Added automatic collection of new runs on a regular time basis (settable from the configuration)
43 in AliShuttleConfig: new members
45 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
46 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
50 - logs now stored in logs/#RUN/DET_#RUN.log
52 Revision 1.66 2007/12/05 10:45:19 jgrosseo
53 changed order of arguments to TMonaLisaWriter
55 Revision 1.65 2007/11/26 16:58:37 acolla
56 Monalisa configuration added: host and table name
58 Revision 1.64 2007/11/13 16:15:47 acolla
59 DCS map is stored in a file in the temp folder where the detector is processed.
60 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
62 Revision 1.63 2007/11/02 10:53:16 acolla
63 Protection added to AliShuttle::CopyFileLocally
65 Revision 1.62 2007/10/31 18:23:13 acolla
66 Furter developement on the Shuttle:
68 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
69 are now built from /alice/data, e.g.:
70 /alice/data/2007/LHC07a/OCDB
72 the year and LHC period are taken from the Shuttle.
73 Raw metadata files are stored by GRP to:
74 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
76 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
78 Revision 1.61 2007/10/30 20:33:51 acolla
79 Improved managing of temporary folders, which weren't correctly handled.
80 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
82 Revision 1.60 2007/10/29 18:06:16 acolla
84 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
85 This function can be used by GRP only. It stores raw data tags merged file to the
86 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
90 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
91 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
92 has been corrected in the root package on the Shuttle machine.
94 Revision 1.59 2007/10/05 12:40:55 acolla
96 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
98 Revision 1.58 2007/09/28 15:27:40 acolla
100 AliDCSClient "multiSplit" option added in the DCS configuration
101 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
103 Revision 1.57 2007/09/27 16:53:13 acolla
104 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
105 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
107 Revision 1.56 2007/09/14 16:46:14 jgrosseo
108 1) Connect and Close are called before and after each query, so one can
109 keep the same AliDCSClient object.
110 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
111 3) Splitting interval can be specified in constructor
113 Revision 1.55 2007/08/06 12:26:40 acolla
114 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
115 read from the run logbook.
117 Revision 1.54 2007/07/12 09:51:25 jgrosseo
118 removed duplicated log message in GetFile
120 Revision 1.53 2007/07/12 09:26:28 jgrosseo
121 updating hlt fxs base path
123 Revision 1.52 2007/07/12 08:06:45 jgrosseo
124 adding log messages in getfile... functions
125 adding not implemented copy constructor in alishuttleconfigholder
127 Revision 1.51 2007/07/03 17:24:52 acolla
128 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
130 Revision 1.50 2007/07/02 17:19:32 acolla
131 preprocessor is run in a temp directory that is removed when process is finished.
133 Revision 1.49 2007/06/29 10:45:06 acolla
134 Number of columns in MySql Shuttle logbook increased by one (HLT added)
136 Revision 1.48 2007/06/21 13:06:19 acolla
137 GetFileSources returns dummy list with 1 source if system=DCS (better than
138 returning error as it was)
140 Revision 1.47 2007/06/19 17:28:56 acolla
141 HLT updated; missing map bug removed.
143 Revision 1.46 2007/06/09 13:01:09 jgrosseo
144 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
146 Revision 1.45 2007/05/30 06:35:20 jgrosseo
147 Adding functionality to the Shuttle/TestShuttle:
148 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
149 o) Function to retrieve list of IDs for a given source (GetFileIDs)
150 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
151 Example code has been added to the TestProcessor in TestShuttle
153 Revision 1.44 2007/05/11 16:09:32 acolla
154 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
155 example: ITS/SPD/100_filename.root
157 Revision 1.43 2007/05/10 09:59:51 acolla
158 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
160 Revision 1.42 2007/05/03 08:01:39 jgrosseo
161 typo in last commit :-(
163 Revision 1.41 2007/05/03 08:00:48 jgrosseo
164 fixing log message when pp want to skip dcs value retrieval
166 Revision 1.40 2007/04/27 07:06:48 jgrosseo
167 GetFileSources returns empty list in case of no files, but successful query
168 No mails sent in testmode
170 Revision 1.39 2007/04/17 12:43:57 acolla
171 Correction in StoreOCDB; change of text in mail to detector expert
173 Revision 1.38 2007/04/12 08:26:18 jgrosseo
176 Revision 1.37 2007/04/10 16:53:14 jgrosseo
177 redirecting sub detector stdout, stderr to sub detector log file
179 Revision 1.35 2007/04/04 16:26:38 acolla
180 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
181 2. Added missing dependency in test preprocessors.
182 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
184 Revision 1.34 2007/04/04 10:33:36 jgrosseo
185 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
186 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
188 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
190 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
192 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
194 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
195 If you always need DCS data (like before), you do not need to implement it.
197 6) The run type has been added to the monitoring page
199 Revision 1.33 2007/04/03 13:56:01 acolla
200 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
203 Revision 1.32 2007/02/28 10:41:56 acolla
204 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
205 AliPreprocessor::GetRunType() function.
206 Added some ldap definition files.
208 Revision 1.30 2007/02/13 11:23:21 acolla
209 Moved getters and setters of Shuttle's main OCDB/Reference, local
210 OCDB/Reference, temp and log folders to AliShuttleInterface
212 Revision 1.27 2007/01/30 17:52:42 jgrosseo
213 adding monalisa monitoring
215 Revision 1.26 2007/01/23 19:20:03 acolla
216 Removed old ldif files, added TOF, MCH ldif files. Added some options in
217 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
220 Revision 1.25 2007/01/15 19:13:52 acolla
221 Moved some AliInfo to AliDebug in SendMail function
223 Revision 1.21 2006/12/07 08:51:26 jgrosseo
225 table, db names in ldap configuration
226 added GRP preprocessor
227 DCS data can also be retrieved by data point
229 Revision 1.20 2006/11/16 16:16:48 jgrosseo
230 introducing strict run ordering flag
231 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
233 Revision 1.19 2006/11/06 14:23:04 jgrosseo
234 major update (Alberto)
235 o) reading of run parameters from the logbook
236 o) online offline naming conversion
237 o) standalone DCSclient package
239 Revision 1.18 2006/10/20 15:22:59 jgrosseo
240 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
241 o) Merging Collect, CollectAll, CollectNew function
242 o) Removing implementation of empty copy constructors (declaration still there!)
244 Revision 1.17 2006/10/05 16:20:55 jgrosseo
245 adapting to new CDB classes
247 Revision 1.16 2006/10/05 15:46:26 jgrosseo
248 applying to the new interface
250 Revision 1.15 2006/10/02 16:38:39 jgrosseo
253 storing of objects that failed to be stored to the grid before
254 interfacing of shuttle status table in daq system
256 Revision 1.14 2006/08/29 09:16:05 jgrosseo
259 Revision 1.13 2006/08/15 10:50:00 jgrosseo
260 effc++ corrections (alberto)
262 Revision 1.12 2006/08/08 14:19:29 jgrosseo
263 Update to shuttle classes (Alberto)
265 - Possibility to set the full object's path in the Preprocessor's and
266 Shuttle's Store functions
267 - Possibility to extend the object's run validity in the same classes
268 ("startValidity" and "validityInfinite" parameters)
269 - Implementation of the StoreReferenceData function to store reference
270 data in a dedicated CDB storage.
272 Revision 1.11 2006/07/21 07:37:20 jgrosseo
273 last run is stored after each run
275 Revision 1.10 2006/07/20 09:54:40 jgrosseo
276 introducing status management: The processing per subdetector is divided into several steps,
277 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
278 can keep track of the number of failures and skips further processing after a certain threshold is
279 exceeded. These thresholds can be configured in LDAP.
281 Revision 1.9 2006/07/19 10:09:55 jgrosseo
282 new configuration, accesst to DAQ FES (Alberto)
284 Revision 1.8 2006/07/11 12:44:36 jgrosseo
285 adding parameters for extended validity range of data produced by preprocessor
287 Revision 1.7 2006/07/10 14:37:09 jgrosseo
288 small fix + todo comment
290 Revision 1.6 2006/07/10 13:01:41 jgrosseo
291 enhanced storing of last sucessfully processed run (alberto)
293 Revision 1.5 2006/07/04 14:59:57 jgrosseo
294 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
296 Revision 1.4 2006/06/12 09:11:16 jgrosseo
297 coding conventions (Alberto)
299 Revision 1.3 2006/06/06 14:26:40 jgrosseo
300 o) removed files that were moved to STEER
301 o) shuttle updated to follow the new interface (Alberto)
303 Revision 1.2 2006/03/07 07:52:34 hristov
304 New version (B.Yordanov)
306 Revision 1.6 2005/11/19 17:19:14 byordano
307 RetrieveDATEEntries and RetrieveConditionsData added
309 Revision 1.5 2005/11/19 11:09:27 byordano
310 AliShuttle declaration added
312 Revision 1.4 2005/11/17 17:47:34 byordano
313 TList changed to TObjArray
315 Revision 1.3 2005/11/17 14:43:23 byordano
318 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
319 Initial import as subdirectory in AliRoot
321 Revision 1.2 2005/09/13 08:41:15 byordano
322 default startTime endTime added
324 Revision 1.4 2005/08/30 09:13:02 byordano
327 Revision 1.3 2005/08/29 21:15:47 byordano
333 // This class is the main manager for AliShuttle.
334 // It organizes the data retrieval from DCS and call the
335 // interface methods of AliPreprocessor.
336 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
337 // data for its set of aliases is retrieved. If there is registered
338 // AliPreprocessor for this detector then it will be used
339 // accroding to the schema (see AliPreprocessor).
340 // If there isn't registered AliPreprocessor than the retrieved
341 // data is stored automatically to the undelying AliCDBStorage.
342 // For detSpec is used the alias name.
345 #include "AliShuttle.h"
347 #include "AliCDBManager.h"
348 #include "AliCDBStorage.h"
349 #include "AliCDBId.h"
350 #include "AliCDBRunRange.h"
351 #include "AliCDBPath.h"
352 #include "AliCDBEntry.h"
353 #include "AliShuttleConfig.h"
354 #include "DCSClient/AliDCSClient.h"
356 #include "AliPreprocessor.h"
357 #include "AliShuttleStatus.h"
358 #include "AliShuttleLogbookEntry.h"
363 #include <TTimeStamp.h>
364 #include <TObjString.h>
365 #include <TSQLServer.h>
366 #include <TSQLResult.h>
369 #include <TSystemDirectory.h>
370 #include <TSystemFile.h>
373 #include <TGridResult.h>
375 #include <TMonaLisaWriter.h>
379 #include <sys/types.h>
380 #include <sys/wait.h>
384 //______________________________________________________________________________________________
385 AliShuttle::AliShuttle(const AliShuttleConfig* config,
386 UInt_t timeout, Int_t retries):
388 fTimeout(timeout), fRetries(retries),
398 fReadTestMode(kFALSE),
399 fOutputRedirected(kFALSE)
402 // config: AliShuttleConfig used
403 // timeout: timeout used for AliDCSClient connection
404 // retries: the number of retries in case of connection error.
407 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
408 for(int iSys=0;iSys<4;iSys++) {
411 fFXSlist[iSys].SetOwner(kTRUE);
413 fPreprocessorMap.SetOwner(kTRUE);
415 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
416 fFirstUnprocessed[iDet] = kFALSE;
418 fMonitoringMutex = new TMutex();
421 //______________________________________________________________________________________________
422 AliShuttle::~AliShuttle()
428 fPreprocessorMap.DeleteAll();
429 for(int iSys=0;iSys<4;iSys++)
431 fServer[iSys]->Close();
432 delete fServer[iSys];
441 if (fMonitoringMutex)
443 delete fMonitoringMutex;
444 fMonitoringMutex = 0;
448 //______________________________________________________________________________________________
449 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
452 // Registers new AliPreprocessor.
453 // It uses GetName() for indentificator of the pre processor.
454 // The pre processor is registered it there isn't any other
455 // with the same identificator (GetName()).
458 const char* detName = preprocessor->GetName();
459 if(GetDetPos(detName) < 0)
460 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
462 if (fPreprocessorMap.GetValue(detName)) {
463 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
467 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
469 //______________________________________________________________________________________________
470 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
471 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
473 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
474 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
475 // using this function. Use StoreReferenceData instead!
476 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
477 // finishes the data are transferred to the main storage (Grid).
479 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
482 //______________________________________________________________________________________________
483 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
485 // Stores a CDB object in the storage for reference data. This objects will not be available during
486 // offline reconstrunction. Use this function for reference data only!
487 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
488 // finishes the data are transferred to the main storage (Grid).
490 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
493 //______________________________________________________________________________________________
494 Bool_t AliShuttle::StoreLocally(const TString& localUri,
495 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
496 Int_t validityStart, Bool_t validityInfinite)
498 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
499 // when the preprocessor finishes the data are transferred to the main storage (Grid).
500 // The parameters are:
501 // 1) Uri of the backup storage (Local)
502 // 2) the object's path.
503 // 3) the object to be stored
504 // 4) the metaData to be associated with the object
505 // 5) the validity start run number w.r.t. the current run,
506 // if the data is valid only for this run leave the default 0
507 // 6) specifies if the calibration data is valid for infinity (this means until updated),
508 // typical for calibration runs, the default is kFALSE
510 // returns 0 if fail, 1 otherwise
512 if (fTestMode & kErrorStorage)
514 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
518 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
520 Int_t firstRun = GetCurrentRun() - validityStart;
522 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
527 if(validityInfinite) {
528 lastRun = AliCDBRunRange::Infinity();
530 lastRun = GetCurrentRun();
533 // Version is set to current run, it will be used later to transfer data to Grid
534 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
536 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
537 TObjString runUsed = Form("%d", GetCurrentRun());
538 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
541 Bool_t result = kFALSE;
543 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
544 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
546 result = AliCDBManager::Instance()->GetStorage(localUri)
547 ->Put(object, id, metaData);
552 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
558 //______________________________________________________________________________________________
559 Bool_t AliShuttle::StoreOCDB()
562 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
563 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
564 // Then calls StoreRefFilesToGrid to store reference files.
567 if (fTestMode & kErrorGrid)
569 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
570 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
574 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
575 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
577 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
578 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
580 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
581 Bool_t resultRefFiles = CopyFilesToGrid("reference");
583 Bool_t resultMetadata = kTRUE;
584 if(fCurrentDetector == "GRP")
586 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
587 resultMetadata = CopyFilesToGrid("metadata");
590 return resultCDB && resultRef && resultRefFiles && resultMetadata;
593 //______________________________________________________________________________________________
594 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
597 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
600 TObjArray* gridIds=0;
602 Bool_t result = kTRUE;
604 const char* type = 0;
606 if(gridURI == fgkMainCDB) {
608 localURI = fgkLocalCDB;
609 } else if(gridURI == fgkMainRefStorage) {
611 localURI = fgkLocalRefStorage;
613 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
617 AliCDBManager* man = AliCDBManager::Instance();
619 AliCDBStorage *gridSto = man->GetStorage(gridURI);
622 Form("StoreOCDB - cannot activate main %s storage", type));
626 gridIds = gridSto->GetQueryCDBList();
628 // get objects previously stored in local CDB
629 AliCDBStorage *localSto = man->GetStorage(localURI);
632 Form("StoreOCDB - cannot activate local %s storage", type));
635 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
636 // Local objects were stored with current run as Grid version!
637 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
638 localEntries->SetOwner(1);
640 // loop on local stored objects
641 TIter localIter(localEntries);
642 AliCDBEntry *aLocEntry = 0;
643 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
644 aLocEntry->SetOwner(1);
645 AliCDBId aLocId = aLocEntry->GetId();
646 aLocEntry->SetVersion(-1);
647 aLocEntry->SetSubVersion(-1);
649 // If local object is valid up to infinity we store it only if it is
650 // the first unprocessed run!
651 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
652 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
654 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
655 "there are previous unprocessed runs!",
656 fCurrentDetector.Data(), aLocId.GetPath().Data()));
660 // loop on Grid valid Id's
661 Bool_t store = kTRUE;
662 TIter gridIter(gridIds);
663 AliCDBId* aGridId = 0;
664 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
665 if(aGridId->GetPath() != aLocId.GetPath()) continue;
666 // skip all objects valid up to infinity
667 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
668 // if we get here, it means there's already some more recent object stored on Grid!
673 // If we get here, the file can be stored!
674 Bool_t storeOk = gridSto->Put(aLocEntry);
675 if(!store || storeOk){
679 Log(fCurrentDetector.Data(),
680 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
681 type, aGridId->ToString().Data()));
684 Form("StoreOCDB - Object <%s> successfully put into %s storage",
685 aLocId.ToString().Data(), type));
686 Log(fCurrentDetector.Data(),
687 Form("StoreOCDB - Object <%s> successfully put into %s storage",
688 aLocId.ToString().Data(), type));
691 // removing local filename...
693 localSto->IdToFilename(aLocId, filename);
694 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
695 RemoveFile(filename.Data());
699 Form("StoreOCDB - Grid %s storage of object <%s> failed",
700 type, aLocId.ToString().Data()));
701 Log(fCurrentDetector.Data(),
702 Form("StoreOCDB - Grid %s storage of object <%s> failed",
703 type, aLocId.ToString().Data()));
707 localEntries->Clear();
712 //______________________________________________________________________________________________
713 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
715 // clears the directory used to store reference files of a given subdetector
717 AliCDBManager* man = AliCDBManager::Instance();
718 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
719 TString localBaseFolder = sto->GetBaseFolder();
721 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
723 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
726 begin.Form("%d_", GetCurrentRun());
728 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
732 TList* dirList = baseDir->GetListOfFiles();
735 if (!dirList) return kTRUE;
737 if (dirList->GetEntries() < 3)
743 Int_t nDirs = 0, nDel = 0;
744 TIter dirIter(dirList);
745 TSystemFile* entry = 0;
747 Bool_t success = kTRUE;
749 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
751 if (entry->IsDirectory())
754 TString fileName(entry->GetName());
755 if (!fileName.BeginsWith(begin))
761 Int_t result = gSystem->Unlink(fileName.Data());
765 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
773 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
774 nDel, nDirs, targetDir.Data()));
785 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
789 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
792 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
797 result = gSystem->mkdir(targetDir, kTRUE);
800 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
807 //______________________________________________________________________________________________
808 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
811 // Stores reference file directly (without opening it). This function stores the file locally.
813 // The file is stored under the following location:
814 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
815 // where <gridFileName> is the second parameter given to the function
818 if (fTestMode & kErrorStorage)
820 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
824 AliCDBManager* man = AliCDBManager::Instance();
825 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
827 TString localBaseFolder = sto->GetBaseFolder();
829 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
830 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
832 return CopyFileLocally(localFile, target);
835 //______________________________________________________________________________________________
836 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
839 // Stores Run metadata file to the Grid, in the run folder
841 // Only GRP can call this function.
843 if (fTestMode & kErrorStorage)
845 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
849 AliCDBManager* man = AliCDBManager::Instance();
850 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
852 TString localBaseFolder = sto->GetBaseFolder();
854 // Build Run level folder
855 // folder = /alice/data/year/lhcPeriod/runNb/Raw
858 TString lhcPeriod = GetLHCPeriod();
859 if (lhcPeriod.Length() == 0)
861 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
865 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
866 localBaseFolder.Data(), GetCurrentYear(),
867 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
869 return CopyFileLocally(localFile, target);
872 //______________________________________________________________________________________________
873 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
876 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
877 // Files are temporarily stored in the local reference storage. When the preprocessor
878 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
879 // (in reference or run level folders)
882 TString targetDir(target(0, target.Last('/')));
884 //try to open base dir folder, if it does not exist
885 void* dir = gSystem->OpenDirectory(targetDir.Data());
887 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
888 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
893 gSystem->FreeDirectory(dir);
898 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
901 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
905 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
908 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
909 if (gSystem->Unlink(target.Data()))
911 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
916 result = gSystem->CopyFile(localFile, target);
920 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
925 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
926 localFile, target.Data(), result));
934 //______________________________________________________________________________________________
935 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
938 // Transfers local files to the Grid. Local files can be reference files
939 // or run metadata file (from GRP only).
941 // According to the type (ref, metadata) the files are stored under the following location:
942 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
943 // metadata --> <run data folder>/<MetadataFileName>
946 AliCDBManager* man = AliCDBManager::Instance();
947 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
950 TString localBaseFolder = sto->GetBaseFolder();
956 if (strcmp(type, "reference") == 0)
958 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
959 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
962 TString gridBaseFolder = gridSto->GetBaseFolder();
963 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
964 begin = Form("%d_", GetCurrentRun());
966 else if (strcmp(type, "metadata") == 0)
969 TString lhcPeriod = GetLHCPeriod();
971 if (lhcPeriod.Length() == 0)
973 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
977 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
978 localBaseFolder.Data(), GetCurrentYear(),
979 lhcPeriod.Data(), GetCurrentRun());
980 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
986 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
990 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
994 TList* dirList = baseDir->GetListOfFiles();
997 if (!dirList) return kTRUE;
999 if (dirList->GetEntries() < 3)
1007 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1012 Int_t nDirs = 0, nTransfer = 0;
1013 TIter dirIter(dirList);
1014 TSystemFile* entry = 0;
1016 Bool_t success = kTRUE;
1017 Bool_t first = kTRUE;
1019 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1021 if (entry->IsDirectory())
1024 TString fileName(entry->GetName());
1025 if (!fileName.BeginsWith(begin))
1033 // check that folder exists, otherwise create it
1034 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1042 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1044 // TODO It does not work currently! Bug in TAliEn::Mkdir
1045 // TODO Manually fixed in local root v5-16-00
1046 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1048 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1053 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1057 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1061 TString fullLocalPath;
1062 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1064 TString fullGridPath;
1065 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1067 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1071 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1072 fullLocalPath.Data(), fullGridPath.Data()));
1073 RemoveFile(fullLocalPath);
1078 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1079 fullLocalPath.Data(), fullGridPath.Data()));
1084 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1085 nTransfer, nDirs, dir.Data()));
1092 //______________________________________________________________________________________________
1093 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1096 // Get folder name of reference files
1099 TString offDetStr(GetOfflineDetName(detector));
1101 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1103 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1105 dir.Form("%s/%s", base, offDetStr.Data());
1113 //______________________________________________________________________________________________
1114 void AliShuttle::CleanLocalStorage(const TString& uri)
1117 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1120 const char* type = 0;
1121 if(uri == fgkLocalCDB) {
1123 } else if(uri == fgkLocalRefStorage) {
1126 AliError(Form("Invalid storage URI: %s", uri.Data()));
1130 AliCDBManager* man = AliCDBManager::Instance();
1132 // open local storage
1133 AliCDBStorage *localSto = man->GetStorage(uri);
1136 Form("CleanLocalStorage - cannot activate local %s storage", type));
1140 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1141 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1143 AliDebug(2, Form("filename = %s", filename.Data()));
1145 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1146 GetCurrentRun(), fCurrentDetector.Data()));
1148 RemoveFile(filename.Data());
1152 //______________________________________________________________________________________________
1153 void AliShuttle::RemoveFile(const char* filename)
1156 // removes local file
1159 TString command(Form("rm -f %s", filename));
1161 Int_t result = gSystem->Exec(command.Data());
1164 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1165 fCurrentDetector.Data(), filename));
1169 //______________________________________________________________________________________________
1170 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1173 // Reads the AliShuttleStatus from the CDB
1177 delete fStatusEntry;
1181 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1182 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1184 if (!fStatusEntry) return 0;
1185 fStatusEntry->SetOwner(1);
1187 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1189 AliError("Invalid object stored to CDB!");
1196 //______________________________________________________________________________________________
1197 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1200 // writes the status for one subdetector
1204 delete fStatusEntry;
1208 Int_t run = GetCurrentRun();
1210 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1212 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1213 fStatusEntry->SetOwner(1);
1215 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1218 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1219 fCurrentDetector.Data(), run));
1228 //______________________________________________________________________________________________
1229 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1232 // changes the AliShuttleStatus for the given detector and run to the given status
1236 AliError("UNEXPECTED: fStatusEntry empty");
1240 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1243 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1247 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1248 fCurrentDetector.Data(),
1249 status->GetStatusName(),
1250 status->GetStatusName(newStatus));
1251 Log("SHUTTLE", actionStr);
1252 SetLastAction(actionStr);
1254 status->SetStatus(newStatus);
1255 if (increaseCount) status->IncreaseCount();
1257 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1262 //______________________________________________________________________________________________
1263 void AliShuttle::SendMLInfo()
1266 // sends ML information about the current status of the current detector being processed
1269 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1272 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1276 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1277 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1280 mlList.Add(&mlStatus);
1281 mlList.Add(&mlRetryCount);
1284 mlID.Form("%d", GetCurrentRun());
1285 fMonaLisa->SendParameters(&mlList, mlID);
1288 //______________________________________________________________________________________________
1289 Bool_t AliShuttle::ContinueProcessing()
1291 // this function reads the AliShuttleStatus information from CDB and
1292 // checks if the processing should be continued
1293 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1295 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1297 AliPreprocessor* aPreprocessor =
1298 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1301 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1305 AliShuttleLogbookEntry::Status entryStatus =
1306 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1308 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1309 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1310 fCurrentDetector.Data(),
1311 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1315 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1317 // check if current run is first unprocessed run for current detector
1318 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1319 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1321 if (fTestMode == kNone)
1323 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1324 " but this is not the first unprocessed run!"));
1329 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1330 "Although %s requires strict run ordering "
1331 "and this is not the first unprocessed run, "
1332 "the SHUTTLE continues"));
1336 AliShuttleStatus* status = ReadShuttleStatus();
1339 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1340 fCurrentDetector.Data()));
1341 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1342 return WriteShuttleStatus(status);
1345 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1346 // If it happens it may mean Logbook updating failed... let's do it now!
1347 if (status->GetStatus() == AliShuttleStatus::kDone ||
1348 status->GetStatus() == AliShuttleStatus::kFailed){
1349 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1350 fCurrentDetector.Data(),
1351 status->GetStatusName(status->GetStatus())));
1352 UpdateShuttleLogbook(fCurrentDetector.Data(),
1353 status->GetStatusName(status->GetStatus()));
1357 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1359 Form("ContinueProcessing - %s: Grid storage of one or more "
1360 "objects failed. Trying again now",
1361 fCurrentDetector.Data()));
1362 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1364 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1365 "successfully stored into main storage",
1366 fCurrentDetector.Data()));
1369 Form("ContinueProcessing - %s: Grid storage failed again",
1370 fCurrentDetector.Data()));
1371 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1376 // if we get here, there is a restart
1377 Bool_t cont = kFALSE;
1380 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1381 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1382 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1383 status->GetCount(), status->GetStatusName()));
1384 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1385 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1387 // there may still be objects in local OCDB and reference storage
1388 // and FXS databases may be not updated: do it now!
1390 // TODO Currently disabled, we want to keep files in case of failure!
1391 // CleanLocalStorage(fgkLocalCDB);
1392 // CleanLocalStorage(fgkLocalRefStorage);
1393 // UpdateTableFailCase();
1395 // Send mail to detector expert!
1396 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1397 fCurrentDetector.Data()));
1399 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1400 fCurrentDetector.Data()));
1403 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1404 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1405 status->GetStatusName(), status->GetCount()));
1406 Bool_t increaseCount = kTRUE;
1407 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1408 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1409 increaseCount = kFALSE;
1411 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1418 //______________________________________________________________________________________________
1419 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1422 // Makes data retrieval for all detectors in the configuration.
1423 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1424 // (Unprocessed, Inactive, Failed or Done).
1425 // Returns kFALSE in case of error occured and kTRUE otherwise
1428 if (!entry) return kFALSE;
1430 fLogbookEntry = entry;
1432 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1435 // Send the information to ML
1436 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1437 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1440 mlList.Add(&mlStatus);
1441 mlList.Add(&mlRunType);
1444 mlID.Form("%d", GetCurrentRun());
1445 fMonaLisa->SendParameters(&mlList, mlID);
1447 if (fLogbookEntry->IsDone())
1449 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1450 UpdateShuttleLogbook("shuttle_done");
1455 // read test mode if flag is set
1459 TString logEntry(entry->GetRunParameter("log"));
1460 //printf("log entry = %s\n", logEntry.Data());
1461 TString searchStr("Testmode: ");
1462 Int_t pos = logEntry.Index(searchStr.Data());
1463 //printf("%d\n", pos);
1466 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1467 //printf("%s\n", subStr.String().Data());
1468 TString newStr(subStr.Data());
1469 TObjArray* token = newStr.Tokenize(' ');
1473 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1476 Int_t testMode = tmpStr->String().Atoi();
1479 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1480 SetTestMode((TestMode) testMode);
1488 fLogbookEntry->Print("all");
1491 Bool_t hasError = kFALSE;
1493 // Set the CDB and Reference folders according to the year and LHC period
1494 TString lhcPeriod(GetLHCPeriod());
1495 if (lhcPeriod.Length() == 0)
1497 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1501 if (fgkMainCDB.Length() == 0)
1502 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1503 GetCurrentYear(), lhcPeriod.Data());
1505 if (fgkMainRefStorage.Length() == 0)
1506 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1507 GetCurrentYear(), lhcPeriod.Data());
1509 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1510 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1511 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1512 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1514 // Loop on detectors in the configuration
1515 TIter iter(fConfig->GetDetectors());
1516 TObjString* aDetector = 0;
1518 while ((aDetector = (TObjString*) iter.Next()))
1520 fCurrentDetector = aDetector->String();
1522 if (ContinueProcessing() == kFALSE) continue;
1524 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1525 GetCurrentRun(), aDetector->GetName()));
1527 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1529 Log(fCurrentDetector.Data(), "Process - Starting processing");
1535 Log("SHUTTLE", "Process - ERROR: Forking failed");
1540 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1541 GetCurrentRun(), aDetector->GetName()));
1543 Long_t begin = time(0);
1545 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1546 while (waitpid(pid, &status, WNOHANG) == 0)
1548 Long_t expiredTime = time(0) - begin;
1550 if (expiredTime > fConfig->GetPPTimeOut())
1553 tmp.Form("Process - Process of %s time out. "
1554 "Run time: %d seconds. Killing...",
1555 fCurrentDetector.Data(), expiredTime);
1556 Log("SHUTTLE", tmp);
1557 Log(fCurrentDetector, tmp);
1561 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1564 gSystem->Sleep(1000);
1568 gSystem->Sleep(1000);
1571 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1572 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1575 Log("SHUTTLE", Form("Process - Error: "
1576 "Could not open pipe to %s", checkStr.Data()));
1581 if (!fgets(buffer, 100, pipe))
1583 Log("SHUTTLE", "Process - Error: ps did not return anything");
1584 gSystem->ClosePipe(pipe);
1587 gSystem->ClosePipe(pipe);
1589 //Log("SHUTTLE", Form("ps returned %s", buffer));
1592 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1594 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1598 if (expiredTime % 60 == 0)
1600 Log("SHUTTLE", Form("Process - %s: Checking process. "
1601 "Run time: %d seconds - Memory consumption: %d KB",
1602 fCurrentDetector.Data(), expiredTime, mem));
1606 if (mem > fConfig->GetPPMaxMem())
1609 tmp.Form("Process - Process exceeds maximum allowed memory "
1610 "(%d KB > %d KB). Killing...",
1611 mem, fConfig->GetPPMaxMem());
1612 Log("SHUTTLE", tmp);
1613 Log(fCurrentDetector, tmp);
1617 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1620 gSystem->Sleep(1000);
1625 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1626 GetCurrentRun(), aDetector->GetName()));
1628 if (WIFEXITED(status))
1630 Int_t returnCode = WEXITSTATUS(status);
1632 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1635 if (returnCode == 0) hasError = kTRUE;
1641 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1642 aDetector->GetName()));
1644 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1646 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1648 Log("SHUTTLE", "Process - Could not freopen stdout");
1652 fOutputRedirected = kTRUE;
1653 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1654 Log("SHUTTLE", "Process - Could not redirect stderr");
1658 TString wd = gSystem->WorkingDirectory();
1659 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1660 fCurrentDetector.Data(), GetCurrentRun());
1662 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1663 if (!result) // temp dir already exists!
1665 Log(fCurrentDetector.Data(),
1666 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1667 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1670 if (gSystem->mkdir(tmpDir.Data(), 1))
1672 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1676 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1678 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1682 Bool_t success = ProcessCurrentDetector();
1684 gSystem->ChangeDirectory(wd.Data());
1686 if (success) // Preprocessor finished successfully!
1688 // remove temporary folder
1689 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1691 // Update time_processed field in FXS DB
1692 if (UpdateTable() == kFALSE)
1693 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1694 fCurrentDetector.Data()));
1696 // Transfer the data from local storage to main storage (Grid)
1697 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1698 if (StoreOCDB() == kFALSE)
1701 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1702 GetCurrentRun(), aDetector->GetName()));
1703 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1707 Form("\t\t\t****** run %d - %s: DONE ******",
1708 GetCurrentRun(), aDetector->GetName()));
1709 UpdateShuttleStatus(AliShuttleStatus::kDone);
1710 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1715 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1716 GetCurrentRun(), aDetector->GetName()));
1719 for (UInt_t iSys=0; iSys<3; iSys++)
1721 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1724 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1725 GetCurrentRun(), aDetector->GetName(), success));
1727 // the client exits here
1728 gSystem->Exit(success);
1730 AliError("We should never get here!!!");
1734 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1737 //check if shuttle is done for this run, if so update logbook
1738 TObjArray checkEntryArray;
1739 checkEntryArray.SetOwner(1);
1740 TString whereClause = Form("where run=%d", GetCurrentRun());
1741 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1742 checkEntryArray.GetEntries() == 0) {
1743 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1745 return hasError == kFALSE;
1748 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1749 (checkEntryArray.At(0));
1753 if (checkEntry->IsDone())
1755 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1756 UpdateShuttleLogbook("shuttle_done");
1760 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1762 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1764 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1765 checkEntry->GetRun(), GetDetName(iDet)));
1766 fFirstUnprocessed[iDet] = kFALSE;
1774 return hasError == kFALSE;
1777 //______________________________________________________________________________________________
1778 Bool_t AliShuttle::ProcessCurrentDetector()
1781 // Makes data retrieval just for a specific detector (fCurrentDetector).
1782 // Threre should be a configuration for this detector.
1784 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1785 fCurrentDetector.Data(), GetCurrentRun()));
1787 TString wd = gSystem->WorkingDirectory();
1789 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1792 gSystem->ChangeDirectory(wd.Data());
1794 TMap* dcsMap = new TMap();
1796 // call preprocessor
1797 AliPreprocessor* aPreprocessor =
1798 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1800 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1802 Bool_t processDCS = aPreprocessor->ProcessDCS();
1806 Log(fCurrentDetector, "ProcessCurrentDetector -"
1807 " The preprocessor requested to skip the retrieval of DCS values");
1809 else if (fTestMode & kSkipDCS)
1811 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1813 else if (fTestMode & kErrorDCS)
1815 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1816 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1817 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1822 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1824 // Query DCS archive
1825 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1827 for (int iServ=0; iServ<nServers; iServ++)
1830 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1831 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1832 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1834 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1835 " Querying DCS Amanda server %s:%d (%d of %d)",
1836 host.Data(), port, iServ+1, nServers));
1841 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1843 aliasMap = GetValueSet(host, port,
1844 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1845 kAlias, multiSplit);
1848 Log(fCurrentDetector,
1849 Form("ProcessCurrentDetector -"
1850 " Error retrieving DCS aliases from server %s."
1851 " Sending mail to DCS experts!", host.Data()));
1852 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1854 //if (!SendMailToDCS())
1855 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1862 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1864 dpMap = GetValueSet(host, port,
1865 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1869 Log(fCurrentDetector,
1870 Form("ProcessCurrentDetector -"
1871 " Error retrieving DCS data points from server %s."
1872 " Sending mail to DCS experts!", host.Data()));
1873 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1875 //if (!SendMailToDCS())
1876 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1878 if (aliasMap) delete aliasMap;
1884 // merge aliasMap and dpMap into dcsMap
1886 TIter iter(aliasMap);
1887 TObjString* key = 0;
1888 while ((key = (TObjString*) iter.Next()))
1889 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1891 aliasMap->SetOwner(kFALSE);
1897 TObjString* key = 0;
1898 while ((key = (TObjString*) iter.Next()))
1899 dcsMap->Add(key, dpMap->GetValue(key->String()));
1901 dpMap->SetOwner(kFALSE);
1907 // save map into file, to help debugging in case of preprocessor error
1908 TFile* f = TFile::Open("DCSMap.root","recreate");
1910 dcsMap->Write("DCSMap", TObject::kSingleKey);
1914 // DCS Archive DB processing successful. Call Preprocessor!
1915 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1917 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1919 if (returnValue > 0) // Preprocessor error!
1921 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1922 "Preprocessor failed. Process returned %d.", returnValue));
1923 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1924 dcsMap->DeleteAll();
1930 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1931 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1932 fCurrentDetector.Data()));
1934 dcsMap->DeleteAll();
1940 //______________________________________________________________________________________________
1941 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1944 // Query DAQ's Shuttle logbook and fills detector status object.
1945 // Call QueryRunParameters to query DAQ logbook for run parameters.
1948 entries.SetOwner(1);
1950 // check connection, in case connect
1951 if(!Connect(3)) return kFALSE;
1954 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1956 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1958 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1962 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1964 if(aResult->GetRowCount() == 0) {
1965 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1970 // TODO Check field count!
1971 const UInt_t nCols = 23;
1972 if (aResult->GetFieldCount() != (Int_t) nCols) {
1973 Log("SHUTTLE", "Invalid SQL result field number!");
1979 while ((aRow = aResult->Next())) {
1980 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1981 Int_t run = runString.Atoi();
1983 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1987 // loop on detectors
1988 for(UInt_t ii = 0; ii < nCols; ii++)
1989 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1991 entries.AddLast(entry);
1999 //______________________________________________________________________________________________
2000 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2003 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2006 // check connection, in case connect
2011 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2013 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2015 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2019 if (aResult->GetRowCount() == 0) {
2020 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2025 if (aResult->GetRowCount() > 1) {
2026 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2027 "more than one entry in DAQ Logbook for run %d!", run));
2032 TSQLRow* aRow = aResult->Next();
2035 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2040 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2042 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2043 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2045 UInt_t startTime = entry->GetStartTime();
2046 UInt_t endTime = entry->GetEndTime();
2048 // if (!startTime || !endTime || startTime > endTime)
2051 // Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2052 // run, startTime, endTime));
2054 // Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2055 // fLogbookEntry = entry;
2056 // if (!UpdateShuttleLogbook("shuttle_done"))
2058 // AliError(Form("Could not update logbook for run %d !", run));
2060 // fLogbookEntry = 0;
2071 Form("QueryRunParameters - Invalid parameters for Run %d: "
2072 "startTime = %d, endTime = %d. Skipping!",
2073 run, startTime, endTime));
2075 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2076 fLogbookEntry = entry;
2077 if (!UpdateShuttleLogbook("shuttle_ignored"))
2079 AliError(Form("Could not update logbook for run %d !", run));
2089 if (startTime && !endTime)
2091 // TODO Here we don't mark SHUTTLE done, because this may mean
2092 //the run is still ongoing!!
2094 Form("QueryRunParameters - Invalid parameters for Run %d: "
2095 "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2096 run, startTime, endTime));
2098 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2099 //fLogbookEntry = entry;
2100 //if (!UpdateShuttleLogbook("shuttle_done"))
2102 // AliError(Form("Could not update logbook for run %d !", run));
2104 //fLogbookEntry = 0;
2112 if (startTime && endTime && (startTime > endTime))
2115 Form("QueryRunParameters - Invalid parameters for Run %d: "
2116 "startTime = %d, endTime = %d. Skipping!",
2117 run, startTime, endTime));
2119 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2120 fLogbookEntry = entry;
2121 if (!UpdateShuttleLogbook("shuttle_ignored"))
2123 AliError(Form("Could not update logbook for run %d !", run));
2133 TString totEventsStr = entry->GetRunParameter("totalEvents");
2134 Int_t totEvents = totEventsStr.Atoi();
2138 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2140 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2141 fLogbookEntry = entry;
2142 if (!UpdateShuttleLogbook("shuttle_ignored"))
2144 AliError(Form("Could not update logbook for run %d !", run));
2160 //______________________________________________________________________________________________
2161 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2162 DCSType type, Int_t multiSplit)
2164 // Retrieve all "entry" data points from the DCS server
2165 // host, port: TSocket connection parameters
2166 // entries: list of name of the alias or data point
2167 // type: kAlias or kDP
2168 // returns TMap of values, 0 when failure
2170 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2175 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2176 GetCurrentEndTime());
2178 else if (type == kDP)
2180 result = client.GetDPValues(entries, GetCurrentStartTime(),
2181 GetCurrentEndTime());
2186 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2187 client.GetErrorString(client.GetResultErrorCode())));
2188 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2189 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2190 client.GetServerError().Data()));
2198 //______________________________________________________________________________________________
2199 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2200 const char* id, const char* source)
2202 // Get calibration file from file exchange servers
2203 // First queris the FXS database for the file name, using the run, detector, id and source info
2204 // then calls RetrieveFile(filename) for actual copy to local disk
2205 // run: current run being processed (given by Logbook entry fLogbookEntry)
2206 // detector: the Preprocessor name
2207 // id: provided as a parameter by the Preprocessor
2208 // source: provided by the Preprocessor through GetFileSources function
2210 // check if test mode should simulate a FXS error
2211 if (fTestMode & kErrorFXSFiles)
2213 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2217 // check connection, in case connect
2218 if (!Connect(system))
2220 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2224 // Query preparation
2225 TString sourceName(source);
2227 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2228 fConfig->GetFXSdbTable(system));
2229 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2230 GetCurrentRun(), detector, id);
2234 whereClause += Form(" and DAQsource=\"%s\"", source);
2236 else if (system == kDCS)
2240 else if (system == kHLT)
2242 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2246 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2248 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2251 TSQLResult* aResult = 0;
2252 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2254 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2255 GetSystemName(system), id, sourceName.Data()));
2259 if(aResult->GetRowCount() == 0)
2262 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2263 GetSystemName(system), id, sourceName.Data()));
2268 if (aResult->GetRowCount() > 1) {
2270 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2271 GetSystemName(system), id, sourceName.Data()));
2276 if (aResult->GetFieldCount() != nFields) {
2278 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2279 GetSystemName(system), id, sourceName.Data()));
2284 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2287 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2288 GetSystemName(system), id, sourceName.Data()));
2293 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2294 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2295 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2300 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2301 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2303 // retrieved file is renamed to make it unique
2304 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2305 GetShuttleTempDir(), detector, GetCurrentRun(),
2306 GetSystemName(system), detector, GetCurrentRun(),
2307 id, sourceName.Data());
2310 // file retrieval from FXS
2311 UInt_t nRetries = 0;
2312 UInt_t maxRetries = 3;
2313 Bool_t result = kFALSE;
2315 // copy!! if successful TSystem::Exec returns 0
2316 while(nRetries++ < maxRetries) {
2317 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2318 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2321 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2322 filePath.Data(), GetSystemName(system)));
2326 if (fileChecksum.Length()>0)
2328 // compare md5sum of local file with the one stored in the FXS DB
2329 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2330 localFileName.Data(), fileChecksum.Data()));
2334 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2340 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2341 filePath.Data(), GetSystemName(system)));
2346 if(!result) return 0;
2348 fFXSCalled[system]=kTRUE;
2349 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2350 fFXSlist[system].Add(fileParams);
2352 static TString staticLocalFileName;
2353 staticLocalFileName.Form("%s", localFileName.Data());
2355 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2356 "source %s from %s to %s", id, source,
2357 GetSystemName(system), localFileName.Data()));
2359 return staticLocalFileName.Data();
2362 //______________________________________________________________________________________________
2363 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2366 // Copies file from FXS to local Shuttle machine
2369 // check temp directory: trying to cd to temp; if it does not exist, create it
2370 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2371 GetSystemName(system), fxsFileName, localFileName));
2373 TString tmpDir(localFileName);
2375 tmpDir = tmpDir(0,tmpDir.Last('/'));
2377 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2378 if (noDir) // temp dir does not exists!
2380 if (gSystem->mkdir(tmpDir.Data(), 1))
2382 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2387 TString baseFXSFolder;
2390 baseFXSFolder = "FES/";
2392 else if (system == kDCS)
2396 else if (system == kHLT)
2398 baseFXSFolder = "/opt/FXS/";
2402 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2403 fConfig->GetFXSPort(system),
2404 fConfig->GetFXSUser(system),
2405 fConfig->GetFXSHost(system),
2406 baseFXSFolder.Data(),
2410 AliDebug(2, Form("%s",command.Data()));
2412 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2417 //______________________________________________________________________________________________
2418 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2421 // Get sources producing the condition file Id from file exchange servers
2422 // if id is NULL all sources are returned (distinct)
2425 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2427 // check if test mode should simulate a FXS error
2428 if (fTestMode & kErrorFXSSources)
2430 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2436 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2437 TList *list = new TList();
2439 list->Add(new TObjString(" "));
2443 // check connection, in case connect
2444 if (!Connect(system))
2446 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2450 TString sourceName = 0;
2453 sourceName = "DAQsource";
2454 } else if (system == kHLT)
2456 sourceName = "DDLnumbers";
2459 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2460 TString whereClause = Form("run=%d and detector=\"%s\"",
2461 GetCurrentRun(), detector);
2463 whereClause += Form(" and fileId=\"%s\"", id);
2464 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2466 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2469 TSQLResult* aResult;
2470 aResult = fServer[system]->Query(sqlQuery);
2472 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2473 GetSystemName(system), id));
2477 TList *list = new TList();
2480 if (aResult->GetRowCount() == 0)
2483 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2488 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2491 while ((aRow = aResult->Next()))
2494 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2495 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2496 list->Add(new TObjString(source));
2505 //______________________________________________________________________________________________
2506 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2509 // Get all ids of condition files produced by a given source from file exchange servers
2512 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2514 // check if test mode should simulate a FXS error
2515 if (fTestMode & kErrorFXSSources)
2517 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2521 // check connection, in case connect
2522 if (!Connect(system))
2524 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2528 TString sourceName = 0;
2531 sourceName = "DAQsource";
2532 } else if (system == kHLT)
2534 sourceName = "DDLnumbers";
2537 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2538 TString whereClause = Form("run=%d and detector=\"%s\"",
2539 GetCurrentRun(), detector);
2540 if (sourceName.Length() > 0 && source)
2541 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2542 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2544 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2547 TSQLResult* aResult;
2548 aResult = fServer[system]->Query(sqlQuery);
2550 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2551 GetSystemName(system), source));
2555 TList *list = new TList();
2558 if (aResult->GetRowCount() == 0)
2561 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2566 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2570 while ((aRow = aResult->Next()))
2573 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2574 AliDebug(2, Form("fileId = %s", id.Data()));
2575 list->Add(new TObjString(id));
2584 //______________________________________________________________________________________________
2585 Bool_t AliShuttle::Connect(Int_t system)
2587 // Connect to MySQL Server of the system's FXS MySQL databases
2588 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2591 // check connection: if already connected return
2592 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2594 TString dbHost, dbUser, dbPass, dbName;
2596 if (system < 3) // FXS db servers
2598 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2599 dbUser = fConfig->GetFXSdbUser(system);
2600 dbPass = fConfig->GetFXSdbPass(system);
2601 dbName = fConfig->GetFXSdbName(system);
2602 } else { // Run & Shuttle logbook servers
2603 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2604 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2605 dbUser = fConfig->GetDAQlbUser();
2606 dbPass = fConfig->GetDAQlbPass();
2607 dbName = fConfig->GetDAQlbDB();
2610 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2611 if (!fServer[system] || !fServer[system]->IsConnected()) {
2614 AliError(Form("Can't establish connection to FXS database for %s",
2615 AliShuttleInterface::GetSystemName(system)));
2617 AliError("Can't establish connection to Run logbook.");
2619 if(fServer[system]) delete fServer[system];
2624 TSQLResult* aResult=0;
2627 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2630 aResult = fServer[kDCS]->GetTables(dbName.Data());
2633 aResult = fServer[kHLT]->GetTables(dbName.Data());
2636 aResult = fServer[3]->GetTables(dbName.Data());
2644 //______________________________________________________________________________________________
2645 Bool_t AliShuttle::UpdateTable()
2648 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2651 Bool_t result = kTRUE;
2653 for (UInt_t system=0; system<3; system++)
2655 if(!fFXSCalled[system]) continue;
2657 // check connection, in case connect
2658 if (!Connect(system))
2660 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2665 TTimeStamp now; // now
2667 // Loop on FXS list entries
2668 TIter iter(&fFXSlist[system]);
2669 TObjString *aFXSentry=0;
2670 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2672 TString aFXSentrystr = aFXSentry->String();
2673 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2674 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2676 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2677 GetSystemName(system), aFXSentrystr.Data()));
2678 if(aFXSarray) delete aFXSarray;
2682 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2683 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2685 TString whereClause;
2688 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2689 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2691 else if (system == kDCS)
2693 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2694 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2696 else if (system == kHLT)
2698 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2699 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2704 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2705 now.GetSec(), whereClause.Data());
2707 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2710 TSQLResult* aResult;
2711 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2714 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2715 GetSystemName(system), sqlQuery.Data()));
2726 //______________________________________________________________________________________________
2727 Bool_t AliShuttle::UpdateTableFailCase()
2729 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2730 // this is called in case the preprocessor is declared failed for the current run, because
2731 // the fields are updated only in case of success
2733 Bool_t result = kTRUE;
2735 for (UInt_t system=0; system<3; system++)
2737 // check connection, in case connect
2738 if (!Connect(system))
2740 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2741 GetSystemName(system)));
2746 TTimeStamp now; // now
2748 // Loop on FXS list entries
2750 TString whereClause = Form("where run=%d and detector=\"%s\";",
2751 GetCurrentRun(), fCurrentDetector.Data());
2754 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2755 now.GetSec(), whereClause.Data());
2757 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2760 TSQLResult* aResult;
2761 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2764 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2765 GetSystemName(system), sqlQuery.Data()));
2775 //______________________________________________________________________________________________
2776 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2779 // Update Shuttle logbook filling detector or shuttle_done column
2780 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2783 // check connection, in case connect
2785 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2789 TString detName(detector);
2791 if (detName == "shuttle_done" || detName == "shuttle_ignored")
2793 setClause = "set shuttle_done=1";
2795 if (detName == "shuttle_done")
2797 // Send the information to ML
2798 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2801 mlList.Add(&mlStatus);
2804 mlID.Form("%d", GetCurrentRun());
2805 fMonaLisa->SendParameters(&mlList, mlID);
2808 TString statusStr(status);
2809 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2810 statusStr.Contains("failed", TString::kIgnoreCase)){
2811 setClause = Form("set %s=\"%s\"", detector, status);
2814 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2820 TString whereClause = Form("where run=%d", GetCurrentRun());
2822 TString sqlQuery = Form("update %s %s %s",
2823 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2825 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2828 TSQLResult* aResult;
2829 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2831 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2839 //______________________________________________________________________________________________
2840 Int_t AliShuttle::GetCurrentRun() const
2843 // Get current run from logbook entry
2846 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2849 //______________________________________________________________________________________________
2850 UInt_t AliShuttle::GetCurrentStartTime() const
2853 // get current start time
2856 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2859 //______________________________________________________________________________________________
2860 UInt_t AliShuttle::GetCurrentEndTime() const
2863 // get current end time from logbook entry
2866 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2869 //______________________________________________________________________________________________
2870 UInt_t AliShuttle::GetCurrentYear() const
2873 // Get current year from logbook entry
2876 if (!fLogbookEntry) return 0;
2878 TTimeStamp startTime(GetCurrentStartTime());
2879 TString year = Form("%d",startTime.GetDate());
2885 //______________________________________________________________________________________________
2886 const char* AliShuttle::GetLHCPeriod() const
2889 // Get current LHC period from logbook entry
2892 if (!fLogbookEntry) return 0;
2894 return fLogbookEntry->GetRunParameter("LHCperiod");
2897 //______________________________________________________________________________________________
2898 void AliShuttle::Log(const char* detector, const char* message)
2901 // Fill log string with a message
2904 TString logRunDir = GetShuttleLogDir();
2905 if (GetCurrentRun() >=0)
2906 logRunDir += Form("/%d", GetCurrentRun());
2908 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2910 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2911 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2916 gSystem->FreeDirectory(dir);
2919 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2920 if (GetCurrentRun() >= 0)
2921 toLog += Form("run %d - ", GetCurrentRun());
2922 toLog += Form("%s", message);
2924 AliInfo(toLog.Data());
2926 // if we redirect the log output already to the file, leave here
2927 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2930 TString fileName = GetLogFileName(detector);
2932 gSystem->ExpandPathName(fileName);
2935 logFile.open(fileName, ofstream::out | ofstream::app);
2937 if (!logFile.is_open()) {
2938 AliError(Form("Could not open file %s", fileName.Data()));
2942 logFile << toLog.Data() << "\n";
2947 //______________________________________________________________________________________________
2948 TString AliShuttle::GetLogFileName(const char* detector) const
2951 // returns the name of the log file for a given sub detector
2956 if (GetCurrentRun() >= 0)
2958 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
2959 detector, GetCurrentRun());
2961 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2967 //______________________________________________________________________________________________
2968 void AliShuttle::SendAlive()
2970 // sends alive message to ML
2972 TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
2975 mlList.Add(&mlStatus);
2977 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2980 //______________________________________________________________________________________________
2981 Bool_t AliShuttle::Collect(Int_t run)
2984 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2985 // If a dedicated run is given this run is processed
2987 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2991 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2993 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2995 SetLastAction("Starting");
2997 // create ML instance
2999 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
3004 TString whereClause("where shuttle_done=0");
3006 whereClause += Form(" and run=%d", run);
3008 TObjArray shuttleLogbookEntries;
3009 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3011 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3015 if (shuttleLogbookEntries.GetEntries() == 0)
3018 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3020 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3021 "or it does not exist in Shuttle logbook", run));
3025 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3026 fFirstUnprocessed[iDet] = kTRUE;
3030 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3031 // flag them into fFirstUnprocessed array
3032 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3033 TObjArray tmpLogbookEntries;
3034 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3036 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3040 TIter iter(&tmpLogbookEntries);
3041 AliShuttleLogbookEntry* anEntry = 0;
3042 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3044 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3046 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3048 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3049 anEntry->GetRun(), GetDetName(iDet)));
3050 fFirstUnprocessed[iDet] = kFALSE;
3058 if (!RetrieveConditionsData(shuttleLogbookEntries))
3060 Log("SHUTTLE", "Collect - Process of at least one run failed");
3064 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3068 //______________________________________________________________________________________________
3069 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3072 // Retrieve conditions data for all runs that aren't processed yet
3075 Bool_t hasError = kFALSE;
3077 TIter iter(&dateEntries);
3078 AliShuttleLogbookEntry* anEntry;
3080 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3081 if (!Process(anEntry)){
3085 // clean SHUTTLE temp directory
3086 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3087 //RemoveFile(filename.Data());
3090 return hasError == kFALSE;
3093 //______________________________________________________________________________________________
3094 ULong_t AliShuttle::GetTimeOfLastAction() const
3097 // Gets time of last action
3102 fMonitoringMutex->Lock();
3104 tmp = fLastActionTime;
3106 fMonitoringMutex->UnLock();
3111 //______________________________________________________________________________________________
3112 const TString AliShuttle::GetLastAction() const
3115 // returns a string description of the last action
3120 fMonitoringMutex->Lock();
3124 fMonitoringMutex->UnLock();
3129 //______________________________________________________________________________________________
3130 void AliShuttle::SetLastAction(const char* action)
3133 // updates the monitoring variables
3136 fMonitoringMutex->Lock();
3138 fLastAction = action;
3139 fLastActionTime = time(0);
3141 fMonitoringMutex->UnLock();
3144 //______________________________________________________________________________________________
3145 const char* AliShuttle::GetRunParameter(const char* param)
3148 // returns run parameter read from DAQ logbook
3151 if(!fLogbookEntry) {
3152 AliError("No logbook entry!");
3156 return fLogbookEntry->GetRunParameter(param);
3159 //______________________________________________________________________________________________
3160 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3163 // returns object from OCDB valid for current run
3166 if (fTestMode & kErrorOCDB)
3168 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3172 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3175 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3179 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3182 //______________________________________________________________________________________________
3183 Bool_t AliShuttle::SendMail()
3186 // sends a mail to the subdetector expert in case of preprocessor error
3189 if (fTestMode != kNone)
3192 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3195 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3197 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3202 gSystem->FreeDirectory(dir);
3205 TString bodyFileName;
3206 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3207 gSystem->ExpandPathName(bodyFileName);
3210 mailBody.open(bodyFileName, ofstream::out);
3212 if (!mailBody.is_open())
3214 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3219 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3220 TObjString *anExpert=0;
3221 while ((anExpert = (TObjString*) iterExperts.Next()))
3223 to += Form("%s,", anExpert->GetName());
3225 to.Remove(to.Length()-1);
3226 AliDebug(2, Form("to: %s",to.Data()));
3229 Log("SHUTTLE", "List of detector responsibles not yet set!");
3233 TString cc="alberto.colla@cern.ch";
3235 TString subject = Form("%s Shuttle preprocessor FAILED in run %d (run type = %s)!",
3236 fCurrentDetector.Data(), GetCurrentRun(), GetRunType());
3237 AliDebug(2, Form("subject: %s", subject.Data()));
3239 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3240 body += Form("SHUTTLE just detected that your preprocessor "
3241 "failed processing run %d (run type = %s)!!\n\n",
3242 GetCurrentRun(), GetRunType());
3243 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3244 fCurrentDetector.Data());
3245 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3247 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3249 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3253 TString logFolder = "logs";
3254 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3255 logFolder += "_PROD";
3258 body += Form("Find the %s log for the current run on \n\n"
3259 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3260 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3261 fCurrentDetector.Data(), GetCurrentRun());
3262 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3264 AliDebug(2, Form("Body begin: %s", body.Data()));
3266 mailBody << body.Data();
3268 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3270 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3271 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3272 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3273 if (gSystem->Exec(tailCommand.Data()))
3275 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3278 TString endBody = Form("------------------------------------------------------\n\n");
3279 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3280 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3281 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3283 AliDebug(2, Form("Body end: %s", endBody.Data()));
3285 mailBody << endBody.Data();
3290 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3294 bodyFileName.Data());
3295 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3297 Bool_t result = gSystem->Exec(mailCommand.Data());
3302 //______________________________________________________________________________________________
3303 Bool_t AliShuttle::SendMailToDCS()
3306 // sends a mail to the DCS experts in case of DCS error
3309 if (fTestMode != kNone)
3312 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3315 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3317 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3322 gSystem->FreeDirectory(dir);
3325 TString bodyFileName;
3326 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3327 gSystem->ExpandPathName(bodyFileName);
3330 mailBody.open(bodyFileName, ofstream::out);
3332 if (!mailBody.is_open())
3334 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3338 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3339 //TString to="alberto.colla@cern.ch";
3340 AliDebug(2, Form("to: %s",to.Data()));
3343 Log("SHUTTLE", "List of detector responsibles not yet set!");
3347 TString cc="alberto.colla@cern.ch";
3349 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3350 fCurrentDetector.Data(), GetCurrentRun());
3351 AliDebug(2, Form("subject: %s", subject.Data()));
3353 TString body = Form("Dear DCS experts, \n\n");
3354 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3355 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3356 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3357 fCurrentDetector.Data());
3358 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3360 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3362 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3365 TString logFolder = "logs";
3366 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3367 logFolder += "_PROD";
3370 body += Form("Find the %s log for the current run on \n\n"
3371 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3372 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3373 fCurrentDetector.Data(), GetCurrentRun());
3374 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3376 AliDebug(2, Form("Body begin: %s", body.Data()));
3378 mailBody << body.Data();
3380 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3382 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3383 fCurrentDetector.Data(), GetCurrentRun());
3384 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3385 if (gSystem->Exec(tailCommand.Data()))
3387 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3390 TString endBody = Form("------------------------------------------------------\n\n");
3391 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3392 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3393 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3395 AliDebug(2, Form("Body end: %s", endBody.Data()));
3397 mailBody << endBody.Data();
3402 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3406 bodyFileName.Data());
3407 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3409 Bool_t result = gSystem->Exec(mailCommand.Data());
3414 //______________________________________________________________________________________________
3415 const char* AliShuttle::GetRunType()
3418 // returns run type read from "run type" logbook
3421 if(!fLogbookEntry) {
3422 AliError("No logbook entry!");
3426 return fLogbookEntry->GetRunType();
3429 //______________________________________________________________________________________________
3430 Bool_t AliShuttle::GetHLTStatus()
3432 // Return HLT status (ON=1 OFF=0)
3433 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3435 if(!fLogbookEntry) {
3436 AliError("No logbook entry!");
3440 // TODO implement when HLTStatus is inserted in run logbook
3441 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3442 //if(hltStatus == "OFF") {return kFALSE};
3447 //______________________________________________________________________________________________
3448 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3451 // sets Shuttle temp directory
3454 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3457 //______________________________________________________________________________________________
3458 void AliShuttle::SetShuttleLogDir(const char* logDir)
3461 // sets Shuttle log directory
3464 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);