1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.71 2007/12/12 14:56:14 jgrosseo
19 sending shuttle_ignore to ML also in case of 0 events
21 Revision 1.70 2007/12/12 13:45:35 acolla
22 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
24 Revision 1.69 2007/12/12 10:06:29 acolla
25 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
27 time_start==0 && time_end==0
29 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
31 Revision 1.68 2007/12/11 10:15:17 acolla
32 Added marking SHUTTLE=DONE for invalid runs
33 (invalid start time or end time) and runs with totalEvents < 1
35 Revision 1.67 2007/12/07 19:14:36 acolla
38 Added automatic collection of new runs on a regular time basis (settable from the configuration)
40 in AliShuttleConfig: new members
42 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
43 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
47 - logs now stored in logs/#RUN/DET_#RUN.log
49 Revision 1.66 2007/12/05 10:45:19 jgrosseo
50 changed order of arguments to TMonaLisaWriter
52 Revision 1.65 2007/11/26 16:58:37 acolla
53 Monalisa configuration added: host and table name
55 Revision 1.64 2007/11/13 16:15:47 acolla
56 DCS map is stored in a file in the temp folder where the detector is processed.
57 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
59 Revision 1.63 2007/11/02 10:53:16 acolla
60 Protection added to AliShuttle::CopyFileLocally
62 Revision 1.62 2007/10/31 18:23:13 acolla
63 Furter developement on the Shuttle:
65 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
66 are now built from /alice/data, e.g.:
67 /alice/data/2007/LHC07a/OCDB
69 the year and LHC period are taken from the Shuttle.
70 Raw metadata files are stored by GRP to:
71 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
73 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
75 Revision 1.61 2007/10/30 20:33:51 acolla
76 Improved managing of temporary folders, which weren't correctly handled.
77 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
79 Revision 1.60 2007/10/29 18:06:16 acolla
81 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
82 This function can be used by GRP only. It stores raw data tags merged file to the
83 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
87 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
88 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
89 has been corrected in the root package on the Shuttle machine.
91 Revision 1.59 2007/10/05 12:40:55 acolla
93 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
95 Revision 1.58 2007/09/28 15:27:40 acolla
97 AliDCSClient "multiSplit" option added in the DCS configuration
98 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
100 Revision 1.57 2007/09/27 16:53:13 acolla
101 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
102 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
104 Revision 1.56 2007/09/14 16:46:14 jgrosseo
105 1) Connect and Close are called before and after each query, so one can
106 keep the same AliDCSClient object.
107 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
108 3) Splitting interval can be specified in constructor
110 Revision 1.55 2007/08/06 12:26:40 acolla
111 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
112 read from the run logbook.
114 Revision 1.54 2007/07/12 09:51:25 jgrosseo
115 removed duplicated log message in GetFile
117 Revision 1.53 2007/07/12 09:26:28 jgrosseo
118 updating hlt fxs base path
120 Revision 1.52 2007/07/12 08:06:45 jgrosseo
121 adding log messages in getfile... functions
122 adding not implemented copy constructor in alishuttleconfigholder
124 Revision 1.51 2007/07/03 17:24:52 acolla
125 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
127 Revision 1.50 2007/07/02 17:19:32 acolla
128 preprocessor is run in a temp directory that is removed when process is finished.
130 Revision 1.49 2007/06/29 10:45:06 acolla
131 Number of columns in MySql Shuttle logbook increased by one (HLT added)
133 Revision 1.48 2007/06/21 13:06:19 acolla
134 GetFileSources returns dummy list with 1 source if system=DCS (better than
135 returning error as it was)
137 Revision 1.47 2007/06/19 17:28:56 acolla
138 HLT updated; missing map bug removed.
140 Revision 1.46 2007/06/09 13:01:09 jgrosseo
141 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
143 Revision 1.45 2007/05/30 06:35:20 jgrosseo
144 Adding functionality to the Shuttle/TestShuttle:
145 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
146 o) Function to retrieve list of IDs for a given source (GetFileIDs)
147 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
148 Example code has been added to the TestProcessor in TestShuttle
150 Revision 1.44 2007/05/11 16:09:32 acolla
151 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
152 example: ITS/SPD/100_filename.root
154 Revision 1.43 2007/05/10 09:59:51 acolla
155 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
157 Revision 1.42 2007/05/03 08:01:39 jgrosseo
158 typo in last commit :-(
160 Revision 1.41 2007/05/03 08:00:48 jgrosseo
161 fixing log message when pp want to skip dcs value retrieval
163 Revision 1.40 2007/04/27 07:06:48 jgrosseo
164 GetFileSources returns empty list in case of no files, but successful query
165 No mails sent in testmode
167 Revision 1.39 2007/04/17 12:43:57 acolla
168 Correction in StoreOCDB; change of text in mail to detector expert
170 Revision 1.38 2007/04/12 08:26:18 jgrosseo
173 Revision 1.37 2007/04/10 16:53:14 jgrosseo
174 redirecting sub detector stdout, stderr to sub detector log file
176 Revision 1.35 2007/04/04 16:26:38 acolla
177 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
178 2. Added missing dependency in test preprocessors.
179 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
181 Revision 1.34 2007/04/04 10:33:36 jgrosseo
182 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
183 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
185 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
187 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
189 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
191 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
192 If you always need DCS data (like before), you do not need to implement it.
194 6) The run type has been added to the monitoring page
196 Revision 1.33 2007/04/03 13:56:01 acolla
197 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
200 Revision 1.32 2007/02/28 10:41:56 acolla
201 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
202 AliPreprocessor::GetRunType() function.
203 Added some ldap definition files.
205 Revision 1.30 2007/02/13 11:23:21 acolla
206 Moved getters and setters of Shuttle's main OCDB/Reference, local
207 OCDB/Reference, temp and log folders to AliShuttleInterface
209 Revision 1.27 2007/01/30 17:52:42 jgrosseo
210 adding monalisa monitoring
212 Revision 1.26 2007/01/23 19:20:03 acolla
213 Removed old ldif files, added TOF, MCH ldif files. Added some options in
214 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
217 Revision 1.25 2007/01/15 19:13:52 acolla
218 Moved some AliInfo to AliDebug in SendMail function
220 Revision 1.21 2006/12/07 08:51:26 jgrosseo
222 table, db names in ldap configuration
223 added GRP preprocessor
224 DCS data can also be retrieved by data point
226 Revision 1.20 2006/11/16 16:16:48 jgrosseo
227 introducing strict run ordering flag
228 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
230 Revision 1.19 2006/11/06 14:23:04 jgrosseo
231 major update (Alberto)
232 o) reading of run parameters from the logbook
233 o) online offline naming conversion
234 o) standalone DCSclient package
236 Revision 1.18 2006/10/20 15:22:59 jgrosseo
237 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
238 o) Merging Collect, CollectAll, CollectNew function
239 o) Removing implementation of empty copy constructors (declaration still there!)
241 Revision 1.17 2006/10/05 16:20:55 jgrosseo
242 adapting to new CDB classes
244 Revision 1.16 2006/10/05 15:46:26 jgrosseo
245 applying to the new interface
247 Revision 1.15 2006/10/02 16:38:39 jgrosseo
250 storing of objects that failed to be stored to the grid before
251 interfacing of shuttle status table in daq system
253 Revision 1.14 2006/08/29 09:16:05 jgrosseo
256 Revision 1.13 2006/08/15 10:50:00 jgrosseo
257 effc++ corrections (alberto)
259 Revision 1.12 2006/08/08 14:19:29 jgrosseo
260 Update to shuttle classes (Alberto)
262 - Possibility to set the full object's path in the Preprocessor's and
263 Shuttle's Store functions
264 - Possibility to extend the object's run validity in the same classes
265 ("startValidity" and "validityInfinite" parameters)
266 - Implementation of the StoreReferenceData function to store reference
267 data in a dedicated CDB storage.
269 Revision 1.11 2006/07/21 07:37:20 jgrosseo
270 last run is stored after each run
272 Revision 1.10 2006/07/20 09:54:40 jgrosseo
273 introducing status management: The processing per subdetector is divided into several steps,
274 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
275 can keep track of the number of failures and skips further processing after a certain threshold is
276 exceeded. These thresholds can be configured in LDAP.
278 Revision 1.9 2006/07/19 10:09:55 jgrosseo
279 new configuration, accesst to DAQ FES (Alberto)
281 Revision 1.8 2006/07/11 12:44:36 jgrosseo
282 adding parameters for extended validity range of data produced by preprocessor
284 Revision 1.7 2006/07/10 14:37:09 jgrosseo
285 small fix + todo comment
287 Revision 1.6 2006/07/10 13:01:41 jgrosseo
288 enhanced storing of last sucessfully processed run (alberto)
290 Revision 1.5 2006/07/04 14:59:57 jgrosseo
291 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
293 Revision 1.4 2006/06/12 09:11:16 jgrosseo
294 coding conventions (Alberto)
296 Revision 1.3 2006/06/06 14:26:40 jgrosseo
297 o) removed files that were moved to STEER
298 o) shuttle updated to follow the new interface (Alberto)
300 Revision 1.2 2006/03/07 07:52:34 hristov
301 New version (B.Yordanov)
303 Revision 1.6 2005/11/19 17:19:14 byordano
304 RetrieveDATEEntries and RetrieveConditionsData added
306 Revision 1.5 2005/11/19 11:09:27 byordano
307 AliShuttle declaration added
309 Revision 1.4 2005/11/17 17:47:34 byordano
310 TList changed to TObjArray
312 Revision 1.3 2005/11/17 14:43:23 byordano
315 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
316 Initial import as subdirectory in AliRoot
318 Revision 1.2 2005/09/13 08:41:15 byordano
319 default startTime endTime added
321 Revision 1.4 2005/08/30 09:13:02 byordano
324 Revision 1.3 2005/08/29 21:15:47 byordano
330 // This class is the main manager for AliShuttle.
331 // It organizes the data retrieval from DCS and call the
332 // interface methods of AliPreprocessor.
333 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
334 // data for its set of aliases is retrieved. If there is registered
335 // AliPreprocessor for this detector then it will be used
336 // accroding to the schema (see AliPreprocessor).
337 // If there isn't registered AliPreprocessor than the retrieved
338 // data is stored automatically to the undelying AliCDBStorage.
339 // For detSpec is used the alias name.
342 #include "AliShuttle.h"
344 #include "AliCDBManager.h"
345 #include "AliCDBStorage.h"
346 #include "AliCDBId.h"
347 #include "AliCDBRunRange.h"
348 #include "AliCDBPath.h"
349 #include "AliCDBEntry.h"
350 #include "AliShuttleConfig.h"
351 #include "DCSClient/AliDCSClient.h"
353 #include "AliPreprocessor.h"
354 #include "AliShuttleStatus.h"
355 #include "AliShuttleLogbookEntry.h"
360 #include <TTimeStamp.h>
361 #include <TObjString.h>
362 #include <TSQLServer.h>
363 #include <TSQLResult.h>
366 #include <TSystemDirectory.h>
367 #include <TSystemFile.h>
370 #include <TGridResult.h>
372 #include <TMonaLisaWriter.h>
376 #include <sys/types.h>
377 #include <sys/wait.h>
381 //______________________________________________________________________________________________
382 AliShuttle::AliShuttle(const AliShuttleConfig* config,
383 UInt_t timeout, Int_t retries):
385 fTimeout(timeout), fRetries(retries),
395 fReadTestMode(kFALSE),
396 fOutputRedirected(kFALSE)
399 // config: AliShuttleConfig used
400 // timeout: timeout used for AliDCSClient connection
401 // retries: the number of retries in case of connection error.
404 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
405 for(int iSys=0;iSys<4;iSys++) {
408 fFXSlist[iSys].SetOwner(kTRUE);
410 fPreprocessorMap.SetOwner(kTRUE);
412 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
413 fFirstUnprocessed[iDet] = kFALSE;
415 fMonitoringMutex = new TMutex();
418 //______________________________________________________________________________________________
419 AliShuttle::~AliShuttle()
425 fPreprocessorMap.DeleteAll();
426 for(int iSys=0;iSys<4;iSys++)
428 fServer[iSys]->Close();
429 delete fServer[iSys];
438 if (fMonitoringMutex)
440 delete fMonitoringMutex;
441 fMonitoringMutex = 0;
445 //______________________________________________________________________________________________
446 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
449 // Registers new AliPreprocessor.
450 // It uses GetName() for indentificator of the pre processor.
451 // The pre processor is registered it there isn't any other
452 // with the same identificator (GetName()).
455 const char* detName = preprocessor->GetName();
456 if(GetDetPos(detName) < 0)
457 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
459 if (fPreprocessorMap.GetValue(detName)) {
460 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
464 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
466 //______________________________________________________________________________________________
467 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
468 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
470 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
471 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
472 // using this function. Use StoreReferenceData instead!
473 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
474 // finishes the data are transferred to the main storage (Grid).
476 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
479 //______________________________________________________________________________________________
480 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
482 // Stores a CDB object in the storage for reference data. This objects will not be available during
483 // offline reconstrunction. Use this function for reference data only!
484 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
485 // finishes the data are transferred to the main storage (Grid).
487 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
490 //______________________________________________________________________________________________
491 Bool_t AliShuttle::StoreLocally(const TString& localUri,
492 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
493 Int_t validityStart, Bool_t validityInfinite)
495 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
496 // when the preprocessor finishes the data are transferred to the main storage (Grid).
497 // The parameters are:
498 // 1) Uri of the backup storage (Local)
499 // 2) the object's path.
500 // 3) the object to be stored
501 // 4) the metaData to be associated with the object
502 // 5) the validity start run number w.r.t. the current run,
503 // if the data is valid only for this run leave the default 0
504 // 6) specifies if the calibration data is valid for infinity (this means until updated),
505 // typical for calibration runs, the default is kFALSE
507 // returns 0 if fail, 1 otherwise
509 if (fTestMode & kErrorStorage)
511 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
515 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
517 Int_t firstRun = GetCurrentRun() - validityStart;
519 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
524 if(validityInfinite) {
525 lastRun = AliCDBRunRange::Infinity();
527 lastRun = GetCurrentRun();
530 // Version is set to current run, it will be used later to transfer data to Grid
531 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
533 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
534 TObjString runUsed = Form("%d", GetCurrentRun());
535 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
538 Bool_t result = kFALSE;
540 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
541 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
543 result = AliCDBManager::Instance()->GetStorage(localUri)
544 ->Put(object, id, metaData);
549 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
555 //______________________________________________________________________________________________
556 Bool_t AliShuttle::StoreOCDB()
559 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
560 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
561 // Then calls StoreRefFilesToGrid to store reference files.
564 if (fTestMode & kErrorGrid)
566 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
567 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
571 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
572 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
574 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
575 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
577 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
578 Bool_t resultRefFiles = CopyFilesToGrid("reference");
580 Bool_t resultMetadata = kTRUE;
581 if(fCurrentDetector == "GRP")
583 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
584 resultMetadata = CopyFilesToGrid("metadata");
587 return resultCDB && resultRef && resultRefFiles && resultMetadata;
590 //______________________________________________________________________________________________
591 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
594 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
597 TObjArray* gridIds=0;
599 Bool_t result = kTRUE;
601 const char* type = 0;
603 if(gridURI == fgkMainCDB) {
605 localURI = fgkLocalCDB;
606 } else if(gridURI == fgkMainRefStorage) {
608 localURI = fgkLocalRefStorage;
610 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
614 AliCDBManager* man = AliCDBManager::Instance();
616 AliCDBStorage *gridSto = man->GetStorage(gridURI);
619 Form("StoreOCDB - cannot activate main %s storage", type));
623 gridIds = gridSto->GetQueryCDBList();
625 // get objects previously stored in local CDB
626 AliCDBStorage *localSto = man->GetStorage(localURI);
629 Form("StoreOCDB - cannot activate local %s storage", type));
632 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
633 // Local objects were stored with current run as Grid version!
634 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
635 localEntries->SetOwner(1);
637 // loop on local stored objects
638 TIter localIter(localEntries);
639 AliCDBEntry *aLocEntry = 0;
640 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
641 aLocEntry->SetOwner(1);
642 AliCDBId aLocId = aLocEntry->GetId();
643 aLocEntry->SetVersion(-1);
644 aLocEntry->SetSubVersion(-1);
646 // If local object is valid up to infinity we store it only if it is
647 // the first unprocessed run!
648 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
649 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
651 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
652 "there are previous unprocessed runs!",
653 fCurrentDetector.Data(), aLocId.GetPath().Data()));
657 // loop on Grid valid Id's
658 Bool_t store = kTRUE;
659 TIter gridIter(gridIds);
660 AliCDBId* aGridId = 0;
661 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
662 if(aGridId->GetPath() != aLocId.GetPath()) continue;
663 // skip all objects valid up to infinity
664 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
665 // if we get here, it means there's already some more recent object stored on Grid!
670 // If we get here, the file can be stored!
671 Bool_t storeOk = gridSto->Put(aLocEntry);
672 if(!store || storeOk){
676 Log(fCurrentDetector.Data(),
677 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
678 type, aGridId->ToString().Data()));
681 Form("StoreOCDB - Object <%s> successfully put into %s storage",
682 aLocId.ToString().Data(), type));
683 Log(fCurrentDetector.Data(),
684 Form("StoreOCDB - Object <%s> successfully put into %s storage",
685 aLocId.ToString().Data(), type));
688 // removing local filename...
690 localSto->IdToFilename(aLocId, filename);
691 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
692 RemoveFile(filename.Data());
696 Form("StoreOCDB - Grid %s storage of object <%s> failed",
697 type, aLocId.ToString().Data()));
698 Log(fCurrentDetector.Data(),
699 Form("StoreOCDB - Grid %s storage of object <%s> failed",
700 type, aLocId.ToString().Data()));
704 localEntries->Clear();
709 //______________________________________________________________________________________________
710 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
712 // clears the directory used to store reference files of a given subdetector
714 AliCDBManager* man = AliCDBManager::Instance();
715 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
716 TString localBaseFolder = sto->GetBaseFolder();
718 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
720 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
723 begin.Form("%d_", GetCurrentRun());
725 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
729 TList* dirList = baseDir->GetListOfFiles();
732 if (!dirList) return kTRUE;
734 if (dirList->GetEntries() < 3)
740 Int_t nDirs = 0, nDel = 0;
741 TIter dirIter(dirList);
742 TSystemFile* entry = 0;
744 Bool_t success = kTRUE;
746 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
748 if (entry->IsDirectory())
751 TString fileName(entry->GetName());
752 if (!fileName.BeginsWith(begin))
758 Int_t result = gSystem->Unlink(fileName.Data());
762 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
770 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
771 nDel, nDirs, targetDir.Data()));
782 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
786 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
789 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
794 result = gSystem->mkdir(targetDir, kTRUE);
797 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
804 //______________________________________________________________________________________________
805 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
808 // Stores reference file directly (without opening it). This function stores the file locally.
810 // The file is stored under the following location:
811 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
812 // where <gridFileName> is the second parameter given to the function
815 if (fTestMode & kErrorStorage)
817 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
821 AliCDBManager* man = AliCDBManager::Instance();
822 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
824 TString localBaseFolder = sto->GetBaseFolder();
826 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
827 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
829 return CopyFileLocally(localFile, target);
832 //______________________________________________________________________________________________
833 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
836 // Stores Run metadata file to the Grid, in the run folder
838 // Only GRP can call this function.
840 if (fTestMode & kErrorStorage)
842 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
846 AliCDBManager* man = AliCDBManager::Instance();
847 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
849 TString localBaseFolder = sto->GetBaseFolder();
851 // Build Run level folder
852 // folder = /alice/data/year/lhcPeriod/runNb/Raw
855 TString lhcPeriod = GetLHCPeriod();
856 if (lhcPeriod.Length() == 0)
858 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
862 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
863 localBaseFolder.Data(), GetCurrentYear(),
864 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
866 return CopyFileLocally(localFile, target);
869 //______________________________________________________________________________________________
870 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
873 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
874 // Files are temporarily stored in the local reference storage. When the preprocessor
875 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
876 // (in reference or run level folders)
879 TString targetDir(target(0, target.Last('/')));
881 //try to open base dir folder, if it does not exist
882 void* dir = gSystem->OpenDirectory(targetDir.Data());
884 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
885 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
890 gSystem->FreeDirectory(dir);
895 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
898 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
902 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
905 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
906 if (gSystem->Unlink(target.Data()))
908 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
913 result = gSystem->CopyFile(localFile, target);
917 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
922 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
923 localFile, target.Data(), result));
931 //______________________________________________________________________________________________
932 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
935 // Transfers local files to the Grid. Local files can be reference files
936 // or run metadata file (from GRP only).
938 // According to the type (ref, metadata) the files are stored under the following location:
939 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
940 // metadata --> <run data folder>/<MetadataFileName>
943 AliCDBManager* man = AliCDBManager::Instance();
944 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
947 TString localBaseFolder = sto->GetBaseFolder();
953 if (strcmp(type, "reference") == 0)
955 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
956 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
959 TString gridBaseFolder = gridSto->GetBaseFolder();
960 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
961 begin = Form("%d_", GetCurrentRun());
963 else if (strcmp(type, "metadata") == 0)
966 TString lhcPeriod = GetLHCPeriod();
968 if (lhcPeriod.Length() == 0)
970 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
974 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
975 localBaseFolder.Data(), GetCurrentYear(),
976 lhcPeriod.Data(), GetCurrentRun());
977 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
983 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
987 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
991 TList* dirList = baseDir->GetListOfFiles();
994 if (!dirList) return kTRUE;
996 if (dirList->GetEntries() < 3)
1004 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1009 Int_t nDirs = 0, nTransfer = 0;
1010 TIter dirIter(dirList);
1011 TSystemFile* entry = 0;
1013 Bool_t success = kTRUE;
1014 Bool_t first = kTRUE;
1016 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1018 if (entry->IsDirectory())
1021 TString fileName(entry->GetName());
1022 if (!fileName.BeginsWith(begin))
1030 // check that folder exists, otherwise create it
1031 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1039 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1041 // TODO It does not work currently! Bug in TAliEn::Mkdir
1042 // TODO Manually fixed in local root v5-16-00
1043 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1045 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1050 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1054 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1058 TString fullLocalPath;
1059 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1061 TString fullGridPath;
1062 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1064 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1068 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1069 fullLocalPath.Data(), fullGridPath.Data()));
1070 RemoveFile(fullLocalPath);
1075 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1076 fullLocalPath.Data(), fullGridPath.Data()));
1081 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1082 nTransfer, nDirs, dir.Data()));
1089 //______________________________________________________________________________________________
1090 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1093 // Get folder name of reference files
1096 TString offDetStr(GetOfflineDetName(detector));
1098 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1100 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1102 dir.Form("%s/%s", base, offDetStr.Data());
1110 //______________________________________________________________________________________________
1111 void AliShuttle::CleanLocalStorage(const TString& uri)
1114 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1117 const char* type = 0;
1118 if(uri == fgkLocalCDB) {
1120 } else if(uri == fgkLocalRefStorage) {
1123 AliError(Form("Invalid storage URI: %s", uri.Data()));
1127 AliCDBManager* man = AliCDBManager::Instance();
1129 // open local storage
1130 AliCDBStorage *localSto = man->GetStorage(uri);
1133 Form("CleanLocalStorage - cannot activate local %s storage", type));
1137 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1138 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1140 AliDebug(2, Form("filename = %s", filename.Data()));
1142 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1143 GetCurrentRun(), fCurrentDetector.Data()));
1145 RemoveFile(filename.Data());
1149 //______________________________________________________________________________________________
1150 void AliShuttle::RemoveFile(const char* filename)
1153 // removes local file
1156 TString command(Form("rm -f %s", filename));
1158 Int_t result = gSystem->Exec(command.Data());
1161 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1162 fCurrentDetector.Data(), filename));
1166 //______________________________________________________________________________________________
1167 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1170 // Reads the AliShuttleStatus from the CDB
1174 delete fStatusEntry;
1178 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1179 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1181 if (!fStatusEntry) return 0;
1182 fStatusEntry->SetOwner(1);
1184 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1186 AliError("Invalid object stored to CDB!");
1193 //______________________________________________________________________________________________
1194 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1197 // writes the status for one subdetector
1201 delete fStatusEntry;
1205 Int_t run = GetCurrentRun();
1207 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1209 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1210 fStatusEntry->SetOwner(1);
1212 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1215 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1216 fCurrentDetector.Data(), run));
1225 //______________________________________________________________________________________________
1226 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1229 // changes the AliShuttleStatus for the given detector and run to the given status
1233 AliError("UNEXPECTED: fStatusEntry empty");
1237 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1240 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1244 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1245 fCurrentDetector.Data(),
1246 status->GetStatusName(),
1247 status->GetStatusName(newStatus));
1248 Log("SHUTTLE", actionStr);
1249 SetLastAction(actionStr);
1251 status->SetStatus(newStatus);
1252 if (increaseCount) status->IncreaseCount();
1254 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1259 //______________________________________________________________________________________________
1260 void AliShuttle::SendMLInfo()
1263 // sends ML information about the current status of the current detector being processed
1266 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1269 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1273 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1274 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1277 mlList.Add(&mlStatus);
1278 mlList.Add(&mlRetryCount);
1281 mlID.Form("%d", GetCurrentRun());
1282 fMonaLisa->SendParameters(&mlList, mlID);
1285 //______________________________________________________________________________________________
1286 Bool_t AliShuttle::ContinueProcessing()
1288 // this function reads the AliShuttleStatus information from CDB and
1289 // checks if the processing should be continued
1290 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1292 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1294 AliPreprocessor* aPreprocessor =
1295 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1298 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1302 AliShuttleLogbookEntry::Status entryStatus =
1303 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1305 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1306 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1307 fCurrentDetector.Data(),
1308 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1312 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1314 // check if current run is first unprocessed run for current detector
1315 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1316 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1318 if (fTestMode == kNone)
1320 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1321 " but this is not the first unprocessed run!"));
1326 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1327 "Although %s requires strict run ordering "
1328 "and this is not the first unprocessed run, "
1329 "the SHUTTLE continues"));
1333 AliShuttleStatus* status = ReadShuttleStatus();
1336 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1337 fCurrentDetector.Data()));
1338 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1339 return WriteShuttleStatus(status);
1342 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1343 // If it happens it may mean Logbook updating failed... let's do it now!
1344 if (status->GetStatus() == AliShuttleStatus::kDone ||
1345 status->GetStatus() == AliShuttleStatus::kFailed){
1346 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1347 fCurrentDetector.Data(),
1348 status->GetStatusName(status->GetStatus())));
1349 UpdateShuttleLogbook(fCurrentDetector.Data(),
1350 status->GetStatusName(status->GetStatus()));
1354 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1356 Form("ContinueProcessing - %s: Grid storage of one or more "
1357 "objects failed. Trying again now",
1358 fCurrentDetector.Data()));
1359 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1361 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1362 "successfully stored into main storage",
1363 fCurrentDetector.Data()));
1366 Form("ContinueProcessing - %s: Grid storage failed again",
1367 fCurrentDetector.Data()));
1368 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1373 // if we get here, there is a restart
1374 Bool_t cont = kFALSE;
1377 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1378 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1379 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1380 status->GetCount(), status->GetStatusName()));
1381 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1382 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1384 // there may still be objects in local OCDB and reference storage
1385 // and FXS databases may be not updated: do it now!
1387 // TODO Currently disabled, we want to keep files in case of failure!
1388 // CleanLocalStorage(fgkLocalCDB);
1389 // CleanLocalStorage(fgkLocalRefStorage);
1390 // UpdateTableFailCase();
1392 // Send mail to detector expert!
1393 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1394 fCurrentDetector.Data()));
1396 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1397 fCurrentDetector.Data()));
1400 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1401 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1402 status->GetStatusName(), status->GetCount()));
1403 Bool_t increaseCount = kTRUE;
1404 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1405 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1406 increaseCount = kFALSE;
1408 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1415 //______________________________________________________________________________________________
1416 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1419 // Makes data retrieval for all detectors in the configuration.
1420 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1421 // (Unprocessed, Inactive, Failed or Done).
1422 // Returns kFALSE in case of error occured and kTRUE otherwise
1425 if (!entry) return kFALSE;
1427 fLogbookEntry = entry;
1429 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1432 // Send the information to ML
1433 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1434 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1437 mlList.Add(&mlStatus);
1438 mlList.Add(&mlRunType);
1441 mlID.Form("%d", GetCurrentRun());
1442 fMonaLisa->SendParameters(&mlList, mlID);
1444 if (fLogbookEntry->IsDone())
1446 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1447 UpdateShuttleLogbook("shuttle_done");
1452 // read test mode if flag is set
1456 TString logEntry(entry->GetRunParameter("log"));
1457 //printf("log entry = %s\n", logEntry.Data());
1458 TString searchStr("Testmode: ");
1459 Int_t pos = logEntry.Index(searchStr.Data());
1460 //printf("%d\n", pos);
1463 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1464 //printf("%s\n", subStr.String().Data());
1465 TString newStr(subStr.Data());
1466 TObjArray* token = newStr.Tokenize(' ');
1470 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1473 Int_t testMode = tmpStr->String().Atoi();
1476 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1477 SetTestMode((TestMode) testMode);
1485 fLogbookEntry->Print("all");
1488 Bool_t hasError = kFALSE;
1490 // Set the CDB and Reference folders according to the year and LHC period
1491 TString lhcPeriod(GetLHCPeriod());
1492 if (lhcPeriod.Length() == 0)
1494 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1498 if (fgkMainCDB.Length() == 0)
1499 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1500 GetCurrentYear(), lhcPeriod.Data());
1502 if (fgkMainRefStorage.Length() == 0)
1503 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1504 GetCurrentYear(), lhcPeriod.Data());
1506 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1507 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1508 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1509 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1511 // Loop on detectors in the configuration
1512 TIter iter(fConfig->GetDetectors());
1513 TObjString* aDetector = 0;
1515 while ((aDetector = (TObjString*) iter.Next()))
1517 fCurrentDetector = aDetector->String();
1519 if (ContinueProcessing() == kFALSE) continue;
1521 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1522 GetCurrentRun(), aDetector->GetName()));
1524 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1526 Log(fCurrentDetector.Data(), "Process - Starting processing");
1532 Log("SHUTTLE", "Process - ERROR: Forking failed");
1537 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1538 GetCurrentRun(), aDetector->GetName()));
1540 Long_t begin = time(0);
1542 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1543 while (waitpid(pid, &status, WNOHANG) == 0)
1545 Long_t expiredTime = time(0) - begin;
1547 if (expiredTime > fConfig->GetPPTimeOut())
1550 tmp.Form("Process - Process of %s time out. "
1551 "Run time: %d seconds. Killing...",
1552 fCurrentDetector.Data(), expiredTime);
1553 Log("SHUTTLE", tmp);
1554 Log(fCurrentDetector, tmp);
1558 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1561 gSystem->Sleep(1000);
1565 gSystem->Sleep(1000);
1568 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1569 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1572 Log("SHUTTLE", Form("Process - Error: "
1573 "Could not open pipe to %s", checkStr.Data()));
1578 if (!fgets(buffer, 100, pipe))
1580 Log("SHUTTLE", "Process - Error: ps did not return anything");
1581 gSystem->ClosePipe(pipe);
1584 gSystem->ClosePipe(pipe);
1586 //Log("SHUTTLE", Form("ps returned %s", buffer));
1589 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1591 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1595 if (expiredTime % 60 == 0)
1597 Log("SHUTTLE", Form("Process - %s: Checking process. "
1598 "Run time: %d seconds - Memory consumption: %d KB",
1599 fCurrentDetector.Data(), expiredTime, mem));
1603 if (mem > fConfig->GetPPMaxMem())
1606 tmp.Form("Process - Process exceeds maximum allowed memory "
1607 "(%d KB > %d KB). Killing...",
1608 mem, fConfig->GetPPMaxMem());
1609 Log("SHUTTLE", tmp);
1610 Log(fCurrentDetector, tmp);
1614 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1617 gSystem->Sleep(1000);
1622 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1623 GetCurrentRun(), aDetector->GetName()));
1625 if (WIFEXITED(status))
1627 Int_t returnCode = WEXITSTATUS(status);
1629 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1632 if (returnCode == 0) hasError = kTRUE;
1638 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1639 aDetector->GetName()));
1641 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1643 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1645 Log("SHUTTLE", "Process - Could not freopen stdout");
1649 fOutputRedirected = kTRUE;
1650 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1651 Log("SHUTTLE", "Process - Could not redirect stderr");
1655 TString wd = gSystem->WorkingDirectory();
1656 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1657 fCurrentDetector.Data(), GetCurrentRun());
1659 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1660 if (!result) // temp dir already exists!
1662 Log(fCurrentDetector.Data(),
1663 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1664 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1667 if (gSystem->mkdir(tmpDir.Data(), 1))
1669 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1673 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1675 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1679 Bool_t success = ProcessCurrentDetector();
1681 gSystem->ChangeDirectory(wd.Data());
1683 if (success) // Preprocessor finished successfully!
1685 // remove temporary folder
1686 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1688 // Update time_processed field in FXS DB
1689 if (UpdateTable() == kFALSE)
1690 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1691 fCurrentDetector.Data()));
1693 // Transfer the data from local storage to main storage (Grid)
1694 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1695 if (StoreOCDB() == kFALSE)
1698 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1699 GetCurrentRun(), aDetector->GetName()));
1700 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1704 Form("\t\t\t****** run %d - %s: DONE ******",
1705 GetCurrentRun(), aDetector->GetName()));
1706 UpdateShuttleStatus(AliShuttleStatus::kDone);
1707 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1712 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1713 GetCurrentRun(), aDetector->GetName()));
1716 for (UInt_t iSys=0; iSys<3; iSys++)
1718 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1721 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1722 GetCurrentRun(), aDetector->GetName(), success));
1724 // the client exits here
1725 gSystem->Exit(success);
1727 AliError("We should never get here!!!");
1731 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1734 //check if shuttle is done for this run, if so update logbook
1735 TObjArray checkEntryArray;
1736 checkEntryArray.SetOwner(1);
1737 TString whereClause = Form("where run=%d", GetCurrentRun());
1738 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1739 checkEntryArray.GetEntries() == 0) {
1740 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1742 return hasError == kFALSE;
1745 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1746 (checkEntryArray.At(0));
1750 if (checkEntry->IsDone())
1752 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1753 UpdateShuttleLogbook("shuttle_done");
1757 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1759 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1761 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1762 checkEntry->GetRun(), GetDetName(iDet)));
1763 fFirstUnprocessed[iDet] = kFALSE;
1771 return hasError == kFALSE;
1774 //______________________________________________________________________________________________
1775 Bool_t AliShuttle::ProcessCurrentDetector()
1778 // Makes data retrieval just for a specific detector (fCurrentDetector).
1779 // Threre should be a configuration for this detector.
1781 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1782 fCurrentDetector.Data(), GetCurrentRun()));
1784 TString wd = gSystem->WorkingDirectory();
1786 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1789 gSystem->ChangeDirectory(wd.Data());
1791 TMap* dcsMap = new TMap();
1793 // call preprocessor
1794 AliPreprocessor* aPreprocessor =
1795 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1797 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1799 Bool_t processDCS = aPreprocessor->ProcessDCS();
1803 Log(fCurrentDetector, "ProcessCurrentDetector -"
1804 " The preprocessor requested to skip the retrieval of DCS values");
1806 else if (fTestMode & kSkipDCS)
1808 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1810 else if (fTestMode & kErrorDCS)
1812 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1813 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1814 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1819 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1821 // Query DCS archive
1822 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1824 for (int iServ=0; iServ<nServers; iServ++)
1827 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1828 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1829 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1831 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1832 " Querying DCS Amanda server %s:%d (%d of %d)",
1833 host.Data(), port, iServ+1, nServers));
1838 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1840 aliasMap = GetValueSet(host, port,
1841 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1842 kAlias, multiSplit);
1845 Log(fCurrentDetector,
1846 Form("ProcessCurrentDetector -"
1847 " Error retrieving DCS aliases from server %s."
1848 " Sending mail to DCS experts!", host.Data()));
1849 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1851 if (!SendMailToDCS())
1852 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1859 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1861 dpMap = GetValueSet(host, port,
1862 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1866 Log(fCurrentDetector,
1867 Form("ProcessCurrentDetector -"
1868 " Error retrieving DCS data points from server %s."
1869 " Sending mail to DCS experts!", host.Data()));
1870 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1872 if (!SendMailToDCS())
1873 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1875 if (aliasMap) delete aliasMap;
1881 // merge aliasMap and dpMap into dcsMap
1883 TIter iter(aliasMap);
1884 TObjString* key = 0;
1885 while ((key = (TObjString*) iter.Next()))
1886 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1888 aliasMap->SetOwner(kFALSE);
1894 TObjString* key = 0;
1895 while ((key = (TObjString*) iter.Next()))
1896 dcsMap->Add(key, dpMap->GetValue(key->String()));
1898 dpMap->SetOwner(kFALSE);
1904 // save map into file, to help debugging in case of preprocessor error
1905 TFile* f = TFile::Open("DCSMap.root","recreate");
1907 dcsMap->Write("DCSMap", TObject::kSingleKey);
1911 // DCS Archive DB processing successful. Call Preprocessor!
1912 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1914 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1916 if (returnValue > 0) // Preprocessor error!
1918 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1919 "Preprocessor failed. Process returned %d.", returnValue));
1920 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1921 dcsMap->DeleteAll();
1927 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1928 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1929 fCurrentDetector.Data()));
1931 dcsMap->DeleteAll();
1937 //______________________________________________________________________________________________
1938 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1941 // Query DAQ's Shuttle logbook and fills detector status object.
1942 // Call QueryRunParameters to query DAQ logbook for run parameters.
1945 entries.SetOwner(1);
1947 // check connection, in case connect
1948 if(!Connect(3)) return kFALSE;
1951 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1953 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1955 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1959 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1961 if(aResult->GetRowCount() == 0) {
1962 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1967 // TODO Check field count!
1968 const UInt_t nCols = 23;
1969 if (aResult->GetFieldCount() != (Int_t) nCols) {
1970 Log("SHUTTLE", "Invalid SQL result field number!");
1976 while ((aRow = aResult->Next())) {
1977 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1978 Int_t run = runString.Atoi();
1980 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1984 // loop on detectors
1985 for(UInt_t ii = 0; ii < nCols; ii++)
1986 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1988 entries.AddLast(entry);
1996 //______________________________________________________________________________________________
1997 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2000 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2003 // check connection, in case connect
2008 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2010 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2012 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2016 if (aResult->GetRowCount() == 0) {
2017 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2022 if (aResult->GetRowCount() > 1) {
2023 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2024 "more than one entry in DAQ Logbook for run %d!", run));
2029 TSQLRow* aRow = aResult->Next();
2032 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2037 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2039 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2040 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2042 UInt_t startTime = entry->GetStartTime();
2043 UInt_t endTime = entry->GetEndTime();
2045 // if (!startTime || !endTime || startTime > endTime)
2048 // Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2049 // run, startTime, endTime));
2051 // Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2052 // fLogbookEntry = entry;
2053 // if (!UpdateShuttleLogbook("shuttle_done"))
2055 // AliError(Form("Could not update logbook for run %d !", run));
2057 // fLogbookEntry = 0;
2068 Form("QueryRunParameters - Invalid parameters for Run %d: "
2069 "startTime = %d, endTime = %d. Skipping!",
2070 run, startTime, endTime));
2072 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2073 fLogbookEntry = entry;
2074 if (!UpdateShuttleLogbook("shuttle_ignored"))
2076 AliError(Form("Could not update logbook for run %d !", run));
2086 if (startTime && !endTime)
2088 // TODO Here we don't mark SHUTTLE done, because this may mean
2089 //the run is still ongoing!!
2091 Form("QueryRunParameters - Invalid parameters for Run %d: "
2092 "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2093 run, startTime, endTime));
2095 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2096 //fLogbookEntry = entry;
2097 //if (!UpdateShuttleLogbook("shuttle_done"))
2099 // AliError(Form("Could not update logbook for run %d !", run));
2101 //fLogbookEntry = 0;
2109 if (startTime && endTime && (startTime > endTime))
2112 Form("QueryRunParameters - Invalid parameters for Run %d: "
2113 "startTime = %d, endTime = %d. Skipping!",
2114 run, startTime, endTime));
2116 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2117 fLogbookEntry = entry;
2118 if (!UpdateShuttleLogbook("shuttle_ignored"))
2120 AliError(Form("Could not update logbook for run %d !", run));
2130 TString totEventsStr = entry->GetRunParameter("totalEvents");
2131 Int_t totEvents = totEventsStr.Atoi();
2135 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2137 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2138 fLogbookEntry = entry;
2139 if (!UpdateShuttleLogbook("shuttle_ignored"))
2141 AliError(Form("Could not update logbook for run %d !", run));
2157 //______________________________________________________________________________________________
2158 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2159 DCSType type, Int_t multiSplit)
2161 // Retrieve all "entry" data points from the DCS server
2162 // host, port: TSocket connection parameters
2163 // entries: list of name of the alias or data point
2164 // type: kAlias or kDP
2165 // returns TMap of values, 0 when failure
2167 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2172 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2173 GetCurrentEndTime());
2175 else if (type == kDP)
2177 result = client.GetDPValues(entries, GetCurrentStartTime(),
2178 GetCurrentEndTime());
2183 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2184 client.GetErrorString(client.GetResultErrorCode())));
2185 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2186 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2187 client.GetServerError().Data()));
2195 //______________________________________________________________________________________________
2196 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2197 const char* id, const char* source)
2199 // Get calibration file from file exchange servers
2200 // First queris the FXS database for the file name, using the run, detector, id and source info
2201 // then calls RetrieveFile(filename) for actual copy to local disk
2202 // run: current run being processed (given by Logbook entry fLogbookEntry)
2203 // detector: the Preprocessor name
2204 // id: provided as a parameter by the Preprocessor
2205 // source: provided by the Preprocessor through GetFileSources function
2207 // check if test mode should simulate a FXS error
2208 if (fTestMode & kErrorFXSFiles)
2210 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2214 // check connection, in case connect
2215 if (!Connect(system))
2217 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2221 // Query preparation
2222 TString sourceName(source);
2224 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2225 fConfig->GetFXSdbTable(system));
2226 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2227 GetCurrentRun(), detector, id);
2231 whereClause += Form(" and DAQsource=\"%s\"", source);
2233 else if (system == kDCS)
2237 else if (system == kHLT)
2239 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2243 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2245 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2248 TSQLResult* aResult = 0;
2249 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2251 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2252 GetSystemName(system), id, sourceName.Data()));
2256 if(aResult->GetRowCount() == 0)
2259 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2260 GetSystemName(system), id, sourceName.Data()));
2265 if (aResult->GetRowCount() > 1) {
2267 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2268 GetSystemName(system), id, sourceName.Data()));
2273 if (aResult->GetFieldCount() != nFields) {
2275 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2276 GetSystemName(system), id, sourceName.Data()));
2281 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2284 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2285 GetSystemName(system), id, sourceName.Data()));
2290 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2291 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2292 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2297 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2298 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2300 // retrieved file is renamed to make it unique
2301 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2302 GetShuttleTempDir(), detector, GetCurrentRun(),
2303 GetSystemName(system), detector, GetCurrentRun(),
2304 id, sourceName.Data());
2307 // file retrieval from FXS
2308 UInt_t nRetries = 0;
2309 UInt_t maxRetries = 3;
2310 Bool_t result = kFALSE;
2312 // copy!! if successful TSystem::Exec returns 0
2313 while(nRetries++ < maxRetries) {
2314 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2315 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2318 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2319 filePath.Data(), GetSystemName(system)));
2323 if (fileChecksum.Length()>0)
2325 // compare md5sum of local file with the one stored in the FXS DB
2326 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2327 localFileName.Data(), fileChecksum.Data()));
2331 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2337 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2338 filePath.Data(), GetSystemName(system)));
2343 if(!result) return 0;
2345 fFXSCalled[system]=kTRUE;
2346 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2347 fFXSlist[system].Add(fileParams);
2349 static TString staticLocalFileName;
2350 staticLocalFileName.Form("%s", localFileName.Data());
2352 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2353 "source %s from %s to %s", id, source,
2354 GetSystemName(system), localFileName.Data()));
2356 return staticLocalFileName.Data();
2359 //______________________________________________________________________________________________
2360 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2363 // Copies file from FXS to local Shuttle machine
2366 // check temp directory: trying to cd to temp; if it does not exist, create it
2367 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2368 GetSystemName(system), fxsFileName, localFileName));
2370 TString tmpDir(localFileName);
2372 tmpDir = tmpDir(0,tmpDir.Last('/'));
2374 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2375 if (noDir) // temp dir does not exists!
2377 if (gSystem->mkdir(tmpDir.Data(), 1))
2379 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2384 TString baseFXSFolder;
2387 baseFXSFolder = "FES/";
2389 else if (system == kDCS)
2393 else if (system == kHLT)
2395 baseFXSFolder = "/opt/FXS/";
2399 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2400 fConfig->GetFXSPort(system),
2401 fConfig->GetFXSUser(system),
2402 fConfig->GetFXSHost(system),
2403 baseFXSFolder.Data(),
2407 AliDebug(2, Form("%s",command.Data()));
2409 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2414 //______________________________________________________________________________________________
2415 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2418 // Get sources producing the condition file Id from file exchange servers
2419 // if id is NULL all sources are returned (distinct)
2422 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2424 // check if test mode should simulate a FXS error
2425 if (fTestMode & kErrorFXSSources)
2427 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2433 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2434 TList *list = new TList();
2436 list->Add(new TObjString(" "));
2440 // check connection, in case connect
2441 if (!Connect(system))
2443 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2447 TString sourceName = 0;
2450 sourceName = "DAQsource";
2451 } else if (system == kHLT)
2453 sourceName = "DDLnumbers";
2456 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2457 TString whereClause = Form("run=%d and detector=\"%s\"",
2458 GetCurrentRun(), detector);
2460 whereClause += Form(" and fileId=\"%s\"", id);
2461 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2463 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2466 TSQLResult* aResult;
2467 aResult = fServer[system]->Query(sqlQuery);
2469 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2470 GetSystemName(system), id));
2474 TList *list = new TList();
2477 if (aResult->GetRowCount() == 0)
2480 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2485 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2488 while ((aRow = aResult->Next()))
2491 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2492 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2493 list->Add(new TObjString(source));
2502 //______________________________________________________________________________________________
2503 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2506 // Get all ids of condition files produced by a given source from file exchange servers
2509 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2511 // check if test mode should simulate a FXS error
2512 if (fTestMode & kErrorFXSSources)
2514 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2518 // check connection, in case connect
2519 if (!Connect(system))
2521 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2525 TString sourceName = 0;
2528 sourceName = "DAQsource";
2529 } else if (system == kHLT)
2531 sourceName = "DDLnumbers";
2534 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2535 TString whereClause = Form("run=%d and detector=\"%s\"",
2536 GetCurrentRun(), detector);
2537 if (sourceName.Length() > 0 && source)
2538 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2539 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2541 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2544 TSQLResult* aResult;
2545 aResult = fServer[system]->Query(sqlQuery);
2547 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2548 GetSystemName(system), source));
2552 TList *list = new TList();
2555 if (aResult->GetRowCount() == 0)
2558 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2563 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2567 while ((aRow = aResult->Next()))
2570 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2571 AliDebug(2, Form("fileId = %s", id.Data()));
2572 list->Add(new TObjString(id));
2581 //______________________________________________________________________________________________
2582 Bool_t AliShuttle::Connect(Int_t system)
2584 // Connect to MySQL Server of the system's FXS MySQL databases
2585 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2588 // check connection: if already connected return
2589 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2591 TString dbHost, dbUser, dbPass, dbName;
2593 if (system < 3) // FXS db servers
2595 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2596 dbUser = fConfig->GetFXSdbUser(system);
2597 dbPass = fConfig->GetFXSdbPass(system);
2598 dbName = fConfig->GetFXSdbName(system);
2599 } else { // Run & Shuttle logbook servers
2600 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2601 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2602 dbUser = fConfig->GetDAQlbUser();
2603 dbPass = fConfig->GetDAQlbPass();
2604 dbName = fConfig->GetDAQlbDB();
2607 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2608 if (!fServer[system] || !fServer[system]->IsConnected()) {
2611 AliError(Form("Can't establish connection to FXS database for %s",
2612 AliShuttleInterface::GetSystemName(system)));
2614 AliError("Can't establish connection to Run logbook.");
2616 if(fServer[system]) delete fServer[system];
2621 TSQLResult* aResult=0;
2624 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2627 aResult = fServer[kDCS]->GetTables(dbName.Data());
2630 aResult = fServer[kHLT]->GetTables(dbName.Data());
2633 aResult = fServer[3]->GetTables(dbName.Data());
2641 //______________________________________________________________________________________________
2642 Bool_t AliShuttle::UpdateTable()
2645 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2648 Bool_t result = kTRUE;
2650 for (UInt_t system=0; system<3; system++)
2652 if(!fFXSCalled[system]) continue;
2654 // check connection, in case connect
2655 if (!Connect(system))
2657 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2662 TTimeStamp now; // now
2664 // Loop on FXS list entries
2665 TIter iter(&fFXSlist[system]);
2666 TObjString *aFXSentry=0;
2667 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2669 TString aFXSentrystr = aFXSentry->String();
2670 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2671 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2673 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2674 GetSystemName(system), aFXSentrystr.Data()));
2675 if(aFXSarray) delete aFXSarray;
2679 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2680 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2682 TString whereClause;
2685 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2686 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2688 else if (system == kDCS)
2690 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2691 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2693 else if (system == kHLT)
2695 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2696 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2701 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2702 now.GetSec(), whereClause.Data());
2704 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2707 TSQLResult* aResult;
2708 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2711 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2712 GetSystemName(system), sqlQuery.Data()));
2723 //______________________________________________________________________________________________
2724 Bool_t AliShuttle::UpdateTableFailCase()
2726 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2727 // this is called in case the preprocessor is declared failed for the current run, because
2728 // the fields are updated only in case of success
2730 Bool_t result = kTRUE;
2732 for (UInt_t system=0; system<3; system++)
2734 // check connection, in case connect
2735 if (!Connect(system))
2737 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2738 GetSystemName(system)));
2743 TTimeStamp now; // now
2745 // Loop on FXS list entries
2747 TString whereClause = Form("where run=%d and detector=\"%s\";",
2748 GetCurrentRun(), fCurrentDetector.Data());
2751 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2752 now.GetSec(), whereClause.Data());
2754 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2757 TSQLResult* aResult;
2758 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2761 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2762 GetSystemName(system), sqlQuery.Data()));
2772 //______________________________________________________________________________________________
2773 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2776 // Update Shuttle logbook filling detector or shuttle_done column
2777 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2780 // check connection, in case connect
2782 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2786 TString detName(detector);
2788 if (detName == "shuttle_done" || detName == "shuttle_ignored")
2790 setClause = "set shuttle_done=1";
2792 if (detName == "shuttle_done")
2794 // Send the information to ML
2795 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2798 mlList.Add(&mlStatus);
2801 mlID.Form("%d", GetCurrentRun());
2802 fMonaLisa->SendParameters(&mlList, mlID);
2805 TString statusStr(status);
2806 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2807 statusStr.Contains("failed", TString::kIgnoreCase)){
2808 setClause = Form("set %s=\"%s\"", detector, status);
2811 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2817 TString whereClause = Form("where run=%d", GetCurrentRun());
2819 TString sqlQuery = Form("update %s %s %s",
2820 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2822 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2825 TSQLResult* aResult;
2826 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2828 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2836 //______________________________________________________________________________________________
2837 Int_t AliShuttle::GetCurrentRun() const
2840 // Get current run from logbook entry
2843 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2846 //______________________________________________________________________________________________
2847 UInt_t AliShuttle::GetCurrentStartTime() const
2850 // get current start time
2853 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2856 //______________________________________________________________________________________________
2857 UInt_t AliShuttle::GetCurrentEndTime() const
2860 // get current end time from logbook entry
2863 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2866 //______________________________________________________________________________________________
2867 UInt_t AliShuttle::GetCurrentYear() const
2870 // Get current year from logbook entry
2873 if (!fLogbookEntry) return 0;
2875 TTimeStamp startTime(GetCurrentStartTime());
2876 TString year = Form("%d",startTime.GetDate());
2882 //______________________________________________________________________________________________
2883 const char* AliShuttle::GetLHCPeriod() const
2886 // Get current LHC period from logbook entry
2889 if (!fLogbookEntry) return 0;
2891 return fLogbookEntry->GetRunParameter("LHCperiod");
2894 //______________________________________________________________________________________________
2895 void AliShuttle::Log(const char* detector, const char* message)
2898 // Fill log string with a message
2901 TString logRunDir = GetShuttleLogDir();
2902 if (GetCurrentRun() >=0)
2903 logRunDir += Form("/%d", GetCurrentRun());
2905 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2907 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2908 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2913 gSystem->FreeDirectory(dir);
2916 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2917 if (GetCurrentRun() >= 0)
2918 toLog += Form("run %d - ", GetCurrentRun());
2919 toLog += Form("%s", message);
2921 AliInfo(toLog.Data());
2923 // if we redirect the log output already to the file, leave here
2924 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2927 TString fileName = GetLogFileName(detector);
2929 gSystem->ExpandPathName(fileName);
2932 logFile.open(fileName, ofstream::out | ofstream::app);
2934 if (!logFile.is_open()) {
2935 AliError(Form("Could not open file %s", fileName.Data()));
2939 logFile << toLog.Data() << "\n";
2944 //______________________________________________________________________________________________
2945 TString AliShuttle::GetLogFileName(const char* detector) const
2948 // returns the name of the log file for a given sub detector
2953 if (GetCurrentRun() >= 0)
2955 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
2956 detector, GetCurrentRun());
2958 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2964 //______________________________________________________________________________________________
2965 void AliShuttle::SendAlive()
2967 // sends alive message to ML
2969 TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
2972 mlList.Add(&mlStatus);
2974 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2977 //______________________________________________________________________________________________
2978 Bool_t AliShuttle::Collect(Int_t run)
2981 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2982 // If a dedicated run is given this run is processed
2984 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2988 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2990 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2992 SetLastAction("Starting");
2994 // create ML instance
2996 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
3001 TString whereClause("where shuttle_done=0");
3003 whereClause += Form(" and run=%d", run);
3005 TObjArray shuttleLogbookEntries;
3006 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3008 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3012 if (shuttleLogbookEntries.GetEntries() == 0)
3015 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3017 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3018 "or it does not exist in Shuttle logbook", run));
3022 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3023 fFirstUnprocessed[iDet] = kTRUE;
3027 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3028 // flag them into fFirstUnprocessed array
3029 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3030 TObjArray tmpLogbookEntries;
3031 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3033 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3037 TIter iter(&tmpLogbookEntries);
3038 AliShuttleLogbookEntry* anEntry = 0;
3039 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3041 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3043 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3045 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3046 anEntry->GetRun(), GetDetName(iDet)));
3047 fFirstUnprocessed[iDet] = kFALSE;
3055 if (!RetrieveConditionsData(shuttleLogbookEntries))
3057 Log("SHUTTLE", "Collect - Process of at least one run failed");
3061 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3065 //______________________________________________________________________________________________
3066 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3069 // Retrieve conditions data for all runs that aren't processed yet
3072 Bool_t hasError = kFALSE;
3074 TIter iter(&dateEntries);
3075 AliShuttleLogbookEntry* anEntry;
3077 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3078 if (!Process(anEntry)){
3082 // clean SHUTTLE temp directory
3083 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3084 //RemoveFile(filename.Data());
3087 return hasError == kFALSE;
3090 //______________________________________________________________________________________________
3091 ULong_t AliShuttle::GetTimeOfLastAction() const
3094 // Gets time of last action
3099 fMonitoringMutex->Lock();
3101 tmp = fLastActionTime;
3103 fMonitoringMutex->UnLock();
3108 //______________________________________________________________________________________________
3109 const TString AliShuttle::GetLastAction() const
3112 // returns a string description of the last action
3117 fMonitoringMutex->Lock();
3121 fMonitoringMutex->UnLock();
3126 //______________________________________________________________________________________________
3127 void AliShuttle::SetLastAction(const char* action)
3130 // updates the monitoring variables
3133 fMonitoringMutex->Lock();
3135 fLastAction = action;
3136 fLastActionTime = time(0);
3138 fMonitoringMutex->UnLock();
3141 //______________________________________________________________________________________________
3142 const char* AliShuttle::GetRunParameter(const char* param)
3145 // returns run parameter read from DAQ logbook
3148 if(!fLogbookEntry) {
3149 AliError("No logbook entry!");
3153 return fLogbookEntry->GetRunParameter(param);
3156 //______________________________________________________________________________________________
3157 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3160 // returns object from OCDB valid for current run
3163 if (fTestMode & kErrorOCDB)
3165 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3169 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3172 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3176 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3179 //______________________________________________________________________________________________
3180 Bool_t AliShuttle::SendMail()
3183 // sends a mail to the subdetector expert in case of preprocessor error
3186 if (fTestMode != kNone)
3189 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3192 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3194 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3199 gSystem->FreeDirectory(dir);
3202 TString bodyFileName;
3203 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3204 gSystem->ExpandPathName(bodyFileName);
3207 mailBody.open(bodyFileName, ofstream::out);
3209 if (!mailBody.is_open())
3211 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3216 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3217 TObjString *anExpert=0;
3218 while ((anExpert = (TObjString*) iterExperts.Next()))
3220 to += Form("%s,", anExpert->GetName());
3222 to.Remove(to.Length()-1);
3223 AliDebug(2, Form("to: %s",to.Data()));
3226 Log("SHUTTLE", "List of detector responsibles not yet set!");
3230 TString cc="alberto.colla@cern.ch";
3232 TString subject = Form("%s Shuttle preprocessor FAILED in run %d (run type = %s)!",
3233 fCurrentDetector.Data(), GetCurrentRun(), GetRunType());
3234 AliDebug(2, Form("subject: %s", subject.Data()));
3236 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3237 body += Form("SHUTTLE just detected that your preprocessor "
3238 "failed processing run %d (run type = %s)!!\n\n",
3239 GetCurrentRun(), GetRunType());
3240 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3241 fCurrentDetector.Data());
3242 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3244 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3246 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3250 TString logFolder = "logs";
3251 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3252 logFolder += "_PROD";
3255 body += Form("Find the %s log for the current run on \n\n"
3256 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3257 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3258 fCurrentDetector.Data(), GetCurrentRun());
3259 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3261 AliDebug(2, Form("Body begin: %s", body.Data()));
3263 mailBody << body.Data();
3265 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3267 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3268 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3269 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3270 if (gSystem->Exec(tailCommand.Data()))
3272 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3275 TString endBody = Form("------------------------------------------------------\n\n");
3276 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3277 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3278 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3280 AliDebug(2, Form("Body end: %s", endBody.Data()));
3282 mailBody << endBody.Data();
3287 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3291 bodyFileName.Data());
3292 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3294 Bool_t result = gSystem->Exec(mailCommand.Data());
3299 //______________________________________________________________________________________________
3300 Bool_t AliShuttle::SendMailToDCS()
3303 // sends a mail to the DCS experts in case of DCS error
3306 if (fTestMode != kNone)
3309 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3312 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3314 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3319 gSystem->FreeDirectory(dir);
3322 TString bodyFileName;
3323 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3324 gSystem->ExpandPathName(bodyFileName);
3327 mailBody.open(bodyFileName, ofstream::out);
3329 if (!mailBody.is_open())
3331 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3335 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3336 //TString to="alberto.colla@cern.ch";
3337 AliDebug(2, Form("to: %s",to.Data()));
3340 Log("SHUTTLE", "List of detector responsibles not yet set!");
3344 TString cc="alberto.colla@cern.ch";
3346 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3347 fCurrentDetector.Data(), GetCurrentRun());
3348 AliDebug(2, Form("subject: %s", subject.Data()));
3350 TString body = Form("Dear DCS experts, \n\n");
3351 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3352 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3353 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3354 fCurrentDetector.Data());
3355 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3357 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3359 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3362 TString logFolder = "logs";
3363 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3364 logFolder += "_PROD";
3367 body += Form("Find the %s log for the current run on \n\n"
3368 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3369 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3370 fCurrentDetector.Data(), GetCurrentRun());
3371 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3373 AliDebug(2, Form("Body begin: %s", body.Data()));
3375 mailBody << body.Data();
3377 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3379 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3380 fCurrentDetector.Data(), GetCurrentRun());
3381 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3382 if (gSystem->Exec(tailCommand.Data()))
3384 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3387 TString endBody = Form("------------------------------------------------------\n\n");
3388 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3389 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3390 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3392 AliDebug(2, Form("Body end: %s", endBody.Data()));
3394 mailBody << endBody.Data();
3399 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3403 bodyFileName.Data());
3404 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3406 Bool_t result = gSystem->Exec(mailCommand.Data());
3411 //______________________________________________________________________________________________
3412 const char* AliShuttle::GetRunType()
3415 // returns run type read from "run type" logbook
3418 if(!fLogbookEntry) {
3419 AliError("No logbook entry!");
3423 return fLogbookEntry->GetRunType();
3426 //______________________________________________________________________________________________
3427 Bool_t AliShuttle::GetHLTStatus()
3429 // Return HLT status (ON=1 OFF=0)
3430 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3432 if(!fLogbookEntry) {
3433 AliError("No logbook entry!");
3437 // TODO implement when HLTStatus is inserted in run logbook
3438 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3439 //if(hltStatus == "OFF") {return kFALSE};
3444 //______________________________________________________________________________________________
3445 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3448 // sets Shuttle temp directory
3451 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3454 //______________________________________________________________________________________________
3455 void AliShuttle::SetShuttleLogDir(const char* logDir)
3458 // sets Shuttle log directory
3461 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);