1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.73 2007/12/14 19:31:36 acolla
19 Sending email to DCS experts is temporarily commented
21 Revision 1.72 2007/12/13 15:44:28 acolla
22 Run type added in mail sent to detector expert (eases understanding)
24 Revision 1.71 2007/12/12 14:56:14 jgrosseo
25 sending shuttle_ignore to ML also in case of 0 events
27 Revision 1.70 2007/12/12 13:45:35 acolla
28 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
30 Revision 1.69 2007/12/12 10:06:29 acolla
31 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
33 time_start==0 && time_end==0
35 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
37 Revision 1.68 2007/12/11 10:15:17 acolla
38 Added marking SHUTTLE=DONE for invalid runs
39 (invalid start time or end time) and runs with totalEvents < 1
41 Revision 1.67 2007/12/07 19:14:36 acolla
44 Added automatic collection of new runs on a regular time basis (settable from the configuration)
46 in AliShuttleConfig: new members
48 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
49 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
53 - logs now stored in logs/#RUN/DET_#RUN.log
55 Revision 1.66 2007/12/05 10:45:19 jgrosseo
56 changed order of arguments to TMonaLisaWriter
58 Revision 1.65 2007/11/26 16:58:37 acolla
59 Monalisa configuration added: host and table name
61 Revision 1.64 2007/11/13 16:15:47 acolla
62 DCS map is stored in a file in the temp folder where the detector is processed.
63 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
65 Revision 1.63 2007/11/02 10:53:16 acolla
66 Protection added to AliShuttle::CopyFileLocally
68 Revision 1.62 2007/10/31 18:23:13 acolla
69 Furter developement on the Shuttle:
71 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
72 are now built from /alice/data, e.g.:
73 /alice/data/2007/LHC07a/OCDB
75 the year and LHC period are taken from the Shuttle.
76 Raw metadata files are stored by GRP to:
77 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
79 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
81 Revision 1.61 2007/10/30 20:33:51 acolla
82 Improved managing of temporary folders, which weren't correctly handled.
83 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
85 Revision 1.60 2007/10/29 18:06:16 acolla
87 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
88 This function can be used by GRP only. It stores raw data tags merged file to the
89 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
93 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
94 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
95 has been corrected in the root package on the Shuttle machine.
97 Revision 1.59 2007/10/05 12:40:55 acolla
99 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
101 Revision 1.58 2007/09/28 15:27:40 acolla
103 AliDCSClient "multiSplit" option added in the DCS configuration
104 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
106 Revision 1.57 2007/09/27 16:53:13 acolla
107 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
108 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
110 Revision 1.56 2007/09/14 16:46:14 jgrosseo
111 1) Connect and Close are called before and after each query, so one can
112 keep the same AliDCSClient object.
113 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
114 3) Splitting interval can be specified in constructor
116 Revision 1.55 2007/08/06 12:26:40 acolla
117 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
118 read from the run logbook.
120 Revision 1.54 2007/07/12 09:51:25 jgrosseo
121 removed duplicated log message in GetFile
123 Revision 1.53 2007/07/12 09:26:28 jgrosseo
124 updating hlt fxs base path
126 Revision 1.52 2007/07/12 08:06:45 jgrosseo
127 adding log messages in getfile... functions
128 adding not implemented copy constructor in alishuttleconfigholder
130 Revision 1.51 2007/07/03 17:24:52 acolla
131 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
133 Revision 1.50 2007/07/02 17:19:32 acolla
134 preprocessor is run in a temp directory that is removed when process is finished.
136 Revision 1.49 2007/06/29 10:45:06 acolla
137 Number of columns in MySql Shuttle logbook increased by one (HLT added)
139 Revision 1.48 2007/06/21 13:06:19 acolla
140 GetFileSources returns dummy list with 1 source if system=DCS (better than
141 returning error as it was)
143 Revision 1.47 2007/06/19 17:28:56 acolla
144 HLT updated; missing map bug removed.
146 Revision 1.46 2007/06/09 13:01:09 jgrosseo
147 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
149 Revision 1.45 2007/05/30 06:35:20 jgrosseo
150 Adding functionality to the Shuttle/TestShuttle:
151 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
152 o) Function to retrieve list of IDs for a given source (GetFileIDs)
153 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
154 Example code has been added to the TestProcessor in TestShuttle
156 Revision 1.44 2007/05/11 16:09:32 acolla
157 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
158 example: ITS/SPD/100_filename.root
160 Revision 1.43 2007/05/10 09:59:51 acolla
161 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
163 Revision 1.42 2007/05/03 08:01:39 jgrosseo
164 typo in last commit :-(
166 Revision 1.41 2007/05/03 08:00:48 jgrosseo
167 fixing log message when pp want to skip dcs value retrieval
169 Revision 1.40 2007/04/27 07:06:48 jgrosseo
170 GetFileSources returns empty list in case of no files, but successful query
171 No mails sent in testmode
173 Revision 1.39 2007/04/17 12:43:57 acolla
174 Correction in StoreOCDB; change of text in mail to detector expert
176 Revision 1.38 2007/04/12 08:26:18 jgrosseo
179 Revision 1.37 2007/04/10 16:53:14 jgrosseo
180 redirecting sub detector stdout, stderr to sub detector log file
182 Revision 1.35 2007/04/04 16:26:38 acolla
183 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
184 2. Added missing dependency in test preprocessors.
185 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
187 Revision 1.34 2007/04/04 10:33:36 jgrosseo
188 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
189 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
191 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
193 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
195 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
197 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
198 If you always need DCS data (like before), you do not need to implement it.
200 6) The run type has been added to the monitoring page
202 Revision 1.33 2007/04/03 13:56:01 acolla
203 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
206 Revision 1.32 2007/02/28 10:41:56 acolla
207 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
208 AliPreprocessor::GetRunType() function.
209 Added some ldap definition files.
211 Revision 1.30 2007/02/13 11:23:21 acolla
212 Moved getters and setters of Shuttle's main OCDB/Reference, local
213 OCDB/Reference, temp and log folders to AliShuttleInterface
215 Revision 1.27 2007/01/30 17:52:42 jgrosseo
216 adding monalisa monitoring
218 Revision 1.26 2007/01/23 19:20:03 acolla
219 Removed old ldif files, added TOF, MCH ldif files. Added some options in
220 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
223 Revision 1.25 2007/01/15 19:13:52 acolla
224 Moved some AliInfo to AliDebug in SendMail function
226 Revision 1.21 2006/12/07 08:51:26 jgrosseo
228 table, db names in ldap configuration
229 added GRP preprocessor
230 DCS data can also be retrieved by data point
232 Revision 1.20 2006/11/16 16:16:48 jgrosseo
233 introducing strict run ordering flag
234 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
236 Revision 1.19 2006/11/06 14:23:04 jgrosseo
237 major update (Alberto)
238 o) reading of run parameters from the logbook
239 o) online offline naming conversion
240 o) standalone DCSclient package
242 Revision 1.18 2006/10/20 15:22:59 jgrosseo
243 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
244 o) Merging Collect, CollectAll, CollectNew function
245 o) Removing implementation of empty copy constructors (declaration still there!)
247 Revision 1.17 2006/10/05 16:20:55 jgrosseo
248 adapting to new CDB classes
250 Revision 1.16 2006/10/05 15:46:26 jgrosseo
251 applying to the new interface
253 Revision 1.15 2006/10/02 16:38:39 jgrosseo
256 storing of objects that failed to be stored to the grid before
257 interfacing of shuttle status table in daq system
259 Revision 1.14 2006/08/29 09:16:05 jgrosseo
262 Revision 1.13 2006/08/15 10:50:00 jgrosseo
263 effc++ corrections (alberto)
265 Revision 1.12 2006/08/08 14:19:29 jgrosseo
266 Update to shuttle classes (Alberto)
268 - Possibility to set the full object's path in the Preprocessor's and
269 Shuttle's Store functions
270 - Possibility to extend the object's run validity in the same classes
271 ("startValidity" and "validityInfinite" parameters)
272 - Implementation of the StoreReferenceData function to store reference
273 data in a dedicated CDB storage.
275 Revision 1.11 2006/07/21 07:37:20 jgrosseo
276 last run is stored after each run
278 Revision 1.10 2006/07/20 09:54:40 jgrosseo
279 introducing status management: The processing per subdetector is divided into several steps,
280 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
281 can keep track of the number of failures and skips further processing after a certain threshold is
282 exceeded. These thresholds can be configured in LDAP.
284 Revision 1.9 2006/07/19 10:09:55 jgrosseo
285 new configuration, accesst to DAQ FES (Alberto)
287 Revision 1.8 2006/07/11 12:44:36 jgrosseo
288 adding parameters for extended validity range of data produced by preprocessor
290 Revision 1.7 2006/07/10 14:37:09 jgrosseo
291 small fix + todo comment
293 Revision 1.6 2006/07/10 13:01:41 jgrosseo
294 enhanced storing of last sucessfully processed run (alberto)
296 Revision 1.5 2006/07/04 14:59:57 jgrosseo
297 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
299 Revision 1.4 2006/06/12 09:11:16 jgrosseo
300 coding conventions (Alberto)
302 Revision 1.3 2006/06/06 14:26:40 jgrosseo
303 o) removed files that were moved to STEER
304 o) shuttle updated to follow the new interface (Alberto)
306 Revision 1.2 2006/03/07 07:52:34 hristov
307 New version (B.Yordanov)
309 Revision 1.6 2005/11/19 17:19:14 byordano
310 RetrieveDATEEntries and RetrieveConditionsData added
312 Revision 1.5 2005/11/19 11:09:27 byordano
313 AliShuttle declaration added
315 Revision 1.4 2005/11/17 17:47:34 byordano
316 TList changed to TObjArray
318 Revision 1.3 2005/11/17 14:43:23 byordano
321 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
322 Initial import as subdirectory in AliRoot
324 Revision 1.2 2005/09/13 08:41:15 byordano
325 default startTime endTime added
327 Revision 1.4 2005/08/30 09:13:02 byordano
330 Revision 1.3 2005/08/29 21:15:47 byordano
336 // This class is the main manager for AliShuttle.
337 // It organizes the data retrieval from DCS and call the
338 // interface methods of AliPreprocessor.
339 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
340 // data for its set of aliases is retrieved. If there is registered
341 // AliPreprocessor for this detector then it will be used
342 // accroding to the schema (see AliPreprocessor).
343 // If there isn't registered AliPreprocessor than the retrieved
344 // data is stored automatically to the undelying AliCDBStorage.
345 // For detSpec is used the alias name.
348 #include "AliShuttle.h"
350 #include "AliCDBManager.h"
351 #include "AliCDBStorage.h"
352 #include "AliCDBId.h"
353 #include "AliCDBRunRange.h"
354 #include "AliCDBPath.h"
355 #include "AliCDBEntry.h"
356 #include "AliShuttleConfig.h"
357 #include "DCSClient/AliDCSClient.h"
359 #include "AliPreprocessor.h"
360 #include "AliShuttleStatus.h"
361 #include "AliShuttleLogbookEntry.h"
366 #include <TTimeStamp.h>
367 #include <TObjString.h>
368 #include <TSQLServer.h>
369 #include <TSQLResult.h>
372 #include <TSystemDirectory.h>
373 #include <TSystemFile.h>
376 #include <TGridResult.h>
378 #include <TMonaLisaWriter.h>
382 #include <sys/types.h>
383 #include <sys/wait.h>
387 //______________________________________________________________________________________________
388 AliShuttle::AliShuttle(const AliShuttleConfig* config,
389 UInt_t timeout, Int_t retries):
391 fTimeout(timeout), fRetries(retries),
401 fReadTestMode(kFALSE),
402 fOutputRedirected(kFALSE)
405 // config: AliShuttleConfig used
406 // timeout: timeout used for AliDCSClient connection
407 // retries: the number of retries in case of connection error.
410 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
411 for(int iSys=0;iSys<4;iSys++) {
414 fFXSlist[iSys].SetOwner(kTRUE);
416 fPreprocessorMap.SetOwner(kTRUE);
418 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
419 fFirstUnprocessed[iDet] = kFALSE;
421 fMonitoringMutex = new TMutex();
424 //______________________________________________________________________________________________
425 AliShuttle::~AliShuttle()
431 fPreprocessorMap.DeleteAll();
432 for(int iSys=0;iSys<4;iSys++)
434 fServer[iSys]->Close();
435 delete fServer[iSys];
444 if (fMonitoringMutex)
446 delete fMonitoringMutex;
447 fMonitoringMutex = 0;
451 //______________________________________________________________________________________________
452 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
455 // Registers new AliPreprocessor.
456 // It uses GetName() for indentificator of the pre processor.
457 // The pre processor is registered it there isn't any other
458 // with the same identificator (GetName()).
461 const char* detName = preprocessor->GetName();
462 if(GetDetPos(detName) < 0)
463 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
465 if (fPreprocessorMap.GetValue(detName)) {
466 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
470 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
472 //______________________________________________________________________________________________
473 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
474 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
476 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
477 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
478 // using this function. Use StoreReferenceData instead!
479 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
480 // finishes the data are transferred to the main storage (Grid).
482 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
485 //______________________________________________________________________________________________
486 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
488 // Stores a CDB object in the storage for reference data. This objects will not be available during
489 // offline reconstrunction. Use this function for reference data only!
490 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
491 // finishes the data are transferred to the main storage (Grid).
493 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
496 //______________________________________________________________________________________________
497 Bool_t AliShuttle::StoreLocally(const TString& localUri,
498 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
499 Int_t validityStart, Bool_t validityInfinite)
501 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
502 // when the preprocessor finishes the data are transferred to the main storage (Grid).
503 // The parameters are:
504 // 1) Uri of the backup storage (Local)
505 // 2) the object's path.
506 // 3) the object to be stored
507 // 4) the metaData to be associated with the object
508 // 5) the validity start run number w.r.t. the current run,
509 // if the data is valid only for this run leave the default 0
510 // 6) specifies if the calibration data is valid for infinity (this means until updated),
511 // typical for calibration runs, the default is kFALSE
513 // returns 0 if fail, 1 otherwise
515 if (fTestMode & kErrorStorage)
517 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
521 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
523 Int_t firstRun = GetCurrentRun() - validityStart;
525 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
530 if(validityInfinite) {
531 lastRun = AliCDBRunRange::Infinity();
533 lastRun = GetCurrentRun();
536 // Version is set to current run, it will be used later to transfer data to Grid
537 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
539 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
540 TObjString runUsed = Form("%d", GetCurrentRun());
541 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
544 Bool_t result = kFALSE;
546 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
547 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
549 result = AliCDBManager::Instance()->GetStorage(localUri)
550 ->Put(object, id, metaData);
555 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
561 //______________________________________________________________________________________________
562 Bool_t AliShuttle::StoreOCDB()
565 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
566 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
567 // Then calls StoreRefFilesToGrid to store reference files.
570 if (fTestMode & kErrorGrid)
572 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
573 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
577 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
578 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
580 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
581 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
583 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
584 Bool_t resultRefFiles = CopyFilesToGrid("reference");
586 Bool_t resultMetadata = kTRUE;
587 if(fCurrentDetector == "GRP")
589 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
590 resultMetadata = CopyFilesToGrid("metadata");
593 return resultCDB && resultRef && resultRefFiles && resultMetadata;
596 //______________________________________________________________________________________________
597 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
600 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
603 TObjArray* gridIds=0;
605 Bool_t result = kTRUE;
607 const char* type = 0;
609 if(gridURI == fgkMainCDB) {
611 localURI = fgkLocalCDB;
612 } else if(gridURI == fgkMainRefStorage) {
614 localURI = fgkLocalRefStorage;
616 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
620 AliCDBManager* man = AliCDBManager::Instance();
622 AliCDBStorage *gridSto = man->GetStorage(gridURI);
625 Form("StoreOCDB - cannot activate main %s storage", type));
629 gridIds = gridSto->GetQueryCDBList();
631 // get objects previously stored in local CDB
632 AliCDBStorage *localSto = man->GetStorage(localURI);
635 Form("StoreOCDB - cannot activate local %s storage", type));
638 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
639 // Local objects were stored with current run as Grid version!
640 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
641 localEntries->SetOwner(1);
643 // loop on local stored objects
644 TIter localIter(localEntries);
645 AliCDBEntry *aLocEntry = 0;
646 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
647 aLocEntry->SetOwner(1);
648 AliCDBId aLocId = aLocEntry->GetId();
649 aLocEntry->SetVersion(-1);
650 aLocEntry->SetSubVersion(-1);
652 // If local object is valid up to infinity we store it only if it is
653 // the first unprocessed run!
654 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
655 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
657 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
658 "there are previous unprocessed runs!",
659 fCurrentDetector.Data(), aLocId.GetPath().Data()));
664 // loop on Grid valid Id's
665 Bool_t store = kTRUE;
666 TIter gridIter(gridIds);
667 AliCDBId* aGridId = 0;
668 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
669 if(aGridId->GetPath() != aLocId.GetPath()) continue;
670 // skip all objects valid up to infinity
671 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
672 // if we get here, it means there's already some more recent object stored on Grid!
677 // If we get here, the file can be stored!
678 Bool_t storeOk = gridSto->Put(aLocEntry);
679 if(!store || storeOk){
683 Log(fCurrentDetector.Data(),
684 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
685 type, aGridId->ToString().Data()));
688 Form("StoreOCDB - Object <%s> successfully put into %s storage",
689 aLocId.ToString().Data(), type));
690 Log(fCurrentDetector.Data(),
691 Form("StoreOCDB - Object <%s> successfully put into %s storage",
692 aLocId.ToString().Data(), type));
695 // removing local filename...
697 localSto->IdToFilename(aLocId, filename);
698 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
699 RemoveFile(filename.Data());
703 Form("StoreOCDB - Grid %s storage of object <%s> failed",
704 type, aLocId.ToString().Data()));
705 Log(fCurrentDetector.Data(),
706 Form("StoreOCDB - Grid %s storage of object <%s> failed",
707 type, aLocId.ToString().Data()));
711 localEntries->Clear();
716 //______________________________________________________________________________________________
717 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
719 // clears the directory used to store reference files of a given subdetector
721 AliCDBManager* man = AliCDBManager::Instance();
722 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
723 TString localBaseFolder = sto->GetBaseFolder();
725 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
727 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
730 begin.Form("%d_", GetCurrentRun());
732 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
736 TList* dirList = baseDir->GetListOfFiles();
739 if (!dirList) return kTRUE;
741 if (dirList->GetEntries() < 3)
747 Int_t nDirs = 0, nDel = 0;
748 TIter dirIter(dirList);
749 TSystemFile* entry = 0;
751 Bool_t success = kTRUE;
753 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
755 if (entry->IsDirectory())
758 TString fileName(entry->GetName());
759 if (!fileName.BeginsWith(begin))
765 Int_t result = gSystem->Unlink(fileName.Data());
769 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
777 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
778 nDel, nDirs, targetDir.Data()));
789 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
793 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
796 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
801 result = gSystem->mkdir(targetDir, kTRUE);
804 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
811 //______________________________________________________________________________________________
812 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
815 // Stores reference file directly (without opening it). This function stores the file locally.
817 // The file is stored under the following location:
818 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
819 // where <gridFileName> is the second parameter given to the function
822 if (fTestMode & kErrorStorage)
824 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
828 AliCDBManager* man = AliCDBManager::Instance();
829 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
831 TString localBaseFolder = sto->GetBaseFolder();
833 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
834 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
836 return CopyFileLocally(localFile, target);
839 //______________________________________________________________________________________________
840 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
843 // Stores Run metadata file to the Grid, in the run folder
845 // Only GRP can call this function.
847 if (fTestMode & kErrorStorage)
849 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
853 AliCDBManager* man = AliCDBManager::Instance();
854 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
856 TString localBaseFolder = sto->GetBaseFolder();
858 // Build Run level folder
859 // folder = /alice/data/year/lhcPeriod/runNb/Raw
862 TString lhcPeriod = GetLHCPeriod();
863 if (lhcPeriod.Length() == 0)
865 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
869 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
870 localBaseFolder.Data(), GetCurrentYear(),
871 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
873 return CopyFileLocally(localFile, target);
876 //______________________________________________________________________________________________
877 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
880 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
881 // Files are temporarily stored in the local reference storage. When the preprocessor
882 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
883 // (in reference or run level folders)
886 TString targetDir(target(0, target.Last('/')));
888 //try to open base dir folder, if it does not exist
889 void* dir = gSystem->OpenDirectory(targetDir.Data());
891 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
892 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
897 gSystem->FreeDirectory(dir);
902 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
905 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
909 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
912 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
913 if (gSystem->Unlink(target.Data()))
915 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
920 result = gSystem->CopyFile(localFile, target);
924 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
929 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
930 localFile, target.Data(), result));
938 //______________________________________________________________________________________________
939 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
942 // Transfers local files to the Grid. Local files can be reference files
943 // or run metadata file (from GRP only).
945 // According to the type (ref, metadata) the files are stored under the following location:
946 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
947 // metadata --> <run data folder>/<MetadataFileName>
950 AliCDBManager* man = AliCDBManager::Instance();
951 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
954 TString localBaseFolder = sto->GetBaseFolder();
960 if (strcmp(type, "reference") == 0)
962 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
963 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
966 TString gridBaseFolder = gridSto->GetBaseFolder();
967 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
968 begin = Form("%d_", GetCurrentRun());
970 else if (strcmp(type, "metadata") == 0)
973 TString lhcPeriod = GetLHCPeriod();
975 if (lhcPeriod.Length() == 0)
977 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
981 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
982 localBaseFolder.Data(), GetCurrentYear(),
983 lhcPeriod.Data(), GetCurrentRun());
984 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
990 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
994 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
998 TList* dirList = baseDir->GetListOfFiles();
1001 if (!dirList) return kTRUE;
1003 if (dirList->GetEntries() < 3)
1011 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1016 Int_t nDirs = 0, nTransfer = 0;
1017 TIter dirIter(dirList);
1018 TSystemFile* entry = 0;
1020 Bool_t success = kTRUE;
1021 Bool_t first = kTRUE;
1023 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1025 if (entry->IsDirectory())
1028 TString fileName(entry->GetName());
1029 if (!fileName.BeginsWith(begin))
1037 // check that folder exists, otherwise create it
1038 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1046 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1048 // TODO It does not work currently! Bug in TAliEn::Mkdir
1049 // TODO Manually fixed in local root v5-16-00
1050 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1052 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1057 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1061 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1065 TString fullLocalPath;
1066 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1068 TString fullGridPath;
1069 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1071 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1075 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1076 fullLocalPath.Data(), fullGridPath.Data()));
1077 RemoveFile(fullLocalPath);
1082 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1083 fullLocalPath.Data(), fullGridPath.Data()));
1088 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1089 nTransfer, nDirs, dir.Data()));
1096 //______________________________________________________________________________________________
1097 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1100 // Get folder name of reference files
1103 TString offDetStr(GetOfflineDetName(detector));
1105 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1107 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1109 dir.Form("%s/%s", base, offDetStr.Data());
1117 //______________________________________________________________________________________________
1118 void AliShuttle::CleanLocalStorage(const TString& uri)
1121 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1124 const char* type = 0;
1125 if(uri == fgkLocalCDB) {
1127 } else if(uri == fgkLocalRefStorage) {
1130 AliError(Form("Invalid storage URI: %s", uri.Data()));
1134 AliCDBManager* man = AliCDBManager::Instance();
1136 // open local storage
1137 AliCDBStorage *localSto = man->GetStorage(uri);
1140 Form("CleanLocalStorage - cannot activate local %s storage", type));
1144 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1145 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1147 AliDebug(2, Form("filename = %s", filename.Data()));
1149 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1150 GetCurrentRun(), fCurrentDetector.Data()));
1152 RemoveFile(filename.Data());
1156 //______________________________________________________________________________________________
1157 void AliShuttle::RemoveFile(const char* filename)
1160 // removes local file
1163 TString command(Form("rm -f %s", filename));
1165 Int_t result = gSystem->Exec(command.Data());
1168 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1169 fCurrentDetector.Data(), filename));
1173 //______________________________________________________________________________________________
1174 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1177 // Reads the AliShuttleStatus from the CDB
1181 delete fStatusEntry;
1185 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1186 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1188 if (!fStatusEntry) return 0;
1189 fStatusEntry->SetOwner(1);
1191 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1193 AliError("Invalid object stored to CDB!");
1200 //______________________________________________________________________________________________
1201 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1204 // writes the status for one subdetector
1208 delete fStatusEntry;
1212 Int_t run = GetCurrentRun();
1214 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1216 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1217 fStatusEntry->SetOwner(1);
1219 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1222 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1223 fCurrentDetector.Data(), run));
1232 //______________________________________________________________________________________________
1233 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1236 // changes the AliShuttleStatus for the given detector and run to the given status
1240 AliError("UNEXPECTED: fStatusEntry empty");
1244 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1247 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1251 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1252 fCurrentDetector.Data(),
1253 status->GetStatusName(),
1254 status->GetStatusName(newStatus));
1255 Log("SHUTTLE", actionStr);
1256 SetLastAction(actionStr);
1258 status->SetStatus(newStatus);
1259 if (increaseCount) status->IncreaseCount();
1261 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1266 //______________________________________________________________________________________________
1267 void AliShuttle::SendMLInfo()
1270 // sends ML information about the current status of the current detector being processed
1273 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1276 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1280 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1281 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1284 mlList.Add(&mlStatus);
1285 mlList.Add(&mlRetryCount);
1288 mlID.Form("%d", GetCurrentRun());
1289 fMonaLisa->SendParameters(&mlList, mlID);
1292 //______________________________________________________________________________________________
1293 Bool_t AliShuttle::ContinueProcessing()
1295 // this function reads the AliShuttleStatus information from CDB and
1296 // checks if the processing should be continued
1297 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1299 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1301 AliPreprocessor* aPreprocessor =
1302 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1305 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1309 AliShuttleLogbookEntry::Status entryStatus =
1310 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1312 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1313 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1314 fCurrentDetector.Data(),
1315 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1319 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1321 // check if current run is first unprocessed run for current detector
1322 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1323 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1325 if (fTestMode == kNone)
1327 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1328 " but this is not the first unprocessed run!"));
1333 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1334 "Although %s requires strict run ordering "
1335 "and this is not the first unprocessed run, "
1336 "the SHUTTLE continues"));
1340 AliShuttleStatus* status = ReadShuttleStatus();
1343 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1344 fCurrentDetector.Data()));
1345 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1346 return WriteShuttleStatus(status);
1349 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1350 // If it happens it may mean Logbook updating failed... let's do it now!
1351 if (status->GetStatus() == AliShuttleStatus::kDone ||
1352 status->GetStatus() == AliShuttleStatus::kFailed){
1353 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1354 fCurrentDetector.Data(),
1355 status->GetStatusName(status->GetStatus())));
1356 UpdateShuttleLogbook(fCurrentDetector.Data(),
1357 status->GetStatusName(status->GetStatus()));
1361 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1363 Form("ContinueProcessing - %s: Grid storage of one or more "
1364 "objects failed. Trying again now",
1365 fCurrentDetector.Data()));
1366 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1368 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1369 "successfully stored into main storage",
1370 fCurrentDetector.Data()));
1373 Form("ContinueProcessing - %s: Grid storage failed again",
1374 fCurrentDetector.Data()));
1375 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1380 // if we get here, there is a restart
1381 Bool_t cont = kFALSE;
1384 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1385 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1386 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1387 status->GetCount(), status->GetStatusName()));
1388 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1389 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1391 // there may still be objects in local OCDB and reference storage
1392 // and FXS databases may be not updated: do it now!
1394 // TODO Currently disabled, we want to keep files in case of failure!
1395 // CleanLocalStorage(fgkLocalCDB);
1396 // CleanLocalStorage(fgkLocalRefStorage);
1397 // UpdateTableFailCase();
1399 // Send mail to detector expert!
1400 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1401 fCurrentDetector.Data()));
1403 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1404 fCurrentDetector.Data()));
1407 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1408 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1409 status->GetStatusName(), status->GetCount()));
1410 Bool_t increaseCount = kTRUE;
1411 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1412 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1413 increaseCount = kFALSE;
1415 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1422 //______________________________________________________________________________________________
1423 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1426 // Makes data retrieval for all detectors in the configuration.
1427 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1428 // (Unprocessed, Inactive, Failed or Done).
1429 // Returns kFALSE in case of error occured and kTRUE otherwise
1432 if (!entry) return kFALSE;
1434 fLogbookEntry = entry;
1436 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1439 // Send the information to ML
1440 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1441 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1444 mlList.Add(&mlStatus);
1445 mlList.Add(&mlRunType);
1448 mlID.Form("%d", GetCurrentRun());
1449 fMonaLisa->SendParameters(&mlList, mlID);
1451 if (fLogbookEntry->IsDone())
1453 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1454 UpdateShuttleLogbook("shuttle_done");
1459 // read test mode if flag is set
1463 TString logEntry(entry->GetRunParameter("log"));
1464 //printf("log entry = %s\n", logEntry.Data());
1465 TString searchStr("Testmode: ");
1466 Int_t pos = logEntry.Index(searchStr.Data());
1467 //printf("%d\n", pos);
1470 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1471 //printf("%s\n", subStr.String().Data());
1472 TString newStr(subStr.Data());
1473 TObjArray* token = newStr.Tokenize(' ');
1477 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1480 Int_t testMode = tmpStr->String().Atoi();
1483 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1484 SetTestMode((TestMode) testMode);
1492 fLogbookEntry->Print("all");
1495 Bool_t hasError = kFALSE;
1497 // Set the CDB and Reference folders according to the year and LHC period
1498 TString lhcPeriod(GetLHCPeriod());
1499 if (lhcPeriod.Length() == 0)
1501 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1505 if (fgkMainCDB.Length() == 0)
1506 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1507 GetCurrentYear(), lhcPeriod.Data());
1509 if (fgkMainRefStorage.Length() == 0)
1510 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1511 GetCurrentYear(), lhcPeriod.Data());
1513 // Loop on detectors in the configuration
1514 TIter iter(fConfig->GetDetectors());
1515 TObjString* aDetector = 0;
1517 Bool_t first = kTRUE;
1519 while ((aDetector = (TObjString*) iter.Next()))
1521 fCurrentDetector = aDetector->String();
1523 if (ContinueProcessing() == kFALSE) continue;
1527 // only read QueryCDB when needed and only once
1528 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1529 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1530 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1531 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1535 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1536 GetCurrentRun(), aDetector->GetName()));
1538 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1540 Log(fCurrentDetector.Data(), "Process - Starting processing");
1546 Log("SHUTTLE", "Process - ERROR: Forking failed");
1551 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1552 GetCurrentRun(), aDetector->GetName()));
1554 Long_t begin = time(0);
1556 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1557 while (waitpid(pid, &status, WNOHANG) == 0)
1559 Long_t expiredTime = time(0) - begin;
1561 if (expiredTime > fConfig->GetPPTimeOut())
1564 tmp.Form("Process - Process of %s time out. "
1565 "Run time: %d seconds. Killing...",
1566 fCurrentDetector.Data(), expiredTime);
1567 Log("SHUTTLE", tmp);
1568 Log(fCurrentDetector, tmp);
1572 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1575 gSystem->Sleep(1000);
1579 gSystem->Sleep(1000);
1582 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1583 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1586 Log("SHUTTLE", Form("Process - Error: "
1587 "Could not open pipe to %s", checkStr.Data()));
1592 if (!fgets(buffer, 100, pipe))
1594 Log("SHUTTLE", "Process - Error: ps did not return anything");
1595 gSystem->ClosePipe(pipe);
1598 gSystem->ClosePipe(pipe);
1600 //Log("SHUTTLE", Form("ps returned %s", buffer));
1603 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1605 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1609 if (expiredTime % 60 == 0)
1611 Log("SHUTTLE", Form("Process - %s: Checking process. "
1612 "Run time: %d seconds - Memory consumption: %d KB",
1613 fCurrentDetector.Data(), expiredTime, mem));
1617 if (mem > fConfig->GetPPMaxMem())
1620 tmp.Form("Process - Process exceeds maximum allowed memory "
1621 "(%d KB > %d KB). Killing...",
1622 mem, fConfig->GetPPMaxMem());
1623 Log("SHUTTLE", tmp);
1624 Log(fCurrentDetector, tmp);
1628 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1631 gSystem->Sleep(1000);
1636 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1637 GetCurrentRun(), aDetector->GetName()));
1639 if (WIFEXITED(status))
1641 Int_t returnCode = WEXITSTATUS(status);
1643 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1646 if (returnCode == 0) hasError = kTRUE;
1652 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1653 aDetector->GetName()));
1655 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1657 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1659 Log("SHUTTLE", "Process - Could not freopen stdout");
1663 fOutputRedirected = kTRUE;
1664 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1665 Log("SHUTTLE", "Process - Could not redirect stderr");
1669 TString wd = gSystem->WorkingDirectory();
1670 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1671 fCurrentDetector.Data(), GetCurrentRun());
1673 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1674 if (!result) // temp dir already exists!
1676 Log(fCurrentDetector.Data(),
1677 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1678 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1681 if (gSystem->mkdir(tmpDir.Data(), 1))
1683 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1687 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1689 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1693 Bool_t success = ProcessCurrentDetector();
1695 gSystem->ChangeDirectory(wd.Data());
1697 if (success) // Preprocessor finished successfully!
1699 // remove temporary folder
1700 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1702 // Update time_processed field in FXS DB
1703 if (UpdateTable() == kFALSE)
1704 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1705 fCurrentDetector.Data()));
1707 // Transfer the data from local storage to main storage (Grid)
1708 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1709 if (StoreOCDB() == kFALSE)
1712 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1713 GetCurrentRun(), aDetector->GetName()));
1714 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1718 Form("\t\t\t****** run %d - %s: DONE ******",
1719 GetCurrentRun(), aDetector->GetName()));
1720 UpdateShuttleStatus(AliShuttleStatus::kDone);
1721 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1726 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1727 GetCurrentRun(), aDetector->GetName()));
1730 for (UInt_t iSys=0; iSys<3; iSys++)
1732 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1735 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1736 GetCurrentRun(), aDetector->GetName(), success));
1738 // the client exits here
1739 gSystem->Exit(success);
1741 AliError("We should never get here!!!");
1745 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1748 //check if shuttle is done for this run, if so update logbook
1749 TObjArray checkEntryArray;
1750 checkEntryArray.SetOwner(1);
1751 TString whereClause = Form("where run=%d", GetCurrentRun());
1752 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1753 checkEntryArray.GetEntries() == 0) {
1754 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1756 return hasError == kFALSE;
1759 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1760 (checkEntryArray.At(0));
1764 if (checkEntry->IsDone())
1766 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1767 UpdateShuttleLogbook("shuttle_done");
1771 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1773 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1775 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1776 checkEntry->GetRun(), GetDetName(iDet)));
1777 fFirstUnprocessed[iDet] = kFALSE;
1785 return hasError == kFALSE;
1788 //______________________________________________________________________________________________
1789 Bool_t AliShuttle::ProcessCurrentDetector()
1792 // Makes data retrieval just for a specific detector (fCurrentDetector).
1793 // Threre should be a configuration for this detector.
1795 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1796 fCurrentDetector.Data(), GetCurrentRun()));
1798 TString wd = gSystem->WorkingDirectory();
1800 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1803 gSystem->ChangeDirectory(wd.Data());
1805 TMap* dcsMap = new TMap();
1807 // call preprocessor
1808 AliPreprocessor* aPreprocessor =
1809 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1811 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1813 Bool_t processDCS = aPreprocessor->ProcessDCS();
1817 Log(fCurrentDetector, "ProcessCurrentDetector -"
1818 " The preprocessor requested to skip the retrieval of DCS values");
1820 else if (fTestMode & kSkipDCS)
1822 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1824 else if (fTestMode & kErrorDCS)
1826 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1827 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1828 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1833 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1835 // Query DCS archive
1836 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1838 for (int iServ=0; iServ<nServers; iServ++)
1841 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1842 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1843 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1845 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1846 " Querying DCS Amanda server %s:%d (%d of %d)",
1847 host.Data(), port, iServ+1, nServers));
1852 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1854 aliasMap = GetValueSet(host, port,
1855 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1856 kAlias, multiSplit);
1859 Log(fCurrentDetector,
1860 Form("ProcessCurrentDetector -"
1861 " Error retrieving DCS aliases from server %s."
1862 " Sending mail to DCS experts!", host.Data()));
1863 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1865 //if (!SendMailToDCS())
1866 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1873 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1875 dpMap = GetValueSet(host, port,
1876 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1880 Log(fCurrentDetector,
1881 Form("ProcessCurrentDetector -"
1882 " Error retrieving DCS data points from server %s."
1883 " Sending mail to DCS experts!", host.Data()));
1884 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1886 //if (!SendMailToDCS())
1887 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1889 if (aliasMap) delete aliasMap;
1895 // merge aliasMap and dpMap into dcsMap
1897 TIter iter(aliasMap);
1898 TObjString* key = 0;
1899 while ((key = (TObjString*) iter.Next()))
1900 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1902 aliasMap->SetOwner(kFALSE);
1908 TObjString* key = 0;
1909 while ((key = (TObjString*) iter.Next()))
1910 dcsMap->Add(key, dpMap->GetValue(key->String()));
1912 dpMap->SetOwner(kFALSE);
1918 // save map into file, to help debugging in case of preprocessor error
1919 /*TFile* f = TFile::Open("DCSMap.root","recreate");
1921 dcsMap->Write("DCSMap", TObject::kSingleKey);
1925 // DCS Archive DB processing successful. Call Preprocessor!
1926 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1928 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1930 if (returnValue > 0) // Preprocessor error!
1932 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1933 "Preprocessor failed. Process returned %d.", returnValue));
1934 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1935 dcsMap->DeleteAll();
1941 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1942 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1943 fCurrentDetector.Data()));
1945 dcsMap->DeleteAll();
1951 //______________________________________________________________________________________________
1952 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1955 // Query DAQ's Shuttle logbook and fills detector status object.
1956 // Call QueryRunParameters to query DAQ logbook for run parameters.
1959 entries.SetOwner(1);
1961 // check connection, in case connect
1962 if(!Connect(3)) return kFALSE;
1965 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1967 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1969 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1973 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1975 if(aResult->GetRowCount() == 0) {
1976 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1981 // TODO Check field count!
1982 const UInt_t nCols = 23;
1983 if (aResult->GetFieldCount() != (Int_t) nCols) {
1984 Log("SHUTTLE", "Invalid SQL result field number!");
1990 while ((aRow = aResult->Next())) {
1991 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1992 Int_t run = runString.Atoi();
1994 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1998 // loop on detectors
1999 for(UInt_t ii = 0; ii < nCols; ii++)
2000 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2002 entries.AddLast(entry);
2010 //______________________________________________________________________________________________
2011 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2014 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2017 // check connection, in case connect
2022 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2024 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2026 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2030 if (aResult->GetRowCount() == 0) {
2031 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2036 if (aResult->GetRowCount() > 1) {
2037 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2038 "more than one entry in DAQ Logbook for run %d!", run));
2043 TSQLRow* aRow = aResult->Next();
2046 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2051 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2053 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2054 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2056 UInt_t startTime = entry->GetStartTime();
2057 UInt_t endTime = entry->GetEndTime();
2059 // if (!startTime || !endTime || startTime > endTime)
2062 // Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2063 // run, startTime, endTime));
2065 // Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2066 // fLogbookEntry = entry;
2067 // if (!UpdateShuttleLogbook("shuttle_done"))
2069 // AliError(Form("Could not update logbook for run %d !", run));
2071 // fLogbookEntry = 0;
2082 Form("QueryRunParameters - Invalid parameters for Run %d: "
2083 "startTime = %d, endTime = %d. Skipping!",
2084 run, startTime, endTime));
2086 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2087 fLogbookEntry = entry;
2088 if (!UpdateShuttleLogbook("shuttle_ignored"))
2090 AliError(Form("Could not update logbook for run %d !", run));
2100 if (startTime && !endTime)
2102 // TODO Here we don't mark SHUTTLE done, because this may mean
2103 //the run is still ongoing!!
2105 Form("QueryRunParameters - Invalid parameters for Run %d: "
2106 "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2107 run, startTime, endTime));
2109 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2110 //fLogbookEntry = entry;
2111 //if (!UpdateShuttleLogbook("shuttle_done"))
2113 // AliError(Form("Could not update logbook for run %d !", run));
2115 //fLogbookEntry = 0;
2123 if (startTime && endTime && (startTime > endTime))
2126 Form("QueryRunParameters - Invalid parameters for Run %d: "
2127 "startTime = %d, endTime = %d. Skipping!",
2128 run, startTime, endTime));
2130 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2131 fLogbookEntry = entry;
2132 if (!UpdateShuttleLogbook("shuttle_ignored"))
2134 AliError(Form("Could not update logbook for run %d !", run));
2144 TString totEventsStr = entry->GetRunParameter("totalEvents");
2145 Int_t totEvents = totEventsStr.Atoi();
2149 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2151 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2152 fLogbookEntry = entry;
2153 if (!UpdateShuttleLogbook("shuttle_ignored"))
2155 AliError(Form("Could not update logbook for run %d !", run));
2171 //______________________________________________________________________________________________
2172 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2173 DCSType type, Int_t multiSplit)
2175 // Retrieve all "entry" data points from the DCS server
2176 // host, port: TSocket connection parameters
2177 // entries: list of name of the alias or data point
2178 // type: kAlias or kDP
2179 // returns TMap of values, 0 when failure
2181 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2186 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2187 GetCurrentEndTime());
2189 else if (type == kDP)
2191 result = client.GetDPValues(entries, GetCurrentStartTime(),
2192 GetCurrentEndTime());
2197 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2198 client.GetErrorString(client.GetResultErrorCode())));
2199 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2200 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2201 client.GetServerError().Data()));
2209 //______________________________________________________________________________________________
2210 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2211 const char* id, const char* source)
2213 // Get calibration file from file exchange servers
2214 // First queris the FXS database for the file name, using the run, detector, id and source info
2215 // then calls RetrieveFile(filename) for actual copy to local disk
2216 // run: current run being processed (given by Logbook entry fLogbookEntry)
2217 // detector: the Preprocessor name
2218 // id: provided as a parameter by the Preprocessor
2219 // source: provided by the Preprocessor through GetFileSources function
2221 // check if test mode should simulate a FXS error
2222 if (fTestMode & kErrorFXSFiles)
2224 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2228 // check connection, in case connect
2229 if (!Connect(system))
2231 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2235 // Query preparation
2236 TString sourceName(source);
2238 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2239 fConfig->GetFXSdbTable(system));
2240 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2241 GetCurrentRun(), detector, id);
2245 whereClause += Form(" and DAQsource=\"%s\"", source);
2247 else if (system == kDCS)
2251 else if (system == kHLT)
2253 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2257 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2259 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2262 TSQLResult* aResult = 0;
2263 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2265 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2266 GetSystemName(system), id, sourceName.Data()));
2270 if(aResult->GetRowCount() == 0)
2273 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2274 GetSystemName(system), id, sourceName.Data()));
2279 if (aResult->GetRowCount() > 1) {
2281 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2282 GetSystemName(system), id, sourceName.Data()));
2287 if (aResult->GetFieldCount() != nFields) {
2289 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2290 GetSystemName(system), id, sourceName.Data()));
2295 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2298 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2299 GetSystemName(system), id, sourceName.Data()));
2304 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2305 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2306 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2311 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2312 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2314 // retrieved file is renamed to make it unique
2315 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2316 GetShuttleTempDir(), detector, GetCurrentRun(),
2317 GetSystemName(system), detector, GetCurrentRun(),
2318 id, sourceName.Data());
2321 // file retrieval from FXS
2322 UInt_t nRetries = 0;
2323 UInt_t maxRetries = 3;
2324 Bool_t result = kFALSE;
2326 // copy!! if successful TSystem::Exec returns 0
2327 while(nRetries++ < maxRetries) {
2328 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2329 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2332 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2333 filePath.Data(), GetSystemName(system)));
2337 if (fileChecksum.Length()>0)
2339 // compare md5sum of local file with the one stored in the FXS DB
2340 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2341 localFileName.Data(), fileChecksum.Data()));
2345 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2351 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2352 filePath.Data(), GetSystemName(system)));
2357 if(!result) return 0;
2359 fFXSCalled[system]=kTRUE;
2360 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2361 fFXSlist[system].Add(fileParams);
2363 static TString staticLocalFileName;
2364 staticLocalFileName.Form("%s", localFileName.Data());
2366 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2367 "source %s from %s to %s", id, source,
2368 GetSystemName(system), localFileName.Data()));
2370 return staticLocalFileName.Data();
2373 //______________________________________________________________________________________________
2374 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2377 // Copies file from FXS to local Shuttle machine
2380 // check temp directory: trying to cd to temp; if it does not exist, create it
2381 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2382 GetSystemName(system), fxsFileName, localFileName));
2384 TString tmpDir(localFileName);
2386 tmpDir = tmpDir(0,tmpDir.Last('/'));
2388 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2389 if (noDir) // temp dir does not exists!
2391 if (gSystem->mkdir(tmpDir.Data(), 1))
2393 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2398 TString baseFXSFolder;
2401 baseFXSFolder = "FES/";
2403 else if (system == kDCS)
2407 else if (system == kHLT)
2409 baseFXSFolder = "/opt/FXS/";
2413 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2414 fConfig->GetFXSPort(system),
2415 fConfig->GetFXSUser(system),
2416 fConfig->GetFXSHost(system),
2417 baseFXSFolder.Data(),
2421 AliDebug(2, Form("%s",command.Data()));
2423 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2428 //______________________________________________________________________________________________
2429 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2432 // Get sources producing the condition file Id from file exchange servers
2433 // if id is NULL all sources are returned (distinct)
2436 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2438 // check if test mode should simulate a FXS error
2439 if (fTestMode & kErrorFXSSources)
2441 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2447 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2448 TList *list = new TList();
2450 list->Add(new TObjString(" "));
2454 // check connection, in case connect
2455 if (!Connect(system))
2457 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2461 TString sourceName = 0;
2464 sourceName = "DAQsource";
2465 } else if (system == kHLT)
2467 sourceName = "DDLnumbers";
2470 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2471 TString whereClause = Form("run=%d and detector=\"%s\"",
2472 GetCurrentRun(), detector);
2474 whereClause += Form(" and fileId=\"%s\"", id);
2475 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2477 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2480 TSQLResult* aResult;
2481 aResult = fServer[system]->Query(sqlQuery);
2483 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2484 GetSystemName(system), id));
2488 TList *list = new TList();
2491 if (aResult->GetRowCount() == 0)
2494 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2499 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2502 while ((aRow = aResult->Next()))
2505 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2506 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2507 list->Add(new TObjString(source));
2516 //______________________________________________________________________________________________
2517 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2520 // Get all ids of condition files produced by a given source from file exchange servers
2523 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2525 // check if test mode should simulate a FXS error
2526 if (fTestMode & kErrorFXSSources)
2528 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2532 // check connection, in case connect
2533 if (!Connect(system))
2535 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2539 TString sourceName = 0;
2542 sourceName = "DAQsource";
2543 } else if (system == kHLT)
2545 sourceName = "DDLnumbers";
2548 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2549 TString whereClause = Form("run=%d and detector=\"%s\"",
2550 GetCurrentRun(), detector);
2551 if (sourceName.Length() > 0 && source)
2552 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2553 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2555 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2558 TSQLResult* aResult;
2559 aResult = fServer[system]->Query(sqlQuery);
2561 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2562 GetSystemName(system), source));
2566 TList *list = new TList();
2569 if (aResult->GetRowCount() == 0)
2572 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2577 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2581 while ((aRow = aResult->Next()))
2584 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2585 AliDebug(2, Form("fileId = %s", id.Data()));
2586 list->Add(new TObjString(id));
2595 //______________________________________________________________________________________________
2596 Bool_t AliShuttle::Connect(Int_t system)
2598 // Connect to MySQL Server of the system's FXS MySQL databases
2599 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2602 // check connection: if already connected return
2603 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2605 TString dbHost, dbUser, dbPass, dbName;
2607 if (system < 3) // FXS db servers
2609 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2610 dbUser = fConfig->GetFXSdbUser(system);
2611 dbPass = fConfig->GetFXSdbPass(system);
2612 dbName = fConfig->GetFXSdbName(system);
2613 } else { // Run & Shuttle logbook servers
2614 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2615 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2616 dbUser = fConfig->GetDAQlbUser();
2617 dbPass = fConfig->GetDAQlbPass();
2618 dbName = fConfig->GetDAQlbDB();
2621 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2622 if (!fServer[system] || !fServer[system]->IsConnected()) {
2625 AliError(Form("Can't establish connection to FXS database for %s",
2626 AliShuttleInterface::GetSystemName(system)));
2628 AliError("Can't establish connection to Run logbook.");
2630 if(fServer[system]) delete fServer[system];
2635 TSQLResult* aResult=0;
2638 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2641 aResult = fServer[kDCS]->GetTables(dbName.Data());
2644 aResult = fServer[kHLT]->GetTables(dbName.Data());
2647 aResult = fServer[3]->GetTables(dbName.Data());
2655 //______________________________________________________________________________________________
2656 Bool_t AliShuttle::UpdateTable()
2659 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2662 Bool_t result = kTRUE;
2664 for (UInt_t system=0; system<3; system++)
2666 if(!fFXSCalled[system]) continue;
2668 // check connection, in case connect
2669 if (!Connect(system))
2671 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2676 TTimeStamp now; // now
2678 // Loop on FXS list entries
2679 TIter iter(&fFXSlist[system]);
2680 TObjString *aFXSentry=0;
2681 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2683 TString aFXSentrystr = aFXSentry->String();
2684 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2685 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2687 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2688 GetSystemName(system), aFXSentrystr.Data()));
2689 if(aFXSarray) delete aFXSarray;
2693 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2694 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2696 TString whereClause;
2699 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2700 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2702 else if (system == kDCS)
2704 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2705 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2707 else if (system == kHLT)
2709 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2710 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2715 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2716 now.GetSec(), whereClause.Data());
2718 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2721 TSQLResult* aResult;
2722 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2725 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2726 GetSystemName(system), sqlQuery.Data()));
2737 //______________________________________________________________________________________________
2738 Bool_t AliShuttle::UpdateTableFailCase()
2740 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2741 // this is called in case the preprocessor is declared failed for the current run, because
2742 // the fields are updated only in case of success
2744 Bool_t result = kTRUE;
2746 for (UInt_t system=0; system<3; system++)
2748 // check connection, in case connect
2749 if (!Connect(system))
2751 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2752 GetSystemName(system)));
2757 TTimeStamp now; // now
2759 // Loop on FXS list entries
2761 TString whereClause = Form("where run=%d and detector=\"%s\";",
2762 GetCurrentRun(), fCurrentDetector.Data());
2765 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2766 now.GetSec(), whereClause.Data());
2768 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2771 TSQLResult* aResult;
2772 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2775 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2776 GetSystemName(system), sqlQuery.Data()));
2786 //______________________________________________________________________________________________
2787 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2790 // Update Shuttle logbook filling detector or shuttle_done column
2791 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2794 // check connection, in case connect
2796 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2800 TString detName(detector);
2802 if (detName == "shuttle_done" || detName == "shuttle_ignored")
2804 setClause = "set shuttle_done=1";
2806 if (detName == "shuttle_done")
2808 // Send the information to ML
2809 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2812 mlList.Add(&mlStatus);
2815 mlID.Form("%d", GetCurrentRun());
2816 fMonaLisa->SendParameters(&mlList, mlID);
2819 TString statusStr(status);
2820 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2821 statusStr.Contains("failed", TString::kIgnoreCase)){
2822 setClause = Form("set %s=\"%s\"", detector, status);
2825 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2831 TString whereClause = Form("where run=%d", GetCurrentRun());
2833 TString sqlQuery = Form("update %s %s %s",
2834 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2836 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2839 TSQLResult* aResult;
2840 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2842 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2850 //______________________________________________________________________________________________
2851 Int_t AliShuttle::GetCurrentRun() const
2854 // Get current run from logbook entry
2857 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2860 //______________________________________________________________________________________________
2861 UInt_t AliShuttle::GetCurrentStartTime() const
2864 // get current start time
2867 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2870 //______________________________________________________________________________________________
2871 UInt_t AliShuttle::GetCurrentEndTime() const
2874 // get current end time from logbook entry
2877 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2880 //______________________________________________________________________________________________
2881 UInt_t AliShuttle::GetCurrentYear() const
2884 // Get current year from logbook entry
2887 if (!fLogbookEntry) return 0;
2889 TTimeStamp startTime(GetCurrentStartTime());
2890 TString year = Form("%d",startTime.GetDate());
2896 //______________________________________________________________________________________________
2897 const char* AliShuttle::GetLHCPeriod() const
2900 // Get current LHC period from logbook entry
2903 if (!fLogbookEntry) return 0;
2905 return fLogbookEntry->GetRunParameter("LHCperiod");
2908 //______________________________________________________________________________________________
2909 void AliShuttle::Log(const char* detector, const char* message)
2912 // Fill log string with a message
2915 TString logRunDir = GetShuttleLogDir();
2916 if (GetCurrentRun() >=0)
2917 logRunDir += Form("/%d", GetCurrentRun());
2919 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2921 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2922 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2927 gSystem->FreeDirectory(dir);
2930 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2931 if (GetCurrentRun() >= 0)
2932 toLog += Form("run %d - ", GetCurrentRun());
2933 toLog += Form("%s", message);
2935 AliInfo(toLog.Data());
2937 // if we redirect the log output already to the file, leave here
2938 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2941 TString fileName = GetLogFileName(detector);
2943 gSystem->ExpandPathName(fileName);
2946 logFile.open(fileName, ofstream::out | ofstream::app);
2948 if (!logFile.is_open()) {
2949 AliError(Form("Could not open file %s", fileName.Data()));
2953 logFile << toLog.Data() << "\n";
2958 //______________________________________________________________________________________________
2959 TString AliShuttle::GetLogFileName(const char* detector) const
2962 // returns the name of the log file for a given sub detector
2967 if (GetCurrentRun() >= 0)
2969 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
2970 detector, GetCurrentRun());
2972 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2978 //______________________________________________________________________________________________
2979 void AliShuttle::SendAlive()
2981 // sends alive message to ML
2983 TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
2986 mlList.Add(&mlStatus);
2988 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2991 //______________________________________________________________________________________________
2992 Bool_t AliShuttle::Collect(Int_t run)
2995 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2996 // If a dedicated run is given this run is processed
2998 // In operational mode, this is the Shuttle function triggered by the EOR signal.
3002 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
3004 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
3006 SetLastAction("Starting");
3008 // create ML instance
3010 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
3015 TString whereClause("where shuttle_done=0");
3017 whereClause += Form(" and run=%d", run);
3019 TObjArray shuttleLogbookEntries;
3020 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3022 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3026 if (shuttleLogbookEntries.GetEntries() == 0)
3029 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3031 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3032 "or it does not exist in Shuttle logbook", run));
3036 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3037 fFirstUnprocessed[iDet] = kTRUE;
3041 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3042 // flag them into fFirstUnprocessed array
3043 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3044 TObjArray tmpLogbookEntries;
3045 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3047 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3051 TIter iter(&tmpLogbookEntries);
3052 AliShuttleLogbookEntry* anEntry = 0;
3053 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3055 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3057 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3059 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3060 anEntry->GetRun(), GetDetName(iDet)));
3061 fFirstUnprocessed[iDet] = kFALSE;
3069 if (!RetrieveConditionsData(shuttleLogbookEntries))
3071 Log("SHUTTLE", "Collect - Process of at least one run failed");
3075 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3079 //______________________________________________________________________________________________
3080 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3083 // Retrieve conditions data for all runs that aren't processed yet
3086 Bool_t hasError = kFALSE;
3088 TIter iter(&dateEntries);
3089 AliShuttleLogbookEntry* anEntry;
3091 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3092 if (!Process(anEntry)){
3096 // clean SHUTTLE temp directory
3097 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3098 //RemoveFile(filename.Data());
3101 return hasError == kFALSE;
3104 //______________________________________________________________________________________________
3105 ULong_t AliShuttle::GetTimeOfLastAction() const
3108 // Gets time of last action
3113 fMonitoringMutex->Lock();
3115 tmp = fLastActionTime;
3117 fMonitoringMutex->UnLock();
3122 //______________________________________________________________________________________________
3123 const TString AliShuttle::GetLastAction() const
3126 // returns a string description of the last action
3131 fMonitoringMutex->Lock();
3135 fMonitoringMutex->UnLock();
3140 //______________________________________________________________________________________________
3141 void AliShuttle::SetLastAction(const char* action)
3144 // updates the monitoring variables
3147 fMonitoringMutex->Lock();
3149 fLastAction = action;
3150 fLastActionTime = time(0);
3152 fMonitoringMutex->UnLock();
3155 //______________________________________________________________________________________________
3156 const char* AliShuttle::GetRunParameter(const char* param)
3159 // returns run parameter read from DAQ logbook
3162 if(!fLogbookEntry) {
3163 AliError("No logbook entry!");
3167 return fLogbookEntry->GetRunParameter(param);
3170 //______________________________________________________________________________________________
3171 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3174 // returns object from OCDB valid for current run
3177 if (fTestMode & kErrorOCDB)
3179 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3183 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3186 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3190 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3193 //______________________________________________________________________________________________
3194 Bool_t AliShuttle::SendMail()
3197 // sends a mail to the subdetector expert in case of preprocessor error
3200 if (fTestMode != kNone)
3203 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3206 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3208 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3213 gSystem->FreeDirectory(dir);
3217 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3218 TObjString *anExpert=0;
3219 while ((anExpert = (TObjString*) iterExperts.Next()))
3221 to += Form("%s,", anExpert->GetName());
3223 if (to.Length() > 0)
3224 to.Remove(to.Length()-1);
3225 AliDebug(2, Form("to: %s",to.Data()));
3228 Log("SHUTTLE", "List of detector responsibles not yet set!");
3232 TString bodyFileName;
3233 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3234 gSystem->ExpandPathName(bodyFileName);
3237 mailBody.open(bodyFileName, ofstream::out);
3239 if (!mailBody.is_open())
3241 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3245 TString cc="alberto.colla@cern.ch";
3247 TString subject = Form("%s Shuttle preprocessor FAILED in run %d (run type = %s)!",
3248 fCurrentDetector.Data(), GetCurrentRun(), GetRunType());
3249 AliDebug(2, Form("subject: %s", subject.Data()));
3251 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3252 body += Form("SHUTTLE just detected that your preprocessor "
3253 "failed processing run %d (run type = %s)!!\n\n",
3254 GetCurrentRun(), GetRunType());
3255 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3256 fCurrentDetector.Data());
3257 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3259 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3261 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD&time=168 \n\n");
3265 TString logFolder = "logs";
3266 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3267 logFolder += "_PROD";
3270 body += Form("Find the %s log for the current run on \n\n"
3271 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3272 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3273 fCurrentDetector.Data(), GetCurrentRun());
3274 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3276 AliDebug(2, Form("Body begin: %s", body.Data()));
3278 mailBody << body.Data();
3280 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3282 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3283 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3284 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3285 if (gSystem->Exec(tailCommand.Data()))
3287 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3290 TString endBody = Form("------------------------------------------------------\n\n");
3291 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3292 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3293 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3295 AliDebug(2, Form("Body end: %s", endBody.Data()));
3297 mailBody << endBody.Data();
3302 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3306 bodyFileName.Data());
3307 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3309 Bool_t result = gSystem->Exec(mailCommand.Data());
3314 //______________________________________________________________________________________________
3315 Bool_t AliShuttle::SendMailToDCS()
3318 // sends a mail to the DCS experts in case of DCS error
3321 if (fTestMode != kNone)
3324 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3327 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3329 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3334 gSystem->FreeDirectory(dir);
3337 TString bodyFileName;
3338 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3339 gSystem->ExpandPathName(bodyFileName);
3342 mailBody.open(bodyFileName, ofstream::out);
3344 if (!mailBody.is_open())
3346 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3350 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3351 //TString to="alberto.colla@cern.ch";
3352 AliDebug(2, Form("to: %s",to.Data()));
3355 Log("SHUTTLE", "List of detector responsibles not yet set!");
3359 TString cc="alberto.colla@cern.ch";
3361 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3362 fCurrentDetector.Data(), GetCurrentRun());
3363 AliDebug(2, Form("subject: %s", subject.Data()));
3365 TString body = Form("Dear DCS experts, \n\n");
3366 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3367 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3368 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3369 fCurrentDetector.Data());
3370 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3372 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3374 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3377 TString logFolder = "logs";
3378 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3379 logFolder += "_PROD";
3382 body += Form("Find the %s log for the current run on \n\n"
3383 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3384 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3385 fCurrentDetector.Data(), GetCurrentRun());
3386 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3388 AliDebug(2, Form("Body begin: %s", body.Data()));
3390 mailBody << body.Data();
3392 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3394 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3395 fCurrentDetector.Data(), GetCurrentRun());
3396 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3397 if (gSystem->Exec(tailCommand.Data()))
3399 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3402 TString endBody = Form("------------------------------------------------------\n\n");
3403 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3404 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3405 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3407 AliDebug(2, Form("Body end: %s", endBody.Data()));
3409 mailBody << endBody.Data();
3414 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3418 bodyFileName.Data());
3419 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3421 Bool_t result = gSystem->Exec(mailCommand.Data());
3426 //______________________________________________________________________________________________
3427 const char* AliShuttle::GetRunType()
3430 // returns run type read from "run type" logbook
3433 if(!fLogbookEntry) {
3434 AliError("No logbook entry!");
3438 return fLogbookEntry->GetRunType();
3441 //______________________________________________________________________________________________
3442 Bool_t AliShuttle::GetHLTStatus()
3444 // Return HLT status (ON=1 OFF=0)
3445 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3447 if(!fLogbookEntry) {
3448 AliError("No logbook entry!");
3452 // TODO implement when HLTStatus is inserted in run logbook
3453 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3454 //if(hltStatus == "OFF") {return kFALSE};
3459 //______________________________________________________________________________________________
3460 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3463 // sets Shuttle temp directory
3466 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3469 //______________________________________________________________________________________________
3470 void AliShuttle::SetShuttleLogDir(const char* logDir)
3473 // sets Shuttle log directory
3476 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);