1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.74 2007/12/17 03:23:32 jgrosseo
20 added "empty preprocessor" as placeholder for Acorde in FDR
22 Revision 1.73 2007/12/14 19:31:36 acolla
23 Sending email to DCS experts is temporarily commented
25 Revision 1.72 2007/12/13 15:44:28 acolla
26 Run type added in mail sent to detector expert (eases understanding)
28 Revision 1.71 2007/12/12 14:56:14 jgrosseo
29 sending shuttle_ignore to ML also in case of 0 events
31 Revision 1.70 2007/12/12 13:45:35 acolla
32 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
34 Revision 1.69 2007/12/12 10:06:29 acolla
35 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
37 time_start==0 && time_end==0
39 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
41 Revision 1.68 2007/12/11 10:15:17 acolla
42 Added marking SHUTTLE=DONE for invalid runs
43 (invalid start time or end time) and runs with totalEvents < 1
45 Revision 1.67 2007/12/07 19:14:36 acolla
48 Added automatic collection of new runs on a regular time basis (settable from the configuration)
50 in AliShuttleConfig: new members
52 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
53 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
57 - logs now stored in logs/#RUN/DET_#RUN.log
59 Revision 1.66 2007/12/05 10:45:19 jgrosseo
60 changed order of arguments to TMonaLisaWriter
62 Revision 1.65 2007/11/26 16:58:37 acolla
63 Monalisa configuration added: host and table name
65 Revision 1.64 2007/11/13 16:15:47 acolla
66 DCS map is stored in a file in the temp folder where the detector is processed.
67 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
69 Revision 1.63 2007/11/02 10:53:16 acolla
70 Protection added to AliShuttle::CopyFileLocally
72 Revision 1.62 2007/10/31 18:23:13 acolla
73 Furter developement on the Shuttle:
75 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
76 are now built from /alice/data, e.g.:
77 /alice/data/2007/LHC07a/OCDB
79 the year and LHC period are taken from the Shuttle.
80 Raw metadata files are stored by GRP to:
81 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
83 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
85 Revision 1.61 2007/10/30 20:33:51 acolla
86 Improved managing of temporary folders, which weren't correctly handled.
87 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
89 Revision 1.60 2007/10/29 18:06:16 acolla
91 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
92 This function can be used by GRP only. It stores raw data tags merged file to the
93 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
97 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
98 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
99 has been corrected in the root package on the Shuttle machine.
101 Revision 1.59 2007/10/05 12:40:55 acolla
103 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
105 Revision 1.58 2007/09/28 15:27:40 acolla
107 AliDCSClient "multiSplit" option added in the DCS configuration
108 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
110 Revision 1.57 2007/09/27 16:53:13 acolla
111 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
112 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
114 Revision 1.56 2007/09/14 16:46:14 jgrosseo
115 1) Connect and Close are called before and after each query, so one can
116 keep the same AliDCSClient object.
117 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
118 3) Splitting interval can be specified in constructor
120 Revision 1.55 2007/08/06 12:26:40 acolla
121 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
122 read from the run logbook.
124 Revision 1.54 2007/07/12 09:51:25 jgrosseo
125 removed duplicated log message in GetFile
127 Revision 1.53 2007/07/12 09:26:28 jgrosseo
128 updating hlt fxs base path
130 Revision 1.52 2007/07/12 08:06:45 jgrosseo
131 adding log messages in getfile... functions
132 adding not implemented copy constructor in alishuttleconfigholder
134 Revision 1.51 2007/07/03 17:24:52 acolla
135 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
137 Revision 1.50 2007/07/02 17:19:32 acolla
138 preprocessor is run in a temp directory that is removed when process is finished.
140 Revision 1.49 2007/06/29 10:45:06 acolla
141 Number of columns in MySql Shuttle logbook increased by one (HLT added)
143 Revision 1.48 2007/06/21 13:06:19 acolla
144 GetFileSources returns dummy list with 1 source if system=DCS (better than
145 returning error as it was)
147 Revision 1.47 2007/06/19 17:28:56 acolla
148 HLT updated; missing map bug removed.
150 Revision 1.46 2007/06/09 13:01:09 jgrosseo
151 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
153 Revision 1.45 2007/05/30 06:35:20 jgrosseo
154 Adding functionality to the Shuttle/TestShuttle:
155 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
156 o) Function to retrieve list of IDs for a given source (GetFileIDs)
157 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
158 Example code has been added to the TestProcessor in TestShuttle
160 Revision 1.44 2007/05/11 16:09:32 acolla
161 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
162 example: ITS/SPD/100_filename.root
164 Revision 1.43 2007/05/10 09:59:51 acolla
165 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
167 Revision 1.42 2007/05/03 08:01:39 jgrosseo
168 typo in last commit :-(
170 Revision 1.41 2007/05/03 08:00:48 jgrosseo
171 fixing log message when pp want to skip dcs value retrieval
173 Revision 1.40 2007/04/27 07:06:48 jgrosseo
174 GetFileSources returns empty list in case of no files, but successful query
175 No mails sent in testmode
177 Revision 1.39 2007/04/17 12:43:57 acolla
178 Correction in StoreOCDB; change of text in mail to detector expert
180 Revision 1.38 2007/04/12 08:26:18 jgrosseo
183 Revision 1.37 2007/04/10 16:53:14 jgrosseo
184 redirecting sub detector stdout, stderr to sub detector log file
186 Revision 1.35 2007/04/04 16:26:38 acolla
187 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
188 2. Added missing dependency in test preprocessors.
189 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
191 Revision 1.34 2007/04/04 10:33:36 jgrosseo
192 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
193 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
195 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
197 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
199 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
201 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
202 If you always need DCS data (like before), you do not need to implement it.
204 6) The run type has been added to the monitoring page
206 Revision 1.33 2007/04/03 13:56:01 acolla
207 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
210 Revision 1.32 2007/02/28 10:41:56 acolla
211 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
212 AliPreprocessor::GetRunType() function.
213 Added some ldap definition files.
215 Revision 1.30 2007/02/13 11:23:21 acolla
216 Moved getters and setters of Shuttle's main OCDB/Reference, local
217 OCDB/Reference, temp and log folders to AliShuttleInterface
219 Revision 1.27 2007/01/30 17:52:42 jgrosseo
220 adding monalisa monitoring
222 Revision 1.26 2007/01/23 19:20:03 acolla
223 Removed old ldif files, added TOF, MCH ldif files. Added some options in
224 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
227 Revision 1.25 2007/01/15 19:13:52 acolla
228 Moved some AliInfo to AliDebug in SendMail function
230 Revision 1.21 2006/12/07 08:51:26 jgrosseo
232 table, db names in ldap configuration
233 added GRP preprocessor
234 DCS data can also be retrieved by data point
236 Revision 1.20 2006/11/16 16:16:48 jgrosseo
237 introducing strict run ordering flag
238 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
240 Revision 1.19 2006/11/06 14:23:04 jgrosseo
241 major update (Alberto)
242 o) reading of run parameters from the logbook
243 o) online offline naming conversion
244 o) standalone DCSclient package
246 Revision 1.18 2006/10/20 15:22:59 jgrosseo
247 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
248 o) Merging Collect, CollectAll, CollectNew function
249 o) Removing implementation of empty copy constructors (declaration still there!)
251 Revision 1.17 2006/10/05 16:20:55 jgrosseo
252 adapting to new CDB classes
254 Revision 1.16 2006/10/05 15:46:26 jgrosseo
255 applying to the new interface
257 Revision 1.15 2006/10/02 16:38:39 jgrosseo
260 storing of objects that failed to be stored to the grid before
261 interfacing of shuttle status table in daq system
263 Revision 1.14 2006/08/29 09:16:05 jgrosseo
266 Revision 1.13 2006/08/15 10:50:00 jgrosseo
267 effc++ corrections (alberto)
269 Revision 1.12 2006/08/08 14:19:29 jgrosseo
270 Update to shuttle classes (Alberto)
272 - Possibility to set the full object's path in the Preprocessor's and
273 Shuttle's Store functions
274 - Possibility to extend the object's run validity in the same classes
275 ("startValidity" and "validityInfinite" parameters)
276 - Implementation of the StoreReferenceData function to store reference
277 data in a dedicated CDB storage.
279 Revision 1.11 2006/07/21 07:37:20 jgrosseo
280 last run is stored after each run
282 Revision 1.10 2006/07/20 09:54:40 jgrosseo
283 introducing status management: The processing per subdetector is divided into several steps,
284 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
285 can keep track of the number of failures and skips further processing after a certain threshold is
286 exceeded. These thresholds can be configured in LDAP.
288 Revision 1.9 2006/07/19 10:09:55 jgrosseo
289 new configuration, accesst to DAQ FES (Alberto)
291 Revision 1.8 2006/07/11 12:44:36 jgrosseo
292 adding parameters for extended validity range of data produced by preprocessor
294 Revision 1.7 2006/07/10 14:37:09 jgrosseo
295 small fix + todo comment
297 Revision 1.6 2006/07/10 13:01:41 jgrosseo
298 enhanced storing of last sucessfully processed run (alberto)
300 Revision 1.5 2006/07/04 14:59:57 jgrosseo
301 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
303 Revision 1.4 2006/06/12 09:11:16 jgrosseo
304 coding conventions (Alberto)
306 Revision 1.3 2006/06/06 14:26:40 jgrosseo
307 o) removed files that were moved to STEER
308 o) shuttle updated to follow the new interface (Alberto)
310 Revision 1.2 2006/03/07 07:52:34 hristov
311 New version (B.Yordanov)
313 Revision 1.6 2005/11/19 17:19:14 byordano
314 RetrieveDATEEntries and RetrieveConditionsData added
316 Revision 1.5 2005/11/19 11:09:27 byordano
317 AliShuttle declaration added
319 Revision 1.4 2005/11/17 17:47:34 byordano
320 TList changed to TObjArray
322 Revision 1.3 2005/11/17 14:43:23 byordano
325 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
326 Initial import as subdirectory in AliRoot
328 Revision 1.2 2005/09/13 08:41:15 byordano
329 default startTime endTime added
331 Revision 1.4 2005/08/30 09:13:02 byordano
334 Revision 1.3 2005/08/29 21:15:47 byordano
340 // This class is the main manager for AliShuttle.
341 // It organizes the data retrieval from DCS and call the
342 // interface methods of AliPreprocessor.
343 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
344 // data for its set of aliases is retrieved. If there is registered
345 // AliPreprocessor for this detector then it will be used
346 // accroding to the schema (see AliPreprocessor).
347 // If there isn't registered AliPreprocessor than the retrieved
348 // data is stored automatically to the undelying AliCDBStorage.
349 // For detSpec is used the alias name.
352 #include "AliShuttle.h"
354 #include "AliCDBManager.h"
355 #include "AliCDBStorage.h"
356 #include "AliCDBId.h"
357 #include "AliCDBRunRange.h"
358 #include "AliCDBPath.h"
359 #include "AliCDBEntry.h"
360 #include "AliShuttleConfig.h"
361 #include "DCSClient/AliDCSClient.h"
363 #include "AliPreprocessor.h"
364 #include "AliShuttleStatus.h"
365 #include "AliShuttleLogbookEntry.h"
370 #include <TTimeStamp.h>
371 #include <TObjString.h>
372 #include <TSQLServer.h>
373 #include <TSQLResult.h>
376 #include <TSystemDirectory.h>
377 #include <TSystemFile.h>
380 #include <TGridResult.h>
382 #include <TMonaLisaWriter.h>
386 #include <sys/types.h>
387 #include <sys/wait.h>
391 //______________________________________________________________________________________________
392 AliShuttle::AliShuttle(const AliShuttleConfig* config,
393 UInt_t timeout, Int_t retries):
395 fTimeout(timeout), fRetries(retries),
405 fReadTestMode(kFALSE),
406 fOutputRedirected(kFALSE)
409 // config: AliShuttleConfig used
410 // timeout: timeout used for AliDCSClient connection
411 // retries: the number of retries in case of connection error.
414 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
415 for(int iSys=0;iSys<4;iSys++) {
418 fFXSlist[iSys].SetOwner(kTRUE);
420 fPreprocessorMap.SetOwner(kTRUE);
422 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
423 fFirstUnprocessed[iDet] = kFALSE;
425 fMonitoringMutex = new TMutex();
428 //______________________________________________________________________________________________
429 AliShuttle::~AliShuttle()
435 fPreprocessorMap.DeleteAll();
436 for(int iSys=0;iSys<4;iSys++)
438 fServer[iSys]->Close();
439 delete fServer[iSys];
448 if (fMonitoringMutex)
450 delete fMonitoringMutex;
451 fMonitoringMutex = 0;
455 //______________________________________________________________________________________________
456 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
459 // Registers new AliPreprocessor.
460 // It uses GetName() for indentificator of the pre processor.
461 // The pre processor is registered it there isn't any other
462 // with the same identificator (GetName()).
465 const char* detName = preprocessor->GetName();
466 if(GetDetPos(detName) < 0)
467 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
469 if (fPreprocessorMap.GetValue(detName)) {
470 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
474 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
476 //______________________________________________________________________________________________
477 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
478 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
480 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
481 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
482 // using this function. Use StoreReferenceData instead!
483 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
484 // finishes the data are transferred to the main storage (Grid).
486 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
489 //______________________________________________________________________________________________
490 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
492 // Stores a CDB object in the storage for reference data. This objects will not be available during
493 // offline reconstrunction. Use this function for reference data only!
494 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
495 // finishes the data are transferred to the main storage (Grid).
497 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
500 //______________________________________________________________________________________________
501 Bool_t AliShuttle::StoreLocally(const TString& localUri,
502 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
503 Int_t validityStart, Bool_t validityInfinite)
505 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
506 // when the preprocessor finishes the data are transferred to the main storage (Grid).
507 // The parameters are:
508 // 1) Uri of the backup storage (Local)
509 // 2) the object's path.
510 // 3) the object to be stored
511 // 4) the metaData to be associated with the object
512 // 5) the validity start run number w.r.t. the current run,
513 // if the data is valid only for this run leave the default 0
514 // 6) specifies if the calibration data is valid for infinity (this means until updated),
515 // typical for calibration runs, the default is kFALSE
517 // returns 0 if fail, 1 otherwise
519 if (fTestMode & kErrorStorage)
521 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
525 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
527 Int_t firstRun = GetCurrentRun() - validityStart;
529 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
534 if(validityInfinite) {
535 lastRun = AliCDBRunRange::Infinity();
537 lastRun = GetCurrentRun();
540 // Version is set to current run, it will be used later to transfer data to Grid
541 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
543 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
544 TObjString runUsed = Form("%d", GetCurrentRun());
545 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
548 Bool_t result = kFALSE;
550 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
551 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
553 result = AliCDBManager::Instance()->GetStorage(localUri)
554 ->Put(object, id, metaData);
559 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
565 //______________________________________________________________________________________________
566 Bool_t AliShuttle::StoreOCDB()
569 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
570 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
571 // Then calls StoreRefFilesToGrid to store reference files.
574 if (fTestMode & kErrorGrid)
576 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
577 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
581 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
582 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
584 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
585 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
587 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
588 Bool_t resultRefFiles = CopyFilesToGrid("reference");
590 Bool_t resultMetadata = kTRUE;
591 if(fCurrentDetector == "GRP")
593 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
594 resultMetadata = CopyFilesToGrid("metadata");
597 return resultCDB && resultRef && resultRefFiles && resultMetadata;
600 //______________________________________________________________________________________________
601 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
604 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
607 TObjArray* gridIds=0;
609 Bool_t result = kTRUE;
611 const char* type = 0;
613 if(gridURI == fgkMainCDB) {
615 localURI = fgkLocalCDB;
616 } else if(gridURI == fgkMainRefStorage) {
618 localURI = fgkLocalRefStorage;
620 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
624 AliCDBManager* man = AliCDBManager::Instance();
626 AliCDBStorage *gridSto = man->GetStorage(gridURI);
629 Form("StoreOCDB - cannot activate main %s storage", type));
633 gridIds = gridSto->GetQueryCDBList();
635 // get objects previously stored in local CDB
636 AliCDBStorage *localSto = man->GetStorage(localURI);
639 Form("StoreOCDB - cannot activate local %s storage", type));
642 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
643 // Local objects were stored with current run as Grid version!
644 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
645 localEntries->SetOwner(1);
647 // loop on local stored objects
648 TIter localIter(localEntries);
649 AliCDBEntry *aLocEntry = 0;
650 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
651 aLocEntry->SetOwner(1);
652 AliCDBId aLocId = aLocEntry->GetId();
653 aLocEntry->SetVersion(-1);
654 aLocEntry->SetSubVersion(-1);
656 // If local object is valid up to infinity we store it only if it is
657 // the first unprocessed run!
658 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
659 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
661 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
662 "there are previous unprocessed runs!",
663 fCurrentDetector.Data(), aLocId.GetPath().Data()));
668 // loop on Grid valid Id's
669 Bool_t store = kTRUE;
670 TIter gridIter(gridIds);
671 AliCDBId* aGridId = 0;
672 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
673 if(aGridId->GetPath() != aLocId.GetPath()) continue;
674 // skip all objects valid up to infinity
675 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
676 // if we get here, it means there's already some more recent object stored on Grid!
681 // If we get here, the file can be stored!
682 Bool_t storeOk = gridSto->Put(aLocEntry);
683 if(!store || storeOk){
687 Log(fCurrentDetector.Data(),
688 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
689 type, aGridId->ToString().Data()));
692 Form("StoreOCDB - Object <%s> successfully put into %s storage",
693 aLocId.ToString().Data(), type));
694 Log(fCurrentDetector.Data(),
695 Form("StoreOCDB - Object <%s> successfully put into %s storage",
696 aLocId.ToString().Data(), type));
699 // removing local filename...
701 localSto->IdToFilename(aLocId, filename);
702 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
703 RemoveFile(filename.Data());
707 Form("StoreOCDB - Grid %s storage of object <%s> failed",
708 type, aLocId.ToString().Data()));
709 Log(fCurrentDetector.Data(),
710 Form("StoreOCDB - Grid %s storage of object <%s> failed",
711 type, aLocId.ToString().Data()));
715 localEntries->Clear();
720 //______________________________________________________________________________________________
721 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
723 // clears the directory used to store reference files of a given subdetector
725 AliCDBManager* man = AliCDBManager::Instance();
726 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
727 TString localBaseFolder = sto->GetBaseFolder();
729 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
731 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
734 begin.Form("%d_", GetCurrentRun());
736 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
740 TList* dirList = baseDir->GetListOfFiles();
743 if (!dirList) return kTRUE;
745 if (dirList->GetEntries() < 3)
751 Int_t nDirs = 0, nDel = 0;
752 TIter dirIter(dirList);
753 TSystemFile* entry = 0;
755 Bool_t success = kTRUE;
757 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
759 if (entry->IsDirectory())
762 TString fileName(entry->GetName());
763 if (!fileName.BeginsWith(begin))
769 Int_t result = gSystem->Unlink(fileName.Data());
773 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
781 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
782 nDel, nDirs, targetDir.Data()));
793 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
797 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
800 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
805 result = gSystem->mkdir(targetDir, kTRUE);
808 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
815 //______________________________________________________________________________________________
816 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
819 // Stores reference file directly (without opening it). This function stores the file locally.
821 // The file is stored under the following location:
822 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
823 // where <gridFileName> is the second parameter given to the function
826 if (fTestMode & kErrorStorage)
828 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
832 AliCDBManager* man = AliCDBManager::Instance();
833 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
835 TString localBaseFolder = sto->GetBaseFolder();
837 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
838 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
840 return CopyFileLocally(localFile, target);
843 //______________________________________________________________________________________________
844 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
847 // Stores Run metadata file to the Grid, in the run folder
849 // Only GRP can call this function.
851 if (fTestMode & kErrorStorage)
853 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
857 AliCDBManager* man = AliCDBManager::Instance();
858 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
860 TString localBaseFolder = sto->GetBaseFolder();
862 // Build Run level folder
863 // folder = /alice/data/year/lhcPeriod/runNb/Raw
866 TString lhcPeriod = GetLHCPeriod();
867 if (lhcPeriod.Length() == 0)
869 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
873 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
874 localBaseFolder.Data(), GetCurrentYear(),
875 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
877 return CopyFileLocally(localFile, target);
880 //______________________________________________________________________________________________
881 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
884 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
885 // Files are temporarily stored in the local reference storage. When the preprocessor
886 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
887 // (in reference or run level folders)
890 TString targetDir(target(0, target.Last('/')));
892 //try to open base dir folder, if it does not exist
893 void* dir = gSystem->OpenDirectory(targetDir.Data());
895 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
896 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
901 gSystem->FreeDirectory(dir);
906 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
909 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
913 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
916 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
917 if (gSystem->Unlink(target.Data()))
919 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
924 result = gSystem->CopyFile(localFile, target);
928 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
933 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
934 localFile, target.Data(), result));
942 //______________________________________________________________________________________________
943 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
946 // Transfers local files to the Grid. Local files can be reference files
947 // or run metadata file (from GRP only).
949 // According to the type (ref, metadata) the files are stored under the following location:
950 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
951 // metadata --> <run data folder>/<MetadataFileName>
954 AliCDBManager* man = AliCDBManager::Instance();
955 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
958 TString localBaseFolder = sto->GetBaseFolder();
964 if (strcmp(type, "reference") == 0)
966 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
967 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
970 TString gridBaseFolder = gridSto->GetBaseFolder();
971 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
972 begin = Form("%d_", GetCurrentRun());
974 else if (strcmp(type, "metadata") == 0)
977 TString lhcPeriod = GetLHCPeriod();
979 if (lhcPeriod.Length() == 0)
981 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
985 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
986 localBaseFolder.Data(), GetCurrentYear(),
987 lhcPeriod.Data(), GetCurrentRun());
988 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
994 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
998 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
1002 TList* dirList = baseDir->GetListOfFiles();
1005 if (!dirList) return kTRUE;
1007 if (dirList->GetEntries() < 3)
1015 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1020 Int_t nDirs = 0, nTransfer = 0;
1021 TIter dirIter(dirList);
1022 TSystemFile* entry = 0;
1024 Bool_t success = kTRUE;
1025 Bool_t first = kTRUE;
1027 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1029 if (entry->IsDirectory())
1032 TString fileName(entry->GetName());
1033 if (!fileName.BeginsWith(begin))
1041 // check that folder exists, otherwise create it
1042 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1050 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1052 // TODO It does not work currently! Bug in TAliEn::Mkdir
1053 // TODO Manually fixed in local root v5-16-00
1054 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1056 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1061 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1065 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1069 TString fullLocalPath;
1070 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1072 TString fullGridPath;
1073 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1075 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1079 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1080 fullLocalPath.Data(), fullGridPath.Data()));
1081 RemoveFile(fullLocalPath);
1086 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1087 fullLocalPath.Data(), fullGridPath.Data()));
1092 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1093 nTransfer, nDirs, dir.Data()));
1100 //______________________________________________________________________________________________
1101 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1104 // Get folder name of reference files
1107 TString offDetStr(GetOfflineDetName(detector));
1109 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1111 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1113 dir.Form("%s/%s", base, offDetStr.Data());
1121 //______________________________________________________________________________________________
1122 void AliShuttle::CleanLocalStorage(const TString& uri)
1125 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1128 const char* type = 0;
1129 if(uri == fgkLocalCDB) {
1131 } else if(uri == fgkLocalRefStorage) {
1134 AliError(Form("Invalid storage URI: %s", uri.Data()));
1138 AliCDBManager* man = AliCDBManager::Instance();
1140 // open local storage
1141 AliCDBStorage *localSto = man->GetStorage(uri);
1144 Form("CleanLocalStorage - cannot activate local %s storage", type));
1148 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1149 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1151 AliDebug(2, Form("filename = %s", filename.Data()));
1153 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1154 GetCurrentRun(), fCurrentDetector.Data()));
1156 RemoveFile(filename.Data());
1160 //______________________________________________________________________________________________
1161 void AliShuttle::RemoveFile(const char* filename)
1164 // removes local file
1167 TString command(Form("rm -f %s", filename));
1169 Int_t result = gSystem->Exec(command.Data());
1172 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1173 fCurrentDetector.Data(), filename));
1177 //______________________________________________________________________________________________
1178 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1181 // Reads the AliShuttleStatus from the CDB
1185 delete fStatusEntry;
1189 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1190 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1192 if (!fStatusEntry) return 0;
1193 fStatusEntry->SetOwner(1);
1195 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1197 AliError("Invalid object stored to CDB!");
1204 //______________________________________________________________________________________________
1205 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1208 // writes the status for one subdetector
1212 delete fStatusEntry;
1216 Int_t run = GetCurrentRun();
1218 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1220 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1221 fStatusEntry->SetOwner(1);
1223 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1226 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1227 fCurrentDetector.Data(), run));
1236 //______________________________________________________________________________________________
1237 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1240 // changes the AliShuttleStatus for the given detector and run to the given status
1244 AliError("UNEXPECTED: fStatusEntry empty");
1248 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1251 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1255 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1256 fCurrentDetector.Data(),
1257 status->GetStatusName(),
1258 status->GetStatusName(newStatus));
1259 Log("SHUTTLE", actionStr);
1260 SetLastAction(actionStr);
1262 status->SetStatus(newStatus);
1263 if (increaseCount) status->IncreaseCount();
1265 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1270 //______________________________________________________________________________________________
1271 void AliShuttle::SendMLInfo()
1274 // sends ML information about the current status of the current detector being processed
1277 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1280 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1284 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1285 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1288 mlList.Add(&mlStatus);
1289 mlList.Add(&mlRetryCount);
1292 mlID.Form("%d", GetCurrentRun());
1293 fMonaLisa->SendParameters(&mlList, mlID);
1296 //______________________________________________________________________________________________
1297 Bool_t AliShuttle::ContinueProcessing()
1299 // this function reads the AliShuttleStatus information from CDB and
1300 // checks if the processing should be continued
1301 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1303 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1305 AliPreprocessor* aPreprocessor =
1306 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1309 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1313 AliShuttleLogbookEntry::Status entryStatus =
1314 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1316 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1317 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1318 fCurrentDetector.Data(),
1319 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1323 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1325 // check if current run is first unprocessed run for current detector
1326 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1327 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1329 if (fTestMode == kNone)
1331 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1332 " but this is not the first unprocessed run!"));
1337 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1338 "Although %s requires strict run ordering "
1339 "and this is not the first unprocessed run, "
1340 "the SHUTTLE continues"));
1344 AliShuttleStatus* status = ReadShuttleStatus();
1347 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1348 fCurrentDetector.Data()));
1349 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1350 return WriteShuttleStatus(status);
1353 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1354 // If it happens it may mean Logbook updating failed... let's do it now!
1355 if (status->GetStatus() == AliShuttleStatus::kDone ||
1356 status->GetStatus() == AliShuttleStatus::kFailed){
1357 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1358 fCurrentDetector.Data(),
1359 status->GetStatusName(status->GetStatus())));
1360 UpdateShuttleLogbook(fCurrentDetector.Data(),
1361 status->GetStatusName(status->GetStatus()));
1365 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1367 Form("ContinueProcessing - %s: Grid storage of one or more "
1368 "objects failed. Trying again now",
1369 fCurrentDetector.Data()));
1370 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1372 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1373 "successfully stored into main storage",
1374 fCurrentDetector.Data()));
1377 Form("ContinueProcessing - %s: Grid storage failed again",
1378 fCurrentDetector.Data()));
1379 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1384 // if we get here, there is a restart
1385 Bool_t cont = kFALSE;
1388 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1389 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1390 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1391 status->GetCount(), status->GetStatusName()));
1392 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1393 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1395 // there may still be objects in local OCDB and reference storage
1396 // and FXS databases may be not updated: do it now!
1398 // TODO Currently disabled, we want to keep files in case of failure!
1399 // CleanLocalStorage(fgkLocalCDB);
1400 // CleanLocalStorage(fgkLocalRefStorage);
1401 // UpdateTableFailCase();
1403 // Send mail to detector expert!
1404 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1405 fCurrentDetector.Data()));
1407 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1408 fCurrentDetector.Data()));
1411 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1412 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1413 status->GetStatusName(), status->GetCount()));
1414 Bool_t increaseCount = kTRUE;
1415 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1416 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1417 increaseCount = kFALSE;
1419 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1426 //______________________________________________________________________________________________
1427 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1430 // Makes data retrieval for all detectors in the configuration.
1431 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1432 // (Unprocessed, Inactive, Failed or Done).
1433 // Returns kFALSE in case of error occured and kTRUE otherwise
1436 if (!entry) return kFALSE;
1438 fLogbookEntry = entry;
1440 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1443 // Send the information to ML
1444 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1445 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1448 mlList.Add(&mlStatus);
1449 mlList.Add(&mlRunType);
1452 mlID.Form("%d", GetCurrentRun());
1453 fMonaLisa->SendParameters(&mlList, mlID);
1455 if (fLogbookEntry->IsDone())
1457 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1458 UpdateShuttleLogbook("shuttle_done");
1463 // read test mode if flag is set
1467 TString logEntry(entry->GetRunParameter("log"));
1468 //printf("log entry = %s\n", logEntry.Data());
1469 TString searchStr("Testmode: ");
1470 Int_t pos = logEntry.Index(searchStr.Data());
1471 //printf("%d\n", pos);
1474 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1475 //printf("%s\n", subStr.String().Data());
1476 TString newStr(subStr.Data());
1477 TObjArray* token = newStr.Tokenize(' ');
1481 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1484 Int_t testMode = tmpStr->String().Atoi();
1487 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1488 SetTestMode((TestMode) testMode);
1496 fLogbookEntry->Print("all");
1499 Bool_t hasError = kFALSE;
1501 // Set the CDB and Reference folders according to the year and LHC period
1502 TString lhcPeriod(GetLHCPeriod());
1503 if (lhcPeriod.Length() == 0)
1505 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1509 if (fgkMainCDB.Length() == 0)
1510 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1511 GetCurrentYear(), lhcPeriod.Data());
1513 if (fgkMainRefStorage.Length() == 0)
1514 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1515 GetCurrentYear(), lhcPeriod.Data());
1517 // Loop on detectors in the configuration
1518 TIter iter(fConfig->GetDetectors());
1519 TObjString* aDetector = 0;
1521 Bool_t first = kTRUE;
1523 while ((aDetector = (TObjString*) iter.Next()))
1525 fCurrentDetector = aDetector->String();
1527 if (ContinueProcessing() == kFALSE) continue;
1531 // only read QueryCDB when needed and only once
1532 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1533 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1534 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1535 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1539 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1540 GetCurrentRun(), aDetector->GetName()));
1542 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1544 Log(fCurrentDetector.Data(), "Process - Starting processing");
1550 Log("SHUTTLE", "Process - ERROR: Forking failed");
1555 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1556 GetCurrentRun(), aDetector->GetName()));
1558 Long_t begin = time(0);
1560 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1561 while (waitpid(pid, &status, WNOHANG) == 0)
1563 Long_t expiredTime = time(0) - begin;
1565 if (expiredTime > fConfig->GetPPTimeOut())
1568 tmp.Form("Process - Process of %s time out. "
1569 "Run time: %d seconds. Killing...",
1570 fCurrentDetector.Data(), expiredTime);
1571 Log("SHUTTLE", tmp);
1572 Log(fCurrentDetector, tmp);
1576 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1579 gSystem->Sleep(1000);
1583 gSystem->Sleep(1000);
1586 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1587 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1590 Log("SHUTTLE", Form("Process - Error: "
1591 "Could not open pipe to %s", checkStr.Data()));
1596 if (!fgets(buffer, 100, pipe))
1598 Log("SHUTTLE", "Process - Error: ps did not return anything");
1599 gSystem->ClosePipe(pipe);
1602 gSystem->ClosePipe(pipe);
1604 //Log("SHUTTLE", Form("ps returned %s", buffer));
1607 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1609 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1613 if (expiredTime % 60 == 0)
1615 Log("SHUTTLE", Form("Process - %s: Checking process. "
1616 "Run time: %d seconds - Memory consumption: %d KB",
1617 fCurrentDetector.Data(), expiredTime, mem));
1621 if (mem > fConfig->GetPPMaxMem())
1624 tmp.Form("Process - Process exceeds maximum allowed memory "
1625 "(%d KB > %d KB). Killing...",
1626 mem, fConfig->GetPPMaxMem());
1627 Log("SHUTTLE", tmp);
1628 Log(fCurrentDetector, tmp);
1632 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1635 gSystem->Sleep(1000);
1640 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1641 GetCurrentRun(), aDetector->GetName()));
1643 if (WIFEXITED(status))
1645 Int_t returnCode = WEXITSTATUS(status);
1647 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1650 if (returnCode == 0) hasError = kTRUE;
1656 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1657 aDetector->GetName()));
1659 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1661 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1663 Log("SHUTTLE", "Process - Could not freopen stdout");
1667 fOutputRedirected = kTRUE;
1668 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1669 Log("SHUTTLE", "Process - Could not redirect stderr");
1673 TString wd = gSystem->WorkingDirectory();
1674 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1675 fCurrentDetector.Data(), GetCurrentRun());
1677 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1678 if (!result) // temp dir already exists!
1680 Log(fCurrentDetector.Data(),
1681 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1682 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1685 if (gSystem->mkdir(tmpDir.Data(), 1))
1687 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1691 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1693 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1697 Bool_t success = ProcessCurrentDetector();
1699 gSystem->ChangeDirectory(wd.Data());
1701 if (success) // Preprocessor finished successfully!
1703 // remove temporary folder
1704 // temporary commented (JF)
1705 //gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1707 // Update time_processed field in FXS DB
1708 if (UpdateTable() == kFALSE)
1709 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1710 fCurrentDetector.Data()));
1712 // Transfer the data from local storage to main storage (Grid)
1713 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1714 if (StoreOCDB() == kFALSE)
1717 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1718 GetCurrentRun(), aDetector->GetName()));
1719 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1723 Form("\t\t\t****** run %d - %s: DONE ******",
1724 GetCurrentRun(), aDetector->GetName()));
1725 UpdateShuttleStatus(AliShuttleStatus::kDone);
1726 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1731 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1732 GetCurrentRun(), aDetector->GetName()));
1735 for (UInt_t iSys=0; iSys<3; iSys++)
1737 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1740 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1741 GetCurrentRun(), aDetector->GetName(), success));
1743 // the client exits here
1744 gSystem->Exit(success);
1746 AliError("We should never get here!!!");
1750 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1753 //check if shuttle is done for this run, if so update logbook
1754 TObjArray checkEntryArray;
1755 checkEntryArray.SetOwner(1);
1756 TString whereClause = Form("where run=%d", GetCurrentRun());
1757 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1758 checkEntryArray.GetEntries() == 0) {
1759 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1761 return hasError == kFALSE;
1764 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1765 (checkEntryArray.At(0));
1769 if (checkEntry->IsDone())
1771 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1772 UpdateShuttleLogbook("shuttle_done");
1776 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1778 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1780 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1781 checkEntry->GetRun(), GetDetName(iDet)));
1782 fFirstUnprocessed[iDet] = kFALSE;
1790 return hasError == kFALSE;
1793 //______________________________________________________________________________________________
1794 Bool_t AliShuttle::ProcessCurrentDetector()
1797 // Makes data retrieval just for a specific detector (fCurrentDetector).
1798 // Threre should be a configuration for this detector.
1800 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1801 fCurrentDetector.Data(), GetCurrentRun()));
1803 TString wd = gSystem->WorkingDirectory();
1805 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1808 gSystem->ChangeDirectory(wd.Data());
1810 TMap* dcsMap = new TMap();
1812 // call preprocessor
1813 AliPreprocessor* aPreprocessor =
1814 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1816 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1818 Bool_t processDCS = aPreprocessor->ProcessDCS();
1822 Log(fCurrentDetector, "ProcessCurrentDetector -"
1823 " The preprocessor requested to skip the retrieval of DCS values");
1825 else if (fTestMode & kSkipDCS)
1827 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1829 else if (fTestMode & kErrorDCS)
1831 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1832 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1833 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1838 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1840 // Query DCS archive
1841 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1843 for (int iServ=0; iServ<nServers; iServ++)
1846 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1847 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1848 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1850 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1851 " Querying DCS Amanda server %s:%d (%d of %d)",
1852 host.Data(), port, iServ+1, nServers));
1857 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1859 aliasMap = GetValueSet(host, port,
1860 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1861 kAlias, multiSplit);
1864 Log(fCurrentDetector,
1865 Form("ProcessCurrentDetector -"
1866 " Error retrieving DCS aliases from server %s."
1867 " Sending mail to DCS experts!", host.Data()));
1868 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1870 //if (!SendMailToDCS())
1871 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1878 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1880 dpMap = GetValueSet(host, port,
1881 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1885 Log(fCurrentDetector,
1886 Form("ProcessCurrentDetector -"
1887 " Error retrieving DCS data points from server %s."
1888 " Sending mail to DCS experts!", host.Data()));
1889 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1891 //if (!SendMailToDCS())
1892 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1894 if (aliasMap) delete aliasMap;
1900 // merge aliasMap and dpMap into dcsMap
1902 TIter iter(aliasMap);
1903 TObjString* key = 0;
1904 while ((key = (TObjString*) iter.Next()))
1905 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1907 aliasMap->SetOwner(kFALSE);
1913 TObjString* key = 0;
1914 while ((key = (TObjString*) iter.Next()))
1915 dcsMap->Add(key, dpMap->GetValue(key->String()));
1917 dpMap->SetOwner(kFALSE);
1923 // save map into file, to help debugging in case of preprocessor error
1924 TFile* f = TFile::Open("DCSMap.root","recreate");
1926 dcsMap->Write("DCSMap", TObject::kSingleKey);
1930 // DCS Archive DB processing successful. Call Preprocessor!
1931 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1933 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1935 if (returnValue > 0) // Preprocessor error!
1937 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1938 "Preprocessor failed. Process returned %d.", returnValue));
1939 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1940 dcsMap->DeleteAll();
1946 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1947 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1948 fCurrentDetector.Data()));
1950 dcsMap->DeleteAll();
1956 //______________________________________________________________________________________________
1957 void AliShuttle::CountOpenRuns()
1959 // Query DAQ's Shuttle logbook and sends the number of open runs to ML
1961 // check connection, in case connect
1966 sqlQuery = Form("select count(*) from %s where shuttle_done=0", fConfig->GetShuttlelbTable());
1968 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1970 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1974 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1976 if (aResult->GetRowCount() == 0) {
1977 AliError(Form("No result for query %s received", sqlQuery.Data()));
1981 if (aResult->GetFieldCount() != 1) {
1982 AliError(Form("Invalid field count for query %s received", sqlQuery.Data()));
1986 TSQLRow* aRow = aResult->Next();
1988 AliError(Form("Could not receive result of query %s", sqlQuery.Data()));
1992 TString result(aRow->GetField(0), aRow->GetFieldLength(0));
1993 Int_t count = result.Atoi();
1995 Log("SHUTTLE", Form("%d unprocessed runs", count));
2000 TMonaLisaValue mlStatus("SHUTTLE_openruns", count);
2003 mlList.Add(&mlStatus);
2005 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2008 //______________________________________________________________________________________________
2009 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
2012 // Query DAQ's Shuttle logbook and fills detector status object.
2013 // Call QueryRunParameters to query DAQ logbook for run parameters.
2016 entries.SetOwner(1);
2018 // check connection, in case connect
2019 if (!Connect(3)) return kFALSE;
2022 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2024 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2026 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
2030 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
2032 if(aResult->GetRowCount() == 0) {
2033 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
2038 // TODO Check field count!
2039 const UInt_t nCols = 23;
2040 if (aResult->GetFieldCount() != (Int_t) nCols) {
2041 Log("SHUTTLE", "Invalid SQL result field number!");
2047 while ((aRow = aResult->Next())) {
2048 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
2049 Int_t run = runString.Atoi();
2051 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
2055 // loop on detectors
2056 for(UInt_t ii = 0; ii < nCols; ii++)
2057 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2059 entries.AddLast(entry);
2067 //______________________________________________________________________________________________
2068 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2071 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2074 // check connection, in case connect
2079 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2081 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2083 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2087 if (aResult->GetRowCount() == 0) {
2088 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2093 if (aResult->GetRowCount() > 1) {
2094 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2095 "more than one entry in DAQ Logbook for run %d!", run));
2100 TSQLRow* aRow = aResult->Next();
2103 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2108 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2110 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2111 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2113 UInt_t startTime = entry->GetStartTime();
2114 UInt_t endTime = entry->GetEndTime();
2116 // if (!startTime || !endTime || startTime > endTime)
2119 // Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2120 // run, startTime, endTime));
2122 // Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2123 // fLogbookEntry = entry;
2124 // if (!UpdateShuttleLogbook("shuttle_done"))
2126 // AliError(Form("Could not update logbook for run %d !", run));
2128 // fLogbookEntry = 0;
2139 Form("QueryRunParameters - Invalid parameters for Run %d: "
2140 "startTime = %d, endTime = %d. Skipping!",
2141 run, startTime, endTime));
2143 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2144 fLogbookEntry = entry;
2145 if (!UpdateShuttleLogbook("shuttle_ignored"))
2147 AliError(Form("Could not update logbook for run %d !", run));
2157 if (startTime && !endTime)
2159 // TODO Here we don't mark SHUTTLE done, because this may mean
2160 //the run is still ongoing!!
2162 Form("QueryRunParameters - Invalid parameters for Run %d: "
2163 "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2164 run, startTime, endTime));
2166 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2167 //fLogbookEntry = entry;
2168 //if (!UpdateShuttleLogbook("shuttle_done"))
2170 // AliError(Form("Could not update logbook for run %d !", run));
2172 //fLogbookEntry = 0;
2180 if (startTime && endTime && (startTime > endTime))
2183 Form("QueryRunParameters - Invalid parameters for Run %d: "
2184 "startTime = %d, endTime = %d. Skipping!",
2185 run, startTime, endTime));
2187 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2188 fLogbookEntry = entry;
2189 if (!UpdateShuttleLogbook("shuttle_ignored"))
2191 AliError(Form("Could not update logbook for run %d !", run));
2201 TString totEventsStr = entry->GetRunParameter("totalEvents");
2202 Int_t totEvents = totEventsStr.Atoi();
2206 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2208 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2209 fLogbookEntry = entry;
2210 if (!UpdateShuttleLogbook("shuttle_ignored"))
2212 AliError(Form("Could not update logbook for run %d !", run));
2228 //______________________________________________________________________________________________
2229 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2230 DCSType type, Int_t multiSplit)
2232 // Retrieve all "entry" data points from the DCS server
2233 // host, port: TSocket connection parameters
2234 // entries: list of name of the alias or data point
2235 // type: kAlias or kDP
2236 // returns TMap of values, 0 when failure
2238 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2243 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2244 GetCurrentEndTime());
2246 else if (type == kDP)
2248 result = client.GetDPValues(entries, GetCurrentStartTime(),
2249 GetCurrentEndTime());
2254 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2255 client.GetErrorString(client.GetResultErrorCode())));
2256 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2257 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2258 client.GetServerError().Data()));
2266 //______________________________________________________________________________________________
2267 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2268 const char* id, const char* source)
2270 // Get calibration file from file exchange servers
2271 // First queris the FXS database for the file name, using the run, detector, id and source info
2272 // then calls RetrieveFile(filename) for actual copy to local disk
2273 // run: current run being processed (given by Logbook entry fLogbookEntry)
2274 // detector: the Preprocessor name
2275 // id: provided as a parameter by the Preprocessor
2276 // source: provided by the Preprocessor through GetFileSources function
2278 // check if test mode should simulate a FXS error
2279 if (fTestMode & kErrorFXSFiles)
2281 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2285 // check connection, in case connect
2286 if (!Connect(system))
2288 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2292 // Query preparation
2293 TString sourceName(source);
2295 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2296 fConfig->GetFXSdbTable(system));
2297 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2298 GetCurrentRun(), detector, id);
2302 whereClause += Form(" and DAQsource=\"%s\"", source);
2304 else if (system == kDCS)
2308 else if (system == kHLT)
2310 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2314 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2316 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2319 TSQLResult* aResult = 0;
2320 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2322 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2323 GetSystemName(system), id, sourceName.Data()));
2327 if(aResult->GetRowCount() == 0)
2330 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2331 GetSystemName(system), id, sourceName.Data()));
2336 if (aResult->GetRowCount() > 1) {
2338 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2339 GetSystemName(system), id, sourceName.Data()));
2344 if (aResult->GetFieldCount() != nFields) {
2346 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2347 GetSystemName(system), id, sourceName.Data()));
2352 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2355 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2356 GetSystemName(system), id, sourceName.Data()));
2361 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2362 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2363 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2368 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2369 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2371 // retrieved file is renamed to make it unique
2372 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2373 GetShuttleTempDir(), detector, GetCurrentRun(),
2374 GetSystemName(system), detector, GetCurrentRun(),
2375 id, sourceName.Data());
2378 // file retrieval from FXS
2379 UInt_t nRetries = 0;
2380 UInt_t maxRetries = 3;
2381 Bool_t result = kFALSE;
2383 // copy!! if successful TSystem::Exec returns 0
2384 while(nRetries++ < maxRetries) {
2385 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2386 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2389 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2390 filePath.Data(), GetSystemName(system)));
2394 if (fileChecksum.Length()>0)
2396 // compare md5sum of local file with the one stored in the FXS DB
2397 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2398 localFileName.Data(), fileChecksum.Data()));
2402 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2408 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2409 filePath.Data(), GetSystemName(system)));
2414 if(!result) return 0;
2416 fFXSCalled[system]=kTRUE;
2417 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2418 fFXSlist[system].Add(fileParams);
2420 static TString staticLocalFileName;
2421 staticLocalFileName.Form("%s", localFileName.Data());
2423 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2424 "source %s from %s to %s", id, source,
2425 GetSystemName(system), localFileName.Data()));
2427 return staticLocalFileName.Data();
2430 //______________________________________________________________________________________________
2431 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2434 // Copies file from FXS to local Shuttle machine
2437 // check temp directory: trying to cd to temp; if it does not exist, create it
2438 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2439 GetSystemName(system), fxsFileName, localFileName));
2441 TString tmpDir(localFileName);
2443 tmpDir = tmpDir(0,tmpDir.Last('/'));
2445 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2446 if (noDir) // temp dir does not exists!
2448 if (gSystem->mkdir(tmpDir.Data(), 1))
2450 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2455 TString baseFXSFolder;
2458 baseFXSFolder = "FES/";
2460 else if (system == kDCS)
2464 else if (system == kHLT)
2466 baseFXSFolder = "/opt/FXS/";
2470 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2471 fConfig->GetFXSPort(system),
2472 fConfig->GetFXSUser(system),
2473 fConfig->GetFXSHost(system),
2474 baseFXSFolder.Data(),
2478 AliDebug(2, Form("%s",command.Data()));
2480 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2485 //______________________________________________________________________________________________
2486 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2489 // Get sources producing the condition file Id from file exchange servers
2490 // if id is NULL all sources are returned (distinct)
2493 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2495 // check if test mode should simulate a FXS error
2496 if (fTestMode & kErrorFXSSources)
2498 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2504 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2505 TList *list = new TList();
2507 list->Add(new TObjString(" "));
2511 // check connection, in case connect
2512 if (!Connect(system))
2514 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2518 TString sourceName = 0;
2521 sourceName = "DAQsource";
2522 } else if (system == kHLT)
2524 sourceName = "DDLnumbers";
2527 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2528 TString whereClause = Form("run=%d and detector=\"%s\"",
2529 GetCurrentRun(), detector);
2531 whereClause += Form(" and fileId=\"%s\"", id);
2532 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2534 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2537 TSQLResult* aResult;
2538 aResult = fServer[system]->Query(sqlQuery);
2540 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2541 GetSystemName(system), id));
2545 TList *list = new TList();
2548 if (aResult->GetRowCount() == 0)
2551 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2556 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2559 while ((aRow = aResult->Next()))
2562 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2563 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2564 list->Add(new TObjString(source));
2573 //______________________________________________________________________________________________
2574 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2577 // Get all ids of condition files produced by a given source from file exchange servers
2580 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2582 // check if test mode should simulate a FXS error
2583 if (fTestMode & kErrorFXSSources)
2585 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2589 // check connection, in case connect
2590 if (!Connect(system))
2592 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2596 TString sourceName = 0;
2599 sourceName = "DAQsource";
2600 } else if (system == kHLT)
2602 sourceName = "DDLnumbers";
2605 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2606 TString whereClause = Form("run=%d and detector=\"%s\"",
2607 GetCurrentRun(), detector);
2608 if (sourceName.Length() > 0 && source)
2609 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2610 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2612 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2615 TSQLResult* aResult;
2616 aResult = fServer[system]->Query(sqlQuery);
2618 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2619 GetSystemName(system), source));
2623 TList *list = new TList();
2626 if (aResult->GetRowCount() == 0)
2629 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2634 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2638 while ((aRow = aResult->Next()))
2641 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2642 AliDebug(2, Form("fileId = %s", id.Data()));
2643 list->Add(new TObjString(id));
2652 //______________________________________________________________________________________________
2653 Bool_t AliShuttle::Connect(Int_t system)
2655 // Connect to MySQL Server of the system's FXS MySQL databases
2656 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2659 // check connection: if already connected return
2660 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2662 TString dbHost, dbUser, dbPass, dbName;
2664 if (system < 3) // FXS db servers
2666 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2667 dbUser = fConfig->GetFXSdbUser(system);
2668 dbPass = fConfig->GetFXSdbPass(system);
2669 dbName = fConfig->GetFXSdbName(system);
2670 } else { // Run & Shuttle logbook servers
2671 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2672 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2673 dbUser = fConfig->GetDAQlbUser();
2674 dbPass = fConfig->GetDAQlbPass();
2675 dbName = fConfig->GetDAQlbDB();
2678 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2679 if (!fServer[system] || !fServer[system]->IsConnected()) {
2682 AliError(Form("Can't establish connection to FXS database for %s",
2683 AliShuttleInterface::GetSystemName(system)));
2685 AliError("Can't establish connection to Run logbook.");
2687 if(fServer[system]) delete fServer[system];
2692 TSQLResult* aResult=0;
2695 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2698 aResult = fServer[kDCS]->GetTables(dbName.Data());
2701 aResult = fServer[kHLT]->GetTables(dbName.Data());
2704 aResult = fServer[3]->GetTables(dbName.Data());
2712 //______________________________________________________________________________________________
2713 Bool_t AliShuttle::UpdateTable()
2716 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2719 Bool_t result = kTRUE;
2721 for (UInt_t system=0; system<3; system++)
2723 if(!fFXSCalled[system]) continue;
2725 // check connection, in case connect
2726 if (!Connect(system))
2728 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2733 TTimeStamp now; // now
2735 // Loop on FXS list entries
2736 TIter iter(&fFXSlist[system]);
2737 TObjString *aFXSentry=0;
2738 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2740 TString aFXSentrystr = aFXSentry->String();
2741 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2742 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2744 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2745 GetSystemName(system), aFXSentrystr.Data()));
2746 if(aFXSarray) delete aFXSarray;
2750 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2751 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2753 TString whereClause;
2756 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2757 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2759 else if (system == kDCS)
2761 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2762 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2764 else if (system == kHLT)
2766 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2767 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2772 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2773 now.GetSec(), whereClause.Data());
2775 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2778 TSQLResult* aResult;
2779 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2782 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2783 GetSystemName(system), sqlQuery.Data()));
2794 //______________________________________________________________________________________________
2795 Bool_t AliShuttle::UpdateTableFailCase()
2797 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2798 // this is called in case the preprocessor is declared failed for the current run, because
2799 // the fields are updated only in case of success
2801 Bool_t result = kTRUE;
2803 for (UInt_t system=0; system<3; system++)
2805 // check connection, in case connect
2806 if (!Connect(system))
2808 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2809 GetSystemName(system)));
2814 TTimeStamp now; // now
2816 // Loop on FXS list entries
2818 TString whereClause = Form("where run=%d and detector=\"%s\";",
2819 GetCurrentRun(), fCurrentDetector.Data());
2822 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2823 now.GetSec(), whereClause.Data());
2825 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2828 TSQLResult* aResult;
2829 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2832 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2833 GetSystemName(system), sqlQuery.Data()));
2843 //______________________________________________________________________________________________
2844 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2847 // Update Shuttle logbook filling detector or shuttle_done column
2848 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2851 // check connection, in case connect
2853 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2857 TString detName(detector);
2859 if (detName == "shuttle_done" || detName == "shuttle_ignored")
2861 setClause = "set shuttle_done=1";
2863 if (detName == "shuttle_done")
2865 // Send the information to ML
2866 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2869 mlList.Add(&mlStatus);
2872 mlID.Form("%d", GetCurrentRun());
2873 fMonaLisa->SendParameters(&mlList, mlID);
2876 TString statusStr(status);
2877 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2878 statusStr.Contains("failed", TString::kIgnoreCase)){
2879 setClause = Form("set %s=\"%s\"", detector, status);
2882 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2888 TString whereClause = Form("where run=%d", GetCurrentRun());
2890 TString sqlQuery = Form("update %s %s %s",
2891 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2893 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2896 TSQLResult* aResult;
2897 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2899 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2907 //______________________________________________________________________________________________
2908 Int_t AliShuttle::GetCurrentRun() const
2911 // Get current run from logbook entry
2914 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2917 //______________________________________________________________________________________________
2918 UInt_t AliShuttle::GetCurrentStartTime() const
2921 // get current start time
2924 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2927 //______________________________________________________________________________________________
2928 UInt_t AliShuttle::GetCurrentEndTime() const
2931 // get current end time from logbook entry
2934 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2937 //______________________________________________________________________________________________
2938 UInt_t AliShuttle::GetCurrentYear() const
2941 // Get current year from logbook entry
2944 if (!fLogbookEntry) return 0;
2946 TTimeStamp startTime(GetCurrentStartTime());
2947 TString year = Form("%d",startTime.GetDate());
2953 //______________________________________________________________________________________________
2954 const char* AliShuttle::GetLHCPeriod() const
2957 // Get current LHC period from logbook entry
2960 if (!fLogbookEntry) return 0;
2962 return fLogbookEntry->GetRunParameter("LHCperiod");
2965 //______________________________________________________________________________________________
2966 void AliShuttle::Log(const char* detector, const char* message)
2969 // Fill log string with a message
2972 TString logRunDir = GetShuttleLogDir();
2973 if (GetCurrentRun() >=0)
2974 logRunDir += Form("/%d", GetCurrentRun());
2976 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2978 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2979 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2984 gSystem->FreeDirectory(dir);
2987 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2988 if (GetCurrentRun() >= 0)
2989 toLog += Form("run %d - ", GetCurrentRun());
2990 toLog += Form("%s", message);
2992 AliInfo(toLog.Data());
2994 // if we redirect the log output already to the file, leave here
2995 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2998 TString fileName = GetLogFileName(detector);
3000 gSystem->ExpandPathName(fileName);
3003 logFile.open(fileName, ofstream::out | ofstream::app);
3005 if (!logFile.is_open()) {
3006 AliError(Form("Could not open file %s", fileName.Data()));
3010 logFile << toLog.Data() << "\n";
3015 //______________________________________________________________________________________________
3016 TString AliShuttle::GetLogFileName(const char* detector) const
3019 // returns the name of the log file for a given sub detector
3024 if (GetCurrentRun() >= 0)
3026 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3027 detector, GetCurrentRun());
3029 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
3035 //______________________________________________________________________________________________
3036 void AliShuttle::SendAlive()
3038 // sends alive message to ML
3040 TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
3043 mlList.Add(&mlStatus);
3045 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
3048 //______________________________________________________________________________________________
3049 Bool_t AliShuttle::Collect(Int_t run)
3052 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
3053 // If a dedicated run is given this run is processed
3055 // In operational mode, this is the Shuttle function triggered by the EOR signal.
3059 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
3061 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
3063 SetLastAction("Starting");
3065 // create ML instance
3067 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
3072 TString whereClause("where shuttle_done=0");
3074 whereClause += Form(" and run=%d", run);
3076 TObjArray shuttleLogbookEntries;
3077 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3079 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3083 if (shuttleLogbookEntries.GetEntries() == 0)
3086 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3088 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3089 "or it does not exist in Shuttle logbook", run));
3093 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3094 fFirstUnprocessed[iDet] = kTRUE;
3098 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3099 // flag them into fFirstUnprocessed array
3100 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3101 TObjArray tmpLogbookEntries;
3102 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3104 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3108 TIter iter(&tmpLogbookEntries);
3109 AliShuttleLogbookEntry* anEntry = 0;
3110 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3112 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3114 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3116 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3117 anEntry->GetRun(), GetDetName(iDet)));
3118 fFirstUnprocessed[iDet] = kFALSE;
3126 if (!RetrieveConditionsData(shuttleLogbookEntries))
3128 Log("SHUTTLE", "Collect - Process of at least one run failed");
3132 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3136 //______________________________________________________________________________________________
3137 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3140 // Retrieve conditions data for all runs that aren't processed yet
3143 Bool_t hasError = kFALSE;
3145 TIter iter(&dateEntries);
3146 AliShuttleLogbookEntry* anEntry;
3148 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3149 if (!Process(anEntry)){
3153 // clean SHUTTLE temp directory
3154 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3155 //RemoveFile(filename.Data());
3158 return hasError == kFALSE;
3161 //______________________________________________________________________________________________
3162 ULong_t AliShuttle::GetTimeOfLastAction() const
3165 // Gets time of last action
3170 fMonitoringMutex->Lock();
3172 tmp = fLastActionTime;
3174 fMonitoringMutex->UnLock();
3179 //______________________________________________________________________________________________
3180 const TString AliShuttle::GetLastAction() const
3183 // returns a string description of the last action
3188 fMonitoringMutex->Lock();
3192 fMonitoringMutex->UnLock();
3197 //______________________________________________________________________________________________
3198 void AliShuttle::SetLastAction(const char* action)
3201 // updates the monitoring variables
3204 fMonitoringMutex->Lock();
3206 fLastAction = action;
3207 fLastActionTime = time(0);
3209 fMonitoringMutex->UnLock();
3212 //______________________________________________________________________________________________
3213 const char* AliShuttle::GetRunParameter(const char* param)
3216 // returns run parameter read from DAQ logbook
3219 if(!fLogbookEntry) {
3220 AliError("No logbook entry!");
3224 return fLogbookEntry->GetRunParameter(param);
3227 //______________________________________________________________________________________________
3228 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3231 // returns object from OCDB valid for current run
3234 if (fTestMode & kErrorOCDB)
3236 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3240 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3243 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3247 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3250 //______________________________________________________________________________________________
3251 Bool_t AliShuttle::SendMail()
3254 // sends a mail to the subdetector expert in case of preprocessor error
3257 if (fTestMode != kNone)
3261 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3262 TObjString *anExpert=0;
3263 while ((anExpert = (TObjString*) iterExperts.Next()))
3265 to += Form("%s,", anExpert->GetName());
3267 if (to.Length() > 0)
3268 to.Remove(to.Length()-1);
3269 AliDebug(2, Form("to: %s",to.Data()));
3272 Log("SHUTTLE", "List of detector responsibles not yet set!");
3276 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3279 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3281 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3286 gSystem->FreeDirectory(dir);
3289 TString bodyFileName;
3290 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3291 gSystem->ExpandPathName(bodyFileName);
3294 mailBody.open(bodyFileName, ofstream::out);
3296 if (!mailBody.is_open())
3298 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3302 TString cc="alberto.colla@cern.ch";
3304 TString subject = Form("%s Shuttle preprocessor FAILED in run %d (run type = %s)!",
3305 fCurrentDetector.Data(), GetCurrentRun(), GetRunType());
3306 AliDebug(2, Form("subject: %s", subject.Data()));
3308 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3309 body += Form("SHUTTLE just detected that your preprocessor "
3310 "failed processing run %d (run type = %s)!!\n\n",
3311 GetCurrentRun(), GetRunType());
3312 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3313 fCurrentDetector.Data());
3314 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3316 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3318 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD&time=168 \n\n");
3322 TString logFolder = "logs";
3323 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3324 logFolder += "_PROD";
3327 body += Form("Find the %s log for the current run on \n\n"
3328 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3329 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3330 fCurrentDetector.Data(), GetCurrentRun());
3331 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3333 AliDebug(2, Form("Body begin: %s", body.Data()));
3335 mailBody << body.Data();
3337 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3339 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3340 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3341 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3342 if (gSystem->Exec(tailCommand.Data()))
3344 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3347 TString endBody = Form("------------------------------------------------------\n\n");
3348 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3349 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3350 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3352 AliDebug(2, Form("Body end: %s", endBody.Data()));
3354 mailBody << endBody.Data();
3359 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3363 bodyFileName.Data());
3364 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3366 Bool_t result = gSystem->Exec(mailCommand.Data());
3371 //______________________________________________________________________________________________
3372 Bool_t AliShuttle::SendMailToDCS()
3375 // sends a mail to the DCS experts in case of DCS error
3378 if (fTestMode != kNone)
3381 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3384 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3386 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3391 gSystem->FreeDirectory(dir);
3394 TString bodyFileName;
3395 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3396 gSystem->ExpandPathName(bodyFileName);
3399 mailBody.open(bodyFileName, ofstream::out);
3401 if (!mailBody.is_open())
3403 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3407 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3408 //TString to="alberto.colla@cern.ch";
3409 AliDebug(2, Form("to: %s",to.Data()));
3412 Log("SHUTTLE", "List of detector responsibles not yet set!");
3416 TString cc="alberto.colla@cern.ch";
3418 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3419 fCurrentDetector.Data(), GetCurrentRun());
3420 AliDebug(2, Form("subject: %s", subject.Data()));
3422 TString body = Form("Dear DCS experts, \n\n");
3423 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3424 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3425 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3426 fCurrentDetector.Data());
3427 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3429 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3431 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3434 TString logFolder = "logs";
3435 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3436 logFolder += "_PROD";
3439 body += Form("Find the %s log for the current run on \n\n"
3440 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3441 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3442 fCurrentDetector.Data(), GetCurrentRun());
3443 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3445 AliDebug(2, Form("Body begin: %s", body.Data()));
3447 mailBody << body.Data();
3449 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3451 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3452 fCurrentDetector.Data(), GetCurrentRun());
3453 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3454 if (gSystem->Exec(tailCommand.Data()))
3456 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3459 TString endBody = Form("------------------------------------------------------\n\n");
3460 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3461 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3462 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3464 AliDebug(2, Form("Body end: %s", endBody.Data()));
3466 mailBody << endBody.Data();
3471 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3475 bodyFileName.Data());
3476 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3478 Bool_t result = gSystem->Exec(mailCommand.Data());
3483 //______________________________________________________________________________________________
3484 const char* AliShuttle::GetRunType()
3487 // returns run type read from "run type" logbook
3490 if(!fLogbookEntry) {
3491 AliError("No logbook entry!");
3495 return fLogbookEntry->GetRunType();
3498 //______________________________________________________________________________________________
3499 Bool_t AliShuttle::GetHLTStatus()
3501 // Return HLT status (ON=1 OFF=0)
3502 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3504 if(!fLogbookEntry) {
3505 AliError("No logbook entry!");
3509 // TODO implement when HLTStatus is inserted in run logbook
3510 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3511 //if(hltStatus == "OFF") {return kFALSE};
3516 //______________________________________________________________________________________________
3517 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3520 // sets Shuttle temp directory
3523 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3526 //______________________________________________________________________________________________
3527 void AliShuttle::SetShuttleLogDir(const char* logDir)
3530 // sets Shuttle log directory
3533 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);