1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.77 2007/12/19 11:16:16 acolla
19 More meaningful log message added in GetFileSources
21 Revision 1.76 2007/12/19 07:45:20 acolla
22 bug fix in the name of the raw tag files (Raw instead of raw)
24 Revision 1.75 2007/12/18 15:42:14 jgrosseo
25 adding number of open runs to monitoring
27 Revision 1.74 2007/12/17 03:23:32 jgrosseo
29 added "empty preprocessor" as placeholder for Acorde in FDR
31 Revision 1.73 2007/12/14 19:31:36 acolla
32 Sending email to DCS experts is temporarily commented
34 Revision 1.72 2007/12/13 15:44:28 acolla
35 Run type added in mail sent to detector expert (eases understanding)
37 Revision 1.71 2007/12/12 14:56:14 jgrosseo
38 sending shuttle_ignore to ML also in case of 0 events
40 Revision 1.70 2007/12/12 13:45:35 acolla
41 Monalisa started in Collect() function. Alive message to monitor is sent at each Collect and every minute during preprocessor processing.
43 Revision 1.69 2007/12/12 10:06:29 acolla
44 in AliShuttle.cxx: SHUTTLE logbook is updated in case of invalid run times:
46 time_start==0 && time_end==0
48 logbook is NOT updated if time_start != 0 && time_end == 0, because it may mean that the run is still ongoing.
50 Revision 1.68 2007/12/11 10:15:17 acolla
51 Added marking SHUTTLE=DONE for invalid runs
52 (invalid start time or end time) and runs with totalEvents < 1
54 Revision 1.67 2007/12/07 19:14:36 acolla
57 Added automatic collection of new runs on a regular time basis (settable from the configuration)
59 in AliShuttleConfig: new members
61 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
62 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
66 - logs now stored in logs/#RUN/DET_#RUN.log
68 Revision 1.66 2007/12/05 10:45:19 jgrosseo
69 changed order of arguments to TMonaLisaWriter
71 Revision 1.65 2007/11/26 16:58:37 acolla
72 Monalisa configuration added: host and table name
74 Revision 1.64 2007/11/13 16:15:47 acolla
75 DCS map is stored in a file in the temp folder where the detector is processed.
76 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
78 Revision 1.63 2007/11/02 10:53:16 acolla
79 Protection added to AliShuttle::CopyFileLocally
81 Revision 1.62 2007/10/31 18:23:13 acolla
82 Furter developement on the Shuttle:
84 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
85 are now built from /alice/data, e.g.:
86 /alice/data/2007/LHC07a/OCDB
88 the year and LHC period are taken from the Shuttle.
89 Raw metadata files are stored by GRP to:
90 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
92 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
94 Revision 1.61 2007/10/30 20:33:51 acolla
95 Improved managing of temporary folders, which weren't correctly handled.
96 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
98 Revision 1.60 2007/10/29 18:06:16 acolla
100 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
101 This function can be used by GRP only. It stores raw data tags merged file to the
102 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
106 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
107 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
108 has been corrected in the root package on the Shuttle machine.
110 Revision 1.59 2007/10/05 12:40:55 acolla
112 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
114 Revision 1.58 2007/09/28 15:27:40 acolla
116 AliDCSClient "multiSplit" option added in the DCS configuration
117 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
119 Revision 1.57 2007/09/27 16:53:13 acolla
120 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
121 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
123 Revision 1.56 2007/09/14 16:46:14 jgrosseo
124 1) Connect and Close are called before and after each query, so one can
125 keep the same AliDCSClient object.
126 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
127 3) Splitting interval can be specified in constructor
129 Revision 1.55 2007/08/06 12:26:40 acolla
130 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
131 read from the run logbook.
133 Revision 1.54 2007/07/12 09:51:25 jgrosseo
134 removed duplicated log message in GetFile
136 Revision 1.53 2007/07/12 09:26:28 jgrosseo
137 updating hlt fxs base path
139 Revision 1.52 2007/07/12 08:06:45 jgrosseo
140 adding log messages in getfile... functions
141 adding not implemented copy constructor in alishuttleconfigholder
143 Revision 1.51 2007/07/03 17:24:52 acolla
144 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
146 Revision 1.50 2007/07/02 17:19:32 acolla
147 preprocessor is run in a temp directory that is removed when process is finished.
149 Revision 1.49 2007/06/29 10:45:06 acolla
150 Number of columns in MySql Shuttle logbook increased by one (HLT added)
152 Revision 1.48 2007/06/21 13:06:19 acolla
153 GetFileSources returns dummy list with 1 source if system=DCS (better than
154 returning error as it was)
156 Revision 1.47 2007/06/19 17:28:56 acolla
157 HLT updated; missing map bug removed.
159 Revision 1.46 2007/06/09 13:01:09 jgrosseo
160 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
162 Revision 1.45 2007/05/30 06:35:20 jgrosseo
163 Adding functionality to the Shuttle/TestShuttle:
164 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
165 o) Function to retrieve list of IDs for a given source (GetFileIDs)
166 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
167 Example code has been added to the TestProcessor in TestShuttle
169 Revision 1.44 2007/05/11 16:09:32 acolla
170 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
171 example: ITS/SPD/100_filename.root
173 Revision 1.43 2007/05/10 09:59:51 acolla
174 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
176 Revision 1.42 2007/05/03 08:01:39 jgrosseo
177 typo in last commit :-(
179 Revision 1.41 2007/05/03 08:00:48 jgrosseo
180 fixing log message when pp want to skip dcs value retrieval
182 Revision 1.40 2007/04/27 07:06:48 jgrosseo
183 GetFileSources returns empty list in case of no files, but successful query
184 No mails sent in testmode
186 Revision 1.39 2007/04/17 12:43:57 acolla
187 Correction in StoreOCDB; change of text in mail to detector expert
189 Revision 1.38 2007/04/12 08:26:18 jgrosseo
192 Revision 1.37 2007/04/10 16:53:14 jgrosseo
193 redirecting sub detector stdout, stderr to sub detector log file
195 Revision 1.35 2007/04/04 16:26:38 acolla
196 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
197 2. Added missing dependency in test preprocessors.
198 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
200 Revision 1.34 2007/04/04 10:33:36 jgrosseo
201 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
202 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
204 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
206 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
208 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
210 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
211 If you always need DCS data (like before), you do not need to implement it.
213 6) The run type has been added to the monitoring page
215 Revision 1.33 2007/04/03 13:56:01 acolla
216 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
219 Revision 1.32 2007/02/28 10:41:56 acolla
220 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
221 AliPreprocessor::GetRunType() function.
222 Added some ldap definition files.
224 Revision 1.30 2007/02/13 11:23:21 acolla
225 Moved getters and setters of Shuttle's main OCDB/Reference, local
226 OCDB/Reference, temp and log folders to AliShuttleInterface
228 Revision 1.27 2007/01/30 17:52:42 jgrosseo
229 adding monalisa monitoring
231 Revision 1.26 2007/01/23 19:20:03 acolla
232 Removed old ldif files, added TOF, MCH ldif files. Added some options in
233 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
236 Revision 1.25 2007/01/15 19:13:52 acolla
237 Moved some AliInfo to AliDebug in SendMail function
239 Revision 1.21 2006/12/07 08:51:26 jgrosseo
241 table, db names in ldap configuration
242 added GRP preprocessor
243 DCS data can also be retrieved by data point
245 Revision 1.20 2006/11/16 16:16:48 jgrosseo
246 introducing strict run ordering flag
247 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
249 Revision 1.19 2006/11/06 14:23:04 jgrosseo
250 major update (Alberto)
251 o) reading of run parameters from the logbook
252 o) online offline naming conversion
253 o) standalone DCSclient package
255 Revision 1.18 2006/10/20 15:22:59 jgrosseo
256 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
257 o) Merging Collect, CollectAll, CollectNew function
258 o) Removing implementation of empty copy constructors (declaration still there!)
260 Revision 1.17 2006/10/05 16:20:55 jgrosseo
261 adapting to new CDB classes
263 Revision 1.16 2006/10/05 15:46:26 jgrosseo
264 applying to the new interface
266 Revision 1.15 2006/10/02 16:38:39 jgrosseo
269 storing of objects that failed to be stored to the grid before
270 interfacing of shuttle status table in daq system
272 Revision 1.14 2006/08/29 09:16:05 jgrosseo
275 Revision 1.13 2006/08/15 10:50:00 jgrosseo
276 effc++ corrections (alberto)
278 Revision 1.12 2006/08/08 14:19:29 jgrosseo
279 Update to shuttle classes (Alberto)
281 - Possibility to set the full object's path in the Preprocessor's and
282 Shuttle's Store functions
283 - Possibility to extend the object's run validity in the same classes
284 ("startValidity" and "validityInfinite" parameters)
285 - Implementation of the StoreReferenceData function to store reference
286 data in a dedicated CDB storage.
288 Revision 1.11 2006/07/21 07:37:20 jgrosseo
289 last run is stored after each run
291 Revision 1.10 2006/07/20 09:54:40 jgrosseo
292 introducing status management: The processing per subdetector is divided into several steps,
293 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
294 can keep track of the number of failures and skips further processing after a certain threshold is
295 exceeded. These thresholds can be configured in LDAP.
297 Revision 1.9 2006/07/19 10:09:55 jgrosseo
298 new configuration, accesst to DAQ FES (Alberto)
300 Revision 1.8 2006/07/11 12:44:36 jgrosseo
301 adding parameters for extended validity range of data produced by preprocessor
303 Revision 1.7 2006/07/10 14:37:09 jgrosseo
304 small fix + todo comment
306 Revision 1.6 2006/07/10 13:01:41 jgrosseo
307 enhanced storing of last sucessfully processed run (alberto)
309 Revision 1.5 2006/07/04 14:59:57 jgrosseo
310 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
312 Revision 1.4 2006/06/12 09:11:16 jgrosseo
313 coding conventions (Alberto)
315 Revision 1.3 2006/06/06 14:26:40 jgrosseo
316 o) removed files that were moved to STEER
317 o) shuttle updated to follow the new interface (Alberto)
319 Revision 1.2 2006/03/07 07:52:34 hristov
320 New version (B.Yordanov)
322 Revision 1.6 2005/11/19 17:19:14 byordano
323 RetrieveDATEEntries and RetrieveConditionsData added
325 Revision 1.5 2005/11/19 11:09:27 byordano
326 AliShuttle declaration added
328 Revision 1.4 2005/11/17 17:47:34 byordano
329 TList changed to TObjArray
331 Revision 1.3 2005/11/17 14:43:23 byordano
334 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
335 Initial import as subdirectory in AliRoot
337 Revision 1.2 2005/09/13 08:41:15 byordano
338 default startTime endTime added
340 Revision 1.4 2005/08/30 09:13:02 byordano
343 Revision 1.3 2005/08/29 21:15:47 byordano
349 // This class is the main manager for AliShuttle.
350 // It organizes the data retrieval from DCS and call the
351 // interface methods of AliPreprocessor.
352 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
353 // data for its set of aliases is retrieved. If there is registered
354 // AliPreprocessor for this detector then it will be used
355 // accroding to the schema (see AliPreprocessor).
356 // If there isn't registered AliPreprocessor than the retrieved
357 // data is stored automatically to the undelying AliCDBStorage.
358 // For detSpec is used the alias name.
361 #include "AliShuttle.h"
363 #include "AliCDBManager.h"
364 #include "AliCDBStorage.h"
365 #include "AliCDBId.h"
366 #include "AliCDBRunRange.h"
367 #include "AliCDBPath.h"
368 #include "AliCDBEntry.h"
369 #include "AliShuttleConfig.h"
370 #include "DCSClient/AliDCSClient.h"
372 #include "AliPreprocessor.h"
373 #include "AliShuttleStatus.h"
374 #include "AliShuttleLogbookEntry.h"
379 #include <TTimeStamp.h>
380 #include <TObjString.h>
381 #include <TSQLServer.h>
382 #include <TSQLResult.h>
385 #include <TSystemDirectory.h>
386 #include <TSystemFile.h>
389 #include <TGridResult.h>
391 #include <TMonaLisaWriter.h>
395 #include <sys/types.h>
396 #include <sys/wait.h>
400 //______________________________________________________________________________________________
401 AliShuttle::AliShuttle(const AliShuttleConfig* config,
402 UInt_t timeout, Int_t retries):
404 fTimeout(timeout), fRetries(retries),
414 fReadTestMode(kFALSE),
415 fOutputRedirected(kFALSE)
418 // config: AliShuttleConfig used
419 // timeout: timeout used for AliDCSClient connection
420 // retries: the number of retries in case of connection error.
423 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
424 for(int iSys=0;iSys<4;iSys++) {
427 fFXSlist[iSys].SetOwner(kTRUE);
429 fPreprocessorMap.SetOwner(kTRUE);
431 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
432 fFirstUnprocessed[iDet] = kFALSE;
434 fMonitoringMutex = new TMutex();
437 //______________________________________________________________________________________________
438 AliShuttle::~AliShuttle()
444 fPreprocessorMap.DeleteAll();
445 for(int iSys=0;iSys<4;iSys++)
447 fServer[iSys]->Close();
448 delete fServer[iSys];
457 if (fMonitoringMutex)
459 delete fMonitoringMutex;
460 fMonitoringMutex = 0;
464 //______________________________________________________________________________________________
465 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
468 // Registers new AliPreprocessor.
469 // It uses GetName() for indentificator of the pre processor.
470 // The pre processor is registered it there isn't any other
471 // with the same identificator (GetName()).
474 const char* detName = preprocessor->GetName();
475 if(GetDetPos(detName) < 0)
476 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
478 if (fPreprocessorMap.GetValue(detName)) {
479 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
483 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
485 //______________________________________________________________________________________________
486 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
487 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
489 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
490 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
491 // using this function. Use StoreReferenceData instead!
492 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
493 // finishes the data are transferred to the main storage (Grid).
495 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
498 //______________________________________________________________________________________________
499 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
501 // Stores a CDB object in the storage for reference data. This objects will not be available during
502 // offline reconstrunction. Use this function for reference data only!
503 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
504 // finishes the data are transferred to the main storage (Grid).
506 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
509 //______________________________________________________________________________________________
510 Bool_t AliShuttle::StoreLocally(const TString& localUri,
511 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
512 Int_t validityStart, Bool_t validityInfinite)
514 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
515 // when the preprocessor finishes the data are transferred to the main storage (Grid).
516 // The parameters are:
517 // 1) Uri of the backup storage (Local)
518 // 2) the object's path.
519 // 3) the object to be stored
520 // 4) the metaData to be associated with the object
521 // 5) the validity start run number w.r.t. the current run,
522 // if the data is valid only for this run leave the default 0
523 // 6) specifies if the calibration data is valid for infinity (this means until updated),
524 // typical for calibration runs, the default is kFALSE
526 // returns 0 if fail, 1 otherwise
528 if (fTestMode & kErrorStorage)
530 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
534 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
536 Int_t firstRun = GetCurrentRun() - validityStart;
538 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
543 if(validityInfinite) {
544 lastRun = AliCDBRunRange::Infinity();
546 lastRun = GetCurrentRun();
549 // Version is set to current run, it will be used later to transfer data to Grid
550 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
552 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
553 TObjString runUsed = Form("%d", GetCurrentRun());
554 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
557 Bool_t result = kFALSE;
559 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
560 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
562 result = AliCDBManager::Instance()->GetStorage(localUri)
563 ->Put(object, id, metaData);
568 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
574 //______________________________________________________________________________________________
575 Bool_t AliShuttle::StoreOCDB()
578 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
579 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
580 // Then calls StoreRefFilesToGrid to store reference files.
583 if (fTestMode & kErrorGrid)
585 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
586 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
590 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
591 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
593 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
594 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
596 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
597 Bool_t resultRefFiles = CopyFilesToGrid("reference");
599 Bool_t resultMetadata = kTRUE;
600 if(fCurrentDetector == "GRP")
602 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
603 resultMetadata = CopyFilesToGrid("metadata");
606 return resultCDB && resultRef && resultRefFiles && resultMetadata;
609 //______________________________________________________________________________________________
610 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
613 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
616 TObjArray* gridIds=0;
618 Bool_t result = kTRUE;
620 const char* type = 0;
622 if(gridURI == fgkMainCDB) {
624 localURI = fgkLocalCDB;
625 } else if(gridURI == fgkMainRefStorage) {
627 localURI = fgkLocalRefStorage;
629 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
633 AliCDBManager* man = AliCDBManager::Instance();
635 AliCDBStorage *gridSto = man->GetStorage(gridURI);
638 Form("StoreOCDB - cannot activate main %s storage", type));
642 gridIds = gridSto->GetQueryCDBList();
644 // get objects previously stored in local CDB
645 AliCDBStorage *localSto = man->GetStorage(localURI);
648 Form("StoreOCDB - cannot activate local %s storage", type));
651 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
652 // Local objects were stored with current run as Grid version!
653 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
654 localEntries->SetOwner(1);
656 // loop on local stored objects
657 TIter localIter(localEntries);
658 AliCDBEntry *aLocEntry = 0;
659 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
660 aLocEntry->SetOwner(1);
661 AliCDBId aLocId = aLocEntry->GetId();
662 aLocEntry->SetVersion(-1);
663 aLocEntry->SetSubVersion(-1);
665 // If local object is valid up to infinity we store it only if it is
666 // the first unprocessed run!
667 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
668 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
670 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
671 "there are previous unprocessed runs!",
672 fCurrentDetector.Data(), aLocId.GetPath().Data()));
677 // loop on Grid valid Id's
678 Bool_t store = kTRUE;
679 TIter gridIter(gridIds);
680 AliCDBId* aGridId = 0;
681 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
682 if(aGridId->GetPath() != aLocId.GetPath()) continue;
683 // skip all objects valid up to infinity
684 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
685 // if we get here, it means there's already some more recent object stored on Grid!
690 // If we get here, the file can be stored!
691 Bool_t storeOk = gridSto->Put(aLocEntry);
692 if(!store || storeOk){
696 Log(fCurrentDetector.Data(),
697 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
698 type, aGridId->ToString().Data()));
701 Form("StoreOCDB - Object <%s> successfully put into %s storage",
702 aLocId.ToString().Data(), type));
703 Log(fCurrentDetector.Data(),
704 Form("StoreOCDB - Object <%s> successfully put into %s storage",
705 aLocId.ToString().Data(), type));
708 // removing local filename...
710 localSto->IdToFilename(aLocId, filename);
711 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
712 RemoveFile(filename.Data());
716 Form("StoreOCDB - Grid %s storage of object <%s> failed",
717 type, aLocId.ToString().Data()));
718 Log(fCurrentDetector.Data(),
719 Form("StoreOCDB - Grid %s storage of object <%s> failed",
720 type, aLocId.ToString().Data()));
724 localEntries->Clear();
729 //______________________________________________________________________________________________
730 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
732 // clears the directory used to store reference files of a given subdetector
734 AliCDBManager* man = AliCDBManager::Instance();
735 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
736 TString localBaseFolder = sto->GetBaseFolder();
738 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
740 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
743 begin.Form("%d_", GetCurrentRun());
745 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
749 TList* dirList = baseDir->GetListOfFiles();
752 if (!dirList) return kTRUE;
754 if (dirList->GetEntries() < 3)
760 Int_t nDirs = 0, nDel = 0;
761 TIter dirIter(dirList);
762 TSystemFile* entry = 0;
764 Bool_t success = kTRUE;
766 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
768 if (entry->IsDirectory())
771 TString fileName(entry->GetName());
772 if (!fileName.BeginsWith(begin))
778 Int_t result = gSystem->Unlink(fileName.Data());
782 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
790 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
791 nDel, nDirs, targetDir.Data()));
802 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
806 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
809 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
814 result = gSystem->mkdir(targetDir, kTRUE);
817 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
824 //______________________________________________________________________________________________
825 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
828 // Stores reference file directly (without opening it). This function stores the file locally.
830 // The file is stored under the following location:
831 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
832 // where <gridFileName> is the second parameter given to the function
835 if (fTestMode & kErrorStorage)
837 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
841 AliCDBManager* man = AliCDBManager::Instance();
842 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
844 TString localBaseFolder = sto->GetBaseFolder();
846 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
847 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
849 return CopyFileLocally(localFile, target);
852 //______________________________________________________________________________________________
853 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
856 // Stores Run metadata file to the Grid, in the run folder
858 // Only GRP can call this function.
860 if (fTestMode & kErrorStorage)
862 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
866 AliCDBManager* man = AliCDBManager::Instance();
867 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
869 TString localBaseFolder = sto->GetBaseFolder();
871 // Build Run level folder
872 // folder = /alice/data/year/lhcPeriod/runNb/raw
875 TString lhcPeriod = GetLHCPeriod();
876 if (lhcPeriod.Length() == 0)
878 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
882 // TODO partitions with one detector only write data into LHCperiod_DET
883 TString partition = GetRunParameter("partition");
885 if (partition.Length() > 0 && partition != "ALICE")
887 lhcPeriod.Append(Form("_%s", partition.Data()));
888 Log(fCurrentDetector, Form("Run data tags merged file will be written in %s",
892 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/raw/%s",
893 localBaseFolder.Data(), GetCurrentYear(),
894 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
896 return CopyFileLocally(localFile, target);
899 //______________________________________________________________________________________________
900 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
903 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
904 // Files are temporarily stored in the local reference storage. When the preprocessor
905 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
906 // (in reference or run level folders)
909 TString targetDir(target(0, target.Last('/')));
911 //try to open base dir folder, if it does not exist
912 void* dir = gSystem->OpenDirectory(targetDir.Data());
914 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
915 Log("SHUTTLE", Form("CopyFileLocally - Can't open directory <%s>", targetDir.Data()));
920 gSystem->FreeDirectory(dir);
925 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
928 Log("SHUTTLE", Form("CopyFileLocally - %s does not exist", localFile));
932 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
935 Log("SHUTTLE", Form("CopyFileLocally - target file %s already exist, removing...", target.Data()));
936 if (gSystem->Unlink(target.Data()))
938 Log("SHUTTLE", Form("CopyFileLocally - Could not remove existing target file %s!", target.Data()));
943 result = gSystem->CopyFile(localFile, target);
947 Log("SHUTTLE", Form("CopyFileLocally - File %s stored locally to %s", localFile, target.Data()));
952 Log("SHUTTLE", Form("CopyFileLocally - Could not store file %s to %s! Error code = %d",
953 localFile, target.Data(), result));
961 //______________________________________________________________________________________________
962 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
965 // Transfers local files to the Grid. Local files can be reference files
966 // or run metadata file (from GRP only).
968 // According to the type (ref, metadata) the files are stored under the following location:
969 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
970 // metadata --> <run data folder>/<MetadataFileName>
973 AliCDBManager* man = AliCDBManager::Instance();
974 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
977 TString localBaseFolder = sto->GetBaseFolder();
983 if (strcmp(type, "reference") == 0)
985 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
986 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
989 TString gridBaseFolder = gridSto->GetBaseFolder();
990 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
991 begin = Form("%d_", GetCurrentRun());
993 else if (strcmp(type, "metadata") == 0)
996 TString lhcPeriod = GetLHCPeriod();
998 if (lhcPeriod.Length() == 0)
1000 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
1004 // TODO partitions with one detector only write data into LHCperiod_DET
1005 TString partition = GetRunParameter("partition");
1007 if (partition.Length() > 0 && partition != "ALICE")
1009 lhcPeriod.Append(Form("_%s", partition.Data()));
1012 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/raw",
1013 localBaseFolder.Data(), GetCurrentYear(),
1014 lhcPeriod.Data(), GetCurrentRun());
1015 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
1021 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
1025 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
1029 TList* dirList = baseDir->GetListOfFiles();
1032 if (!dirList) return kTRUE;
1034 if (dirList->GetEntries() < 3)
1042 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
1047 Int_t nDirs = 0, nTransfer = 0;
1048 TIter dirIter(dirList);
1049 TSystemFile* entry = 0;
1051 Bool_t success = kTRUE;
1052 Bool_t first = kTRUE;
1054 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1056 if (entry->IsDirectory())
1059 TString fileName(entry->GetName());
1060 if (!fileName.BeginsWith(begin))
1068 // check that folder exists, otherwise create it
1069 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1077 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1079 // TODO It does not work currently! Bug in TAliEn::Mkdir
1080 // TODO Manually fixed in local root v5-16-00
1081 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1083 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1088 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1092 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1096 TString fullLocalPath;
1097 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1099 TString fullGridPath;
1100 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1102 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1106 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1107 fullLocalPath.Data(), fullGridPath.Data()));
1108 RemoveFile(fullLocalPath);
1113 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1114 fullLocalPath.Data(), fullGridPath.Data()));
1119 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1120 nTransfer, nDirs, dir.Data()));
1127 //______________________________________________________________________________________________
1128 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1131 // Get folder name of reference files
1134 TString offDetStr(GetOfflineDetName(detector));
1136 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1138 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1140 dir.Form("%s/%s", base, offDetStr.Data());
1148 //______________________________________________________________________________________________
1149 void AliShuttle::CleanLocalStorage(const TString& uri)
1152 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1155 const char* type = 0;
1156 if(uri == fgkLocalCDB) {
1158 } else if(uri == fgkLocalRefStorage) {
1161 AliError(Form("Invalid storage URI: %s", uri.Data()));
1165 AliCDBManager* man = AliCDBManager::Instance();
1167 // open local storage
1168 AliCDBStorage *localSto = man->GetStorage(uri);
1171 Form("CleanLocalStorage - cannot activate local %s storage", type));
1175 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1176 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1178 AliDebug(2, Form("filename = %s", filename.Data()));
1180 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1181 GetCurrentRun(), fCurrentDetector.Data()));
1183 RemoveFile(filename.Data());
1187 //______________________________________________________________________________________________
1188 void AliShuttle::RemoveFile(const char* filename)
1191 // removes local file
1194 TString command(Form("rm -f %s", filename));
1196 Int_t result = gSystem->Exec(command.Data());
1199 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1200 fCurrentDetector.Data(), filename));
1204 //______________________________________________________________________________________________
1205 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1208 // Reads the AliShuttleStatus from the CDB
1212 delete fStatusEntry;
1216 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1217 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1219 if (!fStatusEntry) return 0;
1220 fStatusEntry->SetOwner(1);
1222 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1224 AliError("Invalid object stored to CDB!");
1231 //______________________________________________________________________________________________
1232 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1235 // writes the status for one subdetector
1239 delete fStatusEntry;
1243 Int_t run = GetCurrentRun();
1245 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1247 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1248 fStatusEntry->SetOwner(1);
1250 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1253 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1254 fCurrentDetector.Data(), run));
1263 //______________________________________________________________________________________________
1264 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1267 // changes the AliShuttleStatus for the given detector and run to the given status
1271 AliError("UNEXPECTED: fStatusEntry empty");
1275 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1278 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1282 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1283 fCurrentDetector.Data(),
1284 status->GetStatusName(),
1285 status->GetStatusName(newStatus));
1286 Log("SHUTTLE", actionStr);
1287 SetLastAction(actionStr);
1289 status->SetStatus(newStatus);
1290 if (increaseCount) status->IncreaseCount();
1292 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1297 //______________________________________________________________________________________________
1298 void AliShuttle::SendMLInfo()
1301 // sends ML information about the current status of the current detector being processed
1304 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1307 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1311 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1312 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1315 mlList.Add(&mlStatus);
1316 mlList.Add(&mlRetryCount);
1319 mlID.Form("%d", GetCurrentRun());
1320 fMonaLisa->SendParameters(&mlList, mlID);
1323 //______________________________________________________________________________________________
1324 Bool_t AliShuttle::ContinueProcessing()
1326 // this function reads the AliShuttleStatus information from CDB and
1327 // checks if the processing should be continued
1328 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1330 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1332 AliPreprocessor* aPreprocessor =
1333 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1336 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1340 AliShuttleLogbookEntry::Status entryStatus =
1341 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1343 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1344 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1345 fCurrentDetector.Data(),
1346 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1350 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1352 // check if current run is first unprocessed run for current detector
1353 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1354 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1356 if (fTestMode == kNone)
1358 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1359 " but this is not the first unprocessed run!"));
1364 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1365 "Although %s requires strict run ordering "
1366 "and this is not the first unprocessed run, "
1367 "the SHUTTLE continues"));
1371 AliShuttleStatus* status = ReadShuttleStatus();
1374 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1375 fCurrentDetector.Data()));
1376 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1377 return WriteShuttleStatus(status);
1380 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1381 // If it happens it may mean Logbook updating failed... let's do it now!
1382 if (status->GetStatus() == AliShuttleStatus::kDone ||
1383 status->GetStatus() == AliShuttleStatus::kFailed){
1384 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1385 fCurrentDetector.Data(),
1386 status->GetStatusName(status->GetStatus())));
1387 UpdateShuttleLogbook(fCurrentDetector.Data(),
1388 status->GetStatusName(status->GetStatus()));
1392 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1394 Form("ContinueProcessing - %s: Grid storage of one or more "
1395 "objects failed. Trying again now",
1396 fCurrentDetector.Data()));
1397 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1399 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1400 "successfully stored into main storage",
1401 fCurrentDetector.Data()));
1404 Form("ContinueProcessing - %s: Grid storage failed again",
1405 fCurrentDetector.Data()));
1406 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1411 // if we get here, there is a restart
1412 Bool_t cont = kFALSE;
1415 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1416 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1417 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1418 status->GetCount(), status->GetStatusName()));
1419 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1420 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1422 // there may still be objects in local OCDB and reference storage
1423 // and FXS databases may be not updated: do it now!
1425 // TODO Currently disabled, we want to keep files in case of failure!
1426 // CleanLocalStorage(fgkLocalCDB);
1427 // CleanLocalStorage(fgkLocalRefStorage);
1428 // UpdateTableFailCase();
1430 // Send mail to detector expert!
1431 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1432 fCurrentDetector.Data()));
1434 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1435 fCurrentDetector.Data()));
1438 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1439 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1440 status->GetStatusName(), status->GetCount()));
1441 Bool_t increaseCount = kTRUE;
1442 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1443 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1444 increaseCount = kFALSE;
1446 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1453 //______________________________________________________________________________________________
1454 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1457 // Makes data retrieval for all detectors in the configuration.
1458 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1459 // (Unprocessed, Inactive, Failed or Done).
1460 // Returns kFALSE in case of error occured and kTRUE otherwise
1463 if (!entry) return kFALSE;
1465 fLogbookEntry = entry;
1467 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1470 // Send the information to ML
1471 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1472 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1475 mlList.Add(&mlStatus);
1476 mlList.Add(&mlRunType);
1479 mlID.Form("%d", GetCurrentRun());
1480 fMonaLisa->SendParameters(&mlList, mlID);
1482 if (fLogbookEntry->IsDone())
1484 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1485 UpdateShuttleLogbook("shuttle_done");
1490 // read test mode if flag is set
1494 TString logEntry(entry->GetRunParameter("log"));
1495 //printf("log entry = %s\n", logEntry.Data());
1496 TString searchStr("Testmode: ");
1497 Int_t pos = logEntry.Index(searchStr.Data());
1498 //printf("%d\n", pos);
1501 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1502 //printf("%s\n", subStr.String().Data());
1503 TString newStr(subStr.Data());
1504 TObjArray* token = newStr.Tokenize(' ');
1508 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1511 Int_t testMode = tmpStr->String().Atoi();
1514 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1515 SetTestMode((TestMode) testMode);
1523 fLogbookEntry->Print("all");
1526 Bool_t hasError = kFALSE;
1528 // Set the CDB and Reference folders according to the year and LHC period
1529 TString lhcPeriod(GetLHCPeriod());
1530 if (lhcPeriod.Length() == 0)
1532 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1536 if (fgkMainCDB.Length() == 0)
1537 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1538 GetCurrentYear(), lhcPeriod.Data());
1540 if (fgkMainRefStorage.Length() == 0)
1541 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1542 GetCurrentYear(), lhcPeriod.Data());
1544 // Loop on detectors in the configuration
1545 TIter iter(fConfig->GetDetectors());
1546 TObjString* aDetector = 0;
1548 Bool_t first = kTRUE;
1550 while ((aDetector = (TObjString*) iter.Next()))
1552 fCurrentDetector = aDetector->String();
1554 if (ContinueProcessing() == kFALSE) continue;
1558 // only read QueryCDB when needed and only once
1559 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1560 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1561 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1562 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1566 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1567 GetCurrentRun(), aDetector->GetName()));
1569 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1571 Log(fCurrentDetector.Data(), "Process - Starting processing");
1577 Log("SHUTTLE", "Process - ERROR: Forking failed");
1582 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1583 GetCurrentRun(), aDetector->GetName()));
1585 Long_t begin = time(0);
1587 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1588 while (waitpid(pid, &status, WNOHANG) == 0)
1590 Long_t expiredTime = time(0) - begin;
1592 if (expiredTime > fConfig->GetPPTimeOut())
1595 tmp.Form("Process - Process of %s time out. "
1596 "Run time: %d seconds. Killing...",
1597 fCurrentDetector.Data(), expiredTime);
1598 Log("SHUTTLE", tmp);
1599 Log(fCurrentDetector, tmp);
1603 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1606 gSystem->Sleep(1000);
1610 gSystem->Sleep(1000);
1613 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1614 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1617 Log("SHUTTLE", Form("Process - Error: "
1618 "Could not open pipe to %s", checkStr.Data()));
1623 if (!fgets(buffer, 100, pipe))
1625 Log("SHUTTLE", "Process - Error: ps did not return anything");
1626 gSystem->ClosePipe(pipe);
1629 gSystem->ClosePipe(pipe);
1631 //Log("SHUTTLE", Form("ps returned %s", buffer));
1634 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1636 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1640 if (expiredTime % 60 == 0)
1642 Log("SHUTTLE", Form("Process - %s: Checking process. "
1643 "Run time: %d seconds - Memory consumption: %d KB",
1644 fCurrentDetector.Data(), expiredTime, mem));
1648 if (mem > fConfig->GetPPMaxMem())
1651 tmp.Form("Process - Process exceeds maximum allowed memory "
1652 "(%d KB > %d KB). Killing...",
1653 mem, fConfig->GetPPMaxMem());
1654 Log("SHUTTLE", tmp);
1655 Log(fCurrentDetector, tmp);
1659 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1662 gSystem->Sleep(1000);
1667 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1668 GetCurrentRun(), aDetector->GetName()));
1670 if (WIFEXITED(status))
1672 Int_t returnCode = WEXITSTATUS(status);
1674 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1677 if (returnCode == 0) hasError = kTRUE;
1683 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1684 aDetector->GetName()));
1686 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1688 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1690 Log("SHUTTLE", "Process - Could not freopen stdout");
1694 fOutputRedirected = kTRUE;
1695 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1696 Log("SHUTTLE", "Process - Could not redirect stderr");
1700 TString wd = gSystem->WorkingDirectory();
1701 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1702 fCurrentDetector.Data(), GetCurrentRun());
1704 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1705 if (!result) // temp dir already exists!
1707 Log(fCurrentDetector.Data(),
1708 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1709 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1712 if (gSystem->mkdir(tmpDir.Data(), 1))
1714 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1718 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1720 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1724 Bool_t success = ProcessCurrentDetector();
1726 gSystem->ChangeDirectory(wd.Data());
1728 if (success) // Preprocessor finished successfully!
1730 // remove temporary folder
1731 // temporary commented (JF)
1732 //gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1734 // Update time_processed field in FXS DB
1735 if (UpdateTable() == kFALSE)
1736 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1737 fCurrentDetector.Data()));
1739 // Transfer the data from local storage to main storage (Grid)
1740 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1741 if (StoreOCDB() == kFALSE)
1744 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1745 GetCurrentRun(), aDetector->GetName()));
1746 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1750 Form("\t\t\t****** run %d - %s: DONE ******",
1751 GetCurrentRun(), aDetector->GetName()));
1752 UpdateShuttleStatus(AliShuttleStatus::kDone);
1753 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1758 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1759 GetCurrentRun(), aDetector->GetName()));
1762 for (UInt_t iSys=0; iSys<3; iSys++)
1764 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1767 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1768 GetCurrentRun(), aDetector->GetName(), success));
1770 // the client exits here
1771 gSystem->Exit(success);
1773 AliError("We should never get here!!!");
1777 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1780 //check if shuttle is done for this run, if so update logbook
1781 TObjArray checkEntryArray;
1782 checkEntryArray.SetOwner(1);
1783 TString whereClause = Form("where run=%d", GetCurrentRun());
1784 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1785 checkEntryArray.GetEntries() == 0) {
1786 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1788 return hasError == kFALSE;
1791 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1792 (checkEntryArray.At(0));
1796 if (checkEntry->IsDone())
1798 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1799 UpdateShuttleLogbook("shuttle_done");
1803 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1805 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1807 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1808 checkEntry->GetRun(), GetDetName(iDet)));
1809 fFirstUnprocessed[iDet] = kFALSE;
1817 return hasError == kFALSE;
1820 //______________________________________________________________________________________________
1821 Bool_t AliShuttle::ProcessCurrentDetector()
1824 // Makes data retrieval just for a specific detector (fCurrentDetector).
1825 // Threre should be a configuration for this detector.
1827 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1828 fCurrentDetector.Data(), GetCurrentRun()));
1830 TString wd = gSystem->WorkingDirectory();
1832 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1835 gSystem->ChangeDirectory(wd.Data());
1837 TMap* dcsMap = new TMap();
1839 // call preprocessor
1840 AliPreprocessor* aPreprocessor =
1841 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1843 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1845 Bool_t processDCS = aPreprocessor->ProcessDCS();
1849 Log(fCurrentDetector, "ProcessCurrentDetector -"
1850 " The preprocessor requested to skip the retrieval of DCS values");
1852 else if (fTestMode & kSkipDCS)
1854 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1856 else if (fTestMode & kErrorDCS)
1858 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1859 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1860 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1865 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1867 // Query DCS archive
1868 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1870 for (int iServ=0; iServ<nServers; iServ++)
1873 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1874 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1875 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1877 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1878 " Querying DCS Amanda server %s:%d (%d of %d)",
1879 host.Data(), port, iServ+1, nServers));
1884 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1886 aliasMap = GetValueSet(host, port,
1887 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1888 kAlias, multiSplit);
1891 Log(fCurrentDetector,
1892 Form("ProcessCurrentDetector -"
1893 " Error retrieving DCS aliases from server %s."
1894 " Sending mail to DCS experts!", host.Data()));
1895 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1897 //if (!SendMailToDCS())
1898 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1905 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1907 dpMap = GetValueSet(host, port,
1908 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1912 Log(fCurrentDetector,
1913 Form("ProcessCurrentDetector -"
1914 " Error retrieving DCS data points from server %s."
1915 " Sending mail to DCS experts!", host.Data()));
1916 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1918 //if (!SendMailToDCS())
1919 // Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1921 if (aliasMap) delete aliasMap;
1927 // merge aliasMap and dpMap into dcsMap
1929 TIter iter(aliasMap);
1930 TObjString* key = 0;
1931 while ((key = (TObjString*) iter.Next()))
1932 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1934 aliasMap->SetOwner(kFALSE);
1940 TObjString* key = 0;
1941 while ((key = (TObjString*) iter.Next()))
1942 dcsMap->Add(key, dpMap->GetValue(key->String()));
1944 dpMap->SetOwner(kFALSE);
1950 // save map into file, to help debugging in case of preprocessor error
1951 TFile* f = TFile::Open("DCSMap.root","recreate");
1953 dcsMap->Write("DCSMap", TObject::kSingleKey);
1957 // DCS Archive DB processing successful. Call Preprocessor!
1958 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1960 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1962 if (returnValue > 0) // Preprocessor error!
1964 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1965 "Preprocessor failed. Process returned %d.", returnValue));
1966 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1967 dcsMap->DeleteAll();
1973 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1974 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1975 fCurrentDetector.Data()));
1977 dcsMap->DeleteAll();
1983 //______________________________________________________________________________________________
1984 void AliShuttle::CountOpenRuns()
1986 // Query DAQ's Shuttle logbook and sends the number of open runs to ML
1988 // check connection, in case connect
1993 sqlQuery = Form("select count(*) from %s where shuttle_done=0", fConfig->GetShuttlelbTable());
1995 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1997 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
2001 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
2003 if (aResult->GetRowCount() == 0) {
2004 AliError(Form("No result for query %s received", sqlQuery.Data()));
2008 if (aResult->GetFieldCount() != 1) {
2009 AliError(Form("Invalid field count for query %s received", sqlQuery.Data()));
2013 TSQLRow* aRow = aResult->Next();
2015 AliError(Form("Could not receive result of query %s", sqlQuery.Data()));
2019 TString result(aRow->GetField(0), aRow->GetFieldLength(0));
2020 Int_t count = result.Atoi();
2022 Log("SHUTTLE", Form("%d unprocessed runs", count));
2027 TMonaLisaValue mlStatus("SHUTTLE_openruns", count);
2030 mlList.Add(&mlStatus);
2032 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
2035 //______________________________________________________________________________________________
2036 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
2039 // Query DAQ's Shuttle logbook and fills detector status object.
2040 // Call QueryRunParameters to query DAQ logbook for run parameters.
2043 entries.SetOwner(1);
2045 // check connection, in case connect
2046 if (!Connect(3)) return kFALSE;
2049 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2051 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2053 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
2057 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
2059 if(aResult->GetRowCount() == 0) {
2060 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
2065 // TODO Check field count!
2066 const UInt_t nCols = 23;
2067 if (aResult->GetFieldCount() != (Int_t) nCols) {
2068 Log("SHUTTLE", "Invalid SQL result field number!");
2074 while ((aRow = aResult->Next())) {
2075 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
2076 Int_t run = runString.Atoi();
2078 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
2082 // loop on detectors
2083 for(UInt_t ii = 0; ii < nCols; ii++)
2084 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2086 entries.AddLast(entry);
2094 //______________________________________________________________________________________________
2095 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2098 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
2101 // check connection, in case connect
2106 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2108 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2110 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
2114 if (aResult->GetRowCount() == 0) {
2115 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2120 if (aResult->GetRowCount() > 1) {
2121 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2122 "more than one entry in DAQ Logbook for run %d!", run));
2127 TSQLRow* aRow = aResult->Next();
2130 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2135 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2137 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2138 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2140 UInt_t startTime = entry->GetStartTime();
2141 UInt_t endTime = entry->GetEndTime();
2143 // if (!startTime || !endTime || startTime > endTime)
2146 // Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2147 // run, startTime, endTime));
2149 // Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2150 // fLogbookEntry = entry;
2151 // if (!UpdateShuttleLogbook("shuttle_done"))
2153 // AliError(Form("Could not update logbook for run %d !", run));
2155 // fLogbookEntry = 0;
2166 Form("QueryRunParameters - Invalid parameters for Run %d: "
2167 "startTime = %d, endTime = %d. Skipping!",
2168 run, startTime, endTime));
2170 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2171 fLogbookEntry = entry;
2172 if (!UpdateShuttleLogbook("shuttle_ignored"))
2174 AliError(Form("Could not update logbook for run %d !", run));
2184 if (startTime && !endTime)
2186 // TODO Here we don't mark SHUTTLE done, because this may mean
2187 //the run is still ongoing!!
2189 Form("QueryRunParameters - Invalid parameters for Run %d: "
2190 "startTime = %d, endTime = %d. Skipping (Shuttle won't be marked as DONE)!",
2191 run, startTime, endTime));
2193 //Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2194 //fLogbookEntry = entry;
2195 //if (!UpdateShuttleLogbook("shuttle_done"))
2197 // AliError(Form("Could not update logbook for run %d !", run));
2199 //fLogbookEntry = 0;
2207 if (startTime && endTime && (startTime > endTime))
2210 Form("QueryRunParameters - Invalid parameters for Run %d: "
2211 "startTime = %d, endTime = %d. Skipping!",
2212 run, startTime, endTime));
2214 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2215 fLogbookEntry = entry;
2216 if (!UpdateShuttleLogbook("shuttle_ignored"))
2218 AliError(Form("Could not update logbook for run %d !", run));
2228 TString totEventsStr = entry->GetRunParameter("totalEvents");
2229 Int_t totEvents = totEventsStr.Atoi();
2233 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2235 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2236 fLogbookEntry = entry;
2237 if (!UpdateShuttleLogbook("shuttle_ignored"))
2239 AliError(Form("Could not update logbook for run %d !", run));
2255 //______________________________________________________________________________________________
2256 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2257 DCSType type, Int_t multiSplit)
2259 // Retrieve all "entry" data points from the DCS server
2260 // host, port: TSocket connection parameters
2261 // entries: list of name of the alias or data point
2262 // type: kAlias or kDP
2263 // returns TMap of values, 0 when failure
2265 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2270 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2271 GetCurrentEndTime());
2273 else if (type == kDP)
2275 result = client.GetDPValues(entries, GetCurrentStartTime(),
2276 GetCurrentEndTime());
2281 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2282 client.GetErrorString(client.GetResultErrorCode())));
2283 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2284 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2285 client.GetServerError().Data()));
2293 //______________________________________________________________________________________________
2294 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2295 const char* id, const char* source)
2297 // Get calibration file from file exchange servers
2298 // First queris the FXS database for the file name, using the run, detector, id and source info
2299 // then calls RetrieveFile(filename) for actual copy to local disk
2300 // run: current run being processed (given by Logbook entry fLogbookEntry)
2301 // detector: the Preprocessor name
2302 // id: provided as a parameter by the Preprocessor
2303 // source: provided by the Preprocessor through GetFileSources function
2305 // check if test mode should simulate a FXS error
2306 if (fTestMode & kErrorFXSFiles)
2308 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2312 // check connection, in case connect
2313 if (!Connect(system))
2315 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2319 // Query preparation
2320 TString sourceName(source);
2322 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2323 fConfig->GetFXSdbTable(system));
2324 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2325 GetCurrentRun(), detector, id);
2329 whereClause += Form(" and DAQsource=\"%s\"", source);
2331 else if (system == kDCS)
2335 else if (system == kHLT)
2337 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2341 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2343 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2346 TSQLResult* aResult = 0;
2347 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2349 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2350 GetSystemName(system), id, sourceName.Data()));
2354 if(aResult->GetRowCount() == 0)
2357 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2358 GetSystemName(system), id, sourceName.Data()));
2363 if (aResult->GetRowCount() > 1) {
2365 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2366 GetSystemName(system), id, sourceName.Data()));
2371 if (aResult->GetFieldCount() != nFields) {
2373 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2374 GetSystemName(system), id, sourceName.Data()));
2379 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2382 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2383 GetSystemName(system), id, sourceName.Data()));
2388 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2389 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2390 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2395 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2396 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2398 // retrieved file is renamed to make it unique
2399 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2400 GetShuttleTempDir(), detector, GetCurrentRun(),
2401 GetSystemName(system), detector, GetCurrentRun(),
2402 id, sourceName.Data());
2405 // file retrieval from FXS
2406 UInt_t nRetries = 0;
2407 UInt_t maxRetries = 3;
2408 Bool_t result = kFALSE;
2410 // copy!! if successful TSystem::Exec returns 0
2411 while(nRetries++ < maxRetries) {
2412 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2413 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2416 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2417 filePath.Data(), GetSystemName(system)));
2421 if (fileChecksum.Length()>0)
2423 // compare md5sum of local file with the one stored in the FXS DB
2424 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2425 localFileName.Data(), fileChecksum.Data()));
2429 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2435 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2436 filePath.Data(), GetSystemName(system)));
2441 if(!result) return 0;
2443 fFXSCalled[system]=kTRUE;
2444 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2445 fFXSlist[system].Add(fileParams);
2447 static TString staticLocalFileName;
2448 staticLocalFileName.Form("%s", localFileName.Data());
2450 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2451 "source %s from %s to %s", id, source,
2452 GetSystemName(system), localFileName.Data()));
2454 return staticLocalFileName.Data();
2457 //______________________________________________________________________________________________
2458 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2461 // Copies file from FXS to local Shuttle machine
2464 // check temp directory: trying to cd to temp; if it does not exist, create it
2465 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2466 GetSystemName(system), fxsFileName, localFileName));
2468 TString tmpDir(localFileName);
2470 tmpDir = tmpDir(0,tmpDir.Last('/'));
2472 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2473 if (noDir) // temp dir does not exists!
2475 if (gSystem->mkdir(tmpDir.Data(), 1))
2477 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2482 TString baseFXSFolder;
2485 baseFXSFolder = "FES/";
2487 else if (system == kDCS)
2491 else if (system == kHLT)
2493 baseFXSFolder = "/opt/FXS/";
2497 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2498 fConfig->GetFXSPort(system),
2499 fConfig->GetFXSUser(system),
2500 fConfig->GetFXSHost(system),
2501 baseFXSFolder.Data(),
2505 AliDebug(2, Form("%s",command.Data()));
2507 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2512 //______________________________________________________________________________________________
2513 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2516 // Get sources producing the condition file Id from file exchange servers
2517 // if id is NULL all sources are returned (distinct)
2522 Log(detector, Form("GetFileSources - Querying %s FXS for files with id %s produced by %s", GetSystemName(system), id, detector));
2524 Log(detector, Form("GetFileSources - Querying %s FXS for files produced by %s", GetSystemName(system), detector));
2527 // check if test mode should simulate a FXS error
2528 if (fTestMode & kErrorFXSSources)
2530 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2536 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2537 TList *list = new TList();
2539 list->Add(new TObjString(" "));
2543 // check connection, in case connect
2544 if (!Connect(system))
2546 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2550 TString sourceName = 0;
2553 sourceName = "DAQsource";
2554 } else if (system == kHLT)
2556 sourceName = "DDLnumbers";
2559 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2560 TString whereClause = Form("run=%d and detector=\"%s\"",
2561 GetCurrentRun(), detector);
2563 whereClause += Form(" and fileId=\"%s\"", id);
2564 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2566 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2569 TSQLResult* aResult;
2570 aResult = fServer[system]->Query(sqlQuery);
2572 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2573 GetSystemName(system), id));
2577 TList *list = new TList();
2580 if (aResult->GetRowCount() == 0)
2583 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2588 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2591 while ((aRow = aResult->Next()))
2594 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2595 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2596 list->Add(new TObjString(source));
2605 //______________________________________________________________________________________________
2606 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2609 // Get all ids of condition files produced by a given source from file exchange servers
2612 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2614 // check if test mode should simulate a FXS error
2615 if (fTestMode & kErrorFXSSources)
2617 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2621 // check connection, in case connect
2622 if (!Connect(system))
2624 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2628 TString sourceName = 0;
2631 sourceName = "DAQsource";
2632 } else if (system == kHLT)
2634 sourceName = "DDLnumbers";
2637 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2638 TString whereClause = Form("run=%d and detector=\"%s\"",
2639 GetCurrentRun(), detector);
2640 if (sourceName.Length() > 0 && source)
2641 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2642 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2644 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2647 TSQLResult* aResult;
2648 aResult = fServer[system]->Query(sqlQuery);
2650 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2651 GetSystemName(system), source));
2655 TList *list = new TList();
2658 if (aResult->GetRowCount() == 0)
2661 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2666 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2670 while ((aRow = aResult->Next()))
2673 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2674 AliDebug(2, Form("fileId = %s", id.Data()));
2675 list->Add(new TObjString(id));
2684 //______________________________________________________________________________________________
2685 Bool_t AliShuttle::Connect(Int_t system)
2687 // Connect to MySQL Server of the system's FXS MySQL databases
2688 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2691 // check connection: if already connected return
2692 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2694 TString dbHost, dbUser, dbPass, dbName;
2696 if (system < 3) // FXS db servers
2698 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2699 dbUser = fConfig->GetFXSdbUser(system);
2700 dbPass = fConfig->GetFXSdbPass(system);
2701 dbName = fConfig->GetFXSdbName(system);
2702 } else { // Run & Shuttle logbook servers
2703 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2704 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2705 dbUser = fConfig->GetDAQlbUser();
2706 dbPass = fConfig->GetDAQlbPass();
2707 dbName = fConfig->GetDAQlbDB();
2710 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2711 if (!fServer[system] || !fServer[system]->IsConnected()) {
2714 AliError(Form("Can't establish connection to FXS database for %s",
2715 AliShuttleInterface::GetSystemName(system)));
2717 AliError("Can't establish connection to Run logbook.");
2719 if(fServer[system]) delete fServer[system];
2724 TSQLResult* aResult=0;
2727 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2730 aResult = fServer[kDCS]->GetTables(dbName.Data());
2733 aResult = fServer[kHLT]->GetTables(dbName.Data());
2736 aResult = fServer[3]->GetTables(dbName.Data());
2744 //______________________________________________________________________________________________
2745 Bool_t AliShuttle::UpdateTable()
2748 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2751 Bool_t result = kTRUE;
2753 for (UInt_t system=0; system<3; system++)
2755 if(!fFXSCalled[system]) continue;
2757 // check connection, in case connect
2758 if (!Connect(system))
2760 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2765 TTimeStamp now; // now
2767 // Loop on FXS list entries
2768 TIter iter(&fFXSlist[system]);
2769 TObjString *aFXSentry=0;
2770 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2772 TString aFXSentrystr = aFXSentry->String();
2773 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2774 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2776 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2777 GetSystemName(system), aFXSentrystr.Data()));
2778 if(aFXSarray) delete aFXSarray;
2782 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2783 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2785 TString whereClause;
2788 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2789 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2791 else if (system == kDCS)
2793 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2794 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2796 else if (system == kHLT)
2798 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2799 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2804 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2805 now.GetSec(), whereClause.Data());
2807 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2810 TSQLResult* aResult;
2811 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2814 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2815 GetSystemName(system), sqlQuery.Data()));
2826 //______________________________________________________________________________________________
2827 Bool_t AliShuttle::UpdateTableFailCase()
2829 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2830 // this is called in case the preprocessor is declared failed for the current run, because
2831 // the fields are updated only in case of success
2833 Bool_t result = kTRUE;
2835 for (UInt_t system=0; system<3; system++)
2837 // check connection, in case connect
2838 if (!Connect(system))
2840 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2841 GetSystemName(system)));
2846 TTimeStamp now; // now
2848 // Loop on FXS list entries
2850 TString whereClause = Form("where run=%d and detector=\"%s\";",
2851 GetCurrentRun(), fCurrentDetector.Data());
2854 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2855 now.GetSec(), whereClause.Data());
2857 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2860 TSQLResult* aResult;
2861 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2864 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2865 GetSystemName(system), sqlQuery.Data()));
2875 //______________________________________________________________________________________________
2876 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2879 // Update Shuttle logbook filling detector or shuttle_done column
2880 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2883 // check connection, in case connect
2885 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2889 TString detName(detector);
2891 if (detName == "shuttle_done" || detName == "shuttle_ignored")
2893 setClause = "set shuttle_done=1";
2895 if (detName == "shuttle_done")
2897 // Send the information to ML
2898 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2901 mlList.Add(&mlStatus);
2904 mlID.Form("%d", GetCurrentRun());
2905 fMonaLisa->SendParameters(&mlList, mlID);
2908 TString statusStr(status);
2909 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2910 statusStr.Contains("failed", TString::kIgnoreCase)){
2911 setClause = Form("set %s=\"%s\"", detector, status);
2914 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2920 TString whereClause = Form("where run=%d", GetCurrentRun());
2922 TString sqlQuery = Form("update %s %s %s",
2923 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2925 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2928 TSQLResult* aResult;
2929 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2931 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2939 //______________________________________________________________________________________________
2940 Int_t AliShuttle::GetCurrentRun() const
2943 // Get current run from logbook entry
2946 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2949 //______________________________________________________________________________________________
2950 UInt_t AliShuttle::GetCurrentStartTime() const
2953 // get current start time
2956 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2959 //______________________________________________________________________________________________
2960 UInt_t AliShuttle::GetCurrentEndTime() const
2963 // get current end time from logbook entry
2966 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2969 //______________________________________________________________________________________________
2970 UInt_t AliShuttle::GetCurrentYear() const
2973 // Get current year from logbook entry
2976 if (!fLogbookEntry) return 0;
2978 TTimeStamp startTime(GetCurrentStartTime());
2979 TString year = Form("%d",startTime.GetDate());
2985 //______________________________________________________________________________________________
2986 const char* AliShuttle::GetLHCPeriod() const
2989 // Get current LHC period from logbook entry
2992 if (!fLogbookEntry) return 0;
2994 return fLogbookEntry->GetRunParameter("LHCperiod");
2997 //______________________________________________________________________________________________
2998 void AliShuttle::Log(const char* detector, const char* message)
3001 // Fill log string with a message
3004 TString logRunDir = GetShuttleLogDir();
3005 if (GetCurrentRun() >=0)
3006 logRunDir += Form("/%d", GetCurrentRun());
3008 void* dir = gSystem->OpenDirectory(logRunDir.Data());
3010 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
3011 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
3016 gSystem->FreeDirectory(dir);
3019 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
3020 if (GetCurrentRun() >= 0)
3021 toLog += Form("run %d - ", GetCurrentRun());
3022 toLog += Form("%s", message);
3024 AliInfo(toLog.Data());
3026 // if we redirect the log output already to the file, leave here
3027 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
3030 TString fileName = GetLogFileName(detector);
3032 gSystem->ExpandPathName(fileName);
3035 logFile.open(fileName, ofstream::out | ofstream::app);
3037 if (!logFile.is_open()) {
3038 AliError(Form("Could not open file %s", fileName.Data()));
3042 logFile << toLog.Data() << "\n";
3047 //______________________________________________________________________________________________
3048 TString AliShuttle::GetLogFileName(const char* detector) const
3051 // returns the name of the log file for a given sub detector
3056 if (GetCurrentRun() >= 0)
3058 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3059 detector, GetCurrentRun());
3061 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
3067 //______________________________________________________________________________________________
3068 void AliShuttle::SendAlive()
3070 // sends alive message to ML
3072 TMonaLisaText mlStatus("SHUTTLE_status", "Alive");
3075 mlList.Add(&mlStatus);
3077 fMonaLisa->SendParameters(&mlList, "__PROCESSINGINFO__");
3080 //______________________________________________________________________________________________
3081 Bool_t AliShuttle::Collect(Int_t run)
3084 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
3085 // If a dedicated run is given this run is processed
3087 // In operational mode, this is the Shuttle function triggered by the EOR signal.
3091 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
3093 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
3095 SetLastAction("Starting");
3097 // create ML instance
3099 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable());
3104 TString whereClause("where shuttle_done=0");
3106 whereClause += Form(" and run=%d", run);
3108 TObjArray shuttleLogbookEntries;
3109 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
3111 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3115 if (shuttleLogbookEntries.GetEntries() == 0)
3118 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
3120 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
3121 "or it does not exist in Shuttle logbook", run));
3125 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3126 fFirstUnprocessed[iDet] = kTRUE;
3130 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
3131 // flag them into fFirstUnprocessed array
3132 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
3133 TObjArray tmpLogbookEntries;
3134 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
3136 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
3140 TIter iter(&tmpLogbookEntries);
3141 AliShuttleLogbookEntry* anEntry = 0;
3142 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
3144 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
3146 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
3148 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
3149 anEntry->GetRun(), GetDetName(iDet)));
3150 fFirstUnprocessed[iDet] = kFALSE;
3158 if (!RetrieveConditionsData(shuttleLogbookEntries))
3160 Log("SHUTTLE", "Collect - Process of at least one run failed");
3164 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
3168 //______________________________________________________________________________________________
3169 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
3172 // Retrieve conditions data for all runs that aren't processed yet
3175 Bool_t hasError = kFALSE;
3177 TIter iter(&dateEntries);
3178 AliShuttleLogbookEntry* anEntry;
3180 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
3181 if (!Process(anEntry)){
3185 // clean SHUTTLE temp directory
3186 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
3187 //RemoveFile(filename.Data());
3190 return hasError == kFALSE;
3193 //______________________________________________________________________________________________
3194 ULong_t AliShuttle::GetTimeOfLastAction() const
3197 // Gets time of last action
3202 fMonitoringMutex->Lock();
3204 tmp = fLastActionTime;
3206 fMonitoringMutex->UnLock();
3211 //______________________________________________________________________________________________
3212 const TString AliShuttle::GetLastAction() const
3215 // returns a string description of the last action
3220 fMonitoringMutex->Lock();
3224 fMonitoringMutex->UnLock();
3229 //______________________________________________________________________________________________
3230 void AliShuttle::SetLastAction(const char* action)
3233 // updates the monitoring variables
3236 fMonitoringMutex->Lock();
3238 fLastAction = action;
3239 fLastActionTime = time(0);
3241 fMonitoringMutex->UnLock();
3244 //______________________________________________________________________________________________
3245 const char* AliShuttle::GetRunParameter(const char* param)
3248 // returns run parameter read from DAQ logbook
3251 if(!fLogbookEntry) {
3252 AliError("No logbook entry!");
3256 return fLogbookEntry->GetRunParameter(param);
3259 //______________________________________________________________________________________________
3260 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3263 // returns object from OCDB valid for current run
3266 if (fTestMode & kErrorOCDB)
3268 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3272 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3275 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3279 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3282 //______________________________________________________________________________________________
3283 Bool_t AliShuttle::SendMail()
3286 // sends a mail to the subdetector expert in case of preprocessor error
3289 if (fTestMode != kNone)
3293 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3294 TObjString *anExpert=0;
3295 while ((anExpert = (TObjString*) iterExperts.Next()))
3297 to += Form("%s,", anExpert->GetName());
3299 if (to.Length() > 0)
3300 to.Remove(to.Length()-1);
3301 AliDebug(2, Form("to: %s",to.Data()));
3304 Log("SHUTTLE", "List of detector responsibles not yet set!");
3308 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3311 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3313 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3318 gSystem->FreeDirectory(dir);
3321 TString bodyFileName;
3322 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3323 gSystem->ExpandPathName(bodyFileName);
3326 mailBody.open(bodyFileName, ofstream::out);
3328 if (!mailBody.is_open())
3330 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3334 TString cc="alberto.colla@cern.ch";
3336 TString subject = Form("%s Shuttle preprocessor FAILED in run %d (run type = %s)!",
3337 fCurrentDetector.Data(), GetCurrentRun(), GetRunType());
3338 AliDebug(2, Form("subject: %s", subject.Data()));
3340 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3341 body += Form("SHUTTLE just detected that your preprocessor "
3342 "failed processing run %d (run type = %s)!!\n\n",
3343 GetCurrentRun(), GetRunType());
3344 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3345 fCurrentDetector.Data());
3346 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3348 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3350 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD&time=168 \n\n");
3354 TString logFolder = "logs";
3355 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3356 logFolder += "_PROD";
3359 body += Form("Find the %s log for the current run on \n\n"
3360 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3361 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3362 fCurrentDetector.Data(), GetCurrentRun());
3363 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3365 AliDebug(2, Form("Body begin: %s", body.Data()));
3367 mailBody << body.Data();
3369 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3371 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3372 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3373 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3374 if (gSystem->Exec(tailCommand.Data()))
3376 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3379 TString endBody = Form("------------------------------------------------------\n\n");
3380 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3381 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3382 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3384 AliDebug(2, Form("Body end: %s", endBody.Data()));
3386 mailBody << endBody.Data();
3391 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3395 bodyFileName.Data());
3396 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3398 Bool_t result = gSystem->Exec(mailCommand.Data());
3403 //______________________________________________________________________________________________
3404 Bool_t AliShuttle::SendMailToDCS()
3407 // sends a mail to the DCS experts in case of DCS error
3410 if (fTestMode != kNone)
3413 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3416 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3418 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3423 gSystem->FreeDirectory(dir);
3426 TString bodyFileName;
3427 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3428 gSystem->ExpandPathName(bodyFileName);
3431 mailBody.open(bodyFileName, ofstream::out);
3433 if (!mailBody.is_open())
3435 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3439 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3440 //TString to="alberto.colla@cern.ch";
3441 AliDebug(2, Form("to: %s",to.Data()));
3444 Log("SHUTTLE", "List of detector responsibles not yet set!");
3448 TString cc="alberto.colla@cern.ch";
3450 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3451 fCurrentDetector.Data(), GetCurrentRun());
3452 AliDebug(2, Form("subject: %s", subject.Data()));
3454 TString body = Form("Dear DCS experts, \n\n");
3455 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3456 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3457 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3458 fCurrentDetector.Data());
3459 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3461 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3463 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3466 TString logFolder = "logs";
3467 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3468 logFolder += "_PROD";
3471 body += Form("Find the %s log for the current run on \n\n"
3472 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3473 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3474 fCurrentDetector.Data(), GetCurrentRun());
3475 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3477 AliDebug(2, Form("Body begin: %s", body.Data()));
3479 mailBody << body.Data();
3481 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3483 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3484 fCurrentDetector.Data(), GetCurrentRun());
3485 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3486 if (gSystem->Exec(tailCommand.Data()))
3488 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3491 TString endBody = Form("------------------------------------------------------\n\n");
3492 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3493 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3494 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3496 AliDebug(2, Form("Body end: %s", endBody.Data()));
3498 mailBody << endBody.Data();
3503 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3507 bodyFileName.Data());
3508 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3510 Bool_t result = gSystem->Exec(mailCommand.Data());
3515 //______________________________________________________________________________________________
3516 const char* AliShuttle::GetRunType()
3519 // returns run type read from "run type" logbook
3522 if(!fLogbookEntry) {
3523 AliError("No logbook entry!");
3527 return fLogbookEntry->GetRunType();
3530 //______________________________________________________________________________________________
3531 Bool_t AliShuttle::GetHLTStatus()
3533 // Return HLT status (ON=1 OFF=0)
3534 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3536 if(!fLogbookEntry) {
3537 AliError("No logbook entry!");
3541 // TODO implement when HLTStatus is inserted in run logbook
3542 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3543 //if(hltStatus == "OFF") {return kFALSE};
3548 //______________________________________________________________________________________________
3549 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3552 // sets Shuttle temp directory
3555 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3558 //______________________________________________________________________________________________
3559 void AliShuttle::SetShuttleLogDir(const char* logDir)
3562 // sets Shuttle log directory
3565 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);