1 /**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
18 Revision 1.67 2007/12/07 19:14:36 acolla
21 Added automatic collection of new runs on a regular time basis (settable from the configuration)
23 in AliShuttleConfig: new members
25 - triggerWait: time to wait for DIM trigger (s) before starting automatic collection of new runs
26 - mode: run mode (test, prod) -> used to build log folder (logs or logs_PROD)
30 - logs now stored in logs/#RUN/DET_#RUN.log
32 Revision 1.66 2007/12/05 10:45:19 jgrosseo
33 changed order of arguments to TMonaLisaWriter
35 Revision 1.65 2007/11/26 16:58:37 acolla
36 Monalisa configuration added: host and table name
38 Revision 1.64 2007/11/13 16:15:47 acolla
39 DCS map is stored in a file in the temp folder where the detector is processed.
40 If the preprocessor fails, the temp folder is not removed. This will help the debugging of the problem.
42 Revision 1.63 2007/11/02 10:53:16 acolla
43 Protection added to AliShuttle::CopyFileLocally
45 Revision 1.62 2007/10/31 18:23:13 acolla
46 Furter developement on the Shuttle:
48 - Shuttle now connects to the Grid as alidaq. The OCDB and Reference folders
49 are now built from /alice/data, e.g.:
50 /alice/data/2007/LHC07a/OCDB
52 the year and LHC period are taken from the Shuttle.
53 Raw metadata files are stored by GRP to:
54 /alice/data/2007/LHC07a/<runNb>/Raw/RunMetadata.root
56 - Shuttle sends a mail to DCS experts each time DP retrieval fails.
58 Revision 1.61 2007/10/30 20:33:51 acolla
59 Improved managing of temporary folders, which weren't correctly handled.
60 Resolved bug introduced in StoreReferenceFile, which caused SPD preprocessor fail.
62 Revision 1.60 2007/10/29 18:06:16 acolla
64 New function StoreRunMetadataFile added to preprocessor and Shuttle interface
65 This function can be used by GRP only. It stores raw data tags merged file to the
66 raw data folder (e.g. /alice/data/2008/LHC08a/000099999/Raw).
70 1. Shuttle cannot write to /alice/data/ because it belongs to alidaq. Tag file is stored in /alice/simulation/... for the time being.
71 2. Due to a bug in TAlien::Mkdir, the creation of a folder in recursive mode (-p option) does not work. The problem
72 has been corrected in the root package on the Shuttle machine.
74 Revision 1.59 2007/10/05 12:40:55 acolla
76 Result error code added to AliDCSClient data members (it was "lost" with the new implementation of TMap* GetAliasValues and GetDPValues).
78 Revision 1.58 2007/09/28 15:27:40 acolla
80 AliDCSClient "multiSplit" option added in the DCS configuration
81 in AliDCSMessage: variable MAX_BODY_SIZE set to 500000
83 Revision 1.57 2007/09/27 16:53:13 acolla
84 Detectors can have more than one AMANDA server. SHUTTLE queries the servers sequentially,
85 merges the dcs aliases/DPs in one TMap and sends it to the preprocessor.
87 Revision 1.56 2007/09/14 16:46:14 jgrosseo
88 1) Connect and Close are called before and after each query, so one can
89 keep the same AliDCSClient object.
90 2) The splitting of a query is moved to GetDPValues/GetAliasValues.
91 3) Splitting interval can be specified in constructor
93 Revision 1.55 2007/08/06 12:26:40 acolla
94 Function Bool_t GetHLTStatus added to preprocessor. It returns the status of HLT
95 read from the run logbook.
97 Revision 1.54 2007/07/12 09:51:25 jgrosseo
98 removed duplicated log message in GetFile
100 Revision 1.53 2007/07/12 09:26:28 jgrosseo
101 updating hlt fxs base path
103 Revision 1.52 2007/07/12 08:06:45 jgrosseo
104 adding log messages in getfile... functions
105 adding not implemented copy constructor in alishuttleconfigholder
107 Revision 1.51 2007/07/03 17:24:52 acolla
108 root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
110 Revision 1.50 2007/07/02 17:19:32 acolla
111 preprocessor is run in a temp directory that is removed when process is finished.
113 Revision 1.49 2007/06/29 10:45:06 acolla
114 Number of columns in MySql Shuttle logbook increased by one (HLT added)
116 Revision 1.48 2007/06/21 13:06:19 acolla
117 GetFileSources returns dummy list with 1 source if system=DCS (better than
118 returning error as it was)
120 Revision 1.47 2007/06/19 17:28:56 acolla
121 HLT updated; missing map bug removed.
123 Revision 1.46 2007/06/09 13:01:09 jgrosseo
124 Switching to retrieval of several DCS DPs at a time (multiDPrequest)
126 Revision 1.45 2007/05/30 06:35:20 jgrosseo
127 Adding functionality to the Shuttle/TestShuttle:
128 o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
129 o) Function to retrieve list of IDs for a given source (GetFileIDs)
130 These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
131 Example code has been added to the TestProcessor in TestShuttle
133 Revision 1.44 2007/05/11 16:09:32 acolla
134 Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
135 example: ITS/SPD/100_filename.root
137 Revision 1.43 2007/05/10 09:59:51 acolla
138 Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
140 Revision 1.42 2007/05/03 08:01:39 jgrosseo
141 typo in last commit :-(
143 Revision 1.41 2007/05/03 08:00:48 jgrosseo
144 fixing log message when pp want to skip dcs value retrieval
146 Revision 1.40 2007/04/27 07:06:48 jgrosseo
147 GetFileSources returns empty list in case of no files, but successful query
148 No mails sent in testmode
150 Revision 1.39 2007/04/17 12:43:57 acolla
151 Correction in StoreOCDB; change of text in mail to detector expert
153 Revision 1.38 2007/04/12 08:26:18 jgrosseo
156 Revision 1.37 2007/04/10 16:53:14 jgrosseo
157 redirecting sub detector stdout, stderr to sub detector log file
159 Revision 1.35 2007/04/04 16:26:38 acolla
160 1. Re-organization of function calls in TestPreprocessor to make it more meaningful.
161 2. Added missing dependency in test preprocessors.
162 3. in AliShuttle.cxx: processing time and memory consumption info on a single line.
164 Revision 1.34 2007/04/04 10:33:36 jgrosseo
165 1) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
166 In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
168 2) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
170 3) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
172 4) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
174 5) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
175 If you always need DCS data (like before), you do not need to implement it.
177 6) The run type has been added to the monitoring page
179 Revision 1.33 2007/04/03 13:56:01 acolla
180 Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
183 Revision 1.32 2007/02/28 10:41:56 acolla
184 Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
185 AliPreprocessor::GetRunType() function.
186 Added some ldap definition files.
188 Revision 1.30 2007/02/13 11:23:21 acolla
189 Moved getters and setters of Shuttle's main OCDB/Reference, local
190 OCDB/Reference, temp and log folders to AliShuttleInterface
192 Revision 1.27 2007/01/30 17:52:42 jgrosseo
193 adding monalisa monitoring
195 Revision 1.26 2007/01/23 19:20:03 acolla
196 Removed old ldif files, added TOF, MCH ldif files. Added some options in
197 AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
200 Revision 1.25 2007/01/15 19:13:52 acolla
201 Moved some AliInfo to AliDebug in SendMail function
203 Revision 1.21 2006/12/07 08:51:26 jgrosseo
205 table, db names in ldap configuration
206 added GRP preprocessor
207 DCS data can also be retrieved by data point
209 Revision 1.20 2006/11/16 16:16:48 jgrosseo
210 introducing strict run ordering flag
211 removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
213 Revision 1.19 2006/11/06 14:23:04 jgrosseo
214 major update (Alberto)
215 o) reading of run parameters from the logbook
216 o) online offline naming conversion
217 o) standalone DCSclient package
219 Revision 1.18 2006/10/20 15:22:59 jgrosseo
220 o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
221 o) Merging Collect, CollectAll, CollectNew function
222 o) Removing implementation of empty copy constructors (declaration still there!)
224 Revision 1.17 2006/10/05 16:20:55 jgrosseo
225 adapting to new CDB classes
227 Revision 1.16 2006/10/05 15:46:26 jgrosseo
228 applying to the new interface
230 Revision 1.15 2006/10/02 16:38:39 jgrosseo
233 storing of objects that failed to be stored to the grid before
234 interfacing of shuttle status table in daq system
236 Revision 1.14 2006/08/29 09:16:05 jgrosseo
239 Revision 1.13 2006/08/15 10:50:00 jgrosseo
240 effc++ corrections (alberto)
242 Revision 1.12 2006/08/08 14:19:29 jgrosseo
243 Update to shuttle classes (Alberto)
245 - Possibility to set the full object's path in the Preprocessor's and
246 Shuttle's Store functions
247 - Possibility to extend the object's run validity in the same classes
248 ("startValidity" and "validityInfinite" parameters)
249 - Implementation of the StoreReferenceData function to store reference
250 data in a dedicated CDB storage.
252 Revision 1.11 2006/07/21 07:37:20 jgrosseo
253 last run is stored after each run
255 Revision 1.10 2006/07/20 09:54:40 jgrosseo
256 introducing status management: The processing per subdetector is divided into several steps,
257 after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
258 can keep track of the number of failures and skips further processing after a certain threshold is
259 exceeded. These thresholds can be configured in LDAP.
261 Revision 1.9 2006/07/19 10:09:55 jgrosseo
262 new configuration, accesst to DAQ FES (Alberto)
264 Revision 1.8 2006/07/11 12:44:36 jgrosseo
265 adding parameters for extended validity range of data produced by preprocessor
267 Revision 1.7 2006/07/10 14:37:09 jgrosseo
268 small fix + todo comment
270 Revision 1.6 2006/07/10 13:01:41 jgrosseo
271 enhanced storing of last sucessfully processed run (alberto)
273 Revision 1.5 2006/07/04 14:59:57 jgrosseo
274 revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
276 Revision 1.4 2006/06/12 09:11:16 jgrosseo
277 coding conventions (Alberto)
279 Revision 1.3 2006/06/06 14:26:40 jgrosseo
280 o) removed files that were moved to STEER
281 o) shuttle updated to follow the new interface (Alberto)
283 Revision 1.2 2006/03/07 07:52:34 hristov
284 New version (B.Yordanov)
286 Revision 1.6 2005/11/19 17:19:14 byordano
287 RetrieveDATEEntries and RetrieveConditionsData added
289 Revision 1.5 2005/11/19 11:09:27 byordano
290 AliShuttle declaration added
292 Revision 1.4 2005/11/17 17:47:34 byordano
293 TList changed to TObjArray
295 Revision 1.3 2005/11/17 14:43:23 byordano
298 Revision 1.1.1.1 2005/10/28 07:33:58 hristov
299 Initial import as subdirectory in AliRoot
301 Revision 1.2 2005/09/13 08:41:15 byordano
302 default startTime endTime added
304 Revision 1.4 2005/08/30 09:13:02 byordano
307 Revision 1.3 2005/08/29 21:15:47 byordano
313 // This class is the main manager for AliShuttle.
314 // It organizes the data retrieval from DCS and call the
315 // interface methods of AliPreprocessor.
316 // For every detector in AliShuttleConfgi (see AliShuttleConfig),
317 // data for its set of aliases is retrieved. If there is registered
318 // AliPreprocessor for this detector then it will be used
319 // accroding to the schema (see AliPreprocessor).
320 // If there isn't registered AliPreprocessor than the retrieved
321 // data is stored automatically to the undelying AliCDBStorage.
322 // For detSpec is used the alias name.
325 #include "AliShuttle.h"
327 #include "AliCDBManager.h"
328 #include "AliCDBStorage.h"
329 #include "AliCDBId.h"
330 #include "AliCDBRunRange.h"
331 #include "AliCDBPath.h"
332 #include "AliCDBEntry.h"
333 #include "AliShuttleConfig.h"
334 #include "DCSClient/AliDCSClient.h"
336 #include "AliPreprocessor.h"
337 #include "AliShuttleStatus.h"
338 #include "AliShuttleLogbookEntry.h"
343 #include <TTimeStamp.h>
344 #include <TObjString.h>
345 #include <TSQLServer.h>
346 #include <TSQLResult.h>
349 #include <TSystemDirectory.h>
350 #include <TSystemFile.h>
353 #include <TGridResult.h>
355 #include <TMonaLisaWriter.h>
359 #include <sys/types.h>
360 #include <sys/wait.h>
364 //______________________________________________________________________________________________
365 AliShuttle::AliShuttle(const AliShuttleConfig* config,
366 UInt_t timeout, Int_t retries):
368 fTimeout(timeout), fRetries(retries),
378 fReadTestMode(kFALSE),
379 fOutputRedirected(kFALSE)
382 // config: AliShuttleConfig used
383 // timeout: timeout used for AliDCSClient connection
384 // retries: the number of retries in case of connection error.
387 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
388 for(int iSys=0;iSys<4;iSys++) {
391 fFXSlist[iSys].SetOwner(kTRUE);
393 fPreprocessorMap.SetOwner(kTRUE);
395 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
396 fFirstUnprocessed[iDet] = kFALSE;
398 fMonitoringMutex = new TMutex();
401 //______________________________________________________________________________________________
402 AliShuttle::~AliShuttle()
408 fPreprocessorMap.DeleteAll();
409 for(int iSys=0;iSys<4;iSys++)
411 fServer[iSys]->Close();
412 delete fServer[iSys];
421 if (fMonitoringMutex)
423 delete fMonitoringMutex;
424 fMonitoringMutex = 0;
428 //______________________________________________________________________________________________
429 void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
432 // Registers new AliPreprocessor.
433 // It uses GetName() for indentificator of the pre processor.
434 // The pre processor is registered it there isn't any other
435 // with the same identificator (GetName()).
438 const char* detName = preprocessor->GetName();
439 if(GetDetPos(detName) < 0)
440 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
442 if (fPreprocessorMap.GetValue(detName)) {
443 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
447 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
449 //______________________________________________________________________________________________
450 Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
451 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
453 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
454 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
455 // using this function. Use StoreReferenceData instead!
456 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
457 // finishes the data are transferred to the main storage (Grid).
459 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
462 //______________________________________________________________________________________________
463 Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
465 // Stores a CDB object in the storage for reference data. This objects will not be available during
466 // offline reconstrunction. Use this function for reference data only!
467 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
468 // finishes the data are transferred to the main storage (Grid).
470 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
473 //______________________________________________________________________________________________
474 Bool_t AliShuttle::StoreLocally(const TString& localUri,
475 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
476 Int_t validityStart, Bool_t validityInfinite)
478 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
479 // when the preprocessor finishes the data are transferred to the main storage (Grid).
480 // The parameters are:
481 // 1) Uri of the backup storage (Local)
482 // 2) the object's path.
483 // 3) the object to be stored
484 // 4) the metaData to be associated with the object
485 // 5) the validity start run number w.r.t. the current run,
486 // if the data is valid only for this run leave the default 0
487 // 6) specifies if the calibration data is valid for infinity (this means until updated),
488 // typical for calibration runs, the default is kFALSE
490 // returns 0 if fail, 1 otherwise
492 if (fTestMode & kErrorStorage)
494 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
498 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
500 Int_t firstRun = GetCurrentRun() - validityStart;
502 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
507 if(validityInfinite) {
508 lastRun = AliCDBRunRange::Infinity();
510 lastRun = GetCurrentRun();
513 // Version is set to current run, it will be used later to transfer data to Grid
514 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
516 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
517 TObjString runUsed = Form("%d", GetCurrentRun());
518 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
521 Bool_t result = kFALSE;
523 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
524 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
526 result = AliCDBManager::Instance()->GetStorage(localUri)
527 ->Put(object, id, metaData);
532 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
538 //______________________________________________________________________________________________
539 Bool_t AliShuttle::StoreOCDB()
542 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
543 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
544 // Then calls StoreRefFilesToGrid to store reference files.
547 if (fTestMode & kErrorGrid)
549 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
550 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
554 Log("SHUTTLE","StoreOCDB - Storing OCDB data ...");
555 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
557 Log("SHUTTLE","StoreOCDB - Storing reference data ...");
558 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
560 Log("SHUTTLE","StoreOCDB - Storing reference files ...");
561 Bool_t resultRefFiles = CopyFilesToGrid("reference");
563 Bool_t resultMetadata = kTRUE;
564 if(fCurrentDetector == "GRP")
566 Log("StoreOCDB - SHUTTLE","Storing Run Metadata file ...");
567 resultMetadata = CopyFilesToGrid("metadata");
570 return resultCDB && resultRef && resultRefFiles && resultMetadata;
573 //______________________________________________________________________________________________
574 Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
577 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
580 TObjArray* gridIds=0;
582 Bool_t result = kTRUE;
584 const char* type = 0;
586 if(gridURI == fgkMainCDB) {
588 localURI = fgkLocalCDB;
589 } else if(gridURI == fgkMainRefStorage) {
591 localURI = fgkLocalRefStorage;
593 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
597 AliCDBManager* man = AliCDBManager::Instance();
599 AliCDBStorage *gridSto = man->GetStorage(gridURI);
602 Form("StoreOCDB - cannot activate main %s storage", type));
606 gridIds = gridSto->GetQueryCDBList();
608 // get objects previously stored in local CDB
609 AliCDBStorage *localSto = man->GetStorage(localURI);
612 Form("StoreOCDB - cannot activate local %s storage", type));
615 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
616 // Local objects were stored with current run as Grid version!
617 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
618 localEntries->SetOwner(1);
620 // loop on local stored objects
621 TIter localIter(localEntries);
622 AliCDBEntry *aLocEntry = 0;
623 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
624 aLocEntry->SetOwner(1);
625 AliCDBId aLocId = aLocEntry->GetId();
626 aLocEntry->SetVersion(-1);
627 aLocEntry->SetSubVersion(-1);
629 // If local object is valid up to infinity we store it only if it is
630 // the first unprocessed run!
631 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
632 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
634 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
635 "there are previous unprocessed runs!",
636 fCurrentDetector.Data(), aLocId.GetPath().Data()));
640 // loop on Grid valid Id's
641 Bool_t store = kTRUE;
642 TIter gridIter(gridIds);
643 AliCDBId* aGridId = 0;
644 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
645 if(aGridId->GetPath() != aLocId.GetPath()) continue;
646 // skip all objects valid up to infinity
647 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
648 // if we get here, it means there's already some more recent object stored on Grid!
653 // If we get here, the file can be stored!
654 Bool_t storeOk = gridSto->Put(aLocEntry);
655 if(!store || storeOk){
659 Log(fCurrentDetector.Data(),
660 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
661 type, aGridId->ToString().Data()));
664 Form("StoreOCDB - Object <%s> successfully put into %s storage",
665 aLocId.ToString().Data(), type));
666 Log(fCurrentDetector.Data(),
667 Form("StoreOCDB - Object <%s> successfully put into %s storage",
668 aLocId.ToString().Data(), type));
671 // removing local filename...
673 localSto->IdToFilename(aLocId, filename);
674 Log("SHUTTLE", Form("StoreOCDB - Removing local file %s", filename.Data()));
675 RemoveFile(filename.Data());
679 Form("StoreOCDB - Grid %s storage of object <%s> failed",
680 type, aLocId.ToString().Data()));
681 Log(fCurrentDetector.Data(),
682 Form("StoreOCDB - Grid %s storage of object <%s> failed",
683 type, aLocId.ToString().Data()));
687 localEntries->Clear();
692 //______________________________________________________________________________________________
693 Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
695 // clears the directory used to store reference files of a given subdetector
697 AliCDBManager* man = AliCDBManager::Instance();
698 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
699 TString localBaseFolder = sto->GetBaseFolder();
701 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
703 Log("SHUTTLE", Form("CleanReferenceStorage - Cleaning %s", targetDir.Data()));
706 begin.Form("%d_", GetCurrentRun());
708 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
712 TList* dirList = baseDir->GetListOfFiles();
715 if (!dirList) return kTRUE;
717 if (dirList->GetEntries() < 3)
723 Int_t nDirs = 0, nDel = 0;
724 TIter dirIter(dirList);
725 TSystemFile* entry = 0;
727 Bool_t success = kTRUE;
729 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
731 if (entry->IsDirectory())
734 TString fileName(entry->GetName());
735 if (!fileName.BeginsWith(begin))
741 Int_t result = gSystem->Unlink(fileName.Data());
745 Log("SHUTTLE", Form("CleanReferenceStorage - Could not delete file %s!", fileName.Data()));
753 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
754 nDel, nDirs, targetDir.Data()));
765 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
769 result = gSystem->Exec(Form("rm -rf %s", targetDir.Data()));
772 Log("SHUTTLE", Form("CleanReferenceStorage - Could not clean directory %s", targetDir.Data()));
777 result = gSystem->mkdir(targetDir, kTRUE);
780 Log("SHUTTLE", Form("CleanReferenceStorage - Error creating base directory %s", targetDir.Data()));
787 //______________________________________________________________________________________________
788 Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
791 // Stores reference file directly (without opening it). This function stores the file locally.
793 // The file is stored under the following location:
794 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
795 // where <gridFileName> is the second parameter given to the function
798 if (fTestMode & kErrorStorage)
800 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
804 AliCDBManager* man = AliCDBManager::Instance();
805 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
807 TString localBaseFolder = sto->GetBaseFolder();
809 TString target = GetRefFilePrefix(localBaseFolder.Data(), detector);
810 target.Append(Form("/%d_%s", GetCurrentRun(), gridFileName));
812 return CopyFileLocally(localFile, target);
815 //______________________________________________________________________________________________
816 Bool_t AliShuttle::StoreRunMetadataFile(const char* localFile, const char* gridFileName)
819 // Stores Run metadata file to the Grid, in the run folder
821 // Only GRP can call this function.
823 if (fTestMode & kErrorStorage)
825 Log(fCurrentDetector, "StoreRunMetaDataFile - In TESTMODE - Simulating error while storing locally");
829 AliCDBManager* man = AliCDBManager::Instance();
830 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
832 TString localBaseFolder = sto->GetBaseFolder();
834 // Build Run level folder
835 // folder = /alice/data/year/lhcPeriod/runNb/Raw
838 TString lhcPeriod = GetLHCPeriod();
839 if (lhcPeriod.Length() == 0)
841 Log("SHUTTLE","StoreRunMetaDataFile - LHCPeriod not found in logbook!");
845 TString target = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw/%s",
846 localBaseFolder.Data(), GetCurrentYear(),
847 lhcPeriod.Data(), GetCurrentRun(), gridFileName);
849 return CopyFileLocally(localFile, target);
852 //______________________________________________________________________________________________
853 Bool_t AliShuttle::CopyFileLocally(const char* localFile, const TString& target)
856 // Stores file locally. Called by StoreReferenceFile and StoreRunMetadataFile
857 // Files are temporarily stored in the local reference storage. When the preprocessor
858 // finishes, the Shuttle calls CopyFilesToGrid to transfer the files to AliEn
859 // (in reference or run level folders)
862 TString targetDir(target(0, target.Last('/')));
864 //try to open base dir folder, if it does not exist
865 void* dir = gSystem->OpenDirectory(targetDir.Data());
867 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
868 Log("SHUTTLE", Form("StoreFileLocally - Can't open directory <%s>", targetDir.Data()));
873 gSystem->FreeDirectory(dir);
878 result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
881 Log("SHUTTLE", Form("StoreFileLocally - %s does not exist", localFile));
885 result = gSystem->GetPathInfo(target, 0, (Long64_t*) 0, 0, 0);
888 Log("SHUTTLE", Form("StoreFileLocally - target file %s already exist, removing...", target.Data()));
889 if (gSystem->Unlink(target.Data()))
891 Log("SHUTTLE", Form("StoreFileLocally - Could not remove existing target file %s!", target.Data()));
896 result = gSystem->CopyFile(localFile, target);
900 Log("SHUTTLE", Form("StoreFileLocally - File %s stored locally to %s", localFile, target.Data()));
905 Log("SHUTTLE", Form("StoreFileLocally - Could not store file %s to %s! Error code = %d",
906 localFile, target.Data(), result));
914 //______________________________________________________________________________________________
915 Bool_t AliShuttle::CopyFilesToGrid(const char* type)
918 // Transfers local files to the Grid. Local files can be reference files
919 // or run metadata file (from GRP only).
921 // According to the type (ref, metadata) the files are stored under the following location:
922 // ref --> <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
923 // metadata --> <run data folder>/<MetadataFileName>
926 AliCDBManager* man = AliCDBManager::Instance();
927 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
930 TString localBaseFolder = sto->GetBaseFolder();
936 if (strcmp(type, "reference") == 0)
938 dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
939 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
942 TString gridBaseFolder = gridSto->GetBaseFolder();
943 alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
944 begin = Form("%d_", GetCurrentRun());
946 else if (strcmp(type, "metadata") == 0)
949 TString lhcPeriod = GetLHCPeriod();
951 if (lhcPeriod.Length() == 0)
953 Log("SHUTTLE","CopyFilesToGrid - LHCPeriod not found in logbook!");
957 dir = Form("%s/GRP/RunMetadata/alice/data/%d/%s/%09d/Raw",
958 localBaseFolder.Data(), GetCurrentYear(),
959 lhcPeriod.Data(), GetCurrentRun());
960 alienDir = dir(dir.Index("/alice/data/"), dir.Length());
966 Log("SHUTTLE", "CopyFilesToGrid - Unexpected: type label must be reference or metadata!");
970 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
974 TList* dirList = baseDir->GetListOfFiles();
977 if (!dirList) return kTRUE;
979 if (dirList->GetEntries() < 3)
987 Log("SHUTTLE", "CopyFilesToGrid - Connection to Grid failed: Cannot continue!");
992 Int_t nDirs = 0, nTransfer = 0;
993 TIter dirIter(dirList);
994 TSystemFile* entry = 0;
996 Bool_t success = kTRUE;
997 Bool_t first = kTRUE;
999 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
1001 if (entry->IsDirectory())
1004 TString fileName(entry->GetName());
1005 if (!fileName.BeginsWith(begin))
1013 // check that folder exists, otherwise create it
1014 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
1022 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
1024 // TODO It does not work currently! Bug in TAliEn::Mkdir
1025 // TODO Manually fixed in local root v5-16-00
1026 if (!gGrid->Mkdir(alienDir.Data(),"-p",0))
1028 Log("SHUTTLE", Form("CopyFilesToGrid - Cannot create directory %s",
1033 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s created", alienDir.Data()));
1037 Log("SHUTTLE",Form("CopyFilesToGrid - Folder %s found", alienDir.Data()));
1041 TString fullLocalPath;
1042 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
1044 TString fullGridPath;
1045 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
1047 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
1051 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s succeeded!",
1052 fullLocalPath.Data(), fullGridPath.Data()));
1053 RemoveFile(fullLocalPath);
1058 Log("SHUTTLE", Form("CopyFilesToGrid - Copying local file %s to %s FAILED!",
1059 fullLocalPath.Data(), fullGridPath.Data()));
1064 Log("SHUTTLE", Form("CopyFilesToGrid - %d (over %d) files in folder %s copied to Grid.",
1065 nTransfer, nDirs, dir.Data()));
1072 //______________________________________________________________________________________________
1073 const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
1076 // Get folder name of reference files
1079 TString offDetStr(GetOfflineDetName(detector));
1081 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
1083 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
1085 dir.Form("%s/%s", base, offDetStr.Data());
1093 //______________________________________________________________________________________________
1094 void AliShuttle::CleanLocalStorage(const TString& uri)
1097 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
1100 const char* type = 0;
1101 if(uri == fgkLocalCDB) {
1103 } else if(uri == fgkLocalRefStorage) {
1106 AliError(Form("Invalid storage URI: %s", uri.Data()));
1110 AliCDBManager* man = AliCDBManager::Instance();
1112 // open local storage
1113 AliCDBStorage *localSto = man->GetStorage(uri);
1116 Form("CleanLocalStorage - cannot activate local %s storage", type));
1120 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
1121 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
1123 AliDebug(2, Form("filename = %s", filename.Data()));
1125 Log("SHUTTLE", Form("Removing remaining local files for run %d and detector %s ...",
1126 GetCurrentRun(), fCurrentDetector.Data()));
1128 RemoveFile(filename.Data());
1132 //______________________________________________________________________________________________
1133 void AliShuttle::RemoveFile(const char* filename)
1136 // removes local file
1139 TString command(Form("rm -f %s", filename));
1141 Int_t result = gSystem->Exec(command.Data());
1144 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
1145 fCurrentDetector.Data(), filename));
1149 //______________________________________________________________________________________________
1150 AliShuttleStatus* AliShuttle::ReadShuttleStatus()
1153 // Reads the AliShuttleStatus from the CDB
1157 delete fStatusEntry;
1161 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
1162 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
1164 if (!fStatusEntry) return 0;
1165 fStatusEntry->SetOwner(1);
1167 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1169 AliError("Invalid object stored to CDB!");
1176 //______________________________________________________________________________________________
1177 Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
1180 // writes the status for one subdetector
1184 delete fStatusEntry;
1188 Int_t run = GetCurrentRun();
1190 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
1192 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1193 fStatusEntry->SetOwner(1);
1195 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1198 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1199 fCurrentDetector.Data(), run));
1208 //______________________________________________________________________________________________
1209 void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1212 // changes the AliShuttleStatus for the given detector and run to the given status
1216 AliError("UNEXPECTED: fStatusEntry empty");
1220 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1223 Log("SHUTTLE", "UpdateShuttleStatus - UNEXPECTED: status could not be read from current CDB entry");
1227 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
1228 fCurrentDetector.Data(),
1229 status->GetStatusName(),
1230 status->GetStatusName(newStatus));
1231 Log("SHUTTLE", actionStr);
1232 SetLastAction(actionStr);
1234 status->SetStatus(newStatus);
1235 if (increaseCount) status->IncreaseCount();
1237 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
1242 //______________________________________________________________________________________________
1243 void AliShuttle::SendMLInfo()
1246 // sends ML information about the current status of the current detector being processed
1249 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1252 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
1256 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1257 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1260 mlList.Add(&mlStatus);
1261 mlList.Add(&mlRetryCount);
1263 fMonaLisa->SendParameters(&mlList);
1266 //______________________________________________________________________________________________
1267 Bool_t AliShuttle::ContinueProcessing()
1269 // this function reads the AliShuttleStatus information from CDB and
1270 // checks if the processing should be continued
1271 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
1273 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1275 AliPreprocessor* aPreprocessor =
1276 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1279 Log("SHUTTLE", Form("ContinueProcessing - %s: no preprocessor registered", fCurrentDetector.Data()));
1283 AliShuttleLogbookEntry::Status entryStatus =
1284 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
1286 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
1287 Log("SHUTTLE", Form("ContinueProcessing - %s is %s",
1288 fCurrentDetector.Data(),
1289 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1293 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
1295 // check if current run is first unprocessed run for current detector
1296 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1297 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1299 if (fTestMode == kNone)
1301 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering"
1302 " but this is not the first unprocessed run!"));
1307 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - "
1308 "Although %s requires strict run ordering "
1309 "and this is not the first unprocessed run, "
1310 "the SHUTTLE continues"));
1314 AliShuttleStatus* status = ReadShuttleStatus();
1317 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1318 fCurrentDetector.Data()));
1319 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1320 return WriteShuttleStatus(status);
1323 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1324 // If it happens it may mean Logbook updating failed... let's do it now!
1325 if (status->GetStatus() == AliShuttleStatus::kDone ||
1326 status->GetStatus() == AliShuttleStatus::kFailed){
1327 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1328 fCurrentDetector.Data(),
1329 status->GetStatusName(status->GetStatus())));
1330 UpdateShuttleLogbook(fCurrentDetector.Data(),
1331 status->GetStatusName(status->GetStatus()));
1335 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
1337 Form("ContinueProcessing - %s: Grid storage of one or more "
1338 "objects failed. Trying again now",
1339 fCurrentDetector.Data()));
1340 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1342 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects "
1343 "successfully stored into main storage",
1344 fCurrentDetector.Data()));
1347 Form("ContinueProcessing - %s: Grid storage failed again",
1348 fCurrentDetector.Data()));
1349 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1354 // if we get here, there is a restart
1355 Bool_t cont = kFALSE;
1358 if (status->GetCount() >= fConfig->GetMaxRetries()) {
1359 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1360 "Updating Shuttle Logbook", fCurrentDetector.Data(),
1361 status->GetCount(), status->GetStatusName()));
1362 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
1363 UpdateShuttleStatus(AliShuttleStatus::kFailed);
1365 // there may still be objects in local OCDB and reference storage
1366 // and FXS databases may be not updated: do it now!
1368 // TODO Currently disabled, we want to keep files in case of failure!
1369 // CleanLocalStorage(fgkLocalCDB);
1370 // CleanLocalStorage(fgkLocalRefStorage);
1371 // UpdateTableFailCase();
1373 // Send mail to detector expert!
1374 Log("SHUTTLE", Form("ContinueProcessing - Sending mail to %s expert...",
1375 fCurrentDetector.Data()));
1377 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1378 fCurrentDetector.Data()));
1381 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1382 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1383 status->GetStatusName(), status->GetCount()));
1384 Bool_t increaseCount = kTRUE;
1385 if (status->GetStatus() == AliShuttleStatus::kDCSError ||
1386 status->GetStatus() == AliShuttleStatus::kDCSStarted)
1387 increaseCount = kFALSE;
1389 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
1396 //______________________________________________________________________________________________
1397 Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
1400 // Makes data retrieval for all detectors in the configuration.
1401 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1402 // (Unprocessed, Inactive, Failed or Done).
1403 // Returns kFALSE in case of error occured and kTRUE otherwise
1406 if (!entry) return kFALSE;
1408 fLogbookEntry = entry;
1410 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^*",
1413 // create ML instance that monitors this run
1414 fMonaLisa = new TMonaLisaWriter(fConfig->GetMonitorHost(), fConfig->GetMonitorTable(), Form("%d", GetCurrentRun()));
1416 // Send the information to ML
1417 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
1418 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
1421 mlList.Add(&mlStatus);
1422 mlList.Add(&mlRunType);
1424 fMonaLisa->SendParameters(&mlList);
1426 if (fLogbookEntry->IsDone())
1428 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1429 UpdateShuttleLogbook("shuttle_done");
1434 // read test mode if flag is set
1438 TString logEntry(entry->GetRunParameter("log"));
1439 //printf("log entry = %s\n", logEntry.Data());
1440 TString searchStr("Testmode: ");
1441 Int_t pos = logEntry.Index(searchStr.Data());
1442 //printf("%d\n", pos);
1445 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1446 //printf("%s\n", subStr.String().Data());
1447 TString newStr(subStr.Data());
1448 TObjArray* token = newStr.Tokenize(' ');
1452 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1455 Int_t testMode = tmpStr->String().Atoi();
1458 Log("SHUTTLE", Form("Process - Enabling test mode %d", testMode));
1459 SetTestMode((TestMode) testMode);
1467 fLogbookEntry->Print("all");
1470 Bool_t hasError = kFALSE;
1472 // Set the CDB and Reference folders according to the year and LHC period
1473 TString lhcPeriod(GetLHCPeriod());
1474 if (lhcPeriod.Length() == 0)
1476 Log("SHUTTLE","Process - LHCPeriod not found in logbook!");
1480 if (fgkMainCDB.Length() == 0)
1481 fgkMainCDB = Form("alien://folder=/alice/data/%d/%s/OCDB?user=alidaq?cacheFold=/tmp/OCDBCache",
1482 GetCurrentYear(), lhcPeriod.Data());
1484 if (fgkMainRefStorage.Length() == 0)
1485 fgkMainRefStorage = Form("alien://folder=/alice/data/%d/%s/Reference?user=alidaq?cacheFold=/tmp/OCDBCache",
1486 GetCurrentYear(), lhcPeriod.Data());
1488 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1489 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1490 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1491 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
1493 // Loop on detectors in the configuration
1494 TIter iter(fConfig->GetDetectors());
1495 TObjString* aDetector = 0;
1497 while ((aDetector = (TObjString*) iter.Next()))
1499 fCurrentDetector = aDetector->String();
1501 if (ContinueProcessing() == kFALSE) continue;
1503 Log("SHUTTLE", Form("\t\t\t****** run %d - %s: START ******",
1504 GetCurrentRun(), aDetector->GetName()));
1506 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1508 Log(fCurrentDetector.Data(), "Process - Starting processing");
1514 Log("SHUTTLE", "Process - ERROR: Forking failed");
1519 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Starting monitoring",
1520 GetCurrentRun(), aDetector->GetName()));
1522 Long_t begin = time(0);
1524 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1525 while (waitpid(pid, &status, WNOHANG) == 0)
1527 Long_t expiredTime = time(0) - begin;
1529 if (expiredTime > fConfig->GetPPTimeOut())
1532 tmp.Form("Process - Process of %s time out. "
1533 "Run time: %d seconds. Killing...",
1534 fCurrentDetector.Data(), expiredTime);
1535 Log("SHUTTLE", tmp);
1536 Log(fCurrentDetector, tmp);
1540 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
1543 gSystem->Sleep(1000);
1547 gSystem->Sleep(1000);
1550 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1551 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1554 Log("SHUTTLE", Form("Process - Error: "
1555 "Could not open pipe to %s", checkStr.Data()));
1560 if (!fgets(buffer, 100, pipe))
1562 Log("SHUTTLE", "Process - Error: ps did not return anything");
1563 gSystem->ClosePipe(pipe);
1566 gSystem->ClosePipe(pipe);
1568 //Log("SHUTTLE", Form("ps returned %s", buffer));
1571 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1573 Log("SHUTTLE", "Process - Error: Could not parse output of ps");
1577 if (expiredTime % 60 == 0)
1578 Log("SHUTTLE", Form("Process - %s: Checking process. "
1579 "Run time: %d seconds - Memory consumption: %d KB",
1580 fCurrentDetector.Data(), expiredTime, mem));
1582 if (mem > fConfig->GetPPMaxMem())
1585 tmp.Form("Process - Process exceeds maximum allowed memory "
1586 "(%d KB > %d KB). Killing...",
1587 mem, fConfig->GetPPMaxMem());
1588 Log("SHUTTLE", tmp);
1589 Log(fCurrentDetector, tmp);
1593 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1596 gSystem->Sleep(1000);
1601 Log("SHUTTLE", Form("Process - In parent process of %d - %s: Client has terminated.",
1602 GetCurrentRun(), aDetector->GetName()));
1604 if (WIFEXITED(status))
1606 Int_t returnCode = WEXITSTATUS(status);
1608 Log("SHUTTLE", Form("Process - %s: the return code is %d", fCurrentDetector.Data(),
1611 if (returnCode == 0) hasError = kTRUE;
1617 Log("SHUTTLE", Form("Process - In client process of %d - %s", GetCurrentRun(),
1618 aDetector->GetName()));
1620 Log("SHUTTLE", Form("Process - Redirecting output to %s log",fCurrentDetector.Data()));
1622 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
1624 Log("SHUTTLE", "Process - Could not freopen stdout");
1628 fOutputRedirected = kTRUE;
1629 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1630 Log("SHUTTLE", "Process - Could not redirect stderr");
1634 TString wd = gSystem->WorkingDirectory();
1635 TString tmpDir = Form("%s/%s_%d_process", GetShuttleTempDir(),
1636 fCurrentDetector.Data(), GetCurrentRun());
1638 Int_t result = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
1639 if (!result) // temp dir already exists!
1641 Log(fCurrentDetector.Data(),
1642 Form("Process - %s dir already exists! Removing...", tmpDir.Data()));
1643 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1646 if (gSystem->mkdir(tmpDir.Data(), 1))
1648 Log(fCurrentDetector.Data(), "Process - could not make temp directory!!");
1652 if (!gSystem->ChangeDirectory(tmpDir.Data()))
1654 Log(fCurrentDetector.Data(), "Process - could not change directory!!");
1658 Bool_t success = ProcessCurrentDetector();
1660 gSystem->ChangeDirectory(wd.Data());
1662 if (success) // Preprocessor finished successfully!
1664 // remove temporary folder
1665 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1667 // Update time_processed field in FXS DB
1668 if (UpdateTable() == kFALSE)
1669 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1670 fCurrentDetector.Data()));
1672 // Transfer the data from local storage to main storage (Grid)
1673 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1674 if (StoreOCDB() == kFALSE)
1677 Form("\t\t\t****** run %d - %s: STORAGE ERROR ******",
1678 GetCurrentRun(), aDetector->GetName()));
1679 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
1683 Form("\t\t\t****** run %d - %s: DONE ******",
1684 GetCurrentRun(), aDetector->GetName()));
1685 UpdateShuttleStatus(AliShuttleStatus::kDone);
1686 UpdateShuttleLogbook(fCurrentDetector, "DONE");
1691 Form("\t\t\t****** run %d - %s: PP ERROR ******",
1692 GetCurrentRun(), aDetector->GetName()));
1695 for (UInt_t iSys=0; iSys<3; iSys++)
1697 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1700 Log("SHUTTLE", Form("Process - Client process of %d - %s is exiting now with %d.",
1701 GetCurrentRun(), aDetector->GetName(), success));
1703 // the client exits here
1704 gSystem->Exit(success);
1706 AliError("We should never get here!!!");
1710 Log("SHUTTLE", Form("\t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^*",
1713 //check if shuttle is done for this run, if so update logbook
1714 TObjArray checkEntryArray;
1715 checkEntryArray.SetOwner(1);
1716 TString whereClause = Form("where run=%d", GetCurrentRun());
1717 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) ||
1718 checkEntryArray.GetEntries() == 0) {
1719 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1721 return hasError == kFALSE;
1724 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1725 (checkEntryArray.At(0));
1729 if (checkEntry->IsDone())
1731 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1732 UpdateShuttleLogbook("shuttle_done");
1736 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
1738 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
1740 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1741 checkEntry->GetRun(), GetDetName(iDet)));
1742 fFirstUnprocessed[iDet] = kFALSE;
1748 // remove ML instance
1754 return hasError == kFALSE;
1757 //______________________________________________________________________________________________
1758 Bool_t AliShuttle::ProcessCurrentDetector()
1761 // Makes data retrieval just for a specific detector (fCurrentDetector).
1762 // Threre should be a configuration for this detector.
1764 Log("SHUTTLE", Form("ProcessCurrentDetector - Retrieving values for %s, run %d",
1765 fCurrentDetector.Data(), GetCurrentRun()));
1767 TString wd = gSystem->WorkingDirectory();
1769 if (!CleanReferenceStorage(fCurrentDetector.Data()))
1772 gSystem->ChangeDirectory(wd.Data());
1774 TMap* dcsMap = new TMap();
1776 // call preprocessor
1777 AliPreprocessor* aPreprocessor =
1778 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1780 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1782 Bool_t processDCS = aPreprocessor->ProcessDCS();
1786 Log(fCurrentDetector, "ProcessCurrentDetector -"
1787 " The preprocessor requested to skip the retrieval of DCS values");
1789 else if (fTestMode & kSkipDCS)
1791 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Skipping DCS processing");
1793 else if (fTestMode & kErrorDCS)
1795 Log(fCurrentDetector, "ProcessCurrentDetector - In TESTMODE: Simulating DCS error");
1796 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1797 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1802 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1804 // Query DCS archive
1805 Int_t nServers = fConfig->GetNServers(fCurrentDetector);
1807 for (int iServ=0; iServ<nServers; iServ++)
1810 TString host(fConfig->GetDCSHost(fCurrentDetector, iServ));
1811 Int_t port = fConfig->GetDCSPort(fCurrentDetector, iServ);
1812 Int_t multiSplit = fConfig->GetMultiSplit(fCurrentDetector, iServ);
1814 Log(fCurrentDetector, Form("ProcessCurrentDetector -"
1815 " Querying DCS Amanda server %s:%d (%d of %d)",
1816 host.Data(), port, iServ+1, nServers));
1821 if (fConfig->GetDCSAliases(fCurrentDetector, iServ)->GetEntries() > 0)
1823 aliasMap = GetValueSet(host, port,
1824 fConfig->GetDCSAliases(fCurrentDetector, iServ),
1825 kAlias, multiSplit);
1828 Log(fCurrentDetector,
1829 Form("ProcessCurrentDetector -"
1830 " Error retrieving DCS aliases from server %s."
1831 " Sending mail to DCS experts!", host.Data()));
1832 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1834 if (!SendMailToDCS())
1835 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1842 if (fConfig->GetDCSDataPoints(fCurrentDetector, iServ)->GetEntries() > 0)
1844 dpMap = GetValueSet(host, port,
1845 fConfig->GetDCSDataPoints(fCurrentDetector, iServ),
1849 Log(fCurrentDetector,
1850 Form("ProcessCurrentDetector -"
1851 " Error retrieving DCS data points from server %s."
1852 " Sending mail to DCS experts!", host.Data()));
1853 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1855 if (!SendMailToDCS())
1856 Log("SHUTTLE", Form("ProcessCurrentDetector - Could not send mail to DCS experts!"));
1858 if (aliasMap) delete aliasMap;
1864 // merge aliasMap and dpMap into dcsMap
1866 TIter iter(aliasMap);
1867 TObjString* key = 0;
1868 while ((key = (TObjString*) iter.Next()))
1869 dcsMap->Add(key, aliasMap->GetValue(key->String()));
1871 aliasMap->SetOwner(kFALSE);
1877 TObjString* key = 0;
1878 while ((key = (TObjString*) iter.Next()))
1879 dcsMap->Add(key, dpMap->GetValue(key->String()));
1881 dpMap->SetOwner(kFALSE);
1887 // save map into file, to help debugging in case of preprocessor error
1888 TFile* f = TFile::Open("DCSMap.root","recreate");
1890 dcsMap->Write("DCSMap", TObject::kSingleKey);
1894 // DCS Archive DB processing successful. Call Preprocessor!
1895 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
1897 UInt_t returnValue = aPreprocessor->Process(dcsMap);
1899 if (returnValue > 0) // Preprocessor error!
1901 Log(fCurrentDetector, Form("ProcessCurrentDetector - "
1902 "Preprocessor failed. Process returned %d.", returnValue));
1903 UpdateShuttleStatus(AliShuttleStatus::kPPError);
1904 dcsMap->DeleteAll();
1910 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1911 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1912 fCurrentDetector.Data()));
1914 dcsMap->DeleteAll();
1920 //______________________________________________________________________________________________
1921 Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1924 // Query DAQ's Shuttle logbook and fills detector status object.
1925 // Call QueryRunParameters to query DAQ logbook for run parameters.
1928 entries.SetOwner(1);
1930 // check connection, in case connect
1931 if(!Connect(3)) return kFALSE;
1934 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
1936 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1938 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1942 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1944 if(aResult->GetRowCount() == 0) {
1945 Log("SHUTTLE", "No entries in Shuttle Logbook match request");
1950 // TODO Check field count!
1951 const UInt_t nCols = 23;
1952 if (aResult->GetFieldCount() != (Int_t) nCols) {
1953 Log("SHUTTLE", "Invalid SQL result field number!");
1959 while ((aRow = aResult->Next())) {
1960 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1961 Int_t run = runString.Atoi();
1963 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1967 // loop on detectors
1968 for(UInt_t ii = 0; ii < nCols; ii++)
1969 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
1971 entries.AddLast(entry);
1979 //______________________________________________________________________________________________
1980 AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
1983 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1986 // check connection, in case connect
1991 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
1993 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
1995 Log("SHUTTLE", Form("Can't execute query <%s>!", sqlQuery.Data()));
1999 if (aResult->GetRowCount() == 0) {
2000 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
2005 if (aResult->GetRowCount() > 1) {
2006 Log("SHUTTLE", Form("QueryRunParameters - UNEXPECTED: "
2007 "more than one entry in DAQ Logbook for run %d!", run));
2012 TSQLRow* aRow = aResult->Next();
2015 Log("SHUTTLE", Form("QueryRunParameters - Could not retrieve row for run %d. Skipping", run));
2020 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2022 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
2023 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2025 UInt_t startTime = entry->GetStartTime();
2026 UInt_t endTime = entry->GetEndTime();
2028 if (!startTime || !endTime || startTime > endTime)
2031 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d. Skipping!",
2032 run, startTime, endTime));
2034 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2035 fLogbookEntry = entry;
2036 if (!UpdateShuttleLogbook("shuttle_done"))
2038 AliError(Form("Could not update logbook for run %d !", run));
2048 TString totEventsStr = entry->GetRunParameter("totalEvents");
2049 Int_t totEvents = totEventsStr.Atoi();
2053 Form("QueryRunParameters - Run %d has 0 events - Skipping!", run));
2055 Log("SHUTTLE", Form("Marking SHUTTLE done for run %d", run));
2056 fLogbookEntry = entry;
2057 if (!UpdateShuttleLogbook("shuttle_done"))
2059 AliError(Form("Could not update logbook for run %d !", run));
2075 //______________________________________________________________________________________________
2076 TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
2077 DCSType type, Int_t multiSplit)
2079 // Retrieve all "entry" data points from the DCS server
2080 // host, port: TSocket connection parameters
2081 // entries: list of name of the alias or data point
2082 // type: kAlias or kDP
2083 // returns TMap of values, 0 when failure
2085 AliDCSClient client(host, port, fTimeout, fRetries, multiSplit);
2090 result = client.GetAliasValues(entries, GetCurrentStartTime(),
2091 GetCurrentEndTime());
2093 else if (type == kDP)
2095 result = client.GetDPValues(entries, GetCurrentStartTime(),
2096 GetCurrentEndTime());
2101 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries! Reason: %s",
2102 client.GetErrorString(client.GetResultErrorCode())));
2103 if (client.GetResultErrorCode() == AliDCSClient::fgkServerError)
2104 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error code: %s",
2105 client.GetServerError().Data()));
2113 //______________________________________________________________________________________________
2114 const char* AliShuttle::GetFile(Int_t system, const char* detector,
2115 const char* id, const char* source)
2117 // Get calibration file from file exchange servers
2118 // First queris the FXS database for the file name, using the run, detector, id and source info
2119 // then calls RetrieveFile(filename) for actual copy to local disk
2120 // run: current run being processed (given by Logbook entry fLogbookEntry)
2121 // detector: the Preprocessor name
2122 // id: provided as a parameter by the Preprocessor
2123 // source: provided by the Preprocessor through GetFileSources function
2125 // check if test mode should simulate a FXS error
2126 if (fTestMode & kErrorFXSFiles)
2128 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2132 // check connection, in case connect
2133 if (!Connect(system))
2135 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
2139 // Query preparation
2140 TString sourceName(source);
2142 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
2143 fConfig->GetFXSdbTable(system));
2144 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
2145 GetCurrentRun(), detector, id);
2149 whereClause += Form(" and DAQsource=\"%s\"", source);
2151 else if (system == kDCS)
2155 else if (system == kHLT)
2157 whereClause += Form(" and DDLnumbers=\"%s\"", source);
2161 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2163 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2166 TSQLResult* aResult = 0;
2167 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2169 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
2170 GetSystemName(system), id, sourceName.Data()));
2174 if(aResult->GetRowCount() == 0)
2177 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
2178 GetSystemName(system), id, sourceName.Data()));
2183 if (aResult->GetRowCount() > 1) {
2185 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
2186 GetSystemName(system), id, sourceName.Data()));
2191 if (aResult->GetFieldCount() != nFields) {
2193 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
2194 GetSystemName(system), id, sourceName.Data()));
2199 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
2202 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
2203 GetSystemName(system), id, sourceName.Data()));
2208 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
2209 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
2210 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
2215 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
2216 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
2218 // retrieved file is renamed to make it unique
2219 TString localFileName = Form("%s/%s_%d_process/%s_%s_%d_%s_%s.shuttle",
2220 GetShuttleTempDir(), detector, GetCurrentRun(),
2221 GetSystemName(system), detector, GetCurrentRun(),
2222 id, sourceName.Data());
2225 // file retrieval from FXS
2226 UInt_t nRetries = 0;
2227 UInt_t maxRetries = 3;
2228 Bool_t result = kFALSE;
2230 // copy!! if successful TSystem::Exec returns 0
2231 while(nRetries++ < maxRetries) {
2232 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
2233 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
2236 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
2237 filePath.Data(), GetSystemName(system)));
2241 if (fileChecksum.Length()>0)
2243 // compare md5sum of local file with the one stored in the FXS DB
2244 Int_t md5Comp = gSystem->Exec(Form("md5sum %s |grep %s 2>&1 > /dev/null",
2245 localFileName.Data(), fileChecksum.Data()));
2249 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2255 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2256 filePath.Data(), GetSystemName(system)));
2261 if(!result) return 0;
2263 fFXSCalled[system]=kTRUE;
2264 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2265 fFXSlist[system].Add(fileParams);
2267 static TString staticLocalFileName;
2268 staticLocalFileName.Form("%s", localFileName.Data());
2270 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and "
2271 "source %s from %s to %s", id, source,
2272 GetSystemName(system), localFileName.Data()));
2274 return staticLocalFileName.Data();
2277 //______________________________________________________________________________________________
2278 Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
2281 // Copies file from FXS to local Shuttle machine
2284 // check temp directory: trying to cd to temp; if it does not exist, create it
2285 AliDebug(2, Form("Copy file %s from %s FXS into %s",
2286 GetSystemName(system), fxsFileName, localFileName));
2288 TString tmpDir(localFileName);
2290 tmpDir = tmpDir(0,tmpDir.Last('/'));
2292 Int_t noDir = gSystem->GetPathInfo(tmpDir.Data(), 0, (Long64_t*) 0, 0, 0);
2293 if (noDir) // temp dir does not exists!
2295 if (gSystem->mkdir(tmpDir.Data(), 1))
2297 Log(fCurrentDetector.Data(), "RetrieveFile - could not make temp directory!!");
2302 TString baseFXSFolder;
2305 baseFXSFolder = "FES/";
2307 else if (system == kDCS)
2311 else if (system == kHLT)
2313 baseFXSFolder = "/opt/FXS/";
2317 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s",
2318 fConfig->GetFXSPort(system),
2319 fConfig->GetFXSUser(system),
2320 fConfig->GetFXSHost(system),
2321 baseFXSFolder.Data(),
2325 AliDebug(2, Form("%s",command.Data()));
2327 Bool_t result = (gSystem->Exec(command.Data()) == 0);
2332 //______________________________________________________________________________________________
2333 TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2336 // Get sources producing the condition file Id from file exchange servers
2337 // if id is NULL all sources are returned (distinct)
2340 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
2342 // check if test mode should simulate a FXS error
2343 if (fTestMode & kErrorFXSSources)
2345 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2351 Log(detector, "GetFileSources - WARNING: DCS system has only one source of data!");
2352 TList *list = new TList();
2354 list->Add(new TObjString(" "));
2358 // check connection, in case connect
2359 if (!Connect(system))
2361 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
2365 TString sourceName = 0;
2368 sourceName = "DAQsource";
2369 } else if (system == kHLT)
2371 sourceName = "DDLnumbers";
2374 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2375 TString whereClause = Form("run=%d and detector=\"%s\"",
2376 GetCurrentRun(), detector);
2378 whereClause += Form(" and fileId=\"%s\"", id);
2379 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2381 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2384 TSQLResult* aResult;
2385 aResult = fServer[system]->Query(sqlQuery);
2387 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2388 GetSystemName(system), id));
2392 TList *list = new TList();
2395 if (aResult->GetRowCount() == 0)
2398 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
2403 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
2406 while ((aRow = aResult->Next()))
2409 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2410 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2411 list->Add(new TObjString(source));
2420 //______________________________________________________________________________________________
2421 TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2424 // Get all ids of condition files produced by a given source from file exchange servers
2427 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2429 // check if test mode should simulate a FXS error
2430 if (fTestMode & kErrorFXSSources)
2432 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2436 // check connection, in case connect
2437 if (!Connect(system))
2439 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2443 TString sourceName = 0;
2446 sourceName = "DAQsource";
2447 } else if (system == kHLT)
2449 sourceName = "DDLnumbers";
2452 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2453 TString whereClause = Form("run=%d and detector=\"%s\"",
2454 GetCurrentRun(), detector);
2455 if (sourceName.Length() > 0 && source)
2456 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2457 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2459 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2462 TSQLResult* aResult;
2463 aResult = fServer[system]->Query(sqlQuery);
2465 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2466 GetSystemName(system), source));
2470 TList *list = new TList();
2473 if (aResult->GetRowCount() == 0)
2476 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2481 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2485 while ((aRow = aResult->Next()))
2488 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2489 AliDebug(2, Form("fileId = %s", id.Data()));
2490 list->Add(new TObjString(id));
2499 //______________________________________________________________________________________________
2500 Bool_t AliShuttle::Connect(Int_t system)
2502 // Connect to MySQL Server of the system's FXS MySQL databases
2503 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2506 // check connection: if already connected return
2507 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
2509 TString dbHost, dbUser, dbPass, dbName;
2511 if (system < 3) // FXS db servers
2513 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2514 dbUser = fConfig->GetFXSdbUser(system);
2515 dbPass = fConfig->GetFXSdbPass(system);
2516 dbName = fConfig->GetFXSdbName(system);
2517 } else { // Run & Shuttle logbook servers
2518 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2519 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2520 dbUser = fConfig->GetDAQlbUser();
2521 dbPass = fConfig->GetDAQlbPass();
2522 dbName = fConfig->GetDAQlbDB();
2525 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2526 if (!fServer[system] || !fServer[system]->IsConnected()) {
2529 AliError(Form("Can't establish connection to FXS database for %s",
2530 AliShuttleInterface::GetSystemName(system)));
2532 AliError("Can't establish connection to Run logbook.");
2534 if(fServer[system]) delete fServer[system];
2539 TSQLResult* aResult=0;
2542 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2545 aResult = fServer[kDCS]->GetTables(dbName.Data());
2548 aResult = fServer[kHLT]->GetTables(dbName.Data());
2551 aResult = fServer[3]->GetTables(dbName.Data());
2559 //______________________________________________________________________________________________
2560 Bool_t AliShuttle::UpdateTable()
2563 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2566 Bool_t result = kTRUE;
2568 for (UInt_t system=0; system<3; system++)
2570 if(!fFXSCalled[system]) continue;
2572 // check connection, in case connect
2573 if (!Connect(system))
2575 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2580 TTimeStamp now; // now
2582 // Loop on FXS list entries
2583 TIter iter(&fFXSlist[system]);
2584 TObjString *aFXSentry=0;
2585 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2587 TString aFXSentrystr = aFXSentry->String();
2588 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2589 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2591 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2592 GetSystemName(system), aFXSentrystr.Data()));
2593 if(aFXSarray) delete aFXSarray;
2597 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2598 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2600 TString whereClause;
2603 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2604 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2606 else if (system == kDCS)
2608 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2609 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2611 else if (system == kHLT)
2613 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2614 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2619 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2620 now.GetSec(), whereClause.Data());
2622 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2625 TSQLResult* aResult;
2626 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2629 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2630 GetSystemName(system), sqlQuery.Data()));
2641 //______________________________________________________________________________________________
2642 Bool_t AliShuttle::UpdateTableFailCase()
2644 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2645 // this is called in case the preprocessor is declared failed for the current run, because
2646 // the fields are updated only in case of success
2648 Bool_t result = kTRUE;
2650 for (UInt_t system=0; system<3; system++)
2652 // check connection, in case connect
2653 if (!Connect(system))
2655 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2656 GetSystemName(system)));
2661 TTimeStamp now; // now
2663 // Loop on FXS list entries
2665 TString whereClause = Form("where run=%d and detector=\"%s\";",
2666 GetCurrentRun(), fCurrentDetector.Data());
2669 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2670 now.GetSec(), whereClause.Data());
2672 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2675 TSQLResult* aResult;
2676 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2679 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2680 GetSystemName(system), sqlQuery.Data()));
2690 //______________________________________________________________________________________________
2691 Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2694 // Update Shuttle logbook filling detector or shuttle_done column
2695 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2698 // check connection, in case connect
2700 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2704 TString detName(detector);
2706 if(detName == "shuttle_done")
2708 setClause = "set shuttle_done=1";
2712 // Send the information to ML
2713 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2716 mlList.Add(&mlStatus);
2718 fMonaLisa->SendParameters(&mlList);
2721 TString statusStr(status);
2722 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2723 statusStr.Contains("failed", TString::kIgnoreCase)){
2724 setClause = Form("set %s=\"%s\"", detector, status);
2727 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2733 TString whereClause = Form("where run=%d", GetCurrentRun());
2735 TString sqlQuery = Form("update %s %s %s",
2736 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
2738 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2741 TSQLResult* aResult;
2742 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2744 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2752 //______________________________________________________________________________________________
2753 Int_t AliShuttle::GetCurrentRun() const
2756 // Get current run from logbook entry
2759 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
2762 //______________________________________________________________________________________________
2763 UInt_t AliShuttle::GetCurrentStartTime() const
2766 // get current start time
2769 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
2772 //______________________________________________________________________________________________
2773 UInt_t AliShuttle::GetCurrentEndTime() const
2776 // get current end time from logbook entry
2779 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
2782 //______________________________________________________________________________________________
2783 UInt_t AliShuttle::GetCurrentYear() const
2786 // Get current year from logbook entry
2789 if (!fLogbookEntry) return 0;
2791 TTimeStamp startTime(GetCurrentStartTime());
2792 TString year = Form("%d",startTime.GetDate());
2798 //______________________________________________________________________________________________
2799 const char* AliShuttle::GetLHCPeriod() const
2802 // Get current LHC period from logbook entry
2805 if (!fLogbookEntry) return 0;
2807 return fLogbookEntry->GetRunParameter("LHCperiod");
2810 //______________________________________________________________________________________________
2811 void AliShuttle::Log(const char* detector, const char* message)
2814 // Fill log string with a message
2817 TString logRunDir = GetShuttleLogDir();
2818 if (GetCurrentRun() >=0)
2819 logRunDir += Form("/%d", GetCurrentRun());
2821 void* dir = gSystem->OpenDirectory(logRunDir.Data());
2823 if (gSystem->mkdir(logRunDir.Data(), kTRUE)) {
2824 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
2829 gSystem->FreeDirectory(dir);
2832 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
2833 if (GetCurrentRun() >= 0)
2834 toLog += Form("run %d - ", GetCurrentRun());
2835 toLog += Form("%s", message);
2837 AliInfo(toLog.Data());
2839 // if we redirect the log output already to the file, leave here
2840 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2843 TString fileName = GetLogFileName(detector);
2845 gSystem->ExpandPathName(fileName);
2848 logFile.open(fileName, ofstream::out | ofstream::app);
2850 if (!logFile.is_open()) {
2851 AliError(Form("Could not open file %s", fileName.Data()));
2855 logFile << toLog.Data() << "\n";
2860 //______________________________________________________________________________________________
2861 TString AliShuttle::GetLogFileName(const char* detector) const
2864 // returns the name of the log file for a given sub detector
2869 if (GetCurrentRun() >= 0)
2871 fileName.Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
2872 detector, GetCurrentRun());
2874 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2880 //______________________________________________________________________________________________
2881 Bool_t AliShuttle::Collect(Int_t run)
2884 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2885 // If a dedicated run is given this run is processed
2887 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2891 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2893 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
2895 SetLastAction("Starting");
2897 TString whereClause("where shuttle_done=0");
2899 whereClause += Form(" and run=%d", run);
2901 TObjArray shuttleLogbookEntries;
2902 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2904 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2908 if (shuttleLogbookEntries.GetEntries() == 0)
2911 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2913 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2914 "or it does not exist in Shuttle logbook", run));
2918 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2919 fFirstUnprocessed[iDet] = kTRUE;
2923 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2924 // flag them into fFirstUnprocessed array
2925 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2926 TObjArray tmpLogbookEntries;
2927 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2929 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2933 TIter iter(&tmpLogbookEntries);
2934 AliShuttleLogbookEntry* anEntry = 0;
2935 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2937 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2939 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2941 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2942 anEntry->GetRun(), GetDetName(iDet)));
2943 fFirstUnprocessed[iDet] = kFALSE;
2951 if (!RetrieveConditionsData(shuttleLogbookEntries))
2953 Log("SHUTTLE", "Collect - Process of at least one run failed");
2957 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
2961 //______________________________________________________________________________________________
2962 Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2965 // Retrieve conditions data for all runs that aren't processed yet
2968 Bool_t hasError = kFALSE;
2970 TIter iter(&dateEntries);
2971 AliShuttleLogbookEntry* anEntry;
2973 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2974 if (!Process(anEntry)){
2978 // clean SHUTTLE temp directory
2979 //TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2980 //RemoveFile(filename.Data());
2983 return hasError == kFALSE;
2986 //______________________________________________________________________________________________
2987 ULong_t AliShuttle::GetTimeOfLastAction() const
2990 // Gets time of last action
2995 fMonitoringMutex->Lock();
2997 tmp = fLastActionTime;
2999 fMonitoringMutex->UnLock();
3004 //______________________________________________________________________________________________
3005 const TString AliShuttle::GetLastAction() const
3008 // returns a string description of the last action
3013 fMonitoringMutex->Lock();
3017 fMonitoringMutex->UnLock();
3022 //______________________________________________________________________________________________
3023 void AliShuttle::SetLastAction(const char* action)
3026 // updates the monitoring variables
3029 fMonitoringMutex->Lock();
3031 fLastAction = action;
3032 fLastActionTime = time(0);
3034 fMonitoringMutex->UnLock();
3037 //______________________________________________________________________________________________
3038 const char* AliShuttle::GetRunParameter(const char* param)
3041 // returns run parameter read from DAQ logbook
3044 if(!fLogbookEntry) {
3045 AliError("No logbook entry!");
3049 return fLogbookEntry->GetRunParameter(param);
3052 //______________________________________________________________________________________________
3053 AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
3056 // returns object from OCDB valid for current run
3059 if (fTestMode & kErrorOCDB)
3061 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
3065 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
3068 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
3072 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
3075 //______________________________________________________________________________________________
3076 Bool_t AliShuttle::SendMail()
3079 // sends a mail to the subdetector expert in case of preprocessor error
3082 if (fTestMode != kNone)
3085 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3088 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3090 Log("SHUTTLE", Form("SendMail - Can't open directory <%s>", GetShuttleLogDir()));
3095 gSystem->FreeDirectory(dir);
3098 TString bodyFileName;
3099 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3100 gSystem->ExpandPathName(bodyFileName);
3103 mailBody.open(bodyFileName, ofstream::out);
3105 if (!mailBody.is_open())
3107 Log("SHUTTLE", Form("Could not open mail body file %s", bodyFileName.Data()));
3112 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
3113 TObjString *anExpert=0;
3114 while ((anExpert = (TObjString*) iterExperts.Next()))
3116 to += Form("%s,", anExpert->GetName());
3118 to.Remove(to.Length()-1);
3119 AliDebug(2, Form("to: %s",to.Data()));
3122 Log("SHUTTLE", "List of detector responsibles not yet set!");
3126 TString cc="alberto.colla@cern.ch";
3128 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
3129 fCurrentDetector.Data(), GetCurrentRun());
3130 AliDebug(2, Form("subject: %s", subject.Data()));
3132 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
3133 body += Form("SHUTTLE just detected that your preprocessor "
3134 "failed processing run %d!!\n\n", GetCurrentRun());
3135 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3136 fCurrentDetector.Data());
3137 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3139 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3141 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3145 TString logFolder = "logs";
3146 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3147 logFolder += "_PROD";
3150 body += Form("Find the %s log for the current run on \n\n"
3151 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3152 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3153 fCurrentDetector.Data(), GetCurrentRun());
3154 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3156 AliDebug(2, Form("Body begin: %s", body.Data()));
3158 mailBody << body.Data();
3160 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3162 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(),
3163 GetCurrentRun(), fCurrentDetector.Data(), GetCurrentRun());
3164 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3165 if (gSystem->Exec(tailCommand.Data()))
3167 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3170 TString endBody = Form("------------------------------------------------------\n\n");
3171 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3172 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3173 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3175 AliDebug(2, Form("Body end: %s", endBody.Data()));
3177 mailBody << endBody.Data();
3182 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3186 bodyFileName.Data());
3187 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3189 Bool_t result = gSystem->Exec(mailCommand.Data());
3194 //______________________________________________________________________________________________
3195 Bool_t AliShuttle::SendMailToDCS()
3198 // sends a mail to the DCS experts in case of DCS error
3201 if (fTestMode != kNone)
3204 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
3207 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
3209 Log("SHUTTLE", Form("SendMailToDCS - Can't open directory <%s>", GetShuttleLogDir()));
3214 gSystem->FreeDirectory(dir);
3217 TString bodyFileName;
3218 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
3219 gSystem->ExpandPathName(bodyFileName);
3222 mailBody.open(bodyFileName, ofstream::out);
3224 if (!mailBody.is_open())
3226 Log("SHUTTLE", Form("SendMailToDCS - Could not open mail body file %s", bodyFileName.Data()));
3230 TString to="Vladimir.Fekete@cern.ch, Svetozar.Kapusta@cern.ch";
3231 //TString to="alberto.colla@cern.ch";
3232 AliDebug(2, Form("to: %s",to.Data()));
3235 Log("SHUTTLE", "List of detector responsibles not yet set!");
3239 TString cc="alberto.colla@cern.ch";
3241 TString subject = Form("Retrieval of data points for %s FAILED in run %d !",
3242 fCurrentDetector.Data(), GetCurrentRun());
3243 AliDebug(2, Form("subject: %s", subject.Data()));
3245 TString body = Form("Dear DCS experts, \n\n");
3246 body += Form("SHUTTLE couldn\'t retrieve the data points for detector %s "
3247 "in run %d!!\n\n", fCurrentDetector.Data(), GetCurrentRun());
3248 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n",
3249 fCurrentDetector.Data());
3250 if (fConfig->GetRunMode() == AliShuttleConfig::kTest)
3252 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
3254 body += Form("\thttp://pcalimonitor.cern.ch/shuttle.jsp?instance=PROD?time=168 \n\n");
3257 TString logFolder = "logs";
3258 if (fConfig->GetRunMode() == AliShuttleConfig::kProd)
3259 logFolder += "_PROD";
3262 body += Form("Find the %s log for the current run on \n\n"
3263 "\thttp://pcalishuttle01.cern.ch:8880/%s/%d/%s_%d.log \n\n",
3264 fCurrentDetector.Data(), logFolder.Data(), GetCurrentRun(),
3265 fCurrentDetector.Data(), GetCurrentRun());
3266 body += Form("The last 10 lines of %s log file are following:\n\n", fCurrentDetector.Data());
3268 AliDebug(2, Form("Body begin: %s", body.Data()));
3270 mailBody << body.Data();
3272 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
3274 TString logFileName = Form("%s/%d/%s_%d.log", GetShuttleLogDir(), GetCurrentRun(),
3275 fCurrentDetector.Data(), GetCurrentRun());
3276 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
3277 if (gSystem->Exec(tailCommand.Data()))
3279 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
3282 TString endBody = Form("------------------------------------------------------\n\n");
3283 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
3284 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
3285 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
3287 AliDebug(2, Form("Body end: %s", endBody.Data()));
3289 mailBody << endBody.Data();
3294 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
3298 bodyFileName.Data());
3299 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
3301 Bool_t result = gSystem->Exec(mailCommand.Data());
3306 //______________________________________________________________________________________________
3307 const char* AliShuttle::GetRunType()
3310 // returns run type read from "run type" logbook
3313 if(!fLogbookEntry) {
3314 AliError("No logbook entry!");
3318 return fLogbookEntry->GetRunType();
3321 //______________________________________________________________________________________________
3322 Bool_t AliShuttle::GetHLTStatus()
3324 // Return HLT status (ON=1 OFF=0)
3325 // Converts the HLT status from the status string read in the run logbook (not just a bool)
3327 if(!fLogbookEntry) {
3328 AliError("No logbook entry!");
3332 // TODO implement when HLTStatus is inserted in run logbook
3333 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
3334 //if(hltStatus == "OFF") {return kFALSE};
3339 //______________________________________________________________________________________________
3340 void AliShuttle::SetShuttleTempDir(const char* tmpDir)
3343 // sets Shuttle temp directory
3346 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
3349 //______________________________________________________________________________________________
3350 void AliShuttle::SetShuttleLogDir(const char* logDir)
3353 // sets Shuttle log directory
3356 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);