Reallocate fTracksList in constructor to have fMaxTracks=1000. Plus some minor fixes...
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
3d8bc902 18Revision 1.35 2007/04/04 16:26:38 acolla
191. Re-organization of function calls in TestPreprocessor to make it more meaningful.
202. Added missing dependency in test preprocessors.
213. in AliShuttle.cxx: processing time and memory consumption info on a single line.
22
886d60e6 23Revision 1.34 2007/04/04 10:33:36 jgrosseo
241) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
25In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
26
272) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
28
293) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
30
314) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
32
335) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
34If you always need DCS data (like before), you do not need to implement it.
35
366) The run type has been added to the monitoring page
37
9827400b 38Revision 1.33 2007/04/03 13:56:01 acolla
39Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
40run type.
41
3301427a 42Revision 1.32 2007/02/28 10:41:56 acolla
43Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
44AliPreprocessor::GetRunType() function.
45Added some ldap definition files.
46
d386d623 47Revision 1.30 2007/02/13 11:23:21 acolla
48Moved getters and setters of Shuttle's main OCDB/Reference, local
49OCDB/Reference, temp and log folders to AliShuttleInterface
50
9d733021 51Revision 1.27 2007/01/30 17:52:42 jgrosseo
52adding monalisa monitoring
53
e7f62f16 54Revision 1.26 2007/01/23 19:20:03 acolla
55Removed old ldif files, added TOF, MCH ldif files. Added some options in
56AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
57SetShuttleLogDir
58
36c99a6a 59Revision 1.25 2007/01/15 19:13:52 acolla
60Moved some AliInfo to AliDebug in SendMail function
61
fc5a4708 62Revision 1.21 2006/12/07 08:51:26 jgrosseo
63update (alberto):
64table, db names in ldap configuration
65added GRP preprocessor
66DCS data can also be retrieved by data point
67
2c15234c 68Revision 1.20 2006/11/16 16:16:48 jgrosseo
69introducing strict run ordering flag
70removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
71
be48e3ea 72Revision 1.19 2006/11/06 14:23:04 jgrosseo
73major update (Alberto)
74o) reading of run parameters from the logbook
75o) online offline naming conversion
76o) standalone DCSclient package
77
eba76848 78Revision 1.18 2006/10/20 15:22:59 jgrosseo
79o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
80o) Merging Collect, CollectAll, CollectNew function
81o) Removing implementation of empty copy constructors (declaration still there!)
82
cb343cfd 83Revision 1.17 2006/10/05 16:20:55 jgrosseo
84adapting to new CDB classes
85
6ec0e06c 86Revision 1.16 2006/10/05 15:46:26 jgrosseo
87applying to the new interface
88
481441a2 89Revision 1.15 2006/10/02 16:38:39 jgrosseo
90update (alberto):
91fixed memory leaks
92storing of objects that failed to be stored to the grid before
93interfacing of shuttle status table in daq system
94
2bb7b766 95Revision 1.14 2006/08/29 09:16:05 jgrosseo
96small update
97
85a80aa9 98Revision 1.13 2006/08/15 10:50:00 jgrosseo
99effc++ corrections (alberto)
100
4f0ab988 101Revision 1.12 2006/08/08 14:19:29 jgrosseo
102Update to shuttle classes (Alberto)
103
104- Possibility to set the full object's path in the Preprocessor's and
105Shuttle's Store functions
106- Possibility to extend the object's run validity in the same classes
107("startValidity" and "validityInfinite" parameters)
108- Implementation of the StoreReferenceData function to store reference
109data in a dedicated CDB storage.
110
84090f85 111Revision 1.11 2006/07/21 07:37:20 jgrosseo
112last run is stored after each run
113
7bfb2090 114Revision 1.10 2006/07/20 09:54:40 jgrosseo
115introducing status management: The processing per subdetector is divided into several steps,
116after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
117can keep track of the number of failures and skips further processing after a certain threshold is
118exceeded. These thresholds can be configured in LDAP.
119
5164a766 120Revision 1.9 2006/07/19 10:09:55 jgrosseo
121new configuration, accesst to DAQ FES (Alberto)
122
57f50b3c 123Revision 1.8 2006/07/11 12:44:36 jgrosseo
124adding parameters for extended validity range of data produced by preprocessor
125
17111222 126Revision 1.7 2006/07/10 14:37:09 jgrosseo
127small fix + todo comment
128
e090413b 129Revision 1.6 2006/07/10 13:01:41 jgrosseo
130enhanced storing of last sucessfully processed run (alberto)
131
a7160fe9 132Revision 1.5 2006/07/04 14:59:57 jgrosseo
133revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
134
45a493ce 135Revision 1.4 2006/06/12 09:11:16 jgrosseo
136coding conventions (Alberto)
137
58bc3020 138Revision 1.3 2006/06/06 14:26:40 jgrosseo
139o) removed files that were moved to STEER
140o) shuttle updated to follow the new interface (Alberto)
141
b948db8d 142Revision 1.2 2006/03/07 07:52:34 hristov
143New version (B.Yordanov)
144
d477ad88 145Revision 1.6 2005/11/19 17:19:14 byordano
146RetrieveDATEEntries and RetrieveConditionsData added
147
148Revision 1.5 2005/11/19 11:09:27 byordano
149AliShuttle declaration added
150
151Revision 1.4 2005/11/17 17:47:34 byordano
152TList changed to TObjArray
153
154Revision 1.3 2005/11/17 14:43:23 byordano
155import to local CVS
156
157Revision 1.1.1.1 2005/10/28 07:33:58 hristov
158Initial import as subdirectory in AliRoot
159
73abe331 160Revision 1.2 2005/09/13 08:41:15 byordano
161default startTime endTime added
162
163Revision 1.4 2005/08/30 09:13:02 byordano
164some docs added
165
166Revision 1.3 2005/08/29 21:15:47 byordano
167some docs added
168
169*/
170
171//
172// This class is the main manager for AliShuttle.
173// It organizes the data retrieval from DCS and call the
b948db8d 174// interface methods of AliPreprocessor.
73abe331 175// For every detector in AliShuttleConfgi (see AliShuttleConfig),
176// data for its set of aliases is retrieved. If there is registered
b948db8d 177// AliPreprocessor for this detector then it will be used
178// accroding to the schema (see AliPreprocessor).
179// If there isn't registered AliPreprocessor than the retrieved
73abe331 180// data is stored automatically to the undelying AliCDBStorage.
181// For detSpec is used the alias name.
182//
183
184#include "AliShuttle.h"
185
186#include "AliCDBManager.h"
187#include "AliCDBStorage.h"
188#include "AliCDBId.h"
84090f85 189#include "AliCDBRunRange.h"
190#include "AliCDBPath.h"
5164a766 191#include "AliCDBEntry.h"
73abe331 192#include "AliShuttleConfig.h"
eba76848 193#include "DCSClient/AliDCSClient.h"
73abe331 194#include "AliLog.h"
b948db8d 195#include "AliPreprocessor.h"
5164a766 196#include "AliShuttleStatus.h"
2bb7b766 197#include "AliShuttleLogbookEntry.h"
73abe331 198
57f50b3c 199#include <TSystem.h>
58bc3020 200#include <TObject.h>
b948db8d 201#include <TString.h>
57f50b3c 202#include <TTimeStamp.h>
73abe331 203#include <TObjString.h>
57f50b3c 204#include <TSQLServer.h>
205#include <TSQLResult.h>
206#include <TSQLRow.h>
cb343cfd 207#include <TMutex.h>
9827400b 208#include <TSystemDirectory.h>
209#include <TSystemFile.h>
210#include <TFileMerger.h>
211#include <TGrid.h>
212#include <TGridResult.h>
73abe331 213
e7f62f16 214#include <TMonaLisaWriter.h>
215
5164a766 216#include <fstream>
217
cb343cfd 218#include <sys/types.h>
219#include <sys/wait.h>
220
73abe331 221ClassImp(AliShuttle)
222
b948db8d 223//______________________________________________________________________________________________
224AliShuttle::AliShuttle(const AliShuttleConfig* config,
225 UInt_t timeout, Int_t retries):
4f0ab988 226fConfig(config),
227fTimeout(timeout), fRetries(retries),
228fPreprocessorMap(),
2bb7b766 229fLogbookEntry(0),
eba76848 230fCurrentDetector(),
85a80aa9 231fStatusEntry(0),
cb343cfd 232fMonitoringMutex(0),
eba76848 233fLastActionTime(0),
e7f62f16 234fLastAction(),
9827400b 235fMonaLisa(0),
236fTestMode(kNone),
ffa29e93 237fReadTestMode(kFALSE),
238fOutputRedirected(kFALSE)
73abe331 239{
240 //
241 // config: AliShuttleConfig used
73abe331 242 // timeout: timeout used for AliDCSClient connection
243 // retries: the number of retries in case of connection error.
244 //
245
57f50b3c 246 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 247 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 248 fServer[iSys]=0;
be48e3ea 249 if (iSys < 3)
2c15234c 250 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 251 }
2bb7b766 252 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 253
254 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
255 fFirstUnprocessed[iDet] = kFALSE;
256
cb343cfd 257 fMonitoringMutex = new TMutex();
58bc3020 258}
259
b948db8d 260//______________________________________________________________________________________________
57f50b3c 261AliShuttle::~AliShuttle()
58bc3020 262{
9827400b 263 //
264 // destructor
265 //
58bc3020 266
b948db8d 267 fPreprocessorMap.DeleteAll();
be48e3ea 268 for(int iSys=0;iSys<4;iSys++)
57f50b3c 269 if(fServer[iSys]) {
270 fServer[iSys]->Close();
271 delete fServer[iSys];
eba76848 272 fServer[iSys] = 0;
57f50b3c 273 }
2bb7b766 274
275 if (fStatusEntry){
276 delete fStatusEntry;
277 fStatusEntry = 0;
278 }
cb343cfd 279
280 if (fMonitoringMutex)
281 {
282 delete fMonitoringMutex;
283 fMonitoringMutex = 0;
284 }
73abe331 285}
286
b948db8d 287//______________________________________________________________________________________________
57f50b3c 288void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 289{
73abe331 290 //
b948db8d 291 // Registers new AliPreprocessor.
73abe331 292 // It uses GetName() for indentificator of the pre processor.
293 // The pre processor is registered it there isn't any other
294 // with the same identificator (GetName()).
295 //
296
eba76848 297 const char* detName = preprocessor->GetName();
298 if(GetDetPos(detName) < 0)
299 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
300
301 if (fPreprocessorMap.GetValue(detName)) {
302 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 303 return;
304 }
305
eba76848 306 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 307}
b948db8d 308//______________________________________________________________________________________________
3301427a 309Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 310 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 311{
9827400b 312 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
313 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
314 // using this function. Use StoreReferenceData instead!
315 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
316 // finishes the data are transferred to the main storage (Grid).
b948db8d 317
3301427a 318 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 319}
320
321//______________________________________________________________________________________________
3301427a 322Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 323{
9827400b 324 // Stores a CDB object in the storage for reference data. This objects will not be available during
325 // offline reconstrunction. Use this function for reference data only!
326 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
327 // finishes the data are transferred to the main storage (Grid).
85a80aa9 328
3301427a 329 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 330}
331
332//______________________________________________________________________________________________
3301427a 333Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 334 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
335 Int_t validityStart, Bool_t validityInfinite)
336{
9827400b 337 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
338 // when the preprocessor finishes the data are transferred to the main storage (Grid).
339 // The parameters are:
340 // 1) Uri of the backup storage (Local)
341 // 2) the object's path.
342 // 3) the object to be stored
343 // 4) the metaData to be associated with the object
344 // 5) the validity start run number w.r.t. the current run,
345 // if the data is valid only for this run leave the default 0
346 // 6) specifies if the calibration data is valid for infinity (this means until updated),
347 // typical for calibration runs, the default is kFALSE
348 //
349 // returns 0 if fail, 1 otherwise
84090f85 350
9827400b 351 if (fTestMode & kErrorStorage)
352 {
353 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
354 return kFALSE;
355 }
356
3301427a 357 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 358
85a80aa9 359 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 360 if(firstRun < 0) {
9827400b 361 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 362 firstRun=0;
363 }
364
365 Int_t lastRun = -1;
366 if(validityInfinite) {
367 lastRun = AliCDBRunRange::Infinity();
368 } else {
369 lastRun = GetCurrentRun();
370 }
371
3301427a 372 // Version is set to current run, it will be used later to transfer data to Grid
373 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 374
375 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
376 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 377 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 378 }
84090f85 379
3301427a 380 Bool_t result = kFALSE;
84090f85 381
3301427a 382 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
383 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 384 } else {
3301427a 385 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 386 ->Put(object, id, metaData);
387 }
388
389 if(!result) {
390
9827400b 391 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 392 }
2bb7b766 393
3301427a 394 return result;
395}
84090f85 396
3301427a 397//______________________________________________________________________________________________
398Bool_t AliShuttle::StoreOCDB()
399{
9827400b 400 //
401 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
402 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
403 // Then calls StoreRefFilesToGrid to store reference files.
404 //
405
406 if (fTestMode & kErrorGrid)
407 {
408 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
409 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
410 return kFALSE;
411 }
412
3301427a 413 AliInfo("Storing OCDB data ...");
414 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
415
416 AliInfo("Storing reference data ...");
417 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 418
419 AliInfo("Storing reference files ...");
420 Bool_t resultRefFiles = StoreRefFilesToGrid();
421
422 return resultCDB && resultRef && resultRefFiles;
3301427a 423}
424
425//______________________________________________________________________________________________
426Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
427{
428 //
429 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
430 //
431
432 TObjArray* gridIds=0;
433
434 Bool_t result = kTRUE;
435
436 const char* type = 0;
437 TString localURI;
438 if(gridURI == fgkMainCDB) {
439 type = "OCDB";
440 localURI = fgkLocalCDB;
441 } else if(gridURI == fgkMainRefStorage) {
442 type = "reference";
443 localURI = fgkLocalRefStorage;
444 } else {
445 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
446 return kFALSE;
447 }
448
449 AliCDBManager* man = AliCDBManager::Instance();
450
451 AliCDBStorage *gridSto = man->GetStorage(gridURI);
452 if(!gridSto) {
453 Log("SHUTTLE",
454 Form("StoreOCDB - cannot activate main %s storage", type));
455 return kFALSE;
456 }
457
458 gridIds = gridSto->GetQueryCDBList();
459
460 // get objects previously stored in local CDB
461 AliCDBStorage *localSto = man->GetStorage(localURI);
462 if(!localSto) {
463 Log("SHUTTLE",
464 Form("StoreOCDB - cannot activate local %s storage", type));
465 return kFALSE;
466 }
467 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
468 // Local objects were stored with current run as Grid version!
469 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
470 localEntries->SetOwner(1);
471
472 // loop on local stored objects
473 TIter localIter(localEntries);
474 AliCDBEntry *aLocEntry = 0;
475 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
476 aLocEntry->SetOwner(1);
477 AliCDBId aLocId = aLocEntry->GetId();
478 aLocEntry->SetVersion(-1);
479 aLocEntry->SetSubVersion(-1);
480
481 // If local object is valid up to infinity we store it only if it is
482 // the first unprocessed run!
483 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
484 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
485 {
486 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
487 "there are previous unprocessed runs!",
488 fCurrentDetector.Data(), aLocId.GetPath().Data()));
489 continue;
490 }
491
492 // loop on Grid valid Id's
493 Bool_t store = kTRUE;
494 TIter gridIter(gridIds);
495 AliCDBId* aGridId = 0;
496 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
497 if(aGridId->GetPath() != aLocId.GetPath()) continue;
498 // skip all objects valid up to infinity
499 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
500 // if we get here, it means there's already some more recent object stored on Grid!
501 store = kFALSE;
502 break;
503 }
504
505 // If we get here, the file can be stored!
506 Bool_t storeOk = gridSto->Put(aLocEntry);
507 if(!store || storeOk){
508
509 if (!store)
510 {
511 Log(fCurrentDetector.Data(),
512 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
513 type, aGridId->ToString().Data()));
514 } else {
515 Log("SHUTTLE",
516 Form("StoreOCDB - Object <%s> successfully put into %s storage",
517 aLocId.ToString().Data(), type));
518 }
84090f85 519
3301427a 520 // removing local filename...
521 TString filename;
522 localSto->IdToFilename(aLocId, filename);
523 AliInfo(Form("Removing local file %s", filename.Data()));
524 RemoveFile(filename.Data());
525 continue;
526 } else {
527 Log("SHUTTLE",
528 Form("StoreOCDB - Grid %s storage of object <%s> failed",
529 type, aLocId.ToString().Data()));
530 result = kFALSE;
b948db8d 531 }
532 }
3301427a 533 localEntries->Clear();
2bb7b766 534
b948db8d 535 return result;
3301427a 536}
537
538//______________________________________________________________________________________________
9827400b 539Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
540{
541 //
542 // Stores reference file directly (without opening it). This function stores the file locally
543 // renaming it to #runNumber_gridFileName.
544 //
545
546 if (fTestMode & kErrorStorage)
547 {
548 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
549 return kFALSE;
550 }
551
552 AliCDBManager* man = AliCDBManager::Instance();
553 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
554
555 TString localBaseFolder = sto->GetBaseFolder();
556
557 TString targetDir;
558 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
559
560 TString target;
561 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
562
563 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
564 if (result)
565 {
566 result = gSystem->mkdir(targetDir, kTRUE);
567 if (result != 0)
568 {
569 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
570 return kFALSE;
571 }
572 }
573
574 result = gSystem->CopyFile(localFile, target);
575
576 if (result == 0)
577 {
578 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
579 return kTRUE;
580 }
581 else
582 {
583 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
584 return kFALSE;
585 }
586}
587
588//______________________________________________________________________________________________
589Bool_t AliShuttle::StoreRefFilesToGrid()
590{
591 //
592 // Transfers the reference file to the Grid.
593 // The final full path of the file is:
594 // gridBaseReferenceFolder/DET/#runNumber_gridFileName
595 //
596
597 AliCDBManager* man = AliCDBManager::Instance();
598 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
599 if (!sto)
600 return kFALSE;
601 TString localBaseFolder = sto->GetBaseFolder();
602
603 TString dir;
3d8bc902 604 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 605
606 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
607 if (!gridSto)
608 return kFALSE;
609 TString gridBaseFolder = gridSto->GetBaseFolder();
610 TString alienDir;
3d8bc902 611 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 612
3d8bc902 613 if (!gGrid)
9827400b 614 return kFALSE;
615
9827400b 616 TString begin;
617 begin.Form("%d_", GetCurrentRun());
618
619 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 620 if (!baseDir)
621 return kTRUE;
622
9827400b 623 TList* dirList = baseDir->GetListOfFiles();
624 if (!dirList)
3d8bc902 625 {
626 delete baseDir;
9827400b 627 return kTRUE;
3d8bc902 628 }
9827400b 629
630 Int_t nDirs = dirList->GetEntries();
631
632 Bool_t success = kTRUE;
3d8bc902 633 Bool_t first = kTRUE;
9827400b 634
635 for (Int_t iDir=0; iDir<nDirs; ++iDir)
636 {
637 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
638 if (!entry)
639 continue;
640
641 if (entry->IsDirectory())
642 continue;
643
644 TString fileName(entry->GetName());
645 if (!fileName.BeginsWith(begin))
646 continue;
647
3d8bc902 648 if (first)
649 {
650 first = kFALSE;
651 // check that DET folder exists, otherwise create it
652 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
653
654 if (!result)
655 return kFALSE;
656
657 if (!result->GetFileName(0))
658 {
659 if (!gGrid->Mkdir(alienDir.Data(),"",0))
660 {
661 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
662 alienDir.Data()));
663 delete baseDir;
664 return kFALSE;
665 }
666
667 }
668 }
669
9827400b 670 TString fullLocalPath;
671 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
672
673 TString fullGridPath;
674 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
675
676 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
677
678 TFileMerger fileMerger;
679 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
680
681 if (result)
682 {
683 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
684 RemoveFile(fullLocalPath);
685 }
686 else
687 {
688 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
689 success = kFALSE;
690 }
691 }
692
693 delete baseDir;
694
695 return success;
696}
697
698//______________________________________________________________________________________________
3301427a 699void AliShuttle::CleanLocalStorage(const TString& uri)
700{
9827400b 701 //
702 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
703 //
3301427a 704
705 const char* type = 0;
706 if(uri == fgkLocalCDB) {
707 type = "OCDB";
708 } else if(uri == fgkLocalRefStorage) {
709 type = "reference";
710 } else {
711 AliError(Form("Invalid storage URI: %s", uri.Data()));
712 return;
713 }
714
715 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 716
3301427a 717 // open local storage
718 AliCDBStorage *localSto = man->GetStorage(uri);
719 if(!localSto) {
720 Log("SHUTTLE",
721 Form("CleanLocalStorage - cannot activate local %s storage", type));
722 return;
723 }
724
725 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
726 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
727
728 AliInfo(Form("filename = %s", filename.Data()));
729
730 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
731 GetCurrentRun(), fCurrentDetector.Data()));
732
733 RemoveFile(filename.Data());
734
735}
736
737//______________________________________________________________________________________________
738void AliShuttle::RemoveFile(const char* filename)
739{
9827400b 740 //
741 // removes local file
742 //
3301427a 743
744 TString command(Form("rm -f %s", filename));
745
746 Int_t result = gSystem->Exec(command.Data());
747 if(result != 0)
748 {
749 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
750 fCurrentDetector.Data(), filename));
751 }
73abe331 752}
753
b948db8d 754//______________________________________________________________________________________________
5164a766 755AliShuttleStatus* AliShuttle::ReadShuttleStatus()
756{
9827400b 757 //
758 // Reads the AliShuttleStatus from the CDB
759 //
5164a766 760
2bb7b766 761 if (fStatusEntry){
762 delete fStatusEntry;
763 fStatusEntry = 0;
764 }
5164a766 765
10a5a932 766 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 767 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 768
2bb7b766 769 if (!fStatusEntry) return 0;
770 fStatusEntry->SetOwner(1);
5164a766 771
2bb7b766 772 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
773 if (!status) {
774 AliError("Invalid object stored to CDB!");
775 return 0;
776 }
5164a766 777
2bb7b766 778 return status;
5164a766 779}
780
781//______________________________________________________________________________________________
7bfb2090 782Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 783{
9827400b 784 //
785 // writes the status for one subdetector
786 //
2bb7b766 787
788 if (fStatusEntry){
789 delete fStatusEntry;
790 fStatusEntry = 0;
791 }
5164a766 792
2bb7b766 793 Int_t run = GetCurrentRun();
5164a766 794
2bb7b766 795 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 796
2bb7b766 797 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
798 fStatusEntry->SetOwner(1);
5164a766 799
2bb7b766 800 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 801
2bb7b766 802 if (!result) {
3301427a 803 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
804 fCurrentDetector.Data(), run));
2bb7b766 805 return kFALSE;
806 }
e7f62f16 807
808 SendMLInfo();
7bfb2090 809
2bb7b766 810 return kTRUE;
5164a766 811}
812
813//______________________________________________________________________________________________
814void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
815{
9827400b 816 //
817 // changes the AliShuttleStatus for the given detector and run to the given status
818 //
5164a766 819
2bb7b766 820 if (!fStatusEntry){
821 AliError("UNEXPECTED: fStatusEntry empty");
822 return;
823 }
5164a766 824
2bb7b766 825 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 826
2bb7b766 827 if (!status){
3301427a 828 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 829 return;
830 }
5164a766 831
2c15234c 832 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 833 fCurrentDetector.Data(),
36c99a6a 834 status->GetStatusName(),
eba76848 835 status->GetStatusName(newStatus));
cb343cfd 836 Log("SHUTTLE", actionStr);
837 SetLastAction(actionStr);
5164a766 838
2bb7b766 839 status->SetStatus(newStatus);
840 if (increaseCount) status->IncreaseCount();
5164a766 841
2bb7b766 842 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 843
844 SendMLInfo();
5164a766 845}
e7f62f16 846
847//______________________________________________________________________________________________
848void AliShuttle::SendMLInfo()
849{
850 //
851 // sends ML information about the current status of the current detector being processed
852 //
853
854 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
855
856 if (!status){
3301427a 857 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 858 return;
859 }
860
861 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
862 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
863
864 TList mlList;
865 mlList.Add(&mlStatus);
866 mlList.Add(&mlRetryCount);
867
868 fMonaLisa->SendParameters(&mlList);
869}
870
5164a766 871//______________________________________________________________________________________________
872Bool_t AliShuttle::ContinueProcessing()
873{
9827400b 874 // this function reads the AliShuttleStatus information from CDB and
875 // checks if the processing should be continued
876 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 877
57c1a579 878 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
879
880 AliPreprocessor* aPreprocessor =
881 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
882 if (!aPreprocessor)
883 {
884 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
885 return kFALSE;
886 }
887
2bb7b766 888 AliShuttleLogbookEntry::Status entryStatus =
eba76848 889 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 890
891 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 892 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 893 fCurrentDetector.Data(),
894 fLogbookEntry->GetDetectorStatusName(entryStatus)));
895 return kFALSE;
896 }
897
898 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 899
900 // check if current run is first unprocessed run for current detector
901 if (fConfig->StrictRunOrder(fCurrentDetector) &&
902 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
903 {
904 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
905 return kFALSE;
906 }
907
2bb7b766 908 AliShuttleStatus* status = ReadShuttleStatus();
909 if (!status) {
910 // first time
911 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
912 fCurrentDetector.Data()));
913 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
914 return WriteShuttleStatus(status);
915 }
916
917 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
918 // If it happens it may mean Logbook updating failed... let's do it now!
919 if (status->GetStatus() == AliShuttleStatus::kDone ||
920 status->GetStatus() == AliShuttleStatus::kFailed){
921 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
922 fCurrentDetector.Data(),
923 status->GetStatusName(status->GetStatus())));
924 UpdateShuttleLogbook(fCurrentDetector.Data(),
925 status->GetStatusName(status->GetStatus()));
926 return kFALSE;
927 }
928
3301427a 929 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 930 Log("SHUTTLE",
931 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
932 fCurrentDetector.Data()));
9827400b 933 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
934 if (StoreOCDB()){
3301427a 935 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
936 fCurrentDetector.Data()));
2bb7b766 937 UpdateShuttleStatus(AliShuttleStatus::kDone);
938 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
939 } else {
940 Log("SHUTTLE",
941 Form("ContinueProcessing - %s: Grid storage failed again",
942 fCurrentDetector.Data()));
9827400b 943 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 944 }
945 return kFALSE;
946 }
947
948 // if we get here, there is a restart
57c1a579 949 Bool_t cont = kFALSE;
2bb7b766 950
951 // abort conditions
cb343cfd 952 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 953 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
954 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 955 status->GetCount(), status->GetStatusName()));
956 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 957 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 958
959 // there may still be objects in local OCDB and reference storage
960 // and FXS databases may be not updated: do it now!
9827400b 961
962 // TODO Currently disabled, we want to keep files in case of failure!
963 // CleanLocalStorage(fgkLocalCDB);
964 // CleanLocalStorage(fgkLocalRefStorage);
965 // UpdateTableFailCase();
966
967 // Send mail to detector expert!
968 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
969 if (!SendMail())
970 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
971 fCurrentDetector.Data()));
3301427a 972
57c1a579 973 } else {
974 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
975 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
976 status->GetStatusName(), status->GetCount()));
9827400b 977 Bool_t increaseCount = kTRUE;
978 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
979 increaseCount = kFALSE;
980 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 981 cont = kTRUE;
2bb7b766 982 }
983
57c1a579 984 return cont;
5164a766 985}
986
987//______________________________________________________________________________________________
2bb7b766 988Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 989{
73abe331 990 //
b948db8d 991 // Makes data retrieval for all detectors in the configuration.
2bb7b766 992 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
993 // (Unprocessed, Inactive, Failed or Done).
d477ad88 994 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 995 //
996
9827400b 997 if (!entry) return kFALSE;
2bb7b766 998
999 fLogbookEntry = entry;
1000
9827400b 1001 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1002 GetCurrentRun()));
2bb7b766 1003
e7f62f16 1004 // create ML instance that monitors this run
1005 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1006 // disable monitoring of other parameters that come e.g. from TFile
1007 gMonitoringWriter = 0;
2bb7b766 1008
e7f62f16 1009 // Send the information to ML
1010 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1011 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1012
1013 TList mlList;
1014 mlList.Add(&mlStatus);
9827400b 1015 mlList.Add(&mlRunType);
e7f62f16 1016
1017 fMonaLisa->SendParameters(&mlList);
3301427a 1018
9827400b 1019 if (fLogbookEntry->IsDone())
1020 {
1021 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1022 UpdateShuttleLogbook("shuttle_done");
1023 fLogbookEntry = 0;
1024 return kTRUE;
1025 }
1026
1027 // read test mode if flag is set
1028 if (fReadTestMode)
1029 {
3d8bc902 1030 fTestMode = kNone;
9827400b 1031 TString logEntry(entry->GetRunParameter("log"));
1032 //printf("log entry = %s\n", logEntry.Data());
1033 TString searchStr("Testmode: ");
1034 Int_t pos = logEntry.Index(searchStr.Data());
1035 //printf("%d\n", pos);
1036 if (pos >= 0)
1037 {
1038 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1039 //printf("%s\n", subStr.String().Data());
1040 TString newStr(subStr.Data());
1041 TObjArray* token = newStr.Tokenize(' ');
1042 if (token)
1043 {
1044 //token->Print();
1045 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1046 if (tmpStr)
1047 {
1048 Int_t testMode = tmpStr->String().Atoi();
1049 if (testMode > 0)
1050 {
1051 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1052 SetTestMode((TestMode) testMode);
1053 }
1054 }
1055 delete token;
1056 }
1057 }
1058 }
1059
3d8bc902 1060 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1061
eba76848 1062 fLogbookEntry->Print("all");
57f50b3c 1063
1064 // Initialization
d477ad88 1065 Bool_t hasError = kFALSE;
5164a766 1066
2bb7b766 1067 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1068 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1069 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1070 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1071
57f50b3c 1072 // Loop on detectors in the configuration
b948db8d 1073 TIter iter(fConfig->GetDetectors());
2bb7b766 1074 TObjString* aDetector = 0;
b948db8d 1075
be48e3ea 1076 while ((aDetector = (TObjString*) iter.Next()))
1077 {
7bfb2090 1078 fCurrentDetector = aDetector->String();
5164a766 1079
9e080f92 1080 if (ContinueProcessing() == kFALSE) continue;
1081
2bb7b766 1082 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1083 GetCurrentRun(), aDetector->GetName()));
1084
9d733021 1085 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1086
e7f62f16 1087 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1088
be48e3ea 1089 Int_t pid = fork();
1090
1091 if (pid < 0)
1092 {
1093 Log("SHUTTLE", "ERROR: Forking failed");
1094 }
1095 else if (pid > 0)
1096 {
1097 // parent
1098 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1099 GetCurrentRun(), aDetector->GetName()));
1100
1101 Long_t begin = time(0);
1102
1103 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1104 while (waitpid(pid, &status, WNOHANG) == 0)
1105 {
1106 Long_t expiredTime = time(0) - begin;
1107
1108 if (expiredTime > fConfig->GetPPTimeOut())
1109 {
9827400b 1110 TString tmp;
1111 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1112 fCurrentDetector.Data(), expiredTime);
1113 Log("SHUTTLE", tmp);
1114 Log(fCurrentDetector, tmp);
be48e3ea 1115
1116 kill(pid, 9);
1117
3301427a 1118 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1119 hasError = kTRUE;
1120
1121 gSystem->Sleep(1000);
1122 }
1123 else
1124 {
be48e3ea 1125 gSystem->Sleep(1000);
9827400b 1126
1127 TString checkStr;
1128 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1129 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1130 if (!pipe)
1131 {
1132 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1133 continue;
1134 }
1135
1136 char buffer[100];
1137 if (!fgets(buffer, 100, pipe))
1138 {
1139 Log("SHUTTLE", "Error: ps did not return anything");
1140 gSystem->ClosePipe(pipe);
1141 continue;
1142 }
1143 gSystem->ClosePipe(pipe);
1144
1145 //Log("SHUTTLE", Form("ps returned %s", buffer));
1146
1147 Int_t mem = 0;
1148 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1149 {
1150 Log("SHUTTLE", "Error: Could not parse output of ps");
1151 continue;
1152 }
1153
1154 if (expiredTime % 60 == 0)
886d60e6 1155 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1156 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1157
1158 if (mem > fConfig->GetPPMaxMem())
1159 {
1160 TString tmp;
1161 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1162 mem, fConfig->GetPPMaxMem());
1163 Log("SHUTTLE", tmp);
1164 Log(fCurrentDetector, tmp);
1165
1166 kill(pid, 9);
1167
1168 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1169 hasError = kTRUE;
1170
1171 gSystem->Sleep(1000);
1172 }
be48e3ea 1173 }
1174 }
1175
1176 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1177 GetCurrentRun(), aDetector->GetName()));
1178
1179 if (WIFEXITED(status))
1180 {
1181 Int_t returnCode = WEXITSTATUS(status);
1182
3301427a 1183 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1184 returnCode));
be48e3ea 1185
9827400b 1186 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1187 }
1188 }
1189 else if (pid == 0)
1190 {
1191 // client
1192 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1193
ffa29e93 1194 AliInfo("Redirecting output...");
1195
1196 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1197 {
1198 Log("SHUTTLE", "Could not freopen stdout");
1199 }
1200 else
1201 {
1202 fOutputRedirected = kTRUE;
1203 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1204 Log("SHUTTLE", "Could not redirect stderr");
1205
1206 }
1207
9827400b 1208 Bool_t success = ProcessCurrentDetector();
1209 if (success) // Preprocessor finished successfully!
1210 {
3301427a 1211 // Update time_processed field in FXS DB
1212 if (UpdateTable() == kFALSE)
1213 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1214
1215 // Transfer the data from local storage to main storage (Grid)
1216 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1217 if (StoreOCDB() == kFALSE)
1218 {
1219 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1220 GetCurrentRun(), aDetector->GetName()));
1221 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1222 success = kFALSE;
3301427a 1223 } else {
1224 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1225 GetCurrentRun(), aDetector->GetName()));
1226 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1227 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1228 }
be48e3ea 1229 }
1230
4b95672b 1231 for (UInt_t iSys=0; iSys<3; iSys++)
1232 {
1233 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1234 }
1235
be48e3ea 1236 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1237 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1238
1239 // the client exits here
9827400b 1240 gSystem->Exit(success);
be48e3ea 1241
1242 AliError("We should never get here!!!");
1243 }
7bfb2090 1244 }
5164a766 1245
2bb7b766 1246 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1247 GetCurrentRun()));
1248
1249 //check if shuttle is done for this run, if so update logbook
1250 TObjArray checkEntryArray;
1251 checkEntryArray.SetOwner(1);
9e080f92 1252 TString whereClause = Form("where run=%d", GetCurrentRun());
1253 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1254 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1255 GetCurrentRun()));
1256 return hasError == kFALSE;
1257 }
b948db8d 1258
9e080f92 1259 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1260 (checkEntryArray.At(0));
2bb7b766 1261
9e080f92 1262 if (checkEntry)
1263 {
1264 if (checkEntry->IsDone())
be48e3ea 1265 {
9e080f92 1266 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1267 UpdateShuttleLogbook("shuttle_done");
1268 }
1269 else
1270 {
1271 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1272 {
9e080f92 1273 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1274 {
9e080f92 1275 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1276 checkEntry->GetRun(), GetDetName(iDet)));
1277 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1278 }
1279 }
2bb7b766 1280 }
1281 }
1282
e7f62f16 1283 // remove ML instance
1284 delete fMonaLisa;
1285 fMonaLisa = 0;
1286
2bb7b766 1287 fLogbookEntry = 0;
85a80aa9 1288
a7160fe9 1289 return hasError == kFALSE;
73abe331 1290}
1291
b948db8d 1292//______________________________________________________________________________________________
9827400b 1293Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1294{
1295 //
2bb7b766 1296 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1297 // Threre should be a configuration for this detector.
73abe331 1298
2bb7b766 1299 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1300
2c15234c 1301 TMap dcsMap;
1302 dcsMap.SetOwner(1);
73abe331 1303
85a80aa9 1304 Bool_t aDCSError = kFALSE;
3301427a 1305
1306 // call preprocessor
1307 AliPreprocessor* aPreprocessor =
1308 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1309
1310 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1311
1312 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1313
3d8bc902 1314 if (!processDCS || (fTestMode & kSkipDCS))
2c15234c 1315 {
3d8bc902 1316 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1317 }
1318 else if (fTestMode & kErrorDCS)
1319 {
3d8bc902 1320 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1321 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1322 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1323 return kFALSE;
2c15234c 1324 } else {
3301427a 1325
1326 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1327
2c15234c 1328 TString host(fConfig->GetDCSHost(fCurrentDetector));
1329 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1330
1331 // Retrieval of Aliases
1332 TObjString* anAlias = 0;
36c99a6a 1333 Int_t iAlias = 1;
1334 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1335 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1336 while ((anAlias = (TObjString*) iterAliases.Next()))
1337 {
1338 TObjArray *valueSet = new TObjArray();
1339 valueSet->SetOwner(1);
1340
36c99a6a 1341 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1342 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1343 anAlias->GetName(), iAlias++, nTotAliases));
2c15234c 1344 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1345
1346 if(!aDCSError)
1347 {
1348 dcsMap.Add(anAlias->Clone(), valueSet);
1349 } else {
1350 Log(fCurrentDetector,
1351 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1352 anAlias->GetName()));
1353 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1354 dcsMap.DeleteAll();
9827400b 1355 return kFALSE;
2c15234c 1356 }
4f0ab988 1357 }
2c15234c 1358
1359 // Retrieval of Data Points
1360 TObjString* aDP = 0;
36c99a6a 1361 Int_t iDP = 0;
1362 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1363 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1364 while ((aDP = (TObjString*) iterDP.Next()))
1365 {
1366 TObjArray *valueSet = new TObjArray();
1367 valueSet->SetOwner(1);
36c99a6a 1368 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1369 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1370 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1371 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1372
1373 if(!aDCSError)
1374 {
1375 dcsMap.Add(aDP->Clone(), valueSet);
1376 } else {
1377 Log(fCurrentDetector,
1378 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1379 aDP->GetName()));
1380 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1381 dcsMap.DeleteAll();
9827400b 1382 return kFALSE;
2c15234c 1383 }
73abe331 1384 }
1385 }
b948db8d 1386
2bb7b766 1387 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1388 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1389
3301427a 1390 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1391
3301427a 1392 if (returnValue > 0) // Preprocessor error!
1393 {
9827400b 1394 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1395 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1396 dcsMap.DeleteAll();
1397 return kFALSE;
1398 }
1399
1400 // preprocessor ok!
1401 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1402 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1403 fCurrentDetector.Data()));
b948db8d 1404
2c15234c 1405 dcsMap.DeleteAll();
b948db8d 1406
9827400b 1407 return kTRUE;
2bb7b766 1408}
1409
1410//______________________________________________________________________________________________
1411Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1412 TObjArray& entries)
1413{
9827400b 1414 // Query DAQ's Shuttle logbook and fills detector status object.
1415 // Call QueryRunParameters to query DAQ logbook for run parameters.
1416 //
2bb7b766 1417
fc5a4708 1418 entries.SetOwner(1);
1419
2bb7b766 1420 // check connection, in case connect
be48e3ea 1421 if(!Connect(3)) return kFALSE;
2bb7b766 1422
1423 TString sqlQuery;
441b0e9c 1424 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1425
be48e3ea 1426 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1427 if (!aResult) {
1428 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1429 return kFALSE;
1430 }
1431
fc5a4708 1432 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1433
2bb7b766 1434 if(aResult->GetRowCount() == 0) {
9827400b 1435 AliInfo("No entries in Shuttle Logbook match request");
1436 delete aResult;
1437 return kTRUE;
2bb7b766 1438 }
1439
1440 // TODO Check field count!
fc5a4708 1441 const UInt_t nCols = 22;
2bb7b766 1442 if (aResult->GetFieldCount() != (Int_t) nCols) {
1443 AliError("Invalid SQL result field number!");
1444 delete aResult;
1445 return kFALSE;
1446 }
1447
2bb7b766 1448 TSQLRow* aRow;
1449 while ((aRow = aResult->Next())) {
1450 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1451 Int_t run = runString.Atoi();
1452
eba76848 1453 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1454 if (!entry)
1455 continue;
2bb7b766 1456
1457 // loop on detectors
eba76848 1458 for(UInt_t ii = 0; ii < nCols; ii++)
1459 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1460
eba76848 1461 entries.AddLast(entry);
2bb7b766 1462 delete aRow;
1463 }
1464
2bb7b766 1465 delete aResult;
1466 return kTRUE;
1467}
1468
1469//______________________________________________________________________________________________
eba76848 1470AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1471{
eba76848 1472 //
1473 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1474 //
2bb7b766 1475
1476 // check connection, in case connect
be48e3ea 1477 if (!Connect(3))
eba76848 1478 return 0;
2bb7b766 1479
1480 TString sqlQuery;
2c15234c 1481 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1482
be48e3ea 1483 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1484 if (!aResult) {
1485 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1486 return 0;
2bb7b766 1487 }
1488
eba76848 1489 if (aResult->GetRowCount() == 0) {
2bb7b766 1490 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1491 delete aResult;
eba76848 1492 return 0;
2bb7b766 1493 }
1494
eba76848 1495 if (aResult->GetRowCount() > 1) {
2bb7b766 1496 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1497 delete aResult;
eba76848 1498 return 0;
2bb7b766 1499 }
1500
eba76848 1501 TSQLRow* aRow = aResult->Next();
1502 if (!aRow)
1503 {
1504 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1505 delete aResult;
1506 return 0;
1507 }
2bb7b766 1508
eba76848 1509 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1510
eba76848 1511 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1512 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1513
eba76848 1514 UInt_t startTime = entry->GetStartTime();
1515 UInt_t endTime = entry->GetEndTime();
1516
1517 if (!startTime || !endTime || startTime > endTime) {
1518 Log("SHUTTLE",
1519 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1520 run, startTime, endTime));
1521 delete entry;
2bb7b766 1522 delete aRow;
eba76848 1523 delete aResult;
1524 return 0;
2bb7b766 1525 }
1526
eba76848 1527 delete aRow;
2bb7b766 1528 delete aResult;
eba76848 1529
1530 return entry;
2bb7b766 1531}
1532
1533//______________________________________________________________________________________________
2c15234c 1534Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1535 TObjArray* valueSet, DCSType type)
73abe331 1536{
9827400b 1537 // Retrieve all "entry" data points from the DCS server
1538 // host, port: TSocket connection parameters
1539 // entry: name of the alias or data point
1540 // valueSet: array of retrieved AliDCSValue's
1541 // type: kAlias or kDP
58bc3020 1542
73abe331 1543 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1544 if (!client.IsConnected())
1545 {
b948db8d 1546 return kFALSE;
73abe331 1547 }
1548
2c15234c 1549 Int_t result=0;
73abe331 1550
2c15234c 1551 if (type == kAlias)
1552 {
1553 result = client.GetAliasValues(entry,
1554 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1555 } else
1556 if (type == kDP)
1557 {
1558 result = client.GetDPValues(entry,
1559 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1560 }
1561
1562 if (result < 0)
1563 {
2bb7b766 1564 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1565 entry, AliDCSClient::GetErrorString(result)));
73abe331 1566
2c15234c 1567 if (result == AliDCSClient::fgkServerError)
1568 {
2bb7b766 1569 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1570 client.GetServerError().Data()));
1571 }
1572
1573 return kFALSE;
1574 }
1575
1576 return kTRUE;
1577}
b948db8d 1578
1579//______________________________________________________________________________________________
57f50b3c 1580const char* AliShuttle::GetFile(Int_t system, const char* detector,
1581 const char* id, const char* source)
b948db8d 1582{
9827400b 1583 // Get calibration file from file exchange servers
1584 // First queris the FXS database for the file name, using the run, detector, id and source info
1585 // then calls RetrieveFile(filename) for actual copy to local disk
1586 // run: current run being processed (given by Logbook entry fLogbookEntry)
1587 // detector: the Preprocessor name
1588 // id: provided as a parameter by the Preprocessor
1589 // source: provided by the Preprocessor through GetFileSources function
1590
1591 // check if test mode should simulate a FXS error
1592 if (fTestMode & kErrorFXSFiles)
1593 {
1594 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1595 return 0;
1596 }
1597
57f50b3c 1598 // check connection, in case connect
9d733021 1599 if (!Connect(system))
eba76848 1600 {
9d733021 1601 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1602 return 0;
1603 }
1604
1605 // Query preparation
9d733021 1606 TString sourceName(source);
d386d623 1607 Int_t nFields = 3;
1608 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1609 fConfig->GetFXSdbTable(system));
1610 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1611 GetCurrentRun(), detector, id);
1612
9d733021 1613 if (system == kDAQ)
1614 {
d386d623 1615 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1616 }
9d733021 1617 else if (system == kDCS)
eba76848 1618 {
9d733021 1619 sourceName="none";
57f50b3c 1620 }
9d733021 1621 else if (system == kHLT)
9e080f92 1622 {
d386d623 1623 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1624 nFields = 3;
9e080f92 1625 }
1626
9e080f92 1627 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1628
1629 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1630
1631 // Query execution
1632 TSQLResult* aResult = 0;
9d733021 1633 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1634 if (!aResult) {
9d733021 1635 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1636 GetSystemName(system), id, sourceName.Data()));
9e080f92 1637 return 0;
1638 }
1639
1640 if(aResult->GetRowCount() == 0)
1641 {
1642 Log(detector,
9d733021 1643 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1644 GetSystemName(system), id, sourceName.Data()));
9e080f92 1645 delete aResult;
1646 return 0;
1647 }
2bb7b766 1648
9e080f92 1649 if (aResult->GetRowCount() > 1) {
1650 Log(detector,
9d733021 1651 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1652 GetSystemName(system), id, sourceName.Data()));
9e080f92 1653 delete aResult;
1654 return 0;
1655 }
1656
9d733021 1657 if (aResult->GetFieldCount() != nFields) {
9e080f92 1658 Log(detector,
9d733021 1659 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1660 GetSystemName(system), id, sourceName.Data()));
9e080f92 1661 delete aResult;
1662 return 0;
1663 }
1664
1665 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1666
1667 if (!aRow){
9d733021 1668 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1669 GetSystemName(system), id, sourceName.Data()));
9e080f92 1670 delete aResult;
1671 return 0;
1672 }
1673
1674 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1675 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1676 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1677
1678 delete aResult;
1679 delete aRow;
1680
d386d623 1681 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1682 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1683
9e080f92 1684 // retrieved file is renamed to make it unique
9d733021 1685 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1686 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1687
9e080f92 1688
9d733021 1689 // file retrieval from FXS
4b95672b 1690 UInt_t nRetries = 0;
1691 UInt_t maxRetries = 3;
1692 Bool_t result = kFALSE;
1693
1694 // copy!! if successful TSystem::Exec returns 0
1695 while(nRetries++ < maxRetries) {
1696 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1697 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1698 if(!result)
1699 {
1700 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1701 filePath.Data(), GetSystemName(system)));
4b95672b 1702 continue;
1703 } else {
1704 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1705 filePath.Data(), GetSystemName(system),
1706 GetShuttleTempDir(), localFileName.Data()));
1707 }
9e080f92 1708
d386d623 1709 if (fileChecksum.Length()>0)
4b95672b 1710 {
1711 // compare md5sum of local file with the one stored in the FXS DB
1712 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1713 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1714
4b95672b 1715 if (md5Comp != 0)
1716 {
1717 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1718 filePath.Data()));
1719 result = kFALSE;
1720 continue;
1721 }
d386d623 1722 } else {
1723 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1724 filePath.Data(), GetSystemName(system)));
9d733021 1725 }
4b95672b 1726 if (result) break;
9e080f92 1727 }
1728
4b95672b 1729 if(!result) return 0;
1730
9d733021 1731 fFXSCalled[system]=kTRUE;
1732 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1733 fFXSlist[system].Add(fileParams);
9e080f92 1734
1735 static TString fullLocalFileName;
36c99a6a 1736 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1737
9e080f92 1738 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1739
1740 return fullLocalFileName.Data();
2bb7b766 1741
1742}
1743
1744//______________________________________________________________________________________________
9d733021 1745Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1746{
9827400b 1747 //
1748 // Copies file from FXS to local Shuttle machine
1749 //
2bb7b766 1750
9e080f92 1751 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1752 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1753 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1754
36c99a6a 1755 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1756 if (dir == NULL) {
36c99a6a 1757 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1758 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1759 return kFALSE;
1760 }
1761
1762 } else {
1763 gSystem->FreeDirectory(dir);
1764 }
1765
9d733021 1766 TString baseFXSFolder;
1767 if (system == kDAQ)
1768 {
1769 baseFXSFolder = "FES/";
1770 }
1771 else if (system == kDCS)
1772 {
1773 baseFXSFolder = "";
1774 }
1775 else if (system == kHLT)
1776 {
1777 baseFXSFolder = "~/";
1778 }
1779
1780
1781 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1782 fConfig->GetFXSPort(system),
1783 fConfig->GetFXSUser(system),
1784 fConfig->GetFXSHost(system),
1785 baseFXSFolder.Data(),
1786 fxsFileName,
36c99a6a 1787 GetShuttleTempDir(),
9e080f92 1788 localFileName);
1789
1790 AliDebug(2, Form("%s",command.Data()));
1791
4b95672b 1792 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1793
4b95672b 1794 return result;
9e080f92 1795}
1796
1797//______________________________________________________________________________________________
9d733021 1798TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1799{
9827400b 1800 //
1801 // Get sources producing the condition file Id from file exchange servers
1802 //
1803
1804 // check if test mode should simulate a FXS error
1805 if (fTestMode & kErrorFXSSources)
1806 {
1807 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1808 return 0;
1809 }
1810
9d733021 1811
1812 if (system == kDCS)
1813 {
1814 AliError("DCS system has only one source of data!");
1815 return NULL;
9d733021 1816 }
9e080f92 1817
1818 // check connection, in case connect
9d733021 1819 if (!Connect(system))
1820 {
1821 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1822 return NULL;
9e080f92 1823 }
1824
9d733021 1825 TString sourceName = 0;
1826 if (system == kDAQ)
1827 {
1828 sourceName = "DAQsource";
1829 } else if (system == kHLT)
1830 {
1831 sourceName = "DDLnumbers";
1832 }
1833
d386d623 1834 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
9e080f92 1835 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1836 GetCurrentRun(), detector, id);
1837 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1838
1839 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1840
1841 // Query execution
1842 TSQLResult* aResult;
9d733021 1843 aResult = fServer[system]->Query(sqlQuery);
9e080f92 1844 if (!aResult) {
9d733021 1845 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1846 GetSystemName(system), id));
9e080f92 1847 return 0;
1848 }
1849
9d733021 1850 if (aResult->GetRowCount() == 0)
1851 {
9e080f92 1852 Log(detector,
9d733021 1853 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 1854 delete aResult;
1855 return 0;
1856 }
1857
1858 TSQLRow* aRow;
1859 TList *list = new TList();
1860 list->SetOwner(1);
1861
9d733021 1862 while ((aRow = aResult->Next()))
1863 {
9e080f92 1864
9d733021 1865 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1866 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1867 list->Add(new TObjString(source));
9e080f92 1868 delete aRow;
1869 }
9d733021 1870
9e080f92 1871 delete aResult;
1872
1873 return list;
2bb7b766 1874}
1875
1876//______________________________________________________________________________________________
9d733021 1877Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 1878{
9827400b 1879 // Connect to MySQL Server of the system's FXS MySQL databases
1880 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1881 //
57f50b3c 1882
9d733021 1883 // check connection: if already connected return
1884 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 1885
9d733021 1886 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 1887
9d733021 1888 if (system < 3) // FXS db servers
1889 {
1890 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1891 dbUser = fConfig->GetFXSdbUser(system);
1892 dbPass = fConfig->GetFXSdbPass(system);
1893 dbName = fConfig->GetFXSdbName(system);
1894 } else { // Run & Shuttle logbook servers
1895 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1896 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1897 dbUser = fConfig->GetDAQlbUser();
1898 dbPass = fConfig->GetDAQlbPass();
1899 dbName = fConfig->GetDAQlbDB();
1900 }
57f50b3c 1901
9d733021 1902 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1903 if (!fServer[system] || !fServer[system]->IsConnected()) {
1904 if(system < 3)
1905 {
1906 AliError(Form("Can't establish connection to FXS database for %s",
1907 AliShuttleInterface::GetSystemName(system)));
1908 } else {
1909 AliError("Can't establish connection to Run logbook.");
57f50b3c 1910 }
9d733021 1911 if(fServer[system]) delete fServer[system];
1912 return kFALSE;
2bb7b766 1913 }
57f50b3c 1914
9d733021 1915 // Get tables
1916 TSQLResult* aResult=0;
1917 switch(system){
1918 case kDAQ:
1919 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1920 break;
1921 case kDCS:
1922 aResult = fServer[kDCS]->GetTables(dbName.Data());
1923 break;
1924 case kHLT:
1925 aResult = fServer[kHLT]->GetTables(dbName.Data());
1926 break;
1927 default:
1928 aResult = fServer[3]->GetTables(dbName.Data());
1929 break;
1930 }
1931
1932 delete aResult;
2bb7b766 1933 return kTRUE;
1934}
57f50b3c 1935
9e080f92 1936//______________________________________________________________________________________________
9d733021 1937Bool_t AliShuttle::UpdateTable()
9e080f92 1938{
9827400b 1939 //
1940 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1941 //
9e080f92 1942
9d733021 1943 Bool_t result = kTRUE;
9e080f92 1944
9d733021 1945 for (UInt_t system=0; system<3; system++)
1946 {
1947 if(!fFXSCalled[system]) continue;
9e080f92 1948
9d733021 1949 // check connection, in case connect
1950 if (!Connect(system))
1951 {
1952 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1953 result = kFALSE;
1954 continue;
9e080f92 1955 }
9e080f92 1956
9d733021 1957 TTimeStamp now; // now
1958
1959 // Loop on FXS list entries
1960 TIter iter(&fFXSlist[system]);
1961 TObjString *aFXSentry=0;
1962 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1963 {
1964 TString aFXSentrystr = aFXSentry->String();
1965 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1966 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1967 {
1968 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
1969 GetSystemName(system), aFXSentrystr.Data()));
1970 if(aFXSarray) delete aFXSarray;
1971 result = kFALSE;
1972 continue;
1973 }
1974 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
1975 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
1976
1977 TString whereClause;
1978 if (system == kDAQ)
1979 {
1980 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
1981 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
1982 }
1983 else if (system == kDCS)
1984 {
1985 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
1986 GetCurrentRun(), fCurrentDetector.Data(), fileId);
1987 }
1988 else if (system == kHLT)
1989 {
1990 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
1991 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
1992 }
1993
1994 delete aFXSarray;
9e080f92 1995
9d733021 1996 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
1997 now.GetSec(), whereClause.Data());
9e080f92 1998
9d733021 1999 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2000
9d733021 2001 // Query execution
2002 TSQLResult* aResult;
2003 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2004 if (!aResult)
2005 {
2006 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2007 GetSystemName(system), sqlQuery.Data()));
2008 result = kFALSE;
2009 continue;
2010 }
2011 delete aResult;
9e080f92 2012 }
9e080f92 2013 }
2014
9d733021 2015 return result;
9e080f92 2016}
57f50b3c 2017
2bb7b766 2018//______________________________________________________________________________________________
3301427a 2019Bool_t AliShuttle::UpdateTableFailCase()
2020{
9827400b 2021 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2022 // this is called in case the preprocessor is declared failed for the current run, because
2023 // the fields are updated only in case of success
3301427a 2024
2025 Bool_t result = kTRUE;
2026
2027 for (UInt_t system=0; system<3; system++)
2028 {
2029 // check connection, in case connect
2030 if (!Connect(system))
2031 {
2032 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2033 GetSystemName(system)));
2034 result = kFALSE;
2035 continue;
2036 }
2037
2038 TTimeStamp now; // now
2039
2040 // Loop on FXS list entries
2041
2042 TString whereClause = Form("where run=%d and detector=\"%s\";",
2043 GetCurrentRun(), fCurrentDetector.Data());
2044
2045
2046 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2047 now.GetSec(), whereClause.Data());
2048
2049 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2050
2051 // Query execution
2052 TSQLResult* aResult;
2053 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2054 if (!aResult)
2055 {
2056 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2057 GetSystemName(system), sqlQuery.Data()));
2058 result = kFALSE;
2059 continue;
2060 }
2061 delete aResult;
2062 }
2063
2064 return result;
2065}
2066
2067//______________________________________________________________________________________________
2bb7b766 2068Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2069{
e7f62f16 2070 //
2071 // Update Shuttle logbook filling detector or shuttle_done column
2072 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2073 //
57f50b3c 2074
2bb7b766 2075 // check connection, in case connect
be48e3ea 2076 if(!Connect(3)){
2bb7b766 2077 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2078 return kFALSE;
57f50b3c 2079 }
2080
2bb7b766 2081 TString detName(detector);
2082 TString setClause;
e7f62f16 2083 if(detName == "shuttle_done")
2084 {
2bb7b766 2085 setClause = "set shuttle_done=1";
e7f62f16 2086
2087 // Send the information to ML
2088 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2089
2090 TList mlList;
2091 mlList.Add(&mlStatus);
2092
2093 fMonaLisa->SendParameters(&mlList);
2bb7b766 2094 } else {
2bb7b766 2095 TString statusStr(status);
2096 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2097 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2098 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2099 } else {
2100 Log("SHUTTLE",
2101 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2102 status, detector));
2103 return kFALSE;
2104 }
2105 }
57f50b3c 2106
2bb7b766 2107 TString whereClause = Form("where run=%d", GetCurrentRun());
2108
441b0e9c 2109 TString sqlQuery = Form("update %s %s %s",
2110 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2111
2bb7b766 2112 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2113
2114 // Query execution
2115 TSQLResult* aResult;
be48e3ea 2116 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2117 if (!aResult) {
2118 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2119 return kFALSE;
57f50b3c 2120 }
2bb7b766 2121 delete aResult;
57f50b3c 2122
2123 return kTRUE;
2124}
2125
2126//______________________________________________________________________________________________
2bb7b766 2127Int_t AliShuttle::GetCurrentRun() const
2128{
9827400b 2129 //
2130 // Get current run from logbook entry
2131 //
57f50b3c 2132
2bb7b766 2133 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2134}
2135
2136//______________________________________________________________________________________________
2bb7b766 2137UInt_t AliShuttle::GetCurrentStartTime() const
2138{
9827400b 2139 //
2140 // get current start time
2141 //
57f50b3c 2142
2bb7b766 2143 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2144}
2145
2146//______________________________________________________________________________________________
2bb7b766 2147UInt_t AliShuttle::GetCurrentEndTime() const
2148{
9827400b 2149 //
2150 // get current end time from logbook entry
2151 //
57f50b3c 2152
2bb7b766 2153 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2154}
2155
2156//______________________________________________________________________________________________
b948db8d 2157void AliShuttle::Log(const char* detector, const char* message)
2158{
9827400b 2159 //
2160 // Fill log string with a message
2161 //
b948db8d 2162
36c99a6a 2163 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2164 if (dir == NULL) {
36c99a6a 2165 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2166 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2167 return;
2168 }
b948db8d 2169
84090f85 2170 } else {
2171 gSystem->FreeDirectory(dir);
2172 }
b948db8d 2173
cb343cfd 2174 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2175 if (GetCurrentRun() >= 0)
2176 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2177 toLog += Form("%s", message);
2178
84090f85 2179 AliInfo(toLog.Data());
ffa29e93 2180
2181 // if we redirect the log output already to the file, leave here
2182 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2183 return;
b948db8d 2184
ffa29e93 2185 TString fileName = GetLogFileName(detector);
e7f62f16 2186
84090f85 2187 gSystem->ExpandPathName(fileName);
2188
2189 ofstream logFile;
2190 logFile.open(fileName, ofstream::out | ofstream::app);
2191
2192 if (!logFile.is_open()) {
2193 AliError(Form("Could not open file %s", fileName.Data()));
2194 return;
2195 }
7bfb2090 2196
84090f85 2197 logFile << toLog.Data() << "\n";
b948db8d 2198
84090f85 2199 logFile.close();
b948db8d 2200}
2bb7b766 2201
2bb7b766 2202//______________________________________________________________________________________________
ffa29e93 2203TString AliShuttle::GetLogFileName(const char* detector) const
2204{
2205 //
2206 // returns the name of the log file for a given sub detector
2207 //
2208
2209 TString fileName;
2210
2211 if (GetCurrentRun() >= 0)
2212 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2213 else
2214 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2215
2216 return fileName;
2217}
2218
2219//______________________________________________________________________________________________
2bb7b766 2220Bool_t AliShuttle::Collect(Int_t run)
2221{
9827400b 2222 //
2223 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2224 // If a dedicated run is given this run is processed
2225 //
2226 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2227 //
2bb7b766 2228
eba76848 2229 if (run == -1)
2230 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2231 else
2232 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2233
2234 SetLastAction("Starting");
2bb7b766 2235
2236 TString whereClause("where shuttle_done=0");
eba76848 2237 if (run != -1)
2238 whereClause += Form(" and run=%d", run);
2bb7b766 2239
2240 TObjArray shuttleLogbookEntries;
be48e3ea 2241 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2242 {
cb343cfd 2243 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2244 return kFALSE;
2245 }
2246
9e080f92 2247 if (shuttleLogbookEntries.GetEntries() == 0)
2248 {
2249 if (run == -1)
2250 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2251 else
2252 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2253 "or it does not exist in Shuttle logbook", run));
2254 return kTRUE;
2255 }
2256
be48e3ea 2257 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2258 fFirstUnprocessed[iDet] = kTRUE;
2259
fc5a4708 2260 if (run != -1)
be48e3ea 2261 {
2262 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2263 // flag them into fFirstUnprocessed array
2264 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2265 TObjArray tmpLogbookEntries;
2266 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2267 {
2268 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2269 return kFALSE;
2270 }
2271
2272 TIter iter(&tmpLogbookEntries);
2273 AliShuttleLogbookEntry* anEntry = 0;
2274 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2275 {
2276 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2277 {
2278 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2279 {
2280 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2281 anEntry->GetRun(), GetDetName(iDet)));
2282 fFirstUnprocessed[iDet] = kFALSE;
2283 }
2284 }
2285
2286 }
2287
2288 }
2289
2290 if (!RetrieveConditionsData(shuttleLogbookEntries))
2291 {
cb343cfd 2292 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2293 return kFALSE;
2294 }
2295
36c99a6a 2296 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2297 return kTRUE;
2bb7b766 2298}
2299
2bb7b766 2300//______________________________________________________________________________________________
2301Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2302{
9827400b 2303 //
2304 // Retrieve conditions data for all runs that aren't processed yet
2305 //
2bb7b766 2306
2307 Bool_t hasError = kFALSE;
2308
2309 TIter iter(&dateEntries);
2310 AliShuttleLogbookEntry* anEntry;
2311
2312 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2313 if (!Process(anEntry)){
2314 hasError = kTRUE;
2315 }
4b95672b 2316
2317 // clean SHUTTLE temp directory
3301427a 2318 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2319 RemoveFile(filename.Data());
2bb7b766 2320 }
2321
2322 return hasError == kFALSE;
2323}
cb343cfd 2324
2325//______________________________________________________________________________________________
2326ULong_t AliShuttle::GetTimeOfLastAction() const
2327{
9827400b 2328 //
2329 // Gets time of last action
2330 //
2331
cb343cfd 2332 ULong_t tmp;
36c99a6a 2333
cb343cfd 2334 fMonitoringMutex->Lock();
be48e3ea 2335
cb343cfd 2336 tmp = fLastActionTime;
36c99a6a 2337
cb343cfd 2338 fMonitoringMutex->UnLock();
36c99a6a 2339
cb343cfd 2340 return tmp;
2341}
2342
2343//______________________________________________________________________________________________
2344const TString AliShuttle::GetLastAction() const
2345{
9827400b 2346 //
cb343cfd 2347 // returns a string description of the last action
9827400b 2348 //
cb343cfd 2349
2350 TString tmp;
36c99a6a 2351
cb343cfd 2352 fMonitoringMutex->Lock();
2353
2354 tmp = fLastAction;
2355
2356 fMonitoringMutex->UnLock();
2357
36c99a6a 2358 return tmp;
cb343cfd 2359}
2360
2361//______________________________________________________________________________________________
2362void AliShuttle::SetLastAction(const char* action)
2363{
9827400b 2364 //
cb343cfd 2365 // updates the monitoring variables
9827400b 2366 //
36c99a6a 2367
cb343cfd 2368 fMonitoringMutex->Lock();
36c99a6a 2369
cb343cfd 2370 fLastAction = action;
2371 fLastActionTime = time(0);
2372
2373 fMonitoringMutex->UnLock();
2374}
eba76848 2375
2376//______________________________________________________________________________________________
2377const char* AliShuttle::GetRunParameter(const char* param)
2378{
9827400b 2379 //
2380 // returns run parameter read from DAQ logbook
2381 //
eba76848 2382
2383 if(!fLogbookEntry) {
2384 AliError("No logbook entry!");
2385 return 0;
2386 }
2387
2388 return fLogbookEntry->GetRunParameter(param);
2389}
57c1a579 2390
2391//______________________________________________________________________________________________
9827400b 2392AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2393{
9827400b 2394 //
2395 // returns object from OCDB valid for current run
2396 //
d386d623 2397
9827400b 2398 if (fTestMode & kErrorOCDB)
2399 {
2400 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2401 return 0;
2402 }
2403
d386d623 2404 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2405 if (!sto)
2406 {
9827400b 2407 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2408 return 0;
2409 }
2410
2411 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2412}
2413
2414//______________________________________________________________________________________________
57c1a579 2415Bool_t AliShuttle::SendMail()
2416{
9827400b 2417 //
2418 // sends a mail to the subdetector expert in case of preprocessor error
2419 //
2420
2421 if (fTestMode != kNone)
2422 return kTRUE;
57c1a579 2423
36c99a6a 2424 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2425 if (dir == NULL)
2426 {
36c99a6a 2427 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2428 {
36c99a6a 2429 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2430 return kFALSE;
2431 }
2432
2433 } else {
2434 gSystem->FreeDirectory(dir);
2435 }
2436
2437 TString bodyFileName;
36c99a6a 2438 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2439 gSystem->ExpandPathName(bodyFileName);
2440
2441 ofstream mailBody;
2442 mailBody.open(bodyFileName, ofstream::out);
2443
2444 if (!mailBody.is_open())
2445 {
2446 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2447 return kFALSE;
2448 }
2449
2450 TString to="";
2451 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2452 TObjString *anExpert=0;
2453 while ((anExpert = (TObjString*) iterExperts.Next()))
2454 {
2455 to += Form("%s,", anExpert->GetName());
2456 }
2457 to.Remove(to.Length()-1);
909732f7 2458 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2459
36c99a6a 2460 // TODO this will be removed...
2461 if (to.Contains("not_yet_set")) {
2462 AliInfo("List of detector responsibles not yet set!");
2463 return kFALSE;
2464 }
2465
57c1a579 2466 TString cc="alberto.colla@cern.ch";
2467
2468 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2469 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2470 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2471
2472 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2473 body += Form("SHUTTLE just detected that your preprocessor "
36c99a6a 2474 "exited with ERROR state in run %d!!\n\n", GetCurrentRun());
57c1a579 2475 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2476 body += Form("The last 10 lines of %s log file are following:\n\n");
2477
909732f7 2478 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2479
2480 mailBody << body.Data();
2481 mailBody.close();
2482 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2483
9d733021 2484 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2485 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2486 if (gSystem->Exec(tailCommand.Data()))
2487 {
2488 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2489 }
2490
2491 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2492 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2493 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
57c1a579 2494 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2495
909732f7 2496 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2497
2498 mailBody << endBody.Data();
2499
2500 mailBody.close();
2501
2502 // send mail!
2503 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2504 subject.Data(),
2505 cc.Data(),
2506 to.Data(),
2507 bodyFileName.Data());
909732f7 2508 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2509
2510 Bool_t result = gSystem->Exec(mailCommand.Data());
2511
2512 return result == 0;
2513}
d386d623 2514
2515//______________________________________________________________________________________________
9827400b 2516const char* AliShuttle::GetRunType()
441b0e9c 2517{
9827400b 2518 //
2519 // returns run type read from "run type" logbook
2520 //
441b0e9c 2521
2522 if(!fLogbookEntry) {
2523 AliError("No logbook entry!");
2524 return 0;
2525 }
2526
9827400b 2527 return fLogbookEntry->GetRunType();
441b0e9c 2528}
2529
2530//______________________________________________________________________________________________
d386d623 2531void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2532{
9827400b 2533 //
2534 // sets Shuttle temp directory
2535 //
d386d623 2536
2537 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2538}
2539
2540//______________________________________________________________________________________________
2541void AliShuttle::SetShuttleLogDir(const char* logDir)
2542{
9827400b 2543 //
2544 // sets Shuttle log directory
2545 //
d386d623 2546
2547 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2548}