Typo corrected.
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
3c2a21c8 18Revision 1.37 2007/04/10 16:53:14 jgrosseo
19redirecting sub detector stdout, stderr to sub detector log file
20
3d8bc902 21Revision 1.35 2007/04/04 16:26:38 acolla
221. Re-organization of function calls in TestPreprocessor to make it more meaningful.
232. Added missing dependency in test preprocessors.
243. in AliShuttle.cxx: processing time and memory consumption info on a single line.
25
886d60e6 26Revision 1.34 2007/04/04 10:33:36 jgrosseo
271) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
28In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
29
302) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
31
323) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
33
344) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
35
365) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
37If you always need DCS data (like before), you do not need to implement it.
38
396) The run type has been added to the monitoring page
40
9827400b 41Revision 1.33 2007/04/03 13:56:01 acolla
42Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
43run type.
44
3301427a 45Revision 1.32 2007/02/28 10:41:56 acolla
46Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
47AliPreprocessor::GetRunType() function.
48Added some ldap definition files.
49
d386d623 50Revision 1.30 2007/02/13 11:23:21 acolla
51Moved getters and setters of Shuttle's main OCDB/Reference, local
52OCDB/Reference, temp and log folders to AliShuttleInterface
53
9d733021 54Revision 1.27 2007/01/30 17:52:42 jgrosseo
55adding monalisa monitoring
56
e7f62f16 57Revision 1.26 2007/01/23 19:20:03 acolla
58Removed old ldif files, added TOF, MCH ldif files. Added some options in
59AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
60SetShuttleLogDir
61
36c99a6a 62Revision 1.25 2007/01/15 19:13:52 acolla
63Moved some AliInfo to AliDebug in SendMail function
64
fc5a4708 65Revision 1.21 2006/12/07 08:51:26 jgrosseo
66update (alberto):
67table, db names in ldap configuration
68added GRP preprocessor
69DCS data can also be retrieved by data point
70
2c15234c 71Revision 1.20 2006/11/16 16:16:48 jgrosseo
72introducing strict run ordering flag
73removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
74
be48e3ea 75Revision 1.19 2006/11/06 14:23:04 jgrosseo
76major update (Alberto)
77o) reading of run parameters from the logbook
78o) online offline naming conversion
79o) standalone DCSclient package
80
eba76848 81Revision 1.18 2006/10/20 15:22:59 jgrosseo
82o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
83o) Merging Collect, CollectAll, CollectNew function
84o) Removing implementation of empty copy constructors (declaration still there!)
85
cb343cfd 86Revision 1.17 2006/10/05 16:20:55 jgrosseo
87adapting to new CDB classes
88
6ec0e06c 89Revision 1.16 2006/10/05 15:46:26 jgrosseo
90applying to the new interface
91
481441a2 92Revision 1.15 2006/10/02 16:38:39 jgrosseo
93update (alberto):
94fixed memory leaks
95storing of objects that failed to be stored to the grid before
96interfacing of shuttle status table in daq system
97
2bb7b766 98Revision 1.14 2006/08/29 09:16:05 jgrosseo
99small update
100
85a80aa9 101Revision 1.13 2006/08/15 10:50:00 jgrosseo
102effc++ corrections (alberto)
103
4f0ab988 104Revision 1.12 2006/08/08 14:19:29 jgrosseo
105Update to shuttle classes (Alberto)
106
107- Possibility to set the full object's path in the Preprocessor's and
108Shuttle's Store functions
109- Possibility to extend the object's run validity in the same classes
110("startValidity" and "validityInfinite" parameters)
111- Implementation of the StoreReferenceData function to store reference
112data in a dedicated CDB storage.
113
84090f85 114Revision 1.11 2006/07/21 07:37:20 jgrosseo
115last run is stored after each run
116
7bfb2090 117Revision 1.10 2006/07/20 09:54:40 jgrosseo
118introducing status management: The processing per subdetector is divided into several steps,
119after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
120can keep track of the number of failures and skips further processing after a certain threshold is
121exceeded. These thresholds can be configured in LDAP.
122
5164a766 123Revision 1.9 2006/07/19 10:09:55 jgrosseo
124new configuration, accesst to DAQ FES (Alberto)
125
57f50b3c 126Revision 1.8 2006/07/11 12:44:36 jgrosseo
127adding parameters for extended validity range of data produced by preprocessor
128
17111222 129Revision 1.7 2006/07/10 14:37:09 jgrosseo
130small fix + todo comment
131
e090413b 132Revision 1.6 2006/07/10 13:01:41 jgrosseo
133enhanced storing of last sucessfully processed run (alberto)
134
a7160fe9 135Revision 1.5 2006/07/04 14:59:57 jgrosseo
136revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
137
45a493ce 138Revision 1.4 2006/06/12 09:11:16 jgrosseo
139coding conventions (Alberto)
140
58bc3020 141Revision 1.3 2006/06/06 14:26:40 jgrosseo
142o) removed files that were moved to STEER
143o) shuttle updated to follow the new interface (Alberto)
144
b948db8d 145Revision 1.2 2006/03/07 07:52:34 hristov
146New version (B.Yordanov)
147
d477ad88 148Revision 1.6 2005/11/19 17:19:14 byordano
149RetrieveDATEEntries and RetrieveConditionsData added
150
151Revision 1.5 2005/11/19 11:09:27 byordano
152AliShuttle declaration added
153
154Revision 1.4 2005/11/17 17:47:34 byordano
155TList changed to TObjArray
156
157Revision 1.3 2005/11/17 14:43:23 byordano
158import to local CVS
159
160Revision 1.1.1.1 2005/10/28 07:33:58 hristov
161Initial import as subdirectory in AliRoot
162
73abe331 163Revision 1.2 2005/09/13 08:41:15 byordano
164default startTime endTime added
165
166Revision 1.4 2005/08/30 09:13:02 byordano
167some docs added
168
169Revision 1.3 2005/08/29 21:15:47 byordano
170some docs added
171
172*/
173
174//
175// This class is the main manager for AliShuttle.
176// It organizes the data retrieval from DCS and call the
b948db8d 177// interface methods of AliPreprocessor.
73abe331 178// For every detector in AliShuttleConfgi (see AliShuttleConfig),
179// data for its set of aliases is retrieved. If there is registered
b948db8d 180// AliPreprocessor for this detector then it will be used
181// accroding to the schema (see AliPreprocessor).
182// If there isn't registered AliPreprocessor than the retrieved
73abe331 183// data is stored automatically to the undelying AliCDBStorage.
184// For detSpec is used the alias name.
185//
186
187#include "AliShuttle.h"
188
189#include "AliCDBManager.h"
190#include "AliCDBStorage.h"
191#include "AliCDBId.h"
84090f85 192#include "AliCDBRunRange.h"
193#include "AliCDBPath.h"
5164a766 194#include "AliCDBEntry.h"
73abe331 195#include "AliShuttleConfig.h"
eba76848 196#include "DCSClient/AliDCSClient.h"
73abe331 197#include "AliLog.h"
b948db8d 198#include "AliPreprocessor.h"
5164a766 199#include "AliShuttleStatus.h"
2bb7b766 200#include "AliShuttleLogbookEntry.h"
73abe331 201
57f50b3c 202#include <TSystem.h>
58bc3020 203#include <TObject.h>
b948db8d 204#include <TString.h>
57f50b3c 205#include <TTimeStamp.h>
73abe331 206#include <TObjString.h>
57f50b3c 207#include <TSQLServer.h>
208#include <TSQLResult.h>
209#include <TSQLRow.h>
cb343cfd 210#include <TMutex.h>
9827400b 211#include <TSystemDirectory.h>
212#include <TSystemFile.h>
213#include <TFileMerger.h>
214#include <TGrid.h>
215#include <TGridResult.h>
73abe331 216
e7f62f16 217#include <TMonaLisaWriter.h>
218
5164a766 219#include <fstream>
220
cb343cfd 221#include <sys/types.h>
222#include <sys/wait.h>
223
73abe331 224ClassImp(AliShuttle)
225
b948db8d 226//______________________________________________________________________________________________
227AliShuttle::AliShuttle(const AliShuttleConfig* config,
228 UInt_t timeout, Int_t retries):
4f0ab988 229fConfig(config),
230fTimeout(timeout), fRetries(retries),
231fPreprocessorMap(),
2bb7b766 232fLogbookEntry(0),
eba76848 233fCurrentDetector(),
85a80aa9 234fStatusEntry(0),
cb343cfd 235fMonitoringMutex(0),
eba76848 236fLastActionTime(0),
e7f62f16 237fLastAction(),
9827400b 238fMonaLisa(0),
239fTestMode(kNone),
ffa29e93 240fReadTestMode(kFALSE),
241fOutputRedirected(kFALSE)
73abe331 242{
243 //
244 // config: AliShuttleConfig used
73abe331 245 // timeout: timeout used for AliDCSClient connection
246 // retries: the number of retries in case of connection error.
247 //
248
57f50b3c 249 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 250 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 251 fServer[iSys]=0;
be48e3ea 252 if (iSys < 3)
2c15234c 253 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 254 }
2bb7b766 255 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 256
257 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
258 fFirstUnprocessed[iDet] = kFALSE;
259
cb343cfd 260 fMonitoringMutex = new TMutex();
58bc3020 261}
262
b948db8d 263//______________________________________________________________________________________________
57f50b3c 264AliShuttle::~AliShuttle()
58bc3020 265{
9827400b 266 //
267 // destructor
268 //
58bc3020 269
b948db8d 270 fPreprocessorMap.DeleteAll();
be48e3ea 271 for(int iSys=0;iSys<4;iSys++)
57f50b3c 272 if(fServer[iSys]) {
273 fServer[iSys]->Close();
274 delete fServer[iSys];
eba76848 275 fServer[iSys] = 0;
57f50b3c 276 }
2bb7b766 277
278 if (fStatusEntry){
279 delete fStatusEntry;
280 fStatusEntry = 0;
281 }
cb343cfd 282
283 if (fMonitoringMutex)
284 {
285 delete fMonitoringMutex;
286 fMonitoringMutex = 0;
287 }
73abe331 288}
289
b948db8d 290//______________________________________________________________________________________________
57f50b3c 291void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 292{
73abe331 293 //
b948db8d 294 // Registers new AliPreprocessor.
73abe331 295 // It uses GetName() for indentificator of the pre processor.
296 // The pre processor is registered it there isn't any other
297 // with the same identificator (GetName()).
298 //
299
eba76848 300 const char* detName = preprocessor->GetName();
301 if(GetDetPos(detName) < 0)
302 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
303
304 if (fPreprocessorMap.GetValue(detName)) {
305 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 306 return;
307 }
308
eba76848 309 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 310}
b948db8d 311//______________________________________________________________________________________________
3301427a 312Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 313 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 314{
9827400b 315 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
316 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
317 // using this function. Use StoreReferenceData instead!
318 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
319 // finishes the data are transferred to the main storage (Grid).
b948db8d 320
3301427a 321 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 322}
323
324//______________________________________________________________________________________________
3301427a 325Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 326{
9827400b 327 // Stores a CDB object in the storage for reference data. This objects will not be available during
328 // offline reconstrunction. Use this function for reference data only!
329 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
330 // finishes the data are transferred to the main storage (Grid).
85a80aa9 331
3301427a 332 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 333}
334
335//______________________________________________________________________________________________
3301427a 336Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 337 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
338 Int_t validityStart, Bool_t validityInfinite)
339{
9827400b 340 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
341 // when the preprocessor finishes the data are transferred to the main storage (Grid).
342 // The parameters are:
343 // 1) Uri of the backup storage (Local)
344 // 2) the object's path.
345 // 3) the object to be stored
346 // 4) the metaData to be associated with the object
347 // 5) the validity start run number w.r.t. the current run,
348 // if the data is valid only for this run leave the default 0
349 // 6) specifies if the calibration data is valid for infinity (this means until updated),
350 // typical for calibration runs, the default is kFALSE
351 //
352 // returns 0 if fail, 1 otherwise
84090f85 353
9827400b 354 if (fTestMode & kErrorStorage)
355 {
356 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
357 return kFALSE;
358 }
359
3301427a 360 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 361
85a80aa9 362 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 363 if(firstRun < 0) {
9827400b 364 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 365 firstRun=0;
366 }
367
368 Int_t lastRun = -1;
369 if(validityInfinite) {
370 lastRun = AliCDBRunRange::Infinity();
371 } else {
372 lastRun = GetCurrentRun();
373 }
374
3301427a 375 // Version is set to current run, it will be used later to transfer data to Grid
376 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 377
378 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
379 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 380 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 381 }
84090f85 382
3301427a 383 Bool_t result = kFALSE;
84090f85 384
3301427a 385 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
386 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 387 } else {
3301427a 388 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 389 ->Put(object, id, metaData);
390 }
391
392 if(!result) {
393
9827400b 394 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 395 }
2bb7b766 396
3301427a 397 return result;
398}
84090f85 399
3301427a 400//______________________________________________________________________________________________
401Bool_t AliShuttle::StoreOCDB()
402{
9827400b 403 //
404 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
405 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
406 // Then calls StoreRefFilesToGrid to store reference files.
407 //
408
409 if (fTestMode & kErrorGrid)
410 {
411 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
412 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
413 return kFALSE;
414 }
415
3301427a 416 AliInfo("Storing OCDB data ...");
417 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
418
419 AliInfo("Storing reference data ...");
420 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 421
422 AliInfo("Storing reference files ...");
423 Bool_t resultRefFiles = StoreRefFilesToGrid();
424
425 return resultCDB && resultRef && resultRefFiles;
3301427a 426}
427
428//______________________________________________________________________________________________
429Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
430{
431 //
432 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
433 //
434
435 TObjArray* gridIds=0;
436
437 Bool_t result = kTRUE;
438
439 const char* type = 0;
440 TString localURI;
441 if(gridURI == fgkMainCDB) {
442 type = "OCDB";
443 localURI = fgkLocalCDB;
444 } else if(gridURI == fgkMainRefStorage) {
445 type = "reference";
446 localURI = fgkLocalRefStorage;
447 } else {
448 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
449 return kFALSE;
450 }
451
452 AliCDBManager* man = AliCDBManager::Instance();
453
454 AliCDBStorage *gridSto = man->GetStorage(gridURI);
455 if(!gridSto) {
456 Log("SHUTTLE",
457 Form("StoreOCDB - cannot activate main %s storage", type));
458 return kFALSE;
459 }
460
461 gridIds = gridSto->GetQueryCDBList();
462
463 // get objects previously stored in local CDB
464 AliCDBStorage *localSto = man->GetStorage(localURI);
465 if(!localSto) {
466 Log("SHUTTLE",
467 Form("StoreOCDB - cannot activate local %s storage", type));
468 return kFALSE;
469 }
470 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
471 // Local objects were stored with current run as Grid version!
472 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
473 localEntries->SetOwner(1);
474
475 // loop on local stored objects
476 TIter localIter(localEntries);
477 AliCDBEntry *aLocEntry = 0;
478 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
479 aLocEntry->SetOwner(1);
480 AliCDBId aLocId = aLocEntry->GetId();
481 aLocEntry->SetVersion(-1);
482 aLocEntry->SetSubVersion(-1);
483
484 // If local object is valid up to infinity we store it only if it is
485 // the first unprocessed run!
486 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
487 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
488 {
489 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
490 "there are previous unprocessed runs!",
491 fCurrentDetector.Data(), aLocId.GetPath().Data()));
492 continue;
493 }
494
495 // loop on Grid valid Id's
496 Bool_t store = kTRUE;
497 TIter gridIter(gridIds);
498 AliCDBId* aGridId = 0;
499 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
500 if(aGridId->GetPath() != aLocId.GetPath()) continue;
501 // skip all objects valid up to infinity
502 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
503 // if we get here, it means there's already some more recent object stored on Grid!
504 store = kFALSE;
505 break;
506 }
507
508 // If we get here, the file can be stored!
509 Bool_t storeOk = gridSto->Put(aLocEntry);
510 if(!store || storeOk){
511
512 if (!store)
513 {
514 Log(fCurrentDetector.Data(),
515 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
516 type, aGridId->ToString().Data()));
517 } else {
518 Log("SHUTTLE",
519 Form("StoreOCDB - Object <%s> successfully put into %s storage",
520 aLocId.ToString().Data(), type));
521 }
84090f85 522
3301427a 523 // removing local filename...
524 TString filename;
525 localSto->IdToFilename(aLocId, filename);
526 AliInfo(Form("Removing local file %s", filename.Data()));
527 RemoveFile(filename.Data());
528 continue;
529 } else {
530 Log("SHUTTLE",
531 Form("StoreOCDB - Grid %s storage of object <%s> failed",
532 type, aLocId.ToString().Data()));
533 result = kFALSE;
b948db8d 534 }
535 }
3301427a 536 localEntries->Clear();
2bb7b766 537
b948db8d 538 return result;
3301427a 539}
540
541//______________________________________________________________________________________________
9827400b 542Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
543{
544 //
3c2a21c8 545 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 546 //
3c2a21c8 547 // The file is stored under the following location:
548 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
549 // where <gridFileName> is the second parameter given to the function
550 //
9827400b 551
552 if (fTestMode & kErrorStorage)
553 {
554 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
555 return kFALSE;
556 }
557
558 AliCDBManager* man = AliCDBManager::Instance();
559 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
560
561 TString localBaseFolder = sto->GetBaseFolder();
562
563 TString targetDir;
564 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
565
566 TString target;
567 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
568
569 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
570 if (result)
571 {
572 result = gSystem->mkdir(targetDir, kTRUE);
573 if (result != 0)
574 {
575 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
576 return kFALSE;
577 }
578 }
579
580 result = gSystem->CopyFile(localFile, target);
581
582 if (result == 0)
583 {
584 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
585 return kTRUE;
586 }
587 else
588 {
589 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
590 return kFALSE;
591 }
592}
593
594//______________________________________________________________________________________________
595Bool_t AliShuttle::StoreRefFilesToGrid()
596{
597 //
598 // Transfers the reference file to the Grid.
9827400b 599 //
3c2a21c8 600 // The file is stored under the following location:
601 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
602 // where <gridFileName> is the second parameter given to the function
603 //
9827400b 604
605 AliCDBManager* man = AliCDBManager::Instance();
606 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
607 if (!sto)
608 return kFALSE;
609 TString localBaseFolder = sto->GetBaseFolder();
610
611 TString dir;
3d8bc902 612 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 613
614 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
615 if (!gridSto)
616 return kFALSE;
617 TString gridBaseFolder = gridSto->GetBaseFolder();
618 TString alienDir;
3d8bc902 619 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 620
3d8bc902 621 if (!gGrid)
9827400b 622 return kFALSE;
623
9827400b 624 TString begin;
625 begin.Form("%d_", GetCurrentRun());
626
627 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 628 if (!baseDir)
629 return kTRUE;
630
9827400b 631 TList* dirList = baseDir->GetListOfFiles();
632 if (!dirList)
3d8bc902 633 {
634 delete baseDir;
9827400b 635 return kTRUE;
3d8bc902 636 }
9827400b 637
638 Int_t nDirs = dirList->GetEntries();
639
640 Bool_t success = kTRUE;
3d8bc902 641 Bool_t first = kTRUE;
9827400b 642
643 for (Int_t iDir=0; iDir<nDirs; ++iDir)
644 {
645 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
646 if (!entry)
647 continue;
648
649 if (entry->IsDirectory())
650 continue;
651
652 TString fileName(entry->GetName());
653 if (!fileName.BeginsWith(begin))
654 continue;
655
3d8bc902 656 if (first)
657 {
658 first = kFALSE;
659 // check that DET folder exists, otherwise create it
660 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
661
662 if (!result)
663 return kFALSE;
664
665 if (!result->GetFileName(0))
666 {
667 if (!gGrid->Mkdir(alienDir.Data(),"",0))
668 {
669 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
670 alienDir.Data()));
671 delete baseDir;
672 return kFALSE;
673 }
674
675 }
676 }
677
9827400b 678 TString fullLocalPath;
679 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
680
681 TString fullGridPath;
682 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
683
684 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
685
686 TFileMerger fileMerger;
687 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
688
689 if (result)
690 {
691 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
692 RemoveFile(fullLocalPath);
693 }
694 else
695 {
696 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
697 success = kFALSE;
698 }
699 }
700
701 delete baseDir;
702
703 return success;
704}
705
706//______________________________________________________________________________________________
3301427a 707void AliShuttle::CleanLocalStorage(const TString& uri)
708{
9827400b 709 //
710 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
711 //
3301427a 712
713 const char* type = 0;
714 if(uri == fgkLocalCDB) {
715 type = "OCDB";
716 } else if(uri == fgkLocalRefStorage) {
717 type = "reference";
718 } else {
719 AliError(Form("Invalid storage URI: %s", uri.Data()));
720 return;
721 }
722
723 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 724
3301427a 725 // open local storage
726 AliCDBStorage *localSto = man->GetStorage(uri);
727 if(!localSto) {
728 Log("SHUTTLE",
729 Form("CleanLocalStorage - cannot activate local %s storage", type));
730 return;
731 }
732
733 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
734 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
735
736 AliInfo(Form("filename = %s", filename.Data()));
737
738 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
739 GetCurrentRun(), fCurrentDetector.Data()));
740
741 RemoveFile(filename.Data());
742
743}
744
745//______________________________________________________________________________________________
746void AliShuttle::RemoveFile(const char* filename)
747{
9827400b 748 //
749 // removes local file
750 //
3301427a 751
752 TString command(Form("rm -f %s", filename));
753
754 Int_t result = gSystem->Exec(command.Data());
755 if(result != 0)
756 {
757 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
758 fCurrentDetector.Data(), filename));
759 }
73abe331 760}
761
b948db8d 762//______________________________________________________________________________________________
5164a766 763AliShuttleStatus* AliShuttle::ReadShuttleStatus()
764{
9827400b 765 //
766 // Reads the AliShuttleStatus from the CDB
767 //
5164a766 768
2bb7b766 769 if (fStatusEntry){
770 delete fStatusEntry;
771 fStatusEntry = 0;
772 }
5164a766 773
10a5a932 774 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 775 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 776
2bb7b766 777 if (!fStatusEntry) return 0;
778 fStatusEntry->SetOwner(1);
5164a766 779
2bb7b766 780 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
781 if (!status) {
782 AliError("Invalid object stored to CDB!");
783 return 0;
784 }
5164a766 785
2bb7b766 786 return status;
5164a766 787}
788
789//______________________________________________________________________________________________
7bfb2090 790Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 791{
9827400b 792 //
793 // writes the status for one subdetector
794 //
2bb7b766 795
796 if (fStatusEntry){
797 delete fStatusEntry;
798 fStatusEntry = 0;
799 }
5164a766 800
2bb7b766 801 Int_t run = GetCurrentRun();
5164a766 802
2bb7b766 803 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 804
2bb7b766 805 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
806 fStatusEntry->SetOwner(1);
5164a766 807
2bb7b766 808 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 809
2bb7b766 810 if (!result) {
3301427a 811 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
812 fCurrentDetector.Data(), run));
2bb7b766 813 return kFALSE;
814 }
e7f62f16 815
816 SendMLInfo();
7bfb2090 817
2bb7b766 818 return kTRUE;
5164a766 819}
820
821//______________________________________________________________________________________________
822void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
823{
9827400b 824 //
825 // changes the AliShuttleStatus for the given detector and run to the given status
826 //
5164a766 827
2bb7b766 828 if (!fStatusEntry){
829 AliError("UNEXPECTED: fStatusEntry empty");
830 return;
831 }
5164a766 832
2bb7b766 833 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 834
2bb7b766 835 if (!status){
3301427a 836 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 837 return;
838 }
5164a766 839
2c15234c 840 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 841 fCurrentDetector.Data(),
36c99a6a 842 status->GetStatusName(),
eba76848 843 status->GetStatusName(newStatus));
cb343cfd 844 Log("SHUTTLE", actionStr);
845 SetLastAction(actionStr);
5164a766 846
2bb7b766 847 status->SetStatus(newStatus);
848 if (increaseCount) status->IncreaseCount();
5164a766 849
2bb7b766 850 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 851
852 SendMLInfo();
5164a766 853}
e7f62f16 854
855//______________________________________________________________________________________________
856void AliShuttle::SendMLInfo()
857{
858 //
859 // sends ML information about the current status of the current detector being processed
860 //
861
862 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
863
864 if (!status){
3301427a 865 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 866 return;
867 }
868
869 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
870 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
871
872 TList mlList;
873 mlList.Add(&mlStatus);
874 mlList.Add(&mlRetryCount);
875
876 fMonaLisa->SendParameters(&mlList);
877}
878
5164a766 879//______________________________________________________________________________________________
880Bool_t AliShuttle::ContinueProcessing()
881{
9827400b 882 // this function reads the AliShuttleStatus information from CDB and
883 // checks if the processing should be continued
884 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 885
57c1a579 886 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
887
888 AliPreprocessor* aPreprocessor =
889 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
890 if (!aPreprocessor)
891 {
892 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
893 return kFALSE;
894 }
895
2bb7b766 896 AliShuttleLogbookEntry::Status entryStatus =
eba76848 897 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 898
899 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 900 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 901 fCurrentDetector.Data(),
902 fLogbookEntry->GetDetectorStatusName(entryStatus)));
903 return kFALSE;
904 }
905
906 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 907
908 // check if current run is first unprocessed run for current detector
909 if (fConfig->StrictRunOrder(fCurrentDetector) &&
910 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
911 {
912 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
913 return kFALSE;
914 }
915
2bb7b766 916 AliShuttleStatus* status = ReadShuttleStatus();
917 if (!status) {
918 // first time
919 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
920 fCurrentDetector.Data()));
921 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
922 return WriteShuttleStatus(status);
923 }
924
925 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
926 // If it happens it may mean Logbook updating failed... let's do it now!
927 if (status->GetStatus() == AliShuttleStatus::kDone ||
928 status->GetStatus() == AliShuttleStatus::kFailed){
929 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
930 fCurrentDetector.Data(),
931 status->GetStatusName(status->GetStatus())));
932 UpdateShuttleLogbook(fCurrentDetector.Data(),
933 status->GetStatusName(status->GetStatus()));
934 return kFALSE;
935 }
936
3301427a 937 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 938 Log("SHUTTLE",
939 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
940 fCurrentDetector.Data()));
9827400b 941 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
942 if (StoreOCDB()){
3301427a 943 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
944 fCurrentDetector.Data()));
2bb7b766 945 UpdateShuttleStatus(AliShuttleStatus::kDone);
946 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
947 } else {
948 Log("SHUTTLE",
949 Form("ContinueProcessing - %s: Grid storage failed again",
950 fCurrentDetector.Data()));
9827400b 951 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 952 }
953 return kFALSE;
954 }
955
956 // if we get here, there is a restart
57c1a579 957 Bool_t cont = kFALSE;
2bb7b766 958
959 // abort conditions
cb343cfd 960 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 961 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
962 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 963 status->GetCount(), status->GetStatusName()));
964 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 965 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 966
967 // there may still be objects in local OCDB and reference storage
968 // and FXS databases may be not updated: do it now!
9827400b 969
970 // TODO Currently disabled, we want to keep files in case of failure!
971 // CleanLocalStorage(fgkLocalCDB);
972 // CleanLocalStorage(fgkLocalRefStorage);
973 // UpdateTableFailCase();
974
975 // Send mail to detector expert!
976 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
977 if (!SendMail())
978 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
979 fCurrentDetector.Data()));
3301427a 980
57c1a579 981 } else {
982 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
983 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
984 status->GetStatusName(), status->GetCount()));
9827400b 985 Bool_t increaseCount = kTRUE;
986 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
987 increaseCount = kFALSE;
988 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 989 cont = kTRUE;
2bb7b766 990 }
991
57c1a579 992 return cont;
5164a766 993}
994
995//______________________________________________________________________________________________
2bb7b766 996Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 997{
73abe331 998 //
b948db8d 999 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1000 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1001 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1002 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1003 //
1004
9827400b 1005 if (!entry) return kFALSE;
2bb7b766 1006
1007 fLogbookEntry = entry;
1008
9827400b 1009 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1010 GetCurrentRun()));
2bb7b766 1011
e7f62f16 1012 // create ML instance that monitors this run
1013 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1014 // disable monitoring of other parameters that come e.g. from TFile
1015 gMonitoringWriter = 0;
2bb7b766 1016
e7f62f16 1017 // Send the information to ML
1018 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1019 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1020
1021 TList mlList;
1022 mlList.Add(&mlStatus);
9827400b 1023 mlList.Add(&mlRunType);
e7f62f16 1024
1025 fMonaLisa->SendParameters(&mlList);
3301427a 1026
9827400b 1027 if (fLogbookEntry->IsDone())
1028 {
1029 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1030 UpdateShuttleLogbook("shuttle_done");
1031 fLogbookEntry = 0;
1032 return kTRUE;
1033 }
1034
1035 // read test mode if flag is set
1036 if (fReadTestMode)
1037 {
3d8bc902 1038 fTestMode = kNone;
9827400b 1039 TString logEntry(entry->GetRunParameter("log"));
1040 //printf("log entry = %s\n", logEntry.Data());
1041 TString searchStr("Testmode: ");
1042 Int_t pos = logEntry.Index(searchStr.Data());
1043 //printf("%d\n", pos);
1044 if (pos >= 0)
1045 {
1046 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1047 //printf("%s\n", subStr.String().Data());
1048 TString newStr(subStr.Data());
1049 TObjArray* token = newStr.Tokenize(' ');
1050 if (token)
1051 {
1052 //token->Print();
1053 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1054 if (tmpStr)
1055 {
1056 Int_t testMode = tmpStr->String().Atoi();
1057 if (testMode > 0)
1058 {
1059 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1060 SetTestMode((TestMode) testMode);
1061 }
1062 }
1063 delete token;
1064 }
1065 }
1066 }
1067
3d8bc902 1068 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1069
eba76848 1070 fLogbookEntry->Print("all");
57f50b3c 1071
1072 // Initialization
d477ad88 1073 Bool_t hasError = kFALSE;
5164a766 1074
2bb7b766 1075 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1076 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1077 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1078 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1079
57f50b3c 1080 // Loop on detectors in the configuration
b948db8d 1081 TIter iter(fConfig->GetDetectors());
2bb7b766 1082 TObjString* aDetector = 0;
b948db8d 1083
be48e3ea 1084 while ((aDetector = (TObjString*) iter.Next()))
1085 {
7bfb2090 1086 fCurrentDetector = aDetector->String();
5164a766 1087
9e080f92 1088 if (ContinueProcessing() == kFALSE) continue;
1089
2bb7b766 1090 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1091 GetCurrentRun(), aDetector->GetName()));
1092
9d733021 1093 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1094
e7f62f16 1095 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1096
be48e3ea 1097 Int_t pid = fork();
1098
1099 if (pid < 0)
1100 {
1101 Log("SHUTTLE", "ERROR: Forking failed");
1102 }
1103 else if (pid > 0)
1104 {
1105 // parent
1106 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1107 GetCurrentRun(), aDetector->GetName()));
1108
1109 Long_t begin = time(0);
1110
1111 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1112 while (waitpid(pid, &status, WNOHANG) == 0)
1113 {
1114 Long_t expiredTime = time(0) - begin;
1115
1116 if (expiredTime > fConfig->GetPPTimeOut())
1117 {
9827400b 1118 TString tmp;
1119 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1120 fCurrentDetector.Data(), expiredTime);
1121 Log("SHUTTLE", tmp);
1122 Log(fCurrentDetector, tmp);
be48e3ea 1123
1124 kill(pid, 9);
1125
3301427a 1126 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1127 hasError = kTRUE;
1128
1129 gSystem->Sleep(1000);
1130 }
1131 else
1132 {
be48e3ea 1133 gSystem->Sleep(1000);
9827400b 1134
1135 TString checkStr;
1136 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1137 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1138 if (!pipe)
1139 {
1140 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1141 continue;
1142 }
1143
1144 char buffer[100];
1145 if (!fgets(buffer, 100, pipe))
1146 {
1147 Log("SHUTTLE", "Error: ps did not return anything");
1148 gSystem->ClosePipe(pipe);
1149 continue;
1150 }
1151 gSystem->ClosePipe(pipe);
1152
1153 //Log("SHUTTLE", Form("ps returned %s", buffer));
1154
1155 Int_t mem = 0;
1156 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1157 {
1158 Log("SHUTTLE", "Error: Could not parse output of ps");
1159 continue;
1160 }
1161
1162 if (expiredTime % 60 == 0)
886d60e6 1163 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1164 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1165
1166 if (mem > fConfig->GetPPMaxMem())
1167 {
1168 TString tmp;
1169 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1170 mem, fConfig->GetPPMaxMem());
1171 Log("SHUTTLE", tmp);
1172 Log(fCurrentDetector, tmp);
1173
1174 kill(pid, 9);
1175
1176 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1177 hasError = kTRUE;
1178
1179 gSystem->Sleep(1000);
1180 }
be48e3ea 1181 }
1182 }
1183
1184 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1185 GetCurrentRun(), aDetector->GetName()));
1186
1187 if (WIFEXITED(status))
1188 {
1189 Int_t returnCode = WEXITSTATUS(status);
1190
3301427a 1191 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1192 returnCode));
be48e3ea 1193
9827400b 1194 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1195 }
1196 }
1197 else if (pid == 0)
1198 {
1199 // client
1200 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1201
ffa29e93 1202 AliInfo("Redirecting output...");
1203
1204 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1205 {
1206 Log("SHUTTLE", "Could not freopen stdout");
1207 }
1208 else
1209 {
1210 fOutputRedirected = kTRUE;
1211 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1212 Log("SHUTTLE", "Could not redirect stderr");
1213
1214 }
1215
9827400b 1216 Bool_t success = ProcessCurrentDetector();
1217 if (success) // Preprocessor finished successfully!
1218 {
3301427a 1219 // Update time_processed field in FXS DB
1220 if (UpdateTable() == kFALSE)
1221 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1222
1223 // Transfer the data from local storage to main storage (Grid)
1224 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1225 if (StoreOCDB() == kFALSE)
1226 {
1227 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1228 GetCurrentRun(), aDetector->GetName()));
1229 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1230 success = kFALSE;
3301427a 1231 } else {
1232 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1233 GetCurrentRun(), aDetector->GetName()));
1234 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1235 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1236 }
be48e3ea 1237 }
1238
4b95672b 1239 for (UInt_t iSys=0; iSys<3; iSys++)
1240 {
1241 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1242 }
1243
be48e3ea 1244 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1245 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1246
1247 // the client exits here
9827400b 1248 gSystem->Exit(success);
be48e3ea 1249
1250 AliError("We should never get here!!!");
1251 }
7bfb2090 1252 }
5164a766 1253
2bb7b766 1254 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1255 GetCurrentRun()));
1256
1257 //check if shuttle is done for this run, if so update logbook
1258 TObjArray checkEntryArray;
1259 checkEntryArray.SetOwner(1);
9e080f92 1260 TString whereClause = Form("where run=%d", GetCurrentRun());
1261 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1262 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1263 GetCurrentRun()));
1264 return hasError == kFALSE;
1265 }
b948db8d 1266
9e080f92 1267 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1268 (checkEntryArray.At(0));
2bb7b766 1269
9e080f92 1270 if (checkEntry)
1271 {
1272 if (checkEntry->IsDone())
be48e3ea 1273 {
9e080f92 1274 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1275 UpdateShuttleLogbook("shuttle_done");
1276 }
1277 else
1278 {
1279 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1280 {
9e080f92 1281 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1282 {
9e080f92 1283 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1284 checkEntry->GetRun(), GetDetName(iDet)));
1285 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1286 }
1287 }
2bb7b766 1288 }
1289 }
1290
e7f62f16 1291 // remove ML instance
1292 delete fMonaLisa;
1293 fMonaLisa = 0;
1294
2bb7b766 1295 fLogbookEntry = 0;
85a80aa9 1296
a7160fe9 1297 return hasError == kFALSE;
73abe331 1298}
1299
b948db8d 1300//______________________________________________________________________________________________
9827400b 1301Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1302{
1303 //
2bb7b766 1304 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1305 // Threre should be a configuration for this detector.
73abe331 1306
2bb7b766 1307 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1308
2c15234c 1309 TMap dcsMap;
1310 dcsMap.SetOwner(1);
73abe331 1311
85a80aa9 1312 Bool_t aDCSError = kFALSE;
3301427a 1313
1314 // call preprocessor
1315 AliPreprocessor* aPreprocessor =
1316 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1317
1318 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1319
1320 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1321
3d8bc902 1322 if (!processDCS || (fTestMode & kSkipDCS))
2c15234c 1323 {
3d8bc902 1324 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1325 }
1326 else if (fTestMode & kErrorDCS)
1327 {
3d8bc902 1328 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1329 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1330 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1331 return kFALSE;
2c15234c 1332 } else {
3301427a 1333
1334 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1335
2c15234c 1336 TString host(fConfig->GetDCSHost(fCurrentDetector));
1337 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1338
1339 // Retrieval of Aliases
1340 TObjString* anAlias = 0;
36c99a6a 1341 Int_t iAlias = 1;
1342 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1343 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1344 while ((anAlias = (TObjString*) iterAliases.Next()))
1345 {
1346 TObjArray *valueSet = new TObjArray();
1347 valueSet->SetOwner(1);
1348
36c99a6a 1349 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1350 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1351 anAlias->GetName(), iAlias++, nTotAliases));
2c15234c 1352 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1353
1354 if(!aDCSError)
1355 {
1356 dcsMap.Add(anAlias->Clone(), valueSet);
1357 } else {
1358 Log(fCurrentDetector,
1359 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1360 anAlias->GetName()));
1361 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1362 dcsMap.DeleteAll();
9827400b 1363 return kFALSE;
2c15234c 1364 }
4f0ab988 1365 }
2c15234c 1366
1367 // Retrieval of Data Points
1368 TObjString* aDP = 0;
36c99a6a 1369 Int_t iDP = 0;
1370 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1371 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1372 while ((aDP = (TObjString*) iterDP.Next()))
1373 {
1374 TObjArray *valueSet = new TObjArray();
1375 valueSet->SetOwner(1);
36c99a6a 1376 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1377 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1378 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1379 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1380
1381 if(!aDCSError)
1382 {
1383 dcsMap.Add(aDP->Clone(), valueSet);
1384 } else {
1385 Log(fCurrentDetector,
1386 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1387 aDP->GetName()));
1388 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1389 dcsMap.DeleteAll();
9827400b 1390 return kFALSE;
2c15234c 1391 }
73abe331 1392 }
1393 }
b948db8d 1394
2bb7b766 1395 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1396 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1397
3301427a 1398 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1399
3301427a 1400 if (returnValue > 0) // Preprocessor error!
1401 {
9827400b 1402 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1403 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1404 dcsMap.DeleteAll();
1405 return kFALSE;
1406 }
1407
1408 // preprocessor ok!
1409 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1410 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1411 fCurrentDetector.Data()));
b948db8d 1412
2c15234c 1413 dcsMap.DeleteAll();
b948db8d 1414
9827400b 1415 return kTRUE;
2bb7b766 1416}
1417
1418//______________________________________________________________________________________________
1419Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1420 TObjArray& entries)
1421{
9827400b 1422 // Query DAQ's Shuttle logbook and fills detector status object.
1423 // Call QueryRunParameters to query DAQ logbook for run parameters.
1424 //
2bb7b766 1425
fc5a4708 1426 entries.SetOwner(1);
1427
2bb7b766 1428 // check connection, in case connect
be48e3ea 1429 if(!Connect(3)) return kFALSE;
2bb7b766 1430
1431 TString sqlQuery;
441b0e9c 1432 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1433
be48e3ea 1434 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1435 if (!aResult) {
1436 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1437 return kFALSE;
1438 }
1439
fc5a4708 1440 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1441
2bb7b766 1442 if(aResult->GetRowCount() == 0) {
9827400b 1443 AliInfo("No entries in Shuttle Logbook match request");
1444 delete aResult;
1445 return kTRUE;
2bb7b766 1446 }
1447
1448 // TODO Check field count!
fc5a4708 1449 const UInt_t nCols = 22;
2bb7b766 1450 if (aResult->GetFieldCount() != (Int_t) nCols) {
1451 AliError("Invalid SQL result field number!");
1452 delete aResult;
1453 return kFALSE;
1454 }
1455
2bb7b766 1456 TSQLRow* aRow;
1457 while ((aRow = aResult->Next())) {
1458 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1459 Int_t run = runString.Atoi();
1460
eba76848 1461 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1462 if (!entry)
1463 continue;
2bb7b766 1464
1465 // loop on detectors
eba76848 1466 for(UInt_t ii = 0; ii < nCols; ii++)
1467 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1468
eba76848 1469 entries.AddLast(entry);
2bb7b766 1470 delete aRow;
1471 }
1472
2bb7b766 1473 delete aResult;
1474 return kTRUE;
1475}
1476
1477//______________________________________________________________________________________________
eba76848 1478AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1479{
eba76848 1480 //
1481 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1482 //
2bb7b766 1483
1484 // check connection, in case connect
be48e3ea 1485 if (!Connect(3))
eba76848 1486 return 0;
2bb7b766 1487
1488 TString sqlQuery;
2c15234c 1489 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1490
be48e3ea 1491 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1492 if (!aResult) {
1493 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1494 return 0;
2bb7b766 1495 }
1496
eba76848 1497 if (aResult->GetRowCount() == 0) {
2bb7b766 1498 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1499 delete aResult;
eba76848 1500 return 0;
2bb7b766 1501 }
1502
eba76848 1503 if (aResult->GetRowCount() > 1) {
2bb7b766 1504 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1505 delete aResult;
eba76848 1506 return 0;
2bb7b766 1507 }
1508
eba76848 1509 TSQLRow* aRow = aResult->Next();
1510 if (!aRow)
1511 {
1512 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1513 delete aResult;
1514 return 0;
1515 }
2bb7b766 1516
eba76848 1517 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1518
eba76848 1519 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1520 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1521
eba76848 1522 UInt_t startTime = entry->GetStartTime();
1523 UInt_t endTime = entry->GetEndTime();
1524
1525 if (!startTime || !endTime || startTime > endTime) {
1526 Log("SHUTTLE",
1527 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1528 run, startTime, endTime));
1529 delete entry;
2bb7b766 1530 delete aRow;
eba76848 1531 delete aResult;
1532 return 0;
2bb7b766 1533 }
1534
eba76848 1535 delete aRow;
2bb7b766 1536 delete aResult;
eba76848 1537
1538 return entry;
2bb7b766 1539}
1540
1541//______________________________________________________________________________________________
2c15234c 1542Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1543 TObjArray* valueSet, DCSType type)
73abe331 1544{
9827400b 1545 // Retrieve all "entry" data points from the DCS server
1546 // host, port: TSocket connection parameters
1547 // entry: name of the alias or data point
1548 // valueSet: array of retrieved AliDCSValue's
1549 // type: kAlias or kDP
58bc3020 1550
73abe331 1551 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1552 if (!client.IsConnected())
1553 {
b948db8d 1554 return kFALSE;
73abe331 1555 }
1556
2c15234c 1557 Int_t result=0;
73abe331 1558
2c15234c 1559 if (type == kAlias)
1560 {
1561 result = client.GetAliasValues(entry,
1562 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1563 } else
1564 if (type == kDP)
1565 {
1566 result = client.GetDPValues(entry,
1567 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1568 }
1569
1570 if (result < 0)
1571 {
2bb7b766 1572 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1573 entry, AliDCSClient::GetErrorString(result)));
73abe331 1574
2c15234c 1575 if (result == AliDCSClient::fgkServerError)
1576 {
2bb7b766 1577 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1578 client.GetServerError().Data()));
1579 }
1580
1581 return kFALSE;
1582 }
1583
1584 return kTRUE;
1585}
b948db8d 1586
1587//______________________________________________________________________________________________
57f50b3c 1588const char* AliShuttle::GetFile(Int_t system, const char* detector,
1589 const char* id, const char* source)
b948db8d 1590{
9827400b 1591 // Get calibration file from file exchange servers
1592 // First queris the FXS database for the file name, using the run, detector, id and source info
1593 // then calls RetrieveFile(filename) for actual copy to local disk
1594 // run: current run being processed (given by Logbook entry fLogbookEntry)
1595 // detector: the Preprocessor name
1596 // id: provided as a parameter by the Preprocessor
1597 // source: provided by the Preprocessor through GetFileSources function
1598
1599 // check if test mode should simulate a FXS error
1600 if (fTestMode & kErrorFXSFiles)
1601 {
1602 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1603 return 0;
1604 }
1605
57f50b3c 1606 // check connection, in case connect
9d733021 1607 if (!Connect(system))
eba76848 1608 {
9d733021 1609 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1610 return 0;
1611 }
1612
1613 // Query preparation
9d733021 1614 TString sourceName(source);
d386d623 1615 Int_t nFields = 3;
1616 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1617 fConfig->GetFXSdbTable(system));
1618 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1619 GetCurrentRun(), detector, id);
1620
9d733021 1621 if (system == kDAQ)
1622 {
d386d623 1623 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1624 }
9d733021 1625 else if (system == kDCS)
eba76848 1626 {
9d733021 1627 sourceName="none";
57f50b3c 1628 }
9d733021 1629 else if (system == kHLT)
9e080f92 1630 {
d386d623 1631 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1632 nFields = 3;
9e080f92 1633 }
1634
9e080f92 1635 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1636
1637 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1638
1639 // Query execution
1640 TSQLResult* aResult = 0;
9d733021 1641 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1642 if (!aResult) {
9d733021 1643 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1644 GetSystemName(system), id, sourceName.Data()));
9e080f92 1645 return 0;
1646 }
1647
1648 if(aResult->GetRowCount() == 0)
1649 {
1650 Log(detector,
9d733021 1651 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1652 GetSystemName(system), id, sourceName.Data()));
9e080f92 1653 delete aResult;
1654 return 0;
1655 }
2bb7b766 1656
9e080f92 1657 if (aResult->GetRowCount() > 1) {
1658 Log(detector,
9d733021 1659 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1660 GetSystemName(system), id, sourceName.Data()));
9e080f92 1661 delete aResult;
1662 return 0;
1663 }
1664
9d733021 1665 if (aResult->GetFieldCount() != nFields) {
9e080f92 1666 Log(detector,
9d733021 1667 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1668 GetSystemName(system), id, sourceName.Data()));
9e080f92 1669 delete aResult;
1670 return 0;
1671 }
1672
1673 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1674
1675 if (!aRow){
9d733021 1676 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1677 GetSystemName(system), id, sourceName.Data()));
9e080f92 1678 delete aResult;
1679 return 0;
1680 }
1681
1682 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1683 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1684 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1685
1686 delete aResult;
1687 delete aRow;
1688
d386d623 1689 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1690 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1691
9e080f92 1692 // retrieved file is renamed to make it unique
9d733021 1693 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1694 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1695
9e080f92 1696
9d733021 1697 // file retrieval from FXS
4b95672b 1698 UInt_t nRetries = 0;
1699 UInt_t maxRetries = 3;
1700 Bool_t result = kFALSE;
1701
1702 // copy!! if successful TSystem::Exec returns 0
1703 while(nRetries++ < maxRetries) {
1704 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1705 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1706 if(!result)
1707 {
1708 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1709 filePath.Data(), GetSystemName(system)));
4b95672b 1710 continue;
1711 } else {
1712 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1713 filePath.Data(), GetSystemName(system),
1714 GetShuttleTempDir(), localFileName.Data()));
1715 }
9e080f92 1716
d386d623 1717 if (fileChecksum.Length()>0)
4b95672b 1718 {
1719 // compare md5sum of local file with the one stored in the FXS DB
1720 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1721 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1722
4b95672b 1723 if (md5Comp != 0)
1724 {
1725 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1726 filePath.Data()));
1727 result = kFALSE;
1728 continue;
1729 }
d386d623 1730 } else {
1731 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1732 filePath.Data(), GetSystemName(system)));
9d733021 1733 }
4b95672b 1734 if (result) break;
9e080f92 1735 }
1736
4b95672b 1737 if(!result) return 0;
1738
9d733021 1739 fFXSCalled[system]=kTRUE;
1740 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1741 fFXSlist[system].Add(fileParams);
9e080f92 1742
1743 static TString fullLocalFileName;
36c99a6a 1744 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1745
9e080f92 1746 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1747
1748 return fullLocalFileName.Data();
2bb7b766 1749
1750}
1751
1752//______________________________________________________________________________________________
9d733021 1753Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1754{
9827400b 1755 //
1756 // Copies file from FXS to local Shuttle machine
1757 //
2bb7b766 1758
9e080f92 1759 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1760 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1761 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1762
36c99a6a 1763 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1764 if (dir == NULL) {
36c99a6a 1765 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1766 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1767 return kFALSE;
1768 }
1769
1770 } else {
1771 gSystem->FreeDirectory(dir);
1772 }
1773
9d733021 1774 TString baseFXSFolder;
1775 if (system == kDAQ)
1776 {
1777 baseFXSFolder = "FES/";
1778 }
1779 else if (system == kDCS)
1780 {
1781 baseFXSFolder = "";
1782 }
1783 else if (system == kHLT)
1784 {
1785 baseFXSFolder = "~/";
1786 }
1787
1788
1789 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1790 fConfig->GetFXSPort(system),
1791 fConfig->GetFXSUser(system),
1792 fConfig->GetFXSHost(system),
1793 baseFXSFolder.Data(),
1794 fxsFileName,
36c99a6a 1795 GetShuttleTempDir(),
9e080f92 1796 localFileName);
1797
1798 AliDebug(2, Form("%s",command.Data()));
1799
4b95672b 1800 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1801
4b95672b 1802 return result;
9e080f92 1803}
1804
1805//______________________________________________________________________________________________
9d733021 1806TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1807{
9827400b 1808 //
1809 // Get sources producing the condition file Id from file exchange servers
1810 //
1811
1812 // check if test mode should simulate a FXS error
1813 if (fTestMode & kErrorFXSSources)
1814 {
1815 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1816 return 0;
1817 }
1818
9d733021 1819
1820 if (system == kDCS)
1821 {
1822 AliError("DCS system has only one source of data!");
1823 return NULL;
9d733021 1824 }
9e080f92 1825
1826 // check connection, in case connect
9d733021 1827 if (!Connect(system))
1828 {
1829 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1830 return NULL;
9e080f92 1831 }
1832
9d733021 1833 TString sourceName = 0;
1834 if (system == kDAQ)
1835 {
1836 sourceName = "DAQsource";
1837 } else if (system == kHLT)
1838 {
1839 sourceName = "DDLnumbers";
1840 }
1841
d386d623 1842 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
9e080f92 1843 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1844 GetCurrentRun(), detector, id);
1845 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1846
1847 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1848
1849 // Query execution
1850 TSQLResult* aResult;
9d733021 1851 aResult = fServer[system]->Query(sqlQuery);
9e080f92 1852 if (!aResult) {
9d733021 1853 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1854 GetSystemName(system), id));
9e080f92 1855 return 0;
1856 }
1857
9d733021 1858 if (aResult->GetRowCount() == 0)
1859 {
9e080f92 1860 Log(detector,
9d733021 1861 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 1862 delete aResult;
1863 return 0;
1864 }
1865
1866 TSQLRow* aRow;
1867 TList *list = new TList();
1868 list->SetOwner(1);
1869
9d733021 1870 while ((aRow = aResult->Next()))
1871 {
9e080f92 1872
9d733021 1873 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1874 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1875 list->Add(new TObjString(source));
9e080f92 1876 delete aRow;
1877 }
9d733021 1878
9e080f92 1879 delete aResult;
1880
1881 return list;
2bb7b766 1882}
1883
1884//______________________________________________________________________________________________
9d733021 1885Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 1886{
9827400b 1887 // Connect to MySQL Server of the system's FXS MySQL databases
1888 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1889 //
57f50b3c 1890
9d733021 1891 // check connection: if already connected return
1892 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 1893
9d733021 1894 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 1895
9d733021 1896 if (system < 3) // FXS db servers
1897 {
1898 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1899 dbUser = fConfig->GetFXSdbUser(system);
1900 dbPass = fConfig->GetFXSdbPass(system);
1901 dbName = fConfig->GetFXSdbName(system);
1902 } else { // Run & Shuttle logbook servers
1903 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1904 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1905 dbUser = fConfig->GetDAQlbUser();
1906 dbPass = fConfig->GetDAQlbPass();
1907 dbName = fConfig->GetDAQlbDB();
1908 }
57f50b3c 1909
9d733021 1910 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1911 if (!fServer[system] || !fServer[system]->IsConnected()) {
1912 if(system < 3)
1913 {
1914 AliError(Form("Can't establish connection to FXS database for %s",
1915 AliShuttleInterface::GetSystemName(system)));
1916 } else {
1917 AliError("Can't establish connection to Run logbook.");
57f50b3c 1918 }
9d733021 1919 if(fServer[system]) delete fServer[system];
1920 return kFALSE;
2bb7b766 1921 }
57f50b3c 1922
9d733021 1923 // Get tables
1924 TSQLResult* aResult=0;
1925 switch(system){
1926 case kDAQ:
1927 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1928 break;
1929 case kDCS:
1930 aResult = fServer[kDCS]->GetTables(dbName.Data());
1931 break;
1932 case kHLT:
1933 aResult = fServer[kHLT]->GetTables(dbName.Data());
1934 break;
1935 default:
1936 aResult = fServer[3]->GetTables(dbName.Data());
1937 break;
1938 }
1939
1940 delete aResult;
2bb7b766 1941 return kTRUE;
1942}
57f50b3c 1943
9e080f92 1944//______________________________________________________________________________________________
9d733021 1945Bool_t AliShuttle::UpdateTable()
9e080f92 1946{
9827400b 1947 //
1948 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1949 //
9e080f92 1950
9d733021 1951 Bool_t result = kTRUE;
9e080f92 1952
9d733021 1953 for (UInt_t system=0; system<3; system++)
1954 {
1955 if(!fFXSCalled[system]) continue;
9e080f92 1956
9d733021 1957 // check connection, in case connect
1958 if (!Connect(system))
1959 {
1960 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1961 result = kFALSE;
1962 continue;
9e080f92 1963 }
9e080f92 1964
9d733021 1965 TTimeStamp now; // now
1966
1967 // Loop on FXS list entries
1968 TIter iter(&fFXSlist[system]);
1969 TObjString *aFXSentry=0;
1970 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1971 {
1972 TString aFXSentrystr = aFXSentry->String();
1973 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1974 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1975 {
1976 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
1977 GetSystemName(system), aFXSentrystr.Data()));
1978 if(aFXSarray) delete aFXSarray;
1979 result = kFALSE;
1980 continue;
1981 }
1982 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
1983 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
1984
1985 TString whereClause;
1986 if (system == kDAQ)
1987 {
1988 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
1989 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
1990 }
1991 else if (system == kDCS)
1992 {
1993 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
1994 GetCurrentRun(), fCurrentDetector.Data(), fileId);
1995 }
1996 else if (system == kHLT)
1997 {
1998 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
1999 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2000 }
2001
2002 delete aFXSarray;
9e080f92 2003
9d733021 2004 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2005 now.GetSec(), whereClause.Data());
9e080f92 2006
9d733021 2007 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2008
9d733021 2009 // Query execution
2010 TSQLResult* aResult;
2011 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2012 if (!aResult)
2013 {
2014 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2015 GetSystemName(system), sqlQuery.Data()));
2016 result = kFALSE;
2017 continue;
2018 }
2019 delete aResult;
9e080f92 2020 }
9e080f92 2021 }
2022
9d733021 2023 return result;
9e080f92 2024}
57f50b3c 2025
2bb7b766 2026//______________________________________________________________________________________________
3301427a 2027Bool_t AliShuttle::UpdateTableFailCase()
2028{
9827400b 2029 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2030 // this is called in case the preprocessor is declared failed for the current run, because
2031 // the fields are updated only in case of success
3301427a 2032
2033 Bool_t result = kTRUE;
2034
2035 for (UInt_t system=0; system<3; system++)
2036 {
2037 // check connection, in case connect
2038 if (!Connect(system))
2039 {
2040 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2041 GetSystemName(system)));
2042 result = kFALSE;
2043 continue;
2044 }
2045
2046 TTimeStamp now; // now
2047
2048 // Loop on FXS list entries
2049
2050 TString whereClause = Form("where run=%d and detector=\"%s\";",
2051 GetCurrentRun(), fCurrentDetector.Data());
2052
2053
2054 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2055 now.GetSec(), whereClause.Data());
2056
2057 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2058
2059 // Query execution
2060 TSQLResult* aResult;
2061 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2062 if (!aResult)
2063 {
2064 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2065 GetSystemName(system), sqlQuery.Data()));
2066 result = kFALSE;
2067 continue;
2068 }
2069 delete aResult;
2070 }
2071
2072 return result;
2073}
2074
2075//______________________________________________________________________________________________
2bb7b766 2076Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2077{
e7f62f16 2078 //
2079 // Update Shuttle logbook filling detector or shuttle_done column
2080 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2081 //
57f50b3c 2082
2bb7b766 2083 // check connection, in case connect
be48e3ea 2084 if(!Connect(3)){
2bb7b766 2085 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2086 return kFALSE;
57f50b3c 2087 }
2088
2bb7b766 2089 TString detName(detector);
2090 TString setClause;
e7f62f16 2091 if(detName == "shuttle_done")
2092 {
2bb7b766 2093 setClause = "set shuttle_done=1";
e7f62f16 2094
2095 // Send the information to ML
2096 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2097
2098 TList mlList;
2099 mlList.Add(&mlStatus);
2100
2101 fMonaLisa->SendParameters(&mlList);
2bb7b766 2102 } else {
2bb7b766 2103 TString statusStr(status);
2104 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2105 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2106 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2107 } else {
2108 Log("SHUTTLE",
2109 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2110 status, detector));
2111 return kFALSE;
2112 }
2113 }
57f50b3c 2114
2bb7b766 2115 TString whereClause = Form("where run=%d", GetCurrentRun());
2116
441b0e9c 2117 TString sqlQuery = Form("update %s %s %s",
2118 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2119
2bb7b766 2120 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2121
2122 // Query execution
2123 TSQLResult* aResult;
be48e3ea 2124 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2125 if (!aResult) {
2126 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2127 return kFALSE;
57f50b3c 2128 }
2bb7b766 2129 delete aResult;
57f50b3c 2130
2131 return kTRUE;
2132}
2133
2134//______________________________________________________________________________________________
2bb7b766 2135Int_t AliShuttle::GetCurrentRun() const
2136{
9827400b 2137 //
2138 // Get current run from logbook entry
2139 //
57f50b3c 2140
2bb7b766 2141 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2142}
2143
2144//______________________________________________________________________________________________
2bb7b766 2145UInt_t AliShuttle::GetCurrentStartTime() const
2146{
9827400b 2147 //
2148 // get current start time
2149 //
57f50b3c 2150
2bb7b766 2151 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2152}
2153
2154//______________________________________________________________________________________________
2bb7b766 2155UInt_t AliShuttle::GetCurrentEndTime() const
2156{
9827400b 2157 //
2158 // get current end time from logbook entry
2159 //
57f50b3c 2160
2bb7b766 2161 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2162}
2163
2164//______________________________________________________________________________________________
b948db8d 2165void AliShuttle::Log(const char* detector, const char* message)
2166{
9827400b 2167 //
2168 // Fill log string with a message
2169 //
b948db8d 2170
36c99a6a 2171 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2172 if (dir == NULL) {
36c99a6a 2173 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2174 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2175 return;
2176 }
b948db8d 2177
84090f85 2178 } else {
2179 gSystem->FreeDirectory(dir);
2180 }
b948db8d 2181
cb343cfd 2182 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2183 if (GetCurrentRun() >= 0)
2184 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2185 toLog += Form("%s", message);
2186
84090f85 2187 AliInfo(toLog.Data());
ffa29e93 2188
2189 // if we redirect the log output already to the file, leave here
2190 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2191 return;
b948db8d 2192
ffa29e93 2193 TString fileName = GetLogFileName(detector);
e7f62f16 2194
84090f85 2195 gSystem->ExpandPathName(fileName);
2196
2197 ofstream logFile;
2198 logFile.open(fileName, ofstream::out | ofstream::app);
2199
2200 if (!logFile.is_open()) {
2201 AliError(Form("Could not open file %s", fileName.Data()));
2202 return;
2203 }
7bfb2090 2204
84090f85 2205 logFile << toLog.Data() << "\n";
b948db8d 2206
84090f85 2207 logFile.close();
b948db8d 2208}
2bb7b766 2209
2bb7b766 2210//______________________________________________________________________________________________
ffa29e93 2211TString AliShuttle::GetLogFileName(const char* detector) const
2212{
2213 //
2214 // returns the name of the log file for a given sub detector
2215 //
2216
2217 TString fileName;
2218
2219 if (GetCurrentRun() >= 0)
2220 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2221 else
2222 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2223
2224 return fileName;
2225}
2226
2227//______________________________________________________________________________________________
2bb7b766 2228Bool_t AliShuttle::Collect(Int_t run)
2229{
9827400b 2230 //
2231 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2232 // If a dedicated run is given this run is processed
2233 //
2234 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2235 //
2bb7b766 2236
eba76848 2237 if (run == -1)
2238 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2239 else
2240 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2241
2242 SetLastAction("Starting");
2bb7b766 2243
2244 TString whereClause("where shuttle_done=0");
eba76848 2245 if (run != -1)
2246 whereClause += Form(" and run=%d", run);
2bb7b766 2247
2248 TObjArray shuttleLogbookEntries;
be48e3ea 2249 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2250 {
cb343cfd 2251 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2252 return kFALSE;
2253 }
2254
9e080f92 2255 if (shuttleLogbookEntries.GetEntries() == 0)
2256 {
2257 if (run == -1)
2258 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2259 else
2260 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2261 "or it does not exist in Shuttle logbook", run));
2262 return kTRUE;
2263 }
2264
be48e3ea 2265 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2266 fFirstUnprocessed[iDet] = kTRUE;
2267
fc5a4708 2268 if (run != -1)
be48e3ea 2269 {
2270 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2271 // flag them into fFirstUnprocessed array
2272 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2273 TObjArray tmpLogbookEntries;
2274 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2275 {
2276 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2277 return kFALSE;
2278 }
2279
2280 TIter iter(&tmpLogbookEntries);
2281 AliShuttleLogbookEntry* anEntry = 0;
2282 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2283 {
2284 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2285 {
2286 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2287 {
2288 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2289 anEntry->GetRun(), GetDetName(iDet)));
2290 fFirstUnprocessed[iDet] = kFALSE;
2291 }
2292 }
2293
2294 }
2295
2296 }
2297
2298 if (!RetrieveConditionsData(shuttleLogbookEntries))
2299 {
cb343cfd 2300 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2301 return kFALSE;
2302 }
2303
36c99a6a 2304 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2305 return kTRUE;
2bb7b766 2306}
2307
2bb7b766 2308//______________________________________________________________________________________________
2309Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2310{
9827400b 2311 //
2312 // Retrieve conditions data for all runs that aren't processed yet
2313 //
2bb7b766 2314
2315 Bool_t hasError = kFALSE;
2316
2317 TIter iter(&dateEntries);
2318 AliShuttleLogbookEntry* anEntry;
2319
2320 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2321 if (!Process(anEntry)){
2322 hasError = kTRUE;
2323 }
4b95672b 2324
2325 // clean SHUTTLE temp directory
3301427a 2326 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2327 RemoveFile(filename.Data());
2bb7b766 2328 }
2329
2330 return hasError == kFALSE;
2331}
cb343cfd 2332
2333//______________________________________________________________________________________________
2334ULong_t AliShuttle::GetTimeOfLastAction() const
2335{
9827400b 2336 //
2337 // Gets time of last action
2338 //
2339
cb343cfd 2340 ULong_t tmp;
36c99a6a 2341
cb343cfd 2342 fMonitoringMutex->Lock();
be48e3ea 2343
cb343cfd 2344 tmp = fLastActionTime;
36c99a6a 2345
cb343cfd 2346 fMonitoringMutex->UnLock();
36c99a6a 2347
cb343cfd 2348 return tmp;
2349}
2350
2351//______________________________________________________________________________________________
2352const TString AliShuttle::GetLastAction() const
2353{
9827400b 2354 //
cb343cfd 2355 // returns a string description of the last action
9827400b 2356 //
cb343cfd 2357
2358 TString tmp;
36c99a6a 2359
cb343cfd 2360 fMonitoringMutex->Lock();
2361
2362 tmp = fLastAction;
2363
2364 fMonitoringMutex->UnLock();
2365
36c99a6a 2366 return tmp;
cb343cfd 2367}
2368
2369//______________________________________________________________________________________________
2370void AliShuttle::SetLastAction(const char* action)
2371{
9827400b 2372 //
cb343cfd 2373 // updates the monitoring variables
9827400b 2374 //
36c99a6a 2375
cb343cfd 2376 fMonitoringMutex->Lock();
36c99a6a 2377
cb343cfd 2378 fLastAction = action;
2379 fLastActionTime = time(0);
2380
2381 fMonitoringMutex->UnLock();
2382}
eba76848 2383
2384//______________________________________________________________________________________________
2385const char* AliShuttle::GetRunParameter(const char* param)
2386{
9827400b 2387 //
2388 // returns run parameter read from DAQ logbook
2389 //
eba76848 2390
2391 if(!fLogbookEntry) {
2392 AliError("No logbook entry!");
2393 return 0;
2394 }
2395
2396 return fLogbookEntry->GetRunParameter(param);
2397}
57c1a579 2398
2399//______________________________________________________________________________________________
9827400b 2400AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2401{
9827400b 2402 //
2403 // returns object from OCDB valid for current run
2404 //
d386d623 2405
9827400b 2406 if (fTestMode & kErrorOCDB)
2407 {
2408 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2409 return 0;
2410 }
2411
d386d623 2412 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2413 if (!sto)
2414 {
9827400b 2415 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2416 return 0;
2417 }
2418
2419 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2420}
2421
2422//______________________________________________________________________________________________
57c1a579 2423Bool_t AliShuttle::SendMail()
2424{
9827400b 2425 //
2426 // sends a mail to the subdetector expert in case of preprocessor error
2427 //
2428
2429 if (fTestMode != kNone)
2430 return kTRUE;
57c1a579 2431
36c99a6a 2432 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2433 if (dir == NULL)
2434 {
36c99a6a 2435 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2436 {
36c99a6a 2437 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2438 return kFALSE;
2439 }
2440
2441 } else {
2442 gSystem->FreeDirectory(dir);
2443 }
2444
2445 TString bodyFileName;
36c99a6a 2446 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2447 gSystem->ExpandPathName(bodyFileName);
2448
2449 ofstream mailBody;
2450 mailBody.open(bodyFileName, ofstream::out);
2451
2452 if (!mailBody.is_open())
2453 {
2454 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2455 return kFALSE;
2456 }
2457
2458 TString to="";
2459 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2460 TObjString *anExpert=0;
2461 while ((anExpert = (TObjString*) iterExperts.Next()))
2462 {
2463 to += Form("%s,", anExpert->GetName());
2464 }
2465 to.Remove(to.Length()-1);
909732f7 2466 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2467
36c99a6a 2468 // TODO this will be removed...
2469 if (to.Contains("not_yet_set")) {
2470 AliInfo("List of detector responsibles not yet set!");
2471 return kFALSE;
2472 }
2473
57c1a579 2474 TString cc="alberto.colla@cern.ch";
2475
2476 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2477 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2478 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2479
2480 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2481 body += Form("SHUTTLE just detected that your preprocessor "
36c99a6a 2482 "exited with ERROR state in run %d!!\n\n", GetCurrentRun());
57c1a579 2483 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2484 body += Form("The last 10 lines of %s log file are following:\n\n");
2485
909732f7 2486 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2487
2488 mailBody << body.Data();
2489 mailBody.close();
2490 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2491
9d733021 2492 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2493 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2494 if (gSystem->Exec(tailCommand.Data()))
2495 {
2496 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2497 }
2498
2499 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2500 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2501 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
57c1a579 2502 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2503
909732f7 2504 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2505
2506 mailBody << endBody.Data();
2507
2508 mailBody.close();
2509
2510 // send mail!
2511 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2512 subject.Data(),
2513 cc.Data(),
2514 to.Data(),
2515 bodyFileName.Data());
909732f7 2516 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2517
2518 Bool_t result = gSystem->Exec(mailCommand.Data());
2519
2520 return result == 0;
2521}
d386d623 2522
2523//______________________________________________________________________________________________
9827400b 2524const char* AliShuttle::GetRunType()
441b0e9c 2525{
9827400b 2526 //
2527 // returns run type read from "run type" logbook
2528 //
441b0e9c 2529
2530 if(!fLogbookEntry) {
2531 AliError("No logbook entry!");
2532 return 0;
2533 }
2534
9827400b 2535 return fLogbookEntry->GetRunType();
441b0e9c 2536}
2537
2538//______________________________________________________________________________________________
d386d623 2539void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2540{
9827400b 2541 //
2542 // sets Shuttle temp directory
2543 //
d386d623 2544
2545 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2546}
2547
2548//______________________________________________________________________________________________
2549void AliShuttle::SetShuttleLogDir(const char* logDir)
2550{
9827400b 2551 //
2552 // sets Shuttle log directory
2553 //
d386d623 2554
2555 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2556}