]> git.uio.no Git - u/mrichter/AliRoot.git/blame - SHUTTLE/AliShuttle.cxx
Coding conventions
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
26758fce 18Revision 1.38 2007/04/12 08:26:18 jgrosseo
19updated comment
20
3c2a21c8 21Revision 1.37 2007/04/10 16:53:14 jgrosseo
22redirecting sub detector stdout, stderr to sub detector log file
23
3d8bc902 24Revision 1.35 2007/04/04 16:26:38 acolla
251. Re-organization of function calls in TestPreprocessor to make it more meaningful.
262. Added missing dependency in test preprocessors.
273. in AliShuttle.cxx: processing time and memory consumption info on a single line.
28
886d60e6 29Revision 1.34 2007/04/04 10:33:36 jgrosseo
301) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
31In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
32
332) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
34
353) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
36
374) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
38
395) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
40If you always need DCS data (like before), you do not need to implement it.
41
426) The run type has been added to the monitoring page
43
9827400b 44Revision 1.33 2007/04/03 13:56:01 acolla
45Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
46run type.
47
3301427a 48Revision 1.32 2007/02/28 10:41:56 acolla
49Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
50AliPreprocessor::GetRunType() function.
51Added some ldap definition files.
52
d386d623 53Revision 1.30 2007/02/13 11:23:21 acolla
54Moved getters and setters of Shuttle's main OCDB/Reference, local
55OCDB/Reference, temp and log folders to AliShuttleInterface
56
9d733021 57Revision 1.27 2007/01/30 17:52:42 jgrosseo
58adding monalisa monitoring
59
e7f62f16 60Revision 1.26 2007/01/23 19:20:03 acolla
61Removed old ldif files, added TOF, MCH ldif files. Added some options in
62AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
63SetShuttleLogDir
64
36c99a6a 65Revision 1.25 2007/01/15 19:13:52 acolla
66Moved some AliInfo to AliDebug in SendMail function
67
fc5a4708 68Revision 1.21 2006/12/07 08:51:26 jgrosseo
69update (alberto):
70table, db names in ldap configuration
71added GRP preprocessor
72DCS data can also be retrieved by data point
73
2c15234c 74Revision 1.20 2006/11/16 16:16:48 jgrosseo
75introducing strict run ordering flag
76removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
77
be48e3ea 78Revision 1.19 2006/11/06 14:23:04 jgrosseo
79major update (Alberto)
80o) reading of run parameters from the logbook
81o) online offline naming conversion
82o) standalone DCSclient package
83
eba76848 84Revision 1.18 2006/10/20 15:22:59 jgrosseo
85o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
86o) Merging Collect, CollectAll, CollectNew function
87o) Removing implementation of empty copy constructors (declaration still there!)
88
cb343cfd 89Revision 1.17 2006/10/05 16:20:55 jgrosseo
90adapting to new CDB classes
91
6ec0e06c 92Revision 1.16 2006/10/05 15:46:26 jgrosseo
93applying to the new interface
94
481441a2 95Revision 1.15 2006/10/02 16:38:39 jgrosseo
96update (alberto):
97fixed memory leaks
98storing of objects that failed to be stored to the grid before
99interfacing of shuttle status table in daq system
100
2bb7b766 101Revision 1.14 2006/08/29 09:16:05 jgrosseo
102small update
103
85a80aa9 104Revision 1.13 2006/08/15 10:50:00 jgrosseo
105effc++ corrections (alberto)
106
4f0ab988 107Revision 1.12 2006/08/08 14:19:29 jgrosseo
108Update to shuttle classes (Alberto)
109
110- Possibility to set the full object's path in the Preprocessor's and
111Shuttle's Store functions
112- Possibility to extend the object's run validity in the same classes
113("startValidity" and "validityInfinite" parameters)
114- Implementation of the StoreReferenceData function to store reference
115data in a dedicated CDB storage.
116
84090f85 117Revision 1.11 2006/07/21 07:37:20 jgrosseo
118last run is stored after each run
119
7bfb2090 120Revision 1.10 2006/07/20 09:54:40 jgrosseo
121introducing status management: The processing per subdetector is divided into several steps,
122after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
123can keep track of the number of failures and skips further processing after a certain threshold is
124exceeded. These thresholds can be configured in LDAP.
125
5164a766 126Revision 1.9 2006/07/19 10:09:55 jgrosseo
127new configuration, accesst to DAQ FES (Alberto)
128
57f50b3c 129Revision 1.8 2006/07/11 12:44:36 jgrosseo
130adding parameters for extended validity range of data produced by preprocessor
131
17111222 132Revision 1.7 2006/07/10 14:37:09 jgrosseo
133small fix + todo comment
134
e090413b 135Revision 1.6 2006/07/10 13:01:41 jgrosseo
136enhanced storing of last sucessfully processed run (alberto)
137
a7160fe9 138Revision 1.5 2006/07/04 14:59:57 jgrosseo
139revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
140
45a493ce 141Revision 1.4 2006/06/12 09:11:16 jgrosseo
142coding conventions (Alberto)
143
58bc3020 144Revision 1.3 2006/06/06 14:26:40 jgrosseo
145o) removed files that were moved to STEER
146o) shuttle updated to follow the new interface (Alberto)
147
b948db8d 148Revision 1.2 2006/03/07 07:52:34 hristov
149New version (B.Yordanov)
150
d477ad88 151Revision 1.6 2005/11/19 17:19:14 byordano
152RetrieveDATEEntries and RetrieveConditionsData added
153
154Revision 1.5 2005/11/19 11:09:27 byordano
155AliShuttle declaration added
156
157Revision 1.4 2005/11/17 17:47:34 byordano
158TList changed to TObjArray
159
160Revision 1.3 2005/11/17 14:43:23 byordano
161import to local CVS
162
163Revision 1.1.1.1 2005/10/28 07:33:58 hristov
164Initial import as subdirectory in AliRoot
165
73abe331 166Revision 1.2 2005/09/13 08:41:15 byordano
167default startTime endTime added
168
169Revision 1.4 2005/08/30 09:13:02 byordano
170some docs added
171
172Revision 1.3 2005/08/29 21:15:47 byordano
173some docs added
174
175*/
176
177//
178// This class is the main manager for AliShuttle.
179// It organizes the data retrieval from DCS and call the
b948db8d 180// interface methods of AliPreprocessor.
73abe331 181// For every detector in AliShuttleConfgi (see AliShuttleConfig),
182// data for its set of aliases is retrieved. If there is registered
b948db8d 183// AliPreprocessor for this detector then it will be used
184// accroding to the schema (see AliPreprocessor).
185// If there isn't registered AliPreprocessor than the retrieved
73abe331 186// data is stored automatically to the undelying AliCDBStorage.
187// For detSpec is used the alias name.
188//
189
190#include "AliShuttle.h"
191
192#include "AliCDBManager.h"
193#include "AliCDBStorage.h"
194#include "AliCDBId.h"
84090f85 195#include "AliCDBRunRange.h"
196#include "AliCDBPath.h"
5164a766 197#include "AliCDBEntry.h"
73abe331 198#include "AliShuttleConfig.h"
eba76848 199#include "DCSClient/AliDCSClient.h"
73abe331 200#include "AliLog.h"
b948db8d 201#include "AliPreprocessor.h"
5164a766 202#include "AliShuttleStatus.h"
2bb7b766 203#include "AliShuttleLogbookEntry.h"
73abe331 204
57f50b3c 205#include <TSystem.h>
58bc3020 206#include <TObject.h>
b948db8d 207#include <TString.h>
57f50b3c 208#include <TTimeStamp.h>
73abe331 209#include <TObjString.h>
57f50b3c 210#include <TSQLServer.h>
211#include <TSQLResult.h>
212#include <TSQLRow.h>
cb343cfd 213#include <TMutex.h>
9827400b 214#include <TSystemDirectory.h>
215#include <TSystemFile.h>
216#include <TFileMerger.h>
217#include <TGrid.h>
218#include <TGridResult.h>
73abe331 219
e7f62f16 220#include <TMonaLisaWriter.h>
221
5164a766 222#include <fstream>
223
cb343cfd 224#include <sys/types.h>
225#include <sys/wait.h>
226
73abe331 227ClassImp(AliShuttle)
228
b948db8d 229//______________________________________________________________________________________________
230AliShuttle::AliShuttle(const AliShuttleConfig* config,
231 UInt_t timeout, Int_t retries):
4f0ab988 232fConfig(config),
233fTimeout(timeout), fRetries(retries),
234fPreprocessorMap(),
2bb7b766 235fLogbookEntry(0),
eba76848 236fCurrentDetector(),
85a80aa9 237fStatusEntry(0),
cb343cfd 238fMonitoringMutex(0),
eba76848 239fLastActionTime(0),
e7f62f16 240fLastAction(),
9827400b 241fMonaLisa(0),
242fTestMode(kNone),
ffa29e93 243fReadTestMode(kFALSE),
244fOutputRedirected(kFALSE)
73abe331 245{
246 //
247 // config: AliShuttleConfig used
73abe331 248 // timeout: timeout used for AliDCSClient connection
249 // retries: the number of retries in case of connection error.
250 //
251
57f50b3c 252 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 253 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 254 fServer[iSys]=0;
be48e3ea 255 if (iSys < 3)
2c15234c 256 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 257 }
2bb7b766 258 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 259
260 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
261 fFirstUnprocessed[iDet] = kFALSE;
262
cb343cfd 263 fMonitoringMutex = new TMutex();
58bc3020 264}
265
b948db8d 266//______________________________________________________________________________________________
57f50b3c 267AliShuttle::~AliShuttle()
58bc3020 268{
9827400b 269 //
270 // destructor
271 //
58bc3020 272
b948db8d 273 fPreprocessorMap.DeleteAll();
be48e3ea 274 for(int iSys=0;iSys<4;iSys++)
57f50b3c 275 if(fServer[iSys]) {
276 fServer[iSys]->Close();
277 delete fServer[iSys];
eba76848 278 fServer[iSys] = 0;
57f50b3c 279 }
2bb7b766 280
281 if (fStatusEntry){
282 delete fStatusEntry;
283 fStatusEntry = 0;
284 }
cb343cfd 285
286 if (fMonitoringMutex)
287 {
288 delete fMonitoringMutex;
289 fMonitoringMutex = 0;
290 }
73abe331 291}
292
b948db8d 293//______________________________________________________________________________________________
57f50b3c 294void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 295{
73abe331 296 //
b948db8d 297 // Registers new AliPreprocessor.
73abe331 298 // It uses GetName() for indentificator of the pre processor.
299 // The pre processor is registered it there isn't any other
300 // with the same identificator (GetName()).
301 //
302
eba76848 303 const char* detName = preprocessor->GetName();
304 if(GetDetPos(detName) < 0)
305 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
306
307 if (fPreprocessorMap.GetValue(detName)) {
308 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 309 return;
310 }
311
eba76848 312 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 313}
b948db8d 314//______________________________________________________________________________________________
3301427a 315Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 316 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 317{
9827400b 318 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
319 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
320 // using this function. Use StoreReferenceData instead!
321 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
322 // finishes the data are transferred to the main storage (Grid).
b948db8d 323
3301427a 324 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 325}
326
327//______________________________________________________________________________________________
3301427a 328Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 329{
9827400b 330 // Stores a CDB object in the storage for reference data. This objects will not be available during
331 // offline reconstrunction. Use this function for reference data only!
332 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
333 // finishes the data are transferred to the main storage (Grid).
85a80aa9 334
3301427a 335 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 336}
337
338//______________________________________________________________________________________________
3301427a 339Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 340 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
341 Int_t validityStart, Bool_t validityInfinite)
342{
9827400b 343 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
344 // when the preprocessor finishes the data are transferred to the main storage (Grid).
345 // The parameters are:
346 // 1) Uri of the backup storage (Local)
347 // 2) the object's path.
348 // 3) the object to be stored
349 // 4) the metaData to be associated with the object
350 // 5) the validity start run number w.r.t. the current run,
351 // if the data is valid only for this run leave the default 0
352 // 6) specifies if the calibration data is valid for infinity (this means until updated),
353 // typical for calibration runs, the default is kFALSE
354 //
355 // returns 0 if fail, 1 otherwise
84090f85 356
9827400b 357 if (fTestMode & kErrorStorage)
358 {
359 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
360 return kFALSE;
361 }
362
3301427a 363 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 364
85a80aa9 365 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 366 if(firstRun < 0) {
9827400b 367 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 368 firstRun=0;
369 }
370
371 Int_t lastRun = -1;
372 if(validityInfinite) {
373 lastRun = AliCDBRunRange::Infinity();
374 } else {
375 lastRun = GetCurrentRun();
376 }
377
3301427a 378 // Version is set to current run, it will be used later to transfer data to Grid
379 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 380
381 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
382 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 383 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 384 }
84090f85 385
3301427a 386 Bool_t result = kFALSE;
84090f85 387
3301427a 388 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
389 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 390 } else {
3301427a 391 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 392 ->Put(object, id, metaData);
393 }
394
395 if(!result) {
396
9827400b 397 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 398 }
2bb7b766 399
3301427a 400 return result;
401}
84090f85 402
3301427a 403//______________________________________________________________________________________________
404Bool_t AliShuttle::StoreOCDB()
405{
9827400b 406 //
407 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
408 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
409 // Then calls StoreRefFilesToGrid to store reference files.
410 //
411
412 if (fTestMode & kErrorGrid)
413 {
414 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
415 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
416 return kFALSE;
417 }
418
3301427a 419 AliInfo("Storing reference data ...");
420 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 421
422 AliInfo("Storing reference files ...");
423 Bool_t resultRefFiles = StoreRefFilesToGrid();
424
26758fce 425 AliInfo("Storing OCDB data ...");
426 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
427
9827400b 428 return resultCDB && resultRef && resultRefFiles;
3301427a 429}
430
431//______________________________________________________________________________________________
432Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
433{
434 //
435 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
436 //
437
438 TObjArray* gridIds=0;
439
440 Bool_t result = kTRUE;
26758fce 441 // to check whether all files have been transferred, or some files were left behind
442 // because the run is not first unprocessed
443 Bool_t willDoAgain = kFALSE;
3301427a 444
445 const char* type = 0;
446 TString localURI;
447 if(gridURI == fgkMainCDB) {
448 type = "OCDB";
449 localURI = fgkLocalCDB;
450 } else if(gridURI == fgkMainRefStorage) {
451 type = "reference";
452 localURI = fgkLocalRefStorage;
453 } else {
454 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
455 return kFALSE;
456 }
457
458 AliCDBManager* man = AliCDBManager::Instance();
459
460 AliCDBStorage *gridSto = man->GetStorage(gridURI);
461 if(!gridSto) {
462 Log("SHUTTLE",
463 Form("StoreOCDB - cannot activate main %s storage", type));
464 return kFALSE;
465 }
466
467 gridIds = gridSto->GetQueryCDBList();
468
469 // get objects previously stored in local CDB
470 AliCDBStorage *localSto = man->GetStorage(localURI);
471 if(!localSto) {
472 Log("SHUTTLE",
473 Form("StoreOCDB - cannot activate local %s storage", type));
474 return kFALSE;
475 }
476 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
477 // Local objects were stored with current run as Grid version!
478 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
479 localEntries->SetOwner(1);
480
481 // loop on local stored objects
482 TIter localIter(localEntries);
483 AliCDBEntry *aLocEntry = 0;
484 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
485 aLocEntry->SetOwner(1);
486 AliCDBId aLocId = aLocEntry->GetId();
487 aLocEntry->SetVersion(-1);
488 aLocEntry->SetSubVersion(-1);
489
490 // If local object is valid up to infinity we store it only if it is
491 // the first unprocessed run!
492 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
493 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
494 {
495 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
496 "there are previous unprocessed runs!",
497 fCurrentDetector.Data(), aLocId.GetPath().Data()));
26758fce 498 willDoAgain=kTRUE;
3301427a 499 continue;
500 }
501
502 // loop on Grid valid Id's
503 Bool_t store = kTRUE;
504 TIter gridIter(gridIds);
505 AliCDBId* aGridId = 0;
506 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
507 if(aGridId->GetPath() != aLocId.GetPath()) continue;
508 // skip all objects valid up to infinity
509 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
510 // if we get here, it means there's already some more recent object stored on Grid!
511 store = kFALSE;
512 break;
513 }
514
515 // If we get here, the file can be stored!
516 Bool_t storeOk = gridSto->Put(aLocEntry);
517 if(!store || storeOk){
518
519 if (!store)
520 {
521 Log(fCurrentDetector.Data(),
522 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
523 type, aGridId->ToString().Data()));
524 } else {
525 Log("SHUTTLE",
526 Form("StoreOCDB - Object <%s> successfully put into %s storage",
527 aLocId.ToString().Data(), type));
528 }
84090f85 529
3301427a 530 // removing local filename...
531 TString filename;
532 localSto->IdToFilename(aLocId, filename);
533 AliInfo(Form("Removing local file %s", filename.Data()));
534 RemoveFile(filename.Data());
535 continue;
536 } else {
537 Log("SHUTTLE",
538 Form("StoreOCDB - Grid %s storage of object <%s> failed",
539 type, aLocId.ToString().Data()));
540 result = kFALSE;
b948db8d 541 }
542 }
3301427a 543 localEntries->Clear();
26758fce 544
545 if(result && willDoAgain) {
546 Log(fCurrentDetector.Data(),
547 "Some files have been left on local storage, will try again later!");
548 result = kFALSE;
549 }
2bb7b766 550
b948db8d 551 return result;
3301427a 552}
553
9827400b 554//______________________________________________________________________________________________
555Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
556{
557 //
3c2a21c8 558 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 559 //
3c2a21c8 560 // The file is stored under the following location:
561 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
562 // where <gridFileName> is the second parameter given to the function
563 //
9827400b 564
565 if (fTestMode & kErrorStorage)
566 {
567 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
568 return kFALSE;
569 }
570
571 AliCDBManager* man = AliCDBManager::Instance();
572 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
573
574 TString localBaseFolder = sto->GetBaseFolder();
575
576 TString targetDir;
577 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
578
579 TString target;
580 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
581
582 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
583 if (result)
584 {
585 result = gSystem->mkdir(targetDir, kTRUE);
586 if (result != 0)
587 {
588 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
589 return kFALSE;
590 }
591 }
592
593 result = gSystem->CopyFile(localFile, target);
594
595 if (result == 0)
596 {
597 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
598 return kTRUE;
599 }
600 else
601 {
602 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
603 return kFALSE;
604 }
605}
606
607//______________________________________________________________________________________________
608Bool_t AliShuttle::StoreRefFilesToGrid()
609{
610 //
611 // Transfers the reference file to the Grid.
9827400b 612 //
3c2a21c8 613 // The file is stored under the following location:
614 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
615 // where <gridFileName> is the second parameter given to the function
616 //
9827400b 617
618 AliCDBManager* man = AliCDBManager::Instance();
619 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
620 if (!sto)
621 return kFALSE;
622 TString localBaseFolder = sto->GetBaseFolder();
623
624 TString dir;
3d8bc902 625 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 626
627 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
628 if (!gridSto)
629 return kFALSE;
630 TString gridBaseFolder = gridSto->GetBaseFolder();
631 TString alienDir;
3d8bc902 632 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 633
3d8bc902 634 if (!gGrid)
9827400b 635 return kFALSE;
636
9827400b 637 TString begin;
638 begin.Form("%d_", GetCurrentRun());
639
640 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 641 if (!baseDir)
642 return kTRUE;
643
9827400b 644 TList* dirList = baseDir->GetListOfFiles();
645 if (!dirList)
3d8bc902 646 {
647 delete baseDir;
9827400b 648 return kTRUE;
3d8bc902 649 }
9827400b 650
651 Int_t nDirs = dirList->GetEntries();
652
653 Bool_t success = kTRUE;
3d8bc902 654 Bool_t first = kTRUE;
9827400b 655
656 for (Int_t iDir=0; iDir<nDirs; ++iDir)
657 {
658 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
659 if (!entry)
660 continue;
661
662 if (entry->IsDirectory())
663 continue;
664
665 TString fileName(entry->GetName());
666 if (!fileName.BeginsWith(begin))
667 continue;
668
3d8bc902 669 if (first)
670 {
671 first = kFALSE;
672 // check that DET folder exists, otherwise create it
673 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
674
675 if (!result)
676 return kFALSE;
677
678 if (!result->GetFileName(0))
679 {
680 if (!gGrid->Mkdir(alienDir.Data(),"",0))
681 {
682 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
683 alienDir.Data()));
684 delete baseDir;
685 return kFALSE;
686 }
687
688 }
689 }
690
9827400b 691 TString fullLocalPath;
692 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
693
694 TString fullGridPath;
695 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
696
697 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
698
699 TFileMerger fileMerger;
700 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
701
702 if (result)
703 {
704 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
705 RemoveFile(fullLocalPath);
706 }
707 else
708 {
709 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
710 success = kFALSE;
711 }
712 }
713
714 delete baseDir;
715
716 return success;
717}
718
3301427a 719//______________________________________________________________________________________________
720void AliShuttle::CleanLocalStorage(const TString& uri)
721{
9827400b 722 //
723 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
724 //
3301427a 725
726 const char* type = 0;
727 if(uri == fgkLocalCDB) {
728 type = "OCDB";
729 } else if(uri == fgkLocalRefStorage) {
730 type = "reference";
731 } else {
732 AliError(Form("Invalid storage URI: %s", uri.Data()));
733 return;
734 }
735
736 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 737
3301427a 738 // open local storage
739 AliCDBStorage *localSto = man->GetStorage(uri);
740 if(!localSto) {
741 Log("SHUTTLE",
742 Form("CleanLocalStorage - cannot activate local %s storage", type));
743 return;
744 }
745
746 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
747 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
748
749 AliInfo(Form("filename = %s", filename.Data()));
750
751 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
752 GetCurrentRun(), fCurrentDetector.Data()));
753
754 RemoveFile(filename.Data());
755
756}
757
758//______________________________________________________________________________________________
759void AliShuttle::RemoveFile(const char* filename)
760{
9827400b 761 //
762 // removes local file
763 //
3301427a 764
765 TString command(Form("rm -f %s", filename));
766
767 Int_t result = gSystem->Exec(command.Data());
768 if(result != 0)
769 {
770 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
771 fCurrentDetector.Data(), filename));
772 }
73abe331 773}
774
b948db8d 775//______________________________________________________________________________________________
5164a766 776AliShuttleStatus* AliShuttle::ReadShuttleStatus()
777{
9827400b 778 //
779 // Reads the AliShuttleStatus from the CDB
780 //
5164a766 781
2bb7b766 782 if (fStatusEntry){
783 delete fStatusEntry;
784 fStatusEntry = 0;
785 }
5164a766 786
10a5a932 787 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 788 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 789
2bb7b766 790 if (!fStatusEntry) return 0;
791 fStatusEntry->SetOwner(1);
5164a766 792
2bb7b766 793 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
794 if (!status) {
795 AliError("Invalid object stored to CDB!");
796 return 0;
797 }
5164a766 798
2bb7b766 799 return status;
5164a766 800}
801
802//______________________________________________________________________________________________
7bfb2090 803Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 804{
9827400b 805 //
806 // writes the status for one subdetector
807 //
2bb7b766 808
809 if (fStatusEntry){
810 delete fStatusEntry;
811 fStatusEntry = 0;
812 }
5164a766 813
2bb7b766 814 Int_t run = GetCurrentRun();
5164a766 815
2bb7b766 816 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 817
2bb7b766 818 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
819 fStatusEntry->SetOwner(1);
5164a766 820
2bb7b766 821 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 822
2bb7b766 823 if (!result) {
3301427a 824 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
825 fCurrentDetector.Data(), run));
2bb7b766 826 return kFALSE;
827 }
e7f62f16 828
829 SendMLInfo();
7bfb2090 830
2bb7b766 831 return kTRUE;
5164a766 832}
833
834//______________________________________________________________________________________________
835void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
836{
9827400b 837 //
838 // changes the AliShuttleStatus for the given detector and run to the given status
839 //
5164a766 840
2bb7b766 841 if (!fStatusEntry){
842 AliError("UNEXPECTED: fStatusEntry empty");
843 return;
844 }
5164a766 845
2bb7b766 846 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 847
2bb7b766 848 if (!status){
3301427a 849 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 850 return;
851 }
5164a766 852
2c15234c 853 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 854 fCurrentDetector.Data(),
36c99a6a 855 status->GetStatusName(),
eba76848 856 status->GetStatusName(newStatus));
cb343cfd 857 Log("SHUTTLE", actionStr);
858 SetLastAction(actionStr);
5164a766 859
2bb7b766 860 status->SetStatus(newStatus);
861 if (increaseCount) status->IncreaseCount();
5164a766 862
2bb7b766 863 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 864
865 SendMLInfo();
5164a766 866}
e7f62f16 867
868//______________________________________________________________________________________________
869void AliShuttle::SendMLInfo()
870{
871 //
872 // sends ML information about the current status of the current detector being processed
873 //
874
875 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
876
877 if (!status){
3301427a 878 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 879 return;
880 }
881
882 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
883 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
884
885 TList mlList;
886 mlList.Add(&mlStatus);
887 mlList.Add(&mlRetryCount);
888
889 fMonaLisa->SendParameters(&mlList);
890}
891
5164a766 892//______________________________________________________________________________________________
893Bool_t AliShuttle::ContinueProcessing()
894{
9827400b 895 // this function reads the AliShuttleStatus information from CDB and
896 // checks if the processing should be continued
897 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 898
57c1a579 899 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
900
901 AliPreprocessor* aPreprocessor =
902 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
903 if (!aPreprocessor)
904 {
905 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
906 return kFALSE;
907 }
908
2bb7b766 909 AliShuttleLogbookEntry::Status entryStatus =
eba76848 910 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 911
912 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 913 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 914 fCurrentDetector.Data(),
915 fLogbookEntry->GetDetectorStatusName(entryStatus)));
916 return kFALSE;
917 }
918
919 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 920
921 // check if current run is first unprocessed run for current detector
922 if (fConfig->StrictRunOrder(fCurrentDetector) &&
923 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
924 {
925 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
926 return kFALSE;
927 }
928
2bb7b766 929 AliShuttleStatus* status = ReadShuttleStatus();
930 if (!status) {
931 // first time
932 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
933 fCurrentDetector.Data()));
934 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
935 return WriteShuttleStatus(status);
936 }
937
938 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
939 // If it happens it may mean Logbook updating failed... let's do it now!
940 if (status->GetStatus() == AliShuttleStatus::kDone ||
941 status->GetStatus() == AliShuttleStatus::kFailed){
942 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
943 fCurrentDetector.Data(),
944 status->GetStatusName(status->GetStatus())));
945 UpdateShuttleLogbook(fCurrentDetector.Data(),
946 status->GetStatusName(status->GetStatus()));
947 return kFALSE;
948 }
949
3301427a 950 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 951 Log("SHUTTLE",
952 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
953 fCurrentDetector.Data()));
9827400b 954 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
955 if (StoreOCDB()){
3301427a 956 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
957 fCurrentDetector.Data()));
2bb7b766 958 UpdateShuttleStatus(AliShuttleStatus::kDone);
959 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
960 } else {
961 Log("SHUTTLE",
962 Form("ContinueProcessing - %s: Grid storage failed again",
963 fCurrentDetector.Data()));
9827400b 964 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 965 }
966 return kFALSE;
967 }
968
969 // if we get here, there is a restart
57c1a579 970 Bool_t cont = kFALSE;
2bb7b766 971
972 // abort conditions
cb343cfd 973 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 974 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
975 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 976 status->GetCount(), status->GetStatusName()));
977 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 978 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 979
980 // there may still be objects in local OCDB and reference storage
981 // and FXS databases may be not updated: do it now!
9827400b 982
983 // TODO Currently disabled, we want to keep files in case of failure!
984 // CleanLocalStorage(fgkLocalCDB);
985 // CleanLocalStorage(fgkLocalRefStorage);
986 // UpdateTableFailCase();
987
988 // Send mail to detector expert!
989 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
990 if (!SendMail())
991 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
992 fCurrentDetector.Data()));
3301427a 993
57c1a579 994 } else {
995 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
996 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
997 status->GetStatusName(), status->GetCount()));
9827400b 998 Bool_t increaseCount = kTRUE;
999 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1000 increaseCount = kFALSE;
1001 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1002 cont = kTRUE;
2bb7b766 1003 }
1004
57c1a579 1005 return cont;
5164a766 1006}
1007
1008//______________________________________________________________________________________________
2bb7b766 1009Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1010{
73abe331 1011 //
b948db8d 1012 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1013 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1014 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1015 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1016 //
1017
9827400b 1018 if (!entry) return kFALSE;
2bb7b766 1019
1020 fLogbookEntry = entry;
1021
9827400b 1022 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1023 GetCurrentRun()));
2bb7b766 1024
e7f62f16 1025 // create ML instance that monitors this run
1026 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1027 // disable monitoring of other parameters that come e.g. from TFile
1028 gMonitoringWriter = 0;
2bb7b766 1029
e7f62f16 1030 // Send the information to ML
1031 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1032 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1033
1034 TList mlList;
1035 mlList.Add(&mlStatus);
9827400b 1036 mlList.Add(&mlRunType);
e7f62f16 1037
1038 fMonaLisa->SendParameters(&mlList);
3301427a 1039
9827400b 1040 if (fLogbookEntry->IsDone())
1041 {
1042 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1043 UpdateShuttleLogbook("shuttle_done");
1044 fLogbookEntry = 0;
1045 return kTRUE;
1046 }
1047
1048 // read test mode if flag is set
1049 if (fReadTestMode)
1050 {
3d8bc902 1051 fTestMode = kNone;
9827400b 1052 TString logEntry(entry->GetRunParameter("log"));
1053 //printf("log entry = %s\n", logEntry.Data());
1054 TString searchStr("Testmode: ");
1055 Int_t pos = logEntry.Index(searchStr.Data());
1056 //printf("%d\n", pos);
1057 if (pos >= 0)
1058 {
1059 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1060 //printf("%s\n", subStr.String().Data());
1061 TString newStr(subStr.Data());
1062 TObjArray* token = newStr.Tokenize(' ');
1063 if (token)
1064 {
1065 //token->Print();
1066 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1067 if (tmpStr)
1068 {
1069 Int_t testMode = tmpStr->String().Atoi();
1070 if (testMode > 0)
1071 {
1072 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1073 SetTestMode((TestMode) testMode);
1074 }
1075 }
1076 delete token;
1077 }
1078 }
1079 }
1080
3d8bc902 1081 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1082
eba76848 1083 fLogbookEntry->Print("all");
57f50b3c 1084
1085 // Initialization
d477ad88 1086 Bool_t hasError = kFALSE;
5164a766 1087
2bb7b766 1088 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1089 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1090 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1091 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1092
57f50b3c 1093 // Loop on detectors in the configuration
b948db8d 1094 TIter iter(fConfig->GetDetectors());
2bb7b766 1095 TObjString* aDetector = 0;
b948db8d 1096
be48e3ea 1097 while ((aDetector = (TObjString*) iter.Next()))
1098 {
7bfb2090 1099 fCurrentDetector = aDetector->String();
5164a766 1100
9e080f92 1101 if (ContinueProcessing() == kFALSE) continue;
1102
2bb7b766 1103 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1104 GetCurrentRun(), aDetector->GetName()));
1105
9d733021 1106 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1107
e7f62f16 1108 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1109
be48e3ea 1110 Int_t pid = fork();
1111
1112 if (pid < 0)
1113 {
1114 Log("SHUTTLE", "ERROR: Forking failed");
1115 }
1116 else if (pid > 0)
1117 {
1118 // parent
1119 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1120 GetCurrentRun(), aDetector->GetName()));
1121
1122 Long_t begin = time(0);
1123
1124 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1125 while (waitpid(pid, &status, WNOHANG) == 0)
1126 {
1127 Long_t expiredTime = time(0) - begin;
1128
1129 if (expiredTime > fConfig->GetPPTimeOut())
1130 {
9827400b 1131 TString tmp;
1132 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1133 fCurrentDetector.Data(), expiredTime);
1134 Log("SHUTTLE", tmp);
1135 Log(fCurrentDetector, tmp);
be48e3ea 1136
1137 kill(pid, 9);
1138
3301427a 1139 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1140 hasError = kTRUE;
1141
1142 gSystem->Sleep(1000);
1143 }
1144 else
1145 {
be48e3ea 1146 gSystem->Sleep(1000);
9827400b 1147
1148 TString checkStr;
1149 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1150 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1151 if (!pipe)
1152 {
1153 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1154 continue;
1155 }
1156
1157 char buffer[100];
1158 if (!fgets(buffer, 100, pipe))
1159 {
1160 Log("SHUTTLE", "Error: ps did not return anything");
1161 gSystem->ClosePipe(pipe);
1162 continue;
1163 }
1164 gSystem->ClosePipe(pipe);
1165
1166 //Log("SHUTTLE", Form("ps returned %s", buffer));
1167
1168 Int_t mem = 0;
1169 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1170 {
1171 Log("SHUTTLE", "Error: Could not parse output of ps");
1172 continue;
1173 }
1174
1175 if (expiredTime % 60 == 0)
886d60e6 1176 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1177 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1178
1179 if (mem > fConfig->GetPPMaxMem())
1180 {
1181 TString tmp;
1182 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1183 mem, fConfig->GetPPMaxMem());
1184 Log("SHUTTLE", tmp);
1185 Log(fCurrentDetector, tmp);
1186
1187 kill(pid, 9);
1188
1189 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1190 hasError = kTRUE;
1191
1192 gSystem->Sleep(1000);
1193 }
be48e3ea 1194 }
1195 }
1196
1197 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1198 GetCurrentRun(), aDetector->GetName()));
1199
1200 if (WIFEXITED(status))
1201 {
1202 Int_t returnCode = WEXITSTATUS(status);
1203
3301427a 1204 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1205 returnCode));
be48e3ea 1206
9827400b 1207 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1208 }
1209 }
1210 else if (pid == 0)
1211 {
1212 // client
1213 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1214
ffa29e93 1215 AliInfo("Redirecting output...");
1216
1217 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1218 {
1219 Log("SHUTTLE", "Could not freopen stdout");
1220 }
1221 else
1222 {
1223 fOutputRedirected = kTRUE;
1224 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1225 Log("SHUTTLE", "Could not redirect stderr");
1226
1227 }
1228
9827400b 1229 Bool_t success = ProcessCurrentDetector();
1230 if (success) // Preprocessor finished successfully!
1231 {
3301427a 1232 // Update time_processed field in FXS DB
1233 if (UpdateTable() == kFALSE)
1234 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1235
1236 // Transfer the data from local storage to main storage (Grid)
1237 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1238 if (StoreOCDB() == kFALSE)
1239 {
1240 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1241 GetCurrentRun(), aDetector->GetName()));
1242 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1243 success = kFALSE;
3301427a 1244 } else {
1245 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1246 GetCurrentRun(), aDetector->GetName()));
1247 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1248 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1249 }
be48e3ea 1250 }
1251
4b95672b 1252 for (UInt_t iSys=0; iSys<3; iSys++)
1253 {
1254 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1255 }
1256
be48e3ea 1257 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1258 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1259
1260 // the client exits here
9827400b 1261 gSystem->Exit(success);
be48e3ea 1262
1263 AliError("We should never get here!!!");
1264 }
7bfb2090 1265 }
5164a766 1266
2bb7b766 1267 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1268 GetCurrentRun()));
1269
1270 //check if shuttle is done for this run, if so update logbook
1271 TObjArray checkEntryArray;
1272 checkEntryArray.SetOwner(1);
9e080f92 1273 TString whereClause = Form("where run=%d", GetCurrentRun());
1274 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1275 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1276 GetCurrentRun()));
1277 return hasError == kFALSE;
1278 }
b948db8d 1279
9e080f92 1280 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1281 (checkEntryArray.At(0));
2bb7b766 1282
9e080f92 1283 if (checkEntry)
1284 {
1285 if (checkEntry->IsDone())
be48e3ea 1286 {
9e080f92 1287 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1288 UpdateShuttleLogbook("shuttle_done");
1289 }
1290 else
1291 {
1292 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1293 {
9e080f92 1294 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1295 {
9e080f92 1296 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1297 checkEntry->GetRun(), GetDetName(iDet)));
1298 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1299 }
1300 }
2bb7b766 1301 }
1302 }
1303
e7f62f16 1304 // remove ML instance
1305 delete fMonaLisa;
1306 fMonaLisa = 0;
1307
2bb7b766 1308 fLogbookEntry = 0;
85a80aa9 1309
a7160fe9 1310 return hasError == kFALSE;
73abe331 1311}
1312
b948db8d 1313//______________________________________________________________________________________________
9827400b 1314Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1315{
1316 //
2bb7b766 1317 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1318 // Threre should be a configuration for this detector.
73abe331 1319
2bb7b766 1320 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1321
2c15234c 1322 TMap dcsMap;
1323 dcsMap.SetOwner(1);
73abe331 1324
85a80aa9 1325 Bool_t aDCSError = kFALSE;
3301427a 1326
1327 // call preprocessor
1328 AliPreprocessor* aPreprocessor =
1329 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1330
1331 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1332
1333 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1334
3d8bc902 1335 if (!processDCS || (fTestMode & kSkipDCS))
2c15234c 1336 {
3d8bc902 1337 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1338 }
1339 else if (fTestMode & kErrorDCS)
1340 {
3d8bc902 1341 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1342 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1343 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1344 return kFALSE;
2c15234c 1345 } else {
3301427a 1346
1347 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1348
2c15234c 1349 TString host(fConfig->GetDCSHost(fCurrentDetector));
1350 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1351
1352 // Retrieval of Aliases
1353 TObjString* anAlias = 0;
36c99a6a 1354 Int_t iAlias = 1;
1355 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1356 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1357 while ((anAlias = (TObjString*) iterAliases.Next()))
1358 {
1359 TObjArray *valueSet = new TObjArray();
1360 valueSet->SetOwner(1);
1361
36c99a6a 1362 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1363 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1364 anAlias->GetName(), iAlias++, nTotAliases));
2c15234c 1365 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1366
1367 if(!aDCSError)
1368 {
1369 dcsMap.Add(anAlias->Clone(), valueSet);
1370 } else {
1371 Log(fCurrentDetector,
1372 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1373 anAlias->GetName()));
1374 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1375 dcsMap.DeleteAll();
9827400b 1376 return kFALSE;
2c15234c 1377 }
4f0ab988 1378 }
2c15234c 1379
1380 // Retrieval of Data Points
1381 TObjString* aDP = 0;
36c99a6a 1382 Int_t iDP = 0;
1383 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1384 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1385 while ((aDP = (TObjString*) iterDP.Next()))
1386 {
1387 TObjArray *valueSet = new TObjArray();
1388 valueSet->SetOwner(1);
36c99a6a 1389 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1390 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1391 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1392 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1393
1394 if(!aDCSError)
1395 {
1396 dcsMap.Add(aDP->Clone(), valueSet);
1397 } else {
1398 Log(fCurrentDetector,
1399 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1400 aDP->GetName()));
1401 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1402 dcsMap.DeleteAll();
9827400b 1403 return kFALSE;
2c15234c 1404 }
73abe331 1405 }
1406 }
b948db8d 1407
2bb7b766 1408 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1409 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1410
3301427a 1411 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1412
3301427a 1413 if (returnValue > 0) // Preprocessor error!
1414 {
9827400b 1415 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1416 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1417 dcsMap.DeleteAll();
1418 return kFALSE;
1419 }
1420
1421 // preprocessor ok!
1422 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1423 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1424 fCurrentDetector.Data()));
b948db8d 1425
2c15234c 1426 dcsMap.DeleteAll();
b948db8d 1427
9827400b 1428 return kTRUE;
2bb7b766 1429}
1430
1431//______________________________________________________________________________________________
1432Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1433 TObjArray& entries)
1434{
9827400b 1435 // Query DAQ's Shuttle logbook and fills detector status object.
1436 // Call QueryRunParameters to query DAQ logbook for run parameters.
1437 //
2bb7b766 1438
fc5a4708 1439 entries.SetOwner(1);
1440
2bb7b766 1441 // check connection, in case connect
be48e3ea 1442 if(!Connect(3)) return kFALSE;
2bb7b766 1443
1444 TString sqlQuery;
441b0e9c 1445 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1446
be48e3ea 1447 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1448 if (!aResult) {
1449 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1450 return kFALSE;
1451 }
1452
fc5a4708 1453 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1454
2bb7b766 1455 if(aResult->GetRowCount() == 0) {
9827400b 1456 AliInfo("No entries in Shuttle Logbook match request");
1457 delete aResult;
1458 return kTRUE;
2bb7b766 1459 }
1460
1461 // TODO Check field count!
fc5a4708 1462 const UInt_t nCols = 22;
2bb7b766 1463 if (aResult->GetFieldCount() != (Int_t) nCols) {
1464 AliError("Invalid SQL result field number!");
1465 delete aResult;
1466 return kFALSE;
1467 }
1468
2bb7b766 1469 TSQLRow* aRow;
1470 while ((aRow = aResult->Next())) {
1471 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1472 Int_t run = runString.Atoi();
1473
eba76848 1474 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1475 if (!entry)
1476 continue;
2bb7b766 1477
1478 // loop on detectors
eba76848 1479 for(UInt_t ii = 0; ii < nCols; ii++)
1480 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1481
eba76848 1482 entries.AddLast(entry);
2bb7b766 1483 delete aRow;
1484 }
1485
2bb7b766 1486 delete aResult;
1487 return kTRUE;
1488}
1489
1490//______________________________________________________________________________________________
eba76848 1491AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1492{
eba76848 1493 //
1494 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1495 //
2bb7b766 1496
1497 // check connection, in case connect
be48e3ea 1498 if (!Connect(3))
eba76848 1499 return 0;
2bb7b766 1500
1501 TString sqlQuery;
2c15234c 1502 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1503
be48e3ea 1504 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1505 if (!aResult) {
1506 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1507 return 0;
2bb7b766 1508 }
1509
eba76848 1510 if (aResult->GetRowCount() == 0) {
2bb7b766 1511 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1512 delete aResult;
eba76848 1513 return 0;
2bb7b766 1514 }
1515
eba76848 1516 if (aResult->GetRowCount() > 1) {
2bb7b766 1517 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1518 delete aResult;
eba76848 1519 return 0;
2bb7b766 1520 }
1521
eba76848 1522 TSQLRow* aRow = aResult->Next();
1523 if (!aRow)
1524 {
1525 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1526 delete aResult;
1527 return 0;
1528 }
2bb7b766 1529
eba76848 1530 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1531
eba76848 1532 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1533 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1534
eba76848 1535 UInt_t startTime = entry->GetStartTime();
1536 UInt_t endTime = entry->GetEndTime();
1537
1538 if (!startTime || !endTime || startTime > endTime) {
1539 Log("SHUTTLE",
1540 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1541 run, startTime, endTime));
1542 delete entry;
2bb7b766 1543 delete aRow;
eba76848 1544 delete aResult;
1545 return 0;
2bb7b766 1546 }
1547
eba76848 1548 delete aRow;
2bb7b766 1549 delete aResult;
eba76848 1550
1551 return entry;
2bb7b766 1552}
1553
b948db8d 1554//______________________________________________________________________________________________
2c15234c 1555Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1556 TObjArray* valueSet, DCSType type)
73abe331 1557{
9827400b 1558 // Retrieve all "entry" data points from the DCS server
1559 // host, port: TSocket connection parameters
1560 // entry: name of the alias or data point
1561 // valueSet: array of retrieved AliDCSValue's
1562 // type: kAlias or kDP
58bc3020 1563
73abe331 1564 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1565 if (!client.IsConnected())
1566 {
b948db8d 1567 return kFALSE;
73abe331 1568 }
1569
2c15234c 1570 Int_t result=0;
73abe331 1571
2c15234c 1572 if (type == kAlias)
1573 {
1574 result = client.GetAliasValues(entry,
1575 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1576 } else
1577 if (type == kDP)
1578 {
1579 result = client.GetDPValues(entry,
1580 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1581 }
1582
1583 if (result < 0)
1584 {
2bb7b766 1585 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1586 entry, AliDCSClient::GetErrorString(result)));
73abe331 1587
2c15234c 1588 if (result == AliDCSClient::fgkServerError)
1589 {
2bb7b766 1590 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1591 client.GetServerError().Data()));
1592 }
1593
1594 return kFALSE;
1595 }
1596
1597 return kTRUE;
1598}
b948db8d 1599
1600//______________________________________________________________________________________________
57f50b3c 1601const char* AliShuttle::GetFile(Int_t system, const char* detector,
1602 const char* id, const char* source)
b948db8d 1603{
9827400b 1604 // Get calibration file from file exchange servers
1605 // First queris the FXS database for the file name, using the run, detector, id and source info
1606 // then calls RetrieveFile(filename) for actual copy to local disk
1607 // run: current run being processed (given by Logbook entry fLogbookEntry)
1608 // detector: the Preprocessor name
1609 // id: provided as a parameter by the Preprocessor
1610 // source: provided by the Preprocessor through GetFileSources function
1611
1612 // check if test mode should simulate a FXS error
1613 if (fTestMode & kErrorFXSFiles)
1614 {
1615 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1616 return 0;
1617 }
1618
57f50b3c 1619 // check connection, in case connect
9d733021 1620 if (!Connect(system))
eba76848 1621 {
9d733021 1622 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1623 return 0;
1624 }
1625
1626 // Query preparation
9d733021 1627 TString sourceName(source);
d386d623 1628 Int_t nFields = 3;
1629 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1630 fConfig->GetFXSdbTable(system));
1631 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1632 GetCurrentRun(), detector, id);
1633
9d733021 1634 if (system == kDAQ)
1635 {
d386d623 1636 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1637 }
9d733021 1638 else if (system == kDCS)
eba76848 1639 {
9d733021 1640 sourceName="none";
57f50b3c 1641 }
9d733021 1642 else if (system == kHLT)
9e080f92 1643 {
d386d623 1644 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1645 nFields = 3;
9e080f92 1646 }
1647
9e080f92 1648 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1649
1650 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1651
1652 // Query execution
1653 TSQLResult* aResult = 0;
9d733021 1654 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1655 if (!aResult) {
9d733021 1656 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1657 GetSystemName(system), id, sourceName.Data()));
9e080f92 1658 return 0;
1659 }
1660
1661 if(aResult->GetRowCount() == 0)
1662 {
1663 Log(detector,
9d733021 1664 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1665 GetSystemName(system), id, sourceName.Data()));
9e080f92 1666 delete aResult;
1667 return 0;
1668 }
2bb7b766 1669
9e080f92 1670 if (aResult->GetRowCount() > 1) {
1671 Log(detector,
9d733021 1672 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1673 GetSystemName(system), id, sourceName.Data()));
9e080f92 1674 delete aResult;
1675 return 0;
1676 }
1677
9d733021 1678 if (aResult->GetFieldCount() != nFields) {
9e080f92 1679 Log(detector,
9d733021 1680 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1681 GetSystemName(system), id, sourceName.Data()));
9e080f92 1682 delete aResult;
1683 return 0;
1684 }
1685
1686 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1687
1688 if (!aRow){
9d733021 1689 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1690 GetSystemName(system), id, sourceName.Data()));
9e080f92 1691 delete aResult;
1692 return 0;
1693 }
1694
1695 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1696 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1697 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1698
1699 delete aResult;
1700 delete aRow;
1701
d386d623 1702 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1703 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1704
9e080f92 1705 // retrieved file is renamed to make it unique
9d733021 1706 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1707 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1708
9e080f92 1709
9d733021 1710 // file retrieval from FXS
4b95672b 1711 UInt_t nRetries = 0;
1712 UInt_t maxRetries = 3;
1713 Bool_t result = kFALSE;
1714
1715 // copy!! if successful TSystem::Exec returns 0
1716 while(nRetries++ < maxRetries) {
1717 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1718 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1719 if(!result)
1720 {
1721 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1722 filePath.Data(), GetSystemName(system)));
4b95672b 1723 continue;
1724 } else {
1725 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1726 filePath.Data(), GetSystemName(system),
1727 GetShuttleTempDir(), localFileName.Data()));
1728 }
9e080f92 1729
d386d623 1730 if (fileChecksum.Length()>0)
4b95672b 1731 {
1732 // compare md5sum of local file with the one stored in the FXS DB
1733 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1734 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1735
4b95672b 1736 if (md5Comp != 0)
1737 {
1738 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1739 filePath.Data()));
1740 result = kFALSE;
1741 continue;
1742 }
d386d623 1743 } else {
1744 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1745 filePath.Data(), GetSystemName(system)));
9d733021 1746 }
4b95672b 1747 if (result) break;
9e080f92 1748 }
1749
4b95672b 1750 if(!result) return 0;
1751
9d733021 1752 fFXSCalled[system]=kTRUE;
1753 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1754 fFXSlist[system].Add(fileParams);
9e080f92 1755
1756 static TString fullLocalFileName;
36c99a6a 1757 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1758
9e080f92 1759 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1760
1761 return fullLocalFileName.Data();
2bb7b766 1762
1763}
1764
1765//______________________________________________________________________________________________
9d733021 1766Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1767{
9827400b 1768 //
1769 // Copies file from FXS to local Shuttle machine
1770 //
2bb7b766 1771
9e080f92 1772 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1773 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1774 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1775
36c99a6a 1776 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1777 if (dir == NULL) {
36c99a6a 1778 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1779 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1780 return kFALSE;
1781 }
1782
1783 } else {
1784 gSystem->FreeDirectory(dir);
1785 }
1786
9d733021 1787 TString baseFXSFolder;
1788 if (system == kDAQ)
1789 {
1790 baseFXSFolder = "FES/";
1791 }
1792 else if (system == kDCS)
1793 {
1794 baseFXSFolder = "";
1795 }
1796 else if (system == kHLT)
1797 {
1798 baseFXSFolder = "~/";
1799 }
1800
1801
1802 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1803 fConfig->GetFXSPort(system),
1804 fConfig->GetFXSUser(system),
1805 fConfig->GetFXSHost(system),
1806 baseFXSFolder.Data(),
1807 fxsFileName,
36c99a6a 1808 GetShuttleTempDir(),
9e080f92 1809 localFileName);
1810
1811 AliDebug(2, Form("%s",command.Data()));
1812
4b95672b 1813 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1814
4b95672b 1815 return result;
9e080f92 1816}
1817
1818//______________________________________________________________________________________________
9d733021 1819TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1820{
9827400b 1821 //
1822 // Get sources producing the condition file Id from file exchange servers
1823 //
1824
1825 // check if test mode should simulate a FXS error
1826 if (fTestMode & kErrorFXSSources)
1827 {
1828 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1829 return 0;
1830 }
1831
9d733021 1832
1833 if (system == kDCS)
1834 {
1835 AliError("DCS system has only one source of data!");
1836 return NULL;
9d733021 1837 }
9e080f92 1838
1839 // check connection, in case connect
9d733021 1840 if (!Connect(system))
1841 {
1842 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1843 return NULL;
9e080f92 1844 }
1845
9d733021 1846 TString sourceName = 0;
1847 if (system == kDAQ)
1848 {
1849 sourceName = "DAQsource";
1850 } else if (system == kHLT)
1851 {
1852 sourceName = "DDLnumbers";
1853 }
1854
d386d623 1855 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
9e080f92 1856 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1857 GetCurrentRun(), detector, id);
1858 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1859
1860 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1861
1862 // Query execution
1863 TSQLResult* aResult;
9d733021 1864 aResult = fServer[system]->Query(sqlQuery);
9e080f92 1865 if (!aResult) {
9d733021 1866 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1867 GetSystemName(system), id));
9e080f92 1868 return 0;
1869 }
1870
9d733021 1871 if (aResult->GetRowCount() == 0)
1872 {
9e080f92 1873 Log(detector,
9d733021 1874 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 1875 delete aResult;
1876 return 0;
1877 }
1878
1879 TSQLRow* aRow;
1880 TList *list = new TList();
1881 list->SetOwner(1);
1882
9d733021 1883 while ((aRow = aResult->Next()))
1884 {
9e080f92 1885
9d733021 1886 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1887 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1888 list->Add(new TObjString(source));
9e080f92 1889 delete aRow;
1890 }
9d733021 1891
9e080f92 1892 delete aResult;
1893
1894 return list;
2bb7b766 1895}
1896
1897//______________________________________________________________________________________________
9d733021 1898Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 1899{
9827400b 1900 // Connect to MySQL Server of the system's FXS MySQL databases
1901 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1902 //
57f50b3c 1903
9d733021 1904 // check connection: if already connected return
1905 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 1906
9d733021 1907 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 1908
9d733021 1909 if (system < 3) // FXS db servers
1910 {
1911 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1912 dbUser = fConfig->GetFXSdbUser(system);
1913 dbPass = fConfig->GetFXSdbPass(system);
1914 dbName = fConfig->GetFXSdbName(system);
1915 } else { // Run & Shuttle logbook servers
1916 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1917 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1918 dbUser = fConfig->GetDAQlbUser();
1919 dbPass = fConfig->GetDAQlbPass();
1920 dbName = fConfig->GetDAQlbDB();
1921 }
57f50b3c 1922
9d733021 1923 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1924 if (!fServer[system] || !fServer[system]->IsConnected()) {
1925 if(system < 3)
1926 {
1927 AliError(Form("Can't establish connection to FXS database for %s",
1928 AliShuttleInterface::GetSystemName(system)));
1929 } else {
1930 AliError("Can't establish connection to Run logbook.");
57f50b3c 1931 }
9d733021 1932 if(fServer[system]) delete fServer[system];
1933 return kFALSE;
2bb7b766 1934 }
57f50b3c 1935
9d733021 1936 // Get tables
1937 TSQLResult* aResult=0;
1938 switch(system){
1939 case kDAQ:
1940 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1941 break;
1942 case kDCS:
1943 aResult = fServer[kDCS]->GetTables(dbName.Data());
1944 break;
1945 case kHLT:
1946 aResult = fServer[kHLT]->GetTables(dbName.Data());
1947 break;
1948 default:
1949 aResult = fServer[3]->GetTables(dbName.Data());
1950 break;
1951 }
1952
1953 delete aResult;
2bb7b766 1954 return kTRUE;
1955}
57f50b3c 1956
9e080f92 1957//______________________________________________________________________________________________
9d733021 1958Bool_t AliShuttle::UpdateTable()
9e080f92 1959{
9827400b 1960 //
1961 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1962 //
9e080f92 1963
9d733021 1964 Bool_t result = kTRUE;
9e080f92 1965
9d733021 1966 for (UInt_t system=0; system<3; system++)
1967 {
1968 if(!fFXSCalled[system]) continue;
9e080f92 1969
9d733021 1970 // check connection, in case connect
1971 if (!Connect(system))
1972 {
1973 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1974 result = kFALSE;
1975 continue;
9e080f92 1976 }
9e080f92 1977
9d733021 1978 TTimeStamp now; // now
1979
1980 // Loop on FXS list entries
1981 TIter iter(&fFXSlist[system]);
1982 TObjString *aFXSentry=0;
1983 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1984 {
1985 TString aFXSentrystr = aFXSentry->String();
1986 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1987 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1988 {
1989 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
1990 GetSystemName(system), aFXSentrystr.Data()));
1991 if(aFXSarray) delete aFXSarray;
1992 result = kFALSE;
1993 continue;
1994 }
1995 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
1996 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
1997
1998 TString whereClause;
1999 if (system == kDAQ)
2000 {
2001 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2002 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2003 }
2004 else if (system == kDCS)
2005 {
2006 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2007 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2008 }
2009 else if (system == kHLT)
2010 {
2011 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2012 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2013 }
2014
2015 delete aFXSarray;
9e080f92 2016
9d733021 2017 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2018 now.GetSec(), whereClause.Data());
9e080f92 2019
9d733021 2020 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2021
9d733021 2022 // Query execution
2023 TSQLResult* aResult;
2024 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2025 if (!aResult)
2026 {
2027 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2028 GetSystemName(system), sqlQuery.Data()));
2029 result = kFALSE;
2030 continue;
2031 }
2032 delete aResult;
9e080f92 2033 }
9e080f92 2034 }
2035
9d733021 2036 return result;
9e080f92 2037}
57f50b3c 2038
3301427a 2039//______________________________________________________________________________________________
2040Bool_t AliShuttle::UpdateTableFailCase()
2041{
9827400b 2042 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2043 // this is called in case the preprocessor is declared failed for the current run, because
2044 // the fields are updated only in case of success
3301427a 2045
2046 Bool_t result = kTRUE;
2047
2048 for (UInt_t system=0; system<3; system++)
2049 {
2050 // check connection, in case connect
2051 if (!Connect(system))
2052 {
2053 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2054 GetSystemName(system)));
2055 result = kFALSE;
2056 continue;
2057 }
2058
2059 TTimeStamp now; // now
2060
2061 // Loop on FXS list entries
2062
2063 TString whereClause = Form("where run=%d and detector=\"%s\";",
2064 GetCurrentRun(), fCurrentDetector.Data());
2065
2066
2067 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2068 now.GetSec(), whereClause.Data());
2069
2070 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2071
2072 // Query execution
2073 TSQLResult* aResult;
2074 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2075 if (!aResult)
2076 {
2077 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2078 GetSystemName(system), sqlQuery.Data()));
2079 result = kFALSE;
2080 continue;
2081 }
2082 delete aResult;
2083 }
2084
2085 return result;
2086}
2087
2bb7b766 2088//______________________________________________________________________________________________
2089Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2090{
e7f62f16 2091 //
2092 // Update Shuttle logbook filling detector or shuttle_done column
2093 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2094 //
57f50b3c 2095
2bb7b766 2096 // check connection, in case connect
be48e3ea 2097 if(!Connect(3)){
2bb7b766 2098 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2099 return kFALSE;
57f50b3c 2100 }
2101
2bb7b766 2102 TString detName(detector);
2103 TString setClause;
e7f62f16 2104 if(detName == "shuttle_done")
2105 {
2bb7b766 2106 setClause = "set shuttle_done=1";
e7f62f16 2107
2108 // Send the information to ML
2109 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2110
2111 TList mlList;
2112 mlList.Add(&mlStatus);
2113
2114 fMonaLisa->SendParameters(&mlList);
2bb7b766 2115 } else {
2bb7b766 2116 TString statusStr(status);
2117 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2118 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2119 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2120 } else {
2121 Log("SHUTTLE",
2122 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2123 status, detector));
2124 return kFALSE;
2125 }
2126 }
57f50b3c 2127
2bb7b766 2128 TString whereClause = Form("where run=%d", GetCurrentRun());
2129
441b0e9c 2130 TString sqlQuery = Form("update %s %s %s",
2131 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2132
2bb7b766 2133 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2134
2135 // Query execution
2136 TSQLResult* aResult;
be48e3ea 2137 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2138 if (!aResult) {
2139 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2140 return kFALSE;
57f50b3c 2141 }
2bb7b766 2142 delete aResult;
57f50b3c 2143
2144 return kTRUE;
2145}
2146
2147//______________________________________________________________________________________________
2bb7b766 2148Int_t AliShuttle::GetCurrentRun() const
2149{
9827400b 2150 //
2151 // Get current run from logbook entry
2152 //
57f50b3c 2153
2bb7b766 2154 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2155}
2156
2157//______________________________________________________________________________________________
2bb7b766 2158UInt_t AliShuttle::GetCurrentStartTime() const
2159{
9827400b 2160 //
2161 // get current start time
2162 //
57f50b3c 2163
2bb7b766 2164 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2165}
2166
2167//______________________________________________________________________________________________
2bb7b766 2168UInt_t AliShuttle::GetCurrentEndTime() const
2169{
9827400b 2170 //
2171 // get current end time from logbook entry
2172 //
57f50b3c 2173
2bb7b766 2174 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2175}
2176
b948db8d 2177//______________________________________________________________________________________________
2178void AliShuttle::Log(const char* detector, const char* message)
2179{
9827400b 2180 //
2181 // Fill log string with a message
2182 //
b948db8d 2183
36c99a6a 2184 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2185 if (dir == NULL) {
36c99a6a 2186 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2187 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2188 return;
2189 }
b948db8d 2190
84090f85 2191 } else {
2192 gSystem->FreeDirectory(dir);
2193 }
b948db8d 2194
cb343cfd 2195 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2196 if (GetCurrentRun() >= 0)
2197 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2198 toLog += Form("%s", message);
2199
84090f85 2200 AliInfo(toLog.Data());
ffa29e93 2201
2202 // if we redirect the log output already to the file, leave here
2203 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2204 return;
b948db8d 2205
ffa29e93 2206 TString fileName = GetLogFileName(detector);
e7f62f16 2207
84090f85 2208 gSystem->ExpandPathName(fileName);
2209
2210 ofstream logFile;
2211 logFile.open(fileName, ofstream::out | ofstream::app);
2212
2213 if (!logFile.is_open()) {
2214 AliError(Form("Could not open file %s", fileName.Data()));
2215 return;
2216 }
7bfb2090 2217
84090f85 2218 logFile << toLog.Data() << "\n";
b948db8d 2219
84090f85 2220 logFile.close();
b948db8d 2221}
2bb7b766 2222
ffa29e93 2223//______________________________________________________________________________________________
2224TString AliShuttle::GetLogFileName(const char* detector) const
2225{
2226 //
2227 // returns the name of the log file for a given sub detector
2228 //
2229
2230 TString fileName;
2231
2232 if (GetCurrentRun() >= 0)
2233 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2234 else
2235 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2236
2237 return fileName;
2238}
2239
2bb7b766 2240//______________________________________________________________________________________________
2241Bool_t AliShuttle::Collect(Int_t run)
2242{
9827400b 2243 //
2244 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2245 // If a dedicated run is given this run is processed
2246 //
2247 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2248 //
2bb7b766 2249
eba76848 2250 if (run == -1)
2251 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2252 else
2253 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2254
2255 SetLastAction("Starting");
2bb7b766 2256
2257 TString whereClause("where shuttle_done=0");
eba76848 2258 if (run != -1)
2259 whereClause += Form(" and run=%d", run);
2bb7b766 2260
2261 TObjArray shuttleLogbookEntries;
be48e3ea 2262 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2263 {
cb343cfd 2264 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2265 return kFALSE;
2266 }
2267
9e080f92 2268 if (shuttleLogbookEntries.GetEntries() == 0)
2269 {
2270 if (run == -1)
2271 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2272 else
2273 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2274 "or it does not exist in Shuttle logbook", run));
2275 return kTRUE;
2276 }
2277
be48e3ea 2278 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2279 fFirstUnprocessed[iDet] = kTRUE;
2280
fc5a4708 2281 if (run != -1)
be48e3ea 2282 {
2283 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2284 // flag them into fFirstUnprocessed array
2285 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2286 TObjArray tmpLogbookEntries;
2287 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2288 {
2289 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2290 return kFALSE;
2291 }
2292
2293 TIter iter(&tmpLogbookEntries);
2294 AliShuttleLogbookEntry* anEntry = 0;
2295 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2296 {
2297 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2298 {
2299 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2300 {
2301 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2302 anEntry->GetRun(), GetDetName(iDet)));
2303 fFirstUnprocessed[iDet] = kFALSE;
2304 }
2305 }
2306
2307 }
2308
2309 }
2310
2311 if (!RetrieveConditionsData(shuttleLogbookEntries))
2312 {
cb343cfd 2313 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2314 return kFALSE;
2315 }
2316
36c99a6a 2317 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2318 return kTRUE;
2bb7b766 2319}
2320
2bb7b766 2321//______________________________________________________________________________________________
2322Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2323{
9827400b 2324 //
2325 // Retrieve conditions data for all runs that aren't processed yet
2326 //
2bb7b766 2327
2328 Bool_t hasError = kFALSE;
2329
2330 TIter iter(&dateEntries);
2331 AliShuttleLogbookEntry* anEntry;
2332
2333 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2334 if (!Process(anEntry)){
2335 hasError = kTRUE;
2336 }
4b95672b 2337
2338 // clean SHUTTLE temp directory
3301427a 2339 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2340 RemoveFile(filename.Data());
2bb7b766 2341 }
2342
2343 return hasError == kFALSE;
2344}
cb343cfd 2345
2346//______________________________________________________________________________________________
2347ULong_t AliShuttle::GetTimeOfLastAction() const
2348{
9827400b 2349 //
2350 // Gets time of last action
2351 //
2352
cb343cfd 2353 ULong_t tmp;
36c99a6a 2354
cb343cfd 2355 fMonitoringMutex->Lock();
be48e3ea 2356
cb343cfd 2357 tmp = fLastActionTime;
36c99a6a 2358
cb343cfd 2359 fMonitoringMutex->UnLock();
36c99a6a 2360
cb343cfd 2361 return tmp;
2362}
2363
2364//______________________________________________________________________________________________
2365const TString AliShuttle::GetLastAction() const
2366{
9827400b 2367 //
cb343cfd 2368 // returns a string description of the last action
9827400b 2369 //
cb343cfd 2370
2371 TString tmp;
36c99a6a 2372
cb343cfd 2373 fMonitoringMutex->Lock();
2374
2375 tmp = fLastAction;
2376
2377 fMonitoringMutex->UnLock();
2378
36c99a6a 2379 return tmp;
cb343cfd 2380}
2381
2382//______________________________________________________________________________________________
2383void AliShuttle::SetLastAction(const char* action)
2384{
9827400b 2385 //
cb343cfd 2386 // updates the monitoring variables
9827400b 2387 //
36c99a6a 2388
cb343cfd 2389 fMonitoringMutex->Lock();
36c99a6a 2390
cb343cfd 2391 fLastAction = action;
2392 fLastActionTime = time(0);
2393
2394 fMonitoringMutex->UnLock();
2395}
eba76848 2396
2397//______________________________________________________________________________________________
2398const char* AliShuttle::GetRunParameter(const char* param)
2399{
9827400b 2400 //
2401 // returns run parameter read from DAQ logbook
2402 //
eba76848 2403
2404 if(!fLogbookEntry) {
2405 AliError("No logbook entry!");
2406 return 0;
2407 }
2408
2409 return fLogbookEntry->GetRunParameter(param);
2410}
57c1a579 2411
d386d623 2412//______________________________________________________________________________________________
9827400b 2413AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2414{
9827400b 2415 //
2416 // returns object from OCDB valid for current run
2417 //
d386d623 2418
9827400b 2419 if (fTestMode & kErrorOCDB)
2420 {
2421 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2422 return 0;
2423 }
2424
d386d623 2425 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2426 if (!sto)
2427 {
9827400b 2428 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2429 return 0;
2430 }
2431
2432 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2433}
2434
57c1a579 2435//______________________________________________________________________________________________
2436Bool_t AliShuttle::SendMail()
2437{
9827400b 2438 //
2439 // sends a mail to the subdetector expert in case of preprocessor error
2440 //
2441
2442 if (fTestMode != kNone)
2443 return kTRUE;
57c1a579 2444
36c99a6a 2445 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2446 if (dir == NULL)
2447 {
36c99a6a 2448 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2449 {
36c99a6a 2450 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2451 return kFALSE;
2452 }
2453
2454 } else {
2455 gSystem->FreeDirectory(dir);
2456 }
2457
2458 TString bodyFileName;
36c99a6a 2459 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2460 gSystem->ExpandPathName(bodyFileName);
2461
2462 ofstream mailBody;
2463 mailBody.open(bodyFileName, ofstream::out);
2464
2465 if (!mailBody.is_open())
2466 {
2467 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2468 return kFALSE;
2469 }
2470
2471 TString to="";
2472 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2473 TObjString *anExpert=0;
2474 while ((anExpert = (TObjString*) iterExperts.Next()))
2475 {
2476 to += Form("%s,", anExpert->GetName());
2477 }
2478 to.Remove(to.Length()-1);
909732f7 2479 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2480
36c99a6a 2481 // TODO this will be removed...
2482 if (to.Contains("not_yet_set")) {
2483 AliInfo("List of detector responsibles not yet set!");
2484 return kFALSE;
2485 }
2486
57c1a579 2487 TString cc="alberto.colla@cern.ch";
2488
2489 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2490 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2491 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2492
2493 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2494 body += Form("SHUTTLE just detected that your preprocessor "
26758fce 2495 "FAILED after %d retries in run %d!!\n\n", fConfig->GetMaxRetries(), GetCurrentRun());
57c1a579 2496 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2497 body += Form("The last 10 lines of %s log file are following:\n\n");
2498
909732f7 2499 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2500
2501 mailBody << body.Data();
2502 mailBody.close();
2503 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2504
9d733021 2505 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2506 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2507 if (gSystem->Exec(tailCommand.Data()))
2508 {
2509 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2510 }
2511
2512 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2513 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2514 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
57c1a579 2515 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2516
909732f7 2517 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2518
2519 mailBody << endBody.Data();
2520
2521 mailBody.close();
2522
2523 // send mail!
2524 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2525 subject.Data(),
2526 cc.Data(),
2527 to.Data(),
2528 bodyFileName.Data());
909732f7 2529 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2530
2531 Bool_t result = gSystem->Exec(mailCommand.Data());
2532
2533 return result == 0;
2534}
d386d623 2535
441b0e9c 2536//______________________________________________________________________________________________
9827400b 2537const char* AliShuttle::GetRunType()
441b0e9c 2538{
9827400b 2539 //
2540 // returns run type read from "run type" logbook
2541 //
441b0e9c 2542
2543 if(!fLogbookEntry) {
2544 AliError("No logbook entry!");
2545 return 0;
2546 }
2547
9827400b 2548 return fLogbookEntry->GetRunType();
441b0e9c 2549}
2550
d386d623 2551//______________________________________________________________________________________________
2552void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2553{
9827400b 2554 //
2555 // sets Shuttle temp directory
2556 //
d386d623 2557
2558 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2559}
2560
2561//______________________________________________________________________________________________
2562void AliShuttle::SetShuttleLogDir(const char* logDir)
2563{
9827400b 2564 //
2565 // sets Shuttle log directory
2566 //
d386d623 2567
2568 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2569}