Last moment fixes and changes from v4-05-Release (Silvia)
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
8b739301 18Revision 1.41 2007/05/03 08:00:48 jgrosseo
19fixing log message when pp want to skip dcs value retrieval
20
651fdaab 21Revision 1.40 2007/04/27 07:06:48 jgrosseo
22GetFileSources returns empty list in case of no files, but successful query
23No mails sent in testmode
24
86aa42c3 25Revision 1.39 2007/04/17 12:43:57 acolla
26Correction in StoreOCDB; change of text in mail to detector expert
27
26758fce 28Revision 1.38 2007/04/12 08:26:18 jgrosseo
29updated comment
30
3c2a21c8 31Revision 1.37 2007/04/10 16:53:14 jgrosseo
32redirecting sub detector stdout, stderr to sub detector log file
33
3d8bc902 34Revision 1.35 2007/04/04 16:26:38 acolla
351. Re-organization of function calls in TestPreprocessor to make it more meaningful.
362. Added missing dependency in test preprocessors.
373. in AliShuttle.cxx: processing time and memory consumption info on a single line.
38
886d60e6 39Revision 1.34 2007/04/04 10:33:36 jgrosseo
401) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
41In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
42
432) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
44
453) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
46
474) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
48
495) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
50If you always need DCS data (like before), you do not need to implement it.
51
526) The run type has been added to the monitoring page
53
9827400b 54Revision 1.33 2007/04/03 13:56:01 acolla
55Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
56run type.
57
3301427a 58Revision 1.32 2007/02/28 10:41:56 acolla
59Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
60AliPreprocessor::GetRunType() function.
61Added some ldap definition files.
62
d386d623 63Revision 1.30 2007/02/13 11:23:21 acolla
64Moved getters and setters of Shuttle's main OCDB/Reference, local
65OCDB/Reference, temp and log folders to AliShuttleInterface
66
9d733021 67Revision 1.27 2007/01/30 17:52:42 jgrosseo
68adding monalisa monitoring
69
e7f62f16 70Revision 1.26 2007/01/23 19:20:03 acolla
71Removed old ldif files, added TOF, MCH ldif files. Added some options in
72AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
73SetShuttleLogDir
74
36c99a6a 75Revision 1.25 2007/01/15 19:13:52 acolla
76Moved some AliInfo to AliDebug in SendMail function
77
fc5a4708 78Revision 1.21 2006/12/07 08:51:26 jgrosseo
79update (alberto):
80table, db names in ldap configuration
81added GRP preprocessor
82DCS data can also be retrieved by data point
83
2c15234c 84Revision 1.20 2006/11/16 16:16:48 jgrosseo
85introducing strict run ordering flag
86removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
87
be48e3ea 88Revision 1.19 2006/11/06 14:23:04 jgrosseo
89major update (Alberto)
90o) reading of run parameters from the logbook
91o) online offline naming conversion
92o) standalone DCSclient package
93
eba76848 94Revision 1.18 2006/10/20 15:22:59 jgrosseo
95o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
96o) Merging Collect, CollectAll, CollectNew function
97o) Removing implementation of empty copy constructors (declaration still there!)
98
cb343cfd 99Revision 1.17 2006/10/05 16:20:55 jgrosseo
100adapting to new CDB classes
101
6ec0e06c 102Revision 1.16 2006/10/05 15:46:26 jgrosseo
103applying to the new interface
104
481441a2 105Revision 1.15 2006/10/02 16:38:39 jgrosseo
106update (alberto):
107fixed memory leaks
108storing of objects that failed to be stored to the grid before
109interfacing of shuttle status table in daq system
110
2bb7b766 111Revision 1.14 2006/08/29 09:16:05 jgrosseo
112small update
113
85a80aa9 114Revision 1.13 2006/08/15 10:50:00 jgrosseo
115effc++ corrections (alberto)
116
4f0ab988 117Revision 1.12 2006/08/08 14:19:29 jgrosseo
118Update to shuttle classes (Alberto)
119
120- Possibility to set the full object's path in the Preprocessor's and
121Shuttle's Store functions
122- Possibility to extend the object's run validity in the same classes
123("startValidity" and "validityInfinite" parameters)
124- Implementation of the StoreReferenceData function to store reference
125data in a dedicated CDB storage.
126
84090f85 127Revision 1.11 2006/07/21 07:37:20 jgrosseo
128last run is stored after each run
129
7bfb2090 130Revision 1.10 2006/07/20 09:54:40 jgrosseo
131introducing status management: The processing per subdetector is divided into several steps,
132after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
133can keep track of the number of failures and skips further processing after a certain threshold is
134exceeded. These thresholds can be configured in LDAP.
135
5164a766 136Revision 1.9 2006/07/19 10:09:55 jgrosseo
137new configuration, accesst to DAQ FES (Alberto)
138
57f50b3c 139Revision 1.8 2006/07/11 12:44:36 jgrosseo
140adding parameters for extended validity range of data produced by preprocessor
141
17111222 142Revision 1.7 2006/07/10 14:37:09 jgrosseo
143small fix + todo comment
144
e090413b 145Revision 1.6 2006/07/10 13:01:41 jgrosseo
146enhanced storing of last sucessfully processed run (alberto)
147
a7160fe9 148Revision 1.5 2006/07/04 14:59:57 jgrosseo
149revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
150
45a493ce 151Revision 1.4 2006/06/12 09:11:16 jgrosseo
152coding conventions (Alberto)
153
58bc3020 154Revision 1.3 2006/06/06 14:26:40 jgrosseo
155o) removed files that were moved to STEER
156o) shuttle updated to follow the new interface (Alberto)
157
b948db8d 158Revision 1.2 2006/03/07 07:52:34 hristov
159New version (B.Yordanov)
160
d477ad88 161Revision 1.6 2005/11/19 17:19:14 byordano
162RetrieveDATEEntries and RetrieveConditionsData added
163
164Revision 1.5 2005/11/19 11:09:27 byordano
165AliShuttle declaration added
166
167Revision 1.4 2005/11/17 17:47:34 byordano
168TList changed to TObjArray
169
170Revision 1.3 2005/11/17 14:43:23 byordano
171import to local CVS
172
173Revision 1.1.1.1 2005/10/28 07:33:58 hristov
174Initial import as subdirectory in AliRoot
175
73abe331 176Revision 1.2 2005/09/13 08:41:15 byordano
177default startTime endTime added
178
179Revision 1.4 2005/08/30 09:13:02 byordano
180some docs added
181
182Revision 1.3 2005/08/29 21:15:47 byordano
183some docs added
184
185*/
186
187//
188// This class is the main manager for AliShuttle.
189// It organizes the data retrieval from DCS and call the
b948db8d 190// interface methods of AliPreprocessor.
73abe331 191// For every detector in AliShuttleConfgi (see AliShuttleConfig),
192// data for its set of aliases is retrieved. If there is registered
b948db8d 193// AliPreprocessor for this detector then it will be used
194// accroding to the schema (see AliPreprocessor).
195// If there isn't registered AliPreprocessor than the retrieved
73abe331 196// data is stored automatically to the undelying AliCDBStorage.
197// For detSpec is used the alias name.
198//
199
200#include "AliShuttle.h"
201
202#include "AliCDBManager.h"
203#include "AliCDBStorage.h"
204#include "AliCDBId.h"
84090f85 205#include "AliCDBRunRange.h"
206#include "AliCDBPath.h"
5164a766 207#include "AliCDBEntry.h"
73abe331 208#include "AliShuttleConfig.h"
eba76848 209#include "DCSClient/AliDCSClient.h"
73abe331 210#include "AliLog.h"
b948db8d 211#include "AliPreprocessor.h"
5164a766 212#include "AliShuttleStatus.h"
2bb7b766 213#include "AliShuttleLogbookEntry.h"
73abe331 214
57f50b3c 215#include <TSystem.h>
58bc3020 216#include <TObject.h>
b948db8d 217#include <TString.h>
57f50b3c 218#include <TTimeStamp.h>
73abe331 219#include <TObjString.h>
57f50b3c 220#include <TSQLServer.h>
221#include <TSQLResult.h>
222#include <TSQLRow.h>
cb343cfd 223#include <TMutex.h>
9827400b 224#include <TSystemDirectory.h>
225#include <TSystemFile.h>
226#include <TFileMerger.h>
227#include <TGrid.h>
228#include <TGridResult.h>
73abe331 229
e7f62f16 230#include <TMonaLisaWriter.h>
231
5164a766 232#include <fstream>
233
cb343cfd 234#include <sys/types.h>
235#include <sys/wait.h>
236
73abe331 237ClassImp(AliShuttle)
238
b948db8d 239//______________________________________________________________________________________________
240AliShuttle::AliShuttle(const AliShuttleConfig* config,
241 UInt_t timeout, Int_t retries):
4f0ab988 242fConfig(config),
243fTimeout(timeout), fRetries(retries),
244fPreprocessorMap(),
2bb7b766 245fLogbookEntry(0),
eba76848 246fCurrentDetector(),
85a80aa9 247fStatusEntry(0),
cb343cfd 248fMonitoringMutex(0),
eba76848 249fLastActionTime(0),
e7f62f16 250fLastAction(),
9827400b 251fMonaLisa(0),
252fTestMode(kNone),
ffa29e93 253fReadTestMode(kFALSE),
254fOutputRedirected(kFALSE)
73abe331 255{
256 //
257 // config: AliShuttleConfig used
73abe331 258 // timeout: timeout used for AliDCSClient connection
259 // retries: the number of retries in case of connection error.
260 //
261
57f50b3c 262 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 263 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 264 fServer[iSys]=0;
be48e3ea 265 if (iSys < 3)
2c15234c 266 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 267 }
2bb7b766 268 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 269
270 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
271 fFirstUnprocessed[iDet] = kFALSE;
272
cb343cfd 273 fMonitoringMutex = new TMutex();
58bc3020 274}
275
b948db8d 276//______________________________________________________________________________________________
57f50b3c 277AliShuttle::~AliShuttle()
58bc3020 278{
9827400b 279 //
280 // destructor
281 //
58bc3020 282
b948db8d 283 fPreprocessorMap.DeleteAll();
be48e3ea 284 for(int iSys=0;iSys<4;iSys++)
57f50b3c 285 if(fServer[iSys]) {
286 fServer[iSys]->Close();
287 delete fServer[iSys];
eba76848 288 fServer[iSys] = 0;
57f50b3c 289 }
2bb7b766 290
291 if (fStatusEntry){
292 delete fStatusEntry;
293 fStatusEntry = 0;
294 }
cb343cfd 295
296 if (fMonitoringMutex)
297 {
298 delete fMonitoringMutex;
299 fMonitoringMutex = 0;
300 }
73abe331 301}
302
b948db8d 303//______________________________________________________________________________________________
57f50b3c 304void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 305{
73abe331 306 //
b948db8d 307 // Registers new AliPreprocessor.
73abe331 308 // It uses GetName() for indentificator of the pre processor.
309 // The pre processor is registered it there isn't any other
310 // with the same identificator (GetName()).
311 //
312
eba76848 313 const char* detName = preprocessor->GetName();
314 if(GetDetPos(detName) < 0)
315 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
316
317 if (fPreprocessorMap.GetValue(detName)) {
318 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 319 return;
320 }
321
eba76848 322 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 323}
b948db8d 324//______________________________________________________________________________________________
3301427a 325Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 326 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 327{
9827400b 328 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
329 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
330 // using this function. Use StoreReferenceData instead!
331 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
332 // finishes the data are transferred to the main storage (Grid).
b948db8d 333
3301427a 334 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 335}
336
337//______________________________________________________________________________________________
3301427a 338Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 339{
9827400b 340 // Stores a CDB object in the storage for reference data. This objects will not be available during
341 // offline reconstrunction. Use this function for reference data only!
342 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
343 // finishes the data are transferred to the main storage (Grid).
85a80aa9 344
3301427a 345 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 346}
347
348//______________________________________________________________________________________________
3301427a 349Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 350 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
351 Int_t validityStart, Bool_t validityInfinite)
352{
9827400b 353 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
354 // when the preprocessor finishes the data are transferred to the main storage (Grid).
355 // The parameters are:
356 // 1) Uri of the backup storage (Local)
357 // 2) the object's path.
358 // 3) the object to be stored
359 // 4) the metaData to be associated with the object
360 // 5) the validity start run number w.r.t. the current run,
361 // if the data is valid only for this run leave the default 0
362 // 6) specifies if the calibration data is valid for infinity (this means until updated),
363 // typical for calibration runs, the default is kFALSE
364 //
365 // returns 0 if fail, 1 otherwise
84090f85 366
9827400b 367 if (fTestMode & kErrorStorage)
368 {
369 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
370 return kFALSE;
371 }
372
3301427a 373 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 374
85a80aa9 375 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 376 if(firstRun < 0) {
9827400b 377 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 378 firstRun=0;
379 }
380
381 Int_t lastRun = -1;
382 if(validityInfinite) {
383 lastRun = AliCDBRunRange::Infinity();
384 } else {
385 lastRun = GetCurrentRun();
386 }
387
3301427a 388 // Version is set to current run, it will be used later to transfer data to Grid
389 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 390
391 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
392 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 393 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 394 }
84090f85 395
3301427a 396 Bool_t result = kFALSE;
84090f85 397
3301427a 398 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
399 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 400 } else {
3301427a 401 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 402 ->Put(object, id, metaData);
403 }
404
405 if(!result) {
406
9827400b 407 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 408 }
2bb7b766 409
3301427a 410 return result;
411}
84090f85 412
3301427a 413//______________________________________________________________________________________________
414Bool_t AliShuttle::StoreOCDB()
415{
9827400b 416 //
417 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
418 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
419 // Then calls StoreRefFilesToGrid to store reference files.
420 //
421
422 if (fTestMode & kErrorGrid)
423 {
424 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
425 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
426 return kFALSE;
427 }
428
86aa42c3 429 AliInfo("Storing OCDB data ...");
430 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
431
3301427a 432 AliInfo("Storing reference data ...");
433 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 434
435 AliInfo("Storing reference files ...");
436 Bool_t resultRefFiles = StoreRefFilesToGrid();
437
438 return resultCDB && resultRef && resultRefFiles;
3301427a 439}
440
441//______________________________________________________________________________________________
442Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
443{
444 //
445 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
446 //
447
448 TObjArray* gridIds=0;
449
450 Bool_t result = kTRUE;
451
452 const char* type = 0;
453 TString localURI;
454 if(gridURI == fgkMainCDB) {
455 type = "OCDB";
456 localURI = fgkLocalCDB;
457 } else if(gridURI == fgkMainRefStorage) {
458 type = "reference";
459 localURI = fgkLocalRefStorage;
460 } else {
461 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
462 return kFALSE;
463 }
464
465 AliCDBManager* man = AliCDBManager::Instance();
466
467 AliCDBStorage *gridSto = man->GetStorage(gridURI);
468 if(!gridSto) {
469 Log("SHUTTLE",
470 Form("StoreOCDB - cannot activate main %s storage", type));
471 return kFALSE;
472 }
473
474 gridIds = gridSto->GetQueryCDBList();
475
476 // get objects previously stored in local CDB
477 AliCDBStorage *localSto = man->GetStorage(localURI);
478 if(!localSto) {
479 Log("SHUTTLE",
480 Form("StoreOCDB - cannot activate local %s storage", type));
481 return kFALSE;
482 }
483 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
484 // Local objects were stored with current run as Grid version!
485 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
486 localEntries->SetOwner(1);
487
488 // loop on local stored objects
489 TIter localIter(localEntries);
490 AliCDBEntry *aLocEntry = 0;
491 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
492 aLocEntry->SetOwner(1);
493 AliCDBId aLocId = aLocEntry->GetId();
494 aLocEntry->SetVersion(-1);
495 aLocEntry->SetSubVersion(-1);
496
497 // If local object is valid up to infinity we store it only if it is
498 // the first unprocessed run!
499 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
500 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
501 {
502 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
503 "there are previous unprocessed runs!",
504 fCurrentDetector.Data(), aLocId.GetPath().Data()));
505 continue;
506 }
507
508 // loop on Grid valid Id's
509 Bool_t store = kTRUE;
510 TIter gridIter(gridIds);
511 AliCDBId* aGridId = 0;
512 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
513 if(aGridId->GetPath() != aLocId.GetPath()) continue;
514 // skip all objects valid up to infinity
515 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
516 // if we get here, it means there's already some more recent object stored on Grid!
517 store = kFALSE;
518 break;
519 }
520
521 // If we get here, the file can be stored!
522 Bool_t storeOk = gridSto->Put(aLocEntry);
523 if(!store || storeOk){
524
525 if (!store)
526 {
527 Log(fCurrentDetector.Data(),
528 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
529 type, aGridId->ToString().Data()));
530 } else {
531 Log("SHUTTLE",
532 Form("StoreOCDB - Object <%s> successfully put into %s storage",
533 aLocId.ToString().Data(), type));
534 }
84090f85 535
3301427a 536 // removing local filename...
537 TString filename;
538 localSto->IdToFilename(aLocId, filename);
539 AliInfo(Form("Removing local file %s", filename.Data()));
540 RemoveFile(filename.Data());
541 continue;
542 } else {
543 Log("SHUTTLE",
544 Form("StoreOCDB - Grid %s storage of object <%s> failed",
545 type, aLocId.ToString().Data()));
546 result = kFALSE;
b948db8d 547 }
548 }
3301427a 549 localEntries->Clear();
2bb7b766 550
b948db8d 551 return result;
3301427a 552}
553
554//______________________________________________________________________________________________
9827400b 555Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
556{
557 //
3c2a21c8 558 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 559 //
3c2a21c8 560 // The file is stored under the following location:
561 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
562 // where <gridFileName> is the second parameter given to the function
563 //
9827400b 564
565 if (fTestMode & kErrorStorage)
566 {
567 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
568 return kFALSE;
569 }
570
571 AliCDBManager* man = AliCDBManager::Instance();
572 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
573
574 TString localBaseFolder = sto->GetBaseFolder();
575
576 TString targetDir;
577 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
578
579 TString target;
580 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
581
582 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
583 if (result)
584 {
585 result = gSystem->mkdir(targetDir, kTRUE);
586 if (result != 0)
587 {
588 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
589 return kFALSE;
590 }
591 }
592
593 result = gSystem->CopyFile(localFile, target);
594
595 if (result == 0)
596 {
597 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
598 return kTRUE;
599 }
600 else
601 {
602 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
603 return kFALSE;
604 }
605}
606
607//______________________________________________________________________________________________
608Bool_t AliShuttle::StoreRefFilesToGrid()
609{
610 //
611 // Transfers the reference file to the Grid.
9827400b 612 //
86aa42c3 613 // The files are stored under the following location:
3c2a21c8 614 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
86aa42c3 615 //
9827400b 616
617 AliCDBManager* man = AliCDBManager::Instance();
618 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
619 if (!sto)
620 return kFALSE;
621 TString localBaseFolder = sto->GetBaseFolder();
622
623 TString dir;
3d8bc902 624 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 625
626 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
627 if (!gridSto)
628 return kFALSE;
629 TString gridBaseFolder = gridSto->GetBaseFolder();
630 TString alienDir;
3d8bc902 631 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 632
3d8bc902 633 if (!gGrid)
9827400b 634 return kFALSE;
635
9827400b 636 TString begin;
637 begin.Form("%d_", GetCurrentRun());
638
639 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 640 if (!baseDir)
641 return kTRUE;
642
9827400b 643 TList* dirList = baseDir->GetListOfFiles();
644 if (!dirList)
3d8bc902 645 {
646 delete baseDir;
9827400b 647 return kTRUE;
3d8bc902 648 }
9827400b 649
650 Int_t nDirs = dirList->GetEntries();
651
652 Bool_t success = kTRUE;
3d8bc902 653 Bool_t first = kTRUE;
9827400b 654
655 for (Int_t iDir=0; iDir<nDirs; ++iDir)
656 {
657 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
658 if (!entry)
659 continue;
660
661 if (entry->IsDirectory())
662 continue;
663
664 TString fileName(entry->GetName());
665 if (!fileName.BeginsWith(begin))
666 continue;
667
3d8bc902 668 if (first)
669 {
670 first = kFALSE;
671 // check that DET folder exists, otherwise create it
672 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
673
674 if (!result)
675 return kFALSE;
676
677 if (!result->GetFileName(0))
678 {
679 if (!gGrid->Mkdir(alienDir.Data(),"",0))
680 {
681 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
682 alienDir.Data()));
683 delete baseDir;
684 return kFALSE;
685 }
686
687 }
688 }
689
9827400b 690 TString fullLocalPath;
691 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
692
693 TString fullGridPath;
694 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
695
696 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
697
698 TFileMerger fileMerger;
699 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
700
701 if (result)
702 {
703 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
704 RemoveFile(fullLocalPath);
705 }
706 else
707 {
708 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
709 success = kFALSE;
710 }
711 }
712
713 delete baseDir;
714
715 return success;
716}
717
718//______________________________________________________________________________________________
3301427a 719void AliShuttle::CleanLocalStorage(const TString& uri)
720{
9827400b 721 //
722 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
723 //
3301427a 724
725 const char* type = 0;
726 if(uri == fgkLocalCDB) {
727 type = "OCDB";
728 } else if(uri == fgkLocalRefStorage) {
729 type = "reference";
730 } else {
731 AliError(Form("Invalid storage URI: %s", uri.Data()));
732 return;
733 }
734
735 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 736
3301427a 737 // open local storage
738 AliCDBStorage *localSto = man->GetStorage(uri);
739 if(!localSto) {
740 Log("SHUTTLE",
741 Form("CleanLocalStorage - cannot activate local %s storage", type));
742 return;
743 }
744
745 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
746 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
747
748 AliInfo(Form("filename = %s", filename.Data()));
749
750 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
751 GetCurrentRun(), fCurrentDetector.Data()));
752
753 RemoveFile(filename.Data());
754
755}
756
757//______________________________________________________________________________________________
758void AliShuttle::RemoveFile(const char* filename)
759{
9827400b 760 //
761 // removes local file
762 //
3301427a 763
764 TString command(Form("rm -f %s", filename));
765
766 Int_t result = gSystem->Exec(command.Data());
767 if(result != 0)
768 {
769 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
770 fCurrentDetector.Data(), filename));
771 }
73abe331 772}
773
b948db8d 774//______________________________________________________________________________________________
5164a766 775AliShuttleStatus* AliShuttle::ReadShuttleStatus()
776{
9827400b 777 //
778 // Reads the AliShuttleStatus from the CDB
779 //
5164a766 780
2bb7b766 781 if (fStatusEntry){
782 delete fStatusEntry;
783 fStatusEntry = 0;
784 }
5164a766 785
10a5a932 786 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 787 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 788
2bb7b766 789 if (!fStatusEntry) return 0;
790 fStatusEntry->SetOwner(1);
5164a766 791
2bb7b766 792 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
793 if (!status) {
794 AliError("Invalid object stored to CDB!");
795 return 0;
796 }
5164a766 797
2bb7b766 798 return status;
5164a766 799}
800
801//______________________________________________________________________________________________
7bfb2090 802Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 803{
9827400b 804 //
805 // writes the status for one subdetector
806 //
2bb7b766 807
808 if (fStatusEntry){
809 delete fStatusEntry;
810 fStatusEntry = 0;
811 }
5164a766 812
2bb7b766 813 Int_t run = GetCurrentRun();
5164a766 814
2bb7b766 815 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 816
2bb7b766 817 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
818 fStatusEntry->SetOwner(1);
5164a766 819
2bb7b766 820 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 821
2bb7b766 822 if (!result) {
3301427a 823 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
824 fCurrentDetector.Data(), run));
2bb7b766 825 return kFALSE;
826 }
e7f62f16 827
828 SendMLInfo();
7bfb2090 829
2bb7b766 830 return kTRUE;
5164a766 831}
832
833//______________________________________________________________________________________________
834void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
835{
9827400b 836 //
837 // changes the AliShuttleStatus for the given detector and run to the given status
838 //
5164a766 839
2bb7b766 840 if (!fStatusEntry){
841 AliError("UNEXPECTED: fStatusEntry empty");
842 return;
843 }
5164a766 844
2bb7b766 845 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 846
2bb7b766 847 if (!status){
3301427a 848 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 849 return;
850 }
5164a766 851
2c15234c 852 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 853 fCurrentDetector.Data(),
36c99a6a 854 status->GetStatusName(),
eba76848 855 status->GetStatusName(newStatus));
cb343cfd 856 Log("SHUTTLE", actionStr);
857 SetLastAction(actionStr);
5164a766 858
2bb7b766 859 status->SetStatus(newStatus);
860 if (increaseCount) status->IncreaseCount();
5164a766 861
2bb7b766 862 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 863
864 SendMLInfo();
5164a766 865}
e7f62f16 866
867//______________________________________________________________________________________________
868void AliShuttle::SendMLInfo()
869{
870 //
871 // sends ML information about the current status of the current detector being processed
872 //
873
874 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
875
876 if (!status){
3301427a 877 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 878 return;
879 }
880
881 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
882 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
883
884 TList mlList;
885 mlList.Add(&mlStatus);
886 mlList.Add(&mlRetryCount);
887
888 fMonaLisa->SendParameters(&mlList);
889}
890
5164a766 891//______________________________________________________________________________________________
892Bool_t AliShuttle::ContinueProcessing()
893{
9827400b 894 // this function reads the AliShuttleStatus information from CDB and
895 // checks if the processing should be continued
896 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 897
57c1a579 898 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
899
900 AliPreprocessor* aPreprocessor =
901 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
902 if (!aPreprocessor)
903 {
904 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
905 return kFALSE;
906 }
907
2bb7b766 908 AliShuttleLogbookEntry::Status entryStatus =
eba76848 909 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 910
911 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 912 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 913 fCurrentDetector.Data(),
914 fLogbookEntry->GetDetectorStatusName(entryStatus)));
915 return kFALSE;
916 }
917
918 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 919
920 // check if current run is first unprocessed run for current detector
921 if (fConfig->StrictRunOrder(fCurrentDetector) &&
922 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
923 {
86aa42c3 924 if (fTestMode == kNone)
925 {
926 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
927 return kFALSE;
928 }
929 else
930 {
931 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
932 }
be48e3ea 933 }
934
2bb7b766 935 AliShuttleStatus* status = ReadShuttleStatus();
936 if (!status) {
937 // first time
938 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
939 fCurrentDetector.Data()));
940 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
941 return WriteShuttleStatus(status);
942 }
943
944 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
945 // If it happens it may mean Logbook updating failed... let's do it now!
946 if (status->GetStatus() == AliShuttleStatus::kDone ||
947 status->GetStatus() == AliShuttleStatus::kFailed){
948 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
949 fCurrentDetector.Data(),
950 status->GetStatusName(status->GetStatus())));
951 UpdateShuttleLogbook(fCurrentDetector.Data(),
952 status->GetStatusName(status->GetStatus()));
953 return kFALSE;
954 }
955
3301427a 956 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 957 Log("SHUTTLE",
958 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
959 fCurrentDetector.Data()));
9827400b 960 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
961 if (StoreOCDB()){
3301427a 962 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
963 fCurrentDetector.Data()));
2bb7b766 964 UpdateShuttleStatus(AliShuttleStatus::kDone);
965 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
966 } else {
967 Log("SHUTTLE",
968 Form("ContinueProcessing - %s: Grid storage failed again",
969 fCurrentDetector.Data()));
9827400b 970 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 971 }
972 return kFALSE;
973 }
974
975 // if we get here, there is a restart
57c1a579 976 Bool_t cont = kFALSE;
2bb7b766 977
978 // abort conditions
cb343cfd 979 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 980 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
981 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 982 status->GetCount(), status->GetStatusName()));
983 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 984 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 985
986 // there may still be objects in local OCDB and reference storage
987 // and FXS databases may be not updated: do it now!
9827400b 988
989 // TODO Currently disabled, we want to keep files in case of failure!
990 // CleanLocalStorage(fgkLocalCDB);
991 // CleanLocalStorage(fgkLocalRefStorage);
992 // UpdateTableFailCase();
993
994 // Send mail to detector expert!
995 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
996 if (!SendMail())
997 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
998 fCurrentDetector.Data()));
3301427a 999
57c1a579 1000 } else {
1001 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1002 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1003 status->GetStatusName(), status->GetCount()));
9827400b 1004 Bool_t increaseCount = kTRUE;
1005 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1006 increaseCount = kFALSE;
1007 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1008 cont = kTRUE;
2bb7b766 1009 }
1010
57c1a579 1011 return cont;
5164a766 1012}
1013
1014//______________________________________________________________________________________________
2bb7b766 1015Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1016{
73abe331 1017 //
b948db8d 1018 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1019 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1020 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1021 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1022 //
1023
9827400b 1024 if (!entry) return kFALSE;
2bb7b766 1025
1026 fLogbookEntry = entry;
1027
9827400b 1028 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1029 GetCurrentRun()));
2bb7b766 1030
e7f62f16 1031 // create ML instance that monitors this run
1032 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1033 // disable monitoring of other parameters that come e.g. from TFile
1034 gMonitoringWriter = 0;
2bb7b766 1035
e7f62f16 1036 // Send the information to ML
1037 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1038 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1039
1040 TList mlList;
1041 mlList.Add(&mlStatus);
9827400b 1042 mlList.Add(&mlRunType);
e7f62f16 1043
1044 fMonaLisa->SendParameters(&mlList);
3301427a 1045
9827400b 1046 if (fLogbookEntry->IsDone())
1047 {
1048 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1049 UpdateShuttleLogbook("shuttle_done");
1050 fLogbookEntry = 0;
1051 return kTRUE;
1052 }
1053
1054 // read test mode if flag is set
1055 if (fReadTestMode)
1056 {
3d8bc902 1057 fTestMode = kNone;
9827400b 1058 TString logEntry(entry->GetRunParameter("log"));
1059 //printf("log entry = %s\n", logEntry.Data());
1060 TString searchStr("Testmode: ");
1061 Int_t pos = logEntry.Index(searchStr.Data());
1062 //printf("%d\n", pos);
1063 if (pos >= 0)
1064 {
1065 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1066 //printf("%s\n", subStr.String().Data());
1067 TString newStr(subStr.Data());
1068 TObjArray* token = newStr.Tokenize(' ');
1069 if (token)
1070 {
1071 //token->Print();
1072 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1073 if (tmpStr)
1074 {
1075 Int_t testMode = tmpStr->String().Atoi();
1076 if (testMode > 0)
1077 {
1078 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1079 SetTestMode((TestMode) testMode);
1080 }
1081 }
1082 delete token;
1083 }
1084 }
1085 }
1086
3d8bc902 1087 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1088
eba76848 1089 fLogbookEntry->Print("all");
57f50b3c 1090
1091 // Initialization
d477ad88 1092 Bool_t hasError = kFALSE;
5164a766 1093
2bb7b766 1094 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1095 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1096 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1097 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1098
57f50b3c 1099 // Loop on detectors in the configuration
b948db8d 1100 TIter iter(fConfig->GetDetectors());
2bb7b766 1101 TObjString* aDetector = 0;
b948db8d 1102
be48e3ea 1103 while ((aDetector = (TObjString*) iter.Next()))
1104 {
7bfb2090 1105 fCurrentDetector = aDetector->String();
5164a766 1106
9e080f92 1107 if (ContinueProcessing() == kFALSE) continue;
1108
2bb7b766 1109 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1110 GetCurrentRun(), aDetector->GetName()));
1111
9d733021 1112 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1113
e7f62f16 1114 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1115
be48e3ea 1116 Int_t pid = fork();
1117
1118 if (pid < 0)
1119 {
1120 Log("SHUTTLE", "ERROR: Forking failed");
1121 }
1122 else if (pid > 0)
1123 {
1124 // parent
1125 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1126 GetCurrentRun(), aDetector->GetName()));
1127
1128 Long_t begin = time(0);
1129
1130 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1131 while (waitpid(pid, &status, WNOHANG) == 0)
1132 {
1133 Long_t expiredTime = time(0) - begin;
1134
1135 if (expiredTime > fConfig->GetPPTimeOut())
1136 {
9827400b 1137 TString tmp;
1138 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1139 fCurrentDetector.Data(), expiredTime);
1140 Log("SHUTTLE", tmp);
1141 Log(fCurrentDetector, tmp);
be48e3ea 1142
1143 kill(pid, 9);
1144
3301427a 1145 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1146 hasError = kTRUE;
1147
1148 gSystem->Sleep(1000);
1149 }
1150 else
1151 {
be48e3ea 1152 gSystem->Sleep(1000);
9827400b 1153
1154 TString checkStr;
1155 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1156 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1157 if (!pipe)
1158 {
1159 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1160 continue;
1161 }
1162
1163 char buffer[100];
1164 if (!fgets(buffer, 100, pipe))
1165 {
1166 Log("SHUTTLE", "Error: ps did not return anything");
1167 gSystem->ClosePipe(pipe);
1168 continue;
1169 }
1170 gSystem->ClosePipe(pipe);
1171
1172 //Log("SHUTTLE", Form("ps returned %s", buffer));
1173
1174 Int_t mem = 0;
1175 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1176 {
1177 Log("SHUTTLE", "Error: Could not parse output of ps");
1178 continue;
1179 }
1180
1181 if (expiredTime % 60 == 0)
886d60e6 1182 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1183 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1184
1185 if (mem > fConfig->GetPPMaxMem())
1186 {
1187 TString tmp;
1188 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1189 mem, fConfig->GetPPMaxMem());
1190 Log("SHUTTLE", tmp);
1191 Log(fCurrentDetector, tmp);
1192
1193 kill(pid, 9);
1194
1195 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1196 hasError = kTRUE;
1197
1198 gSystem->Sleep(1000);
1199 }
be48e3ea 1200 }
1201 }
1202
1203 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1204 GetCurrentRun(), aDetector->GetName()));
1205
1206 if (WIFEXITED(status))
1207 {
1208 Int_t returnCode = WEXITSTATUS(status);
1209
3301427a 1210 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1211 returnCode));
be48e3ea 1212
9827400b 1213 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1214 }
1215 }
1216 else if (pid == 0)
1217 {
1218 // client
1219 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1220
ffa29e93 1221 AliInfo("Redirecting output...");
1222
1223 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1224 {
1225 Log("SHUTTLE", "Could not freopen stdout");
1226 }
1227 else
1228 {
1229 fOutputRedirected = kTRUE;
1230 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1231 Log("SHUTTLE", "Could not redirect stderr");
1232
1233 }
1234
9827400b 1235 Bool_t success = ProcessCurrentDetector();
1236 if (success) // Preprocessor finished successfully!
1237 {
3301427a 1238 // Update time_processed field in FXS DB
1239 if (UpdateTable() == kFALSE)
1240 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1241
1242 // Transfer the data from local storage to main storage (Grid)
1243 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1244 if (StoreOCDB() == kFALSE)
1245 {
1246 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1247 GetCurrentRun(), aDetector->GetName()));
1248 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1249 success = kFALSE;
3301427a 1250 } else {
1251 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1252 GetCurrentRun(), aDetector->GetName()));
1253 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1254 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1255 }
be48e3ea 1256 }
1257
4b95672b 1258 for (UInt_t iSys=0; iSys<3; iSys++)
1259 {
1260 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1261 }
1262
be48e3ea 1263 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1264 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1265
1266 // the client exits here
9827400b 1267 gSystem->Exit(success);
be48e3ea 1268
1269 AliError("We should never get here!!!");
1270 }
7bfb2090 1271 }
5164a766 1272
2bb7b766 1273 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1274 GetCurrentRun()));
1275
1276 //check if shuttle is done for this run, if so update logbook
1277 TObjArray checkEntryArray;
1278 checkEntryArray.SetOwner(1);
9e080f92 1279 TString whereClause = Form("where run=%d", GetCurrentRun());
1280 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1281 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1282 GetCurrentRun()));
1283 return hasError == kFALSE;
1284 }
b948db8d 1285
9e080f92 1286 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1287 (checkEntryArray.At(0));
2bb7b766 1288
9e080f92 1289 if (checkEntry)
1290 {
1291 if (checkEntry->IsDone())
be48e3ea 1292 {
9e080f92 1293 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1294 UpdateShuttleLogbook("shuttle_done");
1295 }
1296 else
1297 {
1298 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1299 {
9e080f92 1300 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1301 {
9e080f92 1302 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1303 checkEntry->GetRun(), GetDetName(iDet)));
1304 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1305 }
1306 }
2bb7b766 1307 }
1308 }
1309
e7f62f16 1310 // remove ML instance
1311 delete fMonaLisa;
1312 fMonaLisa = 0;
1313
2bb7b766 1314 fLogbookEntry = 0;
85a80aa9 1315
a7160fe9 1316 return hasError == kFALSE;
73abe331 1317}
1318
b948db8d 1319//______________________________________________________________________________________________
9827400b 1320Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1321{
1322 //
2bb7b766 1323 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1324 // Threre should be a configuration for this detector.
73abe331 1325
2bb7b766 1326 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1327
2c15234c 1328 TMap dcsMap;
1329 dcsMap.SetOwner(1);
73abe331 1330
85a80aa9 1331 Bool_t aDCSError = kFALSE;
3301427a 1332
1333 // call preprocessor
1334 AliPreprocessor* aPreprocessor =
1335 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1336
1337 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1338
1339 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1340
651fdaab 1341 if (!processDCS)
1342 {
1343 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1344 }
8b739301 1345 else if (fTestMode & kSkipDCS)
2c15234c 1346 {
3d8bc902 1347 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1348 }
1349 else if (fTestMode & kErrorDCS)
1350 {
3d8bc902 1351 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1352 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1353 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1354 return kFALSE;
2c15234c 1355 } else {
3301427a 1356
1357 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1358
2c15234c 1359 TString host(fConfig->GetDCSHost(fCurrentDetector));
1360 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1361
1362 // Retrieval of Aliases
1363 TObjString* anAlias = 0;
36c99a6a 1364 Int_t iAlias = 1;
1365 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1366 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1367 while ((anAlias = (TObjString*) iterAliases.Next()))
1368 {
1369 TObjArray *valueSet = new TObjArray();
1370 valueSet->SetOwner(1);
1371
36c99a6a 1372 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1373 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1374 anAlias->GetName(), iAlias++, nTotAliases));
2c15234c 1375 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1376
1377 if(!aDCSError)
1378 {
1379 dcsMap.Add(anAlias->Clone(), valueSet);
1380 } else {
1381 Log(fCurrentDetector,
1382 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1383 anAlias->GetName()));
1384 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1385 dcsMap.DeleteAll();
9827400b 1386 return kFALSE;
2c15234c 1387 }
4f0ab988 1388 }
2c15234c 1389
1390 // Retrieval of Data Points
1391 TObjString* aDP = 0;
36c99a6a 1392 Int_t iDP = 0;
1393 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1394 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1395 while ((aDP = (TObjString*) iterDP.Next()))
1396 {
1397 TObjArray *valueSet = new TObjArray();
1398 valueSet->SetOwner(1);
36c99a6a 1399 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1400 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1401 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1402 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1403
1404 if(!aDCSError)
1405 {
1406 dcsMap.Add(aDP->Clone(), valueSet);
1407 } else {
1408 Log(fCurrentDetector,
1409 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1410 aDP->GetName()));
1411 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1412 dcsMap.DeleteAll();
9827400b 1413 return kFALSE;
2c15234c 1414 }
73abe331 1415 }
1416 }
b948db8d 1417
2bb7b766 1418 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1419 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1420
3301427a 1421 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1422
3301427a 1423 if (returnValue > 0) // Preprocessor error!
1424 {
9827400b 1425 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1426 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1427 dcsMap.DeleteAll();
1428 return kFALSE;
1429 }
1430
1431 // preprocessor ok!
1432 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1433 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1434 fCurrentDetector.Data()));
b948db8d 1435
2c15234c 1436 dcsMap.DeleteAll();
b948db8d 1437
9827400b 1438 return kTRUE;
2bb7b766 1439}
1440
1441//______________________________________________________________________________________________
1442Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1443 TObjArray& entries)
1444{
9827400b 1445 // Query DAQ's Shuttle logbook and fills detector status object.
1446 // Call QueryRunParameters to query DAQ logbook for run parameters.
1447 //
2bb7b766 1448
fc5a4708 1449 entries.SetOwner(1);
1450
2bb7b766 1451 // check connection, in case connect
be48e3ea 1452 if(!Connect(3)) return kFALSE;
2bb7b766 1453
1454 TString sqlQuery;
441b0e9c 1455 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1456
be48e3ea 1457 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1458 if (!aResult) {
1459 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1460 return kFALSE;
1461 }
1462
fc5a4708 1463 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1464
2bb7b766 1465 if(aResult->GetRowCount() == 0) {
9827400b 1466 AliInfo("No entries in Shuttle Logbook match request");
1467 delete aResult;
1468 return kTRUE;
2bb7b766 1469 }
1470
1471 // TODO Check field count!
fc5a4708 1472 const UInt_t nCols = 22;
2bb7b766 1473 if (aResult->GetFieldCount() != (Int_t) nCols) {
1474 AliError("Invalid SQL result field number!");
1475 delete aResult;
1476 return kFALSE;
1477 }
1478
2bb7b766 1479 TSQLRow* aRow;
1480 while ((aRow = aResult->Next())) {
1481 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1482 Int_t run = runString.Atoi();
1483
eba76848 1484 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1485 if (!entry)
1486 continue;
2bb7b766 1487
1488 // loop on detectors
eba76848 1489 for(UInt_t ii = 0; ii < nCols; ii++)
1490 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1491
eba76848 1492 entries.AddLast(entry);
2bb7b766 1493 delete aRow;
1494 }
1495
2bb7b766 1496 delete aResult;
1497 return kTRUE;
1498}
1499
1500//______________________________________________________________________________________________
eba76848 1501AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1502{
eba76848 1503 //
1504 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1505 //
2bb7b766 1506
1507 // check connection, in case connect
be48e3ea 1508 if (!Connect(3))
eba76848 1509 return 0;
2bb7b766 1510
1511 TString sqlQuery;
2c15234c 1512 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1513
be48e3ea 1514 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1515 if (!aResult) {
1516 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1517 return 0;
2bb7b766 1518 }
1519
eba76848 1520 if (aResult->GetRowCount() == 0) {
2bb7b766 1521 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1522 delete aResult;
eba76848 1523 return 0;
2bb7b766 1524 }
1525
eba76848 1526 if (aResult->GetRowCount() > 1) {
2bb7b766 1527 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1528 delete aResult;
eba76848 1529 return 0;
2bb7b766 1530 }
1531
eba76848 1532 TSQLRow* aRow = aResult->Next();
1533 if (!aRow)
1534 {
1535 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1536 delete aResult;
1537 return 0;
1538 }
2bb7b766 1539
eba76848 1540 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1541
eba76848 1542 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1543 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1544
eba76848 1545 UInt_t startTime = entry->GetStartTime();
1546 UInt_t endTime = entry->GetEndTime();
1547
1548 if (!startTime || !endTime || startTime > endTime) {
1549 Log("SHUTTLE",
1550 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1551 run, startTime, endTime));
1552 delete entry;
2bb7b766 1553 delete aRow;
eba76848 1554 delete aResult;
1555 return 0;
2bb7b766 1556 }
1557
eba76848 1558 delete aRow;
2bb7b766 1559 delete aResult;
eba76848 1560
1561 return entry;
2bb7b766 1562}
1563
1564//______________________________________________________________________________________________
2c15234c 1565Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1566 TObjArray* valueSet, DCSType type)
73abe331 1567{
9827400b 1568 // Retrieve all "entry" data points from the DCS server
1569 // host, port: TSocket connection parameters
1570 // entry: name of the alias or data point
1571 // valueSet: array of retrieved AliDCSValue's
1572 // type: kAlias or kDP
58bc3020 1573
73abe331 1574 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1575 if (!client.IsConnected())
1576 {
b948db8d 1577 return kFALSE;
73abe331 1578 }
1579
2c15234c 1580 Int_t result=0;
73abe331 1581
2c15234c 1582 if (type == kAlias)
1583 {
1584 result = client.GetAliasValues(entry,
1585 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1586 } else
1587 if (type == kDP)
1588 {
1589 result = client.GetDPValues(entry,
1590 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1591 }
1592
1593 if (result < 0)
1594 {
2bb7b766 1595 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1596 entry, AliDCSClient::GetErrorString(result)));
73abe331 1597
2c15234c 1598 if (result == AliDCSClient::fgkServerError)
1599 {
2bb7b766 1600 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1601 client.GetServerError().Data()));
1602 }
1603
1604 return kFALSE;
1605 }
1606
1607 return kTRUE;
1608}
b948db8d 1609
1610//______________________________________________________________________________________________
57f50b3c 1611const char* AliShuttle::GetFile(Int_t system, const char* detector,
1612 const char* id, const char* source)
b948db8d 1613{
9827400b 1614 // Get calibration file from file exchange servers
1615 // First queris the FXS database for the file name, using the run, detector, id and source info
1616 // then calls RetrieveFile(filename) for actual copy to local disk
1617 // run: current run being processed (given by Logbook entry fLogbookEntry)
1618 // detector: the Preprocessor name
1619 // id: provided as a parameter by the Preprocessor
1620 // source: provided by the Preprocessor through GetFileSources function
1621
1622 // check if test mode should simulate a FXS error
1623 if (fTestMode & kErrorFXSFiles)
1624 {
1625 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1626 return 0;
1627 }
1628
57f50b3c 1629 // check connection, in case connect
9d733021 1630 if (!Connect(system))
eba76848 1631 {
9d733021 1632 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1633 return 0;
1634 }
1635
1636 // Query preparation
9d733021 1637 TString sourceName(source);
d386d623 1638 Int_t nFields = 3;
1639 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1640 fConfig->GetFXSdbTable(system));
1641 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1642 GetCurrentRun(), detector, id);
1643
9d733021 1644 if (system == kDAQ)
1645 {
d386d623 1646 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1647 }
9d733021 1648 else if (system == kDCS)
eba76848 1649 {
9d733021 1650 sourceName="none";
57f50b3c 1651 }
9d733021 1652 else if (system == kHLT)
9e080f92 1653 {
d386d623 1654 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1655 nFields = 3;
9e080f92 1656 }
1657
9e080f92 1658 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1659
1660 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1661
1662 // Query execution
1663 TSQLResult* aResult = 0;
9d733021 1664 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1665 if (!aResult) {
9d733021 1666 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1667 GetSystemName(system), id, sourceName.Data()));
9e080f92 1668 return 0;
1669 }
1670
1671 if(aResult->GetRowCount() == 0)
1672 {
1673 Log(detector,
9d733021 1674 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1675 GetSystemName(system), id, sourceName.Data()));
9e080f92 1676 delete aResult;
1677 return 0;
1678 }
2bb7b766 1679
9e080f92 1680 if (aResult->GetRowCount() > 1) {
1681 Log(detector,
9d733021 1682 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1683 GetSystemName(system), id, sourceName.Data()));
9e080f92 1684 delete aResult;
1685 return 0;
1686 }
1687
9d733021 1688 if (aResult->GetFieldCount() != nFields) {
9e080f92 1689 Log(detector,
9d733021 1690 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1691 GetSystemName(system), id, sourceName.Data()));
9e080f92 1692 delete aResult;
1693 return 0;
1694 }
1695
1696 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1697
1698 if (!aRow){
9d733021 1699 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1700 GetSystemName(system), id, sourceName.Data()));
9e080f92 1701 delete aResult;
1702 return 0;
1703 }
1704
1705 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1706 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1707 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1708
1709 delete aResult;
1710 delete aRow;
1711
d386d623 1712 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1713 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1714
9e080f92 1715 // retrieved file is renamed to make it unique
9d733021 1716 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1717 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1718
9e080f92 1719
9d733021 1720 // file retrieval from FXS
4b95672b 1721 UInt_t nRetries = 0;
1722 UInt_t maxRetries = 3;
1723 Bool_t result = kFALSE;
1724
1725 // copy!! if successful TSystem::Exec returns 0
1726 while(nRetries++ < maxRetries) {
1727 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1728 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1729 if(!result)
1730 {
1731 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1732 filePath.Data(), GetSystemName(system)));
4b95672b 1733 continue;
1734 } else {
1735 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1736 filePath.Data(), GetSystemName(system),
1737 GetShuttleTempDir(), localFileName.Data()));
1738 }
9e080f92 1739
d386d623 1740 if (fileChecksum.Length()>0)
4b95672b 1741 {
1742 // compare md5sum of local file with the one stored in the FXS DB
1743 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1744 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1745
4b95672b 1746 if (md5Comp != 0)
1747 {
1748 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1749 filePath.Data()));
1750 result = kFALSE;
1751 continue;
1752 }
d386d623 1753 } else {
1754 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1755 filePath.Data(), GetSystemName(system)));
9d733021 1756 }
4b95672b 1757 if (result) break;
9e080f92 1758 }
1759
4b95672b 1760 if(!result) return 0;
1761
9d733021 1762 fFXSCalled[system]=kTRUE;
1763 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1764 fFXSlist[system].Add(fileParams);
9e080f92 1765
1766 static TString fullLocalFileName;
36c99a6a 1767 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1768
9e080f92 1769 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1770
1771 return fullLocalFileName.Data();
2bb7b766 1772
1773}
1774
1775//______________________________________________________________________________________________
9d733021 1776Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1777{
9827400b 1778 //
1779 // Copies file from FXS to local Shuttle machine
1780 //
2bb7b766 1781
9e080f92 1782 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1783 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1784 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1785
36c99a6a 1786 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1787 if (dir == NULL) {
36c99a6a 1788 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1789 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1790 return kFALSE;
1791 }
1792
1793 } else {
1794 gSystem->FreeDirectory(dir);
1795 }
1796
9d733021 1797 TString baseFXSFolder;
1798 if (system == kDAQ)
1799 {
1800 baseFXSFolder = "FES/";
1801 }
1802 else if (system == kDCS)
1803 {
1804 baseFXSFolder = "";
1805 }
1806 else if (system == kHLT)
1807 {
1808 baseFXSFolder = "~/";
1809 }
1810
1811
1812 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1813 fConfig->GetFXSPort(system),
1814 fConfig->GetFXSUser(system),
1815 fConfig->GetFXSHost(system),
1816 baseFXSFolder.Data(),
1817 fxsFileName,
36c99a6a 1818 GetShuttleTempDir(),
9e080f92 1819 localFileName);
1820
1821 AliDebug(2, Form("%s",command.Data()));
1822
4b95672b 1823 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1824
4b95672b 1825 return result;
9e080f92 1826}
1827
1828//______________________________________________________________________________________________
9d733021 1829TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1830{
9827400b 1831 //
1832 // Get sources producing the condition file Id from file exchange servers
1833 //
1834
1835 // check if test mode should simulate a FXS error
1836 if (fTestMode & kErrorFXSSources)
1837 {
1838 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1839 return 0;
1840 }
1841
9d733021 1842
1843 if (system == kDCS)
1844 {
1845 AliError("DCS system has only one source of data!");
1846 return NULL;
9d733021 1847 }
9e080f92 1848
1849 // check connection, in case connect
9d733021 1850 if (!Connect(system))
1851 {
1852 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1853 return NULL;
9e080f92 1854 }
1855
9d733021 1856 TString sourceName = 0;
1857 if (system == kDAQ)
1858 {
1859 sourceName = "DAQsource";
1860 } else if (system == kHLT)
1861 {
1862 sourceName = "DDLnumbers";
1863 }
1864
d386d623 1865 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
9e080f92 1866 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1867 GetCurrentRun(), detector, id);
1868 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1869
1870 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1871
1872 // Query execution
1873 TSQLResult* aResult;
9d733021 1874 aResult = fServer[system]->Query(sqlQuery);
9e080f92 1875 if (!aResult) {
9d733021 1876 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1877 GetSystemName(system), id));
9e080f92 1878 return 0;
1879 }
1880
86aa42c3 1881 TList *list = new TList();
1882 list->SetOwner(1);
1883
9d733021 1884 if (aResult->GetRowCount() == 0)
1885 {
9e080f92 1886 Log(detector,
9d733021 1887 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 1888 delete aResult;
86aa42c3 1889 return list;
9e080f92 1890 }
1891
1892 TSQLRow* aRow;
9e080f92 1893
9d733021 1894 while ((aRow = aResult->Next()))
1895 {
9e080f92 1896
9d733021 1897 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1898 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1899 list->Add(new TObjString(source));
9e080f92 1900 delete aRow;
1901 }
9d733021 1902
9e080f92 1903 delete aResult;
1904
1905 return list;
2bb7b766 1906}
1907
1908//______________________________________________________________________________________________
9d733021 1909Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 1910{
9827400b 1911 // Connect to MySQL Server of the system's FXS MySQL databases
1912 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1913 //
57f50b3c 1914
9d733021 1915 // check connection: if already connected return
1916 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 1917
9d733021 1918 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 1919
9d733021 1920 if (system < 3) // FXS db servers
1921 {
1922 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1923 dbUser = fConfig->GetFXSdbUser(system);
1924 dbPass = fConfig->GetFXSdbPass(system);
1925 dbName = fConfig->GetFXSdbName(system);
1926 } else { // Run & Shuttle logbook servers
1927 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1928 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1929 dbUser = fConfig->GetDAQlbUser();
1930 dbPass = fConfig->GetDAQlbPass();
1931 dbName = fConfig->GetDAQlbDB();
1932 }
57f50b3c 1933
9d733021 1934 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1935 if (!fServer[system] || !fServer[system]->IsConnected()) {
1936 if(system < 3)
1937 {
1938 AliError(Form("Can't establish connection to FXS database for %s",
1939 AliShuttleInterface::GetSystemName(system)));
1940 } else {
1941 AliError("Can't establish connection to Run logbook.");
57f50b3c 1942 }
9d733021 1943 if(fServer[system]) delete fServer[system];
1944 return kFALSE;
2bb7b766 1945 }
57f50b3c 1946
9d733021 1947 // Get tables
1948 TSQLResult* aResult=0;
1949 switch(system){
1950 case kDAQ:
1951 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1952 break;
1953 case kDCS:
1954 aResult = fServer[kDCS]->GetTables(dbName.Data());
1955 break;
1956 case kHLT:
1957 aResult = fServer[kHLT]->GetTables(dbName.Data());
1958 break;
1959 default:
1960 aResult = fServer[3]->GetTables(dbName.Data());
1961 break;
1962 }
1963
1964 delete aResult;
2bb7b766 1965 return kTRUE;
1966}
57f50b3c 1967
9e080f92 1968//______________________________________________________________________________________________
9d733021 1969Bool_t AliShuttle::UpdateTable()
9e080f92 1970{
9827400b 1971 //
1972 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1973 //
9e080f92 1974
9d733021 1975 Bool_t result = kTRUE;
9e080f92 1976
9d733021 1977 for (UInt_t system=0; system<3; system++)
1978 {
1979 if(!fFXSCalled[system]) continue;
9e080f92 1980
9d733021 1981 // check connection, in case connect
1982 if (!Connect(system))
1983 {
1984 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1985 result = kFALSE;
1986 continue;
9e080f92 1987 }
9e080f92 1988
9d733021 1989 TTimeStamp now; // now
1990
1991 // Loop on FXS list entries
1992 TIter iter(&fFXSlist[system]);
1993 TObjString *aFXSentry=0;
1994 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1995 {
1996 TString aFXSentrystr = aFXSentry->String();
1997 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1998 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1999 {
2000 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2001 GetSystemName(system), aFXSentrystr.Data()));
2002 if(aFXSarray) delete aFXSarray;
2003 result = kFALSE;
2004 continue;
2005 }
2006 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2007 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2008
2009 TString whereClause;
2010 if (system == kDAQ)
2011 {
2012 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2013 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2014 }
2015 else if (system == kDCS)
2016 {
2017 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2018 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2019 }
2020 else if (system == kHLT)
2021 {
2022 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2023 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2024 }
2025
2026 delete aFXSarray;
9e080f92 2027
9d733021 2028 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2029 now.GetSec(), whereClause.Data());
9e080f92 2030
9d733021 2031 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2032
9d733021 2033 // Query execution
2034 TSQLResult* aResult;
2035 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2036 if (!aResult)
2037 {
2038 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2039 GetSystemName(system), sqlQuery.Data()));
2040 result = kFALSE;
2041 continue;
2042 }
2043 delete aResult;
9e080f92 2044 }
9e080f92 2045 }
2046
9d733021 2047 return result;
9e080f92 2048}
57f50b3c 2049
2bb7b766 2050//______________________________________________________________________________________________
3301427a 2051Bool_t AliShuttle::UpdateTableFailCase()
2052{
9827400b 2053 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2054 // this is called in case the preprocessor is declared failed for the current run, because
2055 // the fields are updated only in case of success
3301427a 2056
2057 Bool_t result = kTRUE;
2058
2059 for (UInt_t system=0; system<3; system++)
2060 {
2061 // check connection, in case connect
2062 if (!Connect(system))
2063 {
2064 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2065 GetSystemName(system)));
2066 result = kFALSE;
2067 continue;
2068 }
2069
2070 TTimeStamp now; // now
2071
2072 // Loop on FXS list entries
2073
2074 TString whereClause = Form("where run=%d and detector=\"%s\";",
2075 GetCurrentRun(), fCurrentDetector.Data());
2076
2077
2078 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2079 now.GetSec(), whereClause.Data());
2080
2081 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2082
2083 // Query execution
2084 TSQLResult* aResult;
2085 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2086 if (!aResult)
2087 {
2088 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2089 GetSystemName(system), sqlQuery.Data()));
2090 result = kFALSE;
2091 continue;
2092 }
2093 delete aResult;
2094 }
2095
2096 return result;
2097}
2098
2099//______________________________________________________________________________________________
2bb7b766 2100Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2101{
e7f62f16 2102 //
2103 // Update Shuttle logbook filling detector or shuttle_done column
2104 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2105 //
57f50b3c 2106
2bb7b766 2107 // check connection, in case connect
be48e3ea 2108 if(!Connect(3)){
2bb7b766 2109 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2110 return kFALSE;
57f50b3c 2111 }
2112
2bb7b766 2113 TString detName(detector);
2114 TString setClause;
e7f62f16 2115 if(detName == "shuttle_done")
2116 {
2bb7b766 2117 setClause = "set shuttle_done=1";
e7f62f16 2118
2119 // Send the information to ML
2120 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2121
2122 TList mlList;
2123 mlList.Add(&mlStatus);
2124
2125 fMonaLisa->SendParameters(&mlList);
2bb7b766 2126 } else {
2bb7b766 2127 TString statusStr(status);
2128 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2129 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2130 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2131 } else {
2132 Log("SHUTTLE",
2133 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2134 status, detector));
2135 return kFALSE;
2136 }
2137 }
57f50b3c 2138
2bb7b766 2139 TString whereClause = Form("where run=%d", GetCurrentRun());
2140
441b0e9c 2141 TString sqlQuery = Form("update %s %s %s",
2142 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2143
2bb7b766 2144 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2145
2146 // Query execution
2147 TSQLResult* aResult;
be48e3ea 2148 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2149 if (!aResult) {
2150 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2151 return kFALSE;
57f50b3c 2152 }
2bb7b766 2153 delete aResult;
57f50b3c 2154
2155 return kTRUE;
2156}
2157
2158//______________________________________________________________________________________________
2bb7b766 2159Int_t AliShuttle::GetCurrentRun() const
2160{
9827400b 2161 //
2162 // Get current run from logbook entry
2163 //
57f50b3c 2164
2bb7b766 2165 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2166}
2167
2168//______________________________________________________________________________________________
2bb7b766 2169UInt_t AliShuttle::GetCurrentStartTime() const
2170{
9827400b 2171 //
2172 // get current start time
2173 //
57f50b3c 2174
2bb7b766 2175 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2176}
2177
2178//______________________________________________________________________________________________
2bb7b766 2179UInt_t AliShuttle::GetCurrentEndTime() const
2180{
9827400b 2181 //
2182 // get current end time from logbook entry
2183 //
57f50b3c 2184
2bb7b766 2185 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2186}
2187
2188//______________________________________________________________________________________________
b948db8d 2189void AliShuttle::Log(const char* detector, const char* message)
2190{
9827400b 2191 //
2192 // Fill log string with a message
2193 //
b948db8d 2194
36c99a6a 2195 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2196 if (dir == NULL) {
36c99a6a 2197 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2198 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2199 return;
2200 }
b948db8d 2201
84090f85 2202 } else {
2203 gSystem->FreeDirectory(dir);
2204 }
b948db8d 2205
cb343cfd 2206 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2207 if (GetCurrentRun() >= 0)
2208 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2209 toLog += Form("%s", message);
2210
84090f85 2211 AliInfo(toLog.Data());
ffa29e93 2212
2213 // if we redirect the log output already to the file, leave here
2214 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2215 return;
b948db8d 2216
ffa29e93 2217 TString fileName = GetLogFileName(detector);
e7f62f16 2218
84090f85 2219 gSystem->ExpandPathName(fileName);
2220
2221 ofstream logFile;
2222 logFile.open(fileName, ofstream::out | ofstream::app);
2223
2224 if (!logFile.is_open()) {
2225 AliError(Form("Could not open file %s", fileName.Data()));
2226 return;
2227 }
7bfb2090 2228
84090f85 2229 logFile << toLog.Data() << "\n";
b948db8d 2230
84090f85 2231 logFile.close();
b948db8d 2232}
2bb7b766 2233
2bb7b766 2234//______________________________________________________________________________________________
ffa29e93 2235TString AliShuttle::GetLogFileName(const char* detector) const
2236{
2237 //
2238 // returns the name of the log file for a given sub detector
2239 //
2240
2241 TString fileName;
2242
2243 if (GetCurrentRun() >= 0)
2244 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2245 else
2246 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2247
2248 return fileName;
2249}
2250
2251//______________________________________________________________________________________________
2bb7b766 2252Bool_t AliShuttle::Collect(Int_t run)
2253{
9827400b 2254 //
2255 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2256 // If a dedicated run is given this run is processed
2257 //
2258 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2259 //
2bb7b766 2260
eba76848 2261 if (run == -1)
2262 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2263 else
2264 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2265
2266 SetLastAction("Starting");
2bb7b766 2267
2268 TString whereClause("where shuttle_done=0");
eba76848 2269 if (run != -1)
2270 whereClause += Form(" and run=%d", run);
2bb7b766 2271
2272 TObjArray shuttleLogbookEntries;
be48e3ea 2273 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2274 {
cb343cfd 2275 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2276 return kFALSE;
2277 }
2278
9e080f92 2279 if (shuttleLogbookEntries.GetEntries() == 0)
2280 {
2281 if (run == -1)
2282 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2283 else
2284 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2285 "or it does not exist in Shuttle logbook", run));
2286 return kTRUE;
2287 }
2288
be48e3ea 2289 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2290 fFirstUnprocessed[iDet] = kTRUE;
2291
fc5a4708 2292 if (run != -1)
be48e3ea 2293 {
2294 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2295 // flag them into fFirstUnprocessed array
2296 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2297 TObjArray tmpLogbookEntries;
2298 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2299 {
2300 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2301 return kFALSE;
2302 }
2303
2304 TIter iter(&tmpLogbookEntries);
2305 AliShuttleLogbookEntry* anEntry = 0;
2306 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2307 {
2308 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2309 {
2310 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2311 {
2312 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2313 anEntry->GetRun(), GetDetName(iDet)));
2314 fFirstUnprocessed[iDet] = kFALSE;
2315 }
2316 }
2317
2318 }
2319
2320 }
2321
2322 if (!RetrieveConditionsData(shuttleLogbookEntries))
2323 {
cb343cfd 2324 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2325 return kFALSE;
2326 }
2327
36c99a6a 2328 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2329 return kTRUE;
2bb7b766 2330}
2331
2bb7b766 2332//______________________________________________________________________________________________
2333Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2334{
9827400b 2335 //
2336 // Retrieve conditions data for all runs that aren't processed yet
2337 //
2bb7b766 2338
2339 Bool_t hasError = kFALSE;
2340
2341 TIter iter(&dateEntries);
2342 AliShuttleLogbookEntry* anEntry;
2343
2344 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2345 if (!Process(anEntry)){
2346 hasError = kTRUE;
2347 }
4b95672b 2348
2349 // clean SHUTTLE temp directory
3301427a 2350 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2351 RemoveFile(filename.Data());
2bb7b766 2352 }
2353
2354 return hasError == kFALSE;
2355}
cb343cfd 2356
2357//______________________________________________________________________________________________
2358ULong_t AliShuttle::GetTimeOfLastAction() const
2359{
9827400b 2360 //
2361 // Gets time of last action
2362 //
2363
cb343cfd 2364 ULong_t tmp;
36c99a6a 2365
cb343cfd 2366 fMonitoringMutex->Lock();
be48e3ea 2367
cb343cfd 2368 tmp = fLastActionTime;
36c99a6a 2369
cb343cfd 2370 fMonitoringMutex->UnLock();
36c99a6a 2371
cb343cfd 2372 return tmp;
2373}
2374
2375//______________________________________________________________________________________________
2376const TString AliShuttle::GetLastAction() const
2377{
9827400b 2378 //
cb343cfd 2379 // returns a string description of the last action
9827400b 2380 //
cb343cfd 2381
2382 TString tmp;
36c99a6a 2383
cb343cfd 2384 fMonitoringMutex->Lock();
2385
2386 tmp = fLastAction;
2387
2388 fMonitoringMutex->UnLock();
2389
36c99a6a 2390 return tmp;
cb343cfd 2391}
2392
2393//______________________________________________________________________________________________
2394void AliShuttle::SetLastAction(const char* action)
2395{
9827400b 2396 //
cb343cfd 2397 // updates the monitoring variables
9827400b 2398 //
36c99a6a 2399
cb343cfd 2400 fMonitoringMutex->Lock();
36c99a6a 2401
cb343cfd 2402 fLastAction = action;
2403 fLastActionTime = time(0);
2404
2405 fMonitoringMutex->UnLock();
2406}
eba76848 2407
2408//______________________________________________________________________________________________
2409const char* AliShuttle::GetRunParameter(const char* param)
2410{
9827400b 2411 //
2412 // returns run parameter read from DAQ logbook
2413 //
eba76848 2414
2415 if(!fLogbookEntry) {
2416 AliError("No logbook entry!");
2417 return 0;
2418 }
2419
2420 return fLogbookEntry->GetRunParameter(param);
2421}
57c1a579 2422
2423//______________________________________________________________________________________________
9827400b 2424AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2425{
9827400b 2426 //
2427 // returns object from OCDB valid for current run
2428 //
d386d623 2429
9827400b 2430 if (fTestMode & kErrorOCDB)
2431 {
2432 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2433 return 0;
2434 }
2435
d386d623 2436 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2437 if (!sto)
2438 {
9827400b 2439 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2440 return 0;
2441 }
2442
2443 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2444}
2445
2446//______________________________________________________________________________________________
57c1a579 2447Bool_t AliShuttle::SendMail()
2448{
9827400b 2449 //
2450 // sends a mail to the subdetector expert in case of preprocessor error
2451 //
2452
2453 if (fTestMode != kNone)
2454 return kTRUE;
57c1a579 2455
36c99a6a 2456 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2457 if (dir == NULL)
2458 {
36c99a6a 2459 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2460 {
36c99a6a 2461 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2462 return kFALSE;
2463 }
2464
2465 } else {
2466 gSystem->FreeDirectory(dir);
2467 }
2468
2469 TString bodyFileName;
36c99a6a 2470 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2471 gSystem->ExpandPathName(bodyFileName);
2472
2473 ofstream mailBody;
2474 mailBody.open(bodyFileName, ofstream::out);
2475
2476 if (!mailBody.is_open())
2477 {
2478 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2479 return kFALSE;
2480 }
2481
2482 TString to="";
2483 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2484 TObjString *anExpert=0;
2485 while ((anExpert = (TObjString*) iterExperts.Next()))
2486 {
2487 to += Form("%s,", anExpert->GetName());
2488 }
2489 to.Remove(to.Length()-1);
909732f7 2490 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2491
86aa42c3 2492 if (to.IsNull()) {
36c99a6a 2493 AliInfo("List of detector responsibles not yet set!");
2494 return kFALSE;
2495 }
2496
57c1a579 2497 TString cc="alberto.colla@cern.ch";
2498
2499 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2500 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2501 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2502
2503 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2504 body += Form("SHUTTLE just detected that your preprocessor "
86aa42c3 2505 "exited with ERROR state in run %d!!\n\n", GetCurrentRun());
57c1a579 2506 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2507 body += Form("The last 10 lines of %s log file are following:\n\n");
2508
909732f7 2509 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2510
2511 mailBody << body.Data();
2512 mailBody.close();
2513 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2514
9d733021 2515 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2516 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2517 if (gSystem->Exec(tailCommand.Data()))
2518 {
2519 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2520 }
2521
2522 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2523 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2524 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
57c1a579 2525 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2526
909732f7 2527 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2528
2529 mailBody << endBody.Data();
2530
2531 mailBody.close();
2532
2533 // send mail!
2534 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2535 subject.Data(),
2536 cc.Data(),
2537 to.Data(),
2538 bodyFileName.Data());
909732f7 2539 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2540
2541 Bool_t result = gSystem->Exec(mailCommand.Data());
2542
2543 return result == 0;
2544}
d386d623 2545
2546//______________________________________________________________________________________________
9827400b 2547const char* AliShuttle::GetRunType()
441b0e9c 2548{
9827400b 2549 //
2550 // returns run type read from "run type" logbook
2551 //
441b0e9c 2552
2553 if(!fLogbookEntry) {
2554 AliError("No logbook entry!");
2555 return 0;
2556 }
2557
9827400b 2558 return fLogbookEntry->GetRunType();
441b0e9c 2559}
2560
2561//______________________________________________________________________________________________
d386d623 2562void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2563{
9827400b 2564 //
2565 // sets Shuttle temp directory
2566 //
d386d623 2567
2568 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2569}
2570
2571//______________________________________________________________________________________________
2572void AliShuttle::SetShuttleLogDir(const char* logDir)
2573{
9827400b 2574 //
2575 // sets Shuttle log directory
2576 //
d386d623 2577
2578 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2579}