]> git.uio.no Git - u/mrichter/AliRoot.git/blame - SHUTTLE/AliShuttle.cxx
fixing log message when pp want to skip dcs value retrieval
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
651fdaab 18Revision 1.40 2007/04/27 07:06:48 jgrosseo
19GetFileSources returns empty list in case of no files, but successful query
20No mails sent in testmode
21
86aa42c3 22Revision 1.39 2007/04/17 12:43:57 acolla
23Correction in StoreOCDB; change of text in mail to detector expert
24
26758fce 25Revision 1.38 2007/04/12 08:26:18 jgrosseo
26updated comment
27
3c2a21c8 28Revision 1.37 2007/04/10 16:53:14 jgrosseo
29redirecting sub detector stdout, stderr to sub detector log file
30
3d8bc902 31Revision 1.35 2007/04/04 16:26:38 acolla
321. Re-organization of function calls in TestPreprocessor to make it more meaningful.
332. Added missing dependency in test preprocessors.
343. in AliShuttle.cxx: processing time and memory consumption info on a single line.
35
886d60e6 36Revision 1.34 2007/04/04 10:33:36 jgrosseo
371) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
38In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
39
402) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
41
423) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
43
444) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
45
465) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
47If you always need DCS data (like before), you do not need to implement it.
48
496) The run type has been added to the monitoring page
50
9827400b 51Revision 1.33 2007/04/03 13:56:01 acolla
52Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
53run type.
54
3301427a 55Revision 1.32 2007/02/28 10:41:56 acolla
56Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
57AliPreprocessor::GetRunType() function.
58Added some ldap definition files.
59
d386d623 60Revision 1.30 2007/02/13 11:23:21 acolla
61Moved getters and setters of Shuttle's main OCDB/Reference, local
62OCDB/Reference, temp and log folders to AliShuttleInterface
63
9d733021 64Revision 1.27 2007/01/30 17:52:42 jgrosseo
65adding monalisa monitoring
66
e7f62f16 67Revision 1.26 2007/01/23 19:20:03 acolla
68Removed old ldif files, added TOF, MCH ldif files. Added some options in
69AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
70SetShuttleLogDir
71
36c99a6a 72Revision 1.25 2007/01/15 19:13:52 acolla
73Moved some AliInfo to AliDebug in SendMail function
74
fc5a4708 75Revision 1.21 2006/12/07 08:51:26 jgrosseo
76update (alberto):
77table, db names in ldap configuration
78added GRP preprocessor
79DCS data can also be retrieved by data point
80
2c15234c 81Revision 1.20 2006/11/16 16:16:48 jgrosseo
82introducing strict run ordering flag
83removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
84
be48e3ea 85Revision 1.19 2006/11/06 14:23:04 jgrosseo
86major update (Alberto)
87o) reading of run parameters from the logbook
88o) online offline naming conversion
89o) standalone DCSclient package
90
eba76848 91Revision 1.18 2006/10/20 15:22:59 jgrosseo
92o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
93o) Merging Collect, CollectAll, CollectNew function
94o) Removing implementation of empty copy constructors (declaration still there!)
95
cb343cfd 96Revision 1.17 2006/10/05 16:20:55 jgrosseo
97adapting to new CDB classes
98
6ec0e06c 99Revision 1.16 2006/10/05 15:46:26 jgrosseo
100applying to the new interface
101
481441a2 102Revision 1.15 2006/10/02 16:38:39 jgrosseo
103update (alberto):
104fixed memory leaks
105storing of objects that failed to be stored to the grid before
106interfacing of shuttle status table in daq system
107
2bb7b766 108Revision 1.14 2006/08/29 09:16:05 jgrosseo
109small update
110
85a80aa9 111Revision 1.13 2006/08/15 10:50:00 jgrosseo
112effc++ corrections (alberto)
113
4f0ab988 114Revision 1.12 2006/08/08 14:19:29 jgrosseo
115Update to shuttle classes (Alberto)
116
117- Possibility to set the full object's path in the Preprocessor's and
118Shuttle's Store functions
119- Possibility to extend the object's run validity in the same classes
120("startValidity" and "validityInfinite" parameters)
121- Implementation of the StoreReferenceData function to store reference
122data in a dedicated CDB storage.
123
84090f85 124Revision 1.11 2006/07/21 07:37:20 jgrosseo
125last run is stored after each run
126
7bfb2090 127Revision 1.10 2006/07/20 09:54:40 jgrosseo
128introducing status management: The processing per subdetector is divided into several steps,
129after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
130can keep track of the number of failures and skips further processing after a certain threshold is
131exceeded. These thresholds can be configured in LDAP.
132
5164a766 133Revision 1.9 2006/07/19 10:09:55 jgrosseo
134new configuration, accesst to DAQ FES (Alberto)
135
57f50b3c 136Revision 1.8 2006/07/11 12:44:36 jgrosseo
137adding parameters for extended validity range of data produced by preprocessor
138
17111222 139Revision 1.7 2006/07/10 14:37:09 jgrosseo
140small fix + todo comment
141
e090413b 142Revision 1.6 2006/07/10 13:01:41 jgrosseo
143enhanced storing of last sucessfully processed run (alberto)
144
a7160fe9 145Revision 1.5 2006/07/04 14:59:57 jgrosseo
146revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
147
45a493ce 148Revision 1.4 2006/06/12 09:11:16 jgrosseo
149coding conventions (Alberto)
150
58bc3020 151Revision 1.3 2006/06/06 14:26:40 jgrosseo
152o) removed files that were moved to STEER
153o) shuttle updated to follow the new interface (Alberto)
154
b948db8d 155Revision 1.2 2006/03/07 07:52:34 hristov
156New version (B.Yordanov)
157
d477ad88 158Revision 1.6 2005/11/19 17:19:14 byordano
159RetrieveDATEEntries and RetrieveConditionsData added
160
161Revision 1.5 2005/11/19 11:09:27 byordano
162AliShuttle declaration added
163
164Revision 1.4 2005/11/17 17:47:34 byordano
165TList changed to TObjArray
166
167Revision 1.3 2005/11/17 14:43:23 byordano
168import to local CVS
169
170Revision 1.1.1.1 2005/10/28 07:33:58 hristov
171Initial import as subdirectory in AliRoot
172
73abe331 173Revision 1.2 2005/09/13 08:41:15 byordano
174default startTime endTime added
175
176Revision 1.4 2005/08/30 09:13:02 byordano
177some docs added
178
179Revision 1.3 2005/08/29 21:15:47 byordano
180some docs added
181
182*/
183
184//
185// This class is the main manager for AliShuttle.
186// It organizes the data retrieval from DCS and call the
b948db8d 187// interface methods of AliPreprocessor.
73abe331 188// For every detector in AliShuttleConfgi (see AliShuttleConfig),
189// data for its set of aliases is retrieved. If there is registered
b948db8d 190// AliPreprocessor for this detector then it will be used
191// accroding to the schema (see AliPreprocessor).
192// If there isn't registered AliPreprocessor than the retrieved
73abe331 193// data is stored automatically to the undelying AliCDBStorage.
194// For detSpec is used the alias name.
195//
196
197#include "AliShuttle.h"
198
199#include "AliCDBManager.h"
200#include "AliCDBStorage.h"
201#include "AliCDBId.h"
84090f85 202#include "AliCDBRunRange.h"
203#include "AliCDBPath.h"
5164a766 204#include "AliCDBEntry.h"
73abe331 205#include "AliShuttleConfig.h"
eba76848 206#include "DCSClient/AliDCSClient.h"
73abe331 207#include "AliLog.h"
b948db8d 208#include "AliPreprocessor.h"
5164a766 209#include "AliShuttleStatus.h"
2bb7b766 210#include "AliShuttleLogbookEntry.h"
73abe331 211
57f50b3c 212#include <TSystem.h>
58bc3020 213#include <TObject.h>
b948db8d 214#include <TString.h>
57f50b3c 215#include <TTimeStamp.h>
73abe331 216#include <TObjString.h>
57f50b3c 217#include <TSQLServer.h>
218#include <TSQLResult.h>
219#include <TSQLRow.h>
cb343cfd 220#include <TMutex.h>
9827400b 221#include <TSystemDirectory.h>
222#include <TSystemFile.h>
223#include <TFileMerger.h>
224#include <TGrid.h>
225#include <TGridResult.h>
73abe331 226
e7f62f16 227#include <TMonaLisaWriter.h>
228
5164a766 229#include <fstream>
230
cb343cfd 231#include <sys/types.h>
232#include <sys/wait.h>
233
73abe331 234ClassImp(AliShuttle)
235
b948db8d 236//______________________________________________________________________________________________
237AliShuttle::AliShuttle(const AliShuttleConfig* config,
238 UInt_t timeout, Int_t retries):
4f0ab988 239fConfig(config),
240fTimeout(timeout), fRetries(retries),
241fPreprocessorMap(),
2bb7b766 242fLogbookEntry(0),
eba76848 243fCurrentDetector(),
85a80aa9 244fStatusEntry(0),
cb343cfd 245fMonitoringMutex(0),
eba76848 246fLastActionTime(0),
e7f62f16 247fLastAction(),
9827400b 248fMonaLisa(0),
249fTestMode(kNone),
ffa29e93 250fReadTestMode(kFALSE),
251fOutputRedirected(kFALSE)
73abe331 252{
253 //
254 // config: AliShuttleConfig used
73abe331 255 // timeout: timeout used for AliDCSClient connection
256 // retries: the number of retries in case of connection error.
257 //
258
57f50b3c 259 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 260 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 261 fServer[iSys]=0;
be48e3ea 262 if (iSys < 3)
2c15234c 263 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 264 }
2bb7b766 265 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 266
267 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
268 fFirstUnprocessed[iDet] = kFALSE;
269
cb343cfd 270 fMonitoringMutex = new TMutex();
58bc3020 271}
272
b948db8d 273//______________________________________________________________________________________________
57f50b3c 274AliShuttle::~AliShuttle()
58bc3020 275{
9827400b 276 //
277 // destructor
278 //
58bc3020 279
b948db8d 280 fPreprocessorMap.DeleteAll();
be48e3ea 281 for(int iSys=0;iSys<4;iSys++)
57f50b3c 282 if(fServer[iSys]) {
283 fServer[iSys]->Close();
284 delete fServer[iSys];
eba76848 285 fServer[iSys] = 0;
57f50b3c 286 }
2bb7b766 287
288 if (fStatusEntry){
289 delete fStatusEntry;
290 fStatusEntry = 0;
291 }
cb343cfd 292
293 if (fMonitoringMutex)
294 {
295 delete fMonitoringMutex;
296 fMonitoringMutex = 0;
297 }
73abe331 298}
299
b948db8d 300//______________________________________________________________________________________________
57f50b3c 301void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 302{
73abe331 303 //
b948db8d 304 // Registers new AliPreprocessor.
73abe331 305 // It uses GetName() for indentificator of the pre processor.
306 // The pre processor is registered it there isn't any other
307 // with the same identificator (GetName()).
308 //
309
eba76848 310 const char* detName = preprocessor->GetName();
311 if(GetDetPos(detName) < 0)
312 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
313
314 if (fPreprocessorMap.GetValue(detName)) {
315 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 316 return;
317 }
318
eba76848 319 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 320}
b948db8d 321//______________________________________________________________________________________________
3301427a 322Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 323 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 324{
9827400b 325 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
326 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
327 // using this function. Use StoreReferenceData instead!
328 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
329 // finishes the data are transferred to the main storage (Grid).
b948db8d 330
3301427a 331 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 332}
333
334//______________________________________________________________________________________________
3301427a 335Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 336{
9827400b 337 // Stores a CDB object in the storage for reference data. This objects will not be available during
338 // offline reconstrunction. Use this function for reference data only!
339 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
340 // finishes the data are transferred to the main storage (Grid).
85a80aa9 341
3301427a 342 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 343}
344
345//______________________________________________________________________________________________
3301427a 346Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 347 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
348 Int_t validityStart, Bool_t validityInfinite)
349{
9827400b 350 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
351 // when the preprocessor finishes the data are transferred to the main storage (Grid).
352 // The parameters are:
353 // 1) Uri of the backup storage (Local)
354 // 2) the object's path.
355 // 3) the object to be stored
356 // 4) the metaData to be associated with the object
357 // 5) the validity start run number w.r.t. the current run,
358 // if the data is valid only for this run leave the default 0
359 // 6) specifies if the calibration data is valid for infinity (this means until updated),
360 // typical for calibration runs, the default is kFALSE
361 //
362 // returns 0 if fail, 1 otherwise
84090f85 363
9827400b 364 if (fTestMode & kErrorStorage)
365 {
366 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
367 return kFALSE;
368 }
369
3301427a 370 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 371
85a80aa9 372 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 373 if(firstRun < 0) {
9827400b 374 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 375 firstRun=0;
376 }
377
378 Int_t lastRun = -1;
379 if(validityInfinite) {
380 lastRun = AliCDBRunRange::Infinity();
381 } else {
382 lastRun = GetCurrentRun();
383 }
384
3301427a 385 // Version is set to current run, it will be used later to transfer data to Grid
386 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 387
388 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
389 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 390 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 391 }
84090f85 392
3301427a 393 Bool_t result = kFALSE;
84090f85 394
3301427a 395 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
396 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 397 } else {
3301427a 398 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 399 ->Put(object, id, metaData);
400 }
401
402 if(!result) {
403
9827400b 404 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 405 }
2bb7b766 406
3301427a 407 return result;
408}
84090f85 409
3301427a 410//______________________________________________________________________________________________
411Bool_t AliShuttle::StoreOCDB()
412{
9827400b 413 //
414 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
415 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
416 // Then calls StoreRefFilesToGrid to store reference files.
417 //
418
419 if (fTestMode & kErrorGrid)
420 {
421 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
422 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
423 return kFALSE;
424 }
425
86aa42c3 426 AliInfo("Storing OCDB data ...");
427 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
428
3301427a 429 AliInfo("Storing reference data ...");
430 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 431
432 AliInfo("Storing reference files ...");
433 Bool_t resultRefFiles = StoreRefFilesToGrid();
434
435 return resultCDB && resultRef && resultRefFiles;
3301427a 436}
437
438//______________________________________________________________________________________________
439Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
440{
441 //
442 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
443 //
444
445 TObjArray* gridIds=0;
446
447 Bool_t result = kTRUE;
448
449 const char* type = 0;
450 TString localURI;
451 if(gridURI == fgkMainCDB) {
452 type = "OCDB";
453 localURI = fgkLocalCDB;
454 } else if(gridURI == fgkMainRefStorage) {
455 type = "reference";
456 localURI = fgkLocalRefStorage;
457 } else {
458 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
459 return kFALSE;
460 }
461
462 AliCDBManager* man = AliCDBManager::Instance();
463
464 AliCDBStorage *gridSto = man->GetStorage(gridURI);
465 if(!gridSto) {
466 Log("SHUTTLE",
467 Form("StoreOCDB - cannot activate main %s storage", type));
468 return kFALSE;
469 }
470
471 gridIds = gridSto->GetQueryCDBList();
472
473 // get objects previously stored in local CDB
474 AliCDBStorage *localSto = man->GetStorage(localURI);
475 if(!localSto) {
476 Log("SHUTTLE",
477 Form("StoreOCDB - cannot activate local %s storage", type));
478 return kFALSE;
479 }
480 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
481 // Local objects were stored with current run as Grid version!
482 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
483 localEntries->SetOwner(1);
484
485 // loop on local stored objects
486 TIter localIter(localEntries);
487 AliCDBEntry *aLocEntry = 0;
488 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
489 aLocEntry->SetOwner(1);
490 AliCDBId aLocId = aLocEntry->GetId();
491 aLocEntry->SetVersion(-1);
492 aLocEntry->SetSubVersion(-1);
493
494 // If local object is valid up to infinity we store it only if it is
495 // the first unprocessed run!
496 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
497 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
498 {
499 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
500 "there are previous unprocessed runs!",
501 fCurrentDetector.Data(), aLocId.GetPath().Data()));
502 continue;
503 }
504
505 // loop on Grid valid Id's
506 Bool_t store = kTRUE;
507 TIter gridIter(gridIds);
508 AliCDBId* aGridId = 0;
509 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
510 if(aGridId->GetPath() != aLocId.GetPath()) continue;
511 // skip all objects valid up to infinity
512 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
513 // if we get here, it means there's already some more recent object stored on Grid!
514 store = kFALSE;
515 break;
516 }
517
518 // If we get here, the file can be stored!
519 Bool_t storeOk = gridSto->Put(aLocEntry);
520 if(!store || storeOk){
521
522 if (!store)
523 {
524 Log(fCurrentDetector.Data(),
525 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
526 type, aGridId->ToString().Data()));
527 } else {
528 Log("SHUTTLE",
529 Form("StoreOCDB - Object <%s> successfully put into %s storage",
530 aLocId.ToString().Data(), type));
531 }
84090f85 532
3301427a 533 // removing local filename...
534 TString filename;
535 localSto->IdToFilename(aLocId, filename);
536 AliInfo(Form("Removing local file %s", filename.Data()));
537 RemoveFile(filename.Data());
538 continue;
539 } else {
540 Log("SHUTTLE",
541 Form("StoreOCDB - Grid %s storage of object <%s> failed",
542 type, aLocId.ToString().Data()));
543 result = kFALSE;
b948db8d 544 }
545 }
3301427a 546 localEntries->Clear();
2bb7b766 547
b948db8d 548 return result;
3301427a 549}
550
9827400b 551//______________________________________________________________________________________________
552Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
553{
554 //
3c2a21c8 555 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 556 //
3c2a21c8 557 // The file is stored under the following location:
558 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
559 // where <gridFileName> is the second parameter given to the function
560 //
9827400b 561
562 if (fTestMode & kErrorStorage)
563 {
564 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
565 return kFALSE;
566 }
567
568 AliCDBManager* man = AliCDBManager::Instance();
569 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
570
571 TString localBaseFolder = sto->GetBaseFolder();
572
573 TString targetDir;
574 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
575
576 TString target;
577 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
578
579 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
580 if (result)
581 {
582 result = gSystem->mkdir(targetDir, kTRUE);
583 if (result != 0)
584 {
585 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
586 return kFALSE;
587 }
588 }
589
590 result = gSystem->CopyFile(localFile, target);
591
592 if (result == 0)
593 {
594 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
595 return kTRUE;
596 }
597 else
598 {
599 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed", localFile, target.Data()));
600 return kFALSE;
601 }
602}
603
604//______________________________________________________________________________________________
605Bool_t AliShuttle::StoreRefFilesToGrid()
606{
607 //
608 // Transfers the reference file to the Grid.
9827400b 609 //
86aa42c3 610 // The files are stored under the following location:
3c2a21c8 611 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
86aa42c3 612 //
9827400b 613
614 AliCDBManager* man = AliCDBManager::Instance();
615 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
616 if (!sto)
617 return kFALSE;
618 TString localBaseFolder = sto->GetBaseFolder();
619
620 TString dir;
3d8bc902 621 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 622
623 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
624 if (!gridSto)
625 return kFALSE;
626 TString gridBaseFolder = gridSto->GetBaseFolder();
627 TString alienDir;
3d8bc902 628 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 629
3d8bc902 630 if (!gGrid)
9827400b 631 return kFALSE;
632
9827400b 633 TString begin;
634 begin.Form("%d_", GetCurrentRun());
635
636 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 637 if (!baseDir)
638 return kTRUE;
639
9827400b 640 TList* dirList = baseDir->GetListOfFiles();
641 if (!dirList)
3d8bc902 642 {
643 delete baseDir;
9827400b 644 return kTRUE;
3d8bc902 645 }
9827400b 646
647 Int_t nDirs = dirList->GetEntries();
648
649 Bool_t success = kTRUE;
3d8bc902 650 Bool_t first = kTRUE;
9827400b 651
652 for (Int_t iDir=0; iDir<nDirs; ++iDir)
653 {
654 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
655 if (!entry)
656 continue;
657
658 if (entry->IsDirectory())
659 continue;
660
661 TString fileName(entry->GetName());
662 if (!fileName.BeginsWith(begin))
663 continue;
664
3d8bc902 665 if (first)
666 {
667 first = kFALSE;
668 // check that DET folder exists, otherwise create it
669 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
670
671 if (!result)
672 return kFALSE;
673
674 if (!result->GetFileName(0))
675 {
676 if (!gGrid->Mkdir(alienDir.Data(),"",0))
677 {
678 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
679 alienDir.Data()));
680 delete baseDir;
681 return kFALSE;
682 }
683
684 }
685 }
686
9827400b 687 TString fullLocalPath;
688 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
689
690 TString fullGridPath;
691 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
692
693 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
694
695 TFileMerger fileMerger;
696 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
697
698 if (result)
699 {
700 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
701 RemoveFile(fullLocalPath);
702 }
703 else
704 {
705 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
706 success = kFALSE;
707 }
708 }
709
710 delete baseDir;
711
712 return success;
713}
714
3301427a 715//______________________________________________________________________________________________
716void AliShuttle::CleanLocalStorage(const TString& uri)
717{
9827400b 718 //
719 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
720 //
3301427a 721
722 const char* type = 0;
723 if(uri == fgkLocalCDB) {
724 type = "OCDB";
725 } else if(uri == fgkLocalRefStorage) {
726 type = "reference";
727 } else {
728 AliError(Form("Invalid storage URI: %s", uri.Data()));
729 return;
730 }
731
732 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 733
3301427a 734 // open local storage
735 AliCDBStorage *localSto = man->GetStorage(uri);
736 if(!localSto) {
737 Log("SHUTTLE",
738 Form("CleanLocalStorage - cannot activate local %s storage", type));
739 return;
740 }
741
742 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
743 localSto->GetBaseFolder().Data(), fCurrentDetector.Data(), GetCurrentRun()));
744
745 AliInfo(Form("filename = %s", filename.Data()));
746
747 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
748 GetCurrentRun(), fCurrentDetector.Data()));
749
750 RemoveFile(filename.Data());
751
752}
753
754//______________________________________________________________________________________________
755void AliShuttle::RemoveFile(const char* filename)
756{
9827400b 757 //
758 // removes local file
759 //
3301427a 760
761 TString command(Form("rm -f %s", filename));
762
763 Int_t result = gSystem->Exec(command.Data());
764 if(result != 0)
765 {
766 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
767 fCurrentDetector.Data(), filename));
768 }
73abe331 769}
770
b948db8d 771//______________________________________________________________________________________________
5164a766 772AliShuttleStatus* AliShuttle::ReadShuttleStatus()
773{
9827400b 774 //
775 // Reads the AliShuttleStatus from the CDB
776 //
5164a766 777
2bb7b766 778 if (fStatusEntry){
779 delete fStatusEntry;
780 fStatusEntry = 0;
781 }
5164a766 782
10a5a932 783 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 784 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 785
2bb7b766 786 if (!fStatusEntry) return 0;
787 fStatusEntry->SetOwner(1);
5164a766 788
2bb7b766 789 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
790 if (!status) {
791 AliError("Invalid object stored to CDB!");
792 return 0;
793 }
5164a766 794
2bb7b766 795 return status;
5164a766 796}
797
798//______________________________________________________________________________________________
7bfb2090 799Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 800{
9827400b 801 //
802 // writes the status for one subdetector
803 //
2bb7b766 804
805 if (fStatusEntry){
806 delete fStatusEntry;
807 fStatusEntry = 0;
808 }
5164a766 809
2bb7b766 810 Int_t run = GetCurrentRun();
5164a766 811
2bb7b766 812 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 813
2bb7b766 814 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
815 fStatusEntry->SetOwner(1);
5164a766 816
2bb7b766 817 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 818
2bb7b766 819 if (!result) {
3301427a 820 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
821 fCurrentDetector.Data(), run));
2bb7b766 822 return kFALSE;
823 }
e7f62f16 824
825 SendMLInfo();
7bfb2090 826
2bb7b766 827 return kTRUE;
5164a766 828}
829
830//______________________________________________________________________________________________
831void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
832{
9827400b 833 //
834 // changes the AliShuttleStatus for the given detector and run to the given status
835 //
5164a766 836
2bb7b766 837 if (!fStatusEntry){
838 AliError("UNEXPECTED: fStatusEntry empty");
839 return;
840 }
5164a766 841
2bb7b766 842 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 843
2bb7b766 844 if (!status){
3301427a 845 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 846 return;
847 }
5164a766 848
2c15234c 849 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 850 fCurrentDetector.Data(),
36c99a6a 851 status->GetStatusName(),
eba76848 852 status->GetStatusName(newStatus));
cb343cfd 853 Log("SHUTTLE", actionStr);
854 SetLastAction(actionStr);
5164a766 855
2bb7b766 856 status->SetStatus(newStatus);
857 if (increaseCount) status->IncreaseCount();
5164a766 858
2bb7b766 859 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 860
861 SendMLInfo();
5164a766 862}
e7f62f16 863
864//______________________________________________________________________________________________
865void AliShuttle::SendMLInfo()
866{
867 //
868 // sends ML information about the current status of the current detector being processed
869 //
870
871 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
872
873 if (!status){
3301427a 874 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 875 return;
876 }
877
878 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
879 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
880
881 TList mlList;
882 mlList.Add(&mlStatus);
883 mlList.Add(&mlRetryCount);
884
885 fMonaLisa->SendParameters(&mlList);
886}
887
5164a766 888//______________________________________________________________________________________________
889Bool_t AliShuttle::ContinueProcessing()
890{
9827400b 891 // this function reads the AliShuttleStatus information from CDB and
892 // checks if the processing should be continued
893 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 894
57c1a579 895 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
896
897 AliPreprocessor* aPreprocessor =
898 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
899 if (!aPreprocessor)
900 {
901 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
902 return kFALSE;
903 }
904
2bb7b766 905 AliShuttleLogbookEntry::Status entryStatus =
eba76848 906 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 907
908 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 909 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 910 fCurrentDetector.Data(),
911 fLogbookEntry->GetDetectorStatusName(entryStatus)));
912 return kFALSE;
913 }
914
915 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 916
917 // check if current run is first unprocessed run for current detector
918 if (fConfig->StrictRunOrder(fCurrentDetector) &&
919 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
920 {
86aa42c3 921 if (fTestMode == kNone)
922 {
923 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
924 return kFALSE;
925 }
926 else
927 {
928 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
929 }
be48e3ea 930 }
931
2bb7b766 932 AliShuttleStatus* status = ReadShuttleStatus();
933 if (!status) {
934 // first time
935 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
936 fCurrentDetector.Data()));
937 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
938 return WriteShuttleStatus(status);
939 }
940
941 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
942 // If it happens it may mean Logbook updating failed... let's do it now!
943 if (status->GetStatus() == AliShuttleStatus::kDone ||
944 status->GetStatus() == AliShuttleStatus::kFailed){
945 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
946 fCurrentDetector.Data(),
947 status->GetStatusName(status->GetStatus())));
948 UpdateShuttleLogbook(fCurrentDetector.Data(),
949 status->GetStatusName(status->GetStatus()));
950 return kFALSE;
951 }
952
3301427a 953 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 954 Log("SHUTTLE",
955 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
956 fCurrentDetector.Data()));
9827400b 957 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
958 if (StoreOCDB()){
3301427a 959 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
960 fCurrentDetector.Data()));
2bb7b766 961 UpdateShuttleStatus(AliShuttleStatus::kDone);
962 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
963 } else {
964 Log("SHUTTLE",
965 Form("ContinueProcessing - %s: Grid storage failed again",
966 fCurrentDetector.Data()));
9827400b 967 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 968 }
969 return kFALSE;
970 }
971
972 // if we get here, there is a restart
57c1a579 973 Bool_t cont = kFALSE;
2bb7b766 974
975 // abort conditions
cb343cfd 976 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 977 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
978 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 979 status->GetCount(), status->GetStatusName()));
980 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 981 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 982
983 // there may still be objects in local OCDB and reference storage
984 // and FXS databases may be not updated: do it now!
9827400b 985
986 // TODO Currently disabled, we want to keep files in case of failure!
987 // CleanLocalStorage(fgkLocalCDB);
988 // CleanLocalStorage(fgkLocalRefStorage);
989 // UpdateTableFailCase();
990
991 // Send mail to detector expert!
992 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
993 if (!SendMail())
994 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
995 fCurrentDetector.Data()));
3301427a 996
57c1a579 997 } else {
998 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
999 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1000 status->GetStatusName(), status->GetCount()));
9827400b 1001 Bool_t increaseCount = kTRUE;
1002 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1003 increaseCount = kFALSE;
1004 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1005 cont = kTRUE;
2bb7b766 1006 }
1007
57c1a579 1008 return cont;
5164a766 1009}
1010
1011//______________________________________________________________________________________________
2bb7b766 1012Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1013{
73abe331 1014 //
b948db8d 1015 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1016 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1017 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1018 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1019 //
1020
9827400b 1021 if (!entry) return kFALSE;
2bb7b766 1022
1023 fLogbookEntry = entry;
1024
9827400b 1025 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1026 GetCurrentRun()));
2bb7b766 1027
e7f62f16 1028 // create ML instance that monitors this run
1029 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1030 // disable monitoring of other parameters that come e.g. from TFile
1031 gMonitoringWriter = 0;
2bb7b766 1032
e7f62f16 1033 // Send the information to ML
1034 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1035 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1036
1037 TList mlList;
1038 mlList.Add(&mlStatus);
9827400b 1039 mlList.Add(&mlRunType);
e7f62f16 1040
1041 fMonaLisa->SendParameters(&mlList);
3301427a 1042
9827400b 1043 if (fLogbookEntry->IsDone())
1044 {
1045 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1046 UpdateShuttleLogbook("shuttle_done");
1047 fLogbookEntry = 0;
1048 return kTRUE;
1049 }
1050
1051 // read test mode if flag is set
1052 if (fReadTestMode)
1053 {
3d8bc902 1054 fTestMode = kNone;
9827400b 1055 TString logEntry(entry->GetRunParameter("log"));
1056 //printf("log entry = %s\n", logEntry.Data());
1057 TString searchStr("Testmode: ");
1058 Int_t pos = logEntry.Index(searchStr.Data());
1059 //printf("%d\n", pos);
1060 if (pos >= 0)
1061 {
1062 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1063 //printf("%s\n", subStr.String().Data());
1064 TString newStr(subStr.Data());
1065 TObjArray* token = newStr.Tokenize(' ');
1066 if (token)
1067 {
1068 //token->Print();
1069 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1070 if (tmpStr)
1071 {
1072 Int_t testMode = tmpStr->String().Atoi();
1073 if (testMode > 0)
1074 {
1075 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1076 SetTestMode((TestMode) testMode);
1077 }
1078 }
1079 delete token;
1080 }
1081 }
1082 }
1083
3d8bc902 1084 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1085
eba76848 1086 fLogbookEntry->Print("all");
57f50b3c 1087
1088 // Initialization
d477ad88 1089 Bool_t hasError = kFALSE;
5164a766 1090
2bb7b766 1091 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1092 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1093 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1094 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1095
57f50b3c 1096 // Loop on detectors in the configuration
b948db8d 1097 TIter iter(fConfig->GetDetectors());
2bb7b766 1098 TObjString* aDetector = 0;
b948db8d 1099
be48e3ea 1100 while ((aDetector = (TObjString*) iter.Next()))
1101 {
7bfb2090 1102 fCurrentDetector = aDetector->String();
5164a766 1103
9e080f92 1104 if (ContinueProcessing() == kFALSE) continue;
1105
2bb7b766 1106 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1107 GetCurrentRun(), aDetector->GetName()));
1108
9d733021 1109 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1110
e7f62f16 1111 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1112
be48e3ea 1113 Int_t pid = fork();
1114
1115 if (pid < 0)
1116 {
1117 Log("SHUTTLE", "ERROR: Forking failed");
1118 }
1119 else if (pid > 0)
1120 {
1121 // parent
1122 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1123 GetCurrentRun(), aDetector->GetName()));
1124
1125 Long_t begin = time(0);
1126
1127 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1128 while (waitpid(pid, &status, WNOHANG) == 0)
1129 {
1130 Long_t expiredTime = time(0) - begin;
1131
1132 if (expiredTime > fConfig->GetPPTimeOut())
1133 {
9827400b 1134 TString tmp;
1135 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1136 fCurrentDetector.Data(), expiredTime);
1137 Log("SHUTTLE", tmp);
1138 Log(fCurrentDetector, tmp);
be48e3ea 1139
1140 kill(pid, 9);
1141
3301427a 1142 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1143 hasError = kTRUE;
1144
1145 gSystem->Sleep(1000);
1146 }
1147 else
1148 {
be48e3ea 1149 gSystem->Sleep(1000);
9827400b 1150
1151 TString checkStr;
1152 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1153 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1154 if (!pipe)
1155 {
1156 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1157 continue;
1158 }
1159
1160 char buffer[100];
1161 if (!fgets(buffer, 100, pipe))
1162 {
1163 Log("SHUTTLE", "Error: ps did not return anything");
1164 gSystem->ClosePipe(pipe);
1165 continue;
1166 }
1167 gSystem->ClosePipe(pipe);
1168
1169 //Log("SHUTTLE", Form("ps returned %s", buffer));
1170
1171 Int_t mem = 0;
1172 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1173 {
1174 Log("SHUTTLE", "Error: Could not parse output of ps");
1175 continue;
1176 }
1177
1178 if (expiredTime % 60 == 0)
886d60e6 1179 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1180 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1181
1182 if (mem > fConfig->GetPPMaxMem())
1183 {
1184 TString tmp;
1185 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1186 mem, fConfig->GetPPMaxMem());
1187 Log("SHUTTLE", tmp);
1188 Log(fCurrentDetector, tmp);
1189
1190 kill(pid, 9);
1191
1192 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1193 hasError = kTRUE;
1194
1195 gSystem->Sleep(1000);
1196 }
be48e3ea 1197 }
1198 }
1199
1200 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1201 GetCurrentRun(), aDetector->GetName()));
1202
1203 if (WIFEXITED(status))
1204 {
1205 Int_t returnCode = WEXITSTATUS(status);
1206
3301427a 1207 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1208 returnCode));
be48e3ea 1209
9827400b 1210 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1211 }
1212 }
1213 else if (pid == 0)
1214 {
1215 // client
1216 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1217
ffa29e93 1218 AliInfo("Redirecting output...");
1219
1220 if ((freopen(GetLogFileName(fCurrentDetector), "w", stdout)) == 0)
1221 {
1222 Log("SHUTTLE", "Could not freopen stdout");
1223 }
1224 else
1225 {
1226 fOutputRedirected = kTRUE;
1227 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1228 Log("SHUTTLE", "Could not redirect stderr");
1229
1230 }
1231
9827400b 1232 Bool_t success = ProcessCurrentDetector();
1233 if (success) // Preprocessor finished successfully!
1234 {
3301427a 1235 // Update time_processed field in FXS DB
1236 if (UpdateTable() == kFALSE)
1237 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1238
1239 // Transfer the data from local storage to main storage (Grid)
1240 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1241 if (StoreOCDB() == kFALSE)
1242 {
1243 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1244 GetCurrentRun(), aDetector->GetName()));
1245 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1246 success = kFALSE;
3301427a 1247 } else {
1248 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1249 GetCurrentRun(), aDetector->GetName()));
1250 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1251 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1252 }
be48e3ea 1253 }
1254
4b95672b 1255 for (UInt_t iSys=0; iSys<3; iSys++)
1256 {
1257 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1258 }
1259
be48e3ea 1260 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1261 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1262
1263 // the client exits here
9827400b 1264 gSystem->Exit(success);
be48e3ea 1265
1266 AliError("We should never get here!!!");
1267 }
7bfb2090 1268 }
5164a766 1269
2bb7b766 1270 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1271 GetCurrentRun()));
1272
1273 //check if shuttle is done for this run, if so update logbook
1274 TObjArray checkEntryArray;
1275 checkEntryArray.SetOwner(1);
9e080f92 1276 TString whereClause = Form("where run=%d", GetCurrentRun());
1277 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1278 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1279 GetCurrentRun()));
1280 return hasError == kFALSE;
1281 }
b948db8d 1282
9e080f92 1283 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1284 (checkEntryArray.At(0));
2bb7b766 1285
9e080f92 1286 if (checkEntry)
1287 {
1288 if (checkEntry->IsDone())
be48e3ea 1289 {
9e080f92 1290 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1291 UpdateShuttleLogbook("shuttle_done");
1292 }
1293 else
1294 {
1295 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1296 {
9e080f92 1297 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1298 {
9e080f92 1299 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1300 checkEntry->GetRun(), GetDetName(iDet)));
1301 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1302 }
1303 }
2bb7b766 1304 }
1305 }
1306
e7f62f16 1307 // remove ML instance
1308 delete fMonaLisa;
1309 fMonaLisa = 0;
1310
2bb7b766 1311 fLogbookEntry = 0;
85a80aa9 1312
a7160fe9 1313 return hasError == kFALSE;
73abe331 1314}
1315
b948db8d 1316//______________________________________________________________________________________________
9827400b 1317Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1318{
1319 //
2bb7b766 1320 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1321 // Threre should be a configuration for this detector.
73abe331 1322
2bb7b766 1323 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1324
2c15234c 1325 TMap dcsMap;
1326 dcsMap.SetOwner(1);
73abe331 1327
85a80aa9 1328 Bool_t aDCSError = kFALSE;
3301427a 1329
1330 // call preprocessor
1331 AliPreprocessor* aPreprocessor =
1332 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1333
1334 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1335
1336 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1337
651fdaab 1338 if (!processDCS)
1339 {
1340 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1341 }
1342 else if (fTestMode & kSkipDCS))
2c15234c 1343 {
3d8bc902 1344 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1345 }
1346 else if (fTestMode & kErrorDCS)
1347 {
3d8bc902 1348 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1349 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1350 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1351 return kFALSE;
2c15234c 1352 } else {
3301427a 1353
1354 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1355
2c15234c 1356 TString host(fConfig->GetDCSHost(fCurrentDetector));
1357 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1358
1359 // Retrieval of Aliases
1360 TObjString* anAlias = 0;
36c99a6a 1361 Int_t iAlias = 1;
1362 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1363 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1364 while ((anAlias = (TObjString*) iterAliases.Next()))
1365 {
1366 TObjArray *valueSet = new TObjArray();
1367 valueSet->SetOwner(1);
1368
36c99a6a 1369 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1370 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
1371 anAlias->GetName(), iAlias++, nTotAliases));
2c15234c 1372 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1373
1374 if(!aDCSError)
1375 {
1376 dcsMap.Add(anAlias->Clone(), valueSet);
1377 } else {
1378 Log(fCurrentDetector,
1379 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1380 anAlias->GetName()));
1381 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1382 dcsMap.DeleteAll();
9827400b 1383 return kFALSE;
2c15234c 1384 }
4f0ab988 1385 }
2c15234c 1386
1387 // Retrieval of Data Points
1388 TObjString* aDP = 0;
36c99a6a 1389 Int_t iDP = 0;
1390 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1391 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1392 while ((aDP = (TObjString*) iterDP.Next()))
1393 {
1394 TObjArray *valueSet = new TObjArray();
1395 valueSet->SetOwner(1);
36c99a6a 1396 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1397 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1398 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1399 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1400
1401 if(!aDCSError)
1402 {
1403 dcsMap.Add(aDP->Clone(), valueSet);
1404 } else {
1405 Log(fCurrentDetector,
1406 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1407 aDP->GetName()));
1408 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1409 dcsMap.DeleteAll();
9827400b 1410 return kFALSE;
2c15234c 1411 }
73abe331 1412 }
1413 }
b948db8d 1414
2bb7b766 1415 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1416 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1417
3301427a 1418 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1419
3301427a 1420 if (returnValue > 0) // Preprocessor error!
1421 {
9827400b 1422 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1423 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1424 dcsMap.DeleteAll();
1425 return kFALSE;
1426 }
1427
1428 // preprocessor ok!
1429 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1430 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1431 fCurrentDetector.Data()));
b948db8d 1432
2c15234c 1433 dcsMap.DeleteAll();
b948db8d 1434
9827400b 1435 return kTRUE;
2bb7b766 1436}
1437
1438//______________________________________________________________________________________________
1439Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1440 TObjArray& entries)
1441{
9827400b 1442 // Query DAQ's Shuttle logbook and fills detector status object.
1443 // Call QueryRunParameters to query DAQ logbook for run parameters.
1444 //
2bb7b766 1445
fc5a4708 1446 entries.SetOwner(1);
1447
2bb7b766 1448 // check connection, in case connect
be48e3ea 1449 if(!Connect(3)) return kFALSE;
2bb7b766 1450
1451 TString sqlQuery;
441b0e9c 1452 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1453
be48e3ea 1454 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1455 if (!aResult) {
1456 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1457 return kFALSE;
1458 }
1459
fc5a4708 1460 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1461
2bb7b766 1462 if(aResult->GetRowCount() == 0) {
9827400b 1463 AliInfo("No entries in Shuttle Logbook match request");
1464 delete aResult;
1465 return kTRUE;
2bb7b766 1466 }
1467
1468 // TODO Check field count!
fc5a4708 1469 const UInt_t nCols = 22;
2bb7b766 1470 if (aResult->GetFieldCount() != (Int_t) nCols) {
1471 AliError("Invalid SQL result field number!");
1472 delete aResult;
1473 return kFALSE;
1474 }
1475
2bb7b766 1476 TSQLRow* aRow;
1477 while ((aRow = aResult->Next())) {
1478 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1479 Int_t run = runString.Atoi();
1480
eba76848 1481 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1482 if (!entry)
1483 continue;
2bb7b766 1484
1485 // loop on detectors
eba76848 1486 for(UInt_t ii = 0; ii < nCols; ii++)
1487 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1488
eba76848 1489 entries.AddLast(entry);
2bb7b766 1490 delete aRow;
1491 }
1492
2bb7b766 1493 delete aResult;
1494 return kTRUE;
1495}
1496
1497//______________________________________________________________________________________________
eba76848 1498AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1499{
eba76848 1500 //
1501 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1502 //
2bb7b766 1503
1504 // check connection, in case connect
be48e3ea 1505 if (!Connect(3))
eba76848 1506 return 0;
2bb7b766 1507
1508 TString sqlQuery;
2c15234c 1509 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1510
be48e3ea 1511 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1512 if (!aResult) {
1513 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1514 return 0;
2bb7b766 1515 }
1516
eba76848 1517 if (aResult->GetRowCount() == 0) {
2bb7b766 1518 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1519 delete aResult;
eba76848 1520 return 0;
2bb7b766 1521 }
1522
eba76848 1523 if (aResult->GetRowCount() > 1) {
2bb7b766 1524 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1525 delete aResult;
eba76848 1526 return 0;
2bb7b766 1527 }
1528
eba76848 1529 TSQLRow* aRow = aResult->Next();
1530 if (!aRow)
1531 {
1532 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1533 delete aResult;
1534 return 0;
1535 }
2bb7b766 1536
eba76848 1537 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1538
eba76848 1539 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1540 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1541
eba76848 1542 UInt_t startTime = entry->GetStartTime();
1543 UInt_t endTime = entry->GetEndTime();
1544
1545 if (!startTime || !endTime || startTime > endTime) {
1546 Log("SHUTTLE",
1547 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1548 run, startTime, endTime));
1549 delete entry;
2bb7b766 1550 delete aRow;
eba76848 1551 delete aResult;
1552 return 0;
2bb7b766 1553 }
1554
eba76848 1555 delete aRow;
2bb7b766 1556 delete aResult;
eba76848 1557
1558 return entry;
2bb7b766 1559}
1560
b948db8d 1561//______________________________________________________________________________________________
2c15234c 1562Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1563 TObjArray* valueSet, DCSType type)
73abe331 1564{
9827400b 1565 // Retrieve all "entry" data points from the DCS server
1566 // host, port: TSocket connection parameters
1567 // entry: name of the alias or data point
1568 // valueSet: array of retrieved AliDCSValue's
1569 // type: kAlias or kDP
58bc3020 1570
73abe331 1571 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1572 if (!client.IsConnected())
1573 {
b948db8d 1574 return kFALSE;
73abe331 1575 }
1576
2c15234c 1577 Int_t result=0;
73abe331 1578
2c15234c 1579 if (type == kAlias)
1580 {
1581 result = client.GetAliasValues(entry,
1582 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1583 } else
1584 if (type == kDP)
1585 {
1586 result = client.GetDPValues(entry,
1587 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1588 }
1589
1590 if (result < 0)
1591 {
2bb7b766 1592 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1593 entry, AliDCSClient::GetErrorString(result)));
73abe331 1594
2c15234c 1595 if (result == AliDCSClient::fgkServerError)
1596 {
2bb7b766 1597 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1598 client.GetServerError().Data()));
1599 }
1600
1601 return kFALSE;
1602 }
1603
1604 return kTRUE;
1605}
b948db8d 1606
1607//______________________________________________________________________________________________
57f50b3c 1608const char* AliShuttle::GetFile(Int_t system, const char* detector,
1609 const char* id, const char* source)
b948db8d 1610{
9827400b 1611 // Get calibration file from file exchange servers
1612 // First queris the FXS database for the file name, using the run, detector, id and source info
1613 // then calls RetrieveFile(filename) for actual copy to local disk
1614 // run: current run being processed (given by Logbook entry fLogbookEntry)
1615 // detector: the Preprocessor name
1616 // id: provided as a parameter by the Preprocessor
1617 // source: provided by the Preprocessor through GetFileSources function
1618
1619 // check if test mode should simulate a FXS error
1620 if (fTestMode & kErrorFXSFiles)
1621 {
1622 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1623 return 0;
1624 }
1625
57f50b3c 1626 // check connection, in case connect
9d733021 1627 if (!Connect(system))
eba76848 1628 {
9d733021 1629 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1630 return 0;
1631 }
1632
1633 // Query preparation
9d733021 1634 TString sourceName(source);
d386d623 1635 Int_t nFields = 3;
1636 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1637 fConfig->GetFXSdbTable(system));
1638 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1639 GetCurrentRun(), detector, id);
1640
9d733021 1641 if (system == kDAQ)
1642 {
d386d623 1643 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1644 }
9d733021 1645 else if (system == kDCS)
eba76848 1646 {
9d733021 1647 sourceName="none";
57f50b3c 1648 }
9d733021 1649 else if (system == kHLT)
9e080f92 1650 {
d386d623 1651 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1652 nFields = 3;
9e080f92 1653 }
1654
9e080f92 1655 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1656
1657 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1658
1659 // Query execution
1660 TSQLResult* aResult = 0;
9d733021 1661 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1662 if (!aResult) {
9d733021 1663 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1664 GetSystemName(system), id, sourceName.Data()));
9e080f92 1665 return 0;
1666 }
1667
1668 if(aResult->GetRowCount() == 0)
1669 {
1670 Log(detector,
9d733021 1671 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1672 GetSystemName(system), id, sourceName.Data()));
9e080f92 1673 delete aResult;
1674 return 0;
1675 }
2bb7b766 1676
9e080f92 1677 if (aResult->GetRowCount() > 1) {
1678 Log(detector,
9d733021 1679 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1680 GetSystemName(system), id, sourceName.Data()));
9e080f92 1681 delete aResult;
1682 return 0;
1683 }
1684
9d733021 1685 if (aResult->GetFieldCount() != nFields) {
9e080f92 1686 Log(detector,
9d733021 1687 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1688 GetSystemName(system), id, sourceName.Data()));
9e080f92 1689 delete aResult;
1690 return 0;
1691 }
1692
1693 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1694
1695 if (!aRow){
9d733021 1696 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1697 GetSystemName(system), id, sourceName.Data()));
9e080f92 1698 delete aResult;
1699 return 0;
1700 }
1701
1702 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1703 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1704 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1705
1706 delete aResult;
1707 delete aRow;
1708
d386d623 1709 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1710 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1711
9e080f92 1712 // retrieved file is renamed to make it unique
9d733021 1713 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1714 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1715
9e080f92 1716
9d733021 1717 // file retrieval from FXS
4b95672b 1718 UInt_t nRetries = 0;
1719 UInt_t maxRetries = 3;
1720 Bool_t result = kFALSE;
1721
1722 // copy!! if successful TSystem::Exec returns 0
1723 while(nRetries++ < maxRetries) {
1724 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1725 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1726 if(!result)
1727 {
1728 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1729 filePath.Data(), GetSystemName(system)));
4b95672b 1730 continue;
1731 } else {
1732 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1733 filePath.Data(), GetSystemName(system),
1734 GetShuttleTempDir(), localFileName.Data()));
1735 }
9e080f92 1736
d386d623 1737 if (fileChecksum.Length()>0)
4b95672b 1738 {
1739 // compare md5sum of local file with the one stored in the FXS DB
1740 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1741 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1742
4b95672b 1743 if (md5Comp != 0)
1744 {
1745 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1746 filePath.Data()));
1747 result = kFALSE;
1748 continue;
1749 }
d386d623 1750 } else {
1751 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1752 filePath.Data(), GetSystemName(system)));
9d733021 1753 }
4b95672b 1754 if (result) break;
9e080f92 1755 }
1756
4b95672b 1757 if(!result) return 0;
1758
9d733021 1759 fFXSCalled[system]=kTRUE;
1760 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1761 fFXSlist[system].Add(fileParams);
9e080f92 1762
1763 static TString fullLocalFileName;
36c99a6a 1764 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1765
9e080f92 1766 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1767
1768 return fullLocalFileName.Data();
2bb7b766 1769
1770}
1771
1772//______________________________________________________________________________________________
9d733021 1773Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1774{
9827400b 1775 //
1776 // Copies file from FXS to local Shuttle machine
1777 //
2bb7b766 1778
9e080f92 1779 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1780 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1781 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1782
36c99a6a 1783 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1784 if (dir == NULL) {
36c99a6a 1785 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1786 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1787 return kFALSE;
1788 }
1789
1790 } else {
1791 gSystem->FreeDirectory(dir);
1792 }
1793
9d733021 1794 TString baseFXSFolder;
1795 if (system == kDAQ)
1796 {
1797 baseFXSFolder = "FES/";
1798 }
1799 else if (system == kDCS)
1800 {
1801 baseFXSFolder = "";
1802 }
1803 else if (system == kHLT)
1804 {
1805 baseFXSFolder = "~/";
1806 }
1807
1808
1809 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1810 fConfig->GetFXSPort(system),
1811 fConfig->GetFXSUser(system),
1812 fConfig->GetFXSHost(system),
1813 baseFXSFolder.Data(),
1814 fxsFileName,
36c99a6a 1815 GetShuttleTempDir(),
9e080f92 1816 localFileName);
1817
1818 AliDebug(2, Form("%s",command.Data()));
1819
4b95672b 1820 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1821
4b95672b 1822 return result;
9e080f92 1823}
1824
1825//______________________________________________________________________________________________
9d733021 1826TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1827{
9827400b 1828 //
1829 // Get sources producing the condition file Id from file exchange servers
1830 //
1831
1832 // check if test mode should simulate a FXS error
1833 if (fTestMode & kErrorFXSSources)
1834 {
1835 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1836 return 0;
1837 }
1838
9d733021 1839
1840 if (system == kDCS)
1841 {
1842 AliError("DCS system has only one source of data!");
1843 return NULL;
9d733021 1844 }
9e080f92 1845
1846 // check connection, in case connect
9d733021 1847 if (!Connect(system))
1848 {
1849 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1850 return NULL;
9e080f92 1851 }
1852
9d733021 1853 TString sourceName = 0;
1854 if (system == kDAQ)
1855 {
1856 sourceName = "DAQsource";
1857 } else if (system == kHLT)
1858 {
1859 sourceName = "DDLnumbers";
1860 }
1861
d386d623 1862 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
9e080f92 1863 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1864 GetCurrentRun(), detector, id);
1865 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1866
1867 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1868
1869 // Query execution
1870 TSQLResult* aResult;
9d733021 1871 aResult = fServer[system]->Query(sqlQuery);
9e080f92 1872 if (!aResult) {
9d733021 1873 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1874 GetSystemName(system), id));
9e080f92 1875 return 0;
1876 }
1877
86aa42c3 1878 TList *list = new TList();
1879 list->SetOwner(1);
1880
9d733021 1881 if (aResult->GetRowCount() == 0)
1882 {
9e080f92 1883 Log(detector,
9d733021 1884 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 1885 delete aResult;
86aa42c3 1886 return list;
9e080f92 1887 }
1888
1889 TSQLRow* aRow;
9e080f92 1890
9d733021 1891 while ((aRow = aResult->Next()))
1892 {
9e080f92 1893
9d733021 1894 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1895 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1896 list->Add(new TObjString(source));
9e080f92 1897 delete aRow;
1898 }
9d733021 1899
9e080f92 1900 delete aResult;
1901
1902 return list;
2bb7b766 1903}
1904
1905//______________________________________________________________________________________________
9d733021 1906Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 1907{
9827400b 1908 // Connect to MySQL Server of the system's FXS MySQL databases
1909 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1910 //
57f50b3c 1911
9d733021 1912 // check connection: if already connected return
1913 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 1914
9d733021 1915 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 1916
9d733021 1917 if (system < 3) // FXS db servers
1918 {
1919 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1920 dbUser = fConfig->GetFXSdbUser(system);
1921 dbPass = fConfig->GetFXSdbPass(system);
1922 dbName = fConfig->GetFXSdbName(system);
1923 } else { // Run & Shuttle logbook servers
1924 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1925 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1926 dbUser = fConfig->GetDAQlbUser();
1927 dbPass = fConfig->GetDAQlbPass();
1928 dbName = fConfig->GetDAQlbDB();
1929 }
57f50b3c 1930
9d733021 1931 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1932 if (!fServer[system] || !fServer[system]->IsConnected()) {
1933 if(system < 3)
1934 {
1935 AliError(Form("Can't establish connection to FXS database for %s",
1936 AliShuttleInterface::GetSystemName(system)));
1937 } else {
1938 AliError("Can't establish connection to Run logbook.");
57f50b3c 1939 }
9d733021 1940 if(fServer[system]) delete fServer[system];
1941 return kFALSE;
2bb7b766 1942 }
57f50b3c 1943
9d733021 1944 // Get tables
1945 TSQLResult* aResult=0;
1946 switch(system){
1947 case kDAQ:
1948 aResult = fServer[kDAQ]->GetTables(dbName.Data());
1949 break;
1950 case kDCS:
1951 aResult = fServer[kDCS]->GetTables(dbName.Data());
1952 break;
1953 case kHLT:
1954 aResult = fServer[kHLT]->GetTables(dbName.Data());
1955 break;
1956 default:
1957 aResult = fServer[3]->GetTables(dbName.Data());
1958 break;
1959 }
1960
1961 delete aResult;
2bb7b766 1962 return kTRUE;
1963}
57f50b3c 1964
9e080f92 1965//______________________________________________________________________________________________
9d733021 1966Bool_t AliShuttle::UpdateTable()
9e080f92 1967{
9827400b 1968 //
1969 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
1970 //
9e080f92 1971
9d733021 1972 Bool_t result = kTRUE;
9e080f92 1973
9d733021 1974 for (UInt_t system=0; system<3; system++)
1975 {
1976 if(!fFXSCalled[system]) continue;
9e080f92 1977
9d733021 1978 // check connection, in case connect
1979 if (!Connect(system))
1980 {
1981 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
1982 result = kFALSE;
1983 continue;
9e080f92 1984 }
9e080f92 1985
9d733021 1986 TTimeStamp now; // now
1987
1988 // Loop on FXS list entries
1989 TIter iter(&fFXSlist[system]);
1990 TObjString *aFXSentry=0;
1991 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
1992 {
1993 TString aFXSentrystr = aFXSentry->String();
1994 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
1995 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
1996 {
1997 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
1998 GetSystemName(system), aFXSentrystr.Data()));
1999 if(aFXSarray) delete aFXSarray;
2000 result = kFALSE;
2001 continue;
2002 }
2003 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2004 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2005
2006 TString whereClause;
2007 if (system == kDAQ)
2008 {
2009 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2010 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2011 }
2012 else if (system == kDCS)
2013 {
2014 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2015 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2016 }
2017 else if (system == kHLT)
2018 {
2019 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2020 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2021 }
2022
2023 delete aFXSarray;
9e080f92 2024
9d733021 2025 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2026 now.GetSec(), whereClause.Data());
9e080f92 2027
9d733021 2028 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2029
9d733021 2030 // Query execution
2031 TSQLResult* aResult;
2032 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2033 if (!aResult)
2034 {
2035 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2036 GetSystemName(system), sqlQuery.Data()));
2037 result = kFALSE;
2038 continue;
2039 }
2040 delete aResult;
9e080f92 2041 }
9e080f92 2042 }
2043
9d733021 2044 return result;
9e080f92 2045}
57f50b3c 2046
3301427a 2047//______________________________________________________________________________________________
2048Bool_t AliShuttle::UpdateTableFailCase()
2049{
9827400b 2050 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2051 // this is called in case the preprocessor is declared failed for the current run, because
2052 // the fields are updated only in case of success
3301427a 2053
2054 Bool_t result = kTRUE;
2055
2056 for (UInt_t system=0; system<3; system++)
2057 {
2058 // check connection, in case connect
2059 if (!Connect(system))
2060 {
2061 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2062 GetSystemName(system)));
2063 result = kFALSE;
2064 continue;
2065 }
2066
2067 TTimeStamp now; // now
2068
2069 // Loop on FXS list entries
2070
2071 TString whereClause = Form("where run=%d and detector=\"%s\";",
2072 GetCurrentRun(), fCurrentDetector.Data());
2073
2074
2075 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2076 now.GetSec(), whereClause.Data());
2077
2078 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2079
2080 // Query execution
2081 TSQLResult* aResult;
2082 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2083 if (!aResult)
2084 {
2085 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2086 GetSystemName(system), sqlQuery.Data()));
2087 result = kFALSE;
2088 continue;
2089 }
2090 delete aResult;
2091 }
2092
2093 return result;
2094}
2095
2bb7b766 2096//______________________________________________________________________________________________
2097Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2098{
e7f62f16 2099 //
2100 // Update Shuttle logbook filling detector or shuttle_done column
2101 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2102 //
57f50b3c 2103
2bb7b766 2104 // check connection, in case connect
be48e3ea 2105 if(!Connect(3)){
2bb7b766 2106 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2107 return kFALSE;
57f50b3c 2108 }
2109
2bb7b766 2110 TString detName(detector);
2111 TString setClause;
e7f62f16 2112 if(detName == "shuttle_done")
2113 {
2bb7b766 2114 setClause = "set shuttle_done=1";
e7f62f16 2115
2116 // Send the information to ML
2117 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2118
2119 TList mlList;
2120 mlList.Add(&mlStatus);
2121
2122 fMonaLisa->SendParameters(&mlList);
2bb7b766 2123 } else {
2bb7b766 2124 TString statusStr(status);
2125 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2126 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2127 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2128 } else {
2129 Log("SHUTTLE",
2130 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2131 status, detector));
2132 return kFALSE;
2133 }
2134 }
57f50b3c 2135
2bb7b766 2136 TString whereClause = Form("where run=%d", GetCurrentRun());
2137
441b0e9c 2138 TString sqlQuery = Form("update %s %s %s",
2139 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2140
2bb7b766 2141 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2142
2143 // Query execution
2144 TSQLResult* aResult;
be48e3ea 2145 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2146 if (!aResult) {
2147 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2148 return kFALSE;
57f50b3c 2149 }
2bb7b766 2150 delete aResult;
57f50b3c 2151
2152 return kTRUE;
2153}
2154
2155//______________________________________________________________________________________________
2bb7b766 2156Int_t AliShuttle::GetCurrentRun() const
2157{
9827400b 2158 //
2159 // Get current run from logbook entry
2160 //
57f50b3c 2161
2bb7b766 2162 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2163}
2164
2165//______________________________________________________________________________________________
2bb7b766 2166UInt_t AliShuttle::GetCurrentStartTime() const
2167{
9827400b 2168 //
2169 // get current start time
2170 //
57f50b3c 2171
2bb7b766 2172 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2173}
2174
2175//______________________________________________________________________________________________
2bb7b766 2176UInt_t AliShuttle::GetCurrentEndTime() const
2177{
9827400b 2178 //
2179 // get current end time from logbook entry
2180 //
57f50b3c 2181
2bb7b766 2182 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2183}
2184
b948db8d 2185//______________________________________________________________________________________________
2186void AliShuttle::Log(const char* detector, const char* message)
2187{
9827400b 2188 //
2189 // Fill log string with a message
2190 //
b948db8d 2191
36c99a6a 2192 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2193 if (dir == NULL) {
36c99a6a 2194 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2195 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2196 return;
2197 }
b948db8d 2198
84090f85 2199 } else {
2200 gSystem->FreeDirectory(dir);
2201 }
b948db8d 2202
cb343cfd 2203 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2204 if (GetCurrentRun() >= 0)
2205 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2206 toLog += Form("%s", message);
2207
84090f85 2208 AliInfo(toLog.Data());
ffa29e93 2209
2210 // if we redirect the log output already to the file, leave here
2211 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2212 return;
b948db8d 2213
ffa29e93 2214 TString fileName = GetLogFileName(detector);
e7f62f16 2215
84090f85 2216 gSystem->ExpandPathName(fileName);
2217
2218 ofstream logFile;
2219 logFile.open(fileName, ofstream::out | ofstream::app);
2220
2221 if (!logFile.is_open()) {
2222 AliError(Form("Could not open file %s", fileName.Data()));
2223 return;
2224 }
7bfb2090 2225
84090f85 2226 logFile << toLog.Data() << "\n";
b948db8d 2227
84090f85 2228 logFile.close();
b948db8d 2229}
2bb7b766 2230
ffa29e93 2231//______________________________________________________________________________________________
2232TString AliShuttle::GetLogFileName(const char* detector) const
2233{
2234 //
2235 // returns the name of the log file for a given sub detector
2236 //
2237
2238 TString fileName;
2239
2240 if (GetCurrentRun() >= 0)
2241 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2242 else
2243 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2244
2245 return fileName;
2246}
2247
2bb7b766 2248//______________________________________________________________________________________________
2249Bool_t AliShuttle::Collect(Int_t run)
2250{
9827400b 2251 //
2252 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2253 // If a dedicated run is given this run is processed
2254 //
2255 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2256 //
2bb7b766 2257
eba76848 2258 if (run == -1)
2259 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2260 else
2261 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2262
2263 SetLastAction("Starting");
2bb7b766 2264
2265 TString whereClause("where shuttle_done=0");
eba76848 2266 if (run != -1)
2267 whereClause += Form(" and run=%d", run);
2bb7b766 2268
2269 TObjArray shuttleLogbookEntries;
be48e3ea 2270 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2271 {
cb343cfd 2272 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2273 return kFALSE;
2274 }
2275
9e080f92 2276 if (shuttleLogbookEntries.GetEntries() == 0)
2277 {
2278 if (run == -1)
2279 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2280 else
2281 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2282 "or it does not exist in Shuttle logbook", run));
2283 return kTRUE;
2284 }
2285
be48e3ea 2286 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2287 fFirstUnprocessed[iDet] = kTRUE;
2288
fc5a4708 2289 if (run != -1)
be48e3ea 2290 {
2291 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2292 // flag them into fFirstUnprocessed array
2293 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2294 TObjArray tmpLogbookEntries;
2295 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2296 {
2297 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2298 return kFALSE;
2299 }
2300
2301 TIter iter(&tmpLogbookEntries);
2302 AliShuttleLogbookEntry* anEntry = 0;
2303 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2304 {
2305 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2306 {
2307 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2308 {
2309 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2310 anEntry->GetRun(), GetDetName(iDet)));
2311 fFirstUnprocessed[iDet] = kFALSE;
2312 }
2313 }
2314
2315 }
2316
2317 }
2318
2319 if (!RetrieveConditionsData(shuttleLogbookEntries))
2320 {
cb343cfd 2321 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2322 return kFALSE;
2323 }
2324
36c99a6a 2325 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2326 return kTRUE;
2bb7b766 2327}
2328
2bb7b766 2329//______________________________________________________________________________________________
2330Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2331{
9827400b 2332 //
2333 // Retrieve conditions data for all runs that aren't processed yet
2334 //
2bb7b766 2335
2336 Bool_t hasError = kFALSE;
2337
2338 TIter iter(&dateEntries);
2339 AliShuttleLogbookEntry* anEntry;
2340
2341 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2342 if (!Process(anEntry)){
2343 hasError = kTRUE;
2344 }
4b95672b 2345
2346 // clean SHUTTLE temp directory
3301427a 2347 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2348 RemoveFile(filename.Data());
2bb7b766 2349 }
2350
2351 return hasError == kFALSE;
2352}
cb343cfd 2353
2354//______________________________________________________________________________________________
2355ULong_t AliShuttle::GetTimeOfLastAction() const
2356{
9827400b 2357 //
2358 // Gets time of last action
2359 //
2360
cb343cfd 2361 ULong_t tmp;
36c99a6a 2362
cb343cfd 2363 fMonitoringMutex->Lock();
be48e3ea 2364
cb343cfd 2365 tmp = fLastActionTime;
36c99a6a 2366
cb343cfd 2367 fMonitoringMutex->UnLock();
36c99a6a 2368
cb343cfd 2369 return tmp;
2370}
2371
2372//______________________________________________________________________________________________
2373const TString AliShuttle::GetLastAction() const
2374{
9827400b 2375 //
cb343cfd 2376 // returns a string description of the last action
9827400b 2377 //
cb343cfd 2378
2379 TString tmp;
36c99a6a 2380
cb343cfd 2381 fMonitoringMutex->Lock();
2382
2383 tmp = fLastAction;
2384
2385 fMonitoringMutex->UnLock();
2386
36c99a6a 2387 return tmp;
cb343cfd 2388}
2389
2390//______________________________________________________________________________________________
2391void AliShuttle::SetLastAction(const char* action)
2392{
9827400b 2393 //
cb343cfd 2394 // updates the monitoring variables
9827400b 2395 //
36c99a6a 2396
cb343cfd 2397 fMonitoringMutex->Lock();
36c99a6a 2398
cb343cfd 2399 fLastAction = action;
2400 fLastActionTime = time(0);
2401
2402 fMonitoringMutex->UnLock();
2403}
eba76848 2404
2405//______________________________________________________________________________________________
2406const char* AliShuttle::GetRunParameter(const char* param)
2407{
9827400b 2408 //
2409 // returns run parameter read from DAQ logbook
2410 //
eba76848 2411
2412 if(!fLogbookEntry) {
2413 AliError("No logbook entry!");
2414 return 0;
2415 }
2416
2417 return fLogbookEntry->GetRunParameter(param);
2418}
57c1a579 2419
d386d623 2420//______________________________________________________________________________________________
9827400b 2421AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2422{
9827400b 2423 //
2424 // returns object from OCDB valid for current run
2425 //
d386d623 2426
9827400b 2427 if (fTestMode & kErrorOCDB)
2428 {
2429 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2430 return 0;
2431 }
2432
d386d623 2433 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2434 if (!sto)
2435 {
9827400b 2436 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2437 return 0;
2438 }
2439
2440 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2441}
2442
57c1a579 2443//______________________________________________________________________________________________
2444Bool_t AliShuttle::SendMail()
2445{
9827400b 2446 //
2447 // sends a mail to the subdetector expert in case of preprocessor error
2448 //
2449
2450 if (fTestMode != kNone)
2451 return kTRUE;
57c1a579 2452
36c99a6a 2453 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2454 if (dir == NULL)
2455 {
36c99a6a 2456 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2457 {
36c99a6a 2458 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2459 return kFALSE;
2460 }
2461
2462 } else {
2463 gSystem->FreeDirectory(dir);
2464 }
2465
2466 TString bodyFileName;
36c99a6a 2467 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2468 gSystem->ExpandPathName(bodyFileName);
2469
2470 ofstream mailBody;
2471 mailBody.open(bodyFileName, ofstream::out);
2472
2473 if (!mailBody.is_open())
2474 {
2475 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2476 return kFALSE;
2477 }
2478
2479 TString to="";
2480 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2481 TObjString *anExpert=0;
2482 while ((anExpert = (TObjString*) iterExperts.Next()))
2483 {
2484 to += Form("%s,", anExpert->GetName());
2485 }
2486 to.Remove(to.Length()-1);
909732f7 2487 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2488
86aa42c3 2489 if (to.IsNull()) {
36c99a6a 2490 AliInfo("List of detector responsibles not yet set!");
2491 return kFALSE;
2492 }
2493
57c1a579 2494 TString cc="alberto.colla@cern.ch";
2495
2496 TString subject = Form("%s Shuttle preprocessor error in run %d !",
2497 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2498 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2499
2500 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2501 body += Form("SHUTTLE just detected that your preprocessor "
86aa42c3 2502 "exited with ERROR state in run %d!!\n\n", GetCurrentRun());
57c1a579 2503 body += Form("Please check %s status on the web page asap!\n\n", fCurrentDetector.Data());
2504 body += Form("The last 10 lines of %s log file are following:\n\n");
2505
909732f7 2506 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2507
2508 mailBody << body.Data();
2509 mailBody.close();
2510 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2511
9d733021 2512 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2513 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2514 if (gSystem->Exec(tailCommand.Data()))
2515 {
2516 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2517 }
2518
2519 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2520 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2521 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
57c1a579 2522 endBody += "Sincerely yours,\n\n \t\t\tthe SHUTTLE\n";
2523
909732f7 2524 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2525
2526 mailBody << endBody.Data();
2527
2528 mailBody.close();
2529
2530 // send mail!
2531 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2532 subject.Data(),
2533 cc.Data(),
2534 to.Data(),
2535 bodyFileName.Data());
909732f7 2536 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2537
2538 Bool_t result = gSystem->Exec(mailCommand.Data());
2539
2540 return result == 0;
2541}
d386d623 2542
441b0e9c 2543//______________________________________________________________________________________________
9827400b 2544const char* AliShuttle::GetRunType()
441b0e9c 2545{
9827400b 2546 //
2547 // returns run type read from "run type" logbook
2548 //
441b0e9c 2549
2550 if(!fLogbookEntry) {
2551 AliError("No logbook entry!");
2552 return 0;
2553 }
2554
9827400b 2555 return fLogbookEntry->GetRunType();
441b0e9c 2556}
2557
d386d623 2558//______________________________________________________________________________________________
2559void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2560{
9827400b 2561 //
2562 // sets Shuttle temp directory
2563 //
d386d623 2564
2565 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2566}
2567
2568//______________________________________________________________________________________________
2569void AliShuttle::SetShuttleLogDir(const char* logDir)
2570{
9827400b 2571 //
2572 // sets Shuttle log directory
2573 //
d386d623 2574
2575 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2576}