In ESDtoAOD: updating instead of creating new AliAODHeader.
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
4a33bdd9 18Revision 1.44 2007/05/11 16:09:32 acolla
19Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
20example: ITS/SPD/100_filename.root
21
2d9019b4 22Revision 1.43 2007/05/10 09:59:51 acolla
23Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
24
546242fb 25Revision 1.42 2007/05/03 08:01:39 jgrosseo
26typo in last commit :-(
27
8b739301 28Revision 1.41 2007/05/03 08:00:48 jgrosseo
29fixing log message when pp want to skip dcs value retrieval
30
651fdaab 31Revision 1.40 2007/04/27 07:06:48 jgrosseo
32GetFileSources returns empty list in case of no files, but successful query
33No mails sent in testmode
34
86aa42c3 35Revision 1.39 2007/04/17 12:43:57 acolla
36Correction in StoreOCDB; change of text in mail to detector expert
37
26758fce 38Revision 1.38 2007/04/12 08:26:18 jgrosseo
39updated comment
40
3c2a21c8 41Revision 1.37 2007/04/10 16:53:14 jgrosseo
42redirecting sub detector stdout, stderr to sub detector log file
43
3d8bc902 44Revision 1.35 2007/04/04 16:26:38 acolla
451. Re-organization of function calls in TestPreprocessor to make it more meaningful.
462. Added missing dependency in test preprocessors.
473. in AliShuttle.cxx: processing time and memory consumption info on a single line.
48
886d60e6 49Revision 1.34 2007/04/04 10:33:36 jgrosseo
501) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
51In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
52
532) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
54
553) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
56
574) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
58
595) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
60If you always need DCS data (like before), you do not need to implement it.
61
626) The run type has been added to the monitoring page
63
9827400b 64Revision 1.33 2007/04/03 13:56:01 acolla
65Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
66run type.
67
3301427a 68Revision 1.32 2007/02/28 10:41:56 acolla
69Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
70AliPreprocessor::GetRunType() function.
71Added some ldap definition files.
72
d386d623 73Revision 1.30 2007/02/13 11:23:21 acolla
74Moved getters and setters of Shuttle's main OCDB/Reference, local
75OCDB/Reference, temp and log folders to AliShuttleInterface
76
9d733021 77Revision 1.27 2007/01/30 17:52:42 jgrosseo
78adding monalisa monitoring
79
e7f62f16 80Revision 1.26 2007/01/23 19:20:03 acolla
81Removed old ldif files, added TOF, MCH ldif files. Added some options in
82AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
83SetShuttleLogDir
84
36c99a6a 85Revision 1.25 2007/01/15 19:13:52 acolla
86Moved some AliInfo to AliDebug in SendMail function
87
fc5a4708 88Revision 1.21 2006/12/07 08:51:26 jgrosseo
89update (alberto):
90table, db names in ldap configuration
91added GRP preprocessor
92DCS data can also be retrieved by data point
93
2c15234c 94Revision 1.20 2006/11/16 16:16:48 jgrosseo
95introducing strict run ordering flag
96removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
97
be48e3ea 98Revision 1.19 2006/11/06 14:23:04 jgrosseo
99major update (Alberto)
100o) reading of run parameters from the logbook
101o) online offline naming conversion
102o) standalone DCSclient package
103
eba76848 104Revision 1.18 2006/10/20 15:22:59 jgrosseo
105o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
106o) Merging Collect, CollectAll, CollectNew function
107o) Removing implementation of empty copy constructors (declaration still there!)
108
cb343cfd 109Revision 1.17 2006/10/05 16:20:55 jgrosseo
110adapting to new CDB classes
111
6ec0e06c 112Revision 1.16 2006/10/05 15:46:26 jgrosseo
113applying to the new interface
114
481441a2 115Revision 1.15 2006/10/02 16:38:39 jgrosseo
116update (alberto):
117fixed memory leaks
118storing of objects that failed to be stored to the grid before
119interfacing of shuttle status table in daq system
120
2bb7b766 121Revision 1.14 2006/08/29 09:16:05 jgrosseo
122small update
123
85a80aa9 124Revision 1.13 2006/08/15 10:50:00 jgrosseo
125effc++ corrections (alberto)
126
4f0ab988 127Revision 1.12 2006/08/08 14:19:29 jgrosseo
128Update to shuttle classes (Alberto)
129
130- Possibility to set the full object's path in the Preprocessor's and
131Shuttle's Store functions
132- Possibility to extend the object's run validity in the same classes
133("startValidity" and "validityInfinite" parameters)
134- Implementation of the StoreReferenceData function to store reference
135data in a dedicated CDB storage.
136
84090f85 137Revision 1.11 2006/07/21 07:37:20 jgrosseo
138last run is stored after each run
139
7bfb2090 140Revision 1.10 2006/07/20 09:54:40 jgrosseo
141introducing status management: The processing per subdetector is divided into several steps,
142after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
143can keep track of the number of failures and skips further processing after a certain threshold is
144exceeded. These thresholds can be configured in LDAP.
145
5164a766 146Revision 1.9 2006/07/19 10:09:55 jgrosseo
147new configuration, accesst to DAQ FES (Alberto)
148
57f50b3c 149Revision 1.8 2006/07/11 12:44:36 jgrosseo
150adding parameters for extended validity range of data produced by preprocessor
151
17111222 152Revision 1.7 2006/07/10 14:37:09 jgrosseo
153small fix + todo comment
154
e090413b 155Revision 1.6 2006/07/10 13:01:41 jgrosseo
156enhanced storing of last sucessfully processed run (alberto)
157
a7160fe9 158Revision 1.5 2006/07/04 14:59:57 jgrosseo
159revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
160
45a493ce 161Revision 1.4 2006/06/12 09:11:16 jgrosseo
162coding conventions (Alberto)
163
58bc3020 164Revision 1.3 2006/06/06 14:26:40 jgrosseo
165o) removed files that were moved to STEER
166o) shuttle updated to follow the new interface (Alberto)
167
b948db8d 168Revision 1.2 2006/03/07 07:52:34 hristov
169New version (B.Yordanov)
170
d477ad88 171Revision 1.6 2005/11/19 17:19:14 byordano
172RetrieveDATEEntries and RetrieveConditionsData added
173
174Revision 1.5 2005/11/19 11:09:27 byordano
175AliShuttle declaration added
176
177Revision 1.4 2005/11/17 17:47:34 byordano
178TList changed to TObjArray
179
180Revision 1.3 2005/11/17 14:43:23 byordano
181import to local CVS
182
183Revision 1.1.1.1 2005/10/28 07:33:58 hristov
184Initial import as subdirectory in AliRoot
185
73abe331 186Revision 1.2 2005/09/13 08:41:15 byordano
187default startTime endTime added
188
189Revision 1.4 2005/08/30 09:13:02 byordano
190some docs added
191
192Revision 1.3 2005/08/29 21:15:47 byordano
193some docs added
194
195*/
196
197//
198// This class is the main manager for AliShuttle.
199// It organizes the data retrieval from DCS and call the
b948db8d 200// interface methods of AliPreprocessor.
73abe331 201// For every detector in AliShuttleConfgi (see AliShuttleConfig),
202// data for its set of aliases is retrieved. If there is registered
b948db8d 203// AliPreprocessor for this detector then it will be used
204// accroding to the schema (see AliPreprocessor).
205// If there isn't registered AliPreprocessor than the retrieved
73abe331 206// data is stored automatically to the undelying AliCDBStorage.
207// For detSpec is used the alias name.
208//
209
210#include "AliShuttle.h"
211
212#include "AliCDBManager.h"
213#include "AliCDBStorage.h"
214#include "AliCDBId.h"
84090f85 215#include "AliCDBRunRange.h"
216#include "AliCDBPath.h"
5164a766 217#include "AliCDBEntry.h"
73abe331 218#include "AliShuttleConfig.h"
eba76848 219#include "DCSClient/AliDCSClient.h"
73abe331 220#include "AliLog.h"
b948db8d 221#include "AliPreprocessor.h"
5164a766 222#include "AliShuttleStatus.h"
2bb7b766 223#include "AliShuttleLogbookEntry.h"
73abe331 224
57f50b3c 225#include <TSystem.h>
58bc3020 226#include <TObject.h>
b948db8d 227#include <TString.h>
57f50b3c 228#include <TTimeStamp.h>
73abe331 229#include <TObjString.h>
57f50b3c 230#include <TSQLServer.h>
231#include <TSQLResult.h>
232#include <TSQLRow.h>
cb343cfd 233#include <TMutex.h>
9827400b 234#include <TSystemDirectory.h>
235#include <TSystemFile.h>
236#include <TFileMerger.h>
237#include <TGrid.h>
238#include <TGridResult.h>
73abe331 239
e7f62f16 240#include <TMonaLisaWriter.h>
241
5164a766 242#include <fstream>
243
cb343cfd 244#include <sys/types.h>
245#include <sys/wait.h>
246
73abe331 247ClassImp(AliShuttle)
248
b948db8d 249//______________________________________________________________________________________________
250AliShuttle::AliShuttle(const AliShuttleConfig* config,
251 UInt_t timeout, Int_t retries):
4f0ab988 252fConfig(config),
253fTimeout(timeout), fRetries(retries),
254fPreprocessorMap(),
2bb7b766 255fLogbookEntry(0),
eba76848 256fCurrentDetector(),
85a80aa9 257fStatusEntry(0),
cb343cfd 258fMonitoringMutex(0),
eba76848 259fLastActionTime(0),
e7f62f16 260fLastAction(),
9827400b 261fMonaLisa(0),
262fTestMode(kNone),
ffa29e93 263fReadTestMode(kFALSE),
264fOutputRedirected(kFALSE)
73abe331 265{
266 //
267 // config: AliShuttleConfig used
73abe331 268 // timeout: timeout used for AliDCSClient connection
269 // retries: the number of retries in case of connection error.
270 //
271
57f50b3c 272 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 273 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 274 fServer[iSys]=0;
be48e3ea 275 if (iSys < 3)
2c15234c 276 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 277 }
2bb7b766 278 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 279
280 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
281 fFirstUnprocessed[iDet] = kFALSE;
282
cb343cfd 283 fMonitoringMutex = new TMutex();
58bc3020 284}
285
b948db8d 286//______________________________________________________________________________________________
57f50b3c 287AliShuttle::~AliShuttle()
58bc3020 288{
9827400b 289 //
290 // destructor
291 //
58bc3020 292
b948db8d 293 fPreprocessorMap.DeleteAll();
be48e3ea 294 for(int iSys=0;iSys<4;iSys++)
57f50b3c 295 if(fServer[iSys]) {
296 fServer[iSys]->Close();
297 delete fServer[iSys];
eba76848 298 fServer[iSys] = 0;
57f50b3c 299 }
2bb7b766 300
301 if (fStatusEntry){
302 delete fStatusEntry;
303 fStatusEntry = 0;
304 }
cb343cfd 305
306 if (fMonitoringMutex)
307 {
308 delete fMonitoringMutex;
309 fMonitoringMutex = 0;
310 }
73abe331 311}
312
b948db8d 313//______________________________________________________________________________________________
57f50b3c 314void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 315{
73abe331 316 //
b948db8d 317 // Registers new AliPreprocessor.
73abe331 318 // It uses GetName() for indentificator of the pre processor.
319 // The pre processor is registered it there isn't any other
320 // with the same identificator (GetName()).
321 //
322
eba76848 323 const char* detName = preprocessor->GetName();
324 if(GetDetPos(detName) < 0)
325 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
326
327 if (fPreprocessorMap.GetValue(detName)) {
328 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 329 return;
330 }
331
eba76848 332 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 333}
b948db8d 334//______________________________________________________________________________________________
3301427a 335Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 336 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 337{
9827400b 338 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
339 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
340 // using this function. Use StoreReferenceData instead!
341 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
342 // finishes the data are transferred to the main storage (Grid).
b948db8d 343
3301427a 344 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 345}
346
347//______________________________________________________________________________________________
3301427a 348Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 349{
9827400b 350 // Stores a CDB object in the storage for reference data. This objects will not be available during
351 // offline reconstrunction. Use this function for reference data only!
352 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
353 // finishes the data are transferred to the main storage (Grid).
85a80aa9 354
3301427a 355 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 356}
357
358//______________________________________________________________________________________________
3301427a 359Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 360 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
361 Int_t validityStart, Bool_t validityInfinite)
362{
9827400b 363 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
364 // when the preprocessor finishes the data are transferred to the main storage (Grid).
365 // The parameters are:
366 // 1) Uri of the backup storage (Local)
367 // 2) the object's path.
368 // 3) the object to be stored
369 // 4) the metaData to be associated with the object
370 // 5) the validity start run number w.r.t. the current run,
371 // if the data is valid only for this run leave the default 0
372 // 6) specifies if the calibration data is valid for infinity (this means until updated),
373 // typical for calibration runs, the default is kFALSE
374 //
375 // returns 0 if fail, 1 otherwise
84090f85 376
9827400b 377 if (fTestMode & kErrorStorage)
378 {
379 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
380 return kFALSE;
381 }
382
3301427a 383 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 384
85a80aa9 385 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 386 if(firstRun < 0) {
9827400b 387 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 388 firstRun=0;
389 }
390
391 Int_t lastRun = -1;
392 if(validityInfinite) {
393 lastRun = AliCDBRunRange::Infinity();
394 } else {
395 lastRun = GetCurrentRun();
396 }
397
3301427a 398 // Version is set to current run, it will be used later to transfer data to Grid
399 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 400
401 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
402 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 403 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 404 }
84090f85 405
3301427a 406 Bool_t result = kFALSE;
84090f85 407
3301427a 408 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
409 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 410 } else {
3301427a 411 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 412 ->Put(object, id, metaData);
413 }
414
415 if(!result) {
416
9827400b 417 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 418 }
2bb7b766 419
3301427a 420 return result;
421}
84090f85 422
3301427a 423//______________________________________________________________________________________________
424Bool_t AliShuttle::StoreOCDB()
425{
9827400b 426 //
427 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
428 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
429 // Then calls StoreRefFilesToGrid to store reference files.
430 //
431
432 if (fTestMode & kErrorGrid)
433 {
434 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
435 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
436 return kFALSE;
437 }
438
2d9019b4 439 Log("SHUTTLE","Storing OCDB data ...");
86aa42c3 440 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
441
2d9019b4 442 Log("SHUTTLE","Storing reference data ...");
3301427a 443 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 444
2d9019b4 445 Log("SHUTTLE","Storing reference files ...");
9827400b 446 Bool_t resultRefFiles = StoreRefFilesToGrid();
447
448 return resultCDB && resultRef && resultRefFiles;
3301427a 449}
450
451//______________________________________________________________________________________________
452Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
453{
454 //
455 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
456 //
457
458 TObjArray* gridIds=0;
459
460 Bool_t result = kTRUE;
461
462 const char* type = 0;
463 TString localURI;
464 if(gridURI == fgkMainCDB) {
465 type = "OCDB";
466 localURI = fgkLocalCDB;
467 } else if(gridURI == fgkMainRefStorage) {
468 type = "reference";
469 localURI = fgkLocalRefStorage;
470 } else {
471 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
472 return kFALSE;
473 }
474
475 AliCDBManager* man = AliCDBManager::Instance();
476
477 AliCDBStorage *gridSto = man->GetStorage(gridURI);
478 if(!gridSto) {
479 Log("SHUTTLE",
480 Form("StoreOCDB - cannot activate main %s storage", type));
481 return kFALSE;
482 }
483
484 gridIds = gridSto->GetQueryCDBList();
485
486 // get objects previously stored in local CDB
487 AliCDBStorage *localSto = man->GetStorage(localURI);
488 if(!localSto) {
489 Log("SHUTTLE",
490 Form("StoreOCDB - cannot activate local %s storage", type));
491 return kFALSE;
492 }
493 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
494 // Local objects were stored with current run as Grid version!
495 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
496 localEntries->SetOwner(1);
497
498 // loop on local stored objects
499 TIter localIter(localEntries);
500 AliCDBEntry *aLocEntry = 0;
501 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
502 aLocEntry->SetOwner(1);
503 AliCDBId aLocId = aLocEntry->GetId();
504 aLocEntry->SetVersion(-1);
505 aLocEntry->SetSubVersion(-1);
506
507 // If local object is valid up to infinity we store it only if it is
508 // the first unprocessed run!
509 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
510 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
511 {
512 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
513 "there are previous unprocessed runs!",
514 fCurrentDetector.Data(), aLocId.GetPath().Data()));
515 continue;
516 }
517
518 // loop on Grid valid Id's
519 Bool_t store = kTRUE;
520 TIter gridIter(gridIds);
521 AliCDBId* aGridId = 0;
522 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
523 if(aGridId->GetPath() != aLocId.GetPath()) continue;
524 // skip all objects valid up to infinity
525 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
526 // if we get here, it means there's already some more recent object stored on Grid!
527 store = kFALSE;
528 break;
529 }
530
531 // If we get here, the file can be stored!
532 Bool_t storeOk = gridSto->Put(aLocEntry);
533 if(!store || storeOk){
534
535 if (!store)
536 {
537 Log(fCurrentDetector.Data(),
538 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
539 type, aGridId->ToString().Data()));
540 } else {
541 Log("SHUTTLE",
542 Form("StoreOCDB - Object <%s> successfully put into %s storage",
543 aLocId.ToString().Data(), type));
2d9019b4 544 Log(fCurrentDetector.Data(),
545 Form("StoreOCDB - Object <%s> successfully put into %s storage",
546 aLocId.ToString().Data(), type));
3301427a 547 }
84090f85 548
3301427a 549 // removing local filename...
550 TString filename;
551 localSto->IdToFilename(aLocId, filename);
552 AliInfo(Form("Removing local file %s", filename.Data()));
553 RemoveFile(filename.Data());
554 continue;
555 } else {
556 Log("SHUTTLE",
557 Form("StoreOCDB - Grid %s storage of object <%s> failed",
558 type, aLocId.ToString().Data()));
2d9019b4 559 Log(fCurrentDetector.Data(),
560 Form("StoreOCDB - Grid %s storage of object <%s> failed",
561 type, aLocId.ToString().Data()));
3301427a 562 result = kFALSE;
b948db8d 563 }
564 }
3301427a 565 localEntries->Clear();
2bb7b766 566
b948db8d 567 return result;
3301427a 568}
569
546242fb 570//______________________________________________________________________________________________
571Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
572{
2d9019b4 573 // clears the directory used to store reference files of a given subdetector
546242fb 574
575 AliCDBManager* man = AliCDBManager::Instance();
576 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
2d9019b4 577 TString localBaseFolder = sto->GetBaseFolder();
578
579 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
580
581 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
582
583 TString begin;
584 begin.Form("%d_", GetCurrentRun());
585
586 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
587 if (!baseDir)
588 return kTRUE;
589
590 TList* dirList = baseDir->GetListOfFiles();
591 delete baseDir;
592
593 if (!dirList) return kTRUE;
594
595 if (dirList->GetEntries() < 3)
596 {
597 delete dirList;
598 return kTRUE;
599 }
600
601 Int_t nDirs = 0, nDel = 0;
602 TIter dirIter(dirList);
603 TSystemFile* entry = 0;
546242fb 604
2d9019b4 605 Bool_t success = kTRUE;
546242fb 606
2d9019b4 607 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
608 {
609 if (entry->IsDirectory())
610 continue;
611
612 TString fileName(entry->GetName());
613 if (!fileName.BeginsWith(begin))
614 continue;
615
616 nDirs++;
617
618 // delete file
619 Int_t result = gSystem->Unlink(fileName.Data());
620
621 if (result)
622 {
623 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
624 success = kFALSE;
625 } else {
626 nDel++;
627 }
628 }
629
630 if(nDirs > 0)
631 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
632 nDel, nDirs, targetDir.Data()));
633
634
635 delete dirList;
636 return success;
637
638
639
640
641
546242fb 642
643 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
644 if (result == 0)
645 {
646 // delete directory
647 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
648 if (result != 0)
649 {
650 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
651 return kFALSE;
652 }
653 }
654
655 result = gSystem->mkdir(targetDir, kTRUE);
656 if (result != 0)
657 {
658 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
659 return kFALSE;
660 }
661
662 return kTRUE;
663}
664
9827400b 665//______________________________________________________________________________________________
666Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
667{
668 //
3c2a21c8 669 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 670 //
3c2a21c8 671 // The file is stored under the following location:
672 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
673 // where <gridFileName> is the second parameter given to the function
674 //
9827400b 675
676 if (fTestMode & kErrorStorage)
677 {
678 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
679 return kFALSE;
680 }
681
682 AliCDBManager* man = AliCDBManager::Instance();
683 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
684
685 TString localBaseFolder = sto->GetBaseFolder();
686
2d9019b4 687 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
9827400b 688
2d9019b4 689 //try to open folder, if does not exist
690 void* dir = gSystem->OpenDirectory(targetDir.Data());
691 if (dir == NULL) {
692 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
693 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
694 return kFALSE;
695 }
696
697 } else {
698 gSystem->FreeDirectory(dir);
699 }
700
9827400b 701 TString target;
702 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
703
546242fb 704 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
9827400b 705 if (result)
706 {
546242fb 707 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
708 return kFALSE;
9827400b 709 }
546242fb 710
9827400b 711 result = gSystem->CopyFile(localFile, target);
712
713 if (result == 0)
714 {
2d9019b4 715 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
9827400b 716 return kTRUE;
717 }
718 else
719 {
2d9019b4 720 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
546242fb 721 localFile, target.Data(), result));
9827400b 722 return kFALSE;
723 }
724}
725
726//______________________________________________________________________________________________
727Bool_t AliShuttle::StoreRefFilesToGrid()
728{
729 //
730 // Transfers the reference file to the Grid.
9827400b 731 //
86aa42c3 732 // The files are stored under the following location:
3c2a21c8 733 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
86aa42c3 734 //
9827400b 735
736 AliCDBManager* man = AliCDBManager::Instance();
737 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
738 if (!sto)
739 return kFALSE;
740 TString localBaseFolder = sto->GetBaseFolder();
741
2d9019b4 742 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
743
9827400b 744 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
745 if (!gridSto)
746 return kFALSE;
2d9019b4 747
9827400b 748 TString gridBaseFolder = gridSto->GetBaseFolder();
2d9019b4 749
750 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
9827400b 751
9827400b 752 TString begin;
753 begin.Form("%d_", GetCurrentRun());
754
755 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 756 if (!baseDir)
757 return kTRUE;
758
2d9019b4 759 TList* dirList = baseDir->GetListOfFiles();
760 delete baseDir;
761
762 if (!dirList) return kTRUE;
763
764 if (dirList->GetEntries() < 3)
3d8bc902 765 {
2d9019b4 766 delete dirList;
9827400b 767 return kTRUE;
3d8bc902 768 }
2d9019b4 769
546242fb 770 if (!gGrid)
771 {
772 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
2d9019b4 773 delete dirList;
546242fb 774 return kFALSE;
775 }
776
2d9019b4 777 Int_t nDirs = 0, nTransfer = 0;
778 TIter dirIter(dirList);
779 TSystemFile* entry = 0;
780
9827400b 781 Bool_t success = kTRUE;
3d8bc902 782 Bool_t first = kTRUE;
9827400b 783
2d9019b4 784 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
785 {
9827400b 786 if (entry->IsDirectory())
787 continue;
788
789 TString fileName(entry->GetName());
790 if (!fileName.BeginsWith(begin))
791 continue;
792
2d9019b4 793 nDirs++;
794
3d8bc902 795 if (first)
796 {
797 first = kFALSE;
798 // check that DET folder exists, otherwise create it
799 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
800
801 if (!result)
2d9019b4 802 {
803 delete dirList;
3d8bc902 804 return kFALSE;
2d9019b4 805 }
3d8bc902 806
546242fb 807 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
3d8bc902 808 {
809 if (!gGrid->Mkdir(alienDir.Data(),"",0))
810 {
811 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
812 alienDir.Data()));
2d9019b4 813 delete dirList;
3d8bc902 814 return kFALSE;
546242fb 815 } else {
816 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
3d8bc902 817 }
818
546242fb 819 } else {
820 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
3d8bc902 821 }
822 }
823
9827400b 824 TString fullLocalPath;
825 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
826
827 TString fullGridPath;
828 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
829
9827400b 830 TFileMerger fileMerger;
831 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
832
833 if (result)
834 {
2d9019b4 835 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
9827400b 836 RemoveFile(fullLocalPath);
2d9019b4 837 nTransfer++;
9827400b 838 }
839 else
840 {
2d9019b4 841 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
9827400b 842 success = kFALSE;
843 }
844 }
2d9019b4 845
846 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
847
848
849 delete dirList;
9827400b 850 return success;
851}
852
2d9019b4 853//______________________________________________________________________________________________
854const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
855{
856 //
857 // Get folder name of reference files
858 //
859
860 TString offDetStr(GetOfflineDetName(detector));
861 TString dir;
862 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
863 {
864 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
865 } else {
866 dir.Form("%s/%s", base, offDetStr.Data());
867 }
868
869 return dir.Data();
870
871
872}
3301427a 873//______________________________________________________________________________________________
874void AliShuttle::CleanLocalStorage(const TString& uri)
875{
9827400b 876 //
877 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
878 //
3301427a 879
880 const char* type = 0;
881 if(uri == fgkLocalCDB) {
882 type = "OCDB";
883 } else if(uri == fgkLocalRefStorage) {
546242fb 884 type = "Reference";
3301427a 885 } else {
886 AliError(Form("Invalid storage URI: %s", uri.Data()));
887 return;
888 }
889
890 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 891
3301427a 892 // open local storage
893 AliCDBStorage *localSto = man->GetStorage(uri);
894 if(!localSto) {
895 Log("SHUTTLE",
896 Form("CleanLocalStorage - cannot activate local %s storage", type));
897 return;
898 }
899
900 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
546242fb 901 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
3301427a 902
903 AliInfo(Form("filename = %s", filename.Data()));
904
905 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
906 GetCurrentRun(), fCurrentDetector.Data()));
907
908 RemoveFile(filename.Data());
909
910}
911
912//______________________________________________________________________________________________
913void AliShuttle::RemoveFile(const char* filename)
914{
9827400b 915 //
916 // removes local file
917 //
3301427a 918
919 TString command(Form("rm -f %s", filename));
920
921 Int_t result = gSystem->Exec(command.Data());
922 if(result != 0)
923 {
924 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
925 fCurrentDetector.Data(), filename));
926 }
73abe331 927}
928
b948db8d 929//______________________________________________________________________________________________
5164a766 930AliShuttleStatus* AliShuttle::ReadShuttleStatus()
931{
9827400b 932 //
933 // Reads the AliShuttleStatus from the CDB
934 //
5164a766 935
2bb7b766 936 if (fStatusEntry){
937 delete fStatusEntry;
938 fStatusEntry = 0;
939 }
5164a766 940
10a5a932 941 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 942 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 943
2bb7b766 944 if (!fStatusEntry) return 0;
945 fStatusEntry->SetOwner(1);
5164a766 946
2bb7b766 947 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
948 if (!status) {
949 AliError("Invalid object stored to CDB!");
950 return 0;
951 }
5164a766 952
2bb7b766 953 return status;
5164a766 954}
955
956//______________________________________________________________________________________________
7bfb2090 957Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 958{
9827400b 959 //
960 // writes the status for one subdetector
961 //
2bb7b766 962
963 if (fStatusEntry){
964 delete fStatusEntry;
965 fStatusEntry = 0;
966 }
5164a766 967
2bb7b766 968 Int_t run = GetCurrentRun();
5164a766 969
2bb7b766 970 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 971
2bb7b766 972 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
973 fStatusEntry->SetOwner(1);
5164a766 974
2bb7b766 975 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 976
2bb7b766 977 if (!result) {
3301427a 978 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
979 fCurrentDetector.Data(), run));
2bb7b766 980 return kFALSE;
981 }
e7f62f16 982
983 SendMLInfo();
7bfb2090 984
2bb7b766 985 return kTRUE;
5164a766 986}
987
988//______________________________________________________________________________________________
989void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
990{
9827400b 991 //
992 // changes the AliShuttleStatus for the given detector and run to the given status
993 //
5164a766 994
2bb7b766 995 if (!fStatusEntry){
996 AliError("UNEXPECTED: fStatusEntry empty");
997 return;
998 }
5164a766 999
2bb7b766 1000 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 1001
2bb7b766 1002 if (!status){
3301427a 1003 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 1004 return;
1005 }
5164a766 1006
2c15234c 1007 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 1008 fCurrentDetector.Data(),
36c99a6a 1009 status->GetStatusName(),
eba76848 1010 status->GetStatusName(newStatus));
cb343cfd 1011 Log("SHUTTLE", actionStr);
1012 SetLastAction(actionStr);
5164a766 1013
2bb7b766 1014 status->SetStatus(newStatus);
1015 if (increaseCount) status->IncreaseCount();
5164a766 1016
2bb7b766 1017 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 1018
1019 SendMLInfo();
5164a766 1020}
e7f62f16 1021
1022//______________________________________________________________________________________________
1023void AliShuttle::SendMLInfo()
1024{
1025 //
1026 // sends ML information about the current status of the current detector being processed
1027 //
1028
1029 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1030
1031 if (!status){
3301427a 1032 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 1033 return;
1034 }
1035
1036 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1037 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1038
1039 TList mlList;
1040 mlList.Add(&mlStatus);
1041 mlList.Add(&mlRetryCount);
1042
1043 fMonaLisa->SendParameters(&mlList);
1044}
1045
5164a766 1046//______________________________________________________________________________________________
1047Bool_t AliShuttle::ContinueProcessing()
1048{
9827400b 1049 // this function reads the AliShuttleStatus information from CDB and
1050 // checks if the processing should be continued
1051 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 1052
57c1a579 1053 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1054
1055 AliPreprocessor* aPreprocessor =
1056 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1057 if (!aPreprocessor)
1058 {
1059 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1060 return kFALSE;
1061 }
1062
2bb7b766 1063 AliShuttleLogbookEntry::Status entryStatus =
eba76848 1064 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 1065
1066 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 1067 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 1068 fCurrentDetector.Data(),
1069 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1070 return kFALSE;
1071 }
1072
1073 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 1074
1075 // check if current run is first unprocessed run for current detector
1076 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1077 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1078 {
86aa42c3 1079 if (fTestMode == kNone)
1080 {
1081 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1082 return kFALSE;
1083 }
1084 else
1085 {
1086 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1087 }
be48e3ea 1088 }
1089
2bb7b766 1090 AliShuttleStatus* status = ReadShuttleStatus();
1091 if (!status) {
1092 // first time
1093 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1094 fCurrentDetector.Data()));
1095 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1096 return WriteShuttleStatus(status);
1097 }
1098
1099 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1100 // If it happens it may mean Logbook updating failed... let's do it now!
1101 if (status->GetStatus() == AliShuttleStatus::kDone ||
1102 status->GetStatus() == AliShuttleStatus::kFailed){
1103 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1104 fCurrentDetector.Data(),
1105 status->GetStatusName(status->GetStatus())));
1106 UpdateShuttleLogbook(fCurrentDetector.Data(),
1107 status->GetStatusName(status->GetStatus()));
1108 return kFALSE;
1109 }
1110
3301427a 1111 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 1112 Log("SHUTTLE",
1113 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1114 fCurrentDetector.Data()));
9827400b 1115 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1116 if (StoreOCDB()){
3301427a 1117 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1118 fCurrentDetector.Data()));
2bb7b766 1119 UpdateShuttleStatus(AliShuttleStatus::kDone);
1120 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1121 } else {
1122 Log("SHUTTLE",
1123 Form("ContinueProcessing - %s: Grid storage failed again",
1124 fCurrentDetector.Data()));
9827400b 1125 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 1126 }
1127 return kFALSE;
1128 }
1129
1130 // if we get here, there is a restart
57c1a579 1131 Bool_t cont = kFALSE;
2bb7b766 1132
1133 // abort conditions
cb343cfd 1134 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 1135 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1136 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 1137 status->GetCount(), status->GetStatusName()));
1138 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 1139 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 1140
1141 // there may still be objects in local OCDB and reference storage
1142 // and FXS databases may be not updated: do it now!
9827400b 1143
1144 // TODO Currently disabled, we want to keep files in case of failure!
1145 // CleanLocalStorage(fgkLocalCDB);
1146 // CleanLocalStorage(fgkLocalRefStorage);
1147 // UpdateTableFailCase();
1148
1149 // Send mail to detector expert!
1150 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1151 if (!SendMail())
1152 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1153 fCurrentDetector.Data()));
3301427a 1154
57c1a579 1155 } else {
1156 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1157 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1158 status->GetStatusName(), status->GetCount()));
9827400b 1159 Bool_t increaseCount = kTRUE;
1160 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1161 increaseCount = kFALSE;
1162 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1163 cont = kTRUE;
2bb7b766 1164 }
1165
57c1a579 1166 return cont;
5164a766 1167}
1168
1169//______________________________________________________________________________________________
2bb7b766 1170Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1171{
73abe331 1172 //
b948db8d 1173 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1174 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1175 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1176 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1177 //
1178
9827400b 1179 if (!entry) return kFALSE;
2bb7b766 1180
1181 fLogbookEntry = entry;
1182
9827400b 1183 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1184 GetCurrentRun()));
2bb7b766 1185
e7f62f16 1186 // create ML instance that monitors this run
1187 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1188 // disable monitoring of other parameters that come e.g. from TFile
1189 gMonitoringWriter = 0;
2bb7b766 1190
e7f62f16 1191 // Send the information to ML
1192 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1193 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1194
1195 TList mlList;
1196 mlList.Add(&mlStatus);
9827400b 1197 mlList.Add(&mlRunType);
e7f62f16 1198
1199 fMonaLisa->SendParameters(&mlList);
3301427a 1200
9827400b 1201 if (fLogbookEntry->IsDone())
1202 {
1203 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1204 UpdateShuttleLogbook("shuttle_done");
1205 fLogbookEntry = 0;
1206 return kTRUE;
1207 }
1208
1209 // read test mode if flag is set
1210 if (fReadTestMode)
1211 {
3d8bc902 1212 fTestMode = kNone;
9827400b 1213 TString logEntry(entry->GetRunParameter("log"));
1214 //printf("log entry = %s\n", logEntry.Data());
1215 TString searchStr("Testmode: ");
1216 Int_t pos = logEntry.Index(searchStr.Data());
1217 //printf("%d\n", pos);
1218 if (pos >= 0)
1219 {
1220 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1221 //printf("%s\n", subStr.String().Data());
1222 TString newStr(subStr.Data());
1223 TObjArray* token = newStr.Tokenize(' ');
1224 if (token)
1225 {
1226 //token->Print();
1227 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1228 if (tmpStr)
1229 {
1230 Int_t testMode = tmpStr->String().Atoi();
1231 if (testMode > 0)
1232 {
1233 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1234 SetTestMode((TestMode) testMode);
1235 }
1236 }
1237 delete token;
1238 }
1239 }
1240 }
1241
3d8bc902 1242 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1243
eba76848 1244 fLogbookEntry->Print("all");
57f50b3c 1245
1246 // Initialization
d477ad88 1247 Bool_t hasError = kFALSE;
5164a766 1248
2bb7b766 1249 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1250 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1251 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1252 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1253
57f50b3c 1254 // Loop on detectors in the configuration
b948db8d 1255 TIter iter(fConfig->GetDetectors());
2bb7b766 1256 TObjString* aDetector = 0;
b948db8d 1257
be48e3ea 1258 while ((aDetector = (TObjString*) iter.Next()))
1259 {
7bfb2090 1260 fCurrentDetector = aDetector->String();
5164a766 1261
9e080f92 1262 if (ContinueProcessing() == kFALSE) continue;
1263
2bb7b766 1264 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1265 GetCurrentRun(), aDetector->GetName()));
1266
9d733021 1267 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1268
e7f62f16 1269 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1270
be48e3ea 1271 Int_t pid = fork();
1272
1273 if (pid < 0)
1274 {
1275 Log("SHUTTLE", "ERROR: Forking failed");
1276 }
1277 else if (pid > 0)
1278 {
1279 // parent
1280 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1281 GetCurrentRun(), aDetector->GetName()));
1282
1283 Long_t begin = time(0);
1284
1285 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1286 while (waitpid(pid, &status, WNOHANG) == 0)
1287 {
1288 Long_t expiredTime = time(0) - begin;
1289
1290 if (expiredTime > fConfig->GetPPTimeOut())
1291 {
9827400b 1292 TString tmp;
1293 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1294 fCurrentDetector.Data(), expiredTime);
1295 Log("SHUTTLE", tmp);
1296 Log(fCurrentDetector, tmp);
be48e3ea 1297
1298 kill(pid, 9);
1299
3301427a 1300 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1301 hasError = kTRUE;
1302
1303 gSystem->Sleep(1000);
1304 }
1305 else
1306 {
be48e3ea 1307 gSystem->Sleep(1000);
9827400b 1308
1309 TString checkStr;
1310 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1311 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1312 if (!pipe)
1313 {
1314 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1315 continue;
1316 }
1317
1318 char buffer[100];
1319 if (!fgets(buffer, 100, pipe))
1320 {
1321 Log("SHUTTLE", "Error: ps did not return anything");
1322 gSystem->ClosePipe(pipe);
1323 continue;
1324 }
1325 gSystem->ClosePipe(pipe);
1326
1327 //Log("SHUTTLE", Form("ps returned %s", buffer));
1328
1329 Int_t mem = 0;
1330 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1331 {
1332 Log("SHUTTLE", "Error: Could not parse output of ps");
1333 continue;
1334 }
1335
1336 if (expiredTime % 60 == 0)
886d60e6 1337 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1338 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1339
1340 if (mem > fConfig->GetPPMaxMem())
1341 {
1342 TString tmp;
1343 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1344 mem, fConfig->GetPPMaxMem());
1345 Log("SHUTTLE", tmp);
1346 Log(fCurrentDetector, tmp);
1347
1348 kill(pid, 9);
1349
1350 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1351 hasError = kTRUE;
1352
1353 gSystem->Sleep(1000);
1354 }
be48e3ea 1355 }
1356 }
1357
1358 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1359 GetCurrentRun(), aDetector->GetName()));
1360
1361 if (WIFEXITED(status))
1362 {
1363 Int_t returnCode = WEXITSTATUS(status);
1364
3301427a 1365 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1366 returnCode));
be48e3ea 1367
9827400b 1368 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1369 }
1370 }
1371 else if (pid == 0)
1372 {
1373 // client
1374 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1375
ffa29e93 1376 AliInfo("Redirecting output...");
1377
546242fb 1378 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
ffa29e93 1379 {
1380 Log("SHUTTLE", "Could not freopen stdout");
1381 }
1382 else
1383 {
1384 fOutputRedirected = kTRUE;
1385 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1386 Log("SHUTTLE", "Could not redirect stderr");
1387
1388 }
1389
9827400b 1390 Bool_t success = ProcessCurrentDetector();
1391 if (success) // Preprocessor finished successfully!
1392 {
3301427a 1393 // Update time_processed field in FXS DB
1394 if (UpdateTable() == kFALSE)
1395 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1396
1397 // Transfer the data from local storage to main storage (Grid)
1398 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1399 if (StoreOCDB() == kFALSE)
1400 {
1401 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1402 GetCurrentRun(), aDetector->GetName()));
1403 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1404 success = kFALSE;
3301427a 1405 } else {
1406 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1407 GetCurrentRun(), aDetector->GetName()));
1408 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1409 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1410 }
be48e3ea 1411 }
1412
4b95672b 1413 for (UInt_t iSys=0; iSys<3; iSys++)
1414 {
1415 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1416 }
1417
be48e3ea 1418 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1419 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1420
1421 // the client exits here
9827400b 1422 gSystem->Exit(success);
be48e3ea 1423
1424 AliError("We should never get here!!!");
1425 }
7bfb2090 1426 }
5164a766 1427
2bb7b766 1428 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1429 GetCurrentRun()));
1430
1431 //check if shuttle is done for this run, if so update logbook
1432 TObjArray checkEntryArray;
1433 checkEntryArray.SetOwner(1);
9e080f92 1434 TString whereClause = Form("where run=%d", GetCurrentRun());
1435 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1436 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1437 GetCurrentRun()));
1438 return hasError == kFALSE;
1439 }
b948db8d 1440
9e080f92 1441 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1442 (checkEntryArray.At(0));
2bb7b766 1443
9e080f92 1444 if (checkEntry)
1445 {
1446 if (checkEntry->IsDone())
be48e3ea 1447 {
9e080f92 1448 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1449 UpdateShuttleLogbook("shuttle_done");
1450 }
1451 else
1452 {
1453 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1454 {
9e080f92 1455 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1456 {
9e080f92 1457 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1458 checkEntry->GetRun(), GetDetName(iDet)));
1459 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1460 }
1461 }
2bb7b766 1462 }
1463 }
1464
e7f62f16 1465 // remove ML instance
1466 delete fMonaLisa;
1467 fMonaLisa = 0;
1468
2bb7b766 1469 fLogbookEntry = 0;
85a80aa9 1470
a7160fe9 1471 return hasError == kFALSE;
73abe331 1472}
1473
b948db8d 1474//______________________________________________________________________________________________
9827400b 1475Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1476{
1477 //
2bb7b766 1478 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1479 // Threre should be a configuration for this detector.
73abe331 1480
2bb7b766 1481 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1482
2d9019b4 1483 if (!CleanReferenceStorage(fCurrentDetector.Data()))
546242fb 1484 return kFALSE;
1485
2c15234c 1486 TMap dcsMap;
1487 dcsMap.SetOwner(1);
73abe331 1488
85a80aa9 1489 Bool_t aDCSError = kFALSE;
3301427a 1490
1491 // call preprocessor
1492 AliPreprocessor* aPreprocessor =
1493 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1494
1495 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1496
1497 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1498
651fdaab 1499 if (!processDCS)
1500 {
1501 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1502 }
8b739301 1503 else if (fTestMode & kSkipDCS)
2c15234c 1504 {
3d8bc902 1505 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1506 }
1507 else if (fTestMode & kErrorDCS)
1508 {
3d8bc902 1509 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1510 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1511 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1512 return kFALSE;
2c15234c 1513 } else {
3301427a 1514
1515 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1516
2c15234c 1517 TString host(fConfig->GetDCSHost(fCurrentDetector));
1518 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1519
1520 // Retrieval of Aliases
1521 TObjString* anAlias = 0;
546242fb 1522 Int_t iAlias = 0;
36c99a6a 1523 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1524 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1525 while ((anAlias = (TObjString*) iterAliases.Next()))
1526 {
1527 TObjArray *valueSet = new TObjArray();
1528 valueSet->SetOwner(1);
1529
546242fb 1530 iAlias++;
2c15234c 1531 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1532
1533 if(!aDCSError)
1534 {
2d9019b4 1535 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1536 AliInfo(Form("Alias %s (%d of %d) - %d values collected",
1537 anAlias->GetName(), iAlias, nTotAliases, valueSet->GetEntriesFast()));
2c15234c 1538 dcsMap.Add(anAlias->Clone(), valueSet);
1539 } else {
1540 Log(fCurrentDetector,
1541 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1542 anAlias->GetName()));
1543 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1544 dcsMap.DeleteAll();
9827400b 1545 return kFALSE;
2c15234c 1546 }
4f0ab988 1547 }
2c15234c 1548
1549 // Retrieval of Data Points
1550 TObjString* aDP = 0;
36c99a6a 1551 Int_t iDP = 0;
1552 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1553 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1554 while ((aDP = (TObjString*) iterDP.Next()))
1555 {
1556 TObjArray *valueSet = new TObjArray();
1557 valueSet->SetOwner(1);
36c99a6a 1558 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1559 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1560 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1561 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1562
1563 if(!aDCSError)
1564 {
1565 dcsMap.Add(aDP->Clone(), valueSet);
1566 } else {
1567 Log(fCurrentDetector,
1568 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1569 aDP->GetName()));
1570 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1571 dcsMap.DeleteAll();
9827400b 1572 return kFALSE;
2c15234c 1573 }
73abe331 1574 }
1575 }
b948db8d 1576
2bb7b766 1577 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1578 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1579
3301427a 1580 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1581
3301427a 1582 if (returnValue > 0) // Preprocessor error!
1583 {
9827400b 1584 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1585 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1586 dcsMap.DeleteAll();
1587 return kFALSE;
1588 }
1589
1590 // preprocessor ok!
1591 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1592 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1593 fCurrentDetector.Data()));
b948db8d 1594
2c15234c 1595 dcsMap.DeleteAll();
b948db8d 1596
9827400b 1597 return kTRUE;
2bb7b766 1598}
1599
1600//______________________________________________________________________________________________
1601Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1602 TObjArray& entries)
1603{
9827400b 1604 // Query DAQ's Shuttle logbook and fills detector status object.
1605 // Call QueryRunParameters to query DAQ logbook for run parameters.
1606 //
2bb7b766 1607
fc5a4708 1608 entries.SetOwner(1);
1609
2bb7b766 1610 // check connection, in case connect
be48e3ea 1611 if(!Connect(3)) return kFALSE;
2bb7b766 1612
1613 TString sqlQuery;
441b0e9c 1614 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1615
be48e3ea 1616 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1617 if (!aResult) {
1618 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1619 return kFALSE;
1620 }
1621
fc5a4708 1622 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1623
2bb7b766 1624 if(aResult->GetRowCount() == 0) {
9827400b 1625 AliInfo("No entries in Shuttle Logbook match request");
1626 delete aResult;
1627 return kTRUE;
2bb7b766 1628 }
1629
1630 // TODO Check field count!
fc5a4708 1631 const UInt_t nCols = 22;
2bb7b766 1632 if (aResult->GetFieldCount() != (Int_t) nCols) {
1633 AliError("Invalid SQL result field number!");
1634 delete aResult;
1635 return kFALSE;
1636 }
1637
2bb7b766 1638 TSQLRow* aRow;
1639 while ((aRow = aResult->Next())) {
1640 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1641 Int_t run = runString.Atoi();
1642
eba76848 1643 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1644 if (!entry)
1645 continue;
2bb7b766 1646
1647 // loop on detectors
eba76848 1648 for(UInt_t ii = 0; ii < nCols; ii++)
1649 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1650
eba76848 1651 entries.AddLast(entry);
2bb7b766 1652 delete aRow;
1653 }
1654
2bb7b766 1655 delete aResult;
1656 return kTRUE;
1657}
1658
1659//______________________________________________________________________________________________
eba76848 1660AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1661{
eba76848 1662 //
1663 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1664 //
2bb7b766 1665
1666 // check connection, in case connect
be48e3ea 1667 if (!Connect(3))
eba76848 1668 return 0;
2bb7b766 1669
1670 TString sqlQuery;
2c15234c 1671 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1672
be48e3ea 1673 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1674 if (!aResult) {
1675 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1676 return 0;
2bb7b766 1677 }
1678
eba76848 1679 if (aResult->GetRowCount() == 0) {
2bb7b766 1680 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1681 delete aResult;
eba76848 1682 return 0;
2bb7b766 1683 }
1684
eba76848 1685 if (aResult->GetRowCount() > 1) {
2bb7b766 1686 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1687 delete aResult;
eba76848 1688 return 0;
2bb7b766 1689 }
1690
eba76848 1691 TSQLRow* aRow = aResult->Next();
1692 if (!aRow)
1693 {
1694 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1695 delete aResult;
1696 return 0;
1697 }
2bb7b766 1698
eba76848 1699 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1700
eba76848 1701 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1702 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1703
eba76848 1704 UInt_t startTime = entry->GetStartTime();
1705 UInt_t endTime = entry->GetEndTime();
1706
1707 if (!startTime || !endTime || startTime > endTime) {
1708 Log("SHUTTLE",
1709 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1710 run, startTime, endTime));
1711 delete entry;
2bb7b766 1712 delete aRow;
eba76848 1713 delete aResult;
1714 return 0;
2bb7b766 1715 }
1716
eba76848 1717 delete aRow;
2bb7b766 1718 delete aResult;
eba76848 1719
1720 return entry;
2bb7b766 1721}
1722
b948db8d 1723//______________________________________________________________________________________________
2c15234c 1724Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1725 TObjArray* valueSet, DCSType type)
73abe331 1726{
9827400b 1727 // Retrieve all "entry" data points from the DCS server
1728 // host, port: TSocket connection parameters
1729 // entry: name of the alias or data point
1730 // valueSet: array of retrieved AliDCSValue's
1731 // type: kAlias or kDP
58bc3020 1732
73abe331 1733 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1734 if (!client.IsConnected())
1735 {
b948db8d 1736 return kFALSE;
73abe331 1737 }
1738
2c15234c 1739 Int_t result=0;
73abe331 1740
2c15234c 1741 if (type == kAlias)
1742 {
1743 result = client.GetAliasValues(entry,
1744 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1745 } else
1746 if (type == kDP)
1747 {
1748 result = client.GetDPValues(entry,
1749 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1750 }
1751
1752 if (result < 0)
1753 {
2bb7b766 1754 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1755 entry, AliDCSClient::GetErrorString(result)));
73abe331 1756
2c15234c 1757 if (result == AliDCSClient::fgkServerError)
1758 {
2bb7b766 1759 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1760 client.GetServerError().Data()));
1761 }
1762
1763 return kFALSE;
1764 }
1765
1766 return kTRUE;
1767}
b948db8d 1768
1769//______________________________________________________________________________________________
57f50b3c 1770const char* AliShuttle::GetFile(Int_t system, const char* detector,
1771 const char* id, const char* source)
b948db8d 1772{
9827400b 1773 // Get calibration file from file exchange servers
1774 // First queris the FXS database for the file name, using the run, detector, id and source info
1775 // then calls RetrieveFile(filename) for actual copy to local disk
1776 // run: current run being processed (given by Logbook entry fLogbookEntry)
1777 // detector: the Preprocessor name
1778 // id: provided as a parameter by the Preprocessor
1779 // source: provided by the Preprocessor through GetFileSources function
1780
1781 // check if test mode should simulate a FXS error
1782 if (fTestMode & kErrorFXSFiles)
1783 {
1784 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1785 return 0;
1786 }
1787
57f50b3c 1788 // check connection, in case connect
9d733021 1789 if (!Connect(system))
eba76848 1790 {
9d733021 1791 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1792 return 0;
1793 }
1794
1795 // Query preparation
9d733021 1796 TString sourceName(source);
d386d623 1797 Int_t nFields = 3;
1798 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1799 fConfig->GetFXSdbTable(system));
1800 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1801 GetCurrentRun(), detector, id);
1802
9d733021 1803 if (system == kDAQ)
1804 {
d386d623 1805 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1806 }
9d733021 1807 else if (system == kDCS)
eba76848 1808 {
9d733021 1809 sourceName="none";
57f50b3c 1810 }
9d733021 1811 else if (system == kHLT)
9e080f92 1812 {
d386d623 1813 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1814 nFields = 3;
9e080f92 1815 }
1816
9e080f92 1817 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1818
1819 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1820
1821 // Query execution
1822 TSQLResult* aResult = 0;
9d733021 1823 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1824 if (!aResult) {
9d733021 1825 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1826 GetSystemName(system), id, sourceName.Data()));
9e080f92 1827 return 0;
1828 }
1829
1830 if(aResult->GetRowCount() == 0)
1831 {
1832 Log(detector,
9d733021 1833 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1834 GetSystemName(system), id, sourceName.Data()));
9e080f92 1835 delete aResult;
1836 return 0;
1837 }
2bb7b766 1838
9e080f92 1839 if (aResult->GetRowCount() > 1) {
1840 Log(detector,
9d733021 1841 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1842 GetSystemName(system), id, sourceName.Data()));
9e080f92 1843 delete aResult;
1844 return 0;
1845 }
1846
9d733021 1847 if (aResult->GetFieldCount() != nFields) {
9e080f92 1848 Log(detector,
9d733021 1849 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1850 GetSystemName(system), id, sourceName.Data()));
9e080f92 1851 delete aResult;
1852 return 0;
1853 }
1854
1855 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1856
1857 if (!aRow){
9d733021 1858 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1859 GetSystemName(system), id, sourceName.Data()));
9e080f92 1860 delete aResult;
1861 return 0;
1862 }
1863
1864 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1865 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1866 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1867
1868 delete aResult;
1869 delete aRow;
1870
d386d623 1871 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1872 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1873
9e080f92 1874 // retrieved file is renamed to make it unique
9d733021 1875 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1876 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1877
9e080f92 1878
9d733021 1879 // file retrieval from FXS
4b95672b 1880 UInt_t nRetries = 0;
1881 UInt_t maxRetries = 3;
1882 Bool_t result = kFALSE;
1883
1884 // copy!! if successful TSystem::Exec returns 0
1885 while(nRetries++ < maxRetries) {
1886 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1887 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1888 if(!result)
1889 {
1890 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1891 filePath.Data(), GetSystemName(system)));
4b95672b 1892 continue;
1893 } else {
1894 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1895 filePath.Data(), GetSystemName(system),
1896 GetShuttleTempDir(), localFileName.Data()));
1897 }
9e080f92 1898
d386d623 1899 if (fileChecksum.Length()>0)
4b95672b 1900 {
1901 // compare md5sum of local file with the one stored in the FXS DB
1902 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1903 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1904
4b95672b 1905 if (md5Comp != 0)
1906 {
1907 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1908 filePath.Data()));
1909 result = kFALSE;
1910 continue;
1911 }
d386d623 1912 } else {
1913 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1914 filePath.Data(), GetSystemName(system)));
9d733021 1915 }
4b95672b 1916 if (result) break;
9e080f92 1917 }
1918
4b95672b 1919 if(!result) return 0;
1920
9d733021 1921 fFXSCalled[system]=kTRUE;
1922 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1923 fFXSlist[system].Add(fileParams);
9e080f92 1924
1925 static TString fullLocalFileName;
36c99a6a 1926 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1927
9e080f92 1928 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1929
1930 return fullLocalFileName.Data();
2bb7b766 1931
1932}
1933
1934//______________________________________________________________________________________________
9d733021 1935Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1936{
9827400b 1937 //
1938 // Copies file from FXS to local Shuttle machine
1939 //
2bb7b766 1940
9e080f92 1941 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1942 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1943 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1944
36c99a6a 1945 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1946 if (dir == NULL) {
36c99a6a 1947 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1948 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1949 return kFALSE;
1950 }
1951
1952 } else {
1953 gSystem->FreeDirectory(dir);
1954 }
1955
9d733021 1956 TString baseFXSFolder;
1957 if (system == kDAQ)
1958 {
1959 baseFXSFolder = "FES/";
1960 }
1961 else if (system == kDCS)
1962 {
1963 baseFXSFolder = "";
1964 }
1965 else if (system == kHLT)
1966 {
1967 baseFXSFolder = "~/";
1968 }
1969
1970
1971 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1972 fConfig->GetFXSPort(system),
1973 fConfig->GetFXSUser(system),
1974 fConfig->GetFXSHost(system),
1975 baseFXSFolder.Data(),
1976 fxsFileName,
36c99a6a 1977 GetShuttleTempDir(),
9e080f92 1978 localFileName);
1979
1980 AliDebug(2, Form("%s",command.Data()));
1981
4b95672b 1982 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1983
4b95672b 1984 return result;
9e080f92 1985}
1986
1987//______________________________________________________________________________________________
9d733021 1988TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1989{
9827400b 1990 //
1991 // Get sources producing the condition file Id from file exchange servers
4a33bdd9 1992 // if id is NULL all sources are returned (distinct)
9827400b 1993 //
1994
1995 // check if test mode should simulate a FXS error
1996 if (fTestMode & kErrorFXSSources)
1997 {
1998 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1999 return 0;
2000 }
2001
9d733021 2002
2003 if (system == kDCS)
2004 {
2005 AliError("DCS system has only one source of data!");
2006 return NULL;
9d733021 2007 }
9e080f92 2008
2009 // check connection, in case connect
9d733021 2010 if (!Connect(system))
2011 {
4a33bdd9 2012 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
9d733021 2013 return NULL;
9e080f92 2014 }
2015
9d733021 2016 TString sourceName = 0;
2017 if (system == kDAQ)
2018 {
2019 sourceName = "DAQsource";
2020 } else if (system == kHLT)
2021 {
2022 sourceName = "DDLnumbers";
2023 }
2024
4a33bdd9 2025 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2026 TString whereClause = Form("run=%d and detector=\"%s\"",
2027 GetCurrentRun(), detector);
2028 if (id)
2029 whereClause += Form(" and fileId=\"%s\"", id);
9e080f92 2030 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2031
2032 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2033
2034 // Query execution
2035 TSQLResult* aResult;
9d733021 2036 aResult = fServer[system]->Query(sqlQuery);
9e080f92 2037 if (!aResult) {
9d733021 2038 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2039 GetSystemName(system), id));
9e080f92 2040 return 0;
2041 }
2042
86aa42c3 2043 TList *list = new TList();
2044 list->SetOwner(1);
2045
9d733021 2046 if (aResult->GetRowCount() == 0)
2047 {
9e080f92 2048 Log(detector,
9d733021 2049 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 2050 delete aResult;
86aa42c3 2051 return list;
9e080f92 2052 }
2053
2054 TSQLRow* aRow;
9e080f92 2055
9d733021 2056 while ((aRow = aResult->Next()))
2057 {
9e080f92 2058
9d733021 2059 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2060 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2061 list->Add(new TObjString(source));
9e080f92 2062 delete aRow;
2063 }
9d733021 2064
9e080f92 2065 delete aResult;
2066
2067 return list;
2bb7b766 2068}
2069
4a33bdd9 2070//______________________________________________________________________________________________
2071TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2072{
2073 //
2074 // Get all ids of condition files produced by a given source from file exchange servers
2075 //
2076
2077 // check if test mode should simulate a FXS error
2078 if (fTestMode & kErrorFXSSources)
2079 {
2080 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2081 return 0;
2082 }
2083
2084 // check connection, in case connect
2085 if (!Connect(system))
2086 {
2087 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2088 return NULL;
2089 }
2090
2091 TString sourceName = 0;
2092 if (system == kDAQ)
2093 {
2094 sourceName = "DAQsource";
2095 } else if (system == kHLT)
2096 {
2097 sourceName = "DDLnumbers";
2098 }
2099
2100 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2101 TString whereClause = Form("run=%d and detector=\"%s\"",
2102 GetCurrentRun(), detector);
2103 if (sourceName.Length() > 0 && source)
2104 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2105 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2106
2107 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2108
2109 // Query execution
2110 TSQLResult* aResult;
2111 aResult = fServer[system]->Query(sqlQuery);
2112 if (!aResult) {
2113 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2114 GetSystemName(system), source));
2115 return 0;
2116 }
2117
2118 TList *list = new TList();
2119 list->SetOwner(1);
2120
2121 if (aResult->GetRowCount() == 0)
2122 {
2123 Log(detector,
2124 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2125 delete aResult;
2126 return list;
2127 }
2128
2129 TSQLRow* aRow;
2130
2131 while ((aRow = aResult->Next()))
2132 {
2133
2134 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2135 AliDebug(2, Form("fileId = %s", id.Data()));
2136 list->Add(new TObjString(id));
2137 delete aRow;
2138 }
2139
2140 delete aResult;
2141
2142 return list;
2143}
2144
2bb7b766 2145//______________________________________________________________________________________________
9d733021 2146Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 2147{
9827400b 2148 // Connect to MySQL Server of the system's FXS MySQL databases
2149 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2150 //
57f50b3c 2151
9d733021 2152 // check connection: if already connected return
2153 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 2154
9d733021 2155 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 2156
9d733021 2157 if (system < 3) // FXS db servers
2158 {
2159 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2160 dbUser = fConfig->GetFXSdbUser(system);
2161 dbPass = fConfig->GetFXSdbPass(system);
2162 dbName = fConfig->GetFXSdbName(system);
2163 } else { // Run & Shuttle logbook servers
2164 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2165 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2166 dbUser = fConfig->GetDAQlbUser();
2167 dbPass = fConfig->GetDAQlbPass();
2168 dbName = fConfig->GetDAQlbDB();
2169 }
57f50b3c 2170
9d733021 2171 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2172 if (!fServer[system] || !fServer[system]->IsConnected()) {
2173 if(system < 3)
2174 {
2175 AliError(Form("Can't establish connection to FXS database for %s",
2176 AliShuttleInterface::GetSystemName(system)));
2177 } else {
2178 AliError("Can't establish connection to Run logbook.");
57f50b3c 2179 }
9d733021 2180 if(fServer[system]) delete fServer[system];
2181 return kFALSE;
2bb7b766 2182 }
57f50b3c 2183
9d733021 2184 // Get tables
2185 TSQLResult* aResult=0;
2186 switch(system){
2187 case kDAQ:
2188 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2189 break;
2190 case kDCS:
2191 aResult = fServer[kDCS]->GetTables(dbName.Data());
2192 break;
2193 case kHLT:
2194 aResult = fServer[kHLT]->GetTables(dbName.Data());
2195 break;
2196 default:
2197 aResult = fServer[3]->GetTables(dbName.Data());
2198 break;
2199 }
2200
2201 delete aResult;
2bb7b766 2202 return kTRUE;
2203}
57f50b3c 2204
9e080f92 2205//______________________________________________________________________________________________
9d733021 2206Bool_t AliShuttle::UpdateTable()
9e080f92 2207{
9827400b 2208 //
2209 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2210 //
9e080f92 2211
9d733021 2212 Bool_t result = kTRUE;
9e080f92 2213
9d733021 2214 for (UInt_t system=0; system<3; system++)
2215 {
2216 if(!fFXSCalled[system]) continue;
9e080f92 2217
9d733021 2218 // check connection, in case connect
2219 if (!Connect(system))
2220 {
2221 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2222 result = kFALSE;
2223 continue;
9e080f92 2224 }
9e080f92 2225
9d733021 2226 TTimeStamp now; // now
2227
2228 // Loop on FXS list entries
2229 TIter iter(&fFXSlist[system]);
2230 TObjString *aFXSentry=0;
2231 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2232 {
2233 TString aFXSentrystr = aFXSentry->String();
2234 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2235 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2236 {
2237 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2238 GetSystemName(system), aFXSentrystr.Data()));
2239 if(aFXSarray) delete aFXSarray;
2240 result = kFALSE;
2241 continue;
2242 }
2243 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2244 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2245
2246 TString whereClause;
2247 if (system == kDAQ)
2248 {
2249 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2250 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2251 }
2252 else if (system == kDCS)
2253 {
2254 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2255 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2256 }
2257 else if (system == kHLT)
2258 {
2259 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2260 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2261 }
2262
2263 delete aFXSarray;
9e080f92 2264
9d733021 2265 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2266 now.GetSec(), whereClause.Data());
9e080f92 2267
9d733021 2268 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2269
9d733021 2270 // Query execution
2271 TSQLResult* aResult;
2272 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2273 if (!aResult)
2274 {
2275 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2276 GetSystemName(system), sqlQuery.Data()));
2277 result = kFALSE;
2278 continue;
2279 }
2280 delete aResult;
9e080f92 2281 }
9e080f92 2282 }
2283
9d733021 2284 return result;
9e080f92 2285}
57f50b3c 2286
3301427a 2287//______________________________________________________________________________________________
2288Bool_t AliShuttle::UpdateTableFailCase()
2289{
9827400b 2290 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2291 // this is called in case the preprocessor is declared failed for the current run, because
2292 // the fields are updated only in case of success
3301427a 2293
2294 Bool_t result = kTRUE;
2295
2296 for (UInt_t system=0; system<3; system++)
2297 {
2298 // check connection, in case connect
2299 if (!Connect(system))
2300 {
2301 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2302 GetSystemName(system)));
2303 result = kFALSE;
2304 continue;
2305 }
2306
2307 TTimeStamp now; // now
2308
2309 // Loop on FXS list entries
2310
2311 TString whereClause = Form("where run=%d and detector=\"%s\";",
2312 GetCurrentRun(), fCurrentDetector.Data());
2313
2314
2315 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2316 now.GetSec(), whereClause.Data());
2317
2318 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2319
2320 // Query execution
2321 TSQLResult* aResult;
2322 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2323 if (!aResult)
2324 {
2325 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2326 GetSystemName(system), sqlQuery.Data()));
2327 result = kFALSE;
2328 continue;
2329 }
2330 delete aResult;
2331 }
2332
2333 return result;
2334}
2335
2bb7b766 2336//______________________________________________________________________________________________
2337Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2338{
e7f62f16 2339 //
2340 // Update Shuttle logbook filling detector or shuttle_done column
2341 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2342 //
57f50b3c 2343
2bb7b766 2344 // check connection, in case connect
be48e3ea 2345 if(!Connect(3)){
2bb7b766 2346 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2347 return kFALSE;
57f50b3c 2348 }
2349
2bb7b766 2350 TString detName(detector);
2351 TString setClause;
e7f62f16 2352 if(detName == "shuttle_done")
2353 {
2bb7b766 2354 setClause = "set shuttle_done=1";
e7f62f16 2355
2356 // Send the information to ML
2357 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2358
2359 TList mlList;
2360 mlList.Add(&mlStatus);
2361
2362 fMonaLisa->SendParameters(&mlList);
2bb7b766 2363 } else {
2bb7b766 2364 TString statusStr(status);
2365 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2366 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2367 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2368 } else {
2369 Log("SHUTTLE",
2370 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2371 status, detector));
2372 return kFALSE;
2373 }
2374 }
57f50b3c 2375
2bb7b766 2376 TString whereClause = Form("where run=%d", GetCurrentRun());
2377
441b0e9c 2378 TString sqlQuery = Form("update %s %s %s",
2379 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2380
2bb7b766 2381 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2382
2383 // Query execution
2384 TSQLResult* aResult;
be48e3ea 2385 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2386 if (!aResult) {
2387 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2388 return kFALSE;
57f50b3c 2389 }
2bb7b766 2390 delete aResult;
57f50b3c 2391
2392 return kTRUE;
2393}
2394
2395//______________________________________________________________________________________________
2bb7b766 2396Int_t AliShuttle::GetCurrentRun() const
2397{
9827400b 2398 //
2399 // Get current run from logbook entry
2400 //
57f50b3c 2401
2bb7b766 2402 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2403}
2404
2405//______________________________________________________________________________________________
2bb7b766 2406UInt_t AliShuttle::GetCurrentStartTime() const
2407{
9827400b 2408 //
2409 // get current start time
2410 //
57f50b3c 2411
2bb7b766 2412 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2413}
2414
2415//______________________________________________________________________________________________
2bb7b766 2416UInt_t AliShuttle::GetCurrentEndTime() const
2417{
9827400b 2418 //
2419 // get current end time from logbook entry
2420 //
57f50b3c 2421
2bb7b766 2422 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2423}
2424
b948db8d 2425//______________________________________________________________________________________________
2426void AliShuttle::Log(const char* detector, const char* message)
2427{
9827400b 2428 //
2429 // Fill log string with a message
2430 //
b948db8d 2431
36c99a6a 2432 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2433 if (dir == NULL) {
36c99a6a 2434 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2435 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2436 return;
2437 }
b948db8d 2438
84090f85 2439 } else {
2440 gSystem->FreeDirectory(dir);
2441 }
b948db8d 2442
cb343cfd 2443 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2444 if (GetCurrentRun() >= 0)
2445 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2446 toLog += Form("%s", message);
2447
84090f85 2448 AliInfo(toLog.Data());
ffa29e93 2449
2450 // if we redirect the log output already to the file, leave here
2451 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2452 return;
b948db8d 2453
ffa29e93 2454 TString fileName = GetLogFileName(detector);
e7f62f16 2455
84090f85 2456 gSystem->ExpandPathName(fileName);
2457
2458 ofstream logFile;
2459 logFile.open(fileName, ofstream::out | ofstream::app);
2460
2461 if (!logFile.is_open()) {
2462 AliError(Form("Could not open file %s", fileName.Data()));
2463 return;
2464 }
7bfb2090 2465
84090f85 2466 logFile << toLog.Data() << "\n";
b948db8d 2467
84090f85 2468 logFile.close();
b948db8d 2469}
2bb7b766 2470
ffa29e93 2471//______________________________________________________________________________________________
2472TString AliShuttle::GetLogFileName(const char* detector) const
2473{
2474 //
2475 // returns the name of the log file for a given sub detector
2476 //
2477
2478 TString fileName;
2479
2480 if (GetCurrentRun() >= 0)
2481 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2482 else
2483 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2484
2485 return fileName;
2486}
2487
2bb7b766 2488//______________________________________________________________________________________________
2489Bool_t AliShuttle::Collect(Int_t run)
2490{
9827400b 2491 //
2492 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2493 // If a dedicated run is given this run is processed
2494 //
2495 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2496 //
2bb7b766 2497
eba76848 2498 if (run == -1)
2499 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2500 else
2501 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2502
2503 SetLastAction("Starting");
2bb7b766 2504
2505 TString whereClause("where shuttle_done=0");
eba76848 2506 if (run != -1)
2507 whereClause += Form(" and run=%d", run);
2bb7b766 2508
2509 TObjArray shuttleLogbookEntries;
be48e3ea 2510 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2511 {
cb343cfd 2512 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2513 return kFALSE;
2514 }
2515
9e080f92 2516 if (shuttleLogbookEntries.GetEntries() == 0)
2517 {
2518 if (run == -1)
2519 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2520 else
2521 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2522 "or it does not exist in Shuttle logbook", run));
2523 return kTRUE;
2524 }
2525
be48e3ea 2526 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2527 fFirstUnprocessed[iDet] = kTRUE;
2528
fc5a4708 2529 if (run != -1)
be48e3ea 2530 {
2531 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2532 // flag them into fFirstUnprocessed array
2533 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2534 TObjArray tmpLogbookEntries;
2535 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2536 {
2537 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2538 return kFALSE;
2539 }
2540
2541 TIter iter(&tmpLogbookEntries);
2542 AliShuttleLogbookEntry* anEntry = 0;
2543 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2544 {
2545 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2546 {
2547 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2548 {
2549 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2550 anEntry->GetRun(), GetDetName(iDet)));
2551 fFirstUnprocessed[iDet] = kFALSE;
2552 }
2553 }
2554
2555 }
2556
2557 }
2558
2559 if (!RetrieveConditionsData(shuttleLogbookEntries))
2560 {
cb343cfd 2561 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2562 return kFALSE;
2563 }
2564
36c99a6a 2565 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2566 return kTRUE;
2bb7b766 2567}
2568
2bb7b766 2569//______________________________________________________________________________________________
2570Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2571{
9827400b 2572 //
2573 // Retrieve conditions data for all runs that aren't processed yet
2574 //
2bb7b766 2575
2576 Bool_t hasError = kFALSE;
2577
2578 TIter iter(&dateEntries);
2579 AliShuttleLogbookEntry* anEntry;
2580
2581 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2582 if (!Process(anEntry)){
2583 hasError = kTRUE;
2584 }
4b95672b 2585
2586 // clean SHUTTLE temp directory
3301427a 2587 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2588 RemoveFile(filename.Data());
2bb7b766 2589 }
2590
2591 return hasError == kFALSE;
2592}
cb343cfd 2593
2594//______________________________________________________________________________________________
2595ULong_t AliShuttle::GetTimeOfLastAction() const
2596{
9827400b 2597 //
2598 // Gets time of last action
2599 //
2600
cb343cfd 2601 ULong_t tmp;
36c99a6a 2602
cb343cfd 2603 fMonitoringMutex->Lock();
be48e3ea 2604
cb343cfd 2605 tmp = fLastActionTime;
36c99a6a 2606
cb343cfd 2607 fMonitoringMutex->UnLock();
36c99a6a 2608
cb343cfd 2609 return tmp;
2610}
2611
2612//______________________________________________________________________________________________
2613const TString AliShuttle::GetLastAction() const
2614{
9827400b 2615 //
cb343cfd 2616 // returns a string description of the last action
9827400b 2617 //
cb343cfd 2618
2619 TString tmp;
36c99a6a 2620
cb343cfd 2621 fMonitoringMutex->Lock();
2622
2623 tmp = fLastAction;
2624
2625 fMonitoringMutex->UnLock();
2626
36c99a6a 2627 return tmp;
cb343cfd 2628}
2629
2630//______________________________________________________________________________________________
2631void AliShuttle::SetLastAction(const char* action)
2632{
9827400b 2633 //
cb343cfd 2634 // updates the monitoring variables
9827400b 2635 //
36c99a6a 2636
cb343cfd 2637 fMonitoringMutex->Lock();
36c99a6a 2638
cb343cfd 2639 fLastAction = action;
2640 fLastActionTime = time(0);
2641
2642 fMonitoringMutex->UnLock();
2643}
eba76848 2644
2645//______________________________________________________________________________________________
2646const char* AliShuttle::GetRunParameter(const char* param)
2647{
9827400b 2648 //
2649 // returns run parameter read from DAQ logbook
2650 //
eba76848 2651
2652 if(!fLogbookEntry) {
2653 AliError("No logbook entry!");
2654 return 0;
2655 }
2656
2657 return fLogbookEntry->GetRunParameter(param);
2658}
57c1a579 2659
d386d623 2660//______________________________________________________________________________________________
9827400b 2661AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2662{
9827400b 2663 //
2664 // returns object from OCDB valid for current run
2665 //
d386d623 2666
9827400b 2667 if (fTestMode & kErrorOCDB)
2668 {
2669 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2670 return 0;
2671 }
2672
d386d623 2673 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2674 if (!sto)
2675 {
9827400b 2676 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2677 return 0;
2678 }
2679
2680 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2681}
2682
57c1a579 2683//______________________________________________________________________________________________
2684Bool_t AliShuttle::SendMail()
2685{
9827400b 2686 //
2687 // sends a mail to the subdetector expert in case of preprocessor error
2688 //
2689
2690 if (fTestMode != kNone)
2691 return kTRUE;
57c1a579 2692
36c99a6a 2693 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2694 if (dir == NULL)
2695 {
36c99a6a 2696 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2697 {
36c99a6a 2698 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2699 return kFALSE;
2700 }
2701
2702 } else {
2703 gSystem->FreeDirectory(dir);
2704 }
2705
2706 TString bodyFileName;
36c99a6a 2707 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2708 gSystem->ExpandPathName(bodyFileName);
2709
2710 ofstream mailBody;
2711 mailBody.open(bodyFileName, ofstream::out);
2712
2713 if (!mailBody.is_open())
2714 {
2715 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2716 return kFALSE;
2717 }
2718
2719 TString to="";
2720 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2721 TObjString *anExpert=0;
2722 while ((anExpert = (TObjString*) iterExperts.Next()))
2723 {
2724 to += Form("%s,", anExpert->GetName());
2725 }
2726 to.Remove(to.Length()-1);
909732f7 2727 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2728
86aa42c3 2729 if (to.IsNull()) {
36c99a6a 2730 AliInfo("List of detector responsibles not yet set!");
2731 return kFALSE;
2732 }
2733
57c1a579 2734 TString cc="alberto.colla@cern.ch";
2735
546242fb 2736 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
57c1a579 2737 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2738 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2739
2740 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2741 body += Form("SHUTTLE just detected that your preprocessor "
546242fb 2742 "failed processing run %d!!\n\n", GetCurrentRun());
2743 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2744 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2745 body += Form("Find the %s log for the current run on \n\n"
2746 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2747 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2748 body += Form("The last 10 lines of %s log file are following:\n\n");
2749
909732f7 2750 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2751
2752 mailBody << body.Data();
2753 mailBody.close();
2754 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2755
9d733021 2756 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2757 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2758 if (gSystem->Exec(tailCommand.Data()))
2759 {
2760 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2761 }
2762
2763 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2764 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2765 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
546242fb 2766 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
57c1a579 2767
909732f7 2768 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2769
2770 mailBody << endBody.Data();
2771
2772 mailBody.close();
2773
2774 // send mail!
2775 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2776 subject.Data(),
2777 cc.Data(),
2778 to.Data(),
2779 bodyFileName.Data());
909732f7 2780 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2781
2782 Bool_t result = gSystem->Exec(mailCommand.Data());
2783
2784 return result == 0;
2785}
d386d623 2786
441b0e9c 2787//______________________________________________________________________________________________
9827400b 2788const char* AliShuttle::GetRunType()
441b0e9c 2789{
9827400b 2790 //
2791 // returns run type read from "run type" logbook
2792 //
441b0e9c 2793
2794 if(!fLogbookEntry) {
2795 AliError("No logbook entry!");
2796 return 0;
2797 }
2798
9827400b 2799 return fLogbookEntry->GetRunType();
441b0e9c 2800}
2801
d386d623 2802//______________________________________________________________________________________________
2803void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2804{
9827400b 2805 //
2806 // sets Shuttle temp directory
2807 //
d386d623 2808
2809 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2810}
2811
2812//______________________________________________________________________________________________
2813void AliShuttle::SetShuttleLogDir(const char* logDir)
2814{
9827400b 2815 //
2816 // sets Shuttle log directory
2817 //
d386d623 2818
2819 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2820}