Fix the Copy function
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
546242fb 18Revision 1.42 2007/05/03 08:01:39 jgrosseo
19typo in last commit :-(
20
8b739301 21Revision 1.41 2007/05/03 08:00:48 jgrosseo
22fixing log message when pp want to skip dcs value retrieval
23
651fdaab 24Revision 1.40 2007/04/27 07:06:48 jgrosseo
25GetFileSources returns empty list in case of no files, but successful query
26No mails sent in testmode
27
86aa42c3 28Revision 1.39 2007/04/17 12:43:57 acolla
29Correction in StoreOCDB; change of text in mail to detector expert
30
26758fce 31Revision 1.38 2007/04/12 08:26:18 jgrosseo
32updated comment
33
3c2a21c8 34Revision 1.37 2007/04/10 16:53:14 jgrosseo
35redirecting sub detector stdout, stderr to sub detector log file
36
3d8bc902 37Revision 1.35 2007/04/04 16:26:38 acolla
381. Re-organization of function calls in TestPreprocessor to make it more meaningful.
392. Added missing dependency in test preprocessors.
403. in AliShuttle.cxx: processing time and memory consumption info on a single line.
41
886d60e6 42Revision 1.34 2007/04/04 10:33:36 jgrosseo
431) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
44In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
45
462) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
47
483) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
49
504) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
51
525) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
53If you always need DCS data (like before), you do not need to implement it.
54
556) The run type has been added to the monitoring page
56
9827400b 57Revision 1.33 2007/04/03 13:56:01 acolla
58Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
59run type.
60
3301427a 61Revision 1.32 2007/02/28 10:41:56 acolla
62Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
63AliPreprocessor::GetRunType() function.
64Added some ldap definition files.
65
d386d623 66Revision 1.30 2007/02/13 11:23:21 acolla
67Moved getters and setters of Shuttle's main OCDB/Reference, local
68OCDB/Reference, temp and log folders to AliShuttleInterface
69
9d733021 70Revision 1.27 2007/01/30 17:52:42 jgrosseo
71adding monalisa monitoring
72
e7f62f16 73Revision 1.26 2007/01/23 19:20:03 acolla
74Removed old ldif files, added TOF, MCH ldif files. Added some options in
75AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
76SetShuttleLogDir
77
36c99a6a 78Revision 1.25 2007/01/15 19:13:52 acolla
79Moved some AliInfo to AliDebug in SendMail function
80
fc5a4708 81Revision 1.21 2006/12/07 08:51:26 jgrosseo
82update (alberto):
83table, db names in ldap configuration
84added GRP preprocessor
85DCS data can also be retrieved by data point
86
2c15234c 87Revision 1.20 2006/11/16 16:16:48 jgrosseo
88introducing strict run ordering flag
89removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
90
be48e3ea 91Revision 1.19 2006/11/06 14:23:04 jgrosseo
92major update (Alberto)
93o) reading of run parameters from the logbook
94o) online offline naming conversion
95o) standalone DCSclient package
96
eba76848 97Revision 1.18 2006/10/20 15:22:59 jgrosseo
98o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
99o) Merging Collect, CollectAll, CollectNew function
100o) Removing implementation of empty copy constructors (declaration still there!)
101
cb343cfd 102Revision 1.17 2006/10/05 16:20:55 jgrosseo
103adapting to new CDB classes
104
6ec0e06c 105Revision 1.16 2006/10/05 15:46:26 jgrosseo
106applying to the new interface
107
481441a2 108Revision 1.15 2006/10/02 16:38:39 jgrosseo
109update (alberto):
110fixed memory leaks
111storing of objects that failed to be stored to the grid before
112interfacing of shuttle status table in daq system
113
2bb7b766 114Revision 1.14 2006/08/29 09:16:05 jgrosseo
115small update
116
85a80aa9 117Revision 1.13 2006/08/15 10:50:00 jgrosseo
118effc++ corrections (alberto)
119
4f0ab988 120Revision 1.12 2006/08/08 14:19:29 jgrosseo
121Update to shuttle classes (Alberto)
122
123- Possibility to set the full object's path in the Preprocessor's and
124Shuttle's Store functions
125- Possibility to extend the object's run validity in the same classes
126("startValidity" and "validityInfinite" parameters)
127- Implementation of the StoreReferenceData function to store reference
128data in a dedicated CDB storage.
129
84090f85 130Revision 1.11 2006/07/21 07:37:20 jgrosseo
131last run is stored after each run
132
7bfb2090 133Revision 1.10 2006/07/20 09:54:40 jgrosseo
134introducing status management: The processing per subdetector is divided into several steps,
135after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
136can keep track of the number of failures and skips further processing after a certain threshold is
137exceeded. These thresholds can be configured in LDAP.
138
5164a766 139Revision 1.9 2006/07/19 10:09:55 jgrosseo
140new configuration, accesst to DAQ FES (Alberto)
141
57f50b3c 142Revision 1.8 2006/07/11 12:44:36 jgrosseo
143adding parameters for extended validity range of data produced by preprocessor
144
17111222 145Revision 1.7 2006/07/10 14:37:09 jgrosseo
146small fix + todo comment
147
e090413b 148Revision 1.6 2006/07/10 13:01:41 jgrosseo
149enhanced storing of last sucessfully processed run (alberto)
150
a7160fe9 151Revision 1.5 2006/07/04 14:59:57 jgrosseo
152revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
153
45a493ce 154Revision 1.4 2006/06/12 09:11:16 jgrosseo
155coding conventions (Alberto)
156
58bc3020 157Revision 1.3 2006/06/06 14:26:40 jgrosseo
158o) removed files that were moved to STEER
159o) shuttle updated to follow the new interface (Alberto)
160
b948db8d 161Revision 1.2 2006/03/07 07:52:34 hristov
162New version (B.Yordanov)
163
d477ad88 164Revision 1.6 2005/11/19 17:19:14 byordano
165RetrieveDATEEntries and RetrieveConditionsData added
166
167Revision 1.5 2005/11/19 11:09:27 byordano
168AliShuttle declaration added
169
170Revision 1.4 2005/11/17 17:47:34 byordano
171TList changed to TObjArray
172
173Revision 1.3 2005/11/17 14:43:23 byordano
174import to local CVS
175
176Revision 1.1.1.1 2005/10/28 07:33:58 hristov
177Initial import as subdirectory in AliRoot
178
73abe331 179Revision 1.2 2005/09/13 08:41:15 byordano
180default startTime endTime added
181
182Revision 1.4 2005/08/30 09:13:02 byordano
183some docs added
184
185Revision 1.3 2005/08/29 21:15:47 byordano
186some docs added
187
188*/
189
190//
191// This class is the main manager for AliShuttle.
192// It organizes the data retrieval from DCS and call the
b948db8d 193// interface methods of AliPreprocessor.
73abe331 194// For every detector in AliShuttleConfgi (see AliShuttleConfig),
195// data for its set of aliases is retrieved. If there is registered
b948db8d 196// AliPreprocessor for this detector then it will be used
197// accroding to the schema (see AliPreprocessor).
198// If there isn't registered AliPreprocessor than the retrieved
73abe331 199// data is stored automatically to the undelying AliCDBStorage.
200// For detSpec is used the alias name.
201//
202
203#include "AliShuttle.h"
204
205#include "AliCDBManager.h"
206#include "AliCDBStorage.h"
207#include "AliCDBId.h"
84090f85 208#include "AliCDBRunRange.h"
209#include "AliCDBPath.h"
5164a766 210#include "AliCDBEntry.h"
73abe331 211#include "AliShuttleConfig.h"
eba76848 212#include "DCSClient/AliDCSClient.h"
73abe331 213#include "AliLog.h"
b948db8d 214#include "AliPreprocessor.h"
5164a766 215#include "AliShuttleStatus.h"
2bb7b766 216#include "AliShuttleLogbookEntry.h"
73abe331 217
57f50b3c 218#include <TSystem.h>
58bc3020 219#include <TObject.h>
b948db8d 220#include <TString.h>
57f50b3c 221#include <TTimeStamp.h>
73abe331 222#include <TObjString.h>
57f50b3c 223#include <TSQLServer.h>
224#include <TSQLResult.h>
225#include <TSQLRow.h>
cb343cfd 226#include <TMutex.h>
9827400b 227#include <TSystemDirectory.h>
228#include <TSystemFile.h>
229#include <TFileMerger.h>
230#include <TGrid.h>
231#include <TGridResult.h>
73abe331 232
e7f62f16 233#include <TMonaLisaWriter.h>
234
5164a766 235#include <fstream>
236
cb343cfd 237#include <sys/types.h>
238#include <sys/wait.h>
239
73abe331 240ClassImp(AliShuttle)
241
b948db8d 242//______________________________________________________________________________________________
243AliShuttle::AliShuttle(const AliShuttleConfig* config,
244 UInt_t timeout, Int_t retries):
4f0ab988 245fConfig(config),
246fTimeout(timeout), fRetries(retries),
247fPreprocessorMap(),
2bb7b766 248fLogbookEntry(0),
eba76848 249fCurrentDetector(),
85a80aa9 250fStatusEntry(0),
cb343cfd 251fMonitoringMutex(0),
eba76848 252fLastActionTime(0),
e7f62f16 253fLastAction(),
9827400b 254fMonaLisa(0),
255fTestMode(kNone),
ffa29e93 256fReadTestMode(kFALSE),
257fOutputRedirected(kFALSE)
73abe331 258{
259 //
260 // config: AliShuttleConfig used
73abe331 261 // timeout: timeout used for AliDCSClient connection
262 // retries: the number of retries in case of connection error.
263 //
264
57f50b3c 265 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 266 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 267 fServer[iSys]=0;
be48e3ea 268 if (iSys < 3)
2c15234c 269 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 270 }
2bb7b766 271 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 272
273 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
274 fFirstUnprocessed[iDet] = kFALSE;
275
cb343cfd 276 fMonitoringMutex = new TMutex();
58bc3020 277}
278
b948db8d 279//______________________________________________________________________________________________
57f50b3c 280AliShuttle::~AliShuttle()
58bc3020 281{
9827400b 282 //
283 // destructor
284 //
58bc3020 285
b948db8d 286 fPreprocessorMap.DeleteAll();
be48e3ea 287 for(int iSys=0;iSys<4;iSys++)
57f50b3c 288 if(fServer[iSys]) {
289 fServer[iSys]->Close();
290 delete fServer[iSys];
eba76848 291 fServer[iSys] = 0;
57f50b3c 292 }
2bb7b766 293
294 if (fStatusEntry){
295 delete fStatusEntry;
296 fStatusEntry = 0;
297 }
cb343cfd 298
299 if (fMonitoringMutex)
300 {
301 delete fMonitoringMutex;
302 fMonitoringMutex = 0;
303 }
73abe331 304}
305
b948db8d 306//______________________________________________________________________________________________
57f50b3c 307void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 308{
73abe331 309 //
b948db8d 310 // Registers new AliPreprocessor.
73abe331 311 // It uses GetName() for indentificator of the pre processor.
312 // The pre processor is registered it there isn't any other
313 // with the same identificator (GetName()).
314 //
315
eba76848 316 const char* detName = preprocessor->GetName();
317 if(GetDetPos(detName) < 0)
318 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
319
320 if (fPreprocessorMap.GetValue(detName)) {
321 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 322 return;
323 }
324
eba76848 325 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 326}
b948db8d 327//______________________________________________________________________________________________
3301427a 328Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 329 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 330{
9827400b 331 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
332 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
333 // using this function. Use StoreReferenceData instead!
334 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
335 // finishes the data are transferred to the main storage (Grid).
b948db8d 336
3301427a 337 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 338}
339
340//______________________________________________________________________________________________
3301427a 341Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 342{
9827400b 343 // Stores a CDB object in the storage for reference data. This objects will not be available during
344 // offline reconstrunction. Use this function for reference data only!
345 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
346 // finishes the data are transferred to the main storage (Grid).
85a80aa9 347
3301427a 348 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 349}
350
351//______________________________________________________________________________________________
3301427a 352Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 353 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
354 Int_t validityStart, Bool_t validityInfinite)
355{
9827400b 356 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
357 // when the preprocessor finishes the data are transferred to the main storage (Grid).
358 // The parameters are:
359 // 1) Uri of the backup storage (Local)
360 // 2) the object's path.
361 // 3) the object to be stored
362 // 4) the metaData to be associated with the object
363 // 5) the validity start run number w.r.t. the current run,
364 // if the data is valid only for this run leave the default 0
365 // 6) specifies if the calibration data is valid for infinity (this means until updated),
366 // typical for calibration runs, the default is kFALSE
367 //
368 // returns 0 if fail, 1 otherwise
84090f85 369
9827400b 370 if (fTestMode & kErrorStorage)
371 {
372 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
373 return kFALSE;
374 }
375
3301427a 376 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 377
85a80aa9 378 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 379 if(firstRun < 0) {
9827400b 380 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 381 firstRun=0;
382 }
383
384 Int_t lastRun = -1;
385 if(validityInfinite) {
386 lastRun = AliCDBRunRange::Infinity();
387 } else {
388 lastRun = GetCurrentRun();
389 }
390
3301427a 391 // Version is set to current run, it will be used later to transfer data to Grid
392 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 393
394 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
395 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 396 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 397 }
84090f85 398
3301427a 399 Bool_t result = kFALSE;
84090f85 400
3301427a 401 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
402 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 403 } else {
3301427a 404 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 405 ->Put(object, id, metaData);
406 }
407
408 if(!result) {
409
9827400b 410 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 411 }
2bb7b766 412
3301427a 413 return result;
414}
84090f85 415
3301427a 416//______________________________________________________________________________________________
417Bool_t AliShuttle::StoreOCDB()
418{
9827400b 419 //
420 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
421 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
422 // Then calls StoreRefFilesToGrid to store reference files.
423 //
424
425 if (fTestMode & kErrorGrid)
426 {
427 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
428 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
429 return kFALSE;
430 }
431
86aa42c3 432 AliInfo("Storing OCDB data ...");
433 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
434
3301427a 435 AliInfo("Storing reference data ...");
436 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 437
438 AliInfo("Storing reference files ...");
439 Bool_t resultRefFiles = StoreRefFilesToGrid();
440
441 return resultCDB && resultRef && resultRefFiles;
3301427a 442}
443
444//______________________________________________________________________________________________
445Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
446{
447 //
448 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
449 //
450
451 TObjArray* gridIds=0;
452
453 Bool_t result = kTRUE;
454
455 const char* type = 0;
456 TString localURI;
457 if(gridURI == fgkMainCDB) {
458 type = "OCDB";
459 localURI = fgkLocalCDB;
460 } else if(gridURI == fgkMainRefStorage) {
461 type = "reference";
462 localURI = fgkLocalRefStorage;
463 } else {
464 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
465 return kFALSE;
466 }
467
468 AliCDBManager* man = AliCDBManager::Instance();
469
470 AliCDBStorage *gridSto = man->GetStorage(gridURI);
471 if(!gridSto) {
472 Log("SHUTTLE",
473 Form("StoreOCDB - cannot activate main %s storage", type));
474 return kFALSE;
475 }
476
477 gridIds = gridSto->GetQueryCDBList();
478
479 // get objects previously stored in local CDB
480 AliCDBStorage *localSto = man->GetStorage(localURI);
481 if(!localSto) {
482 Log("SHUTTLE",
483 Form("StoreOCDB - cannot activate local %s storage", type));
484 return kFALSE;
485 }
486 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
487 // Local objects were stored with current run as Grid version!
488 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
489 localEntries->SetOwner(1);
490
491 // loop on local stored objects
492 TIter localIter(localEntries);
493 AliCDBEntry *aLocEntry = 0;
494 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
495 aLocEntry->SetOwner(1);
496 AliCDBId aLocId = aLocEntry->GetId();
497 aLocEntry->SetVersion(-1);
498 aLocEntry->SetSubVersion(-1);
499
500 // If local object is valid up to infinity we store it only if it is
501 // the first unprocessed run!
502 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
503 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
504 {
505 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
506 "there are previous unprocessed runs!",
507 fCurrentDetector.Data(), aLocId.GetPath().Data()));
508 continue;
509 }
510
511 // loop on Grid valid Id's
512 Bool_t store = kTRUE;
513 TIter gridIter(gridIds);
514 AliCDBId* aGridId = 0;
515 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
516 if(aGridId->GetPath() != aLocId.GetPath()) continue;
517 // skip all objects valid up to infinity
518 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
519 // if we get here, it means there's already some more recent object stored on Grid!
520 store = kFALSE;
521 break;
522 }
523
524 // If we get here, the file can be stored!
525 Bool_t storeOk = gridSto->Put(aLocEntry);
526 if(!store || storeOk){
527
528 if (!store)
529 {
530 Log(fCurrentDetector.Data(),
531 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
532 type, aGridId->ToString().Data()));
533 } else {
534 Log("SHUTTLE",
535 Form("StoreOCDB - Object <%s> successfully put into %s storage",
536 aLocId.ToString().Data(), type));
537 }
84090f85 538
3301427a 539 // removing local filename...
540 TString filename;
541 localSto->IdToFilename(aLocId, filename);
542 AliInfo(Form("Removing local file %s", filename.Data()));
543 RemoveFile(filename.Data());
544 continue;
545 } else {
546 Log("SHUTTLE",
547 Form("StoreOCDB - Grid %s storage of object <%s> failed",
548 type, aLocId.ToString().Data()));
549 result = kFALSE;
b948db8d 550 }
551 }
3301427a 552 localEntries->Clear();
2bb7b766 553
b948db8d 554 return result;
3301427a 555}
556
557//______________________________________________________________________________________________
546242fb 558Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
559{
560 // clears the directory used to store reference files of a given subdetector
561
562 AliCDBManager* man = AliCDBManager::Instance();
563 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
564 TString localBaseFolder = sto->GetBaseFolder();
565
566 TString targetDir;
567 targetDir.Form("%s/%s", localBaseFolder.Data(), detector);
568
569 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
570
571 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
572 if (result == 0)
573 {
574 // delete directory
575 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
576 if (result != 0)
577 {
578 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
579 return kFALSE;
580 }
581 }
582
583 result = gSystem->mkdir(targetDir, kTRUE);
584 if (result != 0)
585 {
586 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
587 return kFALSE;
588 }
589
590 return kTRUE;
591}
592
593//______________________________________________________________________________________________
9827400b 594Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
595{
596 //
3c2a21c8 597 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 598 //
3c2a21c8 599 // The file is stored under the following location:
600 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
601 // where <gridFileName> is the second parameter given to the function
602 //
9827400b 603
604 if (fTestMode & kErrorStorage)
605 {
606 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
607 return kFALSE;
608 }
609
610 AliCDBManager* man = AliCDBManager::Instance();
611 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
612
613 TString localBaseFolder = sto->GetBaseFolder();
614
615 TString targetDir;
546242fb 616 targetDir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(detector));
9827400b 617
618 TString target;
619 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
620
546242fb 621 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
9827400b 622 if (result)
623 {
546242fb 624 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
625 return kFALSE;
9827400b 626 }
546242fb 627
9827400b 628 result = gSystem->CopyFile(localFile, target);
629
630 if (result == 0)
631 {
632 Log("SHUTTLE", Form("StoreReferenceFile - Stored file %s locally to %s", localFile, target.Data()));
633 return kTRUE;
634 }
635 else
636 {
546242fb 637 Log("SHUTTLE", Form("StoreReferenceFile - Storing file %s locally to %s failed. Error code = %d",
638 localFile, target.Data(), result));
9827400b 639 return kFALSE;
640 }
641}
642
643//______________________________________________________________________________________________
644Bool_t AliShuttle::StoreRefFilesToGrid()
645{
646 //
647 // Transfers the reference file to the Grid.
9827400b 648 //
86aa42c3 649 // The files are stored under the following location:
3c2a21c8 650 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
86aa42c3 651 //
9827400b 652
653 AliCDBManager* man = AliCDBManager::Instance();
654 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
655 if (!sto)
656 return kFALSE;
657 TString localBaseFolder = sto->GetBaseFolder();
658
659 TString dir;
3d8bc902 660 dir.Form("%s/%s", localBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 661
662 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
663 if (!gridSto)
664 return kFALSE;
665 TString gridBaseFolder = gridSto->GetBaseFolder();
666 TString alienDir;
3d8bc902 667 alienDir.Form("%s%s", gridBaseFolder.Data(), GetOfflineDetName(fCurrentDetector));
9827400b 668
9827400b 669 TString begin;
670 begin.Form("%d_", GetCurrentRun());
671
672 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 673 if (!baseDir)
674 return kTRUE;
675
9827400b 676 TList* dirList = baseDir->GetListOfFiles();
677 if (!dirList)
3d8bc902 678 {
679 delete baseDir;
9827400b 680 return kTRUE;
3d8bc902 681 }
9827400b 682
683 Int_t nDirs = dirList->GetEntries();
684
546242fb 685 Log("SHUTTLE", Form("There are %d reference files in folder %s to be transferred to Grid",
686 nDirs, GetOfflineDetName(fCurrentDetector)));
687
688 if(nDirs < 1) return kTRUE;
689
690 if (!gGrid)
691 {
692 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
693 return kFALSE;
694 }
695
9827400b 696 Bool_t success = kTRUE;
3d8bc902 697 Bool_t first = kTRUE;
9827400b 698
699 for (Int_t iDir=0; iDir<nDirs; ++iDir)
700 {
701 TSystemFile* entry = dynamic_cast<TSystemFile*> (dirList->At(iDir));
702 if (!entry)
703 continue;
704
705 if (entry->IsDirectory())
706 continue;
707
708 TString fileName(entry->GetName());
709 if (!fileName.BeginsWith(begin))
710 continue;
711
3d8bc902 712 if (first)
713 {
714 first = kFALSE;
715 // check that DET folder exists, otherwise create it
716 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
717
718 if (!result)
719 return kFALSE;
720
546242fb 721 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
3d8bc902 722 {
723 if (!gGrid->Mkdir(alienDir.Data(),"",0))
724 {
725 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
726 alienDir.Data()));
727 delete baseDir;
728 return kFALSE;
546242fb 729 } else {
730 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
3d8bc902 731 }
732
546242fb 733 } else {
734 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
3d8bc902 735 }
736 }
737
9827400b 738 TString fullLocalPath;
739 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
740
741 TString fullGridPath;
742 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
743
744 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s", fullLocalPath.Data(), fullGridPath.Data()));
745
746 TFileMerger fileMerger;
747 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
748
749 if (result)
750 {
751 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded", fullLocalPath.Data(), fullGridPath.Data()));
752 RemoveFile(fullLocalPath);
753 }
754 else
755 {
756 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s failed", fullLocalPath.Data(), fullGridPath.Data()));
757 success = kFALSE;
758 }
759 }
760
761 delete baseDir;
762
763 return success;
764}
765
766//______________________________________________________________________________________________
3301427a 767void AliShuttle::CleanLocalStorage(const TString& uri)
768{
9827400b 769 //
770 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
771 //
3301427a 772
773 const char* type = 0;
774 if(uri == fgkLocalCDB) {
775 type = "OCDB";
776 } else if(uri == fgkLocalRefStorage) {
546242fb 777 type = "Reference";
3301427a 778 } else {
779 AliError(Form("Invalid storage URI: %s", uri.Data()));
780 return;
781 }
782
783 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 784
3301427a 785 // open local storage
786 AliCDBStorage *localSto = man->GetStorage(uri);
787 if(!localSto) {
788 Log("SHUTTLE",
789 Form("CleanLocalStorage - cannot activate local %s storage", type));
790 return;
791 }
792
793 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
546242fb 794 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
3301427a 795
796 AliInfo(Form("filename = %s", filename.Data()));
797
798 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
799 GetCurrentRun(), fCurrentDetector.Data()));
800
801 RemoveFile(filename.Data());
802
803}
804
805//______________________________________________________________________________________________
806void AliShuttle::RemoveFile(const char* filename)
807{
9827400b 808 //
809 // removes local file
810 //
3301427a 811
812 TString command(Form("rm -f %s", filename));
813
814 Int_t result = gSystem->Exec(command.Data());
815 if(result != 0)
816 {
817 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
818 fCurrentDetector.Data(), filename));
819 }
73abe331 820}
821
b948db8d 822//______________________________________________________________________________________________
5164a766 823AliShuttleStatus* AliShuttle::ReadShuttleStatus()
824{
9827400b 825 //
826 // Reads the AliShuttleStatus from the CDB
827 //
5164a766 828
2bb7b766 829 if (fStatusEntry){
830 delete fStatusEntry;
831 fStatusEntry = 0;
832 }
5164a766 833
10a5a932 834 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 835 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 836
2bb7b766 837 if (!fStatusEntry) return 0;
838 fStatusEntry->SetOwner(1);
5164a766 839
2bb7b766 840 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
841 if (!status) {
842 AliError("Invalid object stored to CDB!");
843 return 0;
844 }
5164a766 845
2bb7b766 846 return status;
5164a766 847}
848
849//______________________________________________________________________________________________
7bfb2090 850Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 851{
9827400b 852 //
853 // writes the status for one subdetector
854 //
2bb7b766 855
856 if (fStatusEntry){
857 delete fStatusEntry;
858 fStatusEntry = 0;
859 }
5164a766 860
2bb7b766 861 Int_t run = GetCurrentRun();
5164a766 862
2bb7b766 863 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 864
2bb7b766 865 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
866 fStatusEntry->SetOwner(1);
5164a766 867
2bb7b766 868 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 869
2bb7b766 870 if (!result) {
3301427a 871 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
872 fCurrentDetector.Data(), run));
2bb7b766 873 return kFALSE;
874 }
e7f62f16 875
876 SendMLInfo();
7bfb2090 877
2bb7b766 878 return kTRUE;
5164a766 879}
880
881//______________________________________________________________________________________________
882void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
883{
9827400b 884 //
885 // changes the AliShuttleStatus for the given detector and run to the given status
886 //
5164a766 887
2bb7b766 888 if (!fStatusEntry){
889 AliError("UNEXPECTED: fStatusEntry empty");
890 return;
891 }
5164a766 892
2bb7b766 893 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 894
2bb7b766 895 if (!status){
3301427a 896 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 897 return;
898 }
5164a766 899
2c15234c 900 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 901 fCurrentDetector.Data(),
36c99a6a 902 status->GetStatusName(),
eba76848 903 status->GetStatusName(newStatus));
cb343cfd 904 Log("SHUTTLE", actionStr);
905 SetLastAction(actionStr);
5164a766 906
2bb7b766 907 status->SetStatus(newStatus);
908 if (increaseCount) status->IncreaseCount();
5164a766 909
2bb7b766 910 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 911
912 SendMLInfo();
5164a766 913}
e7f62f16 914
915//______________________________________________________________________________________________
916void AliShuttle::SendMLInfo()
917{
918 //
919 // sends ML information about the current status of the current detector being processed
920 //
921
922 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
923
924 if (!status){
3301427a 925 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 926 return;
927 }
928
929 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
930 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
931
932 TList mlList;
933 mlList.Add(&mlStatus);
934 mlList.Add(&mlRetryCount);
935
936 fMonaLisa->SendParameters(&mlList);
937}
938
5164a766 939//______________________________________________________________________________________________
940Bool_t AliShuttle::ContinueProcessing()
941{
9827400b 942 // this function reads the AliShuttleStatus information from CDB and
943 // checks if the processing should be continued
944 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 945
57c1a579 946 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
947
948 AliPreprocessor* aPreprocessor =
949 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
950 if (!aPreprocessor)
951 {
952 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
953 return kFALSE;
954 }
955
2bb7b766 956 AliShuttleLogbookEntry::Status entryStatus =
eba76848 957 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 958
959 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 960 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 961 fCurrentDetector.Data(),
962 fLogbookEntry->GetDetectorStatusName(entryStatus)));
963 return kFALSE;
964 }
965
966 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 967
968 // check if current run is first unprocessed run for current detector
969 if (fConfig->StrictRunOrder(fCurrentDetector) &&
970 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
971 {
86aa42c3 972 if (fTestMode == kNone)
973 {
974 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
975 return kFALSE;
976 }
977 else
978 {
979 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
980 }
be48e3ea 981 }
982
2bb7b766 983 AliShuttleStatus* status = ReadShuttleStatus();
984 if (!status) {
985 // first time
986 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
987 fCurrentDetector.Data()));
988 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
989 return WriteShuttleStatus(status);
990 }
991
992 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
993 // If it happens it may mean Logbook updating failed... let's do it now!
994 if (status->GetStatus() == AliShuttleStatus::kDone ||
995 status->GetStatus() == AliShuttleStatus::kFailed){
996 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
997 fCurrentDetector.Data(),
998 status->GetStatusName(status->GetStatus())));
999 UpdateShuttleLogbook(fCurrentDetector.Data(),
1000 status->GetStatusName(status->GetStatus()));
1001 return kFALSE;
1002 }
1003
3301427a 1004 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 1005 Log("SHUTTLE",
1006 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1007 fCurrentDetector.Data()));
9827400b 1008 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1009 if (StoreOCDB()){
3301427a 1010 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1011 fCurrentDetector.Data()));
2bb7b766 1012 UpdateShuttleStatus(AliShuttleStatus::kDone);
1013 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1014 } else {
1015 Log("SHUTTLE",
1016 Form("ContinueProcessing - %s: Grid storage failed again",
1017 fCurrentDetector.Data()));
9827400b 1018 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 1019 }
1020 return kFALSE;
1021 }
1022
1023 // if we get here, there is a restart
57c1a579 1024 Bool_t cont = kFALSE;
2bb7b766 1025
1026 // abort conditions
cb343cfd 1027 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 1028 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1029 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 1030 status->GetCount(), status->GetStatusName()));
1031 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 1032 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 1033
1034 // there may still be objects in local OCDB and reference storage
1035 // and FXS databases may be not updated: do it now!
9827400b 1036
1037 // TODO Currently disabled, we want to keep files in case of failure!
1038 // CleanLocalStorage(fgkLocalCDB);
1039 // CleanLocalStorage(fgkLocalRefStorage);
1040 // UpdateTableFailCase();
1041
1042 // Send mail to detector expert!
1043 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1044 if (!SendMail())
1045 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1046 fCurrentDetector.Data()));
3301427a 1047
57c1a579 1048 } else {
1049 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1050 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1051 status->GetStatusName(), status->GetCount()));
9827400b 1052 Bool_t increaseCount = kTRUE;
1053 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1054 increaseCount = kFALSE;
1055 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1056 cont = kTRUE;
2bb7b766 1057 }
1058
57c1a579 1059 return cont;
5164a766 1060}
1061
1062//______________________________________________________________________________________________
2bb7b766 1063Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1064{
73abe331 1065 //
b948db8d 1066 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1067 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1068 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1069 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1070 //
1071
9827400b 1072 if (!entry) return kFALSE;
2bb7b766 1073
1074 fLogbookEntry = entry;
1075
9827400b 1076 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1077 GetCurrentRun()));
2bb7b766 1078
e7f62f16 1079 // create ML instance that monitors this run
1080 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1081 // disable monitoring of other parameters that come e.g. from TFile
1082 gMonitoringWriter = 0;
2bb7b766 1083
e7f62f16 1084 // Send the information to ML
1085 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1086 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1087
1088 TList mlList;
1089 mlList.Add(&mlStatus);
9827400b 1090 mlList.Add(&mlRunType);
e7f62f16 1091
1092 fMonaLisa->SendParameters(&mlList);
3301427a 1093
9827400b 1094 if (fLogbookEntry->IsDone())
1095 {
1096 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1097 UpdateShuttleLogbook("shuttle_done");
1098 fLogbookEntry = 0;
1099 return kTRUE;
1100 }
1101
1102 // read test mode if flag is set
1103 if (fReadTestMode)
1104 {
3d8bc902 1105 fTestMode = kNone;
9827400b 1106 TString logEntry(entry->GetRunParameter("log"));
1107 //printf("log entry = %s\n", logEntry.Data());
1108 TString searchStr("Testmode: ");
1109 Int_t pos = logEntry.Index(searchStr.Data());
1110 //printf("%d\n", pos);
1111 if (pos >= 0)
1112 {
1113 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1114 //printf("%s\n", subStr.String().Data());
1115 TString newStr(subStr.Data());
1116 TObjArray* token = newStr.Tokenize(' ');
1117 if (token)
1118 {
1119 //token->Print();
1120 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1121 if (tmpStr)
1122 {
1123 Int_t testMode = tmpStr->String().Atoi();
1124 if (testMode > 0)
1125 {
1126 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1127 SetTestMode((TestMode) testMode);
1128 }
1129 }
1130 delete token;
1131 }
1132 }
1133 }
1134
3d8bc902 1135 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1136
eba76848 1137 fLogbookEntry->Print("all");
57f50b3c 1138
1139 // Initialization
d477ad88 1140 Bool_t hasError = kFALSE;
5164a766 1141
2bb7b766 1142 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1143 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1144 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1145 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1146
57f50b3c 1147 // Loop on detectors in the configuration
b948db8d 1148 TIter iter(fConfig->GetDetectors());
2bb7b766 1149 TObjString* aDetector = 0;
b948db8d 1150
be48e3ea 1151 while ((aDetector = (TObjString*) iter.Next()))
1152 {
7bfb2090 1153 fCurrentDetector = aDetector->String();
5164a766 1154
9e080f92 1155 if (ContinueProcessing() == kFALSE) continue;
1156
2bb7b766 1157 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1158 GetCurrentRun(), aDetector->GetName()));
1159
9d733021 1160 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1161
e7f62f16 1162 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1163
be48e3ea 1164 Int_t pid = fork();
1165
1166 if (pid < 0)
1167 {
1168 Log("SHUTTLE", "ERROR: Forking failed");
1169 }
1170 else if (pid > 0)
1171 {
1172 // parent
1173 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1174 GetCurrentRun(), aDetector->GetName()));
1175
1176 Long_t begin = time(0);
1177
1178 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1179 while (waitpid(pid, &status, WNOHANG) == 0)
1180 {
1181 Long_t expiredTime = time(0) - begin;
1182
1183 if (expiredTime > fConfig->GetPPTimeOut())
1184 {
9827400b 1185 TString tmp;
1186 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1187 fCurrentDetector.Data(), expiredTime);
1188 Log("SHUTTLE", tmp);
1189 Log(fCurrentDetector, tmp);
be48e3ea 1190
1191 kill(pid, 9);
1192
3301427a 1193 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1194 hasError = kTRUE;
1195
1196 gSystem->Sleep(1000);
1197 }
1198 else
1199 {
be48e3ea 1200 gSystem->Sleep(1000);
9827400b 1201
1202 TString checkStr;
1203 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1204 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1205 if (!pipe)
1206 {
1207 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1208 continue;
1209 }
1210
1211 char buffer[100];
1212 if (!fgets(buffer, 100, pipe))
1213 {
1214 Log("SHUTTLE", "Error: ps did not return anything");
1215 gSystem->ClosePipe(pipe);
1216 continue;
1217 }
1218 gSystem->ClosePipe(pipe);
1219
1220 //Log("SHUTTLE", Form("ps returned %s", buffer));
1221
1222 Int_t mem = 0;
1223 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1224 {
1225 Log("SHUTTLE", "Error: Could not parse output of ps");
1226 continue;
1227 }
1228
1229 if (expiredTime % 60 == 0)
886d60e6 1230 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1231 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1232
1233 if (mem > fConfig->GetPPMaxMem())
1234 {
1235 TString tmp;
1236 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1237 mem, fConfig->GetPPMaxMem());
1238 Log("SHUTTLE", tmp);
1239 Log(fCurrentDetector, tmp);
1240
1241 kill(pid, 9);
1242
1243 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1244 hasError = kTRUE;
1245
1246 gSystem->Sleep(1000);
1247 }
be48e3ea 1248 }
1249 }
1250
1251 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1252 GetCurrentRun(), aDetector->GetName()));
1253
1254 if (WIFEXITED(status))
1255 {
1256 Int_t returnCode = WEXITSTATUS(status);
1257
3301427a 1258 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1259 returnCode));
be48e3ea 1260
9827400b 1261 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1262 }
1263 }
1264 else if (pid == 0)
1265 {
1266 // client
1267 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1268
ffa29e93 1269 AliInfo("Redirecting output...");
1270
546242fb 1271 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
ffa29e93 1272 {
1273 Log("SHUTTLE", "Could not freopen stdout");
1274 }
1275 else
1276 {
1277 fOutputRedirected = kTRUE;
1278 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1279 Log("SHUTTLE", "Could not redirect stderr");
1280
1281 }
1282
9827400b 1283 Bool_t success = ProcessCurrentDetector();
1284 if (success) // Preprocessor finished successfully!
1285 {
3301427a 1286 // Update time_processed field in FXS DB
1287 if (UpdateTable() == kFALSE)
1288 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1289
1290 // Transfer the data from local storage to main storage (Grid)
1291 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1292 if (StoreOCDB() == kFALSE)
1293 {
1294 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1295 GetCurrentRun(), aDetector->GetName()));
1296 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1297 success = kFALSE;
3301427a 1298 } else {
1299 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1300 GetCurrentRun(), aDetector->GetName()));
1301 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1302 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1303 }
be48e3ea 1304 }
1305
4b95672b 1306 for (UInt_t iSys=0; iSys<3; iSys++)
1307 {
1308 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1309 }
1310
be48e3ea 1311 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1312 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1313
1314 // the client exits here
9827400b 1315 gSystem->Exit(success);
be48e3ea 1316
1317 AliError("We should never get here!!!");
1318 }
7bfb2090 1319 }
5164a766 1320
2bb7b766 1321 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1322 GetCurrentRun()));
1323
1324 //check if shuttle is done for this run, if so update logbook
1325 TObjArray checkEntryArray;
1326 checkEntryArray.SetOwner(1);
9e080f92 1327 TString whereClause = Form("where run=%d", GetCurrentRun());
1328 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1329 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1330 GetCurrentRun()));
1331 return hasError == kFALSE;
1332 }
b948db8d 1333
9e080f92 1334 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1335 (checkEntryArray.At(0));
2bb7b766 1336
9e080f92 1337 if (checkEntry)
1338 {
1339 if (checkEntry->IsDone())
be48e3ea 1340 {
9e080f92 1341 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1342 UpdateShuttleLogbook("shuttle_done");
1343 }
1344 else
1345 {
1346 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1347 {
9e080f92 1348 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1349 {
9e080f92 1350 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1351 checkEntry->GetRun(), GetDetName(iDet)));
1352 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1353 }
1354 }
2bb7b766 1355 }
1356 }
1357
e7f62f16 1358 // remove ML instance
1359 delete fMonaLisa;
1360 fMonaLisa = 0;
1361
2bb7b766 1362 fLogbookEntry = 0;
85a80aa9 1363
a7160fe9 1364 return hasError == kFALSE;
73abe331 1365}
1366
b948db8d 1367//______________________________________________________________________________________________
9827400b 1368Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1369{
1370 //
2bb7b766 1371 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1372 // Threre should be a configuration for this detector.
73abe331 1373
2bb7b766 1374 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1375
546242fb 1376 if (!CleanReferenceStorage(GetOfflineDetName(fCurrentDetector)))
1377 return kFALSE;
1378
2c15234c 1379 TMap dcsMap;
1380 dcsMap.SetOwner(1);
73abe331 1381
85a80aa9 1382 Bool_t aDCSError = kFALSE;
3301427a 1383
1384 // call preprocessor
1385 AliPreprocessor* aPreprocessor =
1386 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1387
1388 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1389
1390 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1391
651fdaab 1392 if (!processDCS)
1393 {
1394 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1395 }
8b739301 1396 else if (fTestMode & kSkipDCS)
2c15234c 1397 {
3d8bc902 1398 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1399 }
1400 else if (fTestMode & kErrorDCS)
1401 {
3d8bc902 1402 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1403 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1404 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1405 return kFALSE;
2c15234c 1406 } else {
3301427a 1407
1408 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1409
2c15234c 1410 TString host(fConfig->GetDCSHost(fCurrentDetector));
1411 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1412
1413 // Retrieval of Aliases
1414 TObjString* anAlias = 0;
546242fb 1415 Int_t iAlias = 0;
36c99a6a 1416 Int_t nTotAliases= ((TMap*)fConfig->GetDCSAliases(fCurrentDetector))->GetEntries();
2c15234c 1417 TIter iterAliases(fConfig->GetDCSAliases(fCurrentDetector));
1418 while ((anAlias = (TObjString*) iterAliases.Next()))
1419 {
1420 TObjArray *valueSet = new TObjArray();
1421 valueSet->SetOwner(1);
1422
546242fb 1423 iAlias++;
36c99a6a 1424 if (((iAlias-1) % 500) == 0 || iAlias == nTotAliases)
1425 AliInfo(Form("Querying DCS archive: alias %s (%d of %d)",
546242fb 1426 anAlias->GetName(), iAlias, nTotAliases));
2c15234c 1427 aDCSError = (GetValueSet(host, port, anAlias->String(), valueSet, kAlias) == 0);
1428
1429 if(!aDCSError)
1430 {
1431 dcsMap.Add(anAlias->Clone(), valueSet);
1432 } else {
1433 Log(fCurrentDetector,
1434 Form("ProcessCurrentDetector - Error while retrieving alias %s",
1435 anAlias->GetName()));
1436 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1437 dcsMap.DeleteAll();
9827400b 1438 return kFALSE;
2c15234c 1439 }
4f0ab988 1440 }
2c15234c 1441
1442 // Retrieval of Data Points
1443 TObjString* aDP = 0;
36c99a6a 1444 Int_t iDP = 0;
1445 Int_t nTotDPs= ((TMap*)fConfig->GetDCSDataPoints(fCurrentDetector))->GetEntries();
2c15234c 1446 TIter iterDP(fConfig->GetDCSDataPoints(fCurrentDetector));
1447 while ((aDP = (TObjString*) iterDP.Next()))
1448 {
1449 TObjArray *valueSet = new TObjArray();
1450 valueSet->SetOwner(1);
36c99a6a 1451 if (((iDP-1) % 500) == 0 || iDP == nTotDPs)
1452 AliInfo(Form("Querying DCS archive: DP %s (%d of %d)",
1453 aDP->GetName(), iDP++, nTotDPs));
2c15234c 1454 aDCSError = (GetValueSet(host, port, aDP->String(), valueSet, kDP) == 0);
1455
1456 if(!aDCSError)
1457 {
1458 dcsMap.Add(aDP->Clone(), valueSet);
1459 } else {
1460 Log(fCurrentDetector,
1461 Form("ProcessCurrentDetector - Error while retrieving data point %s",
1462 aDP->GetName()));
1463 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1464 dcsMap.DeleteAll();
9827400b 1465 return kFALSE;
2c15234c 1466 }
73abe331 1467 }
1468 }
b948db8d 1469
2bb7b766 1470 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1471 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1472
3301427a 1473 UInt_t returnValue = aPreprocessor->Process(&dcsMap);
b948db8d 1474
3301427a 1475 if (returnValue > 0) // Preprocessor error!
1476 {
9827400b 1477 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1478 UpdateShuttleStatus(AliShuttleStatus::kPPError);
9827400b 1479 dcsMap.DeleteAll();
1480 return kFALSE;
1481 }
1482
1483 // preprocessor ok!
1484 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1485 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1486 fCurrentDetector.Data()));
b948db8d 1487
2c15234c 1488 dcsMap.DeleteAll();
b948db8d 1489
9827400b 1490 return kTRUE;
2bb7b766 1491}
1492
1493//______________________________________________________________________________________________
1494Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1495 TObjArray& entries)
1496{
9827400b 1497 // Query DAQ's Shuttle logbook and fills detector status object.
1498 // Call QueryRunParameters to query DAQ logbook for run parameters.
1499 //
2bb7b766 1500
fc5a4708 1501 entries.SetOwner(1);
1502
2bb7b766 1503 // check connection, in case connect
be48e3ea 1504 if(!Connect(3)) return kFALSE;
2bb7b766 1505
1506 TString sqlQuery;
441b0e9c 1507 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1508
be48e3ea 1509 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1510 if (!aResult) {
1511 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1512 return kFALSE;
1513 }
1514
fc5a4708 1515 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1516
2bb7b766 1517 if(aResult->GetRowCount() == 0) {
9827400b 1518 AliInfo("No entries in Shuttle Logbook match request");
1519 delete aResult;
1520 return kTRUE;
2bb7b766 1521 }
1522
1523 // TODO Check field count!
fc5a4708 1524 const UInt_t nCols = 22;
2bb7b766 1525 if (aResult->GetFieldCount() != (Int_t) nCols) {
1526 AliError("Invalid SQL result field number!");
1527 delete aResult;
1528 return kFALSE;
1529 }
1530
2bb7b766 1531 TSQLRow* aRow;
1532 while ((aRow = aResult->Next())) {
1533 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1534 Int_t run = runString.Atoi();
1535
eba76848 1536 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1537 if (!entry)
1538 continue;
2bb7b766 1539
1540 // loop on detectors
eba76848 1541 for(UInt_t ii = 0; ii < nCols; ii++)
1542 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1543
eba76848 1544 entries.AddLast(entry);
2bb7b766 1545 delete aRow;
1546 }
1547
2bb7b766 1548 delete aResult;
1549 return kTRUE;
1550}
1551
1552//______________________________________________________________________________________________
eba76848 1553AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1554{
eba76848 1555 //
1556 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1557 //
2bb7b766 1558
1559 // check connection, in case connect
be48e3ea 1560 if (!Connect(3))
eba76848 1561 return 0;
2bb7b766 1562
1563 TString sqlQuery;
2c15234c 1564 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1565
be48e3ea 1566 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1567 if (!aResult) {
1568 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1569 return 0;
2bb7b766 1570 }
1571
eba76848 1572 if (aResult->GetRowCount() == 0) {
2bb7b766 1573 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1574 delete aResult;
eba76848 1575 return 0;
2bb7b766 1576 }
1577
eba76848 1578 if (aResult->GetRowCount() > 1) {
2bb7b766 1579 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1580 delete aResult;
eba76848 1581 return 0;
2bb7b766 1582 }
1583
eba76848 1584 TSQLRow* aRow = aResult->Next();
1585 if (!aRow)
1586 {
1587 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1588 delete aResult;
1589 return 0;
1590 }
2bb7b766 1591
eba76848 1592 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1593
eba76848 1594 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1595 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1596
eba76848 1597 UInt_t startTime = entry->GetStartTime();
1598 UInt_t endTime = entry->GetEndTime();
1599
1600 if (!startTime || !endTime || startTime > endTime) {
1601 Log("SHUTTLE",
1602 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1603 run, startTime, endTime));
1604 delete entry;
2bb7b766 1605 delete aRow;
eba76848 1606 delete aResult;
1607 return 0;
2bb7b766 1608 }
1609
eba76848 1610 delete aRow;
2bb7b766 1611 delete aResult;
eba76848 1612
1613 return entry;
2bb7b766 1614}
1615
1616//______________________________________________________________________________________________
2c15234c 1617Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1618 TObjArray* valueSet, DCSType type)
73abe331 1619{
9827400b 1620 // Retrieve all "entry" data points from the DCS server
1621 // host, port: TSocket connection parameters
1622 // entry: name of the alias or data point
1623 // valueSet: array of retrieved AliDCSValue's
1624 // type: kAlias or kDP
58bc3020 1625
73abe331 1626 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1627 if (!client.IsConnected())
1628 {
b948db8d 1629 return kFALSE;
73abe331 1630 }
1631
2c15234c 1632 Int_t result=0;
73abe331 1633
2c15234c 1634 if (type == kAlias)
1635 {
1636 result = client.GetAliasValues(entry,
1637 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1638 } else
1639 if (type == kDP)
1640 {
1641 result = client.GetDPValues(entry,
1642 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1643 }
1644
1645 if (result < 0)
1646 {
2bb7b766 1647 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1648 entry, AliDCSClient::GetErrorString(result)));
73abe331 1649
2c15234c 1650 if (result == AliDCSClient::fgkServerError)
1651 {
2bb7b766 1652 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1653 client.GetServerError().Data()));
1654 }
1655
1656 return kFALSE;
1657 }
1658
1659 return kTRUE;
1660}
b948db8d 1661
1662//______________________________________________________________________________________________
57f50b3c 1663const char* AliShuttle::GetFile(Int_t system, const char* detector,
1664 const char* id, const char* source)
b948db8d 1665{
9827400b 1666 // Get calibration file from file exchange servers
1667 // First queris the FXS database for the file name, using the run, detector, id and source info
1668 // then calls RetrieveFile(filename) for actual copy to local disk
1669 // run: current run being processed (given by Logbook entry fLogbookEntry)
1670 // detector: the Preprocessor name
1671 // id: provided as a parameter by the Preprocessor
1672 // source: provided by the Preprocessor through GetFileSources function
1673
1674 // check if test mode should simulate a FXS error
1675 if (fTestMode & kErrorFXSFiles)
1676 {
1677 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1678 return 0;
1679 }
1680
57f50b3c 1681 // check connection, in case connect
9d733021 1682 if (!Connect(system))
eba76848 1683 {
9d733021 1684 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1685 return 0;
1686 }
1687
1688 // Query preparation
9d733021 1689 TString sourceName(source);
d386d623 1690 Int_t nFields = 3;
1691 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1692 fConfig->GetFXSdbTable(system));
1693 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1694 GetCurrentRun(), detector, id);
1695
9d733021 1696 if (system == kDAQ)
1697 {
d386d623 1698 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1699 }
9d733021 1700 else if (system == kDCS)
eba76848 1701 {
9d733021 1702 sourceName="none";
57f50b3c 1703 }
9d733021 1704 else if (system == kHLT)
9e080f92 1705 {
d386d623 1706 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1707 nFields = 3;
9e080f92 1708 }
1709
9e080f92 1710 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1711
1712 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1713
1714 // Query execution
1715 TSQLResult* aResult = 0;
9d733021 1716 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1717 if (!aResult) {
9d733021 1718 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1719 GetSystemName(system), id, sourceName.Data()));
9e080f92 1720 return 0;
1721 }
1722
1723 if(aResult->GetRowCount() == 0)
1724 {
1725 Log(detector,
9d733021 1726 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1727 GetSystemName(system), id, sourceName.Data()));
9e080f92 1728 delete aResult;
1729 return 0;
1730 }
2bb7b766 1731
9e080f92 1732 if (aResult->GetRowCount() > 1) {
1733 Log(detector,
9d733021 1734 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1735 GetSystemName(system), id, sourceName.Data()));
9e080f92 1736 delete aResult;
1737 return 0;
1738 }
1739
9d733021 1740 if (aResult->GetFieldCount() != nFields) {
9e080f92 1741 Log(detector,
9d733021 1742 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1743 GetSystemName(system), id, sourceName.Data()));
9e080f92 1744 delete aResult;
1745 return 0;
1746 }
1747
1748 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1749
1750 if (!aRow){
9d733021 1751 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1752 GetSystemName(system), id, sourceName.Data()));
9e080f92 1753 delete aResult;
1754 return 0;
1755 }
1756
1757 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1758 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1759 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1760
1761 delete aResult;
1762 delete aRow;
1763
d386d623 1764 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1765 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1766
9e080f92 1767 // retrieved file is renamed to make it unique
9d733021 1768 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1769 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1770
9e080f92 1771
9d733021 1772 // file retrieval from FXS
4b95672b 1773 UInt_t nRetries = 0;
1774 UInt_t maxRetries = 3;
1775 Bool_t result = kFALSE;
1776
1777 // copy!! if successful TSystem::Exec returns 0
1778 while(nRetries++ < maxRetries) {
1779 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1780 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1781 if(!result)
1782 {
1783 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1784 filePath.Data(), GetSystemName(system)));
4b95672b 1785 continue;
1786 } else {
1787 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1788 filePath.Data(), GetSystemName(system),
1789 GetShuttleTempDir(), localFileName.Data()));
1790 }
9e080f92 1791
d386d623 1792 if (fileChecksum.Length()>0)
4b95672b 1793 {
1794 // compare md5sum of local file with the one stored in the FXS DB
1795 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1796 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1797
4b95672b 1798 if (md5Comp != 0)
1799 {
1800 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1801 filePath.Data()));
1802 result = kFALSE;
1803 continue;
1804 }
d386d623 1805 } else {
1806 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1807 filePath.Data(), GetSystemName(system)));
9d733021 1808 }
4b95672b 1809 if (result) break;
9e080f92 1810 }
1811
4b95672b 1812 if(!result) return 0;
1813
9d733021 1814 fFXSCalled[system]=kTRUE;
1815 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1816 fFXSlist[system].Add(fileParams);
9e080f92 1817
1818 static TString fullLocalFileName;
36c99a6a 1819 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1820
9e080f92 1821 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1822
1823 return fullLocalFileName.Data();
2bb7b766 1824
1825}
1826
1827//______________________________________________________________________________________________
9d733021 1828Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 1829{
9827400b 1830 //
1831 // Copies file from FXS to local Shuttle machine
1832 //
2bb7b766 1833
9e080f92 1834 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 1835 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
1836 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 1837
36c99a6a 1838 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 1839 if (dir == NULL) {
36c99a6a 1840 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
1841 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 1842 return kFALSE;
1843 }
1844
1845 } else {
1846 gSystem->FreeDirectory(dir);
1847 }
1848
9d733021 1849 TString baseFXSFolder;
1850 if (system == kDAQ)
1851 {
1852 baseFXSFolder = "FES/";
1853 }
1854 else if (system == kDCS)
1855 {
1856 baseFXSFolder = "";
1857 }
1858 else if (system == kHLT)
1859 {
1860 baseFXSFolder = "~/";
1861 }
1862
1863
1864 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
1865 fConfig->GetFXSPort(system),
1866 fConfig->GetFXSUser(system),
1867 fConfig->GetFXSHost(system),
1868 baseFXSFolder.Data(),
1869 fxsFileName,
36c99a6a 1870 GetShuttleTempDir(),
9e080f92 1871 localFileName);
1872
1873 AliDebug(2, Form("%s",command.Data()));
1874
4b95672b 1875 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 1876
4b95672b 1877 return result;
9e080f92 1878}
1879
1880//______________________________________________________________________________________________
9d733021 1881TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
1882{
9827400b 1883 //
1884 // Get sources producing the condition file Id from file exchange servers
1885 //
1886
1887 // check if test mode should simulate a FXS error
1888 if (fTestMode & kErrorFXSSources)
1889 {
1890 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1891 return 0;
1892 }
1893
9d733021 1894
1895 if (system == kDCS)
1896 {
1897 AliError("DCS system has only one source of data!");
1898 return NULL;
9d733021 1899 }
9e080f92 1900
1901 // check connection, in case connect
9d733021 1902 if (!Connect(system))
1903 {
1904 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
1905 return NULL;
9e080f92 1906 }
1907
9d733021 1908 TString sourceName = 0;
1909 if (system == kDAQ)
1910 {
1911 sourceName = "DAQsource";
1912 } else if (system == kHLT)
1913 {
1914 sourceName = "DDLnumbers";
1915 }
1916
d386d623 1917 TString sqlQueryStart = Form("select %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
9e080f92 1918 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1919 GetCurrentRun(), detector, id);
1920 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1921
1922 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1923
1924 // Query execution
1925 TSQLResult* aResult;
9d733021 1926 aResult = fServer[system]->Query(sqlQuery);
9e080f92 1927 if (!aResult) {
9d733021 1928 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
1929 GetSystemName(system), id));
9e080f92 1930 return 0;
1931 }
1932
86aa42c3 1933 TList *list = new TList();
1934 list->SetOwner(1);
1935
9d733021 1936 if (aResult->GetRowCount() == 0)
1937 {
9e080f92 1938 Log(detector,
9d733021 1939 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 1940 delete aResult;
86aa42c3 1941 return list;
9e080f92 1942 }
1943
1944 TSQLRow* aRow;
9e080f92 1945
9d733021 1946 while ((aRow = aResult->Next()))
1947 {
9e080f92 1948
9d733021 1949 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
1950 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
1951 list->Add(new TObjString(source));
9e080f92 1952 delete aRow;
1953 }
9d733021 1954
9e080f92 1955 delete aResult;
1956
1957 return list;
2bb7b766 1958}
1959
1960//______________________________________________________________________________________________
9d733021 1961Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 1962{
9827400b 1963 // Connect to MySQL Server of the system's FXS MySQL databases
1964 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
1965 //
57f50b3c 1966
9d733021 1967 // check connection: if already connected return
1968 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 1969
9d733021 1970 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 1971
9d733021 1972 if (system < 3) // FXS db servers
1973 {
1974 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
1975 dbUser = fConfig->GetFXSdbUser(system);
1976 dbPass = fConfig->GetFXSdbPass(system);
1977 dbName = fConfig->GetFXSdbName(system);
1978 } else { // Run & Shuttle logbook servers
1979 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
1980 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
1981 dbUser = fConfig->GetDAQlbUser();
1982 dbPass = fConfig->GetDAQlbPass();
1983 dbName = fConfig->GetDAQlbDB();
1984 }
57f50b3c 1985
9d733021 1986 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
1987 if (!fServer[system] || !fServer[system]->IsConnected()) {
1988 if(system < 3)
1989 {
1990 AliError(Form("Can't establish connection to FXS database for %s",
1991 AliShuttleInterface::GetSystemName(system)));
1992 } else {
1993 AliError("Can't establish connection to Run logbook.");
57f50b3c 1994 }
9d733021 1995 if(fServer[system]) delete fServer[system];
1996 return kFALSE;
2bb7b766 1997 }
57f50b3c 1998
9d733021 1999 // Get tables
2000 TSQLResult* aResult=0;
2001 switch(system){
2002 case kDAQ:
2003 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2004 break;
2005 case kDCS:
2006 aResult = fServer[kDCS]->GetTables(dbName.Data());
2007 break;
2008 case kHLT:
2009 aResult = fServer[kHLT]->GetTables(dbName.Data());
2010 break;
2011 default:
2012 aResult = fServer[3]->GetTables(dbName.Data());
2013 break;
2014 }
2015
2016 delete aResult;
2bb7b766 2017 return kTRUE;
2018}
57f50b3c 2019
9e080f92 2020//______________________________________________________________________________________________
9d733021 2021Bool_t AliShuttle::UpdateTable()
9e080f92 2022{
9827400b 2023 //
2024 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2025 //
9e080f92 2026
9d733021 2027 Bool_t result = kTRUE;
9e080f92 2028
9d733021 2029 for (UInt_t system=0; system<3; system++)
2030 {
2031 if(!fFXSCalled[system]) continue;
9e080f92 2032
9d733021 2033 // check connection, in case connect
2034 if (!Connect(system))
2035 {
2036 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2037 result = kFALSE;
2038 continue;
9e080f92 2039 }
9e080f92 2040
9d733021 2041 TTimeStamp now; // now
2042
2043 // Loop on FXS list entries
2044 TIter iter(&fFXSlist[system]);
2045 TObjString *aFXSentry=0;
2046 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2047 {
2048 TString aFXSentrystr = aFXSentry->String();
2049 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2050 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2051 {
2052 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2053 GetSystemName(system), aFXSentrystr.Data()));
2054 if(aFXSarray) delete aFXSarray;
2055 result = kFALSE;
2056 continue;
2057 }
2058 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2059 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2060
2061 TString whereClause;
2062 if (system == kDAQ)
2063 {
2064 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2065 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2066 }
2067 else if (system == kDCS)
2068 {
2069 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2070 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2071 }
2072 else if (system == kHLT)
2073 {
2074 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2075 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2076 }
2077
2078 delete aFXSarray;
9e080f92 2079
9d733021 2080 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2081 now.GetSec(), whereClause.Data());
9e080f92 2082
9d733021 2083 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2084
9d733021 2085 // Query execution
2086 TSQLResult* aResult;
2087 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2088 if (!aResult)
2089 {
2090 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2091 GetSystemName(system), sqlQuery.Data()));
2092 result = kFALSE;
2093 continue;
2094 }
2095 delete aResult;
9e080f92 2096 }
9e080f92 2097 }
2098
9d733021 2099 return result;
9e080f92 2100}
57f50b3c 2101
2bb7b766 2102//______________________________________________________________________________________________
3301427a 2103Bool_t AliShuttle::UpdateTableFailCase()
2104{
9827400b 2105 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2106 // this is called in case the preprocessor is declared failed for the current run, because
2107 // the fields are updated only in case of success
3301427a 2108
2109 Bool_t result = kTRUE;
2110
2111 for (UInt_t system=0; system<3; system++)
2112 {
2113 // check connection, in case connect
2114 if (!Connect(system))
2115 {
2116 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2117 GetSystemName(system)));
2118 result = kFALSE;
2119 continue;
2120 }
2121
2122 TTimeStamp now; // now
2123
2124 // Loop on FXS list entries
2125
2126 TString whereClause = Form("where run=%d and detector=\"%s\";",
2127 GetCurrentRun(), fCurrentDetector.Data());
2128
2129
2130 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2131 now.GetSec(), whereClause.Data());
2132
2133 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2134
2135 // Query execution
2136 TSQLResult* aResult;
2137 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2138 if (!aResult)
2139 {
2140 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2141 GetSystemName(system), sqlQuery.Data()));
2142 result = kFALSE;
2143 continue;
2144 }
2145 delete aResult;
2146 }
2147
2148 return result;
2149}
2150
2151//______________________________________________________________________________________________
2bb7b766 2152Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2153{
e7f62f16 2154 //
2155 // Update Shuttle logbook filling detector or shuttle_done column
2156 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2157 //
57f50b3c 2158
2bb7b766 2159 // check connection, in case connect
be48e3ea 2160 if(!Connect(3)){
2bb7b766 2161 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2162 return kFALSE;
57f50b3c 2163 }
2164
2bb7b766 2165 TString detName(detector);
2166 TString setClause;
e7f62f16 2167 if(detName == "shuttle_done")
2168 {
2bb7b766 2169 setClause = "set shuttle_done=1";
e7f62f16 2170
2171 // Send the information to ML
2172 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2173
2174 TList mlList;
2175 mlList.Add(&mlStatus);
2176
2177 fMonaLisa->SendParameters(&mlList);
2bb7b766 2178 } else {
2bb7b766 2179 TString statusStr(status);
2180 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2181 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2182 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2183 } else {
2184 Log("SHUTTLE",
2185 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2186 status, detector));
2187 return kFALSE;
2188 }
2189 }
57f50b3c 2190
2bb7b766 2191 TString whereClause = Form("where run=%d", GetCurrentRun());
2192
441b0e9c 2193 TString sqlQuery = Form("update %s %s %s",
2194 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2195
2bb7b766 2196 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2197
2198 // Query execution
2199 TSQLResult* aResult;
be48e3ea 2200 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2201 if (!aResult) {
2202 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2203 return kFALSE;
57f50b3c 2204 }
2bb7b766 2205 delete aResult;
57f50b3c 2206
2207 return kTRUE;
2208}
2209
2210//______________________________________________________________________________________________
2bb7b766 2211Int_t AliShuttle::GetCurrentRun() const
2212{
9827400b 2213 //
2214 // Get current run from logbook entry
2215 //
57f50b3c 2216
2bb7b766 2217 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2218}
2219
2220//______________________________________________________________________________________________
2bb7b766 2221UInt_t AliShuttle::GetCurrentStartTime() const
2222{
9827400b 2223 //
2224 // get current start time
2225 //
57f50b3c 2226
2bb7b766 2227 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2228}
2229
2230//______________________________________________________________________________________________
2bb7b766 2231UInt_t AliShuttle::GetCurrentEndTime() const
2232{
9827400b 2233 //
2234 // get current end time from logbook entry
2235 //
57f50b3c 2236
2bb7b766 2237 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2238}
2239
2240//______________________________________________________________________________________________
b948db8d 2241void AliShuttle::Log(const char* detector, const char* message)
2242{
9827400b 2243 //
2244 // Fill log string with a message
2245 //
b948db8d 2246
36c99a6a 2247 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2248 if (dir == NULL) {
36c99a6a 2249 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2250 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2251 return;
2252 }
b948db8d 2253
84090f85 2254 } else {
2255 gSystem->FreeDirectory(dir);
2256 }
b948db8d 2257
cb343cfd 2258 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2259 if (GetCurrentRun() >= 0)
2260 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2261 toLog += Form("%s", message);
2262
84090f85 2263 AliInfo(toLog.Data());
ffa29e93 2264
2265 // if we redirect the log output already to the file, leave here
2266 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2267 return;
b948db8d 2268
ffa29e93 2269 TString fileName = GetLogFileName(detector);
e7f62f16 2270
84090f85 2271 gSystem->ExpandPathName(fileName);
2272
2273 ofstream logFile;
2274 logFile.open(fileName, ofstream::out | ofstream::app);
2275
2276 if (!logFile.is_open()) {
2277 AliError(Form("Could not open file %s", fileName.Data()));
2278 return;
2279 }
7bfb2090 2280
84090f85 2281 logFile << toLog.Data() << "\n";
b948db8d 2282
84090f85 2283 logFile.close();
b948db8d 2284}
2bb7b766 2285
2bb7b766 2286//______________________________________________________________________________________________
ffa29e93 2287TString AliShuttle::GetLogFileName(const char* detector) const
2288{
2289 //
2290 // returns the name of the log file for a given sub detector
2291 //
2292
2293 TString fileName;
2294
2295 if (GetCurrentRun() >= 0)
2296 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2297 else
2298 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2299
2300 return fileName;
2301}
2302
2303//______________________________________________________________________________________________
2bb7b766 2304Bool_t AliShuttle::Collect(Int_t run)
2305{
9827400b 2306 //
2307 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2308 // If a dedicated run is given this run is processed
2309 //
2310 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2311 //
2bb7b766 2312
eba76848 2313 if (run == -1)
2314 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2315 else
2316 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2317
2318 SetLastAction("Starting");
2bb7b766 2319
2320 TString whereClause("where shuttle_done=0");
eba76848 2321 if (run != -1)
2322 whereClause += Form(" and run=%d", run);
2bb7b766 2323
2324 TObjArray shuttleLogbookEntries;
be48e3ea 2325 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2326 {
cb343cfd 2327 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2328 return kFALSE;
2329 }
2330
9e080f92 2331 if (shuttleLogbookEntries.GetEntries() == 0)
2332 {
2333 if (run == -1)
2334 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2335 else
2336 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2337 "or it does not exist in Shuttle logbook", run));
2338 return kTRUE;
2339 }
2340
be48e3ea 2341 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2342 fFirstUnprocessed[iDet] = kTRUE;
2343
fc5a4708 2344 if (run != -1)
be48e3ea 2345 {
2346 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2347 // flag them into fFirstUnprocessed array
2348 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2349 TObjArray tmpLogbookEntries;
2350 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2351 {
2352 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2353 return kFALSE;
2354 }
2355
2356 TIter iter(&tmpLogbookEntries);
2357 AliShuttleLogbookEntry* anEntry = 0;
2358 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2359 {
2360 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2361 {
2362 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2363 {
2364 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2365 anEntry->GetRun(), GetDetName(iDet)));
2366 fFirstUnprocessed[iDet] = kFALSE;
2367 }
2368 }
2369
2370 }
2371
2372 }
2373
2374 if (!RetrieveConditionsData(shuttleLogbookEntries))
2375 {
cb343cfd 2376 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2377 return kFALSE;
2378 }
2379
36c99a6a 2380 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2381 return kTRUE;
2bb7b766 2382}
2383
2bb7b766 2384//______________________________________________________________________________________________
2385Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2386{
9827400b 2387 //
2388 // Retrieve conditions data for all runs that aren't processed yet
2389 //
2bb7b766 2390
2391 Bool_t hasError = kFALSE;
2392
2393 TIter iter(&dateEntries);
2394 AliShuttleLogbookEntry* anEntry;
2395
2396 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2397 if (!Process(anEntry)){
2398 hasError = kTRUE;
2399 }
4b95672b 2400
2401 // clean SHUTTLE temp directory
3301427a 2402 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2403 RemoveFile(filename.Data());
2bb7b766 2404 }
2405
2406 return hasError == kFALSE;
2407}
cb343cfd 2408
2409//______________________________________________________________________________________________
2410ULong_t AliShuttle::GetTimeOfLastAction() const
2411{
9827400b 2412 //
2413 // Gets time of last action
2414 //
2415
cb343cfd 2416 ULong_t tmp;
36c99a6a 2417
cb343cfd 2418 fMonitoringMutex->Lock();
be48e3ea 2419
cb343cfd 2420 tmp = fLastActionTime;
36c99a6a 2421
cb343cfd 2422 fMonitoringMutex->UnLock();
36c99a6a 2423
cb343cfd 2424 return tmp;
2425}
2426
2427//______________________________________________________________________________________________
2428const TString AliShuttle::GetLastAction() const
2429{
9827400b 2430 //
cb343cfd 2431 // returns a string description of the last action
9827400b 2432 //
cb343cfd 2433
2434 TString tmp;
36c99a6a 2435
cb343cfd 2436 fMonitoringMutex->Lock();
2437
2438 tmp = fLastAction;
2439
2440 fMonitoringMutex->UnLock();
2441
36c99a6a 2442 return tmp;
cb343cfd 2443}
2444
2445//______________________________________________________________________________________________
2446void AliShuttle::SetLastAction(const char* action)
2447{
9827400b 2448 //
cb343cfd 2449 // updates the monitoring variables
9827400b 2450 //
36c99a6a 2451
cb343cfd 2452 fMonitoringMutex->Lock();
36c99a6a 2453
cb343cfd 2454 fLastAction = action;
2455 fLastActionTime = time(0);
2456
2457 fMonitoringMutex->UnLock();
2458}
eba76848 2459
2460//______________________________________________________________________________________________
2461const char* AliShuttle::GetRunParameter(const char* param)
2462{
9827400b 2463 //
2464 // returns run parameter read from DAQ logbook
2465 //
eba76848 2466
2467 if(!fLogbookEntry) {
2468 AliError("No logbook entry!");
2469 return 0;
2470 }
2471
2472 return fLogbookEntry->GetRunParameter(param);
2473}
57c1a579 2474
2475//______________________________________________________________________________________________
9827400b 2476AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2477{
9827400b 2478 //
2479 // returns object from OCDB valid for current run
2480 //
d386d623 2481
9827400b 2482 if (fTestMode & kErrorOCDB)
2483 {
2484 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2485 return 0;
2486 }
2487
d386d623 2488 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2489 if (!sto)
2490 {
9827400b 2491 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2492 return 0;
2493 }
2494
2495 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2496}
2497
2498//______________________________________________________________________________________________
57c1a579 2499Bool_t AliShuttle::SendMail()
2500{
9827400b 2501 //
2502 // sends a mail to the subdetector expert in case of preprocessor error
2503 //
2504
2505 if (fTestMode != kNone)
2506 return kTRUE;
57c1a579 2507
36c99a6a 2508 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2509 if (dir == NULL)
2510 {
36c99a6a 2511 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2512 {
36c99a6a 2513 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2514 return kFALSE;
2515 }
2516
2517 } else {
2518 gSystem->FreeDirectory(dir);
2519 }
2520
2521 TString bodyFileName;
36c99a6a 2522 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2523 gSystem->ExpandPathName(bodyFileName);
2524
2525 ofstream mailBody;
2526 mailBody.open(bodyFileName, ofstream::out);
2527
2528 if (!mailBody.is_open())
2529 {
2530 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2531 return kFALSE;
2532 }
2533
2534 TString to="";
2535 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2536 TObjString *anExpert=0;
2537 while ((anExpert = (TObjString*) iterExperts.Next()))
2538 {
2539 to += Form("%s,", anExpert->GetName());
2540 }
2541 to.Remove(to.Length()-1);
909732f7 2542 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2543
86aa42c3 2544 if (to.IsNull()) {
36c99a6a 2545 AliInfo("List of detector responsibles not yet set!");
2546 return kFALSE;
2547 }
2548
57c1a579 2549 TString cc="alberto.colla@cern.ch";
2550
546242fb 2551 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
57c1a579 2552 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2553 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2554
2555 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2556 body += Form("SHUTTLE just detected that your preprocessor "
546242fb 2557 "failed processing run %d!!\n\n", GetCurrentRun());
2558 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2559 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2560 body += Form("Find the %s log for the current run on \n\n"
2561 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2562 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2563 body += Form("The last 10 lines of %s log file are following:\n\n");
2564
909732f7 2565 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2566
2567 mailBody << body.Data();
2568 mailBody.close();
2569 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2570
9d733021 2571 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2572 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2573 if (gSystem->Exec(tailCommand.Data()))
2574 {
2575 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2576 }
2577
2578 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2579 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2580 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
546242fb 2581 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
57c1a579 2582
909732f7 2583 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2584
2585 mailBody << endBody.Data();
2586
2587 mailBody.close();
2588
2589 // send mail!
2590 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2591 subject.Data(),
2592 cc.Data(),
2593 to.Data(),
2594 bodyFileName.Data());
909732f7 2595 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2596
2597 Bool_t result = gSystem->Exec(mailCommand.Data());
2598
2599 return result == 0;
2600}
d386d623 2601
2602//______________________________________________________________________________________________
9827400b 2603const char* AliShuttle::GetRunType()
441b0e9c 2604{
9827400b 2605 //
2606 // returns run type read from "run type" logbook
2607 //
441b0e9c 2608
2609 if(!fLogbookEntry) {
2610 AliError("No logbook entry!");
2611 return 0;
2612 }
2613
9827400b 2614 return fLogbookEntry->GetRunType();
441b0e9c 2615}
2616
2617//______________________________________________________________________________________________
d386d623 2618void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2619{
9827400b 2620 //
2621 // sets Shuttle temp directory
2622 //
d386d623 2623
2624 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2625}
2626
2627//______________________________________________________________________________________________
2628void AliShuttle::SetShuttleLogDir(const char* logDir)
2629{
9827400b 2630 //
2631 // sets Shuttle log directory
2632 //
d386d623 2633
2634 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2635}