]> git.uio.no Git - u/mrichter/AliRoot.git/blame - SHUTTLE/AliShuttle.cxx
More protection in solve (this should solve bug #26734) (seggested by P.H.)
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
a038aa70 18Revision 1.45 2007/05/30 06:35:20 jgrosseo
19Adding functionality to the Shuttle/TestShuttle:
20o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
21o) Function to retrieve list of IDs for a given source (GetFileIDs)
22These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
23Example code has been added to the TestProcessor in TestShuttle
24
4a33bdd9 25Revision 1.44 2007/05/11 16:09:32 acolla
26Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
27example: ITS/SPD/100_filename.root
28
2d9019b4 29Revision 1.43 2007/05/10 09:59:51 acolla
30Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
31
546242fb 32Revision 1.42 2007/05/03 08:01:39 jgrosseo
33typo in last commit :-(
34
8b739301 35Revision 1.41 2007/05/03 08:00:48 jgrosseo
36fixing log message when pp want to skip dcs value retrieval
37
651fdaab 38Revision 1.40 2007/04/27 07:06:48 jgrosseo
39GetFileSources returns empty list in case of no files, but successful query
40No mails sent in testmode
41
86aa42c3 42Revision 1.39 2007/04/17 12:43:57 acolla
43Correction in StoreOCDB; change of text in mail to detector expert
44
26758fce 45Revision 1.38 2007/04/12 08:26:18 jgrosseo
46updated comment
47
3c2a21c8 48Revision 1.37 2007/04/10 16:53:14 jgrosseo
49redirecting sub detector stdout, stderr to sub detector log file
50
3d8bc902 51Revision 1.35 2007/04/04 16:26:38 acolla
521. Re-organization of function calls in TestPreprocessor to make it more meaningful.
532. Added missing dependency in test preprocessors.
543. in AliShuttle.cxx: processing time and memory consumption info on a single line.
55
886d60e6 56Revision 1.34 2007/04/04 10:33:36 jgrosseo
571) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
58In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
59
602) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
61
623) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
63
644) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
65
665) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
67If you always need DCS data (like before), you do not need to implement it.
68
696) The run type has been added to the monitoring page
70
9827400b 71Revision 1.33 2007/04/03 13:56:01 acolla
72Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
73run type.
74
3301427a 75Revision 1.32 2007/02/28 10:41:56 acolla
76Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
77AliPreprocessor::GetRunType() function.
78Added some ldap definition files.
79
d386d623 80Revision 1.30 2007/02/13 11:23:21 acolla
81Moved getters and setters of Shuttle's main OCDB/Reference, local
82OCDB/Reference, temp and log folders to AliShuttleInterface
83
9d733021 84Revision 1.27 2007/01/30 17:52:42 jgrosseo
85adding monalisa monitoring
86
e7f62f16 87Revision 1.26 2007/01/23 19:20:03 acolla
88Removed old ldif files, added TOF, MCH ldif files. Added some options in
89AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
90SetShuttleLogDir
91
36c99a6a 92Revision 1.25 2007/01/15 19:13:52 acolla
93Moved some AliInfo to AliDebug in SendMail function
94
fc5a4708 95Revision 1.21 2006/12/07 08:51:26 jgrosseo
96update (alberto):
97table, db names in ldap configuration
98added GRP preprocessor
99DCS data can also be retrieved by data point
100
2c15234c 101Revision 1.20 2006/11/16 16:16:48 jgrosseo
102introducing strict run ordering flag
103removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
104
be48e3ea 105Revision 1.19 2006/11/06 14:23:04 jgrosseo
106major update (Alberto)
107o) reading of run parameters from the logbook
108o) online offline naming conversion
109o) standalone DCSclient package
110
eba76848 111Revision 1.18 2006/10/20 15:22:59 jgrosseo
112o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
113o) Merging Collect, CollectAll, CollectNew function
114o) Removing implementation of empty copy constructors (declaration still there!)
115
cb343cfd 116Revision 1.17 2006/10/05 16:20:55 jgrosseo
117adapting to new CDB classes
118
6ec0e06c 119Revision 1.16 2006/10/05 15:46:26 jgrosseo
120applying to the new interface
121
481441a2 122Revision 1.15 2006/10/02 16:38:39 jgrosseo
123update (alberto):
124fixed memory leaks
125storing of objects that failed to be stored to the grid before
126interfacing of shuttle status table in daq system
127
2bb7b766 128Revision 1.14 2006/08/29 09:16:05 jgrosseo
129small update
130
85a80aa9 131Revision 1.13 2006/08/15 10:50:00 jgrosseo
132effc++ corrections (alberto)
133
4f0ab988 134Revision 1.12 2006/08/08 14:19:29 jgrosseo
135Update to shuttle classes (Alberto)
136
137- Possibility to set the full object's path in the Preprocessor's and
138Shuttle's Store functions
139- Possibility to extend the object's run validity in the same classes
140("startValidity" and "validityInfinite" parameters)
141- Implementation of the StoreReferenceData function to store reference
142data in a dedicated CDB storage.
143
84090f85 144Revision 1.11 2006/07/21 07:37:20 jgrosseo
145last run is stored after each run
146
7bfb2090 147Revision 1.10 2006/07/20 09:54:40 jgrosseo
148introducing status management: The processing per subdetector is divided into several steps,
149after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
150can keep track of the number of failures and skips further processing after a certain threshold is
151exceeded. These thresholds can be configured in LDAP.
152
5164a766 153Revision 1.9 2006/07/19 10:09:55 jgrosseo
154new configuration, accesst to DAQ FES (Alberto)
155
57f50b3c 156Revision 1.8 2006/07/11 12:44:36 jgrosseo
157adding parameters for extended validity range of data produced by preprocessor
158
17111222 159Revision 1.7 2006/07/10 14:37:09 jgrosseo
160small fix + todo comment
161
e090413b 162Revision 1.6 2006/07/10 13:01:41 jgrosseo
163enhanced storing of last sucessfully processed run (alberto)
164
a7160fe9 165Revision 1.5 2006/07/04 14:59:57 jgrosseo
166revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
167
45a493ce 168Revision 1.4 2006/06/12 09:11:16 jgrosseo
169coding conventions (Alberto)
170
58bc3020 171Revision 1.3 2006/06/06 14:26:40 jgrosseo
172o) removed files that were moved to STEER
173o) shuttle updated to follow the new interface (Alberto)
174
b948db8d 175Revision 1.2 2006/03/07 07:52:34 hristov
176New version (B.Yordanov)
177
d477ad88 178Revision 1.6 2005/11/19 17:19:14 byordano
179RetrieveDATEEntries and RetrieveConditionsData added
180
181Revision 1.5 2005/11/19 11:09:27 byordano
182AliShuttle declaration added
183
184Revision 1.4 2005/11/17 17:47:34 byordano
185TList changed to TObjArray
186
187Revision 1.3 2005/11/17 14:43:23 byordano
188import to local CVS
189
190Revision 1.1.1.1 2005/10/28 07:33:58 hristov
191Initial import as subdirectory in AliRoot
192
73abe331 193Revision 1.2 2005/09/13 08:41:15 byordano
194default startTime endTime added
195
196Revision 1.4 2005/08/30 09:13:02 byordano
197some docs added
198
199Revision 1.3 2005/08/29 21:15:47 byordano
200some docs added
201
202*/
203
204//
205// This class is the main manager for AliShuttle.
206// It organizes the data retrieval from DCS and call the
b948db8d 207// interface methods of AliPreprocessor.
73abe331 208// For every detector in AliShuttleConfgi (see AliShuttleConfig),
209// data for its set of aliases is retrieved. If there is registered
b948db8d 210// AliPreprocessor for this detector then it will be used
211// accroding to the schema (see AliPreprocessor).
212// If there isn't registered AliPreprocessor than the retrieved
73abe331 213// data is stored automatically to the undelying AliCDBStorage.
214// For detSpec is used the alias name.
215//
216
217#include "AliShuttle.h"
218
219#include "AliCDBManager.h"
220#include "AliCDBStorage.h"
221#include "AliCDBId.h"
84090f85 222#include "AliCDBRunRange.h"
223#include "AliCDBPath.h"
5164a766 224#include "AliCDBEntry.h"
73abe331 225#include "AliShuttleConfig.h"
eba76848 226#include "DCSClient/AliDCSClient.h"
73abe331 227#include "AliLog.h"
b948db8d 228#include "AliPreprocessor.h"
5164a766 229#include "AliShuttleStatus.h"
2bb7b766 230#include "AliShuttleLogbookEntry.h"
73abe331 231
57f50b3c 232#include <TSystem.h>
58bc3020 233#include <TObject.h>
b948db8d 234#include <TString.h>
57f50b3c 235#include <TTimeStamp.h>
73abe331 236#include <TObjString.h>
57f50b3c 237#include <TSQLServer.h>
238#include <TSQLResult.h>
239#include <TSQLRow.h>
cb343cfd 240#include <TMutex.h>
9827400b 241#include <TSystemDirectory.h>
242#include <TSystemFile.h>
243#include <TFileMerger.h>
244#include <TGrid.h>
245#include <TGridResult.h>
73abe331 246
e7f62f16 247#include <TMonaLisaWriter.h>
248
5164a766 249#include <fstream>
250
cb343cfd 251#include <sys/types.h>
252#include <sys/wait.h>
253
73abe331 254ClassImp(AliShuttle)
255
b948db8d 256//______________________________________________________________________________________________
257AliShuttle::AliShuttle(const AliShuttleConfig* config,
258 UInt_t timeout, Int_t retries):
4f0ab988 259fConfig(config),
260fTimeout(timeout), fRetries(retries),
261fPreprocessorMap(),
2bb7b766 262fLogbookEntry(0),
eba76848 263fCurrentDetector(),
85a80aa9 264fStatusEntry(0),
cb343cfd 265fMonitoringMutex(0),
eba76848 266fLastActionTime(0),
e7f62f16 267fLastAction(),
9827400b 268fMonaLisa(0),
269fTestMode(kNone),
ffa29e93 270fReadTestMode(kFALSE),
271fOutputRedirected(kFALSE)
73abe331 272{
273 //
274 // config: AliShuttleConfig used
73abe331 275 // timeout: timeout used for AliDCSClient connection
276 // retries: the number of retries in case of connection error.
277 //
278
57f50b3c 279 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 280 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 281 fServer[iSys]=0;
be48e3ea 282 if (iSys < 3)
2c15234c 283 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 284 }
2bb7b766 285 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 286
287 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
288 fFirstUnprocessed[iDet] = kFALSE;
289
cb343cfd 290 fMonitoringMutex = new TMutex();
58bc3020 291}
292
b948db8d 293//______________________________________________________________________________________________
57f50b3c 294AliShuttle::~AliShuttle()
58bc3020 295{
9827400b 296 //
297 // destructor
298 //
58bc3020 299
b948db8d 300 fPreprocessorMap.DeleteAll();
be48e3ea 301 for(int iSys=0;iSys<4;iSys++)
57f50b3c 302 if(fServer[iSys]) {
303 fServer[iSys]->Close();
304 delete fServer[iSys];
eba76848 305 fServer[iSys] = 0;
57f50b3c 306 }
2bb7b766 307
308 if (fStatusEntry){
309 delete fStatusEntry;
310 fStatusEntry = 0;
311 }
cb343cfd 312
313 if (fMonitoringMutex)
314 {
315 delete fMonitoringMutex;
316 fMonitoringMutex = 0;
317 }
73abe331 318}
319
b948db8d 320//______________________________________________________________________________________________
57f50b3c 321void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 322{
73abe331 323 //
b948db8d 324 // Registers new AliPreprocessor.
73abe331 325 // It uses GetName() for indentificator of the pre processor.
326 // The pre processor is registered it there isn't any other
327 // with the same identificator (GetName()).
328 //
329
eba76848 330 const char* detName = preprocessor->GetName();
331 if(GetDetPos(detName) < 0)
332 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
333
334 if (fPreprocessorMap.GetValue(detName)) {
335 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 336 return;
337 }
338
eba76848 339 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 340}
b948db8d 341//______________________________________________________________________________________________
3301427a 342Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 343 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 344{
9827400b 345 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
346 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
347 // using this function. Use StoreReferenceData instead!
348 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
349 // finishes the data are transferred to the main storage (Grid).
b948db8d 350
3301427a 351 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 352}
353
354//______________________________________________________________________________________________
3301427a 355Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 356{
9827400b 357 // Stores a CDB object in the storage for reference data. This objects will not be available during
358 // offline reconstrunction. Use this function for reference data only!
359 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
360 // finishes the data are transferred to the main storage (Grid).
85a80aa9 361
3301427a 362 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 363}
364
365//______________________________________________________________________________________________
3301427a 366Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 367 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
368 Int_t validityStart, Bool_t validityInfinite)
369{
9827400b 370 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
371 // when the preprocessor finishes the data are transferred to the main storage (Grid).
372 // The parameters are:
373 // 1) Uri of the backup storage (Local)
374 // 2) the object's path.
375 // 3) the object to be stored
376 // 4) the metaData to be associated with the object
377 // 5) the validity start run number w.r.t. the current run,
378 // if the data is valid only for this run leave the default 0
379 // 6) specifies if the calibration data is valid for infinity (this means until updated),
380 // typical for calibration runs, the default is kFALSE
381 //
382 // returns 0 if fail, 1 otherwise
84090f85 383
9827400b 384 if (fTestMode & kErrorStorage)
385 {
386 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
387 return kFALSE;
388 }
389
3301427a 390 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 391
85a80aa9 392 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 393 if(firstRun < 0) {
9827400b 394 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 395 firstRun=0;
396 }
397
398 Int_t lastRun = -1;
399 if(validityInfinite) {
400 lastRun = AliCDBRunRange::Infinity();
401 } else {
402 lastRun = GetCurrentRun();
403 }
404
3301427a 405 // Version is set to current run, it will be used later to transfer data to Grid
406 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 407
408 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
409 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 410 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 411 }
84090f85 412
3301427a 413 Bool_t result = kFALSE;
84090f85 414
3301427a 415 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
416 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 417 } else {
3301427a 418 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 419 ->Put(object, id, metaData);
420 }
421
422 if(!result) {
423
9827400b 424 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 425 }
2bb7b766 426
3301427a 427 return result;
428}
84090f85 429
3301427a 430//______________________________________________________________________________________________
431Bool_t AliShuttle::StoreOCDB()
432{
9827400b 433 //
434 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
435 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
436 // Then calls StoreRefFilesToGrid to store reference files.
437 //
438
439 if (fTestMode & kErrorGrid)
440 {
441 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
442 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
443 return kFALSE;
444 }
445
2d9019b4 446 Log("SHUTTLE","Storing OCDB data ...");
86aa42c3 447 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
448
2d9019b4 449 Log("SHUTTLE","Storing reference data ...");
3301427a 450 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 451
2d9019b4 452 Log("SHUTTLE","Storing reference files ...");
9827400b 453 Bool_t resultRefFiles = StoreRefFilesToGrid();
454
455 return resultCDB && resultRef && resultRefFiles;
3301427a 456}
457
458//______________________________________________________________________________________________
459Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
460{
461 //
462 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
463 //
464
465 TObjArray* gridIds=0;
466
467 Bool_t result = kTRUE;
468
469 const char* type = 0;
470 TString localURI;
471 if(gridURI == fgkMainCDB) {
472 type = "OCDB";
473 localURI = fgkLocalCDB;
474 } else if(gridURI == fgkMainRefStorage) {
475 type = "reference";
476 localURI = fgkLocalRefStorage;
477 } else {
478 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
479 return kFALSE;
480 }
481
482 AliCDBManager* man = AliCDBManager::Instance();
483
484 AliCDBStorage *gridSto = man->GetStorage(gridURI);
485 if(!gridSto) {
486 Log("SHUTTLE",
487 Form("StoreOCDB - cannot activate main %s storage", type));
488 return kFALSE;
489 }
490
491 gridIds = gridSto->GetQueryCDBList();
492
493 // get objects previously stored in local CDB
494 AliCDBStorage *localSto = man->GetStorage(localURI);
495 if(!localSto) {
496 Log("SHUTTLE",
497 Form("StoreOCDB - cannot activate local %s storage", type));
498 return kFALSE;
499 }
500 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
501 // Local objects were stored with current run as Grid version!
502 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
503 localEntries->SetOwner(1);
504
505 // loop on local stored objects
506 TIter localIter(localEntries);
507 AliCDBEntry *aLocEntry = 0;
508 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
509 aLocEntry->SetOwner(1);
510 AliCDBId aLocId = aLocEntry->GetId();
511 aLocEntry->SetVersion(-1);
512 aLocEntry->SetSubVersion(-1);
513
514 // If local object is valid up to infinity we store it only if it is
515 // the first unprocessed run!
516 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
517 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
518 {
519 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
520 "there are previous unprocessed runs!",
521 fCurrentDetector.Data(), aLocId.GetPath().Data()));
522 continue;
523 }
524
525 // loop on Grid valid Id's
526 Bool_t store = kTRUE;
527 TIter gridIter(gridIds);
528 AliCDBId* aGridId = 0;
529 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
530 if(aGridId->GetPath() != aLocId.GetPath()) continue;
531 // skip all objects valid up to infinity
532 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
533 // if we get here, it means there's already some more recent object stored on Grid!
534 store = kFALSE;
535 break;
536 }
537
538 // If we get here, the file can be stored!
539 Bool_t storeOk = gridSto->Put(aLocEntry);
540 if(!store || storeOk){
541
542 if (!store)
543 {
544 Log(fCurrentDetector.Data(),
545 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
546 type, aGridId->ToString().Data()));
547 } else {
548 Log("SHUTTLE",
549 Form("StoreOCDB - Object <%s> successfully put into %s storage",
550 aLocId.ToString().Data(), type));
2d9019b4 551 Log(fCurrentDetector.Data(),
552 Form("StoreOCDB - Object <%s> successfully put into %s storage",
553 aLocId.ToString().Data(), type));
3301427a 554 }
84090f85 555
3301427a 556 // removing local filename...
557 TString filename;
558 localSto->IdToFilename(aLocId, filename);
559 AliInfo(Form("Removing local file %s", filename.Data()));
560 RemoveFile(filename.Data());
561 continue;
562 } else {
563 Log("SHUTTLE",
564 Form("StoreOCDB - Grid %s storage of object <%s> failed",
565 type, aLocId.ToString().Data()));
2d9019b4 566 Log(fCurrentDetector.Data(),
567 Form("StoreOCDB - Grid %s storage of object <%s> failed",
568 type, aLocId.ToString().Data()));
3301427a 569 result = kFALSE;
b948db8d 570 }
571 }
3301427a 572 localEntries->Clear();
2bb7b766 573
b948db8d 574 return result;
3301427a 575}
576
546242fb 577//______________________________________________________________________________________________
578Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
579{
2d9019b4 580 // clears the directory used to store reference files of a given subdetector
546242fb 581
582 AliCDBManager* man = AliCDBManager::Instance();
583 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
2d9019b4 584 TString localBaseFolder = sto->GetBaseFolder();
585
586 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
587
588 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
589
590 TString begin;
591 begin.Form("%d_", GetCurrentRun());
592
593 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
594 if (!baseDir)
595 return kTRUE;
596
597 TList* dirList = baseDir->GetListOfFiles();
598 delete baseDir;
599
600 if (!dirList) return kTRUE;
601
602 if (dirList->GetEntries() < 3)
603 {
604 delete dirList;
605 return kTRUE;
606 }
607
608 Int_t nDirs = 0, nDel = 0;
609 TIter dirIter(dirList);
610 TSystemFile* entry = 0;
546242fb 611
2d9019b4 612 Bool_t success = kTRUE;
546242fb 613
2d9019b4 614 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
615 {
616 if (entry->IsDirectory())
617 continue;
618
619 TString fileName(entry->GetName());
620 if (!fileName.BeginsWith(begin))
621 continue;
622
623 nDirs++;
624
625 // delete file
626 Int_t result = gSystem->Unlink(fileName.Data());
627
628 if (result)
629 {
630 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
631 success = kFALSE;
632 } else {
633 nDel++;
634 }
635 }
636
637 if(nDirs > 0)
638 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
639 nDel, nDirs, targetDir.Data()));
640
641
642 delete dirList;
643 return success;
644
645
646
647
648
546242fb 649
650 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
651 if (result == 0)
652 {
653 // delete directory
654 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
655 if (result != 0)
656 {
657 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
658 return kFALSE;
659 }
660 }
661
662 result = gSystem->mkdir(targetDir, kTRUE);
663 if (result != 0)
664 {
665 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
666 return kFALSE;
667 }
668
669 return kTRUE;
670}
671
9827400b 672//______________________________________________________________________________________________
673Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
674{
675 //
3c2a21c8 676 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 677 //
3c2a21c8 678 // The file is stored under the following location:
679 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
680 // where <gridFileName> is the second parameter given to the function
681 //
9827400b 682
683 if (fTestMode & kErrorStorage)
684 {
685 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
686 return kFALSE;
687 }
688
689 AliCDBManager* man = AliCDBManager::Instance();
690 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
691
692 TString localBaseFolder = sto->GetBaseFolder();
693
2d9019b4 694 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
9827400b 695
2d9019b4 696 //try to open folder, if does not exist
697 void* dir = gSystem->OpenDirectory(targetDir.Data());
698 if (dir == NULL) {
699 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
700 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
701 return kFALSE;
702 }
703
704 } else {
705 gSystem->FreeDirectory(dir);
706 }
707
9827400b 708 TString target;
709 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
710
546242fb 711 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
9827400b 712 if (result)
713 {
546242fb 714 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
715 return kFALSE;
9827400b 716 }
546242fb 717
9827400b 718 result = gSystem->CopyFile(localFile, target);
719
720 if (result == 0)
721 {
2d9019b4 722 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
9827400b 723 return kTRUE;
724 }
725 else
726 {
2d9019b4 727 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
546242fb 728 localFile, target.Data(), result));
9827400b 729 return kFALSE;
730 }
731}
732
733//______________________________________________________________________________________________
734Bool_t AliShuttle::StoreRefFilesToGrid()
735{
736 //
737 // Transfers the reference file to the Grid.
9827400b 738 //
86aa42c3 739 // The files are stored under the following location:
3c2a21c8 740 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
86aa42c3 741 //
9827400b 742
743 AliCDBManager* man = AliCDBManager::Instance();
744 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
745 if (!sto)
746 return kFALSE;
747 TString localBaseFolder = sto->GetBaseFolder();
748
2d9019b4 749 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
750
9827400b 751 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
752 if (!gridSto)
753 return kFALSE;
2d9019b4 754
9827400b 755 TString gridBaseFolder = gridSto->GetBaseFolder();
2d9019b4 756
757 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
9827400b 758
9827400b 759 TString begin;
760 begin.Form("%d_", GetCurrentRun());
761
762 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 763 if (!baseDir)
764 return kTRUE;
765
2d9019b4 766 TList* dirList = baseDir->GetListOfFiles();
767 delete baseDir;
768
769 if (!dirList) return kTRUE;
770
771 if (dirList->GetEntries() < 3)
3d8bc902 772 {
2d9019b4 773 delete dirList;
9827400b 774 return kTRUE;
3d8bc902 775 }
2d9019b4 776
546242fb 777 if (!gGrid)
778 {
779 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
2d9019b4 780 delete dirList;
546242fb 781 return kFALSE;
782 }
783
2d9019b4 784 Int_t nDirs = 0, nTransfer = 0;
785 TIter dirIter(dirList);
786 TSystemFile* entry = 0;
787
9827400b 788 Bool_t success = kTRUE;
3d8bc902 789 Bool_t first = kTRUE;
9827400b 790
2d9019b4 791 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
792 {
9827400b 793 if (entry->IsDirectory())
794 continue;
795
796 TString fileName(entry->GetName());
797 if (!fileName.BeginsWith(begin))
798 continue;
799
2d9019b4 800 nDirs++;
801
3d8bc902 802 if (first)
803 {
804 first = kFALSE;
805 // check that DET folder exists, otherwise create it
806 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
807
808 if (!result)
2d9019b4 809 {
810 delete dirList;
3d8bc902 811 return kFALSE;
2d9019b4 812 }
3d8bc902 813
546242fb 814 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
3d8bc902 815 {
816 if (!gGrid->Mkdir(alienDir.Data(),"",0))
817 {
818 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
819 alienDir.Data()));
2d9019b4 820 delete dirList;
3d8bc902 821 return kFALSE;
546242fb 822 } else {
823 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
3d8bc902 824 }
825
546242fb 826 } else {
827 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
3d8bc902 828 }
829 }
830
9827400b 831 TString fullLocalPath;
832 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
833
834 TString fullGridPath;
835 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
836
9827400b 837 TFileMerger fileMerger;
838 Bool_t result = fileMerger.Cp(fullLocalPath, fullGridPath);
839
840 if (result)
841 {
2d9019b4 842 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
9827400b 843 RemoveFile(fullLocalPath);
2d9019b4 844 nTransfer++;
9827400b 845 }
846 else
847 {
2d9019b4 848 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
9827400b 849 success = kFALSE;
850 }
851 }
2d9019b4 852
853 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
854
855
856 delete dirList;
9827400b 857 return success;
858}
859
2d9019b4 860//______________________________________________________________________________________________
861const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
862{
863 //
864 // Get folder name of reference files
865 //
866
867 TString offDetStr(GetOfflineDetName(detector));
868 TString dir;
869 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
870 {
871 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
872 } else {
873 dir.Form("%s/%s", base, offDetStr.Data());
874 }
875
876 return dir.Data();
877
878
879}
3301427a 880//______________________________________________________________________________________________
881void AliShuttle::CleanLocalStorage(const TString& uri)
882{
9827400b 883 //
884 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
885 //
3301427a 886
887 const char* type = 0;
888 if(uri == fgkLocalCDB) {
889 type = "OCDB";
890 } else if(uri == fgkLocalRefStorage) {
546242fb 891 type = "Reference";
3301427a 892 } else {
893 AliError(Form("Invalid storage URI: %s", uri.Data()));
894 return;
895 }
896
897 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 898
3301427a 899 // open local storage
900 AliCDBStorage *localSto = man->GetStorage(uri);
901 if(!localSto) {
902 Log("SHUTTLE",
903 Form("CleanLocalStorage - cannot activate local %s storage", type));
904 return;
905 }
906
907 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
546242fb 908 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
3301427a 909
910 AliInfo(Form("filename = %s", filename.Data()));
911
912 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
913 GetCurrentRun(), fCurrentDetector.Data()));
914
915 RemoveFile(filename.Data());
916
917}
918
919//______________________________________________________________________________________________
920void AliShuttle::RemoveFile(const char* filename)
921{
9827400b 922 //
923 // removes local file
924 //
3301427a 925
926 TString command(Form("rm -f %s", filename));
927
928 Int_t result = gSystem->Exec(command.Data());
929 if(result != 0)
930 {
931 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
932 fCurrentDetector.Data(), filename));
933 }
73abe331 934}
935
b948db8d 936//______________________________________________________________________________________________
5164a766 937AliShuttleStatus* AliShuttle::ReadShuttleStatus()
938{
9827400b 939 //
940 // Reads the AliShuttleStatus from the CDB
941 //
5164a766 942
2bb7b766 943 if (fStatusEntry){
944 delete fStatusEntry;
945 fStatusEntry = 0;
946 }
5164a766 947
10a5a932 948 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 949 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 950
2bb7b766 951 if (!fStatusEntry) return 0;
952 fStatusEntry->SetOwner(1);
5164a766 953
2bb7b766 954 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
955 if (!status) {
956 AliError("Invalid object stored to CDB!");
957 return 0;
958 }
5164a766 959
2bb7b766 960 return status;
5164a766 961}
962
963//______________________________________________________________________________________________
7bfb2090 964Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 965{
9827400b 966 //
967 // writes the status for one subdetector
968 //
2bb7b766 969
970 if (fStatusEntry){
971 delete fStatusEntry;
972 fStatusEntry = 0;
973 }
5164a766 974
2bb7b766 975 Int_t run = GetCurrentRun();
5164a766 976
2bb7b766 977 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 978
2bb7b766 979 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
980 fStatusEntry->SetOwner(1);
5164a766 981
2bb7b766 982 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 983
2bb7b766 984 if (!result) {
3301427a 985 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
986 fCurrentDetector.Data(), run));
2bb7b766 987 return kFALSE;
988 }
e7f62f16 989
990 SendMLInfo();
7bfb2090 991
2bb7b766 992 return kTRUE;
5164a766 993}
994
995//______________________________________________________________________________________________
996void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
997{
9827400b 998 //
999 // changes the AliShuttleStatus for the given detector and run to the given status
1000 //
5164a766 1001
2bb7b766 1002 if (!fStatusEntry){
1003 AliError("UNEXPECTED: fStatusEntry empty");
1004 return;
1005 }
5164a766 1006
2bb7b766 1007 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 1008
2bb7b766 1009 if (!status){
3301427a 1010 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 1011 return;
1012 }
5164a766 1013
2c15234c 1014 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 1015 fCurrentDetector.Data(),
36c99a6a 1016 status->GetStatusName(),
eba76848 1017 status->GetStatusName(newStatus));
cb343cfd 1018 Log("SHUTTLE", actionStr);
1019 SetLastAction(actionStr);
5164a766 1020
2bb7b766 1021 status->SetStatus(newStatus);
1022 if (increaseCount) status->IncreaseCount();
5164a766 1023
2bb7b766 1024 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 1025
1026 SendMLInfo();
5164a766 1027}
e7f62f16 1028
1029//______________________________________________________________________________________________
1030void AliShuttle::SendMLInfo()
1031{
1032 //
1033 // sends ML information about the current status of the current detector being processed
1034 //
1035
1036 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1037
1038 if (!status){
3301427a 1039 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 1040 return;
1041 }
1042
1043 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1044 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1045
1046 TList mlList;
1047 mlList.Add(&mlStatus);
1048 mlList.Add(&mlRetryCount);
1049
1050 fMonaLisa->SendParameters(&mlList);
1051}
1052
5164a766 1053//______________________________________________________________________________________________
1054Bool_t AliShuttle::ContinueProcessing()
1055{
9827400b 1056 // this function reads the AliShuttleStatus information from CDB and
1057 // checks if the processing should be continued
1058 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 1059
57c1a579 1060 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1061
1062 AliPreprocessor* aPreprocessor =
1063 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1064 if (!aPreprocessor)
1065 {
1066 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1067 return kFALSE;
1068 }
1069
2bb7b766 1070 AliShuttleLogbookEntry::Status entryStatus =
eba76848 1071 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 1072
1073 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 1074 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 1075 fCurrentDetector.Data(),
1076 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1077 return kFALSE;
1078 }
1079
1080 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 1081
1082 // check if current run is first unprocessed run for current detector
1083 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1084 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1085 {
86aa42c3 1086 if (fTestMode == kNone)
1087 {
1088 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1089 return kFALSE;
1090 }
1091 else
1092 {
1093 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1094 }
be48e3ea 1095 }
1096
2bb7b766 1097 AliShuttleStatus* status = ReadShuttleStatus();
1098 if (!status) {
1099 // first time
1100 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1101 fCurrentDetector.Data()));
1102 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1103 return WriteShuttleStatus(status);
1104 }
1105
1106 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1107 // If it happens it may mean Logbook updating failed... let's do it now!
1108 if (status->GetStatus() == AliShuttleStatus::kDone ||
1109 status->GetStatus() == AliShuttleStatus::kFailed){
1110 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1111 fCurrentDetector.Data(),
1112 status->GetStatusName(status->GetStatus())));
1113 UpdateShuttleLogbook(fCurrentDetector.Data(),
1114 status->GetStatusName(status->GetStatus()));
1115 return kFALSE;
1116 }
1117
3301427a 1118 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 1119 Log("SHUTTLE",
1120 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1121 fCurrentDetector.Data()));
9827400b 1122 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1123 if (StoreOCDB()){
3301427a 1124 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1125 fCurrentDetector.Data()));
2bb7b766 1126 UpdateShuttleStatus(AliShuttleStatus::kDone);
1127 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1128 } else {
1129 Log("SHUTTLE",
1130 Form("ContinueProcessing - %s: Grid storage failed again",
1131 fCurrentDetector.Data()));
9827400b 1132 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 1133 }
1134 return kFALSE;
1135 }
1136
1137 // if we get here, there is a restart
57c1a579 1138 Bool_t cont = kFALSE;
2bb7b766 1139
1140 // abort conditions
cb343cfd 1141 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 1142 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1143 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 1144 status->GetCount(), status->GetStatusName()));
1145 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 1146 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 1147
1148 // there may still be objects in local OCDB and reference storage
1149 // and FXS databases may be not updated: do it now!
9827400b 1150
1151 // TODO Currently disabled, we want to keep files in case of failure!
1152 // CleanLocalStorage(fgkLocalCDB);
1153 // CleanLocalStorage(fgkLocalRefStorage);
1154 // UpdateTableFailCase();
1155
1156 // Send mail to detector expert!
1157 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1158 if (!SendMail())
1159 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1160 fCurrentDetector.Data()));
3301427a 1161
57c1a579 1162 } else {
1163 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1164 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1165 status->GetStatusName(), status->GetCount()));
9827400b 1166 Bool_t increaseCount = kTRUE;
1167 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1168 increaseCount = kFALSE;
1169 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1170 cont = kTRUE;
2bb7b766 1171 }
1172
57c1a579 1173 return cont;
5164a766 1174}
1175
1176//______________________________________________________________________________________________
2bb7b766 1177Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1178{
73abe331 1179 //
b948db8d 1180 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1181 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1182 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1183 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1184 //
1185
9827400b 1186 if (!entry) return kFALSE;
2bb7b766 1187
1188 fLogbookEntry = entry;
1189
9827400b 1190 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1191 GetCurrentRun()));
2bb7b766 1192
e7f62f16 1193 // create ML instance that monitors this run
1194 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1195 // disable monitoring of other parameters that come e.g. from TFile
1196 gMonitoringWriter = 0;
2bb7b766 1197
e7f62f16 1198 // Send the information to ML
1199 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1200 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1201
1202 TList mlList;
1203 mlList.Add(&mlStatus);
9827400b 1204 mlList.Add(&mlRunType);
e7f62f16 1205
1206 fMonaLisa->SendParameters(&mlList);
3301427a 1207
9827400b 1208 if (fLogbookEntry->IsDone())
1209 {
1210 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1211 UpdateShuttleLogbook("shuttle_done");
1212 fLogbookEntry = 0;
1213 return kTRUE;
1214 }
1215
1216 // read test mode if flag is set
1217 if (fReadTestMode)
1218 {
3d8bc902 1219 fTestMode = kNone;
9827400b 1220 TString logEntry(entry->GetRunParameter("log"));
1221 //printf("log entry = %s\n", logEntry.Data());
1222 TString searchStr("Testmode: ");
1223 Int_t pos = logEntry.Index(searchStr.Data());
1224 //printf("%d\n", pos);
1225 if (pos >= 0)
1226 {
1227 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1228 //printf("%s\n", subStr.String().Data());
1229 TString newStr(subStr.Data());
1230 TObjArray* token = newStr.Tokenize(' ');
1231 if (token)
1232 {
1233 //token->Print();
1234 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1235 if (tmpStr)
1236 {
1237 Int_t testMode = tmpStr->String().Atoi();
1238 if (testMode > 0)
1239 {
1240 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1241 SetTestMode((TestMode) testMode);
1242 }
1243 }
1244 delete token;
1245 }
1246 }
1247 }
1248
3d8bc902 1249 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1250
eba76848 1251 fLogbookEntry->Print("all");
57f50b3c 1252
1253 // Initialization
d477ad88 1254 Bool_t hasError = kFALSE;
5164a766 1255
2bb7b766 1256 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1257 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1258 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1259 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1260
57f50b3c 1261 // Loop on detectors in the configuration
b948db8d 1262 TIter iter(fConfig->GetDetectors());
2bb7b766 1263 TObjString* aDetector = 0;
b948db8d 1264
be48e3ea 1265 while ((aDetector = (TObjString*) iter.Next()))
1266 {
7bfb2090 1267 fCurrentDetector = aDetector->String();
5164a766 1268
9e080f92 1269 if (ContinueProcessing() == kFALSE) continue;
1270
2bb7b766 1271 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1272 GetCurrentRun(), aDetector->GetName()));
1273
9d733021 1274 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1275
e7f62f16 1276 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1277
be48e3ea 1278 Int_t pid = fork();
1279
1280 if (pid < 0)
1281 {
1282 Log("SHUTTLE", "ERROR: Forking failed");
1283 }
1284 else if (pid > 0)
1285 {
1286 // parent
1287 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1288 GetCurrentRun(), aDetector->GetName()));
1289
1290 Long_t begin = time(0);
1291
1292 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1293 while (waitpid(pid, &status, WNOHANG) == 0)
1294 {
1295 Long_t expiredTime = time(0) - begin;
1296
1297 if (expiredTime > fConfig->GetPPTimeOut())
1298 {
9827400b 1299 TString tmp;
1300 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1301 fCurrentDetector.Data(), expiredTime);
1302 Log("SHUTTLE", tmp);
1303 Log(fCurrentDetector, tmp);
be48e3ea 1304
1305 kill(pid, 9);
1306
3301427a 1307 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1308 hasError = kTRUE;
1309
1310 gSystem->Sleep(1000);
1311 }
1312 else
1313 {
be48e3ea 1314 gSystem->Sleep(1000);
9827400b 1315
1316 TString checkStr;
1317 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1318 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1319 if (!pipe)
1320 {
1321 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1322 continue;
1323 }
1324
1325 char buffer[100];
1326 if (!fgets(buffer, 100, pipe))
1327 {
1328 Log("SHUTTLE", "Error: ps did not return anything");
1329 gSystem->ClosePipe(pipe);
1330 continue;
1331 }
1332 gSystem->ClosePipe(pipe);
1333
1334 //Log("SHUTTLE", Form("ps returned %s", buffer));
1335
1336 Int_t mem = 0;
1337 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1338 {
1339 Log("SHUTTLE", "Error: Could not parse output of ps");
1340 continue;
1341 }
1342
1343 if (expiredTime % 60 == 0)
886d60e6 1344 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1345 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1346
1347 if (mem > fConfig->GetPPMaxMem())
1348 {
1349 TString tmp;
1350 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1351 mem, fConfig->GetPPMaxMem());
1352 Log("SHUTTLE", tmp);
1353 Log(fCurrentDetector, tmp);
1354
1355 kill(pid, 9);
1356
1357 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1358 hasError = kTRUE;
1359
1360 gSystem->Sleep(1000);
1361 }
be48e3ea 1362 }
1363 }
1364
1365 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1366 GetCurrentRun(), aDetector->GetName()));
1367
1368 if (WIFEXITED(status))
1369 {
1370 Int_t returnCode = WEXITSTATUS(status);
1371
3301427a 1372 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1373 returnCode));
be48e3ea 1374
9827400b 1375 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1376 }
1377 }
1378 else if (pid == 0)
1379 {
1380 // client
1381 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1382
ffa29e93 1383 AliInfo("Redirecting output...");
1384
546242fb 1385 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
ffa29e93 1386 {
1387 Log("SHUTTLE", "Could not freopen stdout");
1388 }
1389 else
1390 {
1391 fOutputRedirected = kTRUE;
1392 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1393 Log("SHUTTLE", "Could not redirect stderr");
1394
1395 }
1396
9827400b 1397 Bool_t success = ProcessCurrentDetector();
1398 if (success) // Preprocessor finished successfully!
1399 {
3301427a 1400 // Update time_processed field in FXS DB
1401 if (UpdateTable() == kFALSE)
1402 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!"));
1403
1404 // Transfer the data from local storage to main storage (Grid)
1405 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1406 if (StoreOCDB() == kFALSE)
1407 {
1408 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1409 GetCurrentRun(), aDetector->GetName()));
1410 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1411 success = kFALSE;
3301427a 1412 } else {
1413 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1414 GetCurrentRun(), aDetector->GetName()));
1415 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1416 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1417 }
be48e3ea 1418 }
1419
4b95672b 1420 for (UInt_t iSys=0; iSys<3; iSys++)
1421 {
1422 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1423 }
1424
be48e3ea 1425 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1426 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1427
1428 // the client exits here
9827400b 1429 gSystem->Exit(success);
be48e3ea 1430
1431 AliError("We should never get here!!!");
1432 }
7bfb2090 1433 }
5164a766 1434
2bb7b766 1435 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1436 GetCurrentRun()));
1437
1438 //check if shuttle is done for this run, if so update logbook
1439 TObjArray checkEntryArray;
1440 checkEntryArray.SetOwner(1);
9e080f92 1441 TString whereClause = Form("where run=%d", GetCurrentRun());
1442 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1443 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1444 GetCurrentRun()));
1445 return hasError == kFALSE;
1446 }
b948db8d 1447
9e080f92 1448 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1449 (checkEntryArray.At(0));
2bb7b766 1450
9e080f92 1451 if (checkEntry)
1452 {
1453 if (checkEntry->IsDone())
be48e3ea 1454 {
9e080f92 1455 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1456 UpdateShuttleLogbook("shuttle_done");
1457 }
1458 else
1459 {
1460 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1461 {
9e080f92 1462 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1463 {
9e080f92 1464 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1465 checkEntry->GetRun(), GetDetName(iDet)));
1466 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1467 }
1468 }
2bb7b766 1469 }
1470 }
1471
e7f62f16 1472 // remove ML instance
1473 delete fMonaLisa;
1474 fMonaLisa = 0;
1475
2bb7b766 1476 fLogbookEntry = 0;
85a80aa9 1477
a7160fe9 1478 return hasError == kFALSE;
73abe331 1479}
1480
b948db8d 1481//______________________________________________________________________________________________
9827400b 1482Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1483{
1484 //
2bb7b766 1485 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1486 // Threre should be a configuration for this detector.
73abe331 1487
2bb7b766 1488 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1489
2d9019b4 1490 if (!CleanReferenceStorage(fCurrentDetector.Data()))
546242fb 1491 return kFALSE;
1492
a038aa70 1493 TMap* dcsMap = 0;
3301427a 1494
1495 // call preprocessor
1496 AliPreprocessor* aPreprocessor =
1497 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1498
1499 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1500
1501 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1502
651fdaab 1503 if (!processDCS)
1504 {
1505 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1506 }
8b739301 1507 else if (fTestMode & kSkipDCS)
2c15234c 1508 {
3d8bc902 1509 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1510 }
1511 else if (fTestMode & kErrorDCS)
1512 {
3d8bc902 1513 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1514 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1515 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1516 return kFALSE;
2c15234c 1517 } else {
3301427a 1518
1519 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1520
2c15234c 1521 TString host(fConfig->GetDCSHost(fCurrentDetector));
1522 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1523
a038aa70 1524 if (fConfig->GetDCSAliases(fCurrentDetector)->GetEntries() > 0)
2c15234c 1525 {
a038aa70 1526 dcsMap = GetValueSet(host, port, fConfig->GetDCSAliases(fCurrentDetector), kAlias);
1527 if (!dcsMap)
2c15234c 1528 {
a038aa70 1529 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS aliases");
2c15234c 1530 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
9827400b 1531 return kFALSE;
2c15234c 1532 }
4f0ab988 1533 }
a038aa70 1534
1535 if (fConfig->GetDCSDataPoints(fCurrentDetector)->GetEntries() > 0)
2c15234c 1536 {
a038aa70 1537 TMap* dcsMap2 = GetValueSet(host, port, fConfig->GetDCSDataPoints(fCurrentDetector), kDP);
1538 if (!dcsMap2)
2c15234c 1539 {
a038aa70 1540 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS data points");
2c15234c 1541 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
a038aa70 1542 if (dcsMap)
1543 delete dcsMap;
9827400b 1544 return kFALSE;
2c15234c 1545 }
a038aa70 1546
1547 if (!dcsMap)
1548 {
1549 dcsMap = dcsMap2;
1550 }
1551 else // merge
1552 {
1553 TIter iter(dcsMap2);
1554 TObjString* key = 0;
1555 while ((key = (TObjString*) iter.Next()))
1556 dcsMap->Add(key, dcsMap2->GetValue(key->String()));
1557
1558 dcsMap2->SetOwner(kFALSE);
1559 delete dcsMap2;
1560 }
73abe331 1561 }
a038aa70 1562
1563 // still no map?
1564 if (!dcsMap)
1565 dcsMap = new TMap;
73abe331 1566 }
b948db8d 1567
2bb7b766 1568 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1569 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1570
a038aa70 1571 UInt_t returnValue = aPreprocessor->Process(dcsMap);
b948db8d 1572
3301427a 1573 if (returnValue > 0) // Preprocessor error!
1574 {
9827400b 1575 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1576 UpdateShuttleStatus(AliShuttleStatus::kPPError);
a038aa70 1577 dcsMap->DeleteAll();
1578 delete dcsMap;
9827400b 1579 return kFALSE;
1580 }
1581
1582 // preprocessor ok!
1583 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1584 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1585 fCurrentDetector.Data()));
b948db8d 1586
a038aa70 1587 dcsMap->DeleteAll();
1588 delete dcsMap;
b948db8d 1589
9827400b 1590 return kTRUE;
2bb7b766 1591}
1592
1593//______________________________________________________________________________________________
1594Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1595 TObjArray& entries)
1596{
9827400b 1597 // Query DAQ's Shuttle logbook and fills detector status object.
1598 // Call QueryRunParameters to query DAQ logbook for run parameters.
1599 //
2bb7b766 1600
fc5a4708 1601 entries.SetOwner(1);
1602
2bb7b766 1603 // check connection, in case connect
be48e3ea 1604 if(!Connect(3)) return kFALSE;
2bb7b766 1605
1606 TString sqlQuery;
441b0e9c 1607 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1608
be48e3ea 1609 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1610 if (!aResult) {
1611 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1612 return kFALSE;
1613 }
1614
fc5a4708 1615 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1616
2bb7b766 1617 if(aResult->GetRowCount() == 0) {
9827400b 1618 AliInfo("No entries in Shuttle Logbook match request");
1619 delete aResult;
1620 return kTRUE;
2bb7b766 1621 }
1622
1623 // TODO Check field count!
fc5a4708 1624 const UInt_t nCols = 22;
2bb7b766 1625 if (aResult->GetFieldCount() != (Int_t) nCols) {
1626 AliError("Invalid SQL result field number!");
1627 delete aResult;
1628 return kFALSE;
1629 }
1630
2bb7b766 1631 TSQLRow* aRow;
1632 while ((aRow = aResult->Next())) {
1633 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1634 Int_t run = runString.Atoi();
1635
eba76848 1636 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1637 if (!entry)
1638 continue;
2bb7b766 1639
1640 // loop on detectors
eba76848 1641 for(UInt_t ii = 0; ii < nCols; ii++)
1642 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1643
eba76848 1644 entries.AddLast(entry);
2bb7b766 1645 delete aRow;
1646 }
1647
2bb7b766 1648 delete aResult;
1649 return kTRUE;
1650}
1651
1652//______________________________________________________________________________________________
eba76848 1653AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1654{
eba76848 1655 //
1656 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1657 //
2bb7b766 1658
1659 // check connection, in case connect
be48e3ea 1660 if (!Connect(3))
eba76848 1661 return 0;
2bb7b766 1662
1663 TString sqlQuery;
2c15234c 1664 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1665
be48e3ea 1666 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1667 if (!aResult) {
1668 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1669 return 0;
2bb7b766 1670 }
1671
eba76848 1672 if (aResult->GetRowCount() == 0) {
2bb7b766 1673 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1674 delete aResult;
eba76848 1675 return 0;
2bb7b766 1676 }
1677
eba76848 1678 if (aResult->GetRowCount() > 1) {
2bb7b766 1679 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1680 delete aResult;
eba76848 1681 return 0;
2bb7b766 1682 }
1683
eba76848 1684 TSQLRow* aRow = aResult->Next();
1685 if (!aRow)
1686 {
1687 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1688 delete aResult;
1689 return 0;
1690 }
2bb7b766 1691
eba76848 1692 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1693
eba76848 1694 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1695 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1696
eba76848 1697 UInt_t startTime = entry->GetStartTime();
1698 UInt_t endTime = entry->GetEndTime();
1699
1700 if (!startTime || !endTime || startTime > endTime) {
1701 Log("SHUTTLE",
1702 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1703 run, startTime, endTime));
1704 delete entry;
2bb7b766 1705 delete aRow;
eba76848 1706 delete aResult;
1707 return 0;
2bb7b766 1708 }
1709
eba76848 1710 delete aRow;
2bb7b766 1711 delete aResult;
eba76848 1712
1713 return entry;
2bb7b766 1714}
1715
b948db8d 1716//______________________________________________________________________________________________
2c15234c 1717Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1718 TObjArray* valueSet, DCSType type)
73abe331 1719{
9827400b 1720 // Retrieve all "entry" data points from the DCS server
1721 // host, port: TSocket connection parameters
1722 // entry: name of the alias or data point
1723 // valueSet: array of retrieved AliDCSValue's
1724 // type: kAlias or kDP
58bc3020 1725
73abe331 1726 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1727 if (!client.IsConnected())
1728 {
b948db8d 1729 return kFALSE;
73abe331 1730 }
1731
2c15234c 1732 Int_t result=0;
73abe331 1733
2c15234c 1734 if (type == kAlias)
1735 {
1736 result = client.GetAliasValues(entry,
1737 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1738 } else
1739 if (type == kDP)
1740 {
1741 result = client.GetDPValues(entry,
1742 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1743 }
1744
1745 if (result < 0)
1746 {
2bb7b766 1747 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1748 entry, AliDCSClient::GetErrorString(result)));
73abe331 1749
2c15234c 1750 if (result == AliDCSClient::fgkServerError)
1751 {
2bb7b766 1752 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1753 client.GetServerError().Data()));
1754 }
1755
1756 return kFALSE;
1757 }
1758
1759 return kTRUE;
1760}
b948db8d 1761
a038aa70 1762//______________________________________________________________________________________________
1763TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1764 DCSType type)
1765{
1766 // Retrieve all "entry" data points from the DCS server
1767 // host, port: TSocket connection parameters
1768 // entries: list of name of the alias or data point
1769 // type: kAlias or kDP
1770 // returns TMap of values, 0 when failure
1771
1772 const Int_t kSplit = 100; // maximum number of DPs at a time
1773
1774 Int_t totalEntries = entries->GetEntries();
1775
1776 TMap* result = 0;
1777
1778 for (Int_t index=0; index < totalEntries; index += kSplit)
1779 {
1780 Int_t endIndex = index + kSplit;
1781
1782 AliDCSClient client(host, port, fTimeout, fRetries);
1783 if (!client.IsConnected())
1784 return 0;
1785
1786 TMap* partialResult = 0;
1787
1788 if (type == kAlias)
1789 {
1790 partialResult = client.GetAliasValues(entries, GetCurrentStartTime(),
1791 GetCurrentEndTime(), index, endIndex);
1792 }
1793 else if (type == kDP)
1794 {
1795 partialResult = client.GetDPValues(entries, GetCurrentStartTime(),
1796 GetCurrentEndTime(), index, endIndex);
1797 }
1798
1799 if (partialResult == 0)
1800 {
1801 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries (%d...%d)! Reason: %s",
1802 index, endIndex, client.GetServerError().Data()));
1803
1804 if (result)
1805 delete result;
1806
1807 return 0;
1808 }
1809
1810 AliInfo(Form("Retrieved entries %d..%d (total %d); E.g. %s has %d values collected",
1811 index, endIndex, totalEntries, entries->At(index)->GetName(), ((TObjArray*)
1812 partialResult->GetValue(entries->At(index)->GetName()))->GetEntriesFast()));
1813
1814 if (!result)
1815 {
1816 result = partialResult;
1817 }
1818 else
1819 {
1820 TIter iter(partialResult);
1821 TObjString* key = 0;
1822 while ((key = (TObjString*) iter.Next()))
1823 result->Add(key, partialResult->GetValue(key->String()));
1824
1825 partialResult->SetOwner(kFALSE);
1826 delete partialResult;
1827 }
1828
1829 }
1830
1831 return result;
1832}
b948db8d 1833//______________________________________________________________________________________________
57f50b3c 1834const char* AliShuttle::GetFile(Int_t system, const char* detector,
1835 const char* id, const char* source)
b948db8d 1836{
9827400b 1837 // Get calibration file from file exchange servers
1838 // First queris the FXS database for the file name, using the run, detector, id and source info
1839 // then calls RetrieveFile(filename) for actual copy to local disk
1840 // run: current run being processed (given by Logbook entry fLogbookEntry)
1841 // detector: the Preprocessor name
1842 // id: provided as a parameter by the Preprocessor
1843 // source: provided by the Preprocessor through GetFileSources function
1844
1845 // check if test mode should simulate a FXS error
1846 if (fTestMode & kErrorFXSFiles)
1847 {
1848 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1849 return 0;
1850 }
1851
57f50b3c 1852 // check connection, in case connect
9d733021 1853 if (!Connect(system))
eba76848 1854 {
9d733021 1855 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1856 return 0;
1857 }
1858
1859 // Query preparation
9d733021 1860 TString sourceName(source);
d386d623 1861 Int_t nFields = 3;
1862 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1863 fConfig->GetFXSdbTable(system));
1864 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1865 GetCurrentRun(), detector, id);
1866
9d733021 1867 if (system == kDAQ)
1868 {
d386d623 1869 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1870 }
9d733021 1871 else if (system == kDCS)
eba76848 1872 {
9d733021 1873 sourceName="none";
57f50b3c 1874 }
9d733021 1875 else if (system == kHLT)
9e080f92 1876 {
d386d623 1877 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1878 nFields = 3;
9e080f92 1879 }
1880
9e080f92 1881 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1882
1883 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1884
1885 // Query execution
1886 TSQLResult* aResult = 0;
9d733021 1887 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1888 if (!aResult) {
9d733021 1889 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1890 GetSystemName(system), id, sourceName.Data()));
9e080f92 1891 return 0;
1892 }
1893
1894 if(aResult->GetRowCount() == 0)
1895 {
1896 Log(detector,
9d733021 1897 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1898 GetSystemName(system), id, sourceName.Data()));
9e080f92 1899 delete aResult;
1900 return 0;
1901 }
2bb7b766 1902
9e080f92 1903 if (aResult->GetRowCount() > 1) {
1904 Log(detector,
9d733021 1905 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1906 GetSystemName(system), id, sourceName.Data()));
9e080f92 1907 delete aResult;
1908 return 0;
1909 }
1910
9d733021 1911 if (aResult->GetFieldCount() != nFields) {
9e080f92 1912 Log(detector,
9d733021 1913 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1914 GetSystemName(system), id, sourceName.Data()));
9e080f92 1915 delete aResult;
1916 return 0;
1917 }
1918
1919 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1920
1921 if (!aRow){
9d733021 1922 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1923 GetSystemName(system), id, sourceName.Data()));
9e080f92 1924 delete aResult;
1925 return 0;
1926 }
1927
1928 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1929 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1930 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1931
1932 delete aResult;
1933 delete aRow;
1934
d386d623 1935 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1936 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1937
9e080f92 1938 // retrieved file is renamed to make it unique
9d733021 1939 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1940 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1941
9e080f92 1942
9d733021 1943 // file retrieval from FXS
4b95672b 1944 UInt_t nRetries = 0;
1945 UInt_t maxRetries = 3;
1946 Bool_t result = kFALSE;
1947
1948 // copy!! if successful TSystem::Exec returns 0
1949 while(nRetries++ < maxRetries) {
1950 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1951 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1952 if(!result)
1953 {
1954 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1955 filePath.Data(), GetSystemName(system)));
4b95672b 1956 continue;
1957 } else {
1958 AliInfo(Form("File %s copied from %s FXS into %s/%s",
1959 filePath.Data(), GetSystemName(system),
1960 GetShuttleTempDir(), localFileName.Data()));
1961 }
9e080f92 1962
d386d623 1963 if (fileChecksum.Length()>0)
4b95672b 1964 {
1965 // compare md5sum of local file with the one stored in the FXS DB
1966 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 1967 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 1968
4b95672b 1969 if (md5Comp != 0)
1970 {
1971 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
1972 filePath.Data()));
1973 result = kFALSE;
1974 continue;
1975 }
d386d623 1976 } else {
1977 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
1978 filePath.Data(), GetSystemName(system)));
9d733021 1979 }
4b95672b 1980 if (result) break;
9e080f92 1981 }
1982
4b95672b 1983 if(!result) return 0;
1984
9d733021 1985 fFXSCalled[system]=kTRUE;
1986 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
1987 fFXSlist[system].Add(fileParams);
9e080f92 1988
1989 static TString fullLocalFileName;
36c99a6a 1990 fullLocalFileName = TString::Format("%s/%s", GetShuttleTempDir(), localFileName.Data());
1991
9e080f92 1992 AliInfo(Form("fullLocalFileName = %s", fullLocalFileName.Data()));
1993
1994 return fullLocalFileName.Data();
2bb7b766 1995
1996}
1997
1998//______________________________________________________________________________________________
9d733021 1999Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 2000{
9827400b 2001 //
2002 // Copies file from FXS to local Shuttle machine
2003 //
2bb7b766 2004
9e080f92 2005 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 2006 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
2007 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 2008
36c99a6a 2009 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 2010 if (dir == NULL) {
36c99a6a 2011 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
2012 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 2013 return kFALSE;
2014 }
2015
2016 } else {
2017 gSystem->FreeDirectory(dir);
2018 }
2019
9d733021 2020 TString baseFXSFolder;
2021 if (system == kDAQ)
2022 {
2023 baseFXSFolder = "FES/";
2024 }
2025 else if (system == kDCS)
2026 {
2027 baseFXSFolder = "";
2028 }
2029 else if (system == kHLT)
2030 {
2031 baseFXSFolder = "~/";
2032 }
2033
2034
2035 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
2036 fConfig->GetFXSPort(system),
2037 fConfig->GetFXSUser(system),
2038 fConfig->GetFXSHost(system),
2039 baseFXSFolder.Data(),
2040 fxsFileName,
36c99a6a 2041 GetShuttleTempDir(),
9e080f92 2042 localFileName);
2043
2044 AliDebug(2, Form("%s",command.Data()));
2045
4b95672b 2046 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 2047
4b95672b 2048 return result;
9e080f92 2049}
2050
2051//______________________________________________________________________________________________
9d733021 2052TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2053{
9827400b 2054 //
2055 // Get sources producing the condition file Id from file exchange servers
4a33bdd9 2056 // if id is NULL all sources are returned (distinct)
9827400b 2057 //
2058
2059 // check if test mode should simulate a FXS error
2060 if (fTestMode & kErrorFXSSources)
2061 {
2062 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2063 return 0;
2064 }
2065
9d733021 2066
2067 if (system == kDCS)
2068 {
2069 AliError("DCS system has only one source of data!");
2070 return NULL;
9d733021 2071 }
9e080f92 2072
2073 // check connection, in case connect
9d733021 2074 if (!Connect(system))
2075 {
4a33bdd9 2076 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
9d733021 2077 return NULL;
9e080f92 2078 }
2079
9d733021 2080 TString sourceName = 0;
2081 if (system == kDAQ)
2082 {
2083 sourceName = "DAQsource";
2084 } else if (system == kHLT)
2085 {
2086 sourceName = "DDLnumbers";
2087 }
2088
4a33bdd9 2089 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2090 TString whereClause = Form("run=%d and detector=\"%s\"",
2091 GetCurrentRun(), detector);
2092 if (id)
2093 whereClause += Form(" and fileId=\"%s\"", id);
9e080f92 2094 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2095
2096 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2097
2098 // Query execution
2099 TSQLResult* aResult;
9d733021 2100 aResult = fServer[system]->Query(sqlQuery);
9e080f92 2101 if (!aResult) {
9d733021 2102 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2103 GetSystemName(system), id));
9e080f92 2104 return 0;
2105 }
2106
86aa42c3 2107 TList *list = new TList();
2108 list->SetOwner(1);
2109
9d733021 2110 if (aResult->GetRowCount() == 0)
2111 {
9e080f92 2112 Log(detector,
9d733021 2113 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 2114 delete aResult;
86aa42c3 2115 return list;
9e080f92 2116 }
2117
2118 TSQLRow* aRow;
9e080f92 2119
9d733021 2120 while ((aRow = aResult->Next()))
2121 {
9e080f92 2122
9d733021 2123 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2124 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2125 list->Add(new TObjString(source));
9e080f92 2126 delete aRow;
2127 }
9d733021 2128
9e080f92 2129 delete aResult;
2130
2131 return list;
2bb7b766 2132}
2133
4a33bdd9 2134//______________________________________________________________________________________________
2135TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2136{
2137 //
2138 // Get all ids of condition files produced by a given source from file exchange servers
2139 //
2140
2141 // check if test mode should simulate a FXS error
2142 if (fTestMode & kErrorFXSSources)
2143 {
2144 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2145 return 0;
2146 }
2147
2148 // check connection, in case connect
2149 if (!Connect(system))
2150 {
2151 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2152 return NULL;
2153 }
2154
2155 TString sourceName = 0;
2156 if (system == kDAQ)
2157 {
2158 sourceName = "DAQsource";
2159 } else if (system == kHLT)
2160 {
2161 sourceName = "DDLnumbers";
2162 }
2163
2164 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2165 TString whereClause = Form("run=%d and detector=\"%s\"",
2166 GetCurrentRun(), detector);
2167 if (sourceName.Length() > 0 && source)
2168 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2169 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2170
2171 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2172
2173 // Query execution
2174 TSQLResult* aResult;
2175 aResult = fServer[system]->Query(sqlQuery);
2176 if (!aResult) {
2177 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2178 GetSystemName(system), source));
2179 return 0;
2180 }
2181
2182 TList *list = new TList();
2183 list->SetOwner(1);
2184
2185 if (aResult->GetRowCount() == 0)
2186 {
2187 Log(detector,
2188 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2189 delete aResult;
2190 return list;
2191 }
2192
2193 TSQLRow* aRow;
2194
2195 while ((aRow = aResult->Next()))
2196 {
2197
2198 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2199 AliDebug(2, Form("fileId = %s", id.Data()));
2200 list->Add(new TObjString(id));
2201 delete aRow;
2202 }
2203
2204 delete aResult;
2205
2206 return list;
2207}
2208
2bb7b766 2209//______________________________________________________________________________________________
9d733021 2210Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 2211{
9827400b 2212 // Connect to MySQL Server of the system's FXS MySQL databases
2213 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2214 //
57f50b3c 2215
9d733021 2216 // check connection: if already connected return
2217 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 2218
9d733021 2219 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 2220
9d733021 2221 if (system < 3) // FXS db servers
2222 {
2223 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2224 dbUser = fConfig->GetFXSdbUser(system);
2225 dbPass = fConfig->GetFXSdbPass(system);
2226 dbName = fConfig->GetFXSdbName(system);
2227 } else { // Run & Shuttle logbook servers
2228 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2229 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2230 dbUser = fConfig->GetDAQlbUser();
2231 dbPass = fConfig->GetDAQlbPass();
2232 dbName = fConfig->GetDAQlbDB();
2233 }
57f50b3c 2234
9d733021 2235 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2236 if (!fServer[system] || !fServer[system]->IsConnected()) {
2237 if(system < 3)
2238 {
2239 AliError(Form("Can't establish connection to FXS database for %s",
2240 AliShuttleInterface::GetSystemName(system)));
2241 } else {
2242 AliError("Can't establish connection to Run logbook.");
57f50b3c 2243 }
9d733021 2244 if(fServer[system]) delete fServer[system];
2245 return kFALSE;
2bb7b766 2246 }
57f50b3c 2247
9d733021 2248 // Get tables
2249 TSQLResult* aResult=0;
2250 switch(system){
2251 case kDAQ:
2252 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2253 break;
2254 case kDCS:
2255 aResult = fServer[kDCS]->GetTables(dbName.Data());
2256 break;
2257 case kHLT:
2258 aResult = fServer[kHLT]->GetTables(dbName.Data());
2259 break;
2260 default:
2261 aResult = fServer[3]->GetTables(dbName.Data());
2262 break;
2263 }
2264
2265 delete aResult;
2bb7b766 2266 return kTRUE;
2267}
57f50b3c 2268
9e080f92 2269//______________________________________________________________________________________________
9d733021 2270Bool_t AliShuttle::UpdateTable()
9e080f92 2271{
9827400b 2272 //
2273 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2274 //
9e080f92 2275
9d733021 2276 Bool_t result = kTRUE;
9e080f92 2277
9d733021 2278 for (UInt_t system=0; system<3; system++)
2279 {
2280 if(!fFXSCalled[system]) continue;
9e080f92 2281
9d733021 2282 // check connection, in case connect
2283 if (!Connect(system))
2284 {
2285 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2286 result = kFALSE;
2287 continue;
9e080f92 2288 }
9e080f92 2289
9d733021 2290 TTimeStamp now; // now
2291
2292 // Loop on FXS list entries
2293 TIter iter(&fFXSlist[system]);
2294 TObjString *aFXSentry=0;
2295 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2296 {
2297 TString aFXSentrystr = aFXSentry->String();
2298 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2299 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2300 {
2301 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2302 GetSystemName(system), aFXSentrystr.Data()));
2303 if(aFXSarray) delete aFXSarray;
2304 result = kFALSE;
2305 continue;
2306 }
2307 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2308 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2309
2310 TString whereClause;
2311 if (system == kDAQ)
2312 {
2313 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2314 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2315 }
2316 else if (system == kDCS)
2317 {
2318 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2319 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2320 }
2321 else if (system == kHLT)
2322 {
2323 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2324 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2325 }
2326
2327 delete aFXSarray;
9e080f92 2328
9d733021 2329 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2330 now.GetSec(), whereClause.Data());
9e080f92 2331
9d733021 2332 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2333
9d733021 2334 // Query execution
2335 TSQLResult* aResult;
2336 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2337 if (!aResult)
2338 {
2339 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2340 GetSystemName(system), sqlQuery.Data()));
2341 result = kFALSE;
2342 continue;
2343 }
2344 delete aResult;
9e080f92 2345 }
9e080f92 2346 }
2347
9d733021 2348 return result;
9e080f92 2349}
57f50b3c 2350
3301427a 2351//______________________________________________________________________________________________
2352Bool_t AliShuttle::UpdateTableFailCase()
2353{
9827400b 2354 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2355 // this is called in case the preprocessor is declared failed for the current run, because
2356 // the fields are updated only in case of success
3301427a 2357
2358 Bool_t result = kTRUE;
2359
2360 for (UInt_t system=0; system<3; system++)
2361 {
2362 // check connection, in case connect
2363 if (!Connect(system))
2364 {
2365 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2366 GetSystemName(system)));
2367 result = kFALSE;
2368 continue;
2369 }
2370
2371 TTimeStamp now; // now
2372
2373 // Loop on FXS list entries
2374
2375 TString whereClause = Form("where run=%d and detector=\"%s\";",
2376 GetCurrentRun(), fCurrentDetector.Data());
2377
2378
2379 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2380 now.GetSec(), whereClause.Data());
2381
2382 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2383
2384 // Query execution
2385 TSQLResult* aResult;
2386 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2387 if (!aResult)
2388 {
2389 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2390 GetSystemName(system), sqlQuery.Data()));
2391 result = kFALSE;
2392 continue;
2393 }
2394 delete aResult;
2395 }
2396
2397 return result;
2398}
2399
2bb7b766 2400//______________________________________________________________________________________________
2401Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2402{
e7f62f16 2403 //
2404 // Update Shuttle logbook filling detector or shuttle_done column
2405 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2406 //
57f50b3c 2407
2bb7b766 2408 // check connection, in case connect
be48e3ea 2409 if(!Connect(3)){
2bb7b766 2410 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2411 return kFALSE;
57f50b3c 2412 }
2413
2bb7b766 2414 TString detName(detector);
2415 TString setClause;
e7f62f16 2416 if(detName == "shuttle_done")
2417 {
2bb7b766 2418 setClause = "set shuttle_done=1";
e7f62f16 2419
2420 // Send the information to ML
2421 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2422
2423 TList mlList;
2424 mlList.Add(&mlStatus);
2425
2426 fMonaLisa->SendParameters(&mlList);
2bb7b766 2427 } else {
2bb7b766 2428 TString statusStr(status);
2429 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2430 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2431 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2432 } else {
2433 Log("SHUTTLE",
2434 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2435 status, detector));
2436 return kFALSE;
2437 }
2438 }
57f50b3c 2439
2bb7b766 2440 TString whereClause = Form("where run=%d", GetCurrentRun());
2441
441b0e9c 2442 TString sqlQuery = Form("update %s %s %s",
2443 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2444
2bb7b766 2445 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2446
2447 // Query execution
2448 TSQLResult* aResult;
be48e3ea 2449 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2450 if (!aResult) {
2451 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2452 return kFALSE;
57f50b3c 2453 }
2bb7b766 2454 delete aResult;
57f50b3c 2455
2456 return kTRUE;
2457}
2458
2459//______________________________________________________________________________________________
2bb7b766 2460Int_t AliShuttle::GetCurrentRun() const
2461{
9827400b 2462 //
2463 // Get current run from logbook entry
2464 //
57f50b3c 2465
2bb7b766 2466 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2467}
2468
2469//______________________________________________________________________________________________
2bb7b766 2470UInt_t AliShuttle::GetCurrentStartTime() const
2471{
9827400b 2472 //
2473 // get current start time
2474 //
57f50b3c 2475
2bb7b766 2476 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2477}
2478
2479//______________________________________________________________________________________________
2bb7b766 2480UInt_t AliShuttle::GetCurrentEndTime() const
2481{
9827400b 2482 //
2483 // get current end time from logbook entry
2484 //
57f50b3c 2485
2bb7b766 2486 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2487}
2488
b948db8d 2489//______________________________________________________________________________________________
2490void AliShuttle::Log(const char* detector, const char* message)
2491{
9827400b 2492 //
2493 // Fill log string with a message
2494 //
b948db8d 2495
36c99a6a 2496 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2497 if (dir == NULL) {
36c99a6a 2498 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2499 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2500 return;
2501 }
b948db8d 2502
84090f85 2503 } else {
2504 gSystem->FreeDirectory(dir);
2505 }
b948db8d 2506
cb343cfd 2507 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2508 if (GetCurrentRun() >= 0)
2509 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2510 toLog += Form("%s", message);
2511
84090f85 2512 AliInfo(toLog.Data());
ffa29e93 2513
2514 // if we redirect the log output already to the file, leave here
2515 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2516 return;
b948db8d 2517
ffa29e93 2518 TString fileName = GetLogFileName(detector);
e7f62f16 2519
84090f85 2520 gSystem->ExpandPathName(fileName);
2521
2522 ofstream logFile;
2523 logFile.open(fileName, ofstream::out | ofstream::app);
2524
2525 if (!logFile.is_open()) {
2526 AliError(Form("Could not open file %s", fileName.Data()));
2527 return;
2528 }
7bfb2090 2529
84090f85 2530 logFile << toLog.Data() << "\n";
b948db8d 2531
84090f85 2532 logFile.close();
b948db8d 2533}
2bb7b766 2534
ffa29e93 2535//______________________________________________________________________________________________
2536TString AliShuttle::GetLogFileName(const char* detector) const
2537{
2538 //
2539 // returns the name of the log file for a given sub detector
2540 //
2541
2542 TString fileName;
2543
2544 if (GetCurrentRun() >= 0)
2545 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2546 else
2547 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2548
2549 return fileName;
2550}
2551
2bb7b766 2552//______________________________________________________________________________________________
2553Bool_t AliShuttle::Collect(Int_t run)
2554{
9827400b 2555 //
2556 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2557 // If a dedicated run is given this run is processed
2558 //
2559 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2560 //
2bb7b766 2561
eba76848 2562 if (run == -1)
2563 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2564 else
2565 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2566
2567 SetLastAction("Starting");
2bb7b766 2568
2569 TString whereClause("where shuttle_done=0");
eba76848 2570 if (run != -1)
2571 whereClause += Form(" and run=%d", run);
2bb7b766 2572
2573 TObjArray shuttleLogbookEntries;
be48e3ea 2574 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2575 {
cb343cfd 2576 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2577 return kFALSE;
2578 }
2579
9e080f92 2580 if (shuttleLogbookEntries.GetEntries() == 0)
2581 {
2582 if (run == -1)
2583 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2584 else
2585 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2586 "or it does not exist in Shuttle logbook", run));
2587 return kTRUE;
2588 }
2589
be48e3ea 2590 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2591 fFirstUnprocessed[iDet] = kTRUE;
2592
fc5a4708 2593 if (run != -1)
be48e3ea 2594 {
2595 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2596 // flag them into fFirstUnprocessed array
2597 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2598 TObjArray tmpLogbookEntries;
2599 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2600 {
2601 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2602 return kFALSE;
2603 }
2604
2605 TIter iter(&tmpLogbookEntries);
2606 AliShuttleLogbookEntry* anEntry = 0;
2607 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2608 {
2609 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2610 {
2611 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2612 {
2613 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2614 anEntry->GetRun(), GetDetName(iDet)));
2615 fFirstUnprocessed[iDet] = kFALSE;
2616 }
2617 }
2618
2619 }
2620
2621 }
2622
2623 if (!RetrieveConditionsData(shuttleLogbookEntries))
2624 {
cb343cfd 2625 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2626 return kFALSE;
2627 }
2628
36c99a6a 2629 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2630 return kTRUE;
2bb7b766 2631}
2632
2bb7b766 2633//______________________________________________________________________________________________
2634Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2635{
9827400b 2636 //
2637 // Retrieve conditions data for all runs that aren't processed yet
2638 //
2bb7b766 2639
2640 Bool_t hasError = kFALSE;
2641
2642 TIter iter(&dateEntries);
2643 AliShuttleLogbookEntry* anEntry;
2644
2645 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2646 if (!Process(anEntry)){
2647 hasError = kTRUE;
2648 }
4b95672b 2649
2650 // clean SHUTTLE temp directory
3301427a 2651 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2652 RemoveFile(filename.Data());
2bb7b766 2653 }
2654
2655 return hasError == kFALSE;
2656}
cb343cfd 2657
2658//______________________________________________________________________________________________
2659ULong_t AliShuttle::GetTimeOfLastAction() const
2660{
9827400b 2661 //
2662 // Gets time of last action
2663 //
2664
cb343cfd 2665 ULong_t tmp;
36c99a6a 2666
cb343cfd 2667 fMonitoringMutex->Lock();
be48e3ea 2668
cb343cfd 2669 tmp = fLastActionTime;
36c99a6a 2670
cb343cfd 2671 fMonitoringMutex->UnLock();
36c99a6a 2672
cb343cfd 2673 return tmp;
2674}
2675
2676//______________________________________________________________________________________________
2677const TString AliShuttle::GetLastAction() const
2678{
9827400b 2679 //
cb343cfd 2680 // returns a string description of the last action
9827400b 2681 //
cb343cfd 2682
2683 TString tmp;
36c99a6a 2684
cb343cfd 2685 fMonitoringMutex->Lock();
2686
2687 tmp = fLastAction;
2688
2689 fMonitoringMutex->UnLock();
2690
36c99a6a 2691 return tmp;
cb343cfd 2692}
2693
2694//______________________________________________________________________________________________
2695void AliShuttle::SetLastAction(const char* action)
2696{
9827400b 2697 //
cb343cfd 2698 // updates the monitoring variables
9827400b 2699 //
36c99a6a 2700
cb343cfd 2701 fMonitoringMutex->Lock();
36c99a6a 2702
cb343cfd 2703 fLastAction = action;
2704 fLastActionTime = time(0);
2705
2706 fMonitoringMutex->UnLock();
2707}
eba76848 2708
2709//______________________________________________________________________________________________
2710const char* AliShuttle::GetRunParameter(const char* param)
2711{
9827400b 2712 //
2713 // returns run parameter read from DAQ logbook
2714 //
eba76848 2715
2716 if(!fLogbookEntry) {
2717 AliError("No logbook entry!");
2718 return 0;
2719 }
2720
2721 return fLogbookEntry->GetRunParameter(param);
2722}
57c1a579 2723
d386d623 2724//______________________________________________________________________________________________
9827400b 2725AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2726{
9827400b 2727 //
2728 // returns object from OCDB valid for current run
2729 //
d386d623 2730
9827400b 2731 if (fTestMode & kErrorOCDB)
2732 {
2733 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2734 return 0;
2735 }
2736
d386d623 2737 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2738 if (!sto)
2739 {
9827400b 2740 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2741 return 0;
2742 }
2743
2744 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2745}
2746
57c1a579 2747//______________________________________________________________________________________________
2748Bool_t AliShuttle::SendMail()
2749{
9827400b 2750 //
2751 // sends a mail to the subdetector expert in case of preprocessor error
2752 //
2753
2754 if (fTestMode != kNone)
2755 return kTRUE;
57c1a579 2756
36c99a6a 2757 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2758 if (dir == NULL)
2759 {
36c99a6a 2760 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2761 {
36c99a6a 2762 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2763 return kFALSE;
2764 }
2765
2766 } else {
2767 gSystem->FreeDirectory(dir);
2768 }
2769
2770 TString bodyFileName;
36c99a6a 2771 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2772 gSystem->ExpandPathName(bodyFileName);
2773
2774 ofstream mailBody;
2775 mailBody.open(bodyFileName, ofstream::out);
2776
2777 if (!mailBody.is_open())
2778 {
2779 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2780 return kFALSE;
2781 }
2782
2783 TString to="";
2784 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2785 TObjString *anExpert=0;
2786 while ((anExpert = (TObjString*) iterExperts.Next()))
2787 {
2788 to += Form("%s,", anExpert->GetName());
2789 }
2790 to.Remove(to.Length()-1);
909732f7 2791 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2792
86aa42c3 2793 if (to.IsNull()) {
36c99a6a 2794 AliInfo("List of detector responsibles not yet set!");
2795 return kFALSE;
2796 }
2797
57c1a579 2798 TString cc="alberto.colla@cern.ch";
2799
546242fb 2800 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
57c1a579 2801 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2802 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2803
2804 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2805 body += Form("SHUTTLE just detected that your preprocessor "
546242fb 2806 "failed processing run %d!!\n\n", GetCurrentRun());
2807 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2808 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2809 body += Form("Find the %s log for the current run on \n\n"
2810 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2811 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2812 body += Form("The last 10 lines of %s log file are following:\n\n");
2813
909732f7 2814 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2815
2816 mailBody << body.Data();
2817 mailBody.close();
2818 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2819
9d733021 2820 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2821 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2822 if (gSystem->Exec(tailCommand.Data()))
2823 {
2824 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2825 }
2826
2827 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2828 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2829 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
546242fb 2830 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
57c1a579 2831
909732f7 2832 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2833
2834 mailBody << endBody.Data();
2835
2836 mailBody.close();
2837
2838 // send mail!
2839 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2840 subject.Data(),
2841 cc.Data(),
2842 to.Data(),
2843 bodyFileName.Data());
909732f7 2844 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2845
2846 Bool_t result = gSystem->Exec(mailCommand.Data());
2847
2848 return result == 0;
2849}
d386d623 2850
441b0e9c 2851//______________________________________________________________________________________________
9827400b 2852const char* AliShuttle::GetRunType()
441b0e9c 2853{
9827400b 2854 //
2855 // returns run type read from "run type" logbook
2856 //
441b0e9c 2857
2858 if(!fLogbookEntry) {
2859 AliError("No logbook entry!");
2860 return 0;
2861 }
2862
9827400b 2863 return fLogbookEntry->GetRunType();
441b0e9c 2864}
2865
d386d623 2866//______________________________________________________________________________________________
2867void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2868{
9827400b 2869 //
2870 // sets Shuttle temp directory
2871 //
d386d623 2872
2873 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2874}
2875
2876//______________________________________________________________________________________________
2877void AliShuttle::SetShuttleLogDir(const char* logDir)
2878{
9827400b 2879 //
2880 // sets Shuttle log directory
2881 //
d386d623 2882
2883 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2884}