]> git.uio.no Git - u/mrichter/AliRoot.git/blame - SHUTTLE/AliShuttle.cxx
Consistency changes, bug fix, and new conditions on the kITSrefit and kTPCrefit bits...
[u/mrichter/AliRoot.git] / SHUTTLE / AliShuttle.cxx
CommitLineData
73abe331 1/**************************************************************************
2 * Copyright(c) 1998-1999, ALICE Experiment at CERN, All rights reserved. *
3 * *
4 * Author: The ALICE Off-line Project. *
5 * Contributors are mentioned in the code where appropriate. *
6 * *
7 * Permission to use, copy, modify and distribute this software and its *
8 * documentation strictly for non-commercial purposes is hereby granted *
9 * without fee, provided that the above copyright notice appears in all *
10 * copies and that both the copyright notice and this permission notice *
11 * appear in the supporting documentation. The authors make no claims *
12 * about the suitability of this software for any purpose. It is *
13 * provided "as is" without express or implied warranty. *
14 **************************************************************************/
15
16/*
17$Log$
4859271b 18Revision 1.54 2007/07/12 09:51:25 jgrosseo
19removed duplicated log message in GetFile
20
4f0749a8 21Revision 1.53 2007/07/12 09:26:28 jgrosseo
22updating hlt fxs base path
23
42fde080 24Revision 1.52 2007/07/12 08:06:45 jgrosseo
25adding log messages in getfile... functions
26adding not implemented copy constructor in alishuttleconfigholder
27
1bcd28db 28Revision 1.51 2007/07/03 17:24:52 acolla
29root moved to v5-16-00. TFileMerger->Cp moved to TFile::Cp.
30
a986b218 31Revision 1.50 2007/07/02 17:19:32 acolla
32preprocessor is run in a temp directory that is removed when process is finished.
33
5bac2bde 34Revision 1.49 2007/06/29 10:45:06 acolla
35Number of columns in MySql Shuttle logbook increased by one (HLT added)
36
db99d43e 37Revision 1.48 2007/06/21 13:06:19 acolla
38GetFileSources returns dummy list with 1 source if system=DCS (better than
39returning error as it was)
40
6297b37d 41Revision 1.47 2007/06/19 17:28:56 acolla
42HLT updated; missing map bug removed.
43
dc25836b 44Revision 1.46 2007/06/09 13:01:09 jgrosseo
45Switching to retrieval of several DCS DPs at a time (multiDPrequest)
46
a038aa70 47Revision 1.45 2007/05/30 06:35:20 jgrosseo
48Adding functionality to the Shuttle/TestShuttle:
49o) Function to retrieve list of sources from a given system (GetFileSources with id=0)
50o) Function to retrieve list of IDs for a given source (GetFileIDs)
51These functions are needed for dealing with the tag files that are saved for the GRP preprocessor
52Example code has been added to the TestProcessor in TestShuttle
53
4a33bdd9 54Revision 1.44 2007/05/11 16:09:32 acolla
55Reference files for ITS, MUON and PHOS are now stored in OfflineDetName/OnlineDetName/run_...
56example: ITS/SPD/100_filename.root
57
2d9019b4 58Revision 1.43 2007/05/10 09:59:51 acolla
59Various bug fixes in StoreRefFilesToGrid; Cleaning of reference storage before processing detector (CleanReferenceStorage)
60
546242fb 61Revision 1.42 2007/05/03 08:01:39 jgrosseo
62typo in last commit :-(
63
8b739301 64Revision 1.41 2007/05/03 08:00:48 jgrosseo
65fixing log message when pp want to skip dcs value retrieval
66
651fdaab 67Revision 1.40 2007/04/27 07:06:48 jgrosseo
68GetFileSources returns empty list in case of no files, but successful query
69No mails sent in testmode
70
86aa42c3 71Revision 1.39 2007/04/17 12:43:57 acolla
72Correction in StoreOCDB; change of text in mail to detector expert
73
26758fce 74Revision 1.38 2007/04/12 08:26:18 jgrosseo
75updated comment
76
3c2a21c8 77Revision 1.37 2007/04/10 16:53:14 jgrosseo
78redirecting sub detector stdout, stderr to sub detector log file
79
3d8bc902 80Revision 1.35 2007/04/04 16:26:38 acolla
811. Re-organization of function calls in TestPreprocessor to make it more meaningful.
822. Added missing dependency in test preprocessors.
833. in AliShuttle.cxx: processing time and memory consumption info on a single line.
84
886d60e6 85Revision 1.34 2007/04/04 10:33:36 jgrosseo
861) Storing of files to the Grid is now done _after_ your preprocessors succeeded. This is transparent, which means that you can still use the same functions (Store, StoreReferenceData) to store files to the Grid. However, the Shuttle first stores them locally and transfers them after the preprocessor finished. The return code of these two functions has changed from UInt_t to Bool_t which gives you the success of the storing.
87In case of an error with the Grid, the Shuttle will retry the storing later, the preprocessor does not need to be run again.
88
892) The meaning of the return code of the preprocessor has changed. 0 is now success and any other value means failure. This value is stored in the log and you can use it to keep details about the error condition.
90
913) New function StoreReferenceFile to _directly_ store a file (without opening it) to the reference storage.
92
934) The memory usage of the preprocessor is monitored. If it exceeds 2 GB it is terminated.
94
955) New function AliPreprocessor::ProcessDCS(). If you do not need to have DCS data in all cases, you can skip the processing by implemting this function and returning kFALSE under certain conditions. E.g. if there is a certain run type.
96If you always need DCS data (like before), you do not need to implement it.
97
986) The run type has been added to the monitoring page
99
9827400b 100Revision 1.33 2007/04/03 13:56:01 acolla
101Grid Storage at the end of preprocessing. Added virtual method to disable DCS query according to the
102run type.
103
3301427a 104Revision 1.32 2007/02/28 10:41:56 acolla
105Run type field added in SHUTTLE framework. Run type is read from "run type" logbook and retrieved by
106AliPreprocessor::GetRunType() function.
107Added some ldap definition files.
108
d386d623 109Revision 1.30 2007/02/13 11:23:21 acolla
110Moved getters and setters of Shuttle's main OCDB/Reference, local
111OCDB/Reference, temp and log folders to AliShuttleInterface
112
9d733021 113Revision 1.27 2007/01/30 17:52:42 jgrosseo
114adding monalisa monitoring
115
e7f62f16 116Revision 1.26 2007/01/23 19:20:03 acolla
117Removed old ldif files, added TOF, MCH ldif files. Added some options in
118AliShuttleConfig::Print. Added in Ali Shuttle: SetShuttleTempDir and
119SetShuttleLogDir
120
36c99a6a 121Revision 1.25 2007/01/15 19:13:52 acolla
122Moved some AliInfo to AliDebug in SendMail function
123
fc5a4708 124Revision 1.21 2006/12/07 08:51:26 jgrosseo
125update (alberto):
126table, db names in ldap configuration
127added GRP preprocessor
128DCS data can also be retrieved by data point
129
2c15234c 130Revision 1.20 2006/11/16 16:16:48 jgrosseo
131introducing strict run ordering flag
132removed giving preprocessor name to preprocessor, they have to know their name themselves ;-)
133
be48e3ea 134Revision 1.19 2006/11/06 14:23:04 jgrosseo
135major update (Alberto)
136o) reading of run parameters from the logbook
137o) online offline naming conversion
138o) standalone DCSclient package
139
eba76848 140Revision 1.18 2006/10/20 15:22:59 jgrosseo
141o) Adding time out to the execution of the preprocessors: The Shuttle forks and the parent process monitors the child
142o) Merging Collect, CollectAll, CollectNew function
143o) Removing implementation of empty copy constructors (declaration still there!)
144
cb343cfd 145Revision 1.17 2006/10/05 16:20:55 jgrosseo
146adapting to new CDB classes
147
6ec0e06c 148Revision 1.16 2006/10/05 15:46:26 jgrosseo
149applying to the new interface
150
481441a2 151Revision 1.15 2006/10/02 16:38:39 jgrosseo
152update (alberto):
153fixed memory leaks
154storing of objects that failed to be stored to the grid before
155interfacing of shuttle status table in daq system
156
2bb7b766 157Revision 1.14 2006/08/29 09:16:05 jgrosseo
158small update
159
85a80aa9 160Revision 1.13 2006/08/15 10:50:00 jgrosseo
161effc++ corrections (alberto)
162
4f0ab988 163Revision 1.12 2006/08/08 14:19:29 jgrosseo
164Update to shuttle classes (Alberto)
165
166- Possibility to set the full object's path in the Preprocessor's and
167Shuttle's Store functions
168- Possibility to extend the object's run validity in the same classes
169("startValidity" and "validityInfinite" parameters)
170- Implementation of the StoreReferenceData function to store reference
171data in a dedicated CDB storage.
172
84090f85 173Revision 1.11 2006/07/21 07:37:20 jgrosseo
174last run is stored after each run
175
7bfb2090 176Revision 1.10 2006/07/20 09:54:40 jgrosseo
177introducing status management: The processing per subdetector is divided into several steps,
178after each step the status is stored on disk. If the system crashes in any of the steps the Shuttle
179can keep track of the number of failures and skips further processing after a certain threshold is
180exceeded. These thresholds can be configured in LDAP.
181
5164a766 182Revision 1.9 2006/07/19 10:09:55 jgrosseo
183new configuration, accesst to DAQ FES (Alberto)
184
57f50b3c 185Revision 1.8 2006/07/11 12:44:36 jgrosseo
186adding parameters for extended validity range of data produced by preprocessor
187
17111222 188Revision 1.7 2006/07/10 14:37:09 jgrosseo
189small fix + todo comment
190
e090413b 191Revision 1.6 2006/07/10 13:01:41 jgrosseo
192enhanced storing of last sucessfully processed run (alberto)
193
a7160fe9 194Revision 1.5 2006/07/04 14:59:57 jgrosseo
195revision of AliDCSValue: Removed wrapper classes, reduced storage size per value by factor 2
196
45a493ce 197Revision 1.4 2006/06/12 09:11:16 jgrosseo
198coding conventions (Alberto)
199
58bc3020 200Revision 1.3 2006/06/06 14:26:40 jgrosseo
201o) removed files that were moved to STEER
202o) shuttle updated to follow the new interface (Alberto)
203
b948db8d 204Revision 1.2 2006/03/07 07:52:34 hristov
205New version (B.Yordanov)
206
d477ad88 207Revision 1.6 2005/11/19 17:19:14 byordano
208RetrieveDATEEntries and RetrieveConditionsData added
209
210Revision 1.5 2005/11/19 11:09:27 byordano
211AliShuttle declaration added
212
213Revision 1.4 2005/11/17 17:47:34 byordano
214TList changed to TObjArray
215
216Revision 1.3 2005/11/17 14:43:23 byordano
217import to local CVS
218
219Revision 1.1.1.1 2005/10/28 07:33:58 hristov
220Initial import as subdirectory in AliRoot
221
73abe331 222Revision 1.2 2005/09/13 08:41:15 byordano
223default startTime endTime added
224
225Revision 1.4 2005/08/30 09:13:02 byordano
226some docs added
227
228Revision 1.3 2005/08/29 21:15:47 byordano
229some docs added
230
231*/
232
233//
234// This class is the main manager for AliShuttle.
235// It organizes the data retrieval from DCS and call the
b948db8d 236// interface methods of AliPreprocessor.
73abe331 237// For every detector in AliShuttleConfgi (see AliShuttleConfig),
238// data for its set of aliases is retrieved. If there is registered
b948db8d 239// AliPreprocessor for this detector then it will be used
240// accroding to the schema (see AliPreprocessor).
241// If there isn't registered AliPreprocessor than the retrieved
73abe331 242// data is stored automatically to the undelying AliCDBStorage.
243// For detSpec is used the alias name.
244//
245
246#include "AliShuttle.h"
247
248#include "AliCDBManager.h"
249#include "AliCDBStorage.h"
250#include "AliCDBId.h"
84090f85 251#include "AliCDBRunRange.h"
252#include "AliCDBPath.h"
5164a766 253#include "AliCDBEntry.h"
73abe331 254#include "AliShuttleConfig.h"
eba76848 255#include "DCSClient/AliDCSClient.h"
73abe331 256#include "AliLog.h"
b948db8d 257#include "AliPreprocessor.h"
5164a766 258#include "AliShuttleStatus.h"
2bb7b766 259#include "AliShuttleLogbookEntry.h"
73abe331 260
57f50b3c 261#include <TSystem.h>
58bc3020 262#include <TObject.h>
b948db8d 263#include <TString.h>
57f50b3c 264#include <TTimeStamp.h>
73abe331 265#include <TObjString.h>
57f50b3c 266#include <TSQLServer.h>
267#include <TSQLResult.h>
268#include <TSQLRow.h>
cb343cfd 269#include <TMutex.h>
9827400b 270#include <TSystemDirectory.h>
271#include <TSystemFile.h>
a986b218 272#include <TFile.h>
9827400b 273#include <TFileMerger.h>
274#include <TGrid.h>
275#include <TGridResult.h>
73abe331 276
e7f62f16 277#include <TMonaLisaWriter.h>
278
5164a766 279#include <fstream>
280
cb343cfd 281#include <sys/types.h>
282#include <sys/wait.h>
283
73abe331 284ClassImp(AliShuttle)
285
b948db8d 286//______________________________________________________________________________________________
287AliShuttle::AliShuttle(const AliShuttleConfig* config,
288 UInt_t timeout, Int_t retries):
4f0ab988 289fConfig(config),
290fTimeout(timeout), fRetries(retries),
291fPreprocessorMap(),
2bb7b766 292fLogbookEntry(0),
eba76848 293fCurrentDetector(),
85a80aa9 294fStatusEntry(0),
cb343cfd 295fMonitoringMutex(0),
eba76848 296fLastActionTime(0),
e7f62f16 297fLastAction(),
9827400b 298fMonaLisa(0),
299fTestMode(kNone),
ffa29e93 300fReadTestMode(kFALSE),
301fOutputRedirected(kFALSE)
73abe331 302{
303 //
304 // config: AliShuttleConfig used
73abe331 305 // timeout: timeout used for AliDCSClient connection
306 // retries: the number of retries in case of connection error.
307 //
308
57f50b3c 309 if (!fConfig->IsValid()) AliFatal("********** !!!!! Invalid configuration !!!!! **********");
be48e3ea 310 for(int iSys=0;iSys<4;iSys++) {
57f50b3c 311 fServer[iSys]=0;
be48e3ea 312 if (iSys < 3)
2c15234c 313 fFXSlist[iSys].SetOwner(kTRUE);
57f50b3c 314 }
2bb7b766 315 fPreprocessorMap.SetOwner(kTRUE);
be48e3ea 316
317 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
318 fFirstUnprocessed[iDet] = kFALSE;
319
cb343cfd 320 fMonitoringMutex = new TMutex();
58bc3020 321}
322
b948db8d 323//______________________________________________________________________________________________
57f50b3c 324AliShuttle::~AliShuttle()
58bc3020 325{
9827400b 326 //
327 // destructor
328 //
58bc3020 329
b948db8d 330 fPreprocessorMap.DeleteAll();
be48e3ea 331 for(int iSys=0;iSys<4;iSys++)
57f50b3c 332 if(fServer[iSys]) {
333 fServer[iSys]->Close();
334 delete fServer[iSys];
eba76848 335 fServer[iSys] = 0;
57f50b3c 336 }
2bb7b766 337
338 if (fStatusEntry){
339 delete fStatusEntry;
340 fStatusEntry = 0;
341 }
cb343cfd 342
343 if (fMonitoringMutex)
344 {
345 delete fMonitoringMutex;
346 fMonitoringMutex = 0;
347 }
73abe331 348}
349
b948db8d 350//______________________________________________________________________________________________
57f50b3c 351void AliShuttle::RegisterPreprocessor(AliPreprocessor* preprocessor)
58bc3020 352{
73abe331 353 //
b948db8d 354 // Registers new AliPreprocessor.
73abe331 355 // It uses GetName() for indentificator of the pre processor.
356 // The pre processor is registered it there isn't any other
357 // with the same identificator (GetName()).
358 //
359
eba76848 360 const char* detName = preprocessor->GetName();
361 if(GetDetPos(detName) < 0)
362 AliFatal(Form("********** !!!!! Invalid detector name: %s !!!!! **********", detName));
363
364 if (fPreprocessorMap.GetValue(detName)) {
365 AliWarning(Form("AliPreprocessor %s is already registered!", detName));
73abe331 366 return;
367 }
368
eba76848 369 fPreprocessorMap.Add(new TObjString(detName), preprocessor);
73abe331 370}
b948db8d 371//______________________________________________________________________________________________
3301427a 372Bool_t AliShuttle::Store(const AliCDBPath& path, TObject* object,
84090f85 373 AliCDBMetaData* metaData, Int_t validityStart, Bool_t validityInfinite)
73abe331 374{
9827400b 375 // Stores a CDB object in the storage for offline reconstruction. Objects that are not needed for
376 // offline reconstruction, but should be stored anyway (e.g. for debugging) should NOT be stored
377 // using this function. Use StoreReferenceData instead!
378 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
379 // finishes the data are transferred to the main storage (Grid).
b948db8d 380
3301427a 381 return StoreLocally(fgkLocalCDB, path, object, metaData, validityStart, validityInfinite);
84090f85 382}
383
384//______________________________________________________________________________________________
3301427a 385Bool_t AliShuttle::StoreReferenceData(const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData)
84090f85 386{
9827400b 387 // Stores a CDB object in the storage for reference data. This objects will not be available during
388 // offline reconstrunction. Use this function for reference data only!
389 // It calls StoreLocally function which temporarily stores the data locally; when the preprocessor
390 // finishes the data are transferred to the main storage (Grid).
85a80aa9 391
3301427a 392 return StoreLocally(fgkLocalRefStorage, path, object, metaData);
85a80aa9 393}
394
395//______________________________________________________________________________________________
3301427a 396Bool_t AliShuttle::StoreLocally(const TString& localUri,
85a80aa9 397 const AliCDBPath& path, TObject* object, AliCDBMetaData* metaData,
398 Int_t validityStart, Bool_t validityInfinite)
399{
9827400b 400 // Store object temporarily in local storage. Parameters are passed by Store and StoreReferenceData functions.
401 // when the preprocessor finishes the data are transferred to the main storage (Grid).
402 // The parameters are:
403 // 1) Uri of the backup storage (Local)
404 // 2) the object's path.
405 // 3) the object to be stored
406 // 4) the metaData to be associated with the object
407 // 5) the validity start run number w.r.t. the current run,
408 // if the data is valid only for this run leave the default 0
409 // 6) specifies if the calibration data is valid for infinity (this means until updated),
410 // typical for calibration runs, the default is kFALSE
411 //
412 // returns 0 if fail, 1 otherwise
84090f85 413
9827400b 414 if (fTestMode & kErrorStorage)
415 {
416 Log(fCurrentDetector, "StoreLocally - In TESTMODE - Simulating error while storing locally");
417 return kFALSE;
418 }
419
3301427a 420 const char* cdbType = (localUri == fgkLocalCDB) ? "CDB" : "Reference";
2bb7b766 421
85a80aa9 422 Int_t firstRun = GetCurrentRun() - validityStart;
84090f85 423 if(firstRun < 0) {
9827400b 424 AliWarning("First valid run happens to be less than 0! Setting it to 0.");
84090f85 425 firstRun=0;
426 }
427
428 Int_t lastRun = -1;
429 if(validityInfinite) {
430 lastRun = AliCDBRunRange::Infinity();
431 } else {
432 lastRun = GetCurrentRun();
433 }
434
3301427a 435 // Version is set to current run, it will be used later to transfer data to Grid
436 AliCDBId id(path, firstRun, lastRun, GetCurrentRun(), -1);
2bb7b766 437
438 if(! dynamic_cast<TObjString*> (metaData->GetProperty("RunUsed(TObjString)"))){
439 TObjString runUsed = Form("%d", GetCurrentRun());
9e080f92 440 metaData->SetProperty("RunUsed(TObjString)", runUsed.Clone());
2bb7b766 441 }
84090f85 442
3301427a 443 Bool_t result = kFALSE;
84090f85 444
3301427a 445 if (!(AliCDBManager::Instance()->GetStorage(localUri))) {
446 Log("SHUTTLE", Form("StoreLocally - Cannot activate local %s storage", cdbType));
84090f85 447 } else {
3301427a 448 result = AliCDBManager::Instance()->GetStorage(localUri)
84090f85 449 ->Put(object, id, metaData);
450 }
451
452 if(!result) {
453
9827400b 454 Log(fCurrentDetector, Form("StoreLocally - Can't store object <%s>!", id.ToString().Data()));
3301427a 455 }
2bb7b766 456
3301427a 457 return result;
458}
84090f85 459
3301427a 460//______________________________________________________________________________________________
461Bool_t AliShuttle::StoreOCDB()
462{
9827400b 463 //
464 // Called when preprocessor ends successfully or when previous storage attempt failed (kStoreError status)
465 // Calls underlying StoreOCDB(const char*) function twice, for OCDB and Reference storage.
466 // Then calls StoreRefFilesToGrid to store reference files.
467 //
468
469 if (fTestMode & kErrorGrid)
470 {
471 Log("SHUTTLE", "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
472 Log(fCurrentDetector, "StoreOCDB - In TESTMODE - Simulating error while storing in the Grid");
473 return kFALSE;
474 }
475
2d9019b4 476 Log("SHUTTLE","Storing OCDB data ...");
86aa42c3 477 Bool_t resultCDB = StoreOCDB(fgkMainCDB);
478
2d9019b4 479 Log("SHUTTLE","Storing reference data ...");
3301427a 480 Bool_t resultRef = StoreOCDB(fgkMainRefStorage);
9827400b 481
2d9019b4 482 Log("SHUTTLE","Storing reference files ...");
9827400b 483 Bool_t resultRefFiles = StoreRefFilesToGrid();
484
485 return resultCDB && resultRef && resultRefFiles;
3301427a 486}
487
488//______________________________________________________________________________________________
489Bool_t AliShuttle::StoreOCDB(const TString& gridURI)
490{
491 //
492 // Called by StoreOCDB(), performs actual storage to the main OCDB and reference storages (Grid)
493 //
494
495 TObjArray* gridIds=0;
496
497 Bool_t result = kTRUE;
498
499 const char* type = 0;
500 TString localURI;
501 if(gridURI == fgkMainCDB) {
502 type = "OCDB";
503 localURI = fgkLocalCDB;
504 } else if(gridURI == fgkMainRefStorage) {
505 type = "reference";
506 localURI = fgkLocalRefStorage;
507 } else {
508 AliError(Form("Invalid storage URI: %s", gridURI.Data()));
509 return kFALSE;
510 }
511
512 AliCDBManager* man = AliCDBManager::Instance();
513
514 AliCDBStorage *gridSto = man->GetStorage(gridURI);
515 if(!gridSto) {
516 Log("SHUTTLE",
517 Form("StoreOCDB - cannot activate main %s storage", type));
518 return kFALSE;
519 }
520
521 gridIds = gridSto->GetQueryCDBList();
522
523 // get objects previously stored in local CDB
524 AliCDBStorage *localSto = man->GetStorage(localURI);
525 if(!localSto) {
526 Log("SHUTTLE",
527 Form("StoreOCDB - cannot activate local %s storage", type));
528 return kFALSE;
529 }
530 AliCDBPath aPath(GetOfflineDetName(fCurrentDetector.Data()),"*","*");
531 // Local objects were stored with current run as Grid version!
532 TList* localEntries = localSto->GetAll(aPath.GetPath(), GetCurrentRun(), GetCurrentRun());
533 localEntries->SetOwner(1);
534
535 // loop on local stored objects
536 TIter localIter(localEntries);
537 AliCDBEntry *aLocEntry = 0;
538 while((aLocEntry = dynamic_cast<AliCDBEntry*> (localIter.Next()))){
539 aLocEntry->SetOwner(1);
540 AliCDBId aLocId = aLocEntry->GetId();
541 aLocEntry->SetVersion(-1);
542 aLocEntry->SetSubVersion(-1);
543
544 // If local object is valid up to infinity we store it only if it is
545 // the first unprocessed run!
546 if (aLocId.GetLastRun() == AliCDBRunRange::Infinity() &&
547 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
548 {
549 Log("SHUTTLE", Form("StoreOCDB - %s: object %s has validity infinite but "
550 "there are previous unprocessed runs!",
551 fCurrentDetector.Data(), aLocId.GetPath().Data()));
552 continue;
553 }
554
555 // loop on Grid valid Id's
556 Bool_t store = kTRUE;
557 TIter gridIter(gridIds);
558 AliCDBId* aGridId = 0;
559 while((aGridId = dynamic_cast<AliCDBId*> (gridIter.Next()))){
560 if(aGridId->GetPath() != aLocId.GetPath()) continue;
561 // skip all objects valid up to infinity
562 if(aGridId->GetLastRun() == AliCDBRunRange::Infinity()) continue;
563 // if we get here, it means there's already some more recent object stored on Grid!
564 store = kFALSE;
565 break;
566 }
567
568 // If we get here, the file can be stored!
569 Bool_t storeOk = gridSto->Put(aLocEntry);
570 if(!store || storeOk){
571
572 if (!store)
573 {
574 Log(fCurrentDetector.Data(),
575 Form("StoreOCDB - A more recent object already exists in %s storage: <%s>",
576 type, aGridId->ToString().Data()));
577 } else {
578 Log("SHUTTLE",
579 Form("StoreOCDB - Object <%s> successfully put into %s storage",
580 aLocId.ToString().Data(), type));
2d9019b4 581 Log(fCurrentDetector.Data(),
582 Form("StoreOCDB - Object <%s> successfully put into %s storage",
583 aLocId.ToString().Data(), type));
3301427a 584 }
84090f85 585
3301427a 586 // removing local filename...
587 TString filename;
588 localSto->IdToFilename(aLocId, filename);
589 AliInfo(Form("Removing local file %s", filename.Data()));
590 RemoveFile(filename.Data());
591 continue;
592 } else {
593 Log("SHUTTLE",
594 Form("StoreOCDB - Grid %s storage of object <%s> failed",
595 type, aLocId.ToString().Data()));
2d9019b4 596 Log(fCurrentDetector.Data(),
597 Form("StoreOCDB - Grid %s storage of object <%s> failed",
598 type, aLocId.ToString().Data()));
3301427a 599 result = kFALSE;
b948db8d 600 }
601 }
3301427a 602 localEntries->Clear();
2bb7b766 603
b948db8d 604 return result;
3301427a 605}
606
546242fb 607//______________________________________________________________________________________________
608Bool_t AliShuttle::CleanReferenceStorage(const char* detector)
609{
2d9019b4 610 // clears the directory used to store reference files of a given subdetector
546242fb 611
612 AliCDBManager* man = AliCDBManager::Instance();
613 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
2d9019b4 614 TString localBaseFolder = sto->GetBaseFolder();
615
616 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
617
618 Log("SHUTTLE", Form("Cleaning %s", targetDir.Data()));
619
620 TString begin;
621 begin.Form("%d_", GetCurrentRun());
622
623 TSystemDirectory* baseDir = new TSystemDirectory("/", targetDir);
624 if (!baseDir)
625 return kTRUE;
626
627 TList* dirList = baseDir->GetListOfFiles();
628 delete baseDir;
629
630 if (!dirList) return kTRUE;
631
632 if (dirList->GetEntries() < 3)
633 {
634 delete dirList;
635 return kTRUE;
636 }
637
638 Int_t nDirs = 0, nDel = 0;
639 TIter dirIter(dirList);
640 TSystemFile* entry = 0;
546242fb 641
2d9019b4 642 Bool_t success = kTRUE;
546242fb 643
2d9019b4 644 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
645 {
646 if (entry->IsDirectory())
647 continue;
648
649 TString fileName(entry->GetName());
650 if (!fileName.BeginsWith(begin))
651 continue;
652
653 nDirs++;
654
655 // delete file
656 Int_t result = gSystem->Unlink(fileName.Data());
657
658 if (result)
659 {
660 Log("SHUTTLE", Form("Could not delete file %s!", fileName.Data()));
661 success = kFALSE;
662 } else {
663 nDel++;
664 }
665 }
666
667 if(nDirs > 0)
668 Log("SHUTTLE", Form("CleanReferenceStorage - %d (over %d) reference files in folder %s were deleted.",
669 nDel, nDirs, targetDir.Data()));
670
671
672 delete dirList;
673 return success;
674
675
676
677
678
546242fb 679
680 Int_t result = gSystem->GetPathInfo(targetDir, 0, (Long64_t*) 0, 0, 0);
681 if (result == 0)
682 {
683 // delete directory
684 result = gSystem->Exec(Form("rm -r %s", targetDir.Data()));
685 if (result != 0)
686 {
687 Log("SHUTTLE", Form("StoreReferenceFile - Could not clear directory %s", targetDir.Data()));
688 return kFALSE;
689 }
690 }
691
692 result = gSystem->mkdir(targetDir, kTRUE);
693 if (result != 0)
694 {
695 Log("SHUTTLE", Form("StoreReferenceFile - Error creating base directory %s", targetDir.Data()));
696 return kFALSE;
697 }
698
699 return kTRUE;
700}
701
9827400b 702//______________________________________________________________________________________________
703Bool_t AliShuttle::StoreReferenceFile(const char* detector, const char* localFile, const char* gridFileName)
704{
705 //
3c2a21c8 706 // Stores reference file directly (without opening it). This function stores the file locally.
9827400b 707 //
3c2a21c8 708 // The file is stored under the following location:
709 // <base folder of local reference storage>/<DET>/<RUN#>_<gridFileName>
710 // where <gridFileName> is the second parameter given to the function
711 //
9827400b 712
713 if (fTestMode & kErrorStorage)
714 {
715 Log(fCurrentDetector, "StoreReferenceFile - In TESTMODE - Simulating error while storing locally");
716 return kFALSE;
717 }
718
719 AliCDBManager* man = AliCDBManager::Instance();
720 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
721
722 TString localBaseFolder = sto->GetBaseFolder();
723
2d9019b4 724 TString targetDir = GetRefFilePrefix(localBaseFolder.Data(), detector);
9827400b 725
2d9019b4 726 //try to open folder, if does not exist
727 void* dir = gSystem->OpenDirectory(targetDir.Data());
728 if (dir == NULL) {
729 if (gSystem->mkdir(targetDir.Data(), kTRUE)) {
730 Log("SHUTTLE", Form("Can't open directory <%s>", targetDir.Data()));
731 return kFALSE;
732 }
733
734 } else {
735 gSystem->FreeDirectory(dir);
736 }
737
9827400b 738 TString target;
739 target.Form("%s/%d_%s", targetDir.Data(), GetCurrentRun(), gridFileName);
740
546242fb 741 Int_t result = gSystem->GetPathInfo(localFile, 0, (Long64_t*) 0, 0, 0);
9827400b 742 if (result)
743 {
546242fb 744 Log("SHUTTLE", Form("StoreReferenceFile - %s does not exist", localFile));
745 return kFALSE;
9827400b 746 }
546242fb 747
9827400b 748 result = gSystem->CopyFile(localFile, target);
749
750 if (result == 0)
751 {
2d9019b4 752 Log("SHUTTLE", Form("StoreReferenceFile - File %s stored locally to %s", localFile, target.Data()));
9827400b 753 return kTRUE;
754 }
755 else
756 {
2d9019b4 757 Log("SHUTTLE", Form("StoreReferenceFile - Could not store file %s to %s!. Error code = %d",
546242fb 758 localFile, target.Data(), result));
9827400b 759 return kFALSE;
760 }
761}
762
763//______________________________________________________________________________________________
764Bool_t AliShuttle::StoreRefFilesToGrid()
765{
766 //
767 // Transfers the reference file to the Grid.
9827400b 768 //
86aa42c3 769 // The files are stored under the following location:
3c2a21c8 770 // <base folder of reference storage>/<DET>/<RUN#>_<gridFileName>
86aa42c3 771 //
9827400b 772
773 AliCDBManager* man = AliCDBManager::Instance();
774 AliCDBStorage* sto = man->GetStorage(fgkLocalRefStorage);
775 if (!sto)
776 return kFALSE;
777 TString localBaseFolder = sto->GetBaseFolder();
778
2d9019b4 779 TString dir = GetRefFilePrefix(localBaseFolder.Data(), fCurrentDetector.Data());
780
9827400b 781 AliCDBStorage* gridSto = man->GetStorage(fgkMainRefStorage);
782 if (!gridSto)
783 return kFALSE;
2d9019b4 784
9827400b 785 TString gridBaseFolder = gridSto->GetBaseFolder();
2d9019b4 786
787 TString alienDir = GetRefFilePrefix(gridBaseFolder.Data(), fCurrentDetector.Data());
9827400b 788
9827400b 789 TString begin;
790 begin.Form("%d_", GetCurrentRun());
791
792 TSystemDirectory* baseDir = new TSystemDirectory("/", dir);
3d8bc902 793 if (!baseDir)
794 return kTRUE;
795
2d9019b4 796 TList* dirList = baseDir->GetListOfFiles();
797 delete baseDir;
798
799 if (!dirList) return kTRUE;
800
801 if (dirList->GetEntries() < 3)
3d8bc902 802 {
2d9019b4 803 delete dirList;
9827400b 804 return kTRUE;
3d8bc902 805 }
2d9019b4 806
546242fb 807 if (!gGrid)
808 {
809 Log("SHUTTLE", "Connection to Grid failed: Cannot continue!");
2d9019b4 810 delete dirList;
546242fb 811 return kFALSE;
812 }
813
2d9019b4 814 Int_t nDirs = 0, nTransfer = 0;
815 TIter dirIter(dirList);
816 TSystemFile* entry = 0;
817
9827400b 818 Bool_t success = kTRUE;
3d8bc902 819 Bool_t first = kTRUE;
9827400b 820
2d9019b4 821 while ((entry = dynamic_cast<TSystemFile*> (dirIter.Next())))
822 {
9827400b 823 if (entry->IsDirectory())
824 continue;
825
826 TString fileName(entry->GetName());
827 if (!fileName.BeginsWith(begin))
828 continue;
829
2d9019b4 830 nDirs++;
831
3d8bc902 832 if (first)
833 {
834 first = kFALSE;
835 // check that DET folder exists, otherwise create it
836 TGridResult* result = gGrid->Ls(alienDir.Data(), "a");
837
838 if (!result)
2d9019b4 839 {
840 delete dirList;
3d8bc902 841 return kFALSE;
2d9019b4 842 }
3d8bc902 843
546242fb 844 if (!result->GetFileName(1)) // TODO: It looks like element 0 is always 0!!
3d8bc902 845 {
846 if (!gGrid->Mkdir(alienDir.Data(),"",0))
847 {
848 Log("SHUTTLE", Form("StoreRefFilesToGrid - Cannot create directory %s",
849 alienDir.Data()));
2d9019b4 850 delete dirList;
3d8bc902 851 return kFALSE;
546242fb 852 } else {
853 Log("SHUTTLE",Form("Folder %s created", alienDir.Data()));
3d8bc902 854 }
855
546242fb 856 } else {
857 Log("SHUTTLE",Form("Folder %s found", alienDir.Data()));
3d8bc902 858 }
859 }
860
9827400b 861 TString fullLocalPath;
862 fullLocalPath.Form("%s/%s", dir.Data(), fileName.Data());
863
864 TString fullGridPath;
865 fullGridPath.Form("alien://%s/%s", alienDir.Data(), fileName.Data());
866
9827400b 867 TFileMerger fileMerger;
a986b218 868 Bool_t result = TFile::Cp(fullLocalPath, fullGridPath);
9827400b 869
870 if (result)
871 {
2d9019b4 872 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s succeeded!", fullLocalPath.Data(), fullGridPath.Data()));
9827400b 873 RemoveFile(fullLocalPath);
2d9019b4 874 nTransfer++;
9827400b 875 }
876 else
877 {
2d9019b4 878 Log("SHUTTLE", Form("StoreRefFilesToGrid - Copying local file %s to %s FAILED!", fullLocalPath.Data(), fullGridPath.Data()));
9827400b 879 success = kFALSE;
880 }
881 }
2d9019b4 882
883 Log("SHUTTLE", Form("StoreRefFilesToGrid - %d (over %d) reference files in folder %s copied to Grid.", nTransfer, nDirs, dir.Data()));
884
885
886 delete dirList;
9827400b 887 return success;
888}
889
2d9019b4 890//______________________________________________________________________________________________
891const char* AliShuttle::GetRefFilePrefix(const char* base, const char* detector)
892{
893 //
894 // Get folder name of reference files
895 //
896
897 TString offDetStr(GetOfflineDetName(detector));
898 TString dir;
899 if (offDetStr == "ITS" || offDetStr == "MUON" || offDetStr == "PHOS")
900 {
901 dir.Form("%s/%s/%s", base, offDetStr.Data(), detector);
902 } else {
903 dir.Form("%s/%s", base, offDetStr.Data());
904 }
905
906 return dir.Data();
907
908
909}
3301427a 910//______________________________________________________________________________________________
911void AliShuttle::CleanLocalStorage(const TString& uri)
912{
9827400b 913 //
914 // Called in case the preprocessor is declared failed. Remove remaining objects from the local storages.
915 //
3301427a 916
917 const char* type = 0;
918 if(uri == fgkLocalCDB) {
919 type = "OCDB";
920 } else if(uri == fgkLocalRefStorage) {
546242fb 921 type = "Reference";
3301427a 922 } else {
923 AliError(Form("Invalid storage URI: %s", uri.Data()));
924 return;
925 }
926
927 AliCDBManager* man = AliCDBManager::Instance();
b948db8d 928
3301427a 929 // open local storage
930 AliCDBStorage *localSto = man->GetStorage(uri);
931 if(!localSto) {
932 Log("SHUTTLE",
933 Form("CleanLocalStorage - cannot activate local %s storage", type));
934 return;
935 }
936
937 TString filename(Form("%s/%s/*/Run*_v%d_s*.root",
546242fb 938 localSto->GetBaseFolder().Data(), GetOfflineDetName(fCurrentDetector.Data()), GetCurrentRun()));
3301427a 939
940 AliInfo(Form("filename = %s", filename.Data()));
941
942 AliInfo(Form("Removing remaining local files from run %d and detector %s ...",
943 GetCurrentRun(), fCurrentDetector.Data()));
944
945 RemoveFile(filename.Data());
946
947}
948
949//______________________________________________________________________________________________
950void AliShuttle::RemoveFile(const char* filename)
951{
9827400b 952 //
953 // removes local file
954 //
3301427a 955
956 TString command(Form("rm -f %s", filename));
957
958 Int_t result = gSystem->Exec(command.Data());
959 if(result != 0)
960 {
961 Log("SHUTTLE", Form("RemoveFile - %s: Cannot remove file %s!",
962 fCurrentDetector.Data(), filename));
963 }
73abe331 964}
965
b948db8d 966//______________________________________________________________________________________________
5164a766 967AliShuttleStatus* AliShuttle::ReadShuttleStatus()
968{
9827400b 969 //
970 // Reads the AliShuttleStatus from the CDB
971 //
5164a766 972
2bb7b766 973 if (fStatusEntry){
974 delete fStatusEntry;
975 fStatusEntry = 0;
976 }
5164a766 977
10a5a932 978 fStatusEntry = AliCDBManager::Instance()->GetStorage(GetLocalCDB())
2bb7b766 979 ->Get(Form("/SHUTTLE/STATUS/%s", fCurrentDetector.Data()), GetCurrentRun());
5164a766 980
2bb7b766 981 if (!fStatusEntry) return 0;
982 fStatusEntry->SetOwner(1);
5164a766 983
2bb7b766 984 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
985 if (!status) {
986 AliError("Invalid object stored to CDB!");
987 return 0;
988 }
5164a766 989
2bb7b766 990 return status;
5164a766 991}
992
993//______________________________________________________________________________________________
7bfb2090 994Bool_t AliShuttle::WriteShuttleStatus(AliShuttleStatus* status)
5164a766 995{
9827400b 996 //
997 // writes the status for one subdetector
998 //
2bb7b766 999
1000 if (fStatusEntry){
1001 delete fStatusEntry;
1002 fStatusEntry = 0;
1003 }
5164a766 1004
2bb7b766 1005 Int_t run = GetCurrentRun();
5164a766 1006
2bb7b766 1007 AliCDBId id(AliCDBPath("SHUTTLE", "STATUS", fCurrentDetector), run, run);
5164a766 1008
2bb7b766 1009 fStatusEntry = new AliCDBEntry(status, id, new AliCDBMetaData);
1010 fStatusEntry->SetOwner(1);
5164a766 1011
2bb7b766 1012 UInt_t result = AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
7bfb2090 1013
2bb7b766 1014 if (!result) {
3301427a 1015 Log("SHUTTLE", Form("WriteShuttleStatus - Failed for %s, run %d",
1016 fCurrentDetector.Data(), run));
2bb7b766 1017 return kFALSE;
1018 }
e7f62f16 1019
1020 SendMLInfo();
7bfb2090 1021
2bb7b766 1022 return kTRUE;
5164a766 1023}
1024
1025//______________________________________________________________________________________________
1026void AliShuttle::UpdateShuttleStatus(AliShuttleStatus::Status newStatus, Bool_t increaseCount)
1027{
9827400b 1028 //
1029 // changes the AliShuttleStatus for the given detector and run to the given status
1030 //
5164a766 1031
2bb7b766 1032 if (!fStatusEntry){
1033 AliError("UNEXPECTED: fStatusEntry empty");
1034 return;
1035 }
5164a766 1036
2bb7b766 1037 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
5164a766 1038
2bb7b766 1039 if (!status){
3301427a 1040 Log("SHUTTLE", "UNEXPECTED: status could not be read from current CDB entry");
2bb7b766 1041 return;
1042 }
5164a766 1043
2c15234c 1044 TString actionStr = Form("UpdateShuttleStatus - %s: Changing state from %s to %s",
eba76848 1045 fCurrentDetector.Data(),
36c99a6a 1046 status->GetStatusName(),
eba76848 1047 status->GetStatusName(newStatus));
cb343cfd 1048 Log("SHUTTLE", actionStr);
1049 SetLastAction(actionStr);
5164a766 1050
2bb7b766 1051 status->SetStatus(newStatus);
1052 if (increaseCount) status->IncreaseCount();
5164a766 1053
2bb7b766 1054 AliCDBManager::Instance()->GetStorage(fgkLocalCDB)->Put(fStatusEntry);
e7f62f16 1055
1056 SendMLInfo();
5164a766 1057}
e7f62f16 1058
1059//______________________________________________________________________________________________
1060void AliShuttle::SendMLInfo()
1061{
1062 //
1063 // sends ML information about the current status of the current detector being processed
1064 //
1065
1066 AliShuttleStatus* status = dynamic_cast<AliShuttleStatus*> (fStatusEntry->GetObject());
1067
1068 if (!status){
3301427a 1069 Log("SHUTTLE", "SendMLInfo - UNEXPECTED: status could not be read from current CDB entry");
e7f62f16 1070 return;
1071 }
1072
1073 TMonaLisaText mlStatus(Form("%s_status", fCurrentDetector.Data()), status->GetStatusName());
1074 TMonaLisaValue mlRetryCount(Form("%s_count", fCurrentDetector.Data()), status->GetCount());
1075
1076 TList mlList;
1077 mlList.Add(&mlStatus);
1078 mlList.Add(&mlRetryCount);
1079
1080 fMonaLisa->SendParameters(&mlList);
1081}
1082
5164a766 1083//______________________________________________________________________________________________
1084Bool_t AliShuttle::ContinueProcessing()
1085{
9827400b 1086 // this function reads the AliShuttleStatus information from CDB and
1087 // checks if the processing should be continued
1088 // if yes it returns kTRUE and updates the AliShuttleStatus with nextStatus
2bb7b766 1089
57c1a579 1090 if (!fConfig->HostProcessDetector(fCurrentDetector)) return kFALSE;
1091
1092 AliPreprocessor* aPreprocessor =
1093 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1094 if (!aPreprocessor)
1095 {
1096 AliInfo(Form("%s: no preprocessor registered", fCurrentDetector.Data()));
1097 return kFALSE;
1098 }
1099
2bb7b766 1100 AliShuttleLogbookEntry::Status entryStatus =
eba76848 1101 fLogbookEntry->GetDetectorStatus(fCurrentDetector);
2bb7b766 1102
1103 if(entryStatus != AliShuttleLogbookEntry::kUnprocessed) {
9e080f92 1104 AliInfo(Form("ContinueProcessing - %s is %s",
2bb7b766 1105 fCurrentDetector.Data(),
1106 fLogbookEntry->GetDetectorStatusName(entryStatus)));
1107 return kFALSE;
1108 }
1109
1110 // if we get here, according to Shuttle logbook subdetector is in UNPROCESSED state
be48e3ea 1111
1112 // check if current run is first unprocessed run for current detector
1113 if (fConfig->StrictRunOrder(fCurrentDetector) &&
1114 !fFirstUnprocessed[GetDetPos(fCurrentDetector)])
1115 {
86aa42c3 1116 if (fTestMode == kNone)
1117 {
1118 Log("SHUTTLE", Form("ContinueProcessing - %s requires strict run ordering but this is not the first unprocessed run!"));
1119 return kFALSE;
1120 }
1121 else
1122 {
1123 Log("SHUTTLE", Form("ContinueProcessing - In TESTMODE - Although %s requires strict run ordering and this is not the first unprocessed run, the SHUTTLE continues"));
1124 }
be48e3ea 1125 }
1126
2bb7b766 1127 AliShuttleStatus* status = ReadShuttleStatus();
1128 if (!status) {
1129 // first time
1130 Log("SHUTTLE", Form("ContinueProcessing - %s: Processing first time",
1131 fCurrentDetector.Data()));
1132 status = new AliShuttleStatus(AliShuttleStatus::kStarted);
1133 return WriteShuttleStatus(status);
1134 }
1135
1136 // The following two cases shouldn't happen if Shuttle Logbook was correctly updated.
1137 // If it happens it may mean Logbook updating failed... let's do it now!
1138 if (status->GetStatus() == AliShuttleStatus::kDone ||
1139 status->GetStatus() == AliShuttleStatus::kFailed){
1140 Log("SHUTTLE", Form("ContinueProcessing - %s is already %s. Updating Shuttle Logbook",
1141 fCurrentDetector.Data(),
1142 status->GetStatusName(status->GetStatus())));
1143 UpdateShuttleLogbook(fCurrentDetector.Data(),
1144 status->GetStatusName(status->GetStatus()));
1145 return kFALSE;
1146 }
1147
3301427a 1148 if (status->GetStatus() == AliShuttleStatus::kStoreError) {
2bb7b766 1149 Log("SHUTTLE",
1150 Form("ContinueProcessing - %s: Grid storage of one or more objects failed. Trying again now",
1151 fCurrentDetector.Data()));
9827400b 1152 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1153 if (StoreOCDB()){
3301427a 1154 Log("SHUTTLE", Form("ContinueProcessing - %s: all objects successfully stored into main storage",
1155 fCurrentDetector.Data()));
2bb7b766 1156 UpdateShuttleStatus(AliShuttleStatus::kDone);
1157 UpdateShuttleLogbook(fCurrentDetector.Data(), "DONE");
1158 } else {
1159 Log("SHUTTLE",
1160 Form("ContinueProcessing - %s: Grid storage failed again",
1161 fCurrentDetector.Data()));
9827400b 1162 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
2bb7b766 1163 }
1164 return kFALSE;
1165 }
1166
1167 // if we get here, there is a restart
57c1a579 1168 Bool_t cont = kFALSE;
2bb7b766 1169
1170 // abort conditions
cb343cfd 1171 if (status->GetCount() >= fConfig->GetMaxRetries()) {
57c1a579 1172 Log("SHUTTLE", Form("ContinueProcessing - %s failed %d times in status %s - "
1173 "Updating Shuttle Logbook", fCurrentDetector.Data(),
2bb7b766 1174 status->GetCount(), status->GetStatusName()));
1175 UpdateShuttleLogbook(fCurrentDetector.Data(), "FAILED");
e7f62f16 1176 UpdateShuttleStatus(AliShuttleStatus::kFailed);
3301427a 1177
1178 // there may still be objects in local OCDB and reference storage
1179 // and FXS databases may be not updated: do it now!
9827400b 1180
1181 // TODO Currently disabled, we want to keep files in case of failure!
1182 // CleanLocalStorage(fgkLocalCDB);
1183 // CleanLocalStorage(fgkLocalRefStorage);
1184 // UpdateTableFailCase();
1185
1186 // Send mail to detector expert!
1187 AliInfo(Form("Sending mail to %s expert...", fCurrentDetector.Data()));
1188 if (!SendMail())
1189 Log("SHUTTLE", Form("ContinueProcessing - Could not send mail to %s expert",
1190 fCurrentDetector.Data()));
3301427a 1191
57c1a579 1192 } else {
1193 Log("SHUTTLE", Form("ContinueProcessing - %s: restarting. "
1194 "Aborted before with %s. Retry number %d.", fCurrentDetector.Data(),
1195 status->GetStatusName(), status->GetCount()));
9827400b 1196 Bool_t increaseCount = kTRUE;
1197 if (status->GetStatus() == AliShuttleStatus::kDCSError || status->GetStatus() == AliShuttleStatus::kDCSStarted)
1198 increaseCount = kFALSE;
1199 UpdateShuttleStatus(AliShuttleStatus::kStarted, increaseCount);
57c1a579 1200 cont = kTRUE;
2bb7b766 1201 }
1202
57c1a579 1203 return cont;
5164a766 1204}
1205
1206//______________________________________________________________________________________________
2bb7b766 1207Bool_t AliShuttle::Process(AliShuttleLogbookEntry* entry)
58bc3020 1208{
73abe331 1209 //
b948db8d 1210 // Makes data retrieval for all detectors in the configuration.
2bb7b766 1211 // entry: Shuttle logbook entry, contains run paramenters and status of detectors
1212 // (Unprocessed, Inactive, Failed or Done).
d477ad88 1213 // Returns kFALSE in case of error occured and kTRUE otherwise
73abe331 1214 //
1215
9827400b 1216 if (!entry) return kFALSE;
2bb7b766 1217
1218 fLogbookEntry = entry;
1219
9827400b 1220 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: START ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1221 GetCurrentRun()));
2bb7b766 1222
e7f62f16 1223 // create ML instance that monitors this run
1224 fMonaLisa = new TMonaLisaWriter(Form("%d", GetCurrentRun()), "SHUTTLE", "aliendb1.cern.ch");
1225 // disable monitoring of other parameters that come e.g. from TFile
1226 gMonitoringWriter = 0;
2bb7b766 1227
e7f62f16 1228 // Send the information to ML
1229 TMonaLisaText mlStatus("SHUTTLE_status", "Processing");
9827400b 1230 TMonaLisaText mlRunType("SHUTTLE_runtype", Form("%s (%s)", entry->GetRunType(), entry->GetRunParameter("log")));
e7f62f16 1231
1232 TList mlList;
1233 mlList.Add(&mlStatus);
9827400b 1234 mlList.Add(&mlRunType);
e7f62f16 1235
1236 fMonaLisa->SendParameters(&mlList);
3301427a 1237
9827400b 1238 if (fLogbookEntry->IsDone())
1239 {
1240 Log("SHUTTLE","Process - Shuttle is already DONE. Updating logbook");
1241 UpdateShuttleLogbook("shuttle_done");
1242 fLogbookEntry = 0;
1243 return kTRUE;
1244 }
1245
1246 // read test mode if flag is set
1247 if (fReadTestMode)
1248 {
3d8bc902 1249 fTestMode = kNone;
9827400b 1250 TString logEntry(entry->GetRunParameter("log"));
1251 //printf("log entry = %s\n", logEntry.Data());
1252 TString searchStr("Testmode: ");
1253 Int_t pos = logEntry.Index(searchStr.Data());
1254 //printf("%d\n", pos);
1255 if (pos >= 0)
1256 {
1257 TSubString subStr = logEntry(pos + searchStr.Length(), logEntry.Length());
1258 //printf("%s\n", subStr.String().Data());
1259 TString newStr(subStr.Data());
1260 TObjArray* token = newStr.Tokenize(' ');
1261 if (token)
1262 {
1263 //token->Print();
1264 TObjString* tmpStr = dynamic_cast<TObjString*> (token->First());
1265 if (tmpStr)
1266 {
1267 Int_t testMode = tmpStr->String().Atoi();
1268 if (testMode > 0)
1269 {
1270 Log("SHUTTLE", Form("Enabling test mode %d", testMode));
1271 SetTestMode((TestMode) testMode);
1272 }
1273 }
1274 delete token;
1275 }
1276 }
1277 }
1278
3d8bc902 1279 Log("SHUTTLE", Form("The test mode flag is %d", (Int_t) fTestMode));
1280
eba76848 1281 fLogbookEntry->Print("all");
57f50b3c 1282
1283 // Initialization
d477ad88 1284 Bool_t hasError = kFALSE;
5164a766 1285
2bb7b766 1286 AliCDBStorage *mainCDBSto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
1287 if(mainCDBSto) mainCDBSto->QueryCDB(GetCurrentRun());
1288 AliCDBStorage *mainRefSto = AliCDBManager::Instance()->GetStorage(fgkMainRefStorage);
1289 if(mainRefSto) mainRefSto->QueryCDB(GetCurrentRun());
d477ad88 1290
57f50b3c 1291 // Loop on detectors in the configuration
b948db8d 1292 TIter iter(fConfig->GetDetectors());
2bb7b766 1293 TObjString* aDetector = 0;
b948db8d 1294
be48e3ea 1295 while ((aDetector = (TObjString*) iter.Next()))
1296 {
7bfb2090 1297 fCurrentDetector = aDetector->String();
5164a766 1298
9e080f92 1299 if (ContinueProcessing() == kFALSE) continue;
1300
2bb7b766 1301 AliInfo(Form("\n\n \t\t\t****** run %d - %s: START ******",
1302 GetCurrentRun(), aDetector->GetName()));
1303
9d733021 1304 for(Int_t iSys=0;iSys<3;iSys++) fFXSCalled[iSys]=kFALSE;
1305
e7f62f16 1306 Log(fCurrentDetector.Data(), "Starting processing");
85a80aa9 1307
be48e3ea 1308 Int_t pid = fork();
1309
1310 if (pid < 0)
1311 {
1312 Log("SHUTTLE", "ERROR: Forking failed");
1313 }
1314 else if (pid > 0)
1315 {
1316 // parent
1317 AliInfo(Form("In parent process of %d - %s: Starting monitoring",
1318 GetCurrentRun(), aDetector->GetName()));
1319
1320 Long_t begin = time(0);
1321
1322 int status; // to be used with waitpid, on purpose an int (not Int_t)!
1323 while (waitpid(pid, &status, WNOHANG) == 0)
1324 {
1325 Long_t expiredTime = time(0) - begin;
1326
1327 if (expiredTime > fConfig->GetPPTimeOut())
1328 {
9827400b 1329 TString tmp;
1330 tmp.Form("Process of %s time out. Run time: %d seconds. Killing...",
1331 fCurrentDetector.Data(), expiredTime);
1332 Log("SHUTTLE", tmp);
1333 Log(fCurrentDetector, tmp);
be48e3ea 1334
1335 kill(pid, 9);
1336
3301427a 1337 UpdateShuttleStatus(AliShuttleStatus::kPPTimeOut);
be48e3ea 1338 hasError = kTRUE;
1339
1340 gSystem->Sleep(1000);
1341 }
1342 else
1343 {
be48e3ea 1344 gSystem->Sleep(1000);
9827400b 1345
1346 TString checkStr;
1347 checkStr.Form("ps -o vsize --pid %d | tail -n 1", pid);
1348 FILE* pipe = gSystem->OpenPipe(checkStr, "r");
1349 if (!pipe)
1350 {
1351 Log("SHUTTLE", Form("Error: Could not open pipe to %s", checkStr.Data()));
1352 continue;
1353 }
1354
1355 char buffer[100];
1356 if (!fgets(buffer, 100, pipe))
1357 {
1358 Log("SHUTTLE", "Error: ps did not return anything");
1359 gSystem->ClosePipe(pipe);
1360 continue;
1361 }
1362 gSystem->ClosePipe(pipe);
1363
1364 //Log("SHUTTLE", Form("ps returned %s", buffer));
1365
1366 Int_t mem = 0;
1367 if ((sscanf(buffer, "%d\n", &mem) != 1) || !mem)
1368 {
1369 Log("SHUTTLE", "Error: Could not parse output of ps");
1370 continue;
1371 }
1372
1373 if (expiredTime % 60 == 0)
886d60e6 1374 Log("SHUTTLE", Form("%s: Checking process. Run time: %d seconds - Memory consumption: %d KB",
1375 fCurrentDetector.Data(), expiredTime, mem));
9827400b 1376
1377 if (mem > fConfig->GetPPMaxMem())
1378 {
1379 TString tmp;
1380 tmp.Form("Process exceeds maximum allowed memory (%d KB > %d KB). Killing...",
1381 mem, fConfig->GetPPMaxMem());
1382 Log("SHUTTLE", tmp);
1383 Log(fCurrentDetector, tmp);
1384
1385 kill(pid, 9);
1386
1387 UpdateShuttleStatus(AliShuttleStatus::kPPOutOfMemory);
1388 hasError = kTRUE;
1389
1390 gSystem->Sleep(1000);
1391 }
be48e3ea 1392 }
1393 }
1394
1395 AliInfo(Form("In parent process of %d - %s: Client has terminated.",
1396 GetCurrentRun(), aDetector->GetName()));
1397
1398 if (WIFEXITED(status))
1399 {
1400 Int_t returnCode = WEXITSTATUS(status);
1401
3301427a 1402 Log("SHUTTLE", Form("%s: the return code is %d", fCurrentDetector.Data(),
1403 returnCode));
be48e3ea 1404
9827400b 1405 if (returnCode == 0) hasError = kTRUE;
be48e3ea 1406 }
1407 }
1408 else if (pid == 0)
1409 {
1410 // client
1411 AliInfo(Form("In client process of %d - %s", GetCurrentRun(), aDetector->GetName()));
1412
ffa29e93 1413 AliInfo("Redirecting output...");
1414
546242fb 1415 if ((freopen(GetLogFileName(fCurrentDetector), "a", stdout)) == 0)
ffa29e93 1416 {
1417 Log("SHUTTLE", "Could not freopen stdout");
1418 }
1419 else
1420 {
1421 fOutputRedirected = kTRUE;
1422 if ((dup2(fileno(stdout), fileno(stderr))) < 0)
1423 Log("SHUTTLE", "Could not redirect stderr");
1424
1425 }
1426
5bac2bde 1427 TString wd = gSystem->WorkingDirectory();
1428 TString tmpDir = Form("%s/%s_process",GetShuttleTempDir(),fCurrentDetector.Data());
1429
1430 gSystem->mkdir(tmpDir.Data());
1431 gSystem->ChangeDirectory(tmpDir.Data());
1432
9827400b 1433 Bool_t success = ProcessCurrentDetector();
5bac2bde 1434
1435 gSystem->ChangeDirectory(wd.Data());
1436
1437 gSystem->Exec(Form("rm -rf %s",tmpDir.Data()));
1438
9827400b 1439 if (success) // Preprocessor finished successfully!
1440 {
3301427a 1441 // Update time_processed field in FXS DB
1442 if (UpdateTable() == kFALSE)
5bac2bde 1443 Log("SHUTTLE", Form("Process - %s: Could not update FXS databases!",
1444 fCurrentDetector.Data()));
3301427a 1445
1446 // Transfer the data from local storage to main storage (Grid)
1447 UpdateShuttleStatus(AliShuttleStatus::kStoreStarted);
1448 if (StoreOCDB() == kFALSE)
1449 {
1450 AliInfo(Form("\n \t\t\t****** run %d - %s: STORAGE ERROR ****** \n\n",
1451 GetCurrentRun(), aDetector->GetName()));
1452 UpdateShuttleStatus(AliShuttleStatus::kStoreError);
9827400b 1453 success = kFALSE;
3301427a 1454 } else {
1455 AliInfo(Form("\n \t\t\t****** run %d - %s: DONE ****** \n\n",
1456 GetCurrentRun(), aDetector->GetName()));
1457 UpdateShuttleStatus(AliShuttleStatus::kDone);
9827400b 1458 UpdateShuttleLogbook(fCurrentDetector, "DONE");
3301427a 1459 }
be48e3ea 1460 }
1461
4b95672b 1462 for (UInt_t iSys=0; iSys<3; iSys++)
1463 {
1464 if (fFXSCalled[iSys]) fFXSlist[iSys].Clear();
1465 }
1466
be48e3ea 1467 AliInfo(Form("Client process of %d - %s is exiting now with %d.",
9827400b 1468 GetCurrentRun(), aDetector->GetName(), success));
be48e3ea 1469
1470 // the client exits here
9827400b 1471 gSystem->Exit(success);
be48e3ea 1472
1473 AliError("We should never get here!!!");
1474 }
7bfb2090 1475 }
5164a766 1476
2bb7b766 1477 AliInfo(Form("\n\n \t\t\t^*^*^*^*^*^*^*^*^*^*^*^* run %d: FINISH ^*^*^*^*^*^*^*^*^*^*^*^* \n",
1478 GetCurrentRun()));
1479
1480 //check if shuttle is done for this run, if so update logbook
1481 TObjArray checkEntryArray;
1482 checkEntryArray.SetOwner(1);
9e080f92 1483 TString whereClause = Form("where run=%d", GetCurrentRun());
1484 if (!QueryShuttleLogbook(whereClause.Data(), checkEntryArray) || checkEntryArray.GetEntries() == 0) {
1485 Log("SHUTTLE", Form("Process - Warning: Cannot check status of run %d on Shuttle logbook!",
1486 GetCurrentRun()));
1487 return hasError == kFALSE;
1488 }
b948db8d 1489
9e080f92 1490 AliShuttleLogbookEntry* checkEntry = dynamic_cast<AliShuttleLogbookEntry*>
1491 (checkEntryArray.At(0));
2bb7b766 1492
9e080f92 1493 if (checkEntry)
1494 {
1495 if (checkEntry->IsDone())
be48e3ea 1496 {
9e080f92 1497 Log("SHUTTLE","Process - Shuttle is DONE. Updating logbook");
1498 UpdateShuttleLogbook("shuttle_done");
1499 }
1500 else
1501 {
1502 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
be48e3ea 1503 {
9e080f92 1504 if (checkEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
be48e3ea 1505 {
9e080f92 1506 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
1507 checkEntry->GetRun(), GetDetName(iDet)));
1508 fFirstUnprocessed[iDet] = kFALSE;
be48e3ea 1509 }
1510 }
2bb7b766 1511 }
1512 }
1513
e7f62f16 1514 // remove ML instance
1515 delete fMonaLisa;
1516 fMonaLisa = 0;
1517
2bb7b766 1518 fLogbookEntry = 0;
85a80aa9 1519
a7160fe9 1520 return hasError == kFALSE;
73abe331 1521}
1522
b948db8d 1523//______________________________________________________________________________________________
9827400b 1524Bool_t AliShuttle::ProcessCurrentDetector()
73abe331 1525{
1526 //
2bb7b766 1527 // Makes data retrieval just for a specific detector (fCurrentDetector).
73abe331 1528 // Threre should be a configuration for this detector.
73abe331 1529
2bb7b766 1530 AliInfo(Form("Retrieving values for %s, run %d", fCurrentDetector.Data(), GetCurrentRun()));
73abe331 1531
2d9019b4 1532 if (!CleanReferenceStorage(fCurrentDetector.Data()))
546242fb 1533 return kFALSE;
1534
a038aa70 1535 TMap* dcsMap = 0;
3301427a 1536
1537 // call preprocessor
1538 AliPreprocessor* aPreprocessor =
1539 dynamic_cast<AliPreprocessor*> (fPreprocessorMap.GetValue(fCurrentDetector));
1540
1541 aPreprocessor->Initialize(GetCurrentRun(), GetCurrentStartTime(), GetCurrentEndTime());
1542
1543 Bool_t processDCS = aPreprocessor->ProcessDCS();
d477ad88 1544
651fdaab 1545 if (!processDCS)
1546 {
1547 Log(fCurrentDetector, "The preprocessor requested to skip the retrieval of DCS values");
1548 }
8b739301 1549 else if (fTestMode & kSkipDCS)
2c15234c 1550 {
3d8bc902 1551 Log(fCurrentDetector, "In TESTMODE - Skipping DCS processing!");
9827400b 1552 }
1553 else if (fTestMode & kErrorDCS)
1554 {
3d8bc902 1555 Log(fCurrentDetector, "In TESTMODE - Simulating DCS error");
1556 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
9827400b 1557 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
1558 return kFALSE;
2c15234c 1559 } else {
3301427a 1560
1561 UpdateShuttleStatus(AliShuttleStatus::kDCSStarted);
1562
2c15234c 1563 TString host(fConfig->GetDCSHost(fCurrentDetector));
1564 Int_t port = fConfig->GetDCSPort(fCurrentDetector);
1565
a038aa70 1566 if (fConfig->GetDCSAliases(fCurrentDetector)->GetEntries() > 0)
2c15234c 1567 {
a038aa70 1568 dcsMap = GetValueSet(host, port, fConfig->GetDCSAliases(fCurrentDetector), kAlias);
1569 if (!dcsMap)
2c15234c 1570 {
a038aa70 1571 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS aliases");
2c15234c 1572 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
9827400b 1573 return kFALSE;
2c15234c 1574 }
4f0ab988 1575 }
a038aa70 1576
1577 if (fConfig->GetDCSDataPoints(fCurrentDetector)->GetEntries() > 0)
2c15234c 1578 {
a038aa70 1579 TMap* dcsMap2 = GetValueSet(host, port, fConfig->GetDCSDataPoints(fCurrentDetector), kDP);
1580 if (!dcsMap2)
2c15234c 1581 {
a038aa70 1582 Log(fCurrentDetector, "ProcessCurrentDetector - Error while retrieving DCS data points");
2c15234c 1583 UpdateShuttleStatus(AliShuttleStatus::kDCSError);
a038aa70 1584 if (dcsMap)
1585 delete dcsMap;
9827400b 1586 return kFALSE;
2c15234c 1587 }
a038aa70 1588
1589 if (!dcsMap)
1590 {
1591 dcsMap = dcsMap2;
1592 }
1593 else // merge
1594 {
1595 TIter iter(dcsMap2);
1596 TObjString* key = 0;
1597 while ((key = (TObjString*) iter.Next()))
1598 dcsMap->Add(key, dcsMap2->GetValue(key->String()));
1599
1600 dcsMap2->SetOwner(kFALSE);
1601 delete dcsMap2;
1602 }
73abe331 1603 }
a038aa70 1604
73abe331 1605 }
b948db8d 1606
dc25836b 1607 // still no map?
1608 if (!dcsMap)
1609 dcsMap = new TMap;
1610
2bb7b766 1611 // DCS Archive DB processing successful. Call Preprocessor!
85a80aa9 1612 UpdateShuttleStatus(AliShuttleStatus::kPPStarted);
a7160fe9 1613
a038aa70 1614 UInt_t returnValue = aPreprocessor->Process(dcsMap);
b948db8d 1615
3301427a 1616 if (returnValue > 0) // Preprocessor error!
1617 {
9827400b 1618 Log(fCurrentDetector, Form("Preprocessor failed. Process returned %d.", returnValue));
cb343cfd 1619 UpdateShuttleStatus(AliShuttleStatus::kPPError);
a038aa70 1620 dcsMap->DeleteAll();
1621 delete dcsMap;
9827400b 1622 return kFALSE;
1623 }
1624
1625 // preprocessor ok!
1626 UpdateShuttleStatus(AliShuttleStatus::kPPDone);
1627 Log(fCurrentDetector, Form("ProcessCurrentDetector - %s preprocessor returned success",
1628 fCurrentDetector.Data()));
b948db8d 1629
a038aa70 1630 dcsMap->DeleteAll();
1631 delete dcsMap;
b948db8d 1632
9827400b 1633 return kTRUE;
2bb7b766 1634}
1635
1636//______________________________________________________________________________________________
1637Bool_t AliShuttle::QueryShuttleLogbook(const char* whereClause,
1638 TObjArray& entries)
1639{
9827400b 1640 // Query DAQ's Shuttle logbook and fills detector status object.
1641 // Call QueryRunParameters to query DAQ logbook for run parameters.
1642 //
2bb7b766 1643
fc5a4708 1644 entries.SetOwner(1);
1645
2bb7b766 1646 // check connection, in case connect
be48e3ea 1647 if(!Connect(3)) return kFALSE;
2bb7b766 1648
1649 TString sqlQuery;
441b0e9c 1650 sqlQuery = Form("select * from %s %s order by run", fConfig->GetShuttlelbTable(), whereClause);
2bb7b766 1651
be48e3ea 1652 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1653 if (!aResult) {
1654 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
1655 return kFALSE;
1656 }
1657
fc5a4708 1658 AliDebug(2,Form("Query = %s", sqlQuery.Data()));
1659
2bb7b766 1660 if(aResult->GetRowCount() == 0) {
9827400b 1661 AliInfo("No entries in Shuttle Logbook match request");
1662 delete aResult;
1663 return kTRUE;
2bb7b766 1664 }
1665
1666 // TODO Check field count!
db99d43e 1667 const UInt_t nCols = 23;
2bb7b766 1668 if (aResult->GetFieldCount() != (Int_t) nCols) {
1669 AliError("Invalid SQL result field number!");
1670 delete aResult;
1671 return kFALSE;
1672 }
1673
2bb7b766 1674 TSQLRow* aRow;
1675 while ((aRow = aResult->Next())) {
1676 TString runString(aRow->GetField(0), aRow->GetFieldLength(0));
1677 Int_t run = runString.Atoi();
1678
eba76848 1679 AliShuttleLogbookEntry *entry = QueryRunParameters(run);
1680 if (!entry)
1681 continue;
2bb7b766 1682
1683 // loop on detectors
eba76848 1684 for(UInt_t ii = 0; ii < nCols; ii++)
1685 entry->SetDetectorStatus(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1686
eba76848 1687 entries.AddLast(entry);
2bb7b766 1688 delete aRow;
1689 }
1690
2bb7b766 1691 delete aResult;
1692 return kTRUE;
1693}
1694
1695//______________________________________________________________________________________________
eba76848 1696AliShuttleLogbookEntry* AliShuttle::QueryRunParameters(Int_t run)
2bb7b766 1697{
eba76848 1698 //
1699 // Retrieve run parameters written in the DAQ logbook and sets them into AliShuttleLogbookEntry object
1700 //
2bb7b766 1701
1702 // check connection, in case connect
be48e3ea 1703 if (!Connect(3))
eba76848 1704 return 0;
2bb7b766 1705
1706 TString sqlQuery;
2c15234c 1707 sqlQuery.Form("select * from %s where run=%d", fConfig->GetDAQlbTable(), run);
2bb7b766 1708
be48e3ea 1709 TSQLResult* aResult = fServer[3]->Query(sqlQuery);
2bb7b766 1710 if (!aResult) {
1711 AliError(Form("Can't execute query <%s>!", sqlQuery.Data()));
eba76848 1712 return 0;
2bb7b766 1713 }
1714
eba76848 1715 if (aResult->GetRowCount() == 0) {
2bb7b766 1716 Log("SHUTTLE", Form("QueryRunParameters - No entry in DAQ Logbook for run %d. Skipping", run));
1717 delete aResult;
eba76848 1718 return 0;
2bb7b766 1719 }
1720
eba76848 1721 if (aResult->GetRowCount() > 1) {
2bb7b766 1722 AliError(Form("More than one entry in DAQ Logbook for run %d. Skipping", run));
1723 delete aResult;
eba76848 1724 return 0;
2bb7b766 1725 }
1726
eba76848 1727 TSQLRow* aRow = aResult->Next();
1728 if (!aRow)
1729 {
1730 AliError(Form("Could not retrieve row for run %d. Skipping", run));
1731 delete aResult;
1732 return 0;
1733 }
2bb7b766 1734
eba76848 1735 AliShuttleLogbookEntry* entry = new AliShuttleLogbookEntry(run);
2bb7b766 1736
eba76848 1737 for (Int_t ii = 0; ii < aResult->GetFieldCount(); ii++)
1738 entry->SetRunParameter(aResult->GetFieldName(ii), aRow->GetField(ii));
2bb7b766 1739
eba76848 1740 UInt_t startTime = entry->GetStartTime();
1741 UInt_t endTime = entry->GetEndTime();
1742
1743 if (!startTime || !endTime || startTime > endTime) {
1744 Log("SHUTTLE",
1745 Form("QueryRunParameters - Invalid parameters for Run %d: startTime = %d, endTime = %d",
1746 run, startTime, endTime));
1747 delete entry;
2bb7b766 1748 delete aRow;
eba76848 1749 delete aResult;
1750 return 0;
2bb7b766 1751 }
1752
eba76848 1753 delete aRow;
2bb7b766 1754 delete aResult;
eba76848 1755
1756 return entry;
2bb7b766 1757}
1758
b948db8d 1759//______________________________________________________________________________________________
2c15234c 1760Bool_t AliShuttle::GetValueSet(const char* host, Int_t port, const char* entry,
1761 TObjArray* valueSet, DCSType type)
73abe331 1762{
9827400b 1763 // Retrieve all "entry" data points from the DCS server
1764 // host, port: TSocket connection parameters
1765 // entry: name of the alias or data point
1766 // valueSet: array of retrieved AliDCSValue's
1767 // type: kAlias or kDP
58bc3020 1768
73abe331 1769 AliDCSClient client(host, port, fTimeout, fRetries);
2c15234c 1770 if (!client.IsConnected())
1771 {
b948db8d 1772 return kFALSE;
73abe331 1773 }
1774
2c15234c 1775 Int_t result=0;
73abe331 1776
2c15234c 1777 if (type == kAlias)
1778 {
1779 result = client.GetAliasValues(entry,
1780 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1781 } else
1782 if (type == kDP)
1783 {
1784 result = client.GetDPValues(entry,
1785 GetCurrentStartTime(), GetCurrentEndTime(), valueSet);
1786 }
1787
1788 if (result < 0)
1789 {
2bb7b766 1790 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get '%s'! Reason: %s",
2c15234c 1791 entry, AliDCSClient::GetErrorString(result)));
73abe331 1792
2c15234c 1793 if (result == AliDCSClient::fgkServerError)
1794 {
2bb7b766 1795 Log(fCurrentDetector.Data(), Form("GetValueSet - Server error: %s",
73abe331 1796 client.GetServerError().Data()));
1797 }
1798
1799 return kFALSE;
1800 }
1801
1802 return kTRUE;
1803}
b948db8d 1804
a038aa70 1805//______________________________________________________________________________________________
1806TMap* AliShuttle::GetValueSet(const char* host, Int_t port, const TSeqCollection* entries,
1807 DCSType type)
1808{
1809 // Retrieve all "entry" data points from the DCS server
1810 // host, port: TSocket connection parameters
1811 // entries: list of name of the alias or data point
1812 // type: kAlias or kDP
1813 // returns TMap of values, 0 when failure
1814
1815 const Int_t kSplit = 100; // maximum number of DPs at a time
1816
1817 Int_t totalEntries = entries->GetEntries();
1818
1819 TMap* result = 0;
1820
1821 for (Int_t index=0; index < totalEntries; index += kSplit)
1822 {
1823 Int_t endIndex = index + kSplit;
1824
1825 AliDCSClient client(host, port, fTimeout, fRetries);
1826 if (!client.IsConnected())
1827 return 0;
1828
1829 TMap* partialResult = 0;
1830
1831 if (type == kAlias)
1832 {
1833 partialResult = client.GetAliasValues(entries, GetCurrentStartTime(),
1834 GetCurrentEndTime(), index, endIndex);
1835 }
1836 else if (type == kDP)
1837 {
1838 partialResult = client.GetDPValues(entries, GetCurrentStartTime(),
1839 GetCurrentEndTime(), index, endIndex);
1840 }
1841
1842 if (partialResult == 0)
1843 {
1844 Log(fCurrentDetector.Data(), Form("GetValueSet - Can't get entries (%d...%d)! Reason: %s",
1845 index, endIndex, client.GetServerError().Data()));
1846
1847 if (result)
1848 delete result;
1849
1850 return 0;
1851 }
1852
1853 AliInfo(Form("Retrieved entries %d..%d (total %d); E.g. %s has %d values collected",
1854 index, endIndex, totalEntries, entries->At(index)->GetName(), ((TObjArray*)
1855 partialResult->GetValue(entries->At(index)->GetName()))->GetEntriesFast()));
1856
1857 if (!result)
1858 {
1859 result = partialResult;
1860 }
1861 else
1862 {
1863 TIter iter(partialResult);
1864 TObjString* key = 0;
1865 while ((key = (TObjString*) iter.Next()))
1866 result->Add(key, partialResult->GetValue(key->String()));
1867
1868 partialResult->SetOwner(kFALSE);
1869 delete partialResult;
1870 }
1871
1872 }
1873
1874 return result;
1875}
b948db8d 1876//______________________________________________________________________________________________
57f50b3c 1877const char* AliShuttle::GetFile(Int_t system, const char* detector,
1878 const char* id, const char* source)
b948db8d 1879{
9827400b 1880 // Get calibration file from file exchange servers
1881 // First queris the FXS database for the file name, using the run, detector, id and source info
1882 // then calls RetrieveFile(filename) for actual copy to local disk
1883 // run: current run being processed (given by Logbook entry fLogbookEntry)
1884 // detector: the Preprocessor name
1885 // id: provided as a parameter by the Preprocessor
1886 // source: provided by the Preprocessor through GetFileSources function
1887
1888 // check if test mode should simulate a FXS error
1889 if (fTestMode & kErrorFXSFiles)
1890 {
1891 Log(detector, Form("GetFile - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
1892 return 0;
1893 }
1894
57f50b3c 1895 // check connection, in case connect
9d733021 1896 if (!Connect(system))
eba76848 1897 {
9d733021 1898 Log(detector, Form("GetFile - Couldn't connect to %s FXS database", GetSystemName(system)));
57f50b3c 1899 return 0;
1900 }
1901
1902 // Query preparation
9d733021 1903 TString sourceName(source);
d386d623 1904 Int_t nFields = 3;
1905 TString sqlQueryStart = Form("select filePath,size,fileChecksum from %s where",
1906 fConfig->GetFXSdbTable(system));
1907 TString whereClause = Form("run=%d and detector=\"%s\" and fileId=\"%s\"",
1908 GetCurrentRun(), detector, id);
1909
9d733021 1910 if (system == kDAQ)
1911 {
d386d623 1912 whereClause += Form(" and DAQsource=\"%s\"", source);
57f50b3c 1913 }
9d733021 1914 else if (system == kDCS)
eba76848 1915 {
9d733021 1916 sourceName="none";
57f50b3c 1917 }
9d733021 1918 else if (system == kHLT)
9e080f92 1919 {
d386d623 1920 whereClause += Form(" and DDLnumbers=\"%s\"", source);
9d733021 1921 nFields = 3;
9e080f92 1922 }
1923
9e080f92 1924 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
1925
1926 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
1927
1928 // Query execution
1929 TSQLResult* aResult = 0;
9d733021 1930 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
9e080f92 1931 if (!aResult) {
9d733021 1932 Log(detector, Form("GetFileName - Can't execute SQL query to %s database for: id = %s, source = %s",
1933 GetSystemName(system), id, sourceName.Data()));
9e080f92 1934 return 0;
1935 }
1936
1937 if(aResult->GetRowCount() == 0)
1938 {
1939 Log(detector,
9d733021 1940 Form("GetFileName - No entry in %s FXS db for: id = %s, source = %s",
1941 GetSystemName(system), id, sourceName.Data()));
9e080f92 1942 delete aResult;
1943 return 0;
1944 }
2bb7b766 1945
9e080f92 1946 if (aResult->GetRowCount() > 1) {
1947 Log(detector,
9d733021 1948 Form("GetFileName - More than one entry in %s FXS db for: id = %s, source = %s",
1949 GetSystemName(system), id, sourceName.Data()));
9e080f92 1950 delete aResult;
1951 return 0;
1952 }
1953
9d733021 1954 if (aResult->GetFieldCount() != nFields) {
9e080f92 1955 Log(detector,
9d733021 1956 Form("GetFileName - Wrong field count in %s FXS db for: id = %s, source = %s",
1957 GetSystemName(system), id, sourceName.Data()));
9e080f92 1958 delete aResult;
1959 return 0;
1960 }
1961
1962 TSQLRow* aRow = dynamic_cast<TSQLRow*> (aResult->Next());
1963
1964 if (!aRow){
9d733021 1965 Log(detector, Form("GetFileName - Empty set result in %s FXS db from query: id = %s, source = %s",
1966 GetSystemName(system), id, sourceName.Data()));
9e080f92 1967 delete aResult;
1968 return 0;
1969 }
1970
1971 TString filePath(aRow->GetField(0), aRow->GetFieldLength(0));
1972 TString fileSize(aRow->GetField(1), aRow->GetFieldLength(1));
d386d623 1973 TString fileChecksum(aRow->GetField(2), aRow->GetFieldLength(2));
9e080f92 1974
1975 delete aResult;
1976 delete aRow;
1977
d386d623 1978 AliDebug(2, Form("filePath = %s; size = %s, fileChecksum = %s",
1979 filePath.Data(), fileSize.Data(), fileChecksum.Data()));
9e080f92 1980
9e080f92 1981 // retrieved file is renamed to make it unique
9d733021 1982 TString localFileName = Form("%s_%s_%d_%s_%s.shuttle",
1983 GetSystemName(system), detector, GetCurrentRun(), id, sourceName.Data());
1984
9e080f92 1985
9d733021 1986 // file retrieval from FXS
4b95672b 1987 UInt_t nRetries = 0;
1988 UInt_t maxRetries = 3;
1989 Bool_t result = kFALSE;
1990
1991 // copy!! if successful TSystem::Exec returns 0
1992 while(nRetries++ < maxRetries) {
1993 AliDebug(2, Form("Trying to copy file. Retry # %d", nRetries));
1994 result = RetrieveFile(system, filePath.Data(), localFileName.Data());
1995 if(!result)
1996 {
1997 Log(detector, Form("GetFileName - Copy of file %s from %s FXS failed",
9d733021 1998 filePath.Data(), GetSystemName(system)));
4b95672b 1999 continue;
4f0749a8 2000 }
9e080f92 2001
d386d623 2002 if (fileChecksum.Length()>0)
4b95672b 2003 {
2004 // compare md5sum of local file with the one stored in the FXS DB
2005 Int_t md5Comp = gSystem->Exec(Form("md5sum %s/%s |grep %s 2>&1 > /dev/null",
d386d623 2006 GetShuttleTempDir(), localFileName.Data(), fileChecksum.Data()));
9e080f92 2007
4b95672b 2008 if (md5Comp != 0)
2009 {
2010 Log(detector, Form("GetFileName - md5sum of file %s does not match with local copy!",
2011 filePath.Data()));
2012 result = kFALSE;
2013 continue;
2014 }
d386d623 2015 } else {
2016 Log(fCurrentDetector, Form("GetFile - md5sum of file %s not set in %s database, skipping comparison",
2017 filePath.Data(), GetSystemName(system)));
9d733021 2018 }
4b95672b 2019 if (result) break;
9e080f92 2020 }
2021
4b95672b 2022 if(!result) return 0;
2023
9d733021 2024 fFXSCalled[system]=kTRUE;
2025 TObjString *fileParams = new TObjString(Form("%s#!?!#%s", id, sourceName.Data()));
2026 fFXSlist[system].Add(fileParams);
9e080f92 2027
2028 static TString fullLocalFileName;
1bcd28db 2029 fullLocalFileName.Form("%s/%s", GetShuttleTempDir(), localFileName.Data());
36c99a6a 2030
1bcd28db 2031 Log(fCurrentDetector, Form("GetFile - Retrieved file with id %s and source %s from %s to %s", id, source, GetSystemName(system), fullLocalFileName.Data()));
9e080f92 2032
2033 return fullLocalFileName.Data();
2bb7b766 2034}
2035
2036//______________________________________________________________________________________________
9d733021 2037Bool_t AliShuttle::RetrieveFile(UInt_t system, const char* fxsFileName, const char* localFileName)
9e080f92 2038{
9827400b 2039 //
2040 // Copies file from FXS to local Shuttle machine
2041 //
2bb7b766 2042
9e080f92 2043 // check temp directory: trying to cd to temp; if it does not exist, create it
9d733021 2044 AliDebug(2, Form("Copy file %s from %s FXS into %s/%s",
2045 GetSystemName(system), fxsFileName, GetShuttleTempDir(), localFileName));
9e080f92 2046
36c99a6a 2047 void* dir = gSystem->OpenDirectory(GetShuttleTempDir());
9e080f92 2048 if (dir == NULL) {
36c99a6a 2049 if (gSystem->mkdir(GetShuttleTempDir(), kTRUE)) {
2050 AliError(Form("Can't open directory <%s>", GetShuttleTempDir()));
9e080f92 2051 return kFALSE;
2052 }
2053
2054 } else {
2055 gSystem->FreeDirectory(dir);
2056 }
2057
9d733021 2058 TString baseFXSFolder;
2059 if (system == kDAQ)
2060 {
2061 baseFXSFolder = "FES/";
2062 }
2063 else if (system == kDCS)
2064 {
2065 baseFXSFolder = "";
2066 }
2067 else if (system == kHLT)
2068 {
42fde080 2069 baseFXSFolder = "/opt/FXS/";
9d733021 2070 }
2071
2072
2073 TString command = Form("scp -oPort=%d -2 %s@%s:%s%s %s/%s",
2074 fConfig->GetFXSPort(system),
2075 fConfig->GetFXSUser(system),
2076 fConfig->GetFXSHost(system),
2077 baseFXSFolder.Data(),
2078 fxsFileName,
36c99a6a 2079 GetShuttleTempDir(),
9e080f92 2080 localFileName);
2081
2082 AliDebug(2, Form("%s",command.Data()));
2083
4b95672b 2084 Bool_t result = (gSystem->Exec(command.Data()) == 0);
9e080f92 2085
4b95672b 2086 return result;
9e080f92 2087}
2088
2089//______________________________________________________________________________________________
9d733021 2090TList* AliShuttle::GetFileSources(Int_t system, const char* detector, const char* id)
2091{
9827400b 2092 //
2093 // Get sources producing the condition file Id from file exchange servers
4a33bdd9 2094 // if id is NULL all sources are returned (distinct)
9827400b 2095 //
1bcd28db 2096
2097 Log(detector, Form("GetFileSources - Retrieving sources with id %s from %s", id, GetSystemName(system)));
9827400b 2098
2099 // check if test mode should simulate a FXS error
2100 if (fTestMode & kErrorFXSSources)
2101 {
2102 Log(detector, Form("GetFileSources - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2103 return 0;
2104 }
2105
9d733021 2106 if (system == kDCS)
2107 {
6297b37d 2108 AliWarning("DCS system has only one source of data!");
2109 TList *list = new TList();
2110 list->SetOwner(1);
2111 list->Add(new TObjString(" "));
2112 return list;
9d733021 2113 }
9e080f92 2114
2115 // check connection, in case connect
9d733021 2116 if (!Connect(system))
2117 {
4a33bdd9 2118 Log(detector, Form("GetFileSources - Couldn't connect to %s FXS database", GetSystemName(system)));
9d733021 2119 return NULL;
9e080f92 2120 }
2121
9d733021 2122 TString sourceName = 0;
2123 if (system == kDAQ)
2124 {
2125 sourceName = "DAQsource";
2126 } else if (system == kHLT)
2127 {
2128 sourceName = "DDLnumbers";
2129 }
2130
4a33bdd9 2131 TString sqlQueryStart = Form("select distinct %s from %s where", sourceName.Data(), fConfig->GetFXSdbTable(system));
2132 TString whereClause = Form("run=%d and detector=\"%s\"",
2133 GetCurrentRun(), detector);
2134 if (id)
2135 whereClause += Form(" and fileId=\"%s\"", id);
9e080f92 2136 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2137
2138 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2139
2140 // Query execution
2141 TSQLResult* aResult;
9d733021 2142 aResult = fServer[system]->Query(sqlQuery);
9e080f92 2143 if (!aResult) {
9d733021 2144 Log(detector, Form("GetFileSources - Can't execute SQL query to %s database for id: %s",
2145 GetSystemName(system), id));
9e080f92 2146 return 0;
2147 }
2148
86aa42c3 2149 TList *list = new TList();
2150 list->SetOwner(1);
2151
9d733021 2152 if (aResult->GetRowCount() == 0)
2153 {
9e080f92 2154 Log(detector,
9d733021 2155 Form("GetFileSources - No entry in %s FXS table for id: %s", GetSystemName(system), id));
9e080f92 2156 delete aResult;
86aa42c3 2157 return list;
9e080f92 2158 }
2159
1bcd28db 2160 Log(detector, Form("GetFileSources - Found %d sources", aResult->GetRowCount()));
9e080f92 2161
1bcd28db 2162 TSQLRow* aRow;
9d733021 2163 while ((aRow = aResult->Next()))
2164 {
9e080f92 2165
9d733021 2166 TString source(aRow->GetField(0), aRow->GetFieldLength(0));
2167 AliDebug(2, Form("%s = %s", sourceName.Data(), source.Data()));
2168 list->Add(new TObjString(source));
9e080f92 2169 delete aRow;
2170 }
9d733021 2171
9e080f92 2172 delete aResult;
2173
2174 return list;
2bb7b766 2175}
2176
4a33bdd9 2177//______________________________________________________________________________________________
2178TList* AliShuttle::GetFileIDs(Int_t system, const char* detector, const char* source)
2179{
2180 //
2181 // Get all ids of condition files produced by a given source from file exchange servers
2182 //
2183
1bcd28db 2184 Log(detector, Form("GetFileIDs - Retrieving ids with source %s with %s", source, GetSystemName(system)));
2185
4a33bdd9 2186 // check if test mode should simulate a FXS error
2187 if (fTestMode & kErrorFXSSources)
2188 {
2189 Log(detector, Form("GetFileIDs - In TESTMODE - Simulating error while connecting to %s FXS", GetSystemName(system)));
2190 return 0;
2191 }
2192
2193 // check connection, in case connect
2194 if (!Connect(system))
2195 {
2196 Log(detector, Form("GetFileIDs - Couldn't connect to %s FXS database", GetSystemName(system)));
2197 return NULL;
2198 }
2199
2200 TString sourceName = 0;
2201 if (system == kDAQ)
2202 {
2203 sourceName = "DAQsource";
2204 } else if (system == kHLT)
2205 {
2206 sourceName = "DDLnumbers";
2207 }
2208
2209 TString sqlQueryStart = Form("select fileId from %s where", fConfig->GetFXSdbTable(system));
2210 TString whereClause = Form("run=%d and detector=\"%s\"",
2211 GetCurrentRun(), detector);
2212 if (sourceName.Length() > 0 && source)
2213 whereClause += Form(" and %s=\"%s\"", sourceName.Data(), source);
2214 TString sqlQuery = Form("%s %s", sqlQueryStart.Data(), whereClause.Data());
2215
2216 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2217
2218 // Query execution
2219 TSQLResult* aResult;
2220 aResult = fServer[system]->Query(sqlQuery);
2221 if (!aResult) {
2222 Log(detector, Form("GetFileIDs - Can't execute SQL query to %s database for source: %s",
2223 GetSystemName(system), source));
2224 return 0;
2225 }
2226
2227 TList *list = new TList();
2228 list->SetOwner(1);
2229
2230 if (aResult->GetRowCount() == 0)
2231 {
2232 Log(detector,
2233 Form("GetFileIDs - No entry in %s FXS table for source: %s", GetSystemName(system), source));
2234 delete aResult;
2235 return list;
2236 }
2237
1bcd28db 2238 Log(detector, Form("GetFileIDs - Found %d ids", aResult->GetRowCount()));
2239
4a33bdd9 2240 TSQLRow* aRow;
2241
2242 while ((aRow = aResult->Next()))
2243 {
2244
2245 TString id(aRow->GetField(0), aRow->GetFieldLength(0));
2246 AliDebug(2, Form("fileId = %s", id.Data()));
2247 list->Add(new TObjString(id));
2248 delete aRow;
2249 }
2250
2251 delete aResult;
2252
2253 return list;
2254}
2255
2bb7b766 2256//______________________________________________________________________________________________
9d733021 2257Bool_t AliShuttle::Connect(Int_t system)
2bb7b766 2258{
9827400b 2259 // Connect to MySQL Server of the system's FXS MySQL databases
2260 // DAQ Logbook, Shuttle Logbook and DAQ FXS db are on the same host
2261 //
57f50b3c 2262
9d733021 2263 // check connection: if already connected return
2264 if(fServer[system] && fServer[system]->IsConnected()) return kTRUE;
57f50b3c 2265
9d733021 2266 TString dbHost, dbUser, dbPass, dbName;
57f50b3c 2267
9d733021 2268 if (system < 3) // FXS db servers
2269 {
2270 dbHost = Form("mysql://%s:%d", fConfig->GetFXSdbHost(system), fConfig->GetFXSdbPort(system));
2271 dbUser = fConfig->GetFXSdbUser(system);
2272 dbPass = fConfig->GetFXSdbPass(system);
2273 dbName = fConfig->GetFXSdbName(system);
2274 } else { // Run & Shuttle logbook servers
2275 // TODO Will the Shuttle logbook server be the same as the Run logbook server ???
2276 dbHost = Form("mysql://%s:%d", fConfig->GetDAQlbHost(), fConfig->GetDAQlbPort());
2277 dbUser = fConfig->GetDAQlbUser();
2278 dbPass = fConfig->GetDAQlbPass();
2279 dbName = fConfig->GetDAQlbDB();
2280 }
57f50b3c 2281
9d733021 2282 fServer[system] = TSQLServer::Connect(dbHost.Data(), dbUser.Data(), dbPass.Data());
2283 if (!fServer[system] || !fServer[system]->IsConnected()) {
2284 if(system < 3)
2285 {
2286 AliError(Form("Can't establish connection to FXS database for %s",
2287 AliShuttleInterface::GetSystemName(system)));
2288 } else {
2289 AliError("Can't establish connection to Run logbook.");
57f50b3c 2290 }
9d733021 2291 if(fServer[system]) delete fServer[system];
2292 return kFALSE;
2bb7b766 2293 }
57f50b3c 2294
9d733021 2295 // Get tables
2296 TSQLResult* aResult=0;
2297 switch(system){
2298 case kDAQ:
2299 aResult = fServer[kDAQ]->GetTables(dbName.Data());
2300 break;
2301 case kDCS:
2302 aResult = fServer[kDCS]->GetTables(dbName.Data());
2303 break;
2304 case kHLT:
2305 aResult = fServer[kHLT]->GetTables(dbName.Data());
2306 break;
2307 default:
2308 aResult = fServer[3]->GetTables(dbName.Data());
2309 break;
2310 }
2311
2312 delete aResult;
2bb7b766 2313 return kTRUE;
2314}
57f50b3c 2315
9e080f92 2316//______________________________________________________________________________________________
9d733021 2317Bool_t AliShuttle::UpdateTable()
9e080f92 2318{
9827400b 2319 //
2320 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2321 //
9e080f92 2322
9d733021 2323 Bool_t result = kTRUE;
9e080f92 2324
9d733021 2325 for (UInt_t system=0; system<3; system++)
2326 {
2327 if(!fFXSCalled[system]) continue;
9e080f92 2328
9d733021 2329 // check connection, in case connect
2330 if (!Connect(system))
2331 {
2332 Log(fCurrentDetector, Form("UpdateTable - Couldn't connect to %s FXS database", GetSystemName(system)));
2333 result = kFALSE;
2334 continue;
9e080f92 2335 }
9e080f92 2336
9d733021 2337 TTimeStamp now; // now
2338
2339 // Loop on FXS list entries
2340 TIter iter(&fFXSlist[system]);
2341 TObjString *aFXSentry=0;
2342 while ((aFXSentry = dynamic_cast<TObjString*> (iter.Next())))
2343 {
2344 TString aFXSentrystr = aFXSentry->String();
2345 TObjArray *aFXSarray = aFXSentrystr.Tokenize("#!?!#");
2346 if (!aFXSarray || aFXSarray->GetEntries() != 2 )
2347 {
2348 Log(fCurrentDetector, Form("UpdateTable - error updating %s FXS entry. Check string: <%s>",
2349 GetSystemName(system), aFXSentrystr.Data()));
2350 if(aFXSarray) delete aFXSarray;
2351 result = kFALSE;
2352 continue;
2353 }
2354 const char* fileId = ((TObjString*) aFXSarray->At(0))->GetName();
2355 const char* source = ((TObjString*) aFXSarray->At(1))->GetName();
2356
2357 TString whereClause;
2358 if (system == kDAQ)
2359 {
2360 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DAQsource=\"%s\";",
2361 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2362 }
2363 else if (system == kDCS)
2364 {
2365 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\";",
2366 GetCurrentRun(), fCurrentDetector.Data(), fileId);
2367 }
2368 else if (system == kHLT)
2369 {
2370 whereClause = Form("where run=%d and detector=\"%s\" and fileId=\"%s\" and DDLnumbers=\"%s\";",
2371 GetCurrentRun(), fCurrentDetector.Data(), fileId, source);
2372 }
2373
2374 delete aFXSarray;
9e080f92 2375
9d733021 2376 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2377 now.GetSec(), whereClause.Data());
9e080f92 2378
9d733021 2379 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
9e080f92 2380
9d733021 2381 // Query execution
2382 TSQLResult* aResult;
2383 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2384 if (!aResult)
2385 {
2386 Log(fCurrentDetector, Form("UpdateTable - %s db: can't execute SQL query <%s>",
2387 GetSystemName(system), sqlQuery.Data()));
2388 result = kFALSE;
2389 continue;
2390 }
2391 delete aResult;
9e080f92 2392 }
9e080f92 2393 }
2394
9d733021 2395 return result;
9e080f92 2396}
57f50b3c 2397
3301427a 2398//______________________________________________________________________________________________
2399Bool_t AliShuttle::UpdateTableFailCase()
2400{
9827400b 2401 // Update FXS table filling time_processed field in all rows corresponding to current run and detector
2402 // this is called in case the preprocessor is declared failed for the current run, because
2403 // the fields are updated only in case of success
3301427a 2404
2405 Bool_t result = kTRUE;
2406
2407 for (UInt_t system=0; system<3; system++)
2408 {
2409 // check connection, in case connect
2410 if (!Connect(system))
2411 {
2412 Log(fCurrentDetector, Form("UpdateTableFailCase - Couldn't connect to %s FXS database",
2413 GetSystemName(system)));
2414 result = kFALSE;
2415 continue;
2416 }
2417
2418 TTimeStamp now; // now
2419
2420 // Loop on FXS list entries
2421
2422 TString whereClause = Form("where run=%d and detector=\"%s\";",
2423 GetCurrentRun(), fCurrentDetector.Data());
2424
2425
2426 TString sqlQuery = Form("update %s set time_processed=%d %s", fConfig->GetFXSdbTable(system),
2427 now.GetSec(), whereClause.Data());
2428
2429 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2430
2431 // Query execution
2432 TSQLResult* aResult;
2433 aResult = dynamic_cast<TSQLResult*> (fServer[system]->Query(sqlQuery));
2434 if (!aResult)
2435 {
2436 Log(fCurrentDetector, Form("UpdateTableFailCase - %s db: can't execute SQL query <%s>",
2437 GetSystemName(system), sqlQuery.Data()));
2438 result = kFALSE;
2439 continue;
2440 }
2441 delete aResult;
2442 }
2443
2444 return result;
2445}
2446
2bb7b766 2447//______________________________________________________________________________________________
2448Bool_t AliShuttle::UpdateShuttleLogbook(const char* detector, const char* status)
2449{
e7f62f16 2450 //
2451 // Update Shuttle logbook filling detector or shuttle_done column
2452 // ex. of usage: UpdateShuttleLogbook("PHOS", "DONE") or UpdateShuttleLogbook("shuttle_done")
2453 //
57f50b3c 2454
2bb7b766 2455 // check connection, in case connect
be48e3ea 2456 if(!Connect(3)){
2bb7b766 2457 Log("SHUTTLE", "UpdateShuttleLogbook - Couldn't connect to DAQ Logbook.");
2458 return kFALSE;
57f50b3c 2459 }
2460
2bb7b766 2461 TString detName(detector);
2462 TString setClause;
e7f62f16 2463 if(detName == "shuttle_done")
2464 {
2bb7b766 2465 setClause = "set shuttle_done=1";
e7f62f16 2466
2467 // Send the information to ML
2468 TMonaLisaText mlStatus("SHUTTLE_status", "Done");
2469
2470 TList mlList;
2471 mlList.Add(&mlStatus);
2472
2473 fMonaLisa->SendParameters(&mlList);
2bb7b766 2474 } else {
2bb7b766 2475 TString statusStr(status);
2476 if(statusStr.Contains("done", TString::kIgnoreCase) ||
2477 statusStr.Contains("failed", TString::kIgnoreCase)){
eba76848 2478 setClause = Form("set %s=\"%s\"", detector, status);
2bb7b766 2479 } else {
2480 Log("SHUTTLE",
2481 Form("UpdateShuttleLogbook - Invalid status <%s> for detector %s",
2482 status, detector));
2483 return kFALSE;
2484 }
2485 }
57f50b3c 2486
2bb7b766 2487 TString whereClause = Form("where run=%d", GetCurrentRun());
2488
441b0e9c 2489 TString sqlQuery = Form("update %s %s %s",
2490 fConfig->GetShuttlelbTable(), setClause.Data(), whereClause.Data());
57f50b3c 2491
2bb7b766 2492 AliDebug(2, Form("SQL query: \n%s",sqlQuery.Data()));
2493
2494 // Query execution
2495 TSQLResult* aResult;
be48e3ea 2496 aResult = dynamic_cast<TSQLResult*> (fServer[3]->Query(sqlQuery));
2bb7b766 2497 if (!aResult) {
2498 Log("SHUTTLE", Form("UpdateShuttleLogbook - Can't execute query <%s>", sqlQuery.Data()));
2499 return kFALSE;
57f50b3c 2500 }
2bb7b766 2501 delete aResult;
57f50b3c 2502
2503 return kTRUE;
2504}
2505
2506//______________________________________________________________________________________________
2bb7b766 2507Int_t AliShuttle::GetCurrentRun() const
2508{
9827400b 2509 //
2510 // Get current run from logbook entry
2511 //
57f50b3c 2512
2bb7b766 2513 return fLogbookEntry ? fLogbookEntry->GetRun() : -1;
57f50b3c 2514}
2515
2516//______________________________________________________________________________________________
2bb7b766 2517UInt_t AliShuttle::GetCurrentStartTime() const
2518{
9827400b 2519 //
2520 // get current start time
2521 //
57f50b3c 2522
2bb7b766 2523 return fLogbookEntry ? fLogbookEntry->GetStartTime() : 0;
57f50b3c 2524}
2525
2526//______________________________________________________________________________________________
2bb7b766 2527UInt_t AliShuttle::GetCurrentEndTime() const
2528{
9827400b 2529 //
2530 // get current end time from logbook entry
2531 //
57f50b3c 2532
2bb7b766 2533 return fLogbookEntry ? fLogbookEntry->GetEndTime() : 0;
57f50b3c 2534}
2535
b948db8d 2536//______________________________________________________________________________________________
2537void AliShuttle::Log(const char* detector, const char* message)
2538{
9827400b 2539 //
2540 // Fill log string with a message
2541 //
b948db8d 2542
36c99a6a 2543 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
84090f85 2544 if (dir == NULL) {
36c99a6a 2545 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE)) {
2546 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
84090f85 2547 return;
2548 }
b948db8d 2549
84090f85 2550 } else {
2551 gSystem->FreeDirectory(dir);
2552 }
b948db8d 2553
cb343cfd 2554 TString toLog = Form("%s (%d): %s - ", TTimeStamp(time(0)).AsString("s"), getpid(), detector);
e7f62f16 2555 if (GetCurrentRun() >= 0)
2556 toLog += Form("run %d - ", GetCurrentRun());
2bb7b766 2557 toLog += Form("%s", message);
2558
84090f85 2559 AliInfo(toLog.Data());
ffa29e93 2560
2561 // if we redirect the log output already to the file, leave here
2562 if (fOutputRedirected && strcmp(detector, "SHUTTLE") != 0)
2563 return;
b948db8d 2564
ffa29e93 2565 TString fileName = GetLogFileName(detector);
e7f62f16 2566
84090f85 2567 gSystem->ExpandPathName(fileName);
2568
2569 ofstream logFile;
2570 logFile.open(fileName, ofstream::out | ofstream::app);
2571
2572 if (!logFile.is_open()) {
2573 AliError(Form("Could not open file %s", fileName.Data()));
2574 return;
2575 }
7bfb2090 2576
84090f85 2577 logFile << toLog.Data() << "\n";
b948db8d 2578
84090f85 2579 logFile.close();
b948db8d 2580}
2bb7b766 2581
ffa29e93 2582//______________________________________________________________________________________________
2583TString AliShuttle::GetLogFileName(const char* detector) const
2584{
2585 //
2586 // returns the name of the log file for a given sub detector
2587 //
2588
2589 TString fileName;
2590
2591 if (GetCurrentRun() >= 0)
2592 fileName.Form("%s/%s_%d.log", GetShuttleLogDir(), detector, GetCurrentRun());
2593 else
2594 fileName.Form("%s/%s.log", GetShuttleLogDir(), detector);
2595
2596 return fileName;
2597}
2598
2bb7b766 2599//______________________________________________________________________________________________
2600Bool_t AliShuttle::Collect(Int_t run)
2601{
9827400b 2602 //
2603 // Collects conditions data for all UNPROCESSED run written to DAQ LogBook in case of run = -1 (default)
2604 // If a dedicated run is given this run is processed
2605 //
2606 // In operational mode, this is the Shuttle function triggered by the EOR signal.
2607 //
2bb7b766 2608
eba76848 2609 if (run == -1)
2610 Log("SHUTTLE","Collect - Shuttle called. Collecting conditions data for unprocessed runs");
2611 else
2612 Log("SHUTTLE", Form("Collect - Shuttle called. Collecting conditions data for run %d", run));
cb343cfd 2613
2614 SetLastAction("Starting");
2bb7b766 2615
2616 TString whereClause("where shuttle_done=0");
eba76848 2617 if (run != -1)
2618 whereClause += Form(" and run=%d", run);
2bb7b766 2619
2620 TObjArray shuttleLogbookEntries;
be48e3ea 2621 if (!QueryShuttleLogbook(whereClause, shuttleLogbookEntries))
2622 {
cb343cfd 2623 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2bb7b766 2624 return kFALSE;
2625 }
2626
9e080f92 2627 if (shuttleLogbookEntries.GetEntries() == 0)
2628 {
2629 if (run == -1)
2630 Log("SHUTTLE","Collect - Found no UNPROCESSED runs in Shuttle logbook");
2631 else
2632 Log("SHUTTLE", Form("Collect - Run %d is already DONE "
2633 "or it does not exist in Shuttle logbook", run));
2634 return kTRUE;
2635 }
2636
be48e3ea 2637 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2638 fFirstUnprocessed[iDet] = kTRUE;
2639
fc5a4708 2640 if (run != -1)
be48e3ea 2641 {
2642 // query Shuttle logbook for earlier runs, check if some detectors are unprocessed,
2643 // flag them into fFirstUnprocessed array
2644 TString whereClause(Form("where shuttle_done=0 and run < %d", run));
2645 TObjArray tmpLogbookEntries;
2646 if (!QueryShuttleLogbook(whereClause, tmpLogbookEntries))
2647 {
2648 Log("SHUTTLE", "Collect - Can't retrieve entries from Shuttle logbook");
2649 return kFALSE;
2650 }
2651
2652 TIter iter(&tmpLogbookEntries);
2653 AliShuttleLogbookEntry* anEntry = 0;
2654 while ((anEntry = dynamic_cast<AliShuttleLogbookEntry*> (iter.Next())))
2655 {
2656 for (UInt_t iDet=0; iDet<NDetectors(); iDet++)
2657 {
2658 if (anEntry->GetDetectorStatus(iDet) == AliShuttleLogbookEntry::kUnprocessed)
2659 {
2660 AliDebug(2, Form("Run %d: setting %s as \"not first time unprocessed\"",
2661 anEntry->GetRun(), GetDetName(iDet)));
2662 fFirstUnprocessed[iDet] = kFALSE;
2663 }
2664 }
2665
2666 }
2667
2668 }
2669
2670 if (!RetrieveConditionsData(shuttleLogbookEntries))
2671 {
cb343cfd 2672 Log("SHUTTLE", "Collect - Process of at least one run failed");
2bb7b766 2673 return kFALSE;
2674 }
2675
36c99a6a 2676 Log("SHUTTLE", "Collect - Requested run(s) successfully processed");
eba76848 2677 return kTRUE;
2bb7b766 2678}
2679
2bb7b766 2680//______________________________________________________________________________________________
2681Bool_t AliShuttle::RetrieveConditionsData(const TObjArray& dateEntries)
2682{
9827400b 2683 //
2684 // Retrieve conditions data for all runs that aren't processed yet
2685 //
2bb7b766 2686
2687 Bool_t hasError = kFALSE;
2688
2689 TIter iter(&dateEntries);
2690 AliShuttleLogbookEntry* anEntry;
2691
2692 while ((anEntry = (AliShuttleLogbookEntry*) iter.Next())){
2693 if (!Process(anEntry)){
2694 hasError = kTRUE;
2695 }
4b95672b 2696
2697 // clean SHUTTLE temp directory
3301427a 2698 TString filename = Form("%s/*.shuttle", GetShuttleTempDir());
2699 RemoveFile(filename.Data());
2bb7b766 2700 }
2701
2702 return hasError == kFALSE;
2703}
cb343cfd 2704
2705//______________________________________________________________________________________________
2706ULong_t AliShuttle::GetTimeOfLastAction() const
2707{
9827400b 2708 //
2709 // Gets time of last action
2710 //
2711
cb343cfd 2712 ULong_t tmp;
36c99a6a 2713
cb343cfd 2714 fMonitoringMutex->Lock();
be48e3ea 2715
cb343cfd 2716 tmp = fLastActionTime;
36c99a6a 2717
cb343cfd 2718 fMonitoringMutex->UnLock();
36c99a6a 2719
cb343cfd 2720 return tmp;
2721}
2722
2723//______________________________________________________________________________________________
2724const TString AliShuttle::GetLastAction() const
2725{
9827400b 2726 //
cb343cfd 2727 // returns a string description of the last action
9827400b 2728 //
cb343cfd 2729
2730 TString tmp;
36c99a6a 2731
cb343cfd 2732 fMonitoringMutex->Lock();
2733
2734 tmp = fLastAction;
2735
2736 fMonitoringMutex->UnLock();
2737
36c99a6a 2738 return tmp;
cb343cfd 2739}
2740
2741//______________________________________________________________________________________________
2742void AliShuttle::SetLastAction(const char* action)
2743{
9827400b 2744 //
cb343cfd 2745 // updates the monitoring variables
9827400b 2746 //
36c99a6a 2747
cb343cfd 2748 fMonitoringMutex->Lock();
36c99a6a 2749
cb343cfd 2750 fLastAction = action;
2751 fLastActionTime = time(0);
2752
2753 fMonitoringMutex->UnLock();
2754}
eba76848 2755
2756//______________________________________________________________________________________________
2757const char* AliShuttle::GetRunParameter(const char* param)
2758{
9827400b 2759 //
2760 // returns run parameter read from DAQ logbook
2761 //
eba76848 2762
2763 if(!fLogbookEntry) {
2764 AliError("No logbook entry!");
2765 return 0;
2766 }
2767
2768 return fLogbookEntry->GetRunParameter(param);
2769}
57c1a579 2770
d386d623 2771//______________________________________________________________________________________________
9827400b 2772AliCDBEntry* AliShuttle::GetFromOCDB(const char* detector, const AliCDBPath& path)
d386d623 2773{
9827400b 2774 //
2775 // returns object from OCDB valid for current run
2776 //
d386d623 2777
9827400b 2778 if (fTestMode & kErrorOCDB)
2779 {
2780 Log(detector, "GetFromOCDB - In TESTMODE - Simulating error with OCDB");
2781 return 0;
2782 }
2783
d386d623 2784 AliCDBStorage *sto = AliCDBManager::Instance()->GetStorage(fgkMainCDB);
2785 if (!sto)
2786 {
9827400b 2787 Log(detector, "GetFromOCDB - Cannot activate main OCDB for query!");
d386d623 2788 return 0;
2789 }
2790
2791 return dynamic_cast<AliCDBEntry*> (sto->Get(path, GetCurrentRun()));
2792}
2793
57c1a579 2794//______________________________________________________________________________________________
2795Bool_t AliShuttle::SendMail()
2796{
9827400b 2797 //
2798 // sends a mail to the subdetector expert in case of preprocessor error
2799 //
2800
2801 if (fTestMode != kNone)
2802 return kTRUE;
57c1a579 2803
36c99a6a 2804 void* dir = gSystem->OpenDirectory(GetShuttleLogDir());
57c1a579 2805 if (dir == NULL)
2806 {
36c99a6a 2807 if (gSystem->mkdir(GetShuttleLogDir(), kTRUE))
57c1a579 2808 {
36c99a6a 2809 AliError(Form("Can't open directory <%s>", GetShuttleLogDir()));
57c1a579 2810 return kFALSE;
2811 }
2812
2813 } else {
2814 gSystem->FreeDirectory(dir);
2815 }
2816
2817 TString bodyFileName;
36c99a6a 2818 bodyFileName.Form("%s/mail.body", GetShuttleLogDir());
57c1a579 2819 gSystem->ExpandPathName(bodyFileName);
2820
2821 ofstream mailBody;
2822 mailBody.open(bodyFileName, ofstream::out);
2823
2824 if (!mailBody.is_open())
2825 {
2826 AliError(Form("Could not open mail body file %s", bodyFileName.Data()));
2827 return kFALSE;
2828 }
2829
2830 TString to="";
2831 TIter iterExperts(fConfig->GetResponsibles(fCurrentDetector));
2832 TObjString *anExpert=0;
2833 while ((anExpert = (TObjString*) iterExperts.Next()))
2834 {
2835 to += Form("%s,", anExpert->GetName());
2836 }
2837 to.Remove(to.Length()-1);
909732f7 2838 AliDebug(2, Form("to: %s",to.Data()));
57c1a579 2839
86aa42c3 2840 if (to.IsNull()) {
36c99a6a 2841 AliInfo("List of detector responsibles not yet set!");
2842 return kFALSE;
2843 }
2844
57c1a579 2845 TString cc="alberto.colla@cern.ch";
2846
546242fb 2847 TString subject = Form("%s Shuttle preprocessor FAILED in run %d !",
57c1a579 2848 fCurrentDetector.Data(), GetCurrentRun());
909732f7 2849 AliDebug(2, Form("subject: %s", subject.Data()));
57c1a579 2850
2851 TString body = Form("Dear %s expert(s), \n\n", fCurrentDetector.Data());
2852 body += Form("SHUTTLE just detected that your preprocessor "
546242fb 2853 "failed processing run %d!!\n\n", GetCurrentRun());
2854 body += Form("Please check %s status on the SHUTTLE monitoring page: \n\n", fCurrentDetector.Data());
2855 body += Form("\thttp://pcalimonitor.cern.ch:8889/shuttle.jsp?time=168 \n\n");
2856 body += Form("Find the %s log for the current run on \n\n"
2857 "\thttp://pcalishuttle01.cern.ch:8880/logs/%s_%d.log \n\n",
2858 fCurrentDetector.Data(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2859 body += Form("The last 10 lines of %s log file are following:\n\n");
2860
909732f7 2861 AliDebug(2, Form("Body begin: %s", body.Data()));
57c1a579 2862
2863 mailBody << body.Data();
2864 mailBody.close();
2865 mailBody.open(bodyFileName, ofstream::out | ofstream::app);
2866
9d733021 2867 TString logFileName = Form("%s/%s_%d.log", GetShuttleLogDir(), fCurrentDetector.Data(), GetCurrentRun());
57c1a579 2868 TString tailCommand = Form("tail -n 10 %s >> %s", logFileName.Data(), bodyFileName.Data());
2869 if (gSystem->Exec(tailCommand.Data()))
2870 {
2871 mailBody << Form("%s log file not found ...\n\n", fCurrentDetector.Data());
2872 }
2873
2874 TString endBody = Form("------------------------------------------------------\n\n");
36c99a6a 2875 endBody += Form("In case of problems please contact the SHUTTLE core team.\n\n");
2876 endBody += "Please do not answer this message directly, it is automatically generated.\n\n";
546242fb 2877 endBody += "Greetings,\n\n \t\t\tthe SHUTTLE\n";
57c1a579 2878
909732f7 2879 AliDebug(2, Form("Body end: %s", endBody.Data()));
57c1a579 2880
2881 mailBody << endBody.Data();
2882
2883 mailBody.close();
2884
2885 // send mail!
2886 TString mailCommand = Form("mail -s \"%s\" -c %s %s < %s",
2887 subject.Data(),
2888 cc.Data(),
2889 to.Data(),
2890 bodyFileName.Data());
909732f7 2891 AliDebug(2, Form("mail command: %s", mailCommand.Data()));
57c1a579 2892
2893 Bool_t result = gSystem->Exec(mailCommand.Data());
2894
2895 return result == 0;
2896}
d386d623 2897
441b0e9c 2898//______________________________________________________________________________________________
9827400b 2899const char* AliShuttle::GetRunType()
441b0e9c 2900{
9827400b 2901 //
2902 // returns run type read from "run type" logbook
2903 //
441b0e9c 2904
2905 if(!fLogbookEntry) {
2906 AliError("No logbook entry!");
2907 return 0;
2908 }
2909
9827400b 2910 return fLogbookEntry->GetRunType();
441b0e9c 2911}
2912
4859271b 2913//______________________________________________________________________________________________
2914Bool_t AliShuttle::GetHLTStatus()
2915{
2916 // Return HLT status (ON=1 OFF=0)
2917 // Converts the HLT status from the status string read in the run logbook (not just a bool)
2918
2919 if(!fLogbookEntry) {
2920 AliError("No logbook entry!");
2921 return 0;
2922 }
2923
2924 // TODO implement when HLTStatus is inserted in run logbook
2925 //TString hltStatus = fLogbookEntry->GetRunParameter("HLTStatus");
2926 //if(hltStatus == "OFF") {return kFALSE};
2927
2928 return kTRUE;
2929}
2930
d386d623 2931//______________________________________________________________________________________________
2932void AliShuttle::SetShuttleTempDir(const char* tmpDir)
2933{
9827400b 2934 //
2935 // sets Shuttle temp directory
2936 //
d386d623 2937
2938 fgkShuttleTempDir = gSystem->ExpandPathName(tmpDir);
2939}
2940
2941//______________________________________________________________________________________________
2942void AliShuttle::SetShuttleLogDir(const char* logDir)
2943{
9827400b 2944 //
2945 // sets Shuttle log directory
2946 //
d386d623 2947
2948 fgkShuttleLogDir = gSystem->ExpandPathName(logDir);
2949}