RAL Tier1 weekly operations Grid 20110307
			
								From GridPP Wiki
								
												
				
Operational Issues
| Description | Start | End | Affected VO(s) | Severity | Status | 
|  |  |  |  |  |  | 
Downtimes
| Description | Hosts | Type | Start | End | Affected VO(s) | 
Blocking Issues
| Description | Requested Date | Required By Date | Priority | Status | 
|  |  |  |  |  | 
Developments/Plans
Highlights for Tier-1 Ops Meeting
Highlights for Tier-1 VO Liaison Meeting
Detailed Individual Reports
Alastair
-  Working on ATLAS permission change.  [On hold]
-  Setting up xrootd for ATLAS at RAL.
-  Talking to ALICE
-  Looking into upgrading castor client on all WN. 
 
-  Disk pool merging and DB change.
-  Cleaning up dark data [Ongoing]
-  Writing change control [Done]
-  Moving files! [Ongoing]
 
-  Preparing for Beauty 2011 conference.
-  Requested new VO box for ATLAS Frontier.
Andrew
-  Migration to FTS groups for CMS [Done]
-  Prepared FTS groups setup for ATLAS [Done]
-  Feb accounting; migrated tape usage from vmgr to ns in UB schedule & capacity planning system [Done]
-  Kernel/errata updates [Done]
-  CMS storage consistency check; setup script/cron to run monthly. [Done]
-  CMS squid name changes [Ongoing]
-  CMS data ops
-  Installed new PA instances required for FNAL move to Lustre [Done]
 
Catalin
-  work on quattorised ATLAS Frontier installation
-  apply latest errata and kernel
-  assist work on LFC Oracle DB change [ongoing]
-  involved with CREAM CEs installation and configuration [ongoing]
-  two new VOS to be added to the LFC [done]
-  GGUS issue with pheno affecting lcgwms03 [done]
Derek
-  Catching up after leave [done]
-  Investigating load problems on lcgce05 [done]
-  Investigating BLParser isssues on lcgce09 [ongoing]
-  Publishing whole node queue [ongoing]
Matt
-  Deploying test Hadoop instance. [Ongoing]
-  Contact NFS users. [Ongoing]
-  Deploying FTS test instance on new virtual hosts. [Done]
Richard
-  Updating Site level BDIIs to level 21. [Ongoing]
-  Moving one more top BDII into UPS room for better resilience. [Ongoing]
-  Trying out new hypervisor (hv-10) to see how much performance has improved (have moved an existing VM across to the new h/v) [Ongoing].
-  Building an ARGUS server using the new QWG templates [Ongoing]
-  Working on the "team status page" being developed as an action from team awayday [Ongoing]
-  Reviewing G/S process documentation  [Ongoing]
-  CASTOR items:
-  Developed a script to stress test FTS xfers in/out of preprod instance. [Ongoing]
 
VO Reports
ALICE
ATLAS
CMS
-  2011-02-28: CREAM CE temporarily blacklisted by a CERN WMS, leading to 35 Job Robot jobs aborting.
-  Large MC reprocessing will start across all T1s sometime this week
LHCb
OnCall/AoD Cover
OnCall Rota
-  Primary OnCall: 
-  Grid OnCall: Derek
-  AoD: