RAL Tier1 weekly operations castor 15/03/2010
From GridPP Wiki
								
												
				Contents
Summary of Previous Week
-  Matthew:
- Out of the office
 
-  Shaun:
- ..
 
-  Chris:
- ..
 
-  Cheney:
- installing various new hardware
- set up new robot controller
- cleaning away machine room junk
- fix castor151 backups
- fix castor151 crash
- fix castor150 disk space
- got borg-ed
 
-  Tim:
- VDQM issues
- CMS Steam starvation issues
 
-  Richard:
- ..
 
-  Brian:
- Draining on lhcbDst RAID servers
- Investigation into non migrating ATLAS MCTAPE files.
- ATLAS FTS slot re-calculation
 
-  Jens:
- Only minor things this week.
 
Developments for this week
-  Matthew:
- Out of the office
 
-  Shaun:
- ..
 
-  Chris:
-  Cheney:
- Robot controller set up
 
-  Tim:
- More new kit install
- T10KB tests on pre-prod
- New tape server installs
 
-  Jens
- Out of office.
 
Operations Issues
- ATLAS tape migration problem due to incorrect service class configuration
- LHCb operations contention with draining. Resolved with help of LHCb.
- CMS had problems with timeouts on transfers to ASGC early in the week.
- Crash of castor151 DB server
Blocking issues
- Waiting for neworking for new tape servers
- Delivery of preprod datbase
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
| Description | Start | End | Type | Affected VO(s) | 
|---|---|---|---|---|
| LHCb Draining RAID 5 disk servers | 2010-03-11, 17:00:00 | 2010-03-15, 08:00:00 | At-risk | LHCb | 
Advanced Planning
- Gen upgrade to 2.1.8 2010Q1
- Install/enable gridftp-internal on Gen (This year/before 2.1.8 upgrade)
Staffing
- Castor on Call person: Chris
- Matt on paternity leave for 1 more week
-  Staff absences: 
- Brian: Mon(pm), Tue, Wed
- Jens
 
