RAL Tier1 weekly operations castor 04/07/2010
From GridPP Wiki
								
												
				Contents
Summary of Previous Week
-  Matthew:
- CoD work
- 2.1.9 change control document
- arranging for access of BADC/NEODC people to Gen and helping them to get started
- getting 2.1.9 DLF working on preprod with rsyslog & testing
- assisting with Security Challenge 4
- setup website on e-Science wiki for 2.1.9 upgrade
 
-  Shaun:
- 2.1.9 Upgrades
- Moving last disk server into cmsTemp
- Fixed gdss539
 
-  Chris:
- Castor 2.1.9 tests and work related to it
- SSC4
- Alice disk servers filling up /var partitions
- rsyslog
 
-  Richard:
- Installed the machine lcg0625 as a test CIP server
- Trying to get latest (2.1.9-x) functional tests running on pre-prod
- Re-config pre-prod to reverse the "local ns" change
- Starting a run of stress tests for 2.1.9 pre-prod
 
-  Brian:
- ..
 
-  Jens:
- ..
 
Developments for this week
-  Matthew:
- Finalizing plans for NIS + networking for CASTOR for Facilities
- 2.1.9 DLF configuration & testing
- Testing 2.1.9 tape migration
- WLCG (Wed-Fri)
 
-  Shaun:
- SRM work
- COD
 
-  Chris:
- Castor 2.1.9 tests and work related to it
- WLCG meeting
- Finishing SSC4
 
-  Richard:
- Continuing with stress tests for 2.1.9 pre-prod
- 4.5 days A/L
 
-  Brian:
- ..
 
-  Jens:
- ..
 
Operations Issues
- Approx. 1000 CMS files were lost on gdss67 (D1T0 cmsFarm) after a failure of a RAID array. The decision to recreate the array was carried out prematurely before CMS was announced of the file loss. A postmortem has been written http://www.gridpp.ac.uk/wiki/RAL_Tier1_Incident_20100630_Disk_Server_Data_Loss_CMS.
- Transtech disk array controller reset itself for an unknown reason, causing mounts on 6 nodes to go read-only, stopping backup of DB redo logs. A reboot of the nodes during a downtime on 5/7/10 fixed the issue.
Blocking issues
none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
None
Advanced Planning
- Upgrade to 2.1.8/2.1.9 2010
Staffing
- Castor on Call person: Shaun
-  Staff absences: 
- Jens (Mon,Tue)
 
