RAL Tier1 weekly operations castor 29/02/2016

From GridPP Wiki

Revision as of 10:41, 26 February 2016 by Alison Packer 52064d6050 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

Operations News

No disk server issues this week
globc updates applied, all CASTOR systems rebooted. initial issues with head nodes, 7 failed to reboot due to their build history. ACTION: they need their quattor build revisited so that this does not recur.
Main CIP system failed, have failed over to test CIP machine. HW failure to be fixed then will fail back over to production system
11.2.0.4 DB client update had to be rescheduled, should go ahead Monday 29th, has been running in pre-prod for considerable amount of time. This should be transparent.

castor 2.1.15 update
- ns upgrade on day of 29thFeb-3March; Downtime for all VOs
- stager upgrade for one VO week commencing 21/3/16
Repack updated to 2.1.14-15
2.1.15 works on preprod (RAL xroot rpm build) had not been put under stress yet
castor 2.1.16 coming soon - SRM integration into CASTOR code base
ATLAS gSoap Errors; JK (SdW advised) restarted SRM front ends
CMS AAA still an issue
LHCb upload still problematic

VO DiRAC people from Leicester are coming online -
2.1.15 change control had its first airing in change control - 2.1.15 currently not working for us.
new tape backed disk servers for Tier1 - to replace CV11, recommendation made to Martin
Merging tape pools wiki created by Shaun
2.1.15 name server tested
New SRM on vcert2
New SRM (SL6) with bug fixes available - needs test
Gfal-cat command failing for atlas reading of nsdumps form castor: https://ggus.eu/index.php?mode=ticket_info&ticket_id=117846. Developers looking to fix within: https://ggus.eu/index.php?mode=ticket_info&ticket_id=118842
LHCb batch jobs failing to copy results into castor - changes made seems to have improved the situation but not fix (Raja). Increasing the number of connections to the NS db (more threads)
BD looking at porting persistent tests to Ceph

Retrieved from "https://www.gridpp.ac.uk/w/index.php?title=RAL_Tier1_weekly_operations_castor_29/02/2016&oldid=10784"