RAL Tier1 weekly operations castor 19/5/2017
Contents
Draft agenda
1. Problems encountered this week
2. Upgrades/improvements made this week
3. What are we planning to do next week?
4. Long-term project updates (if not already covered)
1. SL7 upgrade on tape servers 2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server 4. CASTOR stress test improvement
5. Special topics
1. Future CASTOR upgrade methodology
6. Actions
7. Anything for CASTOR-Fabric?
8. AoTechnicalB
9. Availability for next week
10. On-Call
11. AoOtherB
Operation problems
gdss724 and gdss744 crashed and removed from production
When diskmanager daemon restarted, after an obsolete protocol was removed from castor.conf, the disk managers were not visible to the transfer manager. See e-log entry
Operation news
Correct version of printdiskcopy pushed to all 2.1.16 headnodes e-log
New StorageD box for Diamond in place
Plans for next week
Upgrade ATLAS to 2.1.16 on Tuesday
Long-term projects
CIP migration to aquilon and upgrade to SL6
SL6 upgrade on functional test boxes and tape verification server
Tape-server migration to aquilon and SL7 upgrade (on hold at the moment)
CASTOR stress test improvement
Actions
DB hardware upgrade tracking
Drain and decomission/recomission the 12 generation disk servers
RA to get a new source control management system sorted for CASTOR script development
GP to prepare a report on the performance of the WAN parameters deployed on CMS disk servers
Staffing
RA on call
