Data Management Overview
Managing data in a Grid environment involves moving end users away from the concept of files stored on a particular disk location (a Storage URL) and towards a more abstract filename space, usually called a Logical File Name.
The hope for data management is that ultimately users will never need to know where the physical locations of their files are - their work will be carried out simply in Logical File Name space.
As usual, such simplicity will require considerable complexity and robustness from the data management parts of Grid Middleware. The essential DM middleware components are:
- A File Catalog, managing the Logical File Name space and maintaining knowledge of the proper Storage URLs.
- A File Copy Service, to make new copies of files as necessary.
These components can be seen as part of a Data Management Middleware Stack.
Data management middleware has to seamlessly integrate with other middleware components - particularly important is the middleware for Grid Storage, which manages the actual disk servers on which the physical files are hosted.
For the end user, satisfying Data Management Use Cases is essential; for Tier 2 sysadmins knowing Tier 2 DM Requirements will tell you what's expected from your site.
