Distributed Data Management Services for Dynamic Data Grids Houda Lamehamedi Boleslaw K. Szymanski Brenden Conte Data grids are middleware systems that enable users and applications to locate, access, and place large numbers of data sets in geographically distributed storage sites. In most existing and deployed grid sys- tems however, control of the resources is centralized and usually handled by system administrators. Such con¯gurations hinder dynamic and scalable expan- sion of the Grid infrastructure and resources. We propose a new lightweight distributed, adaptive, and scalable data Grid middleware that provides trans- parent, fast, and reliable access to data and storage resources in data grids. At the core of our approach are dynamic data and replica location and placement techniques that adapt replica location and access to the continuously changing network connectivity and users behavior. The system is fully distributed and self con¯guring. In this paper we present the design of the system, the algorithms we use to implement the data management services, and demonstrate the scalability and performance of the overall system. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY cs-05-16
Distributed Data Management Services for Dynamic Data Grids
Houda Lamehamedi
Boleslaw K. Szymanski
Brenden Conte
Data grids are middleware systems that enable users and applications to locate, access, and place large numbers of data sets in geographically distributed storage sites. In most existing and deployed grid sys- tems however, control of the resources is centralized and usually handled by system administrators. Such con¯gurations hinder dynamic and scalable expan- sion of the Grid infrastructure and resources. We propose a new lightweight distributed, adaptive, and scalable data Grid middleware that provides trans- parent, fast, and reliable access to data and storage resources in data grids. At the core of our approach are dynamic data and replica location and placement techniques that adapt replica location and access to the continuously changing network connectivity and users behavior. The system is fully distributed and self con¯guring. In this paper we present the design of the system, the algorithms we use to implement the data management services, and demonstrate the scalability and performance of the overall system.
Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY
cs-05-16