CONDESA: A Framework for Controlling Data Distribution on Elastic Server Architectures Articles uri icon

publication date

  • August 2014

start page

  • 2010

end page

  • 2019

issue

  • 8

volume

  • 25

International Standard Serial Number (ISSN)

  • 1045-9219

Electronic International Standard Serial Number (EISSN)

  • 1558-2183

abstract

  • Applications running in today's data centers show high workload variability. While seasonal patterns, trends and expected events may help building proactive resource allocation policies, this approach has to be complemented with adaptive strategies which should address unexpected events such as flash crowds and volume spikes. Additionally, the limitations of current I/O infrastructures in the face of dramatic increase of data generation, requires the ability to build novel abstractions and models for robust decision making regarding data layout and data locality. In this work, we present CONDESA (CONtrolling Data distribution on Elastic Server Architectures), a framework for exploring adaptive data distribution strategies for elastic server architectures. To the best of our knowledge CONDESA is the first platform that permits to systematically study the interplay between five data related strategies: workload prediction, adaptive control of data distribution and server provisioning, adaptive data grouping, adaptive data placement, and adaptive system sizing. We demonstrate how CONDESA can be used for browsing the design space of adaptive data distribution policies. We show how prediction models can be compared in terms of overhead and accuracy. We evaluate the impact of change detection on prediction accuracy and how CONDESA can be used for choosing an adequate prediction horizon. We demonstrate how adaptive prediction can be used for sizing a server system. Finally, we show how prediction models, change detection strategies and data placement policies can be combined and compared based on sever utilization, load balance, data locality, over- and underprovisioning.

keywords

  • adaptive control; adaptive data distribution strategies; adaptive data grouping; adaptive data placement; adaptive system sizing; data centers; data distribution control; data generation; data layout; data locality; elastic server architectures; input-output infrastructures; load balance; prediction horizon; resource allocation policies; server provisioning; server utilization; workload prediction; workload variability