A novel approach for large-scale environmental data partitioning on cloud and on-premises storage for compute continuum applications Articles uri icon

publication date

  • November 2023

issue

  • 25

volume

  • 35

abstract

  • Cloud-based services have proved useful in several research fields, such as engineering, health science, and astrophysics, to mention a few examples. The computational environmental science community developed a strong need for cloud facilities to store, process, and manage data from observations and numerical models for simulations and forecasts. Weather forecast models and global sensor networks deal with multidimensional geo-referenced data⧵sets. However, environmental data consumer applications usually require a relatively small amount of multidimensional input data slice to analyze a specific area or time interval. Hence, reducing data dimension for information retrieval is mandatory. This paper presents a twofold solution: a technique to load and
    retrieve the sliced multidimensional data set on different cloud services such as Amazon Web Service (AWS), Google Cloud Platform, and Microsoft Azure. The experimental results performed on these cloud services highlight that the proposed method can significantly speed up the process of loading and retrieving the data slices compared to working with the entire data set in bulk or OPeNDAP server

subjects

  • Computer Science

keywords

  • data decomposition; environmental data; multidimensional georeferenced data; object cloud storage