Electronic International Standard Serial Number (EISSN)
1872-7115
abstract
Typical infrastructure for big-data includes multiple machines with data accessed remotely with request&-response patterns from different remote locations. Currently, most of the state-of-the-art remote invocation techniques are focused on models for distributed interactions, which have not explored the advantages given by parallel computing, such as those offered to run on distributed stream processors. In this context, the article is focused on the definition of a predictable remote procedure call (RPC) able to take advantage from the distributed stream processing technology. Potentially, this type of infrastructure enables efficient parallel computations, reducing the effective response-time of end-to-end invocations linearly with the number of resources assigned to the system, as the data increases. The article describes a predictable model which defines maximum response-times for different types of applications described in the context of Apache Storm. Evaluation also offers clues on the performance one may expect from this type of infrastructure.