E2Data develops a novel Big Data software stack that will help Big Data practitioners to exploit in a transparent and efficient manner the available underlying diverse heterogeneous hardware resources without requiring software re-engineering by the programmers.

In today’s world, the data would be streamed from the local network or edge devices to a cloud provider which is rented by a customer to perform the data execution. The Big Data software stack, in an application and hardware agnostic manner, will split the execution stream into multiple tasks and send them for processing on the nodes the customer has paid for. If the outcome does not match the strict three second business requirement, then the customer has three options:

  • Scale-up (by upgrading processors at node level),
  • Scale-out (by adding nodes to their clusters), or
  • Manually implement code optimizations specific to the underlying hardware.

E2Data proposes an end-to-end solution for Big Data deployments that will fully exploit and advance the state-of-the- art in infrastructure services by delivering a performance increase while utilizing less cloud resources.

E2Data will provide a new Big Data paradigm, by combining state-of-the-art software components, in order to achieve maximum resource utilization for heterogeneous cloud deployments without affecting current programming norms (i.e. no code changes in the original source).

The evaluation will be conducted on both high-performing x86 and low-power ARM cluster architectures representing realistic execution scenarios of real-world deployments in four resource-demanding applications from the finance, health, green buildings, and security domains.