Research

BSC-MSRC ResearchALOJA is an initiative to produce mechanisms for an automated characterization of cost-effectiveness of Hadoop deployments and reports its initial results.

While during the last years, Hadoop has become the de-facto platform for Big Data deployments, still little is understood of how the different layers of the software and hardware deployment options affects its performance.

Early ALOJA findings show that Hadoop's runtime performance, and therefore its price, are critically affected by relatively simple software and hardware configuration choices e.g., number of mappers, compression, or volume configuration.

ALOJA presents a vendor-neutral repository (hadoop.bsc.es) featuring thousands of Hadoop runs, a test bed, and tools to evaluate the cost-effectiveness of different hardware, parameter tuning, and Cloud services for Hadoop.

As few organizations have the time or performance profiling expertise,
we expect our growing repository will benefit Hadoop customers to meet their Big Data application needs.
ALOJA seeks to provide both knowledge and an online service to with which users make better informed configuration choices for their Hadoop compute infrastructure whether this be on-premise or cloud-based.


BSC-Microsoft Research Centre Past Projects