The advance of Big Data analytics and the need for real time results in application environments such as IoT is driving the need for new approaches to storage. An example of this trend can be seen in the Alluxio In-memory File System. Alluxio has available as a free open source download. Now it is also available in more robust versions are available at cost from Alluxio. These include a Community Edition that adds a management interface and an Enterprise Edition that offers Kerberos security authentication, data replication and support.
In 2012, researchers at the UC Berkeley APM Lab open-sourced a memory-centric, fault-tolerant virtual distributed storage system called Tachyon. Because of its memory-centric design, it found early acceptance when coupled with the Big Data analytics platforms built to deliver real-time or near real-time results such as Apache Spark and Storm. The project was later renamed Alluxio.
With Alluxio, working data sets are loaded into Alluxio’s in memory file system where they can be accessed simultaneously and at memory speed by multiple applications. Alluxio’s tiered storage framework pre-loads data into distributed cluster memory from a unified, persistent storage layer. Accessed through an API, the long term persistent storage layer can include both local (SSD, disk array, etc.) and distributed file stores including the Hadoop Distributed Files System (HDFS), Amazon S3, and Swift object stores. The pre-loading process can occur automatically or be done manually. When automated, the user defines policies for data allocation and eviction.
We believe that the following combination of attributes makes Alluxio unique: