Apache Apache samoa (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams. Big data, is defined as datasets whose size is beyond the ability of typical software tools to capture, store, manage and analyze, due to the time and memory complexity. Velocity is one of the main properties of big data. Apache Apache SAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Apache Flink, Apache Storm, Apache Samza, and Apache Apex. Apache Apache SAMOA is written in Java and is available at https://samoa.incubator.apache.org/under the Apache Software License version 2.0.
展开▼