Recently I am taking the Cloud Computing Specialization MCS course on Coursera for fun and gaining breadth on distributed systems.
One thing I recently learned about is Apache Storm, which is a Distributed Stream Processing framework. At the first glance, I wondered how it is different from the popular Apache Spark; so I did a little bit of research on this, and I found the two comparison charts from here to be quite useful.
So basically, the major difference seems to be brought out by their fundamental architectures: Spark uses HDFS, meaning it’s also possible for batch processing.
(Interesting fact: the Coursera course was developed in 2014 whereas Spark was released in 2015; not a coincidence that such a powerful framework was not mentioned 🙂 )