Hacker News Clone

Benchmarking Streaming Computation Engines at Yahoo

by YAFZ on 12/18/2015, 1:20:58 PM with 3 comments

by kod on 12/18/2015, 3:25:36 PM
If I'm reading this correctly, the Kafka topic only had 5 partitions, but they had 10 workers.
With the Spark direct stream, kafka partitions are 1:1 with spark partitions, which means at most half of the workers would be doing work without a shuffle.
Seems like a pretty basic oversight that should be addressed.
by estefan on 12/18/2015, 2:58:50 PM
This is the first mention I've seen of flink on HN.