Skip to main content

Table 4 Comparison among popular Non-spatial DB Framework

From: A comparative study of big data use in Egyptian agriculture

Features

Hadoop

Spark

Processing type

Batch

Hybrid

Computing cluster architecture

YARN

YARN and Mesos

Data Flow

MapReduce data flow

A queue of RDDs called DStream processed one at-a-time using microbatching cluster

Data Processing Model

MapReduce

exactly-once

Fault Tolerance

Yes

Yes (using lineage)

Latency

low

High

Scalability

Yes

Yes (user demand)

Back-pressure Mechanism

No

Yes

Programming Languages

Java mostly

API for Scala, Java, Python, and R

Support for Machine Learning

Yes

Yes (Spark MLlib)