Michael's Wiki

Data processing technologies include programming languages, the tools implemented within them, as well as how these tools are used in larger systems. They include tasks ranging from simple ETL or pipeline management to complicated machine learning and data mining systems.

First come languages where people have implemented data analysis techniques or designed a language for data manipulation or visualization. Large frameworks have been developed around these to support data scaling, parallelization, and specific types of data processing needs.

Data Analysis Languages

Distinguishing features of analysis languages

Data Processing Platforms


Apache Hadoop is the industry-standard open-source implementation of Google's MapReduce.


Other technologies avoid structured table formats and focus on creating extensible and customizable processing workflows. These are appropriate for streaming/large-quantity data.

Examples include Apache Storm or Microsoft's Azure Stream Analytics.