The demand for the analysis of big data makes centralized proprietary tools unsuitable for some purposes. Cloudera offers a Hadoop distribution for processing large data volumes in the big data environment. Cloudera effectively combines clusters of hundreds of computers into a single data programming environment, which acts as a “computing center operating system”. Among the components of the Cloudera Hadoop dispersion are HDFS – dispersed data memory, YARN – resource intermediary, Spark – an in-memory data programming framework, Spark Streaming – streaming data analysis, Impala – a dispersed relational data warehouse, and Cloudera Manager – a graphic console for cluster management. As its manufacturer says in its slogan, Cloudera makes it possible to ask bigger questions.