Hadoop
Hadoop is a popular open-source framework for distributed storage and distributed processing of large volumes of data, especially where the processing involves a MapReduce algorithm. Unlike traditional cluster resources in MetaCentrum, which assume just-in-time stage-in of data before processing, the Hadoop environment is intended for mid- to long-term storage/collection of data to be processed.
Aside of a large native Hadoop cluster extensible on demand with additional virtual nodes, MetaCentrum can also offer purely virtual Hadoop PaaS clusters for prototyping your solutions prior to large-scale deployment.
How to Enroll
The service is intended for MetaCentrum users and registration is subject to approval:
- Register with MetaCentrum (unless you are already a memeber).
- Apply for access to Hadoop.
- Access the Hadoop frontend at hador.ics.muni.cz with ssh.
User Documentation and Support
- User guides and tips at the MetaWiki.
- Support by e-mail.
Current Hardware Resources
- 27 ⨉ 16 cores
- 27 ⨉ 128 GB of RAM
- 1 PB of storage in HDFS