MAMBA DSBA Hadoop User Notes

DSBA Hadoop is running Cloudera’s Distribution of Hadoop (CDH) 5.14, on top of Red Hat Enterprise Linux 7.2. Here are a few details regarding our implementation of CDH5.14 on DSBA Hadoop.


Currently, only classes are added to the MAMBA Education environment.  If you are unsure whether your class has access to MAMBA or not, please contact your TA or instructor.

Users must have already registered for DUO authentication. You can follow these instructions to do so: is the MAMBA EDGE node (interactive host) and HUE server for the cluster. You may use SSH to log into dsba-hadoop, and you can use SCP or SFTP to transfer files to-and-from this host. Please use your NINERNET USERNAME and PASSWORD to log in.


The HDFS volume for DSBA Hadoop is 130 TB, and is accessible through the "hadoop fs" (or "hdfs dfs") commands. Your HDFS home directory is /user/<username>, there is quota limit of 2TB, and HDFS is NOT BACKED UP. HDFS is intended to be used for computation on the Hadoop cluster and not for long-term storage, so please do not store anything in HDFS that you have not backed up to your NFS storage or somewhere else.


The Hadoop URLs (below) are NOT directly accessible from the campus network. They are only accessible from the DSBA Hadoop cluster's internal network. In order to navigate to the following URLs, you must run Firefox from the MAMBA interactive (EDGE) node, using X11 forwarding:

Namenode Info
Yarn/MapRed Info
MapRed Job History
Spark Job History
Spark2 Job History
HBase Info