Recently I was working on some data mining tasks using MapReduce, and I need to setup Hadoop on my Mac. I attempted several tutorials and I want to share my experience with it for others to avoid making mistakes.
Do not attempt this one! It was recommended by my instructor but it caused so many different errors including cant start namenode, ssh failures and cant install your app jar into hadoop. It was very outdated as well.
This one mostly works, but it does have a few gotchas:
1. when you create a new SSH key, make sure to use ssh-add to make it available for ssh daemon so you dont have to type your password everytime
2. after installing, the file explorer web UI can’t create files/folders etc because of permission issue, if it’s your personal computer, just do hdfs -chmod 777 /YOUR_HADOOP_DIR to make it work.
3. specifically on Mac which by default is a case-insensitive system, simply run hadoop jar might give you an error like:
Exception in thread “main” java.io.IOException: Mkdirs failed to create //META-INF/license
because it doesn’t know how to create both LICENSE and license. To solve this, follow instructions here.
Also, a quite useful link to debug common issues here.