Hadoop-GIS Installation
Install Hadoop distribution
Please contact your system admin or see the Hadoop installation guide.
Please add or reconfigure your environment variables to have the following variables set up and pointing to correct paths (preferably in your ~/.bashrc
file):
HADOOP_INSTALL
HADOOP_HDFS_HOME
YARN_HOME
HADOOP_PREFIX
The following is not available by default, so please add it manually
HADOOP_STREAMING_PATH
(This is not a native default environment variable; this is the directory where there should be a symbolic link to the hadoop streaming jar file named hadoop-streaming.jar)
The reason is the use of Hadoop streaming jar in scripts: hadoop jar ${HADOOP_STREAMING_PATH}/hadoop-streaming.jar –args
Install library dependencies
Below are the list of dependencies that Hadoop-GIS requires. They must be available on all compute (task) nodes. You can install dependencies as the root user or a non-admin user.
Install the libraries in the following order:
- C-boost 1.x (x > 47)
- g++ 4.x (x >= 4)
- python 3.x (x >= 0)
- GEOS 3.x (x >= 3)
- libspatialindex 1.8.x (x >= 0)
Most installation scripts located in install/dependencies
directory contains several bootstrap scripts to install on different system and different user privileges.
Installing as an admin
TBU - After the installation of the above packages on all cluster
Installing as a non-admin user
For the above library dependencies, you can preferably install them into your home or shared directory.
For most packages, running autogen.sh
, bootstrap.sh
or ./configure –prefix=your_preferred_path
, before executing make
and make install
.
Update your $LD_LIBRARY_PATH
environment variable with paths to dependent libraries.
To test whether dependencies were properly installed,
It is recommended to install them into a common directory.