Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
ispeed:features [2017/12/06 11:37]
yliang
ispeed:features [2017/12/07 15:36] (current)
master [Parameter Explanation]
Line 14: Line 14:
  
 ==== Input 3D Data Format ==== ==== Input 3D Data Format ====
-**Data Format**. ​+ 
 +**Data Format**. 
   * HadoopGIS currently supports the [[https://​en.wikipedia.org/​wiki/​OFF_(file_format)|OFF]] data format.   * HadoopGIS currently supports the [[https://​en.wikipedia.org/​wiki/​OFF_(file_format)|OFF]] data format.
  
 **Data Preparation**. The data format that iSPEED can accept or re-process must be have the following properties: **Data Preparation**. The data format that iSPEED can accept or re-process must be have the following properties:
 +
   * Each record is located on a separate line; a record representing a spatial object must be contained in a single line.   * Each record is located on a separate line; a record representing a spatial object must be contained in a single line.
   * Each line starts with a unique object ID followed by the object geometry.   * Each line starts with a unique object ID followed by the object geometry.
Line 31: Line 34:
 ''​hdfs dfs -put testdata.off /​user/​testuser/​rawdata1/''​ ''​hdfs dfs -put testdata.off /​user/​testuser/​rawdata1/''​
  
- +If you do not have data, you can download the data from [[:ispeed:​examples|Examples]] page.
-If you do not have data, you can download the data from [[ispeed:​examples|Examples]] page. +
  
 ==== Query Parameters ==== ==== Query Parameters ====
Line 39: Line 40:
 Arguments are passed to iSPEED via command line arguments. The full list of arguments for the framework manager can be found by executing ''​../​build/​bin/​queryproc3d –-help''​. Arguments are passed to iSPEED via command line arguments. The full list of arguments for the framework manager can be found by executing ''​../​build/​bin/​queryproc3d –-help''​.
  
-The following output will be displayed. Detailed explanations of these parameters are presented in [[ispeed:​examples|Examples]] page.+The following output will be displayed.
  
 <​code>​ <​code>​
Line 59: Line 60:
   -o [ --overwrite ]        Overwrite existing hdfs directory.   -o [ --overwrite ]        Overwrite existing hdfs directory.
 </​code>​ </​code>​
 +
 +==== Parameter Explanation ====
 +
 +Most of the above listed parameters are self-explained. The very detailed explanation of the query parameters can be found in the [[:​hadoopgis:​features|Features]] page.
 +
 +Here we present a 3D spatial join example. The following code sections display the critical commands to run the spatial join query, and the complete script can be downloaded at [[http://​bmidb.cs.stonybrook.edu/​publicdata/​ispeed/​run_spatial_join.sh|run_spatial_join]].
 +
 +The first step is data compression. Data compression is performed by Mapper-only jobs.
 +
 +<​code>​
 +#data compression first
 +../​build/​bin/​queryprocessor_3d -t st_intersects -a your_hdfs_path/​3Ddata/​spjoin/​testdata/​d1 -b your_hdfs_path/​3Ddata/​spjoin/​testdata/​d2 -h your_hdfs_path/​3Ddata/​spjoin/​testdata/​output/​ -q spjoin -s 1.0 -n 240 -u fg_3d --bucket 4000 -f tileid,​1:​1,​2:​1,​intersect_volume -j 1 -i 1 --compression --overwrite
 +</​code>​
 +
 +  * ''​-t st_intersections'' ​ is the default predicate for data compression;​ ''​-a'' ​ and ''​-b'' ​ specify the paths of input dataset.
 +  * The core parameter in this command is ''​%%--%%compression'' ​ to specify the data compression operations.
 +
 +The second step is to combine the compressed data, store it into memory and share it across all cluster nodes. The object MBBs are also combined for data partitioning.
 +
 +<​code>​
 +#combine all binary data and mbb outputs
 +../​build/​bin/​runcombiner.sh your_hdfs_path/​3Ddata/​spjoin/​testdata/​output
 +
 +hdfs dfs -rm -r your_hdfs_path/​3Ddata/​spjoin/​testdata/​output/​output_partidx
 +hdfs dfs -rm -r your_hdfs_path/​3Ddata/​spjoin/​testdata/​output/​output_joinout
 +</​code>​
 +
 +The third step is to run the actural spatial join.
 +
 +<​code>​
 + #run spatial join
 +../​build/​bin/​queryprocessor_3d -t st_intersects your_hdfs_path/​3Ddata/​spjoin/​testdata/​d1 -b your_hdfs_path/​3Ddata/​spjoin/​testdata/​d2 -h your_hdfs_path/​3Ddata/​spjoin/​testdata/​output/​ -q spjoin -s 1.0 -n 240 -u fg_3d --bucket 4000 -f tileid,​1:​1,​2:​1 -j 1 -i 1 --spatialproc --decomplod 100 --overwrite
 +</​code>​
 +
 +  * ''​-t st_intersections'' ​ is the predicate for spatial join query; ''​-a'' ​ and ''​-b'' ​ specify the paths of input dataset, and ''​-h'' ​ specifies the output path.
 +  * ''​-q spjoin'' ​ specifies the spatial query type.
 +  * ''​-s 1.0'' ​ is the data sampling rate for spatial partitioning,​ ''​-u fg_3d'' ​ specifies the fixed-grid partitioning method and ''​bucket 4000'' ​ is the bucket size for partitioning. So after data partitioning,​ each partitioned cuboid contains about 4000 objects.
 +  * ''​-n 240'' ​ is the number of reducers for this MapReduce job. This number is adjustable based on the cluster configurations.
 +  * ''​%%--%%spatialproc'' ​ indicates this is the spatial query process, and ''​%%--%%decomplod 100'' ​ specifies the level of detail (LOD) used in the spatial join query. Users can choose different LODs (resolutions) to balance the run time and accuracy in practice.
 +