Getting Started
Quick Installation
This is a portable and OS independent system, therefore installation is practically just downloading and running the system.
Steps:
1. Download required files from our download page.
2. Start the Solr
3. Make the input file according to specified format.
4. Use jar file for running the program. Check out the example.
That is it! We have encapsulated everything in order to make it as simple as possible.
Start The Solr
First you need to run the solr, go to the solr folder in your computer, in Linux and Mac, use command:
bin/solr start -e cloud -noprompt
On Windows, use:
bin/solr.cmd start -e cloud -noprompt
After your job is finish, you could stop Solr in linux and Mac by command:
bin/solr stop -all
And in Windows, by:
bin/solr.cmd stop -all
Input File Format
Currently we are only supporting csv format with headers as follow - Latitude, Longitude, Address 1, Address 2, City, County, State, Zip code, Country. There are four requirements for the input file listed below.
1. Must have meaningful building number.
2. Must have zip code.
3. Must be in New York state.
4. Must have street name.
When your address has two parts, you need to put place name below the address1, and the part with street name below the address2. Examples of correct input format:
Latitude | Longitude | Address1 | Address2 | city | county | state | zipcode | country |
42.651895 | -73.764145 | 339 Hamilton St | Albany | NY | 12210 | USA | ||
Strong Memorial Hospital | 601 Elmwood Ave | Rochester | NY | 14642 | USA |
There are samples of input files format at our download page, check them out.
Additional Parameters
1. Mode
This program works in two modes.
1. Comparison Mode. For measuring accuracy of this system given a list of addresses with their geolocations as the ground truth.
2. Searching Mode (default). To extract geolocations for list of addresses, most common mode.
2. NumberOfThreads
Specifying number of additional threads the program could use which helps improving performance of this system. The minimum amount is one, default is set to 8 and recommended number is two times cpu virtual cores minus one.
3. Max
Number of rows from start that are intended to get geocoded.
Example
The first parameter must always be the input file address.
java -jar EaserGeocoder_VX.XX.jar [inputfile]
java -jar EaserGeocoder_VX.XX.jar [inputfile] [outputfile]
The program will geocode addresses in the [inputfile], results will be written in [outputfile]. In case the [outputfile] is missing, the output file will have the format “output_[inputfile]_[date]” and it would be placed in the “output” folder near jar file. If this folder does not exist, the jar file would create it automatically.
The jar file could take other optional parameters as following with their default value.
Name | Description | Default Value | Parameter Syntax |
---|---|---|---|
Max | Number of rows in input file for geocoding | Till the end of the file | -m |
Mode | Specifiying the output format | 2 | -o |
Threads | Number of additional active threads | 15 | -t |
An example of running the program with all additional parameters:
java -jar EaserGeocoder_VX.XX.jar [inputfile] [outputfile] -m 2000 -t 10 -o 2
It means, first 2000 addresses in inputfilename file will be geocoded, by 10 additional threads, output will generate based on mode 2 which is only extracting geolocations.
Troubleshooting
This section will be completed along common reported errors.
1. Cannot start the Solr
- Solr is not OS independent, make sure the right command (solr or solr.cmd) is used.
- If you get exception that solr didn’t start in 30 seconds, when you start solr first time in your computer. Then you need to stop solr and start it again, exception should be cleared.
- If the exception still exist, you should download Apache Solr based on your system, then replace the “example/cloud/” folder of our Solr with new version you have downloaded.
In case you are facing any other problem, please feel free to contact us, we would be more than happy to help.