Data Uploading

After all the data are generated and pre-processed, we need to upload them into database. The following steps present the detailed process.

1. Install Openslide

Openslide is necessary for our system. We use openslide to read the content of whole slide images. Therefore openslide should be installed before using any C or java programs. You can download the latest version online, but version 3.3.3 is highly recommended since some functions we use are deprecated in the latest version.

You can download openslide using git clone from OpenSlide Github (read README for the setup info) or use the included file in the setup folder. Firstly unzip the tar file and enter the directory. Then run commands with root authority. (There are some required dependencies, such as imagemagick, openjpeg, glib2, cairo and libxml etc. You can install them using yum install command.) Please use root user or sudo prefix for these installations. Notice that the latest openjpeg will cause an unexpected error, please install openjpeg 1.5 using the source code attached in folder dependency.

 $sudo yum install ImageMagick-devel.x86_64
 $sudo yum install glib2-devel.x86_64
 $sudo yum install cairo-devel.x86_64
 $sudo yum install libxml2-devel.x86_64
 $sudo yum install openjpeg-devel.x86_64
 $tar zxvf openslide.tar.gz
 $cd openslide-3.4.1
 $./configure --prefix=/usr
 $make
 $sudo make install

Openslide will be installed to /usr/lib by default, please include this folder in LD_LIBRARY_PATH by adding one line in file /etc/profile if it is not included

 $export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/lib

and then run command

 $source /etc/profile

2. Compile and deploy Openslidetools

Openslidetools is a C program we developed to manage whole slide images. It can read the dimension of the whole slide image, generate thumbnails for the whole image or one specified region, get the full resolution tile for any region and generate tiles for the consumption of image viewer. This tool is necessary for both data uploading and web API. Therefore we need to compile it and put it somewhere our system can access. Please make sure that openslide has been already installed before compiling. If you don't have openslide installed, please go to step 1 to install openslide.

Please find the C source file in folder setup/lib openslidetoolsbin.c and compile it to the openslidetools. Then move the compiled binary file into somewhere the operating system can get it from PATH. Otherwise other programs cannot run it. I recommend the folder /bin in which you don't need to edit the /etc/profile to add any export lines. please make sure that environment variable LD_LIBRARY_PATH contains the value /usr/local/lib, otherwise openslide library cannot be reached. It has already been written into file /etc/profile, please use command source /etc/profile to enable this setup if library not found error occurred when compiling openslidetools.

 $gcc -o openslidetools openslidetoolsbin.c -I/usr/include/openslide -L/usr/lib -lopenslide
 $sudo mv openslidetools /bin

The run the following command to make sure openslidetools has been already compiled and moved correctly.

 $openslidetools

One sentence “I’m ready” will appear if it is really ready. If you see the error:

 openslidetools: error while loading shared libraries: libopenslide.so.0: cannot open shared object file: No such file or directory

You should define LD_LIBRARY_PATHin profile to remove the error. Add this line in the file /etc/profile

 $export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}/usr/lib:/lib

Then run the following command

 $source /etc/profile

3. Install JAVA

If you've already have Java in your system and the Jdk version is 1.7 or higner, please skip this step. Otherwise, follow the following steps to install Java.

(1) Download the newest jdk7, extract to /usr/local/java/jdk-version

(2) In the folder /usr/local/java, run the following command to create a soft link

 $sudo ln -s /usr/local/java/jdk-version /usr/local/java/jdk7

(3) Edit the file /etc/profile to add the following two lines at the end of this file

 export JAVA_HOME=/usr/local/java/jdk7
 export PATH=${JAVA_HOME}/bin: ${PATH}

(4) Run the following command if you don't want to relogin.

 $source /etc/profile

4. Upload data

Please check the following list before trying to upload data:

  • db2 is installed
  • db2 spatial extender is installed
  • universal fix pack is applied
  • db is created
  • pais and pi tables are created
  • stored procedures and user defined functions are created
  • openslidetools is compiled and can be executed using command openslidetools (double check it before running any program below, reboot or re-login to make exports in /etc/profile count)
  • Java is installed

After you make sure all the above steps are successful, you can do the following steps to upload your data. Please switch to any other user except for root user. That's because root user will not initialize parameters required from /etc/profile. If you still want to use root user, please run command $source /etc/profile before running programs below.

(1) Modify the configuration files

After you download all the configuration files, currently you don't need to modify paisidconfig.xml and loadconfig_cell.xml and all you need to modify are the database configuration files: paisdbc.xml and imagedbc.xml. Here we use two different configuration files because the image uploader is implemented using hibernate and it needs a hibernate style configuration file. It’s very simple to modify these two files. You only need to change the hostname, port, database, username and password in the files to the corresponding information of your own local or remote server.

(2) Import the metadata of images into pi schema. Go into the folder uploadtool and run command:

 $java -jar imageloader.jar -dbc config/imagedbc.xml -pc config/paisidconfig.xml -i resource/wsi

(3) Upload the boundary documents into database. Run command:

 $java -jar paistools.jar uploader -i resource/boundary -dbc config/paisdbc.xml

(4) Load all uploaded boundaries into database. Run the following command (Here the argument t is set as 2, which means using 2 threads). This step is time-consuming.

 $java -jar paistools.jar loadmanager -dbc config/paisdbc.xml -lc config/loadconfig_cell.xml -t 2

(5) After uploading all the images and the boundary data, go to sql folder to generate the histogram information and upload the patient information. Switch to db2 instance user and Run the following commands:

 $db2 connect to pais
 $db2 -tf gen_histogram.sql
 $db2 -tf load_patient.sql