Abstract

RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality. However, it is still challenging to register RGB-D images from a handheld camera over a long video sequence into a globally consistent 3D model. Current methods often can lose tracking or drift and thus fail to reconstruct salient structures in large environments (e.g., parallel walls in different rooms). To address this problem, we propose a “fine-to-coarse” global registration algorithm that leverages robust registrations at finer scales to seed detection and enforcement of new correspondence and structural constraints at coarser scales. To test global registration algorithms, we provide a benchmark with 10,401 manually-clicked point correspondences in 25 scenes from the SUN3D dataset. During experiments with this benchmark, we find that our fine-to-coarse algorithm registers long RGB-D sequences better than previous methods.

Bibtex

@article{HalberF2C17,
  title     = {Fine-To-Coarse Global Registration of RGB-D Scans},
  author    = {Maciej Halber and Thomas Funkhouser},
  booktitle = {CVPR}, 
  year      = {2017} 
}

Materials

Data

For the purpose of evaluating the quality of reconstructions we have extended the SUN3D dataset by Xiao et al.[1] by manually clicking on points in corresponding images in 25 SUN3D scenes. Below we provide both the correspondence files, as well as all the SUN3D scenes transformed to our format for the ease of result reproduction. Note however that we do not distribute the original images. These need to be obtained from the original dataset.

Downloads:

Correspondence Format

Each correspondence file is a simple text file, with 3 commands:

n_correspondences x — Number of correspondences
point2d scan_name_1 scan_name_2 x1 y1 x2 y2 — Location of a point {x1, y1} in image scan_name_1 and a location of a point {x1, y2} in image scan_name_2
point3d scan_name_1 scan_name_2 x1 y1 z1 x2 y2 z2 — 3D location of a point {x1, y1, z1} backprojected using image scan_name_1 and a 3D location of a point {x2, y2,z2} backprojected using image scan_name_2

We provide both 2d locations clicked in color images, as well as backprojected 3d locations, so that depth images do not have to be read in for evaluation.

Example:

n_correspondences 375
point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 480.241 205.955 477.353 203.068
point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 480.241 127.038 473.504 124.150
point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 133.774 172.271 129.925 170.346
point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 286.797 259.850 300.271 256.962
point2d SCAN:0000006-000000413475 SCAN:0000811-000027275722 474.466 124.150 396.511 169.383
...
point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092  0.463  0.096 -1.647  0.459  0.106 -1.663
point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092  0.470  0.329 -1.671  0.455  0.340 -1.680
point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092 -0.705  0.256 -2.170 -0.720  0.264 -2.170
point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092 -0.036 -0.023 -0.632 -0.021 -0.019 -0.625
point3d SCAN:0000006-000000413475 SCAN:0000811-000027275722  0.455  0.340 -1.680  0.217  0.197 -1.601
...

Scene Format

Each scene requires configuration file (.conf) and feature file (.fcb). Configuration files are the main format used by our system. They are simple text files with a number of supported commands:

dataset dataset_name — Name of dataset used. This is to distinguish how depth images are stored in different datasets.
n_images x — Number of images in the sequence
intrinsics filename — Filename containing camera intrinsic information
color_resolution width height — Resolution of color images
depth_resolution width height — Resolution of depth images
depth_directory dir_name — Directory of depth images
image_directory dir_name — Directory of color images
correspondences filename — Name of the file storing correspondences
pairwise_matches filename — Name of the file storing pairwise transformations for initialization
scan depth_image_name color_image_name transformation_matrix — Identifier of color and depth images along with their respective transformation (16 numbers)

We provide three configuration files per scene: *_initial.conf (input to our method), *_fine_to_coarse.conf (output of our method) and *_ground_truth.conf (the transformations obtained by optimizing for camera poses using the clicked correspondences).

Example:

dataset sun3d
n_images 5242
intrinsics intrinsics.txt
color_resolution 640 480
depth_resolution 640 480
depth_directory depth
image_directory image
correspondences correspondences/gt_corrs.txt
pairwise_matches matches/pairwise_matches.txt

scan       0000001-000000013044.png       0000001-000000000000.jpg  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1
scan       0000002-000000046414.png       0000002-000000033516.jpg  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1
scan       0000003-000000079783.png       0000003-000000067032.jpg  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1
scan       0000004-000000113152.png       0000004-000000100548.jpg  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1
scan       0000005-000000146521.png       0000005-000000134064.jpg  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1
scan       0000006-000000413475.png       0000007-000000435708.jpg  1 0 0 0  0 1 0 0  0 0 1 0  0 0 0 1
...

Feature files (.fcb) are sets of preprocessed RGB-D frames put into a single file. We provide them for the ease of experimentation with our system. Our code also provides conf2fet program that is able to produce a fet file from a configuration file.

For how to process a new scene please see our code repository.

Benchmark

We provide fetbenchmark program that can be run to evaluate results of your registration algorithm on our correspondence dataset. You only need to provide a simplified version of configuration file containing list of scans with associated transformations (n_images and scan commands) as well as name of correspondence file you wish to use. To obtain RMSE, mean and standard deviation of your errors you simply need to run:

fetbenchmark filename.conf

Example of a simplifed configuration file:

n_images 875
correspondences correspondences/gt_corrs.txt

scan       0000001-000000000000.png       0000001-000000020293.jpg  0.760508 -0.215960 0.612364 1.246280  (...)
scan       0000002-000000033369.png       0000001-000000020293.jpg  0.755561 -0.218332 0.617624 1.273338  (...)
scan       0000003-000000066739.png       0000002-000000053809.jpg  0.755161 -0.215306 0.619173 1.272936  (...)
scan       0000004-000000100108.png       0000003-000000087325.jpg  0.756726 -0.210920 0.618772 1.272400  (...)
scan       0000005-000000133477.png       0000004-000000120841.jpg  0.761084 -0.207178 0.614677 1.273754  (...)
...

Below are results obtained by our method. You should be able to reproduce these numbers by running fetbenchmark on the *_fine_to_coarse.conf files.

Sequence Name	Initial	Fine-to-Coarse [5]	Ground Truth
brown_bm_1/brown_bm_1	0.407691	0.083449	0.061802
brown_bm_4/brown_bm_4	1.528700	0.105455	0.044534
brown_cogsci_1/brown_cogsci_1	0.918856	0.071608	0.055644
brown_cs_2/brown_cs2	1.552957	0.063214	0.044288
brown_cs_3/brown_cs3	1.286955	0.106481	0.058540
harvard_c11/hv_c11_2	0.461329	0.064711	0.040733
harvard_c3/hv_c3_1	0.476576	0.064681	0.041180
harvard_c5/hv_c5_1	0.239101	0.077660	0.048203
harvard_c6/hv_c6_1	0.637067	0.075244	0.035075
harvard_c8/hv_c8_3	0.435362	0.086565	0.036319
home_at/home_at_scan1_2013_jan_1	0.248482	0.040599	0.022638
home_bksh/home_bksh_oct_30_2012_scan2_erika	0.265791	0.058708	0.042858
home_md/home_md_scan9_2012_sep_30	0.340615	0.060626	0.038748
hotel_nips2012/nips_4	0.171788	0.051005	0.049751
hotel_sf/scan1	0.633076	0.067881	0.064119
hotel_uc/scan3	0.536251	0.050381	0.051987
hotel_umd/maryland_hotel1	0.291628	0.061397	0.037778
hotel_umd/maryland_hotel3	0.191594	0.057937	0.029695
mit_32_d507/d507_2	0.357736	0.138743	0.034739
mit_46_ted_lab1/ted_lab_2	0.199438	0.046924	0.026515
mit_76_417/76-417b	0.117685	0.048521	0.033720
mit_76_studyroom/76-1studyroom2	0.454204	0.053273	0.037425
mit_dorm_next_sj/dorm_next_sj_oct_30_2012_scan1_erika	0.194821	0.088166	0.022229
mit_lab_hj/lab_hj_tea_nov_2_2012_scan1_erika	0.603828	0.089048	0.045952
mit_w20_athena/sc_athena_oct_29_2012_scan1_erika	0.420944	0.096238	0.056805

Fine-to-Coarse Global Registration of RGB-D Scans

Maciej Halber Thomas Funkhouser

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017