Abstract
RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality. However, it is still challenging to register RGB-D images from a handheld camera over a long video sequence into a globally consistent 3D model. Current methods often can lose tracking or drift and thus fail to reconstruct salient structures in large environments (e.g., parallel walls in different rooms). To address this problem, we propose a “fine-to-coarse” global registration algorithm that leverages robust registrations at finer scales to seed detection and enforcement of new correspondence and structural constraints at coarser scales. To test global registration algorithms, we provide a benchmark with 10,401 manually-clicked point correspondences in 25 scenes from the SUN3D dataset. During experiments with this benchmark, we find that our fine-to-coarse algorithm registers long RGB-D sequences better than previous methods.
Bibtex
@article{HalberF2C17, title = {Fine-To-Coarse Global Registration of RGB-D Scans}, author = {Maciej Halber and Thomas Funkhouser}, booktitle = {CVPR}, year = {2017} }
Materials
Data
For the purpose of evaluating the quality of reconstructions we have extended the SUN3D dataset by Xiao et al.[1] by manually clicking on points in corresponding images in 25 SUN3D scenes. Below we provide both the correspondence files, as well as all the SUN3D scenes transformed to our format for the ease of result reproduction. Note however that we do not distribute the original images. These need to be obtained from the original dataset.
Downloads:
Correspondence Format
Each correspondence file is a simple text file, with 3 commands:
- n_correspondences x — Number of correspondences
- point2d scan_name_1 scan_name_2 x1 y1 x2 y2 — Location of a point {x1, y1} in image scan_name_1 and a location of a point {x1, y2} in image scan_name_2
- point3d scan_name_1 scan_name_2 x1 y1 z1 x2 y2 z2 — 3D location of a point {x1, y1, z1} backprojected using image scan_name_1 and a 3D location of a point {x2, y2,z2} backprojected using image scan_name_2
We provide both 2d locations clicked in color images, as well as backprojected 3d locations, so that depth images do not have to be read in for evaluation.
Example:
n_correspondences 375 point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 480.241 205.955 477.353 203.068 point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 480.241 127.038 473.504 124.150 point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 133.774 172.271 129.925 170.346 point2d SCAN:0000001-000000013044 SCAN:0000051-000001915092 286.797 259.850 300.271 256.962 point2d SCAN:0000006-000000413475 SCAN:0000811-000027275722 474.466 124.150 396.511 169.383 ... point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092 0.463 0.096 -1.647 0.459 0.106 -1.663 point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092 0.470 0.329 -1.671 0.455 0.340 -1.680 point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092 -0.705 0.256 -2.170 -0.720 0.264 -2.170 point3d SCAN:0000001-000000013044 SCAN:0000051-000001915092 -0.036 -0.023 -0.632 -0.021 -0.019 -0.625 point3d SCAN:0000006-000000413475 SCAN:0000811-000027275722 0.455 0.340 -1.680 0.217 0.197 -1.601 ...
Scene Format
Each scene requires configuration file (.conf) and feature file (.fcb). Configuration files are the main format used by our system. They are simple text files with a number of supported commands:
- dataset dataset_name — Name of dataset used. This is to distinguish how depth images are stored in different datasets.
- n_images x — Number of images in the sequence
- intrinsics filename — Filename containing camera intrinsic information
- color_resolution width height — Resolution of color images
- depth_resolution width height — Resolution of depth images
- depth_directory dir_name — Directory of depth images
- image_directory dir_name — Directory of color images
- correspondences filename — Name of the file storing correspondences
- pairwise_matches filename — Name of the file storing pairwise transformations for initialization
- scan depth_image_name color_image_name transformation_matrix — Identifier of color and depth images along with their respective transformation (16 numbers)
We provide three configuration files per scene: *_initial.conf (input to our method), *_fine_to_coarse.conf (output of our method) and *_ground_truth.conf (the transformations obtained by optimizing for camera poses using the clicked correspondences).
Example:
dataset sun3d n_images 5242 intrinsics intrinsics.txt color_resolution 640 480 depth_resolution 640 480 depth_directory depth image_directory image correspondences correspondences/gt_corrs.txt pairwise_matches matches/pairwise_matches.txt scan 0000001-000000013044.png 0000001-000000000000.jpg 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 scan 0000002-000000046414.png 0000002-000000033516.jpg 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 scan 0000003-000000079783.png 0000003-000000067032.jpg 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 scan 0000004-000000113152.png 0000004-000000100548.jpg 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 scan 0000005-000000146521.png 0000005-000000134064.jpg 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 scan 0000006-000000413475.png 0000007-000000435708.jpg 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 ...
Feature files (.fcb) are sets of preprocessed RGB-D frames put into a single file. We provide them for the ease of experimentation with our system. Our code also provides conf2fet program that is able to produce a fet file from a configuration file.
For how to process a new scene please see our code repository.
Benchmark
We provide fetbenchmark program that can be run to evaluate results of your registration algorithm on our correspondence dataset. You only need to provide a simplified version of configuration file containing list of scans with associated transformations (n_images and scan commands) as well as name of correspondence file you wish to use. To obtain RMSE, mean and standard deviation of your errors you simply need to run:
fetbenchmark filename.conf
Example of a simplifed configuration file:
n_images 875 correspondences correspondences/gt_corrs.txt scan 0000001-000000000000.png 0000001-000000020293.jpg 0.760508 -0.215960 0.612364 1.246280 (...) scan 0000002-000000033369.png 0000001-000000020293.jpg 0.755561 -0.218332 0.617624 1.273338 (...) scan 0000003-000000066739.png 0000002-000000053809.jpg 0.755161 -0.215306 0.619173 1.272936 (...) scan 0000004-000000100108.png 0000003-000000087325.jpg 0.756726 -0.210920 0.618772 1.272400 (...) scan 0000005-000000133477.png 0000004-000000120841.jpg 0.761084 -0.207178 0.614677 1.273754 (...) ...
Below are results obtained by our method. You should be able to reproduce these numbers by running fetbenchmark on the *_fine_to_coarse.conf files.
Sequence Name | Initial | Fine-to-Coarse [5] | Ground Truth |
brown_bm_1/brown_bm_1 | 0.407691 | 0.083449 | 0.061802 |
brown_bm_4/brown_bm_4 | 1.528700 | 0.105455 | 0.044534 |
brown_cogsci_1/brown_cogsci_1 | 0.918856 | 0.071608 | 0.055644 |
brown_cs_2/brown_cs2 | 1.552957 | 0.063214 | 0.044288 |
brown_cs_3/brown_cs3 | 1.286955 | 0.106481 | 0.058540 |
harvard_c11/hv_c11_2 | 0.461329 | 0.064711 | 0.040733 |
harvard_c3/hv_c3_1 | 0.476576 | 0.064681 | 0.041180 |
harvard_c5/hv_c5_1 | 0.239101 | 0.077660 | 0.048203 |
harvard_c6/hv_c6_1 | 0.637067 | 0.075244 | 0.035075 |
harvard_c8/hv_c8_3 | 0.435362 | 0.086565 | 0.036319 |
home_at/home_at_scan1_2013_jan_1 | 0.248482 | 0.040599 | 0.022638 |
home_bksh/home_bksh_oct_30_2012_scan2_erika | 0.265791 | 0.058708 | 0.042858 |
home_md/home_md_scan9_2012_sep_30 | 0.340615 | 0.060626 | 0.038748 |
hotel_nips2012/nips_4 | 0.171788 | 0.051005 | 0.049751 |
hotel_sf/scan1 | 0.633076 | 0.067881 | 0.064119 |
hotel_uc/scan3 | 0.536251 | 0.050381 | 0.051987 |
hotel_umd/maryland_hotel1 | 0.291628 | 0.061397 | 0.037778 |
hotel_umd/maryland_hotel3 | 0.191594 | 0.057937 | 0.029695 |
mit_32_d507/d507_2 | 0.357736 | 0.138743 | 0.034739 |
mit_46_ted_lab1/ted_lab_2 | 0.199438 | 0.046924 | 0.026515 |
mit_76_417/76-417b | 0.117685 | 0.048521 | 0.033720 |
mit_76_studyroom/76-1studyroom2 | 0.454204 | 0.053273 | 0.037425 |
mit_dorm_next_sj/dorm_next_sj_oct_30_2012_scan1_erika | 0.194821 | 0.088166 | 0.022229 |
mit_lab_hj/lab_hj_tea_nov_2_2012_scan1_erika | 0.603828 | 0.089048 | 0.045952 |
mit_w20_athena/sc_athena_oct_29_2012_scan1_erika | 0.420944 | 0.096238 | 0.056805 |