Skip to content

Commit d41a4c7

Browse files
committed
Updated docs for v0.4
Change-Id: I70c9e73f3884db0aace52fc4804085c3b397532b
1 parent cb0e76a commit d41a4c7

8 files changed

+182
-28
lines changed

CMakeLists.txt

+2-2
Original file line numberDiff line numberDiff line change
@@ -91,11 +91,11 @@ SET(CMAKE_RELEASE_POSTFIX "")
9191
SET(CMAKE_DEBUG_POSTFIX "-debug")
9292

9393
SET(THEIA_VERSION_MAJOR 0)
94-
SET(THEIA_VERSION_MINOR 3)
94+
SET(THEIA_VERSION_MINOR 4)
9595
SET(THEIA_VERSION_PATCH 0)
9696
SET(THEIA_VERSION
9797
${THEIA_VERSION_MAJOR}.${THEIA_VERSION_MINOR}.${THEIA_VERSION_PATCH})
98-
SET(THEIA_ABI_VERSION 0.3.0)
98+
SET(THEIA_ABI_VERSION 0.4.0)
9999

100100
# THEIA data directory
101101
ADD_DEFINITIONS(-DTHEIA_DATA_DIR="${CMAKE_SOURCE_DIR}/data")

docs/source/bibliography.rst

+10
Original file line numberDiff line numberDiff line change
@@ -99,10 +99,17 @@ Bibliography
9999
Estimation via Residual Consensus**. *International Conference on Computer
100100
Vision (ICCV)*, 2011.
101101
102+
.. [PhotoTourism] N. Snavely, S. Seitz, and R. Szeliski. **Photo tourism:
103+
exploring photo collections in 3D.** *ACM transactions on graphics (TOG)*, 2006.
104+
102105
.. [Stewenius5pt] H. Stewénius, C. Engels, D. Nistér. **Recent developments on
103106
direct relative orientation**. *ISPRS Journal of Photogrammetry and Remote
104107
Sensing*, 2006
105108
109+
.. [SweeneyCVPR2015] C. Sweeney, L. Kneip, T. Hollerer, M. Turk. **Computing
110+
Similarity Transformations from Only Image Correspondences**. *Computer Vision
111+
and Pattern Recognition (CVPR)*, 2015
112+
106113
.. [SweeneyGDLS] C. Sweeney, V. Fragoso, T. Hollerer, M. Turk. **gDLS: A
107114
Scalable Solution to the Generalized Pose and Scale Problem**. *European
108115
Conference on Computer Vision (ECCV)*, 2014
@@ -119,5 +126,8 @@ Bibliography
119126
between two point patterns**. *IEEE Transactions on Pattern Analysis and Machine
120127
Intelligence*, 1991.
121128
129+
.. [VisualSfM] Wu, Changchang. **Towards a Linear Time Incremental Structure From
130+
Motion**. *International Conference on 3D Vision*, 2013.
131+
122132
.. [WilsonECCV2014] Wilson, K. and Snavely, N. **Robust Global Translation with 1DSfM**
123133
*European Conference on Computer Vision*, 2014.
Loading

docs/source/global_sfm.png

677 KB
Loading

docs/source/incremental_sfm.png

55.7 KB
Loading

docs/source/pisa.png

116 KB
Loading

docs/source/releases.rst

+24-2
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,34 @@ HEAD
99

1010
New Features
1111
------------
12-
* Better rendering for point clouds.
1312

1413
Bug Fixes
1514
---------
16-
* Some Visual Studio bugs and incompatabilities (thanks to Pierre Moulon and Brojeshwar Bhowmick)
1715

16+
`0.4.0 <https://github.com/sweeneychris/TheiaSfM/archive/v0.4.tar.gz>`_
17+
=======================================================================
18+
19+
New Features
20+
------------
21+
* Incremental SfM pipeline.
22+
* New website: `www.theia-sfm.org <http://www.theia-sfm.org>`_.
23+
* Linear method for camera pose registration [JiangICCV]_.
24+
* Better rendering for point clouds.
25+
* Significantly better Cmake scripts for Windows (thanks to bvanevery for testing)
26+
* Mutable priority queue class.
27+
* Bundle adjustment method for cameras only (points held constant).
28+
* Calibrated and Uncalibrated absolute pose estimators.
29+
* Two-view bundle adjustment will now optimize camera intrinsics if they are not known.
30+
* New small and large-scale benchmarking results on the Theia website.
31+
32+
Bug Fixes
33+
---------
34+
* Some Visual Studio bugs and incompatabilities (thanks to Pierre Moulon and Brojeshwar Bhowmick).
35+
* Sample Consensus estimators were incorrectly counting the number of samples needed (found by inspirit).
36+
* Proper normalization the 1dSfM axis of projection.
37+
* OpenGL viewer properly sets zero-values of matrices upon initialization.
38+
* Relative translation optimization (with known rotation) is dramatically improved (thanks to Onur Ozyesil)
39+
* Translations solver uses SPARSE_NORMAL_CHOLESKY when no 3D points are used.
1840

1941
`0.3.0 <https://github.com/sweeneychris/TheiaSfM/archive/v0.3.tar.gz>`_
2042
=======================================================================

docs/source/sfm.rst

+146-24
Original file line numberDiff line numberDiff line change
@@ -12,15 +12,17 @@ Theia has a full Structure-from-Motion pipeline that is extremely efficient. Our
1212
overall pipeline consists of several steps. First, we extract features (SIFT is
1313
the default). Then, we perform two-view matching and geometric verification to
1414
obtain relative poses between image pairs and create a :class:`ViewGraph`. Next,
15-
we perform global pose estimation with global SfM. Global SfM is different from
16-
incremental SfM in that it considers the entire view graph at the same time
17-
instead of incrementally adding more and more images to the
18-
:class:`Reconstruction`. Global SfM methods have been proven to be very fast
19-
with comparable or better accuracy to incremental SfM approaches (See
20-
[JiangICCV]_, [MoulonICCV]_, [WilsonECCV2014]_), and they are much more readily
21-
parallelized. After we have obtained camera poses, we perform triangulation and
22-
:class:`BundleAdjustment` to obtain a valid 3D reconstruction consisting of
23-
cameras and 3D points.
15+
we perform either incremental or global SfM. Incremental SfM is the standard
16+
approach that adds on one image at a time to grow the reconstruction. While this
17+
method is robust, it is not scalable because it requires repeated operations of
18+
expensive bundle adjustment. Global SfM is different from incremental SfM in
19+
that it considers the entire view graph at the same time instead of
20+
incrementally adding more and more images to the :class:`Reconstruction`. Global
21+
SfM methods have been proven to be very fast with comparable or better accuracy
22+
to incremental SfM approaches (See [JiangICCV]_, [MoulonICCV]_,
23+
[WilsonECCV2014]_), and they are much more readily parallelized. After we have
24+
obtained camera poses, we perform triangulation and :class:`BundleAdjustment` to
25+
obtain a valid 3D reconstruction consisting of cameras and 3D points.
2426

2527
The first step towards creating a reconstruction is to determine images which
2628
view the same objects. To do this, we must create a :class:`ViewGraph`.
@@ -29,8 +31,6 @@ view the same objects. To do this, we must create a :class:`ViewGraph`.
2931
#. Match features to obtain image correspondences.
3032
#. Estimate camera poses from two-view matches and geometries.
3133

32-
.. TODO:: Insert figure.
33-
3434
#1. and #2. have been covered in other sections, so we will focus on creating a
3535
reconstruction from two-view matches and geometry. First, we will describe the
3636
fundamental elements of our reconstruction.
@@ -40,7 +40,7 @@ Reconstruction
4040

4141
.. class:: Reconstruction
4242

43-
.. TODO:: Insert figure.
43+
.. image:: pisa.png
4444

4545
At the core of our SfM pipeline is an SfM :class:`Reconstruction`. A
4646
:class:`Reconstruction` is the representation of a 3D reconstuction consisting
@@ -117,8 +117,6 @@ ViewGraph
117117

118118
.. class:: ViewGraph
119119

120-
.. TODO:: INSERT FIGURE HERE
121-
122120
A :class:`ViewGraph` is a basic SfM construct that is created from two-view
123121
matching information. Any pair of views that have a view correlation form an
124122
edge in the :class:`ViewGraph` such that the nodes in the graph are
@@ -296,9 +294,83 @@ In addition to typical getter/setter methods for the camera parameters, the
296294
according to the camera orientation in 3D space. The returned vector is not
297295
unit length.
298296

297+
Incremental SfM Pipeline
298+
========================
299+
300+
.. image:: incremental_sfm.png
301+
302+
The incremental SfM pipeline follows very closely the pipelines of `Bundler
303+
<http://www.cs.cornell.edu/~snavely/bundler/>`_ [PhotoTourism]_ and `VisualSfM
304+
<http://ccwu.me/vsfm/>`_ [VisualSfM]_. The method begins by first estimating the
305+
3D structure and camera poses of 2 cameras based on their relative pose. Then
306+
additional cameras are added on sequentially and new 3D structure is estimated
307+
as new parts of the scene are observed. Bundle adjustment is repeatedly
308+
performed as more cameras are added to ensure high quality reconstructions and
309+
to avoid drift.
310+
311+
The incremental SfM pipeline is as follows:
312+
#. Choose an initial camera pair to reconstruct.
313+
#. Estimate 3D structure of the scene.
314+
#. Bundle adjustment on the 2-view reconstruction.
315+
#. Localize a new camera to the current 3D points. Choose the camera that
316+
observes the most 3D points currently in the scene.
317+
#. Estimate new 3D structure.
318+
#. Bundle adjustment if the model has grown by more than 5% since the last
319+
bundle adjustment.
320+
#. Repeat steps 4-6 until all cameras have been added.
321+
322+
Incremental SfM is generally considered to be more robust than global SfM
323+
methods; hwoever, it requires many more instances of bundle adjustment (which
324+
is very costly) and so incremental SfM is not as efficient or scalable.
325+
326+
.. member:: double ReconstructorEstimatorOptions::multiple_view_localization_ratio
327+
328+
DEFAULT: ``0.8``
329+
330+
If M is the maximum number of 3D points observed by any view, we want to
331+
localize all views that observe > M * multiple_view_localization_ratio 3D
332+
points. This allows for multiple well-conditioned views to be added to the
333+
reconstruction before needing bundle adjustment.
334+
335+
.. member:: double ReconstructionEstimatorOptions::absolute_pose_reprojection_error_threshold
336+
337+
DEFAULT: ``8.0``
338+
339+
When adding a new view to the current reconstruction, this is the
340+
reprojection error that determines whether a 2D-3D correspondence is an
341+
inlier during localization.
342+
343+
.. member:: int ReconstructionEstimatorOptions::min_num_absolute_pose_inliers
344+
345+
DEFAULT: ``30``
346+
347+
Minimum number of inliers for absolute pose estimation to be considered
348+
successful.
349+
350+
.. member:: double ReconstructionEstimatorOptions::full_bundle_adjustment_growth_percent
351+
352+
DEFAULT: ``5.0``
353+
354+
Bundle adjustment of the entire reconstruction is triggered when the
355+
reconstruction has grown by more than this percent. That is, if we last ran
356+
BA when there were K views in the reconstruction and there are now N views,
357+
then G = (N - K) / K is the percent that the model has grown. We run bundle
358+
adjustment only if G is greater than this variable. This variable is
359+
indicated in percent so e.g., 5.0 = 5%.
360+
361+
.. member:: int ReconstructionEstimatorOptions::partial_bundle_adjustment_num_views
362+
363+
DEFAULT: ``20``
364+
365+
During incremental SfM we run "partial" bundle adjustment on the most
366+
recent views that have been added to the 3D reconstruction. This parameter
367+
controls how many views should be part of the partial BA.
368+
299369
Global SfM Pipeline
300370
===================
301371

372+
.. image:: global_sfm.png
373+
302374
The global SfM pipelines in Theia follow a general procedure of filtering
303375
outliers and estimating camera poses or structure. Removing outliers can help
304376
increase performance dramatically for global SfM, though robust estimation
@@ -325,8 +397,8 @@ follows:
325397
.. class:: ReconstructionEstimator
326398

327399
This is the base class for which all SfM reconstruction pipelines derive
328-
from. The reconstruction estimation type can be specified at runtime, though
329-
currently only ``NONLINEAR`` is implemented.
400+
from. The reconstruction estimation type can be specified at runtime
401+
(currently ``NONLINEAR`` and ``INCREMENTAL`` are implemented).
330402

331403
.. function:: ReconstructionEstimator::ReconstructionEstimator(const ReconstructorEstimatorOptions& options)
332404

@@ -583,15 +655,17 @@ Estimating Global Positions
583655
===========================
584656

585657
Positions of cameras may be estimated simultaneously after the rotations are
586-
known. We use a nonlinear optimization to estimate camera positions based. Given
587-
pairwise relative translations from :class:`TwoViewInfo` and the estimated
588-
rotation, the constraint
658+
known. We use either a linear or a nonlinear optimization to estimate camera
659+
positions based.
660+
661+
Given pairwise relative translations from :class:`TwoViewInfo`
662+
and the estimated rotation, the constraint
589663

590664
.. math:: R_i * (c_j - c_i) = \alpha_{i,j} * t_{i,j}
591665

592-
Where :math:`\alpha_{i,j} = ||c_j - c_i||^2`. This ensures that we optimize for
593-
positions that agree with the relative positions computed in two-view
594-
estimation.
666+
is used to determine the global camera positions, where :math:`\alpha_{i,j} =
667+
||c_j - c_i||^2`. This ensures that we optimize for positions that agree with
668+
the relative positions computed in two-view estimation.
595669

596670
.. class:: NonlinearPositionEstimatorOptions
597671

@@ -653,6 +727,40 @@ estimation.
653727
using the nonlinear algorithm described above. Only positions that have an
654728
orientation set are estimated. Returns true upons success and false on failure.
655729

730+
731+
.. class:: LinearPositionEstimator
732+
733+
.. image:: global_linear_position_estimation.png
734+
:width: 40%
735+
:align: center
736+
737+
For the linear position estimator of [JiangICCV]_, we utilize an approximate geometric error to determine the position locations within a triplet as shown above. The cost function we minimize is:
738+
739+
.. math:: f(i, j, k) = c_k - \dfrac{1}{2} (c_i + ||c_k - c_i|| c_{ik}) + c_j + ||c_k - c_j|| c_{jk})
740+
741+
This can be formed as a linear constraint in the unknown camera positions :math:`c_i`. Tthe solution that minimizes this cost lies in the null-space of the resultant linear system. Instead of extracting the entire null-space as [JiangICCV]_ does, we instead hold one camera constant at the origin and use the Inverse-Iteration Power Method to efficiently determine the null vector that best solves our minimization. This results in a dramatic speedup without sacrificing efficiency.
742+
743+
.. NOTE:: Currently this position estimation method is not integrated into the Theia global SfM pipeline. More testing needs to be done with this method before it can be reliably integrated.
744+
745+
.. member:: int LinearPositionEstimator::Options::num_threads
746+
747+
DEFAULT: ``1``
748+
749+
The number of threads to use to solve for camera positions
750+
751+
.. member:: int LinearPositionEstimator::Options::max_power_iterations
752+
753+
DEFAULT: ``1000``
754+
755+
Maximum number of power iterations to perform while solving for camera positions.
756+
757+
.. member:: double LinearPositionEstimator::Options::eigensolver_threshold
758+
759+
DEFAULT: ``1e-8``
760+
761+
This number determines the convergence of the power iteration method. The
762+
lower the threshold the longer it will take to converge.
763+
656764
Triangulation
657765
=============
658766

@@ -853,7 +961,7 @@ the reprojection error.
853961
Similarity Transformation
854962
=========================
855963

856-
.. function:: void AlignPointCloudsICP(const int num_points, const double left[], const double right[], double rotation[3 * 3], double translation[3])
964+
.. function:: void AlignPointCloudsICP(const int num_points, const double left[], const double right[], double rotation[], double translation[])
857965

858966
We implement ICP for point clouds. We use Besl-McKay registration to align
859967
point clouds. We use SVD decomposition to find the rotation, as this is much
@@ -863,7 +971,7 @@ Similarity Transformation
863971
the left and right reconstructions have the same number of points, and that the
864972
points are aligned by correspondence (i.e. left[i] corresponds to right[i]).
865973

866-
.. function:: void AlignPointCloudsUmeyama(const int num_points, const double left[], const double right[], double rotation[3 * 3], double translation[3], double* scale)
974+
.. function:: void AlignPointCloudsUmeyama(const int num_points, const double left[], const double right[], double rotation[], double translation[], double* scale)
867975

868976
This function estimates the 3D similiarty transformation using the least
869977
squares method of [Umeyama]_. The returned rotation, translation, and scale
@@ -903,3 +1011,17 @@ Similarity Transformation
9031011
``solution_translation``: the translation of the candidate solutions
9041012

9051013
``solution_scale``: the scale of the candidate solutions
1014+
1015+
.. function:: void SimTransformPartialRotation(const Eigen::Vector3d& rotation_axis, const Eigen::Vector3d image_one_ray_directions[5], const Eigen::Vector3d image_one_ray_origins[5], const Eigen::Vector3d image_two_ray_directions[5], const Eigen::Vector3d image_two_ray_origins[5], std::vector<Eigen::Quaterniond>* soln_rotations, std::vector<Eigen::Vector3d>* soln_translations, std::vector<double>* soln_scales)
1016+
1017+
Solves for the similarity transformation that will transform rays in image
1018+
two such that the intersect with rays in image one such that:
1019+
1020+
.. math:: s * R * X' + t = X
1021+
1022+
where s, R, t are the scale, rotation, and translation returned, X' is a
1023+
point in coordinate system 2 and X is the point transformed back to
1024+
coordinate system 1. Up to 8 solutions will be returned.
1025+
1026+
Please cite the paper "Computing Similarity Transformations from Only Image
1027+
Correspondences" by C. Sweeney et al (CVPR 2015) [SweeneyCVPR2015]_ when using this algorithm.

0 commit comments

Comments
 (0)