Comparative analysis

Select the best architecture from the below

Encoder :

variations:
- standard conv2
- depth-wise separable convolution (1)
- ~~MobileNet~~
initialization:
- random (1)
- pre-train as classification on ImageNet
metric at component level: number of weights, lower is better.

Decoder:

Fix the "UpConv Block"
Variations:
- NNConv3 with interpolation using nearest (1)
- NNConv3 with pixel shuffle
- Just pixel shuffle
Note: Why using convolution during decoding before upsampling.

Loss function:

train with only the last decoder (decorer_0) (1)
train with multi-scale depth prediction.

Training method:

Supervised learning (1)
Unsupervised learning using stereo images.

Edited Jun 25, 2024 by Harley Nelson Lara Alonso