3D Object Classification with Selective Multi-View Fusion and Shape Rendering

Author name : Mona Saleh Ahmad Alzahrani

Publication Date : 2024-11-27

Journal Name : 2024 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

Abstract

3D classification is complex and challenging because of high-dimensional data, the intricate nature of their spatial relationships, and viewpoint variations. We fill the gap in view-based 3D object classification by examining the factors that influence classification's effectiveness via determining their respective merits in feature extraction for 3D object recognition by comparing CNN-based and Transformer-based backbone networks side-by-side. Our research extends to evaluating various fusion strategies to determine the most effective method for integrating multiple views and ascertain the optimal number of views that balances classification and computation. We also probe into the effectiveness of different feature types from rendering techniques in accurately depicting 3D objects. This investigation is supported by an extensive experimental framework, incorporating a diverse set of 3D objects from the ModelNet40 dataset. Finally, based on the analysis, we present a Selective Multi-View deep model (SelectiveMV) that shows efficient performance and provides high accuracy given a few views.

Keywords

Object Classification , 3D Classification , Typical Features , Object Recognition , Backbone Network , Fusion Strategy , Transformer , Classification Accuracy , Grayscale , Point Cloud , Number Of Objects , Majority Voting , Fully-connected Layer , Pre-trained Network , Typical Architecture , Importance Scores , Softmax Activation , 3D Datasets , Single View , Fully-connected Network , Virtual Camera , Late Fusion , Feature Extraction Backbone , Set Of Views