We introduce a Multi-Scale Pyramidal Pooling Network tailored to generic steel defect classification, featuring a novel pyramidal pooling layer at multiple scales and a novel encoding layer. Thanks to the former, the network does not require all images of a given classification task to be of equal size. The latter narrows the gap to bag-of-features approaches. On various benchmark datasets, we evaluate and compare our system to convolutional neural networks and state-of-the-art computer vision methods. We also present results on a real industrial steel defect classification problem, where existing architectures are not applicable as they require equally sized input images. Our method substantially outperforms previous methods based on engineered features. It can be seen as a fully supervised hierarchical bag-of-features extension that is trained online and can be fine-tuned for any given task. © 2013 IEEE.
|Original language||English (US)|
|Title of host publication||Proceedings of the International Joint Conference on Neural Networks|
|State||Published - Dec 1 2013|