Autotuning Techniques for Performance-Portable Point Set Registration in 3D
DOI:
https://doi.org/10.14529/jsfi180404Abstract
We present an autotuning approach applied to exhaustive performance engineering of the EM-ICP algorithm for the point set registration problem with a known reference. We were able to achieve progressively higher performance levels through a variety of code transformations and an automated procedure of generating a large number of implementation variants. Furthermore, we managed to exploit code patterns that are not common when only attempting manual optimization but which yielded in our tests better performance for the chosen registration algorithm. Finally, we also show how we maintained high levels of the performance rate in a portable fashion across a wide range of hardware platforms including multicore, manycore coprocessors, and accelerators. Each of these hardware classes is much different from the others and, consequently, cannot reliably be mastered by a single developer in a short time required to deliver a close-to-optimal implementation. We assert in our concluding remarks that our methodology as well as the presented tools provide a valid automation system for software optimization tasks on modern HPC hardware.
References
Bartok, A., Kondor, R., Csanyi, G.: On representing chemical environments. Physical Review B 87(18) (2013), DOI: 10.1103/PhysRevB.87.184115
Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE PAMI 14(2), 239–256 (1992), DOI: 10.1109/34.121791
Chui, H., Rangarajan, A.: A feature registration framework using mixture models. In: IEEE Workshop on MMBIA. pp. 190–197 (2000), DOI: 10.1109/MMBIA.2000.852377
Chui, H., Rangarajan, A.: A new algorithm for non-rigid point matching. In: CVPR. vol. 2, pp. 44–51. IEEE Press (2000), DOI: 10.1016/S1077-3142(03)00009-2
Chui, H., Rangarajan, A.: A new point matching algorithm for non-rigid registration. CVIU 89(2–3), 114–141 (2003), DOI: 10.1016/S1077-3142(03)00009-2
Du, S., Zheng, N., Xiong, L., Ying, S., Xue, J.: Scaling iterative closest point algorithm for registration of md point sets. Journal of Visual Communication and Image Representation 21(5–6), 442–452 (2010), DOI: 10.1016/j.jvcir.2010.02.005
Fitzgibbon, A.W.: Robust registration of 2D and 3D point sets. Image and Vision Computing 21, 1145–1153 (2003), DOI: 10.1016/j.imavis.2003.09.004
Gao, M.C., Yeh, J.W., Liaw, P.K., Zhang, Y.: High-entropy alloys: fundamentals and applications. Springer International Publishing Switzerland (2016), DOI: 10.1007/978-3-319-27013-5
Gold, S., Lu, C.P., Rangarajan, A., Pappu, S., Mjolsness, E.: New algorithms for 2D and 3D point matching: Pose estimation and corresp. In: NIPS. vol. 7, pp. 957–964. The MIT Press (1995), DOI: 10.1016/S0031-3203(98)80010-1
Granger, S., Pennec, X.: Multi-scale EM-ICP: A fast and robust approach for surface registration. In: et al., A.H. (ed.) ECCV 2002. pp. 418–432. LNCS 2353, (c) Springer-Verlag Berlin Heidelberg (2002), DOI: 10.1007/3-540-47979-1 28
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. CoRR abs/1502.02551 (2015), http://arxiv.org/abs/1502.02551, accessed: 2018-08-01
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference onMachine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 1737– 1746. PMLR, Lille, France (2015), http://proceedings.mlr.press/v37/gupta15.html, accessed: 2018-08-01
Hockney, R.W., Curington, I.J.: f1/2: A parameter to characterize memory and communication bottlenecks. Parallel Computing 10(3), 277–286 (1989), DOI: 10.1016/0167-8191(89)90100-2
Kim, Y.M., Morozovska, A., Eliseev, E., Oxley, M., Mishra, R., Selbach, S., Grande, T., Pantelides, S., Kalinin, S., Borisevich, A.: Direct observation of ferroelectric field effect and vacancy-controlled screening at the BiFeO3/LaxSr1−xMnO3 interface. Nature Materials 13(11), 1019–1025 (2014), DOI: 10.1038/nmat4058
Larson, D., Prosa, T., Ulfig, R., Geiser, B., Kelly, T.: Local Electrode Atom Probe Tomography: A User’s Guide. Springer-Verlag New York (2013), DOI: 10.1007/978-1-4614-8721-0
Luszczek, P., Gates, M., Kurzak, J., Danalis, A., Dongarra, J.: Search space generation and pruning system for autotuners. In: Proceedings of IPDPSW. pp. 1545–1554. The Eleventh International Workshop on Automatic Performance Tuning (iWAPT) 2016, IEEE, Chicago, IL, USA (2016), DOI: 10.1109/IPDPSW.2016.197
Luszczek, P., Kurzak, J., Yamazaki, I., Dongarra, J.: Towards numerical benchmark for halfprecision floating point arithmetic. In: 2017 IEEE High Performance Extreme Computing Conference (2017), DOI: 10.1109/HPEC.2017.8091031
Luszczek, P., Kurzak, J., Yamazaki, I., Keffer, D., Dongarra, J.J.: Scaling point set registration in 3D across thread counts on multicore and hardware accelerator platforms through autotuning for large scale analysis of scientific point clouds. In: 2017 IEEE International Conference on Big Data (Big Data). pp. 2893–2902. Boston, MA, USA (2017), DOI: 10.1109/BigData.2017.8258258
Miller, M.K., Forbes, R.G.: Atom-Probe Tomography: The Local Electrode Atom Probe. Springer US (2014), DOI: 10.1007/978-1-4899-7430-3
Myronenko, A., Song, X., Carreira-Perpi˜n´an, M.A.: Non-rigid point set registration: Coherent Point Drift. In: Sch¨olkopf, B., Platt, J.C., Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19. pp. 1009–1016. MIT Press (2007)
Rangarajan, A., Chui, H., Mjolsness, E., Davachi, L., Goldman-Rakic, P.S., Duncan, J.S.: A robust point matching algorithm for autoradiograph alignment. Medical Image Analysis 1(4), 379–398 (1997), DOI: 10.1016/S1361-8415(97)85008-6
Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: International Conference on 3D Digital Imaging and Modeling (3DIM). pp. 145–152 (2001), DOI: 10.1109/IM.2001.924423
Rusu, R.B., Cousins, S.: 3D is here: Point Cloud Library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA). Shanghai, China (2011), DOI: 10.1109/ICRA.2011.5980567
Sohlberg, K., Rashkeev, S., Borisevich, A., Pennycook, S., Pantelides, S.: Origin of anomalous Pt-Pt distances in the Pt/Alumina catalytic system. ChemPhysChem 5(12), 1893–1897 (2004), DOI: 10.1002/cphc.200400212
Volkov, V., Demmel, J.W.: Benchmarking GPUs to Tune Dense Linear Algebra. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing. pp. 31:1–31:11. SC ’08, IEEE Press, Piscataway, NJ, USA (2008), http://dl.acm.org/citation.cfm?id=1413370.1413402, accessed: 2018-08-01, DOI: 10.1145/1413370.1413402
Zhang, Y., Zuo, T.T., Tang, Z., Gao, M.C., Dahmen, K.A., Liaw, P.K., Lu, Z.P.: Microstructures and properties of high-entropy alloys. Progress in Materials Science 61, 1–93 (2014), DOI: 10.1016/j.pmatsci.2013.10.001
Zhang, Z.: Iterative point matching for registration of free-form curves and surfaces. International Journal of Computer Vision 13(2), 119–152 (1994), DOI: 10.1007/BF01427149
Downloads
Published
How to Cite
Issue
License
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-Non Commercial 3.0 License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.