Accelerating Seismic Redatuming Using Tile Low-Rank Approximations on NEC SX-Aurora TSUBASA
DOI:
https://doi.org/10.14529/jsfi210201Abstract
With the aim of imaging subsurface discontinuities, seismic data recorded at the surface of the Earth must be numerically re-positioned inside the subsurface where reflections have originated, a process referred to as redatuming. The recently developed Marchenko method is able to handle full-wavefield data including multiple arrivals. A downside of this approach is that a multi-dimensional convolution operator must be repeatedly evaluated to solve an expensive inverse problem. As such an operator applies multiple dense matrix-vector multiplications (MVM), we identify and leverage the data sparsity structure for each frequency matrix and propose to accelerate the MVM step using tile low-rank (TLR) matrix approximations. We study the TLR impact on time-to-solution for the MVM using different accuracy thresholds whilst at the same time assessing the quality of the resulting subsurface seismic wavefields and show that TLR leads to a minimal degradation in terms of signal-to-noise ratio on a 3D synthetic dataset. We mitigate the load imbalance overhead and provide performance evaluation on two distributed-memory systems. Our MPI+OpenMP TLR-MVM implementation reaches up to 3X performance speedup against the dense MVM counterpart from NEC scientific library on 128 NEC SX-Aurora TSUBASA cards. Thanks to the second generation of high bandwidth memory technology, it further attains up to 67X performance speedup compared to the dense MVM from Intel MKL when running on 128 dual-socket 20-core Intel Cascade Lake nodes with DDR4 memory. This corresponds to 110 TB/s of aggregated sustained bandwidth for our TLR-MVM implementation, without suffering deterioration in the quality of the reconstructed seismic wavefields.References
Abdelfattah, A., Haidar, A., Tomov, S., Dongarra, J.J.: Performance, design, and autotuning of batched GEMM for GPUs. In: Kunkel, J.M., Balaji, P., Dongarra, J.J. (eds.) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 9697, pp. 21–38. Springer (2016). https://doi.org/10.1007/978-3-319-41321-1_2
Akbudak, K., Ltaief, H., Mikhalev, A., Keyes, D.: Tile Low Rank Cholesky Factorization for Climate/Weather Modeling Applications on Manycore Architectures. In: High Performance Computing. ISC 2017. Lecture Notes in Computer Science, vol. 10266, pp. 22–40. Springer (2017). https://doi.org/10.1007/978-3-319-58667-0_2
Akbudak, K., Ltaief, H., Mikhalev, A., et al.: Exploiting data sparsity for large-scale matrix computations. In: Aldinucci, M., Padovani, L., Torquati, M. (eds.) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 11014, pp. 721–734. Springer (2018). https://doi.org/10.1007/978-3-319-96983-1_51
Al-Harthi, N., Alomairy, R., Akbudak, K., et al.: Solving Acoustic Boundary Integral Equations Using High Performance Tile Low-Rank LU Factorization. In: High Performance Computing. ISC High Performance 2020. Springer (2020). https://doi.org/10.1007/978-3-030-50743-5_11
Amestoy, P., Ashcraft, C., Boiteau, O., et al.: Improving Multifrontal Methods by Means of Block Low-Rank Representations. SIAM Journal on Scientific Computing 37(3), A1451–A1474 (2015). https://doi.org/10.1137/120903476
Amundsen, L.: Elimination of Free-surface Related Multiples Without Need of a Source Wavelet. Geophysics 66, 327–341 (2001). https://doi.org/10.1190/1.1444912
Berryhill, J.R.: Wave-equation Datuming Before Stack. Geophysics 49, 2064–2066 (1984). https://doi.org/10.1190/1.1441620
Börm, S.: Efficient Numerical Methods for Non-Local Operators: H2-matrix Compression, Algorithms and Analysis, vol. 14. European Mathematical Society (2010). https://doi.org/10.4171/091
Börm, S., Grasedyck, L., Hackbusch, W.: Introduction to Hierarchical Matrices with Applications. Engineering Analysis with Boundary Elements 27(5), 405–422 (2003). https://doi.org/10.1016/S0955-7997(02)00152-2
Boukaram, W.H., Turkiyyah, G., Ltaief, H., Keyes, D.E.: Batched QR and SVD Algorithms on GPUs with Applications in Hierarchical Matrix Compression. Parallel Computing 74(C), 19–33 (2018). https://doi.org/10.1016/j.parco.2017.09.001
Brackenhoff, J., Thorbecke, J., Koehne, V., et al.: Implementation of the 3D Marchenko method (2020). https://doi.org/10.1190/geo2017-0108.1
Broggini, F., Snieder, R., Wapenaar, K.: Focusing the Wavefield Inside an Unknown 1D Medium: Beyond Seismic Interferometry. Geophysics 77(5), A25–A28 (2012). https://doi.org/10.1190/geo2012-0060.1
Cao, Q., Pei, Y., Akbudak, K., et al.: Extreme-Scale Task-Based Cholesky Factorization Toward Climate and Weather Prediction Applications. In: Proceedings of the Platform for Advanced Scientific Computing Conference. pp. 2:1–2:11. ACM (2020). https://doi.org/10.1145/3394277.3401846
Charara, A., Keyes, D., Ltaief, H.: Tile Low-Rank GEMM Using Batched Operations on GPUs. In: Aldinucci, M., Padovani, L., Torquati, M. (eds.) Euro-Par 2018: Parallel Processing. Lecture Notes in Computer Science, vol. 11014, pp. 811–825. Springer (2018). https://doi.org/10.1007/978-3-319-96983-1_57
Charara, A., Keyes, D., Ltaief, H.: Batched Triangular Dense Linear Algebra Kernels for Very Small Matrix Sizes on GPUs. ACM Transactions on Mathematical Software 45(2) (2019). https://doi.org/10.1145/3267101
Corona, E., Martinsson, P.G., Zorin, D.: An O(N) Direct Solver for Integral Equations on the Plane. Applied and Computational Harmonic Analysis 38(2), 284–317 (2015). https://doi.org/10.1016/j.acha.2014.04.002
Goreinov, S., Tyrtyshnikov, E., Yeremin, A.Y.: Matrix-Free Iterative Solution Strategies for Large Dense Linear Systems. Numerical Linear Algebra with Applications 4(4), 273–294 (1997)
Grasedyck, L., Kressner, D., Tobler, C.: A Literature Survey of Low-Rank Tensor Approximation Techniques. GAMM-Mitteilungen 36(1), 53–78 (2013). https://doi.org/10.1002/gamm.201310004
van Groenestijn, G.J., Verschuur, D.J.: Estimating Primaries by Sparse Inversion and Application to Near-offset Data Reconstruction. Geophysics 74(3), 1MJ–Z54 (2009). https://doi.org/10.1190/1.3111115
Hackbusch, W.: A Sparse Matrix Arithmetic Based on H-matrices. Part I: Introduction to H-Matrices. Computing 62(2), 89–108 (1999). https://doi.org/10.1007/s006070050015
Halko, N., Martinsson, P.G., Tropp, J.A.: Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Review 53(2), 217–288 (2011). https://doi.org/10.1137/090771806
Jumah, B., Herrmann, F.J.: Dimensionality-reduced Estimation of Primaries by Sparse Inversion. Geophysical Prospecting 62(5), 972–993 (2014). https://doi.org/10.1111/1365-2478.12113
Keyes, D.E., Ltaief, H., Turkiyyah, G.: Hierarchical Algorithms on Hierarchical Architectures. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 378(2166), 20190055 (2020). https://doi.org/10.1098/rsta.2019.0055
Kriemann, R.: H-LU Factorization on Many-Core Systems. Computing and Visualization in Science 16(3), 105–117 (2013). https://doi.org/10.1007/s00791-014-0226-7
Lindstrom, P.: Fixed-rate compressed floating-point arrays. IEEE Transactions on Visualization and Computer Graphics 20(12), 2674–2683 (2014). https://doi.org/10.1109/TVCG.2014.2346458
Ltaief, H., Cranney, J., Gratadour, D., et al.: Meeting the Real-Time Challenges of Ground-Based Telescopes Using Low-Rank Matrix Computations (2021), http://hdl.handle.net/10754/669813
van der Neut, J., Thorbecke, J., Wapenaar, K., Slob, E.: Inversion of the Multidimensional Marchenko Equation. In: 77th Conference and Exhibition, EAGE, Extended Abstracts. vol. 2015, pp. 1–5. European Association of Geoscientists & Engineers (2015). https://doi.org/10.3997/2214-4609.201412939
Ravasi, M., Vasconcelos, I.: PyLops – A Linear-operator Python Library for Scalable Algebra and Optimization. SoftwareX 11, 100361 (2020). https://doi.org/10.1016/j.softx.2019.100361
Ravasi, M., Vasconcelos, I.: An Open-source Framework for the Implementation of Largescale Integral Operators with Flexible, Modern HPC Solutions - Enabling 3D Marchenko
Imaging by Least Squares Inversion. Geophysics pp. 1–74 (2021). https://doi.org/10.1190/geo2020-0796.1
Ravasi, M., Vasconcelos, I., Kritski, A., et al.: Target-oriented Marchenko Imaging of a North Sea Field. Geophysical Journal International 205(1), 99–104 (2016). https://doi.org/10.1093/gji/ggv528
Rouet, F.H., Li, X.S., Ghysels, P., Napov, A.: A Distributed-memory Package for Dense Hierarchically Semi-separable Matrix Computations Using Randomization. ACM Transactions on Mathematical Software (TOMS) 42(4), 27 (2016). https://doi.org/10.1145/2930660
Verschuur, D.J.: Surface-related Multiple Elimination in Terms of Huygens Sources. Journal of Seismic Exploration 1, 49–59 (1992)
Wapenaar, K., Thorbecke, J., van der Neut, J., et al.: Marchenko Imaging. Geophysics 79(3), WA39–WA57 (2014). https://doi.org/10.1190/geo2013-0302.1
Williams, S., Waterman, A., Patterson, D.: Roofline: An Insightful Visual Performance Model for Multicore Architectures. Communications of the ACM 52(4), 65–76 (2009). https://doi.org/10.1145/1498765.1498785
Yilmaz, O.: Seismic Data Analysis. Society of Exploration Geophysicists (2001)
Downloads
Published
How to Cite
License
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-Non Commercial 3.0 License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.