Long Distance Geographically Distributed InfiniBand Based Computing

Authors

  • Karol Niedzielewski Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
  • Marcin Semeniuk Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
  • Jarosław Skomiał Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
  • Jerzy Proficz Centre of Informatics - Tricity Academic Supercomputer & networK, Gdańsk University of Technology, Gdańsk, Poland
  • Piotr Sumioka Centre of Informatics - Tricity Academic Supercomputer & networK, Gdańsk University of Technology, Gdańsk, Poland
  • Bartosz Pliszka Centre of Informatics - Tricity Academic Supercomputer & networK, Gdańsk University of Technology, Gdańsk, Poland
  • Marek Michalewicz Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland

DOI:

https://doi.org/10.14529/jsfi200202

Abstract

Collaboration between multiple computing centres, referred as federated computing is becoming important pillar of High Performance Computing (HPC) and will be one of its key components in the future. To test technical possibilities of future collaboration using 100Gb optic fiber link (Connection was 900 km in length with 9ms RTT time) we prepared two scenarios of operation.
In the first one, Interdisciplinary Centre for Mathematical and Computational Modelling (ICM) in Warsaw and Centre of Informatics - Tricity Academic Supercomputer & networK (CI-TASK) in Gdańsk prepared a long distance geographically distributed computing cluster. System consisted of 14 nodes (10 nodes at ICM facility and 4 at TASK facility) connected using InfiniBand. Our tests demonstrate that it is possible to perform computationally intensive data analysis on systems of this class without substantial drop in performance for a certain type of workloads. Additionally, we show that it is feasible to use High Performance Parallex [1], high level abstraction libraries for distributed computing, to develop software for such geographically distributed computing resources and maintain desired efficiency.
In the second scenario, we prepared distributed simulation-postprocessing-visualization workflow using ADIOS2 [2] and two programming languages (C++ and python). In this test we prove capabilities of performing different parts of analysis in seperate sites.

References

Kaiser, H., Lelbachaka wash, B.A., Heller, T., Berge, A., et al.: STEllAR-GROUP/hpx: HPX V1.3.0: The C++ Standards Library for Parallelism and Concurrency (2019), DOI: 10.5281/zenodo.3189323

The Adaptable Input Output System version 2, https://github.com/ornladios/ADIOS2/, accessed: 2020-02-08

Orlowski, L., Deng, Y., Michalewicz, M.: Galaxies of supercomputers and their underlying interconnect topologies hierarchies. In: International Supercomputer Conference, Leipzig, Germany (2014), DOI: 10.13140/2.1.4798.2728

Michalewicz, M., Southwell, D., Tan, T., Poppe, Y., et al.: InfiniCortex: concurrent supercomputing across the globe utilising trans-continental InfiniBand and Galaxy of Supercomputers. In: Supercomputing 2014: The International Conference for High Performance Computing, Networking, Storage and Analysis, At New Orleans, LA, USA (2014), DOI: 10.13140/2.1.3267.7444

Michalewicz, M.T., Lian, T.G., Seng, L., Low, J., et al.: InfiniCortex: Present and Future Invited Paper. In: Proceedings of the ACM International Conference on Computing Frontiers, May 2016, Como, Italy. pp. 267–273. Association for Computing Machinery, New York, NY, USA (2016), DOI: 10.1145/2903150.2912887

Noaje, G., Davis, A., Low, J., Lim, S., et al.: InfiniCortex – From Proof-of-concept to Production. Supercomputing Frontiers and Innovations 4(2), 87–102 (2017), DOI: 10.14529/jsfi170207

Obsidian Strategics Inc., https://www.cybersecurityintelligence.com/obsidian-strategics-106.html, accessed: 2020-06-01

Obsidian Strategics Inc., https://obsidianstrategics.com/index.html, accessed: 2020-06-01

Vcinity Inc., https://vcinity.io/, accessed: 2020-06-01

Mellanox MetroX R-2 Systems, https://www.mellanox.com/products/long-haul, accessed: 2020-06-01

Obsidian Longbow Campus Solutions Extend Its Columbia Supercomputer across Multiple NASA Locations, https://www.militaryaerospace.com/home/article/16725502/obsidian-longbow-campus-solutions-extend-its-columbia-supercomputer-across-multiple-nasa-locations, accessed: 2020-06-01

Eikenberry, S., Lindekugel, K., Stanzione, D.: Long Haul InfiniBand Technology: Implications for Cluster Computing, Arizona State University (2006), https://obsidianstrategics.com/archives/2006/asustanzione ccs.pdf, accessed: 2020-06-28

El-Harake, H.N., Gamboni, C., Gorini, S., Schoenemeyer, T.: Evaluation of infiniband range extension offered by obsidian (2011)

Richling, S., Kredel, H., Hau, S., Kruse, H.G.: A long-distance infiniband interconnection between two clusters in production use. In: State of the Practice Reports, November 2011, Seattle, Washington. Association for Computing Machinery, New York, NY, USA (2011), DOI: 10.1145/2063348.2063368

Ban, K., Chrzeszczyk, J., Howard, A., Li, D., Tan, T.W.: InfiniCloud: Leveraging the Global InfiniCortex Fabric and OpenStack Cloud for Borderless High Performance Computing of Genomic Data. Supercomputing Frontiers and Innovations 2(3), 14–27 (2015), DOI: 10.14529/jsfi150302

Chrzeszczyk, J., Howard, A., Chrzeszczyk, A., Swift, B., Davis, P., Low, J., Tan, T.W., Ban, K.: InfiniCloud 2.0: distributing High Performance Computing across continents. Supercomputing Frontiers and Innovations 3(2), 54–71 (2016), DOI: 10.14529/jsfi160204

Antypas, K.: Superfacility: How new workflows in the DOE Office of Science are influencing storage system requirements? (2016), https://storageconference.us/2016/Slides/KatieAntypas.pdf, accessed: 2020-06-01

NERSC Superfacility, https://www.nersc.gov/research-and-development/superfacility/, accessed: 2020-06-01

Creating Super-facilities: a Coupled Facility Model for Data-Intensive Science, Internet 2 Global Summit 2015, http://meetings.internet2.edu/2015-global-summit/detail/10003679/, accessed: 2020-06-01

Bell, G.: The Energy Sciences Network: Overview, Update, Impact (DoE) - presentation, https://science.osti.gov/-/media/ascr/ascac/pdf/meetings/20150324/Bell ESNet.pdf?la=en&hash=46C0168F7ADAB232EC32E4452C49A159453859C9, accessed: 2020-06-01

Fenix Research Infrastructure, https://fenix-ri.eu/about-fenix, accessed: 2020-06-01

Noaje, G.: InfiniCortex, InfiniBand nation-wide and world-wide, a talk given at Journee Scientifique ROMEO’2016, Reims, France (2016), https://romeo.univ-reims.fr/news/208/Journee Scientifique ROMEO 2016 le 9 juin 2016 a REIMS, accessed: 2020-06-01

Proficz, J., Sumionka, P., Skomia l, J., Semeniuk, M., Niedzielewski, K., Walczak, M.: Investigation into MPI All-Reduce Performance in a Distributed Cluster with Consideration of Imbalanced Process Arrival Patterns. In: International Conference on Advanced Information Networking and Applications, 15-17 April, Caserta, Italy. pp. 817–829. Springer (2020), DOI: 10.1007/978-3-030-44041-1_72

Niedzielewski, K., Marchwiany, M.E., Piliszek, R., Michalewicz, M., Rudnicki, W.: Multidimensional feature selection and high performance parallex. SN Computer Science 1(1), 40 (2020), DOI: 10.1007/s42979-019-0037-5

Open MPI: Open source high performance computing, https://www.open-mpi.org/, accessed: 2020-02-08

Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the nips 2003 feature selection challenge. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems 17, pp. 545–552. MIT Press (2005), http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-feature-selection-challenge.pdf

Dua, D., Graff, C.: UCI machine learning repository (2017), http://archive.ics.uci.edu/ml

Application examples for the ADIOS2 I/O library, https://github.com/ornladios/ADIOS2-Examples, accessed: 2020-02-08

Pearson, J.E.: Complex Patterns in a Simple System. Science 261(5118), 189–192 (1993), DOI: 10.1126/science.261.5118.189

Downloads

Published

2020-07-21

How to Cite

Niedzielewski, K., Semeniuk, M., Skomiał, J., Proficz, J., Sumioka, P., Pliszka, B., & Michalewicz, M. (2020). Long Distance Geographically Distributed InfiniBand Based Computing. Supercomputing Frontiers and Innovations, 7(2), 24–34. https://doi.org/10.14529/jsfi200202

Most read articles by the same author(s)