Grid Computing Evolution in Scientific Applications

Authors

DOI:

https://doi.org/10.14529/jsfi240101

Keywords:

grid, distributed computing, history, science, metacomputing, grid middleware, scientific grid, cloud computing

Abstract

The advent of interconnected machines laid the foundation for utilizing distributed computing resources. Coinciding with rapid advancements in computing technologies and significant hardware innovations at the turn of the century, the field of science also experienced exponential growth. As supercomputers remained limited in availability and usage, distributed computing leveraged the potential of idle and dedicated resources, bridging the gap between scientists and computationally intensive projects. This paper provides a review of the evolutionary journey of grid computing in scientific applications, starting from the advancements in network connection technologies and concept of metacomputing and progressing to the current developments integrating cloud technologies with large-scale grids. The paper aims to outline the key milestones, advancements, and challenges encountered throughout this evolution, highlighting the potential of grid computing in enabling scientific breakthroughs and addressing future research directions. The most popular middleware systems are considered, as well as a description of scientific grid systems that existed in the past and are still in operation today is given. At the end of the article, we examined two of the most significant scientific discoveries that became possible largely thanks to grid technologies.

References

Gigabit network testbeds. Computer 23(9), 77–80 (1990). https://doi.org/10.1109/2.58220

The Gigabit Testbed Initiative. Final Report (1996), http://www.cnri.reston.va.us/gigafr/Gigabit_Final_Rpt.pdf, accessed: 2024-02-16

ARC middleware. The NorduGrid Collaboration (2005), http://www.nordugrid.org/documents/whitepaper.pdf, Accessed: 2024-02-16

OSG helps LIGO scientists confirm Einstein’s unproven theory (2016), https://osg-htc.org/spotlights/osg-helps-ligo-scientists-confirm-einsteins-last-unproven-theory.html, accessed: 2024-04-27

ESGF Brochure (2017), https://esgf.llnl.gov/esgf-media/pdf/2017-ESGF-Brochure.pdf, accessed: 2024-04-29

OSPool Usage Hits Daily Record (2021), https://osg-htc.org/news/2021/06/10/OSPool-Hits-Record-Utilization.html, accessed: 2024-02-16

Aad, G., et al.: Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Physics Letters B 716(1), 1–29 (2012). https://doi.org/10.1016/j.physletb.2012.08.020

Aasi, J., et al.: Einstein@Home all-sky search for periodic gravitational waves in LIGO S5 data. Phys. Rev. D 87, 042001 (Feb 2013). https://doi.org/10.1103/PhysRevD.87.042001

Abbott, B.P., et al.: LIGO: the laser interferometer gravitational-wave observatory. Reports on Progress in Physics 72(7), 076901 (jun 2009). https://doi.org/10.1088/0034-4885/72/7/076901

Alexandrov, A., Ibel, M., Schauser, K., et al.: SuperWeb: towards a global Web-based parallel computing infrastructure. In: Proceedings 11th International Parallel Processing Symposium. pp. 100–106. IEEE Comput. Soc. Press, Genva, Switzerland (1997). https://doi.org/10.1109/IPPS.1997.580858

Anderson, D.P.: BOINC: A Platform for Volunteer Computing. Journal of Grid Computing 18(1), 99122 (Mar 2020). https://doi.org/10.1007/s10723-019-09497-9

Anderson, D.P., Cobb, J., Korpela, E., et al.: SETI@home: An Experiment in Public- Resource Computing. Commun. ACM 45(11), 56–61 (nov 2002). https://doi.org/10.1145/581571.581573

Antonioletti, M., Krause, A., Paton, N.W.: An Outline of the Global Grid Forum Data Access and Integration Service Specifications. In: Pierson, J.M. (ed.) Data Management in Grids. pp. 71–84. Springer Berlin Heidelberg, Berlin, Heidelberg (2006)

Baratloo, A., Karaul, M., Kedem, Z., et al.: Charlotte: Metacomputing on the Web. Future Generation Computer Systems 15(5), 559–570 (1999). https://doi.org/10.1016/S0167-739X(99)00009-6

Barisits, Martin, Barreiro, Fernando, et al., B.: The Data Ocean Project – An ATLAS and Google R&D collaboration. EPJ Web Conf. 214, 04020 (2019). https://doi.org/10.1051/epjconf/201921404020

Bauerfeld, W.: A tutorial on network gateways and interworking of LANs and WANs. Computer Networks and ISDN Systems 13(3), 187–193 (Jan 1987). https://doi.org/10.1016/0169-7552(87)90025-0

Beberg, A.L., Ensign, D.L., Jayachandran, G., et al.: Folding@home: Lessons from Eight Years of Volunteer Distributed Computing. In: Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing. pp. 1–8. IPDPS ’09, IEEE Computer Society, USA (2009). https://doi.org/10.1109/IPDPS.2009.5160922

Bell, T., Bompastor, B., Bukowiec, S., et al.: Scaling the cern openstack cloud. Journal of Physics: Conference Series 664, 022003 (12 2015). https://doi.org/10.1088/1742-6596/664/2/022003

Berners-Lee, T., Cailliau, R., Luotonen, A.t.: The World-Wide Web. Commun. ACM 37(8), 76–82 (aug 1994). https://doi.org/10.1145/179606.179671

Bernstein, P.A.: Middleware: A Model for Distributed System Services. Commun. ACM 39(2), 86–98 (feb 1996). https://doi.org/10.1145/230798.230809

Bogdanov, A., et al.: GRID processing and analysis of ALICE data at distributed Russian Tier2 centre: RDIG. J. Phys. Conf. Ser. 219, 072054 (2010). https://doi.org/10.1088/1742-6596/219/7/072054

Bonacorsi, D., Ferrari, T.: WLCG Service Challenges and Tiered architecture in the LHC era. In: Montagna, G., Nicrosini, O., Vercesi, V. (eds.) IFAE 2006. pp. 365–368. Springer Milan, Milano (2007). https://doi.org/10.1007/978-88-470-0530-3_68

Brecht, T., Sandhu, H., Shan, M., et al.: ParaWeb: Towards World-Wide Supercomputing. In: Proceedings of the 7th Workshop on ACM SIGOPS European Workshop: Systems Support for Worldwide Applications. pp. 181–188. EW7, Association for Computing Machinery, New York, NY, USA (1996). https://doi.org/10.1145/504450.504484

Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid. In: Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region. vol. 1, pp. 283–289 (2000). https://doi.org/10.1109/HPC.2000.846563

Casajus, A., Graciani, R., Paterson, S., et al.: DIRAC pilot framework and the DIRAC Workload Management System. Journal of Physics: Conference Series 219(6), 062049 (Apr 2010). https://doi.org/10.1088/1742-6596/219/6/062049

Casanova, H.: Simgrid: a toolkit for the simulation of application scheduling. In: Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid. pp. 430–437 (2001). https://doi.org/10.1109/CCGRID.2001.923223

Catlett, C., Allcock, W., Andrews, P., et al.: TeraGrid: Analysis of Organization, System Architecture, and Middleware Enabling New Types of Applications, vol. 16, pp. 225–249 (01 2008)

Chapin, S.J., Katramatos, D., Karpovich, J., et al.: The Legion Resource Management System. In: Feitelson, D.G., Rudolph, L. (eds.) Job Scheduling Strategies for Parallel Processing. pp. 162–178. Springer Berlin Heidelberg, Berlin, Heidelberg (1999). https://doi.org/10.1007/3-540-47954-6_9

Chiappa, J.N., Steenstrup, M.E., Castineyra, I.M.: The Nimrod Routing Architecture. RFC 1992 (Aug 1996). https://doi.org/10.17487/RFC1992, accessed: 2024-02-16

Cinquini, L., et al.: The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data. Future Generation Computer Systems 36, 400–417 (Jul 2014). https://doi.org/10.1016/j.future.2013.07.002

Clarke, L., Glendinning, I., Hempel, R.: The MPI Message Passing Interface Standard. In: Decker, K.M., Rehmann, R.M. (eds.) Programming Environments for Massively Parallel Distributed Systems. pp. 213–218. Birkh¨auser Basel, Basel (1994)

Clissa, L., Lassnig, M., Rinaldi, L.: How big is Big Data? A comprehensive survey of data production, storage, and streaming in science and industry. Frontiers in Big Data 6 (2023). https://doi.org/10.3389/fdata.2023.1271639

Cole, N., Desell, T., Lombrana Gonzalez, D., et al.: Evolutionary Algorithms on Volunteer Computing Platforms: The MilkyWay@Home Project, pp. 63–90. Springer Berlin Heidelberg, Berlin, Heidelberg (2010). https://doi.org/10.1007/978-3-642-10675-0_4

Cordeiro, C., Field, L., et al.: CERN Computing in Commercial Clouds. Journal of Physics: Conference Series 898(8), 082030 (oct 2017). https://doi.org/10.1088/1742-6596/898/8/082030

DeFanti, T.A., Foster, I., Papka, M.E., et al.: Overview of the I-Way: Wide-Area Visual Supercomputing. The International Journal of Supercomputer Applications and High Performance Computing 10(23), 123–131 (Jun 1996). https://doi.org/10.1177/109434209601000201

Erwin, D.W., Snelling, D.F.: UNICORE: A Grid Computing Environment. In: Sakellariou, Rizos and Gurd, John and Freeman, Len et al. (ed.) Euro-Par 2001 Parallel Processing. pp. 825–834. Springer Berlin Heidelberg, Berlin, Heidelberg (2001)

Ferguson, A.: Where are they now? Superuser Awards winner: CERN, https://superuser.openinfra.dev/articles/cern-openstack-update/, accessed: 2024-02-16

Filipi, A., Cameron, D., Smirnova, O., et al.: The Next Generation ARC Middleware and ATLAS Computing Model. Journal of Physics: Conference Series 396(3), 032039 (dec 2012). https://doi.org/10.1088/1742-6596/396/3/032039

Foster, I.: Globus Toolkit Version 4: Software for Service-Oriented Systems. In: Jin, H., Reed, D., Jiang, W. (eds.) Network and Parallel Computing. pp. 2–13. Springer Berlin Heidelberg, Berlin, Heidelberg (2005)

Foster, I.: Globus Online: Accelerating and Democratizing Science through Cloud-Based Services. IEEE Internet Computing 15(3), 70–73 (2011). https://doi.org/10.1109/MIC.2011.64

Foster, I., Kesselman, C.: Globus: a Metacomputing Infrastructure Toolkit. The International Journal of Supercomputer Applications and High Performance Computing 11(2), 115–128 (1997). https://doi.org/10.1177/109434209701100205

Foster, I., Kesselman, C., Tuecke, S.: Chapter 17 – The Open Grid Services Architecture, pp. 215–257. The Morgan Kaufmann Series in Computer Architecture and Design, Morgan Kaufmann, Burlington (Jan 2004). https://doi.org/10.1016/B978-155860933-4/50022-5

Foster, Ian and Kesselman, Carl: The History of the Grid 20(21), 22 (2010)

Freitag, S., Wieder, P.: The German Grid Initiative D-Grid: Current State and Future Perspectives, pp. 29–52. Springer London, London (2011). https://doi.org/10.1007/978-0-85729-439-5_2

Frey, J., Tannenbaum, T., Livny, M., et al.: Condor-G: a computation management agent for multi-institutional grids. In: Proceedings 10th IEEE International Symposium on High Performance Distributed Computing. pp. 55–63 (2001). https://doi.org/10.1109/HPDC.2001.945176

Gagliardi, F.: The EGEE European Grid Infrastructure Project. In: Dayde, Michel and Dongarra, Jack and Hernandez, Vicente and et al. (ed.) High Performance Computing for Computational Science – VECPAR 2004. pp. 194–203. Springer Berlin Heidelberg, Berlin, Heidelberg (2005). https://doi.org/10.1007/11403937_16

Garonne, V., Vigne, R., Stewart, G., et al.: Rucio – The next generation of large scale distributed system for ATLAS Data Management. Journal of Physics: Conference Series 513(4), 042021 (Jun 2014). https://doi.org/10.1088/1742-6596/513/4/042021

Gentzsch, W., Girou, D., Kennedy, A., et al.: DEISA – Distributed European Infrastructure for Supercomputing Applications. Journal of Grid Computing 9(2), 259–277 (Jun 2011). https://doi.org/10.1007/s10723-011-9183-2

Glasser, A.H., Sovinec, C.R., Nebel, R.A., et al.: The NIMROD code: a new approach to numerical plasma physics. Plasma Physics and Controlled Fusion 41(3A), A747 (mar 1999). https://doi.org/10.1088/0741-3335/41/3A/067

Haber, R.: Scientific visualization and the Rivers Project at the National Center for Supercomputing Applications. Computer 22(8), 84–89 (1989). https://doi.org/10.1109/2.35205

Henderson, R.L.: Job scheduling under the Portable Batch System. In: Feitelson, D.G., Rudolph, L. (eds.) Job Scheduling Strategies for Parallel Processing. pp. 279–294. Springer Berlin Heidelberg, Berlin, Heidelberg (1995). https://doi.org/10.1007/3-540-60153-8_34

Henning, M.: The Rise and Fall of CORBA: Theres a lot we can learn from CORBAs mistakes. Queue 4(5), 28–34 (jun 2006). https://doi.org/10.1145/1142031.1142044

Holzman, B., Bauerdick, L.A.T., Bockelman, B., et al.: HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation. Computing and Software for Big Science 1(1), 1 (Nov 2017). https://doi.org/10.1007/s41781-017-0001-9

Jackson, D., Snell, Q., Clement, M.: Core Algorithms of the Maui Scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) Job Scheduling Strategies for Parallel Processing. pp. 87–102. Springer Berlin Heidelberg, Berlin, Heidelberg (2001). https://doi.org/10.1007/3-540-45540-X_6

Jackson, T., Austin, J., Fletcher, M., et al.: Diagnostics and Prognostics on the Grid: the Distributed Aircraft Maintenance Environment project (DAME) (01 2004)

Karsch, F., Simma, H., Yoshie, T.: The International Lattice Data Grid – towards FAIR data. In: Proceedings of The 39th International Symposium on Lattice Field Theory PoS(LATTICE2022). p. 244. Sissa Medialab, Bonn, Germany (Feb 2023). https://doi.org/10.22323/1.430.0244

Konya, B.: Advanced Resource Connector (ARC) – The Grid Middleware of the NorduGrid. In: Kranzlmuller, D., Kacsuk, P., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. p. 10. Springer Berlin Heidelberg, Berlin, Heidelberg (2004)

Litzkow, M., Livny, M., Mutka, M.: Condor-a hunter of idle workstations. In: [1988] Proceedings. The 8th International Conference on Distributed. pp. 104–111. IEEE Comput. Soc. Press, San Jose, CA, USA (1988). https://doi.org/10.1109/DCS.1988.12507

Lukasik, S.: Why the Arpanet Was Built. IEEE Annals of the History of Computing 33(3), 4–21 (Mar 2011). https://doi.org/10.1109/MAHC.2010.11

Maeno, T., De, K., Wenaus, T., et al.: Overview of ATLAS PanDA Workload Management. Journal of Physics: Conference Series 331(7), 072024 (Dec 2011). https://doi.org/10.1088/1742-6596/331/7/072024

Maffeis, S., Schmidt, D.: Constructing reliable distributed communication systems with CORBA. IEEE Communications Magazine 35(2), 56–60 (1997). https://doi.org/10.1109/35.565656

Megino, F.B., Bawa Harinder, S., De, K., et al.: Seamless integration of commercial Clouds with ATLAS Distributed Computing. EPJ Web Conf. 251, 02005 (2021). https://doi.org/10.1051/epjconf/202125102005

Megino, F.B., De, K., Elmsheuser, J., et al : Operational Experience and R&D results using the Google Cloud for High Energy Physics in the ATLAS experiment (2024). https://doi.org/10.48550/arXiv.2403.15873

Megino, F.B., et al.: Accelerating science: the usage of commercial clouds in ATLAS distributed computing (5 2023), https://indico.jlab.org/event/459/contributions/11636/, accessed: 2024-02-16

Megino, F.H.B., Jones, R., Kucharczyk, K., et al.: Helix nebula and cern: A symbiotic approach to exploiting commercial clouds. Journal of Physics: Conference Series 513(3), 032067 (jun 2014). https://doi.org/10.1088/1742-6596/513/3/032067

Metcalfe, R.M., Boggs, D.R.: Ethernet: distributed packet switching for local computer networks. Communications of the ACM 19(7), 395–404 (Jul 1976). https://doi.org/10.1145/360248.360253

Mhashilkar, P., Altunay, M., Berman, E., et al.: HEPCloud, an Elastic Hybrid HEP Facility using an Intelligent Decision Support System. EPJ Web of Conferences 214, 03060 (2019). https://doi.org/10.1051/epjconf/201921403060

Miura, K.: Overview of Japanese science Grid project NAREGI. Progress in Informatics (3), 67 (Apr 2006). https://doi.org/10.2201/NiiPi.2006.3.7

Nisan, N., London, S., Regev, O., et al.: Globally distributed computation over the Internetthe POPCORN project. In: Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183). p. 592601. IEEE Comput. Soc, Amsterdam, Netherlands (1998). https://doi.org/10.1109/ICDCS.1998.679836

Pickles, S., Brooke, J., Costen, F., et al.: Metacomputing across intercontinental networks. Future Generation Computer Systems 17(8), 911–918 (2001). https://doi.org/10.1016/ S0167-739X(01)00032-2, high Performance Computing and Networking

Pickles, S., Brooke, J., Costen, F., et al.: Metacomputing across intercontinental networks. Future Generation Computer Systems 17(8), 911–918 (2001). https://doi.org/10.1016/ S0167-739X(01)00032-2, high Performance Computing and Networking

Pordes, R., Petravick, D., Kramer, B., et al.: The Open Science Grid. Journal of Physics: Conference Series 78(1), 012057 (jul 2007). https://doi.org/10.1088/1742-6596/78/1/012057

Resch, M.M., Rantzau, D., Stoy, R.: Metacomputing experience in a transatlantic wide area application test-bed. Future Generation Computer Systems 15(5), 807–816 (1999). https://doi.org/10.1016/S0167-739X(99)00028-X

Rutkowski, A.M.: The Integrated Serviced digital Network: Issues and Options for the Future. Jurimetrics 24(1), 19–42 (1983), http://www.jstor.org/stable/29761846

Saiz, P., Aphecetche, L., Buni, P., et al.: AliEnALICE environment on the GRID. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 502(23), 437–440 (Apr 2003). https://doi.org/10.1016/S0168-9002(03)00462-5

Sfiligoi, I.: glideinWMSa generic pilot-based workload management system. Journal of Physics: Conference Series 119(6), 062044 (Jul 2008). https://doi.org/10.1088/1742-6596/119/6/062044

Skovira, J., Chan, W., Zhou, H., et al.: The EASY – LoadLeveler API project. In: Feitelson, D.G., Rudolph, L. (eds.) Job Scheduling Strategies for Parallel Processing. pp. 41–47. Springer Berlin Heidelberg, Berlin, Heidelberg (1996)

Smarr, L., Catlett, C.E.: Metacomputing. Commun. ACM 35(6), 44–52 (jun 1992). https://doi.org/10.1145/129888.129890

Talia, D.: The Open Grid Services Architecture: where the grid meets the Web. IEEE Internet Computing 6(6), 67–71 (2002). https://doi.org/10.1109/MIC.2002.1067739

Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the Condor experience. Concurrency and Computation: Practice and Experience 17(2-4), 323–356 (2005). https://doi.org/10.1002/cpe.938

The ATLAS Collaboration: Total cost of ownership and evaluation of Google cloud resources for the ATLAS experiment at the LHC (2024). https://doi.org/10.48550/arXiv.2405.13695

Durech, J., Hanu, J., Vano, R.: Asteroids@homeA BOINC distributed computing project for asteroid shape reconstruction. Astronomy and Computing 13, 80–84 (2015). https://doi.org/10.1016/j.ascom.2015.09.004

Warren, R., et al.: MammoGrid – a prototype distributed mammographic database for Europe. Clinical Radiology 62(11), 1044–1051 (2007). https://doi.org/10.1016/j.crad.2006.09.032

Xu, M.: Effective metacomputing using LSF Multicluster. In: Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid. pp. 100–105 (2001). https://doi.org/10.1109/CCGRID.2001.923181

Downloads

Published

2024-06-06

How to Cite

Grigoryeva, M. A., & Klimentov, A. A. (2024). Grid Computing Evolution in Scientific Applications. Supercomputing Frontiers and Innovations, 11(1), 4–50. https://doi.org/10.14529/jsfi240101