A Study on Cross-Architectural Modelling of Power Consumption Using Neural Networks

Vadim V. Elisseev; Milos Puzovic; Eun Kyung Lee

doi:10.14529/jsfi180403

Authors

Vadim V. Elisseev IBM
Milos Puzovic STFC
Eun Kyung Lee IBM

DOI:

https://doi.org/10.14529/jsfi180403

Abstract

On the path to Exascale, the goal of High Performance Computing (HPC) to achieve maximum performance becomes the goal of achieving maximum performance under strict power constraint. Novel approaches to hardware and software co-design of modern HPC systems have to be developed to address such challenges. In this paper, we study prediction of power consumption of HPC systems using metrics obtained from hardware performance counters. We argue that this methodology is portable across different micro architecture implementations and compare results obtained on Intel 64, IBMR and Cavium ThunderXR ARMv8 microarchitectures.We discuss optimal number and type of hardware performance counters required to accurately predict power consumption.
We compare accuracy of power predictions provided by models based on Linear Regression (LR) and Neural Networks (NN). We find that the NN-based model provides better accuracy of predictions than the LR model. We also find, that presently it is not yet possible to predict power consumption on a given microarchitecture using data obtained on a different microarchitecture. Results of our work can be used as a starting point for developing unified, cross-architectural models for predicting power consumption.

References

Li, T., John, L.K.: Run-time Modeling and Estimation of Operating System Power Consumption. In: Proceedings of the 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. pp. 160–171. SIGMETRICS ’03 (2003), DOI: 10.1145/781047.781048

Duy, T.V.T., Sato, Y., Inoguchi, Y.: Performance evaluation of a green scheduling algorithm for energy savings in cloud computing. In: Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on. pp. 1–8. IEEE (2010), DOI: 10.1109/ipdpsw.2010.5470908

Woo, S.C., et al.: The SPLASH-2 Programs: Characterization and Methodological Considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture. pp. 24–36. ISCA ’95, ACM, New York, NY, USA (1995), DOI: 10.1109/isca.1995.524546

More, J.J.: The Levenberg–Marquardt Algorithm: Implementation and Theory. Numerical Analysis, ed. G. A. Watson, Lecture Notes in Mathematics 630, Springer Verlag pp. 105–116 (1977), DOI: 10.1007/bfb0067700

Acun, B., Lee, E.K., Park, Y., V. Kal´e, L.: Neural Network-Based Task Scheduling with Preemptive Fan Control. In: International Workshop on Energy Efficient Supercomputing (E2SC). ACM (2016), DOI: 10.1109/E2SC.2016.016

Bhat, S.G.: Openpower based Inband OCC sensors. https://github.com/shilpasri/-inband_sensors (2017), accessed: 2018-08-31

Huang, W., Lefurgy, C., Kuk, W., Buyuktosunoglu, A., Floyd, M., Rajamani, K., Allen- Ware, M., Brock, B.: Accurate Fine-Grained Processor Power Proxies. In: Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. pp. 224–234. IEEE (2012), DOI: 10.1109/micro.2012.29

Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS Parallel Benchmarks - Summary and Preliminary Results. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. pp. 158–165. Supercomputing ’91 (1991), DOI: 10.1145/125826.125925

Demuth, H., Beale, M.: Neural Network Toolbox for Use with MATLAB. http://www.mathworks.com/help/nnet/ (2017), accessed: 2018-08-31

Bansal, N., Lahiri, K., Raghunathan, A., Chakradhar, S.T.: Power Monitors: A Framework for System-Level Power Estimation Using Heterogeneous Power Models. In: VLSI Design. pp. 579–585. IEEE Computer Society (2005), DOI: 10.1109/icvd.2005.138

Tiwari, A., Laurenzano, M.A., Carrington, L., Snavely, A.: Modeling power and energy usage of HPC kernels. In: Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International. pp. 990–998. IEEE (2012), DOI: 10.1109/ipdpsw.2012.121

Moller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. NEURAL NETWORKS 6(4), 525–533 (1993), DOI: 10.1016/s0893-6080(05)80056-5

Heroux, M.A., Doerfler, D.W., Crozier, P.S., Willenbring, J.M., Edwards, H.C., Williams, A., Rajan, M., Keiter, E.R., Thornquist, H.K., Numrich, R.W.: Improving Performance via Mini-applications. Tech. Rep. SAND2009-5574, Sandia National Laboratories (2009), DOI: 10.2172/993908

Bienia, C.: Benchmarking Modern Multiprocessors. Ph.D. thesis, Princeton University (2011)

Heroux, Michael A and Neely, Rob, and Swaminarayan, Sriram: ASC Co-Design Proxy App Strategy. Tech. Rep. LLNL-TR-592878, Los Alamos National Laboratory (2013), DOI: 10.2172/1055856

Auweter, A., Bode, A., Brehm, M., Brochard, L., Hammer, N., Huber, H., Panda, R., Thomas, F., Wilde, T.: A Case Study of Energy Aware Scheduling on SuperMUC. In: Supercomputing - 29th International Conference, ISC 2014, Leipzig, Germany, June 22-26, 2014. Proceedings. pp. 394–409 (2014), DOI: 10.1007/978-3-319-07518-1 25

Eranian, S.: Perfmon2: a flexible performance monitoring interface for Linux. In: Proceedings of the Ottawa Linux Symposium. pp. 269–288 (2006), http://perfmon2.sourceforge.net/ols2006perfmon2.pdf, accessed: 2018-08-31

Bircher, W.L., John, L.K.: Complete System Power Estimation Using Processor Performance Events. IEEE Trans. Comput. 61(4), 563–577 (2012), DOI: 10.1109/tc.2011.47

McCalpin, J.D.: Memory Bandwidth and Machine Balance in Current High Performance Computers. IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter pp. 19–25 (1995)

Elisseev, V., Baker, J., Morgan, N., Brochard, L., Hewitt, T.: Energy Aware Scheduling Study on BlueWonder. In: 4th International Workshop on Energy Efficient Supercomputing, E2SC@SC 2016, Salt Lake City, UT, USA, November 14, 2016. pp. 61–68 (2016), DOI: 10.1109/e2sc.2016.014

Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Predictive Modeling for Job Power Consumption in HPC Systems. In: High Performance Computing - 31st International Conference, ISC High Performance 2016, Frankfurt, Germany, June 19-23, 2016, Proceedings. pp. 181–199 (2016), DOI: 10.1007/978-3-319-41321-1 10

Rodrigues, R., Annamalai, A., Koren, I., Kundu, S.: A Study on the Use of Performance Counters to Estimate Power in Microprocessors. IEEE Trans. on Circuits and Systems 60-II(12), 882–886 (2013), DOI: 10.1109/tcsii.2013.2285966

Burtscher, M., Zecena, I., Zong, Z.: Measuring GPU Power with the K20 Built-in Sensor. In: Proceedings of Workshop on General Purpose Processing Using GPUs. pp. 28:28–28:36. GPGPU-7, ACM, New York, NY, USA (2014), DOI: 10.1145/2588768.2576783

Contreras, G., Martonosi, M.: Power Prediction for Intel XScale R Processors Using Performance Monitoring Unit Events. In: Proceedings of the 2005 International Symposium on Low Power Electronics and Design. pp. 221–226. ISLPED ’05 (2005), DOI: 10.1109/lpe.2005.195518

Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Neural Networks, 1993., IEEE International Conference on. pp. 586–591 (1993), DOI: 10.1109/icnn.1993.298623

The Green500: The Green Lists. https://www.top500.org/green500/lists/2017/11 (2017), accessed: 2018-08-31

Intel: OpenMPI. https://www.open-mpi.org/ (2017), accessed: 2018-08-31

Texas Instruments: System Power Management and Protection IC With PMBusTM. Texas Instruments (2013)