Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Evaluation on performance and energy eciency of distributed computing systems
PREMIUM
Số trang
116
Kích thước
2.6 MB
Định dạng
PDF
Lượt xem
1625

Evaluation on performance and energy eciency of distributed computing systems

Nội dung xem thử

Mô tả chi tiết

Evaluation on Performance and Energy

Efficiency of Distributed Computing

Systems

Ph.D. Dissertation

by

Tran Thi Xuan (MSc)

Supervised by

Prof. Do Van Tien (DSc)

Department of Networked Systems and Services

Budapest University of Technology and Economics

Hungary, 2020

Abstract

The increasing usage of distributed computing systems to serve the growing demand for

scientific computation and big data processing comes with the drastic growth of energy

consumption in computing clusters. Therefore, optimizing the energy consumption of

computational clusters has become more crucial than ever. The dissertation summarizes

a study on the resource allocation problem in distributed systems, motivated by a need

of taking into account different resource characteristics and dynamic power management

(DPM) techniques.

First, a generalized model of computational clusters built from heterogeneous types

of COTS servers has been introduced to study the resource-aware scheduling. A set of

scheduling heuristics that consider servers’ performance and power consumption charac￾teristics and the organization of waiting buffers have been investigated. We show that the

buffering schemes play an important role in ensuring the quality of service parameters in

terms of the waiting time and the response time experienced by arriving jobs. Moreover,

energy efficiency characteristic based scheduling can conserve the system energy and high

performance priority based policy yields the best performance.

Second, new real-time measurement based scheduling algorithms to achieve a trade-off

between energy efficiency and the performance capability of computational clusters have

been proposed in the thesis. Numerical results show that the proposed algorithms attain

a balance between the job execution time and energy efficiency.

Third, the impact of dynamic power management (DPM) in computing systems built

from multicore processors has been investigated. Numerical results point out that DPM

in the core level of processors can play a role in saving energy consumption. A resource￾aware scheduling solution has been proposed to achieve energy-efficient processing of

parallel tasks in multicore systems. Obtained results indicate that the proposal reduces

energy consumption significantly in comparison to random allocation.

2

3

Last, the energy inefficiency in an ordinary big data scheduler-Hadoop YARN has

been investigated. Since the resource allocation policy in the Hadoop YARN cluster is

data-aware (i.e. the allocation strongly depends on the locations of data splits in Hadoop

Distributed File System-HDFS), a new data placement scheme for HDFS was proposed to

achieve energy efficiency when MapReduce tasks are processed by the cluster. Compared

to the existing HDFS data layout scheme, the proposal yields above 50% reduction in

energy consumption at a small expense of ≈6% increase in job execution time.

5

I, the undersigned Tran Thi Xuan, hereby state that I have written this doctoral dis￾sertation myself, and I have used only the sources given in it. I have clearly marked all

the parts taken from other sources either word for word or reworded but with the same

contents, indicating their sources.

The reviews of the dissertation and the report of the thesis discussion are available at the

Dean’s Office of the Electrical Engineering and Informatics Faculty, Budapest University

of Technology and Economics.

Budapest, February 17, 2020

Tran Thi Xuan

Acknowledgements

I would like to thank all people who have provided invaluable assistance during my study

towards the Ph.D. degree.

I would like to express my sincere gratitude to Prof. Dr. Do Van Tien for his intensive

supervision. Prof. Dr. Do Van Tien has guided me on the direction of my research at

preliminary time. Without his continuous supervision and straight criticisms, I could not

accomplish this study and achieve PhD degree.

I deeply thank Dr. Do Hoai Nam, a senior researcher in Analysis, Design and De￾velopment of ICT systems laboratory at our department, for his work cooperation and

enthusiastic support through my research. All members of the Analysis, Design and De￾velopment of ICT systems laboratory, other PhD students, and the university staffs are

acknowledged.

Finally, I dedicate my hearty thankfulness to my husband and son Le Linh Bang and

Le Minh Anh for their love and encouragement. I am also grateful to all family members

and friends who have supported me throughout.

6

Contents

Abstract 4

Acknowledgement 7

List of Figures 14

List of Tables 16

1 Introduction 17

2 A generalized model of heterogeneous computing clusters for investiga￾tion of scheduling schemes 19

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.2 A generalized cluster model and Scheduling algorithms . . . . . . . . . . . 21

2.2.1 Ranking of servers . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.2 Scheduling algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2.3 Performance measures and energy metrics . . . . . . . . . . . . . . 27

2.3 Simulation Inputs and Numerical Results . . . . . . . . . . . . . . . . . . . 29

2.3.1 Input parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8

CONTENTS 9

3 New algorithms for balancing energy consumption and performance in

computational clusters 40

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.2 System description and proposed scheduling algorithms . . . . . . . . . . . 42

3.2.1 Scheduling algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.1 The parameters of a computational cluster . . . . . . . . . . . . . . 46

3.3.2 Job balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.3.3 System metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3.4 Impacts of DVFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3.5 Evaluations with workload traces as input data . . . . . . . . . . . 52

3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4 Impact of Dynamic power management techniques in computing sys￾tems of multicore processors 56

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Dynamic Power Management practices . . . . . . . . . . . . . . . . . . . . 58

4.3 System descriptions and operation scenarios . . . . . . . . . . . . . . . . . 59

4.3.1 Job assignment scenarios . . . . . . . . . . . . . . . . . . . . . . . . 61

4.3.2 Performance and energy metrics . . . . . . . . . . . . . . . . . . . . 63

4.4 Evaluation on the impact of DPM . . . . . . . . . . . . . . . . . . . . . . . 65

4.4.1 Simulation inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.4.2 Analysis of obtained results . . . . . . . . . . . . . . . . . . . . . . 67

4.5 A proposal of Resource-aware scheduling algorithm . . . . . . . . . . . . . 72

4.5.1 The proposed policy . . . . . . . . . . . . . . . . . . . . . . . . . . 73

CONTENTS 10

4.5.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5 A New Data Layout Scheme for Energy-Efficient MapReduce Processing

Tasks 80

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.3 The operation of HDFS and YARN in a computing cluster . . . . . . . . . 83

5.3.1 HDFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.3.2 Yet Another Resource Negotiator –YARN . . . . . . . . . . . . . . 83

5.3.3 Processing Hadoop MapReduce applications . . . . . . . . . . . . . 85

5.3.4 The default HDFS data layout . . . . . . . . . . . . . . . . . . . . . 86

5.3.5 A locality relaxation algorithm for resource allocation in RM . . . . 87

5.4 A New Data Layout Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 89

5.4.1 Subsets of servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5.4.2 A proposed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.5.1 Parameters for a case study . . . . . . . . . . . . . . . . . . . . . . 93

5.5.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6 Summary 104

Own Publications 105

Bibliography 105

Tải ngay đi em, còn do dự, trời tối mất!