Difference between revisions of "AMGCC15 Program"

From HTC-as-a-Service, KISTI
Jump to: navigation, search
(Created page with "== Workshop Program | Workshop Main Page== * '''Location''': Imperial College * '''Date''': September 12 (Friday), 2014 * '''Time''': 08:30 AM - 05:00 PM This wo...")
 
(Workshop Program | Workshop Main Page)
 
(44 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Workshop Program | [[AMGCC14|Workshop Main Page]]==
+
== Workshop Program | [[AMGCC15|Workshop Main Page]]==
  
* '''Location''': Imperial College
+
* '''Location''': Boston Marriott Cambridge, Cambridge, MA, USA
* '''Date''': September 12 (Friday), 2014
+
* '''Date''': September 21 (Monday), 2015
* '''Time''': 08:30 AM - 05:00 PM
+
* '''Time''': 08:30 AM - 06:00 PM (Salon VII)
  
This workshop features a keynode by Dr. David Wallom (University of Oxford), two invited talks by Dr. Dana Petcu (West University of Timisoara) and Boris Parak (Czech Education and Scientific Network provider), and oral presentations of 5 full papers.
+
This workshop features two keynotes by Dr. Alan Edelman (Massachusetts Institute of Technology) and Dr. Kate Keahey (Argonne National Laboratory/University of Chicago), two invited talks by Robert Quick (Indiana University) and Dr. Hyeonsang Eom (Seoul National University), and oral presentations of 7 full papers.
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 16: Line 16:
 
|- align="center"
 
|- align="center"
  
| width="10%"| 08:30 - 09:30
+
| width="10%"| 08:30 - 09:00
| colspan="4" align="center"| CAC Registration and Check-in
+
| colspan="4" align="center"| ICCAC 2015 Registration and Check-in
  
 
|- align="center"
 
|- align="center"
| width="10%"| 09:30 - 09:40
+
| width="10%"| 09:00 - 09:10
 
| colspan="4" align="center"| Welcome Remarks by AMGCC workshop organizers
 
| colspan="4" align="center"| Welcome Remarks by AMGCC workshop organizers
  
 
|- align="center"
 
|- align="center"
| width="10%"| 09:40 - 10:35
+
| width="10%"| 09:10 - 10:00
| width="40%"| '''Keynote''': Federation of Cloud Computing to Create a Uniform e-infrastructure for Research
+
| width="40%"| '''Keynote''': Julia: A fresh approach to parallel programming
| width="20%"| David Wallom
+
| width="20%"| Alan Edelman
| width="20%"| University of Oxford
+
| width="20%"| MIT
  
| rowspan="2"| Soonwook Hwang
+
| rowspan="2"| Hyeonsang Eom
  
 
|- align="center"
 
|- align="center"
| width="10%"| 10:35 - 11:30
+
| width="10%"| 10:00 - 10:45
| width="40%"| '''Invited Talk''': Challenges of Multi Clouds
+
| width="40%"| '''Invited Talk''': Autonomy in the Open Science Grid or Pay No Attention to the Operator Behind the Curtains
| width="20%"| Dana Petcu
+
| width="20%"| Robert Quick
| width="20%"| West University of Timisoara
+
| width="20%"| Indiana University
  
 
|- align="center"
 
|- align="center"
| 11:30 - 12:00
+
| 10:45 - 11:00
 
| colspan="4" align="center"| Coffee Break
 
| colspan="4" align="center"| Coffee Break
  
 
|- align="center"
 
|- align="center"
| 12:00 - 12:30
+
| 11:00 - 11:30
| VM Auto-Scaling for Workflows in Hybrid Cloud Computing
+
| Performance Analysis of Loosely Coupled Applications in Heterogeneous Distributed Computing Systems
| Younsun Ahn, Yoonhee Kim
+
| Eunji Hwang, Seontae Kim, Tae-Kyung Yoo, Jik-Soo Kim, Soonwook Hwang and Young-Ri Choi
| Sookmyung Women's University
+
| Ulsan National Institute of Science and Technology, KISTI
  
| rowspan="2"| Jik-Soo Kim
+
| rowspan="3"| Jik-Soo Kim
  
 
|- align="center"
 
|- align="center"
| 12:30 - 13:00
+
| 11:30 - 12:00
| A User-Level File System for Fast Storage Devices
+
| Fine-Grained, Adaptive Resource Sharing for Real Pay-Per-Use Pricing in Clouds
| Yongseok Son, Nae Young Song, Hyuck Han, Hyeonsang Eom, Heon Young Yeom
+
| Young Choon Lee, Youngjin Kim, Hyuck Han and Sooyong Kang
| Seoul National University & Dongduk women's university
+
| Macquarie University, Hanyang University, Dongduk Women’s University
  
 
|- align="center"
 
|- align="center"
| 13:00 - 14:00
+
| 12:00 - 12:30
 +
| A Job Dispatch Optimization Method on Cluster and Cloud for Large-scale High-Throughput Computing Service
 +
| Jieun Choi, Seoyoung Kim, Theodora Adufu, Soonwook Hwang and Yoonhee Kim
 +
| Sookmyung Women's University, KISTI
 +
 
 +
|- align="center"
 +
| 12:30 - 14:00
 
| colspan="4" align="center"| Lunch
 
| colspan="4" align="center"| Lunch
  
 
|- align="center"
 
|- align="center"
| width="10%"| 14:00 - 14:50
+
| width="10%"| 14:00 - 14:45
| width="40%"| '''Invited Talk''': (r)OCCI -- Lessons in practical interoperability for IaaS clouds
+
| width="40%"| '''Keynote''': Chameleon: A Large-scale, Reconfigurable Experimental Environment for Next Generation Cloud Research
| width="20%"| Boris Parak
+
| width="20%"| Kate Keahey
| width="20%"| Czech Education and Scientific NETwork
+
| width="20%"| Argonne National Laboratory/University of Chicago
  
| rowspan="2"| Soonwook Hwang
+
| rowspan="3"| Jaehwan Lee
  
 
|- align="center"
 
|- align="center"
| 14:50 - 15:20
+
| 14:45 - 15:15
| Collaborative Multi-dimensional Dataset Processing with Distributed Cache Infrastructure in the Cloud
+
| An Empirical Evaluation of NVM Express SSD
| Youngmoon Eom, Jonghwan Moon, Jinwoong Kim, Beomseok Nam
+
| Yongseok Son, Hara Kang, Hyuck Han and Heon Young Yeom
| Ulsan National Institute of Science and Technology
+
| Seoul National University, Dongduk women's university
  
 
|- align="center"
 
|- align="center"
| 15:20 - 15:50
+
| 15:15 - 15:45
 +
| SCOUT: A Monitor & Profiler of Grid Resources for Large-Scale Scientific Computing
 +
| Md Azam Hossain, Hieu Trong Vu, Jik-Soo Kim, Myungho Lee and Soonwook Hwang
 +
| University of Science & Technology, KISTI, Myongji University
 +
 
 +
|- align="center"
 +
| 15:45 - 16:00
 
| colspan="4" align="center"| Coffee Break
 
| colspan="4" align="center"| Coffee Break
  
 
|- align="center"
 
|- align="center"
| 15:50 - 16:20
+
| width="10%"| 16:00 - 16:45
| Toward a Multi-cluster Analytical Engine for Transportation Data
+
| width="40%"| '''Invited Talk''': How can we allocate “right” resources to virtual machines in virtualized data centers? – Workload-aware hierarchical scheduling with OpenStack
| Mark Shtern, Rizwan Mian, Marin Litoiu, Saeed Zareian, Hossam Abdelgawad, Ali Tizghadam
+
| width="20%"| Hyeonsang Eom
| York University & University of Toronto
+
| width="20%"| Seoul National University
  
| rowspan="2"| Yoonhee Kim
+
| rowspan="3"| Yoonhee Kim
  
 
|- align="center"
 
|- align="center"
| 16:20 - 16:50
+
| 16:45 - 17:15
| High Performance Parallelization of Boyer-Moore Algorithm on Many-Core Accelerators
+
| A CPU Overhead-aware VM Placement Algorithm for Network Bandwidth Guarantee in Virtualized Data Centers
| Yosang Jeong, Myungho Lee, Dukyun Nam, Jik-Soo Kim, Soonwook Hwang
+
| Kwonyong Lee and Sungyong Park
| Myongji University & KISTI
+
| Sogang University
  
 
|- align="center"
 
|- align="center"
| 16:50 - 17:00
+
| 17:15 - 17:45
 +
| Feasibility of the Computation Task Offloading to GPGPU-enabled Devices in Mobile Cloud
 +
| Kihan Choi, Jaehoon Lee, Youngjin Kim, Sooyong Kang and Hyuck Han
 +
| Hanyang University, Dongduk Women’s University
 +
 
 +
|- align="center"
 +
| 17:45 - 18:00
 
| colspan="4" align="center"| Closing Remarks
 
| colspan="4" align="center"| Closing Remarks
 
|}
 
|}
  
== Keynote ==
+
== Keynotes ==
  
=== Federation of Cloud Computing to Create a Uniform e-infrastructure for Research by Dr. David Wallom ===
+
=== Julia: A fresh approach to parallel programming ===
  
 +
==== Talk Abstract ====
  
==== Biography of the presenter ====
+
The Julia programming language is gaining enormous popularity. Julia was designed to be easy and fast. Most importantly, Julia shatters deeply established notions widely held in the applied community. Julia shows the fascinating dance between specialization and abstraction. Specialization allows for custom treatment. We can pick just the right algorithm for the right circumstance and this can happen at runtime based on argument types (code selection via multiple dispatch). Abstraction recognizes what remains the same after differences are stripped away and ignored as irrelevant. The recognition of abstraction allows for code reuse (generic programming). A simple idea that yields incredible power. Julia is many things to many people. In this talk we describe how Julia was built on the heels of our parallel computing experience with Star-P which began as an MIT research project and was a software product of Interactive Supercomputing. Our experience taught us that bolting parallelism onto an existing language that was not designed for performance or parallelism is difficult at best, and impossible at worst. One of our (not so secret) motivations to build Julia was to have the language we wanted for parallel numerical computing.
  
David Wallom is the Associate Director - Innovation at the Oxford e-Research Centre in the University of Oxford. He leads three separate activities within the centre, Energy and Environmental ICT, Volunteer Computing and advanced e-infrasructure including Cloud. He also leads engagement activities within the centre with industry. Following a period as the Technical Director of the UK NGS David has lead the EGI Federated Cloud activity snce its inception as one of the original architects.
+
=== Chameleon: A Large-scale, Reconfigurable Experimental Environment for Next Generation Cloud Research ===
  
 
==== Talk Abstract ====
 
==== Talk Abstract ====
  
The EGI Federated Cloud is Europes largest activity in the creation of a federation of heterogeneous public and private sector suitable for use by the research community. The service has been under development since Q4 2011 and was launched into production in May of this year. The FedCloud gives members of all user communities open access to reliable cloud computing resources, an alternative to large multinational providers. As the federation is made up of smaller providers, the service that we offer must include mechanisms where users can easily migrate from one service provider to another, a feature definitely not available in public cloud currently. To enable this we have built from day one with the principles of the usage of open standards where possible and published best practices in other areas.
+
Cloud services have become ubiquitous to all major 21st century economic activities -- there are however still many open questions surrounding this new technology. IN particular, many open research questions concern the relationship between cloud computing and high performance computing, the suitability of cloud computing for data-intensive applications, and its position with respect to emergent trends such as Software Defined Networking. A persistent barrier to further understanding of those issues has been the lack of a large-scale testbed where they can be explored.  
 +
 
 +
With funding from the National Science Foundation (NSF), the Chameleon project provides such a large-scale platform to the open research community allowing them to explore transformative concepts in deeply programmable cloud services, design, and core technologies. The testbed, deployed at the University of Chicago and the Texas Advanced Computing Center, will ultimately consist of almost 15,000 cores, 5PB of total disk space, and leverage 100 Gbps connection between the sites. While a large part of the testbed will consist of homogenous hardware to support large-scale experiments, a portion of it will support heterogeneous units allowing experimentation with high-memory, large-disk, low-power, GPU, and co-processor units. To support a broad range of experiments, the project will support a graduated configuration system allowing full user configurability of the software stack, from provisioning of bare metal and network interconnects to delivery of fully functioning cloud environments. This talk will describe the goals, the building, and the modus operandi of the testbed.
  
 
== Invited Talks ==
 
== Invited Talks ==
  
=== Challenges of Multi-Clouds by Dr. Dana Petcu ===
+
=== Autonomy in the Open Science Grid or Pay No Attention to the Operator Behind the Curtains ===
 
+
==== Biography of the presenter ====
+
 
+
Dana Petcu (Mrs., PhD) is [http://web.info.uvt.ro/~petcu/programe.htm Professor] at Computer Science Department of [http://www.uvt.ro/ West University of Timisoara], scientific manager of its [http://hpc.uvt.ro/ supercomputing center], and CEO of the research spin-off [http://www.ieat.ro/ Institute e-Austria Timisoara]. Her interest in distributed and parallel computing is reflected in more than two hundred [http://web.info.uvt.ro/~petcu/publicat.html papers] about Cloud, Grid, Cluster or HPC computing. She was is and was involved in several [http://web.info.uvt.ro/~petcu/contract.htm projects] funded by European Commission and other research funding agencies, as coordinator, scientific coordinator, or local team leader. She is chief editor of the open-access journal [http://www.scpe.org/ Scalable Computing: Practice and Experience].
+
  
 
==== Talk Abstract ====
 
==== Talk Abstract ====
  
The Cloud heterogeneity is manifested today in the set of interfaces of the services from different Clouds, in the set of services from the same provider, in the software or hardware stacks, in the terms of performance or user quality of experience. This heterogeneity is favoring the Cloud service providers allowing them to be competitive in a very dynamic market especially by exposing unique solutions. However such heterogeneity is hindering the interoperability between these services, the portability of the applications consuming the services, the seamless migration of legacy applications towards Cloud environments, as well as the automation of Cloud application deployments.
+
The Open Science Grid (OSG) is a distributed computational facility providing resources for High Throughput Computing (HTC) workflows. These resources are located at 125 locations across North America and South America, with minor extensions to Asia, Europe, and Africa. By nature, a distributed computing facility of this extent is a chaotic ecosystem, with scheduled and unscheduled outages, network fluctuations, and resource and policy autonomy. To consolidate this system into a functional operational environment the OSG uses a variety of technical and social techniques. These include central operational services that provide dynamic snapshots of the state of OSG, continuous monitoring and subsequent self-repairing actions, active 24x7 tracking and troubleshooting of critical production issues, automatically managed glide-in based workflows, and high availability operational services. This talk will cover and introduction to the OSG, discussion of the scale and chaotic nature of the environment, and techniques used to provide autonomic  production quality service.
  
Various solutions to overcome the Cloud heterogeneity have been investigated in the last half decade, starting from the definition of uniform interfaces (defining the communalities, but loosing the specificities) and arriving to domain specific languages (allowing to conceive applications at a Cloud-agnostic level, but introducing a high overhead).
+
=== How can we allocate “right” resources to virtual machines in virtualized data centers? – Workload-aware hierarchical scheduling with OpenStack ===
 
+
We discuss the existing approaches and their completeness from the perspective of building support systems for Multi-Clouds, identifying the gaps and potential solutions. Concrete examples are taken from recent experiments in developing a particular Multi-Cloud support system in the frame of MODAClouds project (www.modaclouds.eu).
+
 
+
=== (r)OCCI -- Lessons in practical interoperability for IaaS clouds by Boris Parak ===
+
 
+
==== Biography of the presenter ====
+
 
+
Boris Parák is the lead developer in the rOCCI project with numerous major commits and principal author of the rOCCI-server and rOCCI-cli. He is also a system administrator and a member of the team responsible for designing, building, developing and maintaining CESNET’s private HPC cloud — MetaCloud, as well as a member of the EGI Federated Cloud on behalf of CESNET, currently acting as the task leader for Virtual Machine Management.
+
 
+
CESNET (Czech Education and Scientific Network provider) offers long-term experience in cloud and hybrid grid/cloud solutions through NGI activities and participates in the EGI Federated Cloud Task as a resource and technology provider contributing an infrastructure based on OpenNebula exposing an OGF OCCI service endpoint provided by rOCCI-server. It also runs its own private experimental HPC cloud infrastructure based on OpenNebula.
+
  
 
==== Talk Abstract ====
 
==== Talk Abstract ====
  
OGF OCCI (Open Cloud Computing Interface ) [1] is an open standard/protocol for management tasks in the cloud environment focused on integration, portability and interoperability with a high degree of extensibility. It is designed to bridge differences between various cloud management platforms and provide common ground for users and developers alike.
+
Data centers have been becoming larger and more heterogeneous, possibly being highly distributed. It is crucial to manage many heterogeneous resources effectively to efficiently and cost-effectivity provide services; it is necessary to allocate “right” resources to Virtual Machines (VMs) in virtualized data centers in order to decrease the cost of the operation while meeting the SLAs (Service Level Agreements) such as guaranteeing the latency requirement. One of the most effective ways to allocate “right” resources to a VM would be to do it considering the characteristics of the VM such as the memory intensiveness of the workload executed in the VM. However, the existing schedulers do not consider these kinds of characteristics, including the NOVA scheduler of OpenStack and DRS (Distributed Resource Scheduler) of VMWare. We propose a workload-ware hierarchical scheduler that schedules VMs on OpenStack clusters of nodes, considering the characteristics of workload executed in the VMs and the hierarchy of the resources to be allocated. Our experimental study shows that our memory-intensiveness-aware scheduler may outperform the default scheduler of OpenStack and DRS as well in terms of throughput and latency.
 
+
The rOCCI framework [2][3][4][5] is designed to simplify the implementation of the OCCI 1.1 protocol in Ruby and provide the base for a working client and server implementation targeting multiple cloud management
+
frameworks and commercial service providers via its back-ends. It was adopted by the EGI Federated Cloud [6] and chosen to act as one of the designated OCCI implementations. This led to further development and provided much needed practical experience in real-world conditions.
+
 
+
This talk aims to provide information about the OGF OCCI standard/protocol, introduce its basic concepts and available implementations, describe functionality provided by the rOCCI framework in concert with multiple cloud management frameworks. It also briefly examines the use of OGF OCCI and rOCCI in the EGI FedCloud
+
environment and explores the possibility of further integration, extensions and improvements with interoperability in mind.
+
 
+
[1] http://occi-wg.org/
+
 
+
[2] https://github.com/EGI-FCTF/rOCCI-server
+
 
+
[3] https://github.com/EGI-FCTF/rOCCI-core
+
 
+
[4] https://github.com/EGI-FCTF/rOCCI-api
+
 
+
[5] https://github.com/EGI-FCTF/rOCCI-cli
+
 
+
[6] http://www.egi.eu/infrastructure/cloud/
+

Latest revision as of 14:05, 21 September 2015

Workshop Program | Workshop Main Page

  • Location: Boston Marriott Cambridge, Cambridge, MA, USA
  • Date: September 21 (Monday), 2015
  • Time: 08:30 AM - 06:00 PM (Salon VII)

This workshop features two keynotes by Dr. Alan Edelman (Massachusetts Institute of Technology) and Dr. Kate Keahey (Argonne National Laboratory/University of Chicago), two invited talks by Robert Quick (Indiana University) and Dr. Hyeonsang Eom (Seoul National University), and oral presentations of 7 full papers.

Time Description Presenter Institution Session Chair
08:30 - 09:00 ICCAC 2015 Registration and Check-in
09:00 - 09:10 Welcome Remarks by AMGCC workshop organizers
09:10 - 10:00 Keynote: Julia: A fresh approach to parallel programming Alan Edelman MIT Hyeonsang Eom
10:00 - 10:45 Invited Talk: Autonomy in the Open Science Grid or Pay No Attention to the Operator Behind the Curtains Robert Quick Indiana University
10:45 - 11:00 Coffee Break
11:00 - 11:30 Performance Analysis of Loosely Coupled Applications in Heterogeneous Distributed Computing Systems Eunji Hwang, Seontae Kim, Tae-Kyung Yoo, Jik-Soo Kim, Soonwook Hwang and Young-Ri Choi Ulsan National Institute of Science and Technology, KISTI Jik-Soo Kim
11:30 - 12:00 Fine-Grained, Adaptive Resource Sharing for Real Pay-Per-Use Pricing in Clouds Young Choon Lee, Youngjin Kim, Hyuck Han and Sooyong Kang Macquarie University, Hanyang University, Dongduk Women’s University
12:00 - 12:30 A Job Dispatch Optimization Method on Cluster and Cloud for Large-scale High-Throughput Computing Service Jieun Choi, Seoyoung Kim, Theodora Adufu, Soonwook Hwang and Yoonhee Kim Sookmyung Women's University, KISTI
12:30 - 14:00 Lunch
14:00 - 14:45 Keynote: Chameleon: A Large-scale, Reconfigurable Experimental Environment for Next Generation Cloud Research Kate Keahey Argonne National Laboratory/University of Chicago Jaehwan Lee
14:45 - 15:15 An Empirical Evaluation of NVM Express SSD Yongseok Son, Hara Kang, Hyuck Han and Heon Young Yeom Seoul National University, Dongduk women's university
15:15 - 15:45 SCOUT: A Monitor & Profiler of Grid Resources for Large-Scale Scientific Computing Md Azam Hossain, Hieu Trong Vu, Jik-Soo Kim, Myungho Lee and Soonwook Hwang University of Science & Technology, KISTI, Myongji University
15:45 - 16:00 Coffee Break
16:00 - 16:45 Invited Talk: How can we allocate “right” resources to virtual machines in virtualized data centers? – Workload-aware hierarchical scheduling with OpenStack Hyeonsang Eom Seoul National University Yoonhee Kim
16:45 - 17:15 A CPU Overhead-aware VM Placement Algorithm for Network Bandwidth Guarantee in Virtualized Data Centers Kwonyong Lee and Sungyong Park Sogang University
17:15 - 17:45 Feasibility of the Computation Task Offloading to GPGPU-enabled Devices in Mobile Cloud Kihan Choi, Jaehoon Lee, Youngjin Kim, Sooyong Kang and Hyuck Han Hanyang University, Dongduk Women’s University
17:45 - 18:00 Closing Remarks

Keynotes

Julia: A fresh approach to parallel programming

Talk Abstract

The Julia programming language is gaining enormous popularity. Julia was designed to be easy and fast. Most importantly, Julia shatters deeply established notions widely held in the applied community. Julia shows the fascinating dance between specialization and abstraction. Specialization allows for custom treatment. We can pick just the right algorithm for the right circumstance and this can happen at runtime based on argument types (code selection via multiple dispatch). Abstraction recognizes what remains the same after differences are stripped away and ignored as irrelevant. The recognition of abstraction allows for code reuse (generic programming). A simple idea that yields incredible power. Julia is many things to many people. In this talk we describe how Julia was built on the heels of our parallel computing experience with Star-P which began as an MIT research project and was a software product of Interactive Supercomputing. Our experience taught us that bolting parallelism onto an existing language that was not designed for performance or parallelism is difficult at best, and impossible at worst. One of our (not so secret) motivations to build Julia was to have the language we wanted for parallel numerical computing.

Chameleon: A Large-scale, Reconfigurable Experimental Environment for Next Generation Cloud Research

Talk Abstract

Cloud services have become ubiquitous to all major 21st century economic activities -- there are however still many open questions surrounding this new technology. IN particular, many open research questions concern the relationship between cloud computing and high performance computing, the suitability of cloud computing for data-intensive applications, and its position with respect to emergent trends such as Software Defined Networking. A persistent barrier to further understanding of those issues has been the lack of a large-scale testbed where they can be explored.

With funding from the National Science Foundation (NSF), the Chameleon project provides such a large-scale platform to the open research community allowing them to explore transformative concepts in deeply programmable cloud services, design, and core technologies. The testbed, deployed at the University of Chicago and the Texas Advanced Computing Center, will ultimately consist of almost 15,000 cores, 5PB of total disk space, and leverage 100 Gbps connection between the sites. While a large part of the testbed will consist of homogenous hardware to support large-scale experiments, a portion of it will support heterogeneous units allowing experimentation with high-memory, large-disk, low-power, GPU, and co-processor units. To support a broad range of experiments, the project will support a graduated configuration system allowing full user configurability of the software stack, from provisioning of bare metal and network interconnects to delivery of fully functioning cloud environments. This talk will describe the goals, the building, and the modus operandi of the testbed.

Invited Talks

Autonomy in the Open Science Grid or Pay No Attention to the Operator Behind the Curtains

Talk Abstract

The Open Science Grid (OSG) is a distributed computational facility providing resources for High Throughput Computing (HTC) workflows. These resources are located at 125 locations across North America and South America, with minor extensions to Asia, Europe, and Africa. By nature, a distributed computing facility of this extent is a chaotic ecosystem, with scheduled and unscheduled outages, network fluctuations, and resource and policy autonomy. To consolidate this system into a functional operational environment the OSG uses a variety of technical and social techniques. These include central operational services that provide dynamic snapshots of the state of OSG, continuous monitoring and subsequent self-repairing actions, active 24x7 tracking and troubleshooting of critical production issues, automatically managed glide-in based workflows, and high availability operational services. This talk will cover and introduction to the OSG, discussion of the scale and chaotic nature of the environment, and techniques used to provide autonomic production quality service.

How can we allocate “right” resources to virtual machines in virtualized data centers? – Workload-aware hierarchical scheduling with OpenStack

Talk Abstract

Data centers have been becoming larger and more heterogeneous, possibly being highly distributed. It is crucial to manage many heterogeneous resources effectively to efficiently and cost-effectivity provide services; it is necessary to allocate “right” resources to Virtual Machines (VMs) in virtualized data centers in order to decrease the cost of the operation while meeting the SLAs (Service Level Agreements) such as guaranteeing the latency requirement. One of the most effective ways to allocate “right” resources to a VM would be to do it considering the characteristics of the VM such as the memory intensiveness of the workload executed in the VM. However, the existing schedulers do not consider these kinds of characteristics, including the NOVA scheduler of OpenStack and DRS (Distributed Resource Scheduler) of VMWare. We propose a workload-ware hierarchical scheduler that schedules VMs on OpenStack clusters of nodes, considering the characteristics of workload executed in the VMs and the hierarchy of the resources to be allocated. Our experimental study shows that our memory-intensiveness-aware scheduler may outperform the default scheduler of OpenStack and DRS as well in terms of throughput and latency.