Service-Oriented Models for Cloud Based Big Data Analytics

Juanhuan Wen, Junfeng Zhao, Da Qi Ren, Huawei
Focus Area: 

In this work, a service oriented hierarchical model is introduced to assist the assurance of the high performance business service over the virtual clusters on a cloud where the data intensive computing paradigms are deployed. This model is especially illustrated by exploiting and modeling the business workload characterization, constraints in software stack and low-level distributed resources. The modeling mechanisms, including the primitives and functionalities, are formulated. The service-oriented model aids in systematic and hierarchical development of global optimization for big data analytics on a cloud. It is suitable for government, education, finance, medical, telecom and other industries and for all types of offices, including branches, call centers and mobile office situations.

Who benefits and how: 

It is suitable for government, education, finance, medical, telecom and other industries and for all types of offices, including branches, call centers and mobile office situations.

A growing number of enterprises use cloud computing to build their big data projects because the cloud offers a cost-effective way to support big data technologies and the advanced analytic applications that respond to real business needs and drive commercial values.

A growing number of enterprises use cloud computing to build their big data projects because the cloud offers a cost-effective way to support big data technologies and the advanced analytic applications that respond to real business needs and drive commercial values. Cloud computing provides a wide range of infrastructure and software services and manages large numbers of virtualized resources, which makes advantageous computing paradigms available for big data. A modern cloud can behave virtually like a local homogeneous computer cluster, providing high performance, data intensive computing platforms for public use. These platforms can potentially enhance business agility and productivity while enabling greater efficiencies and reducing costs. In this work, a service oriented hierarchical model is introduced to assist the assurance of the high performance business service over the virtual clusters on a cloud where the data intensive computing paradigms are deployed. This model is especially illustrated by exploiting and modeling the business workload characterization, constraints in software stack and low-level distributed resources. The modeling mechanisms, including the primitives and functionalities, are formulated. The service-oriented model aids in systematic and hierarchical development of global optimization for big data analytics on a cloud. It is suitable for government, education, finance, medical, telecom and other industries and for all types of offices, including branches, call centers and mobile office situations.

Hardware and Software Platform
A cloud solution for big data is an end-to-end solution covering hardware, software, network, terminal, security, consulting and design services. Cloud servers composed of cloud OS and virtualized platforms. By centralized managing and sharing of computing and storage resources, cloud platform helps customers to solve the problems of traditional clusters and allows customers to enhance information security, improve O&M efficiency and create a truly mobile office while improving service reliability.  Cloud hardware integrates compute, storage and network. The computing devices usually include multi-core CPU, GPU, FPGA and other multiprocessing facilities. Smart storage engines, intelligent networks, SSD caching mechanisms and other innovations work together to achieve high performances. Systems are designed as pre-validated infrastructure under unified physical and virtual resource management.  

Analytics-as-a-Sevice
Big data analytics on a cloud are expanding in a fast way in terms of the increase of the workloads varieties of business on cloud. Such as, some of the big data workload are with more branch operations; some are data movement dominated computing; and some have larger instruction footprint. Typically, Hadoop and Spark based big data workloads have higher front-end stalls; and complex big data software stacks fail to use state-of-practice processors efficiently. The architectural designs of the cloud services have to meet the performance requirements of the specific business characterizations. Service model by analyzing the factors of workload, cost, security, and data interoperability are vital to performance.    Depending on the usage scenario and the performance requirements, the best use of the cloud platform may be to focus on analytics as a service (AaaS).  Cloud service models can help accelerate the potential for scalable big data analytics solutions.  

The Service Oriented Model
Cloud-based big data analytics is not a one-size fits-all solution. Organizations using cloud infrastructure to provide AaaS have multiple options. Businesses with varying needs and budgets determine the strategies to create a service model in cloud environments.  Computing power and storage capacity via cloud services for certain analytics initiatives   and provide added capacity and scale as needed. Based on the service model to a business on cloud, data localities need to be designed in order to analyze the data where it resides, either in a cloud data center or in edge systems and client devices.  Focusing on handling the critical configurable design constraints at each level of a cloud platform, optimized big data analytic services can be approached based on the above service-oriented model, in order to achieve the best possible performance. The model allows obtaining design characteristic values at the early design stage, thus benefits cloud administrator by providing the necessary workload information for choosing the best compute, storage and network alternatives. The model is embedded in the management system, it allows access to everything including switches, virtual machines, storage volumes, applications provisioning, automation and security. The model supports suggesting flexible compute/storage configurations; scalabilities, configuration/expansion on demand, improvement of storage I/O performances. Working together with the cloud management system, the service oriented model virtualizes and schedules computing, storage and network resource and provides services such as elastic computing, load balancing and virtual private cloud.