Friday, April 15, 2016

Book Review: Big Data Appliances for In-Memory Computing

BOOK REVIEW
(28-March-2016)

(Source: Adapted from Ganapathi Pulipaka, 2015)

Title: Big Data Appliances for In-Memory Computing
Author: Ganapathi Pulipaka
Publisher: High-Performance Computing Institute of Technology
Pages: 210
ISBN: 978-0692599570
Print: 
Kindle:
Audience: SAP HANA Developers, IT Professionals, Doctoral Students, Professors,
                 and Organizational Business Managers.
Rating: 5
Reviewer: ThienSi Le


     Dr. Pulipaka’s book presents a scholarly research guide for corporations or high-tech organizations to use SAP HANA (Systems, applications, and products – High-performance analytic appliance) for robustly elaborating their big data (D) for meaningful information (I), holistic knowledge (K) and professional wisdom (W). SAP HANA provides an insight of DIKW that will help organizations to conduct of the effective business strategy and achieve four objectives: (a) making a practical and strategic decision, (b) improving business performance, (c) increasing organizational productivity, and (d) gaining and sustaining the competitive edge in the dynamic market locally and globally. SAP HANA is an in-memory computing platform in data processing developed by SAP.
     The qualitative study’s purpose on SAP HANA is to determine big data technologies
for analyzing smart computing capabilities and to provide recommendations for friendly and cost-effective big data solution.  

Chapter 1 Introduction
     In the first chapter, Dr. Pulikapa discussed the background of the problem eruditely, problem statement, study’s purpose, the significance of the study, research questions, assumptions, limitations, and delimitations, and presents SAP HANA conceptual framework.
     On the spontaneous explosion of colossal amounts of data and information during the evolution of Internet at the end of the 20th century, the traditional databases such as RDBMS (Relational database management system), CODASYL (Conference/Committee on data systems languages), IMS (Information Management System), etc. no longer can handle data in huge volume and various formats. This chapter describes the problem statement that many organizations are unable to effectively access, extract, and process data for insightful information for a sound and just decision-making on the complex problems at the right time and opportunities.
     The author addressed the purpose of the qualitative SAP HANA study to use secondary data to analyze industry trends of SAP HANA performance benchmarks as a standard appliance tool. The focus of this study is on resolving performance challenges, moving data from OLTP (Online transaction processing) systems to OLAP (online analytical processing) systems, and resolving the global challenges of information delays in many fields. The study also determines whether (a) SAP HANA addresses big data’s speed, accuracy and granularity, (b) SAP HANA has efficient in-memory technology and architecture to blend OLAP and OLTP, and (c) SAP HANA has shown promising results in emerging growth technologies such as mIOT (Medical Internet of Things) and speech-to-speech translation. The study also explores multiple big data tools with in-memory technology capabilities for the best practices.   
     In the significance of the study, the author emphasized an evolution of the next generation of big data and how SAP HANA’s research business cases can apply in other industries and guide the future of in-memory computing. Based on the results, techniques to boost performance to ultra-blazing speed, methods to bring data at a real time, and strategic approach to gain the competitive edge can be achieved. The study also contributes SAP HANA’s performance benchmarks to the pool of literature and an alternative solution for an in-memory solution.
     The author raised three primary research questions (RQ). Each question is fervid and laudable on a pioneering in-memory database-computing platform SAP HANA. They are as follows: 
     RS1: How can SAP HANA speed up the diffusion of big data via awareness, interest evaluation, trial and adoption?
     RS2: Why is SAP HANA more efficient that other big data appliances such as Oracle and IBM?
     RS3: What data show that SAP HANA can shape the future of emerging growth technologies?
     An assumption on single database instance with enabled Unicode is used in SAP HANA. Its limitation requires decompression during query execution. The chapter 1 closes up with a list of the definition of terms. 

Chapter 2 Literature Review
     Chapter 2 covers a comprehensive literature review relevant to SAP HANA. The author goes through literature to find urgency of SAP HANA and limitation of the process for faster and higher performance in existing systems. SAP HANA provides insights to customers for increasing productivity.
     The author spent plenty of time to review the prevailing literature on SAP HANA in 53 pages. He provides a historical overview of the problem of processing Gigantic volumes of big data with SAP product and SAP HANA’s future in big data, and SAP business applications for decision-making framework. He distinguishes hot data as big data in motion and cold data as data at rest with big data in global media distribution. SAP HANA is an in-memory computing platform integrated with Apache Hadoop, an open source Apache distributed architecture to support high performance demanding applications. A role of IBM DB2 Blu is compared with SAP  HANA development. SAP HANA database compression technology is discussed in-depth with big data’s fours Vs (volume, variety, velocity, value). The Chapter also provides the critique of the previous scholarly research on SAP BW on business information warehouse and SAP HANA in the early development stage in lacking in-memory technology. A literature review (LR) finds that a mature SAP HANA can outperform various in-memory database platforms for benchmarks, atomicity and ability to provide predictive analytics. LR also finds that the conceptual framework of the third generation in-memory database SAP HANA in data analytic view for decision-making in most of the data management organizations on big data. The author delves NoSQL databases in the marketplace and provides the popular NoSQL tools such as Google Dremel, Tokyo Cabinet, Apache CouchDBm Redis, MongoDB, Apache Cassandra, etc. He also performs an overview of other analytical software products such as big data visualization Tableau, business intelligence QlikView, Splunk, and R language.          
     This Chapter concludes the evolution of database giants like SAP HANA, IBM and Oracle Exadata with the issue of choosing either one of these applications by organizations that leads to the present study to explore the big data appliances that disseminate the speed of big data to organizations with awareness, interest, trial, evaluation, and adoption.

Chapter 3 Research Method
    Chapter 3 constructs a research methodology for the study of big data appliances for in-memory computing technology with issues of data movement between OLTP and OLAP and various information technology (IT) users among employees. It outlines the design and basic procedures underlying this study. The author uses a qualitative research method that consists of a highly structured IDC (International Data Corporation) survey with predefined questionnaires on participants around the world. The sample of 759 IT and business managers who likely understand the problem of technological limitations, performance bottleneck to untangle OLAP and OLPT’s data transfer on three research questions with case studies.
     The present study that bases on secondary data without human subjects uses IDC interviews, case studies from SAP on 405 IT managers and 352 business managers who encounter challenges dealing with multiple database platforms on OLTP and OLAP. IDC survey results from many fields and areas that are extracted for data analysis indicate information delays from ad hoc reports to each department in the organizations.

Chapter 4 Findings and Results
     Chapter 4 discusses empirical data from IDC, SAP on SAP HANA customers that show that the SAP HANA tool assists many organizations to resolve performance challenges, reduce complexity issues, improve flexibility in data transfer from OLPT systems to OLAP systems, and reduce information delays. IDC survey’s results indicate 40% of respondents in business organizations. 25% of business managers believe information delays affects negatively their business. Time to transfer data from RDBMS to OLAP system is significantly high with 70% of the time on data processing. IDC survey’s results also show that the ROI of SAP HANA implementation and application from the organizations is 509% in 5 years. Powered by HANA, the SAP NetWeaver BW (Business Warehouse)’s performance is improved in several industries such as beverages, utilities, and automotive. Many organizations that use SAP HANA have a bright worldwide future. For example, SAP HANA can retrieve 10 years huge data in a couple of seconds, and quickly diagnose and analyze patient records on tablets at the real time.
     Qualitative results from IDC survey and SAP interviews with its customers focus on performance, information delays in business, and runtime for data movement between the OLTP and OLAP systems to answer three research questions below:
     Research Question 1: How can SAP HANA speed up the diffusion of big data via awareness, interest evaluation, trial and adoption?                   
     The study results show that SAP was the first organization to implement in-memory technology in SAPP HANA in 2010. SAP HANA can compress row-based tabular data to columnar-based data to improve performance in the factor of 100 to 1000. The results have shown that business managers spent more than 48 hours to close the financial statements and information delay increase. 35% of participants rated satisfaction with four stars in the adoption of SAP HANA trial version up to 30 days. 30% of the respondents responded that it takes them 48 hours to complete an operational build report. The performance time of query execution has improved by the range of 15 to 255 times. The overall analytics were improved by 15,000 times.      
     Research Question 2: Why is SAP HANA more efficient that other big data appliances such as Oracle and IBM?
     SAP HANA became an emerging analytical tool in in-memory database (DB) technology in blended DB management system with OLTP and OLAP. It has the capability to scan DB records at the ultrafast speed of 250 GB/s, e.g., 1.5 million INSERT operations per second, 12 million records per second in DB aggregation. SAP HANA can have the blended OLAP and OLTP on a single DB system for business intelligence reporting. SAP HANA can handle the data conversions and data migration challenges with SAP BODS (Business Object Data Services). Competing against IBM DB2 Blu and Oracle Exadata, SAP HANA is the only solution big data and enterprise-ready with columnar-based data.
          Research Question 3: What data show that SAP HANA can shape the future of emerging growth technologies?
        SAP released SAP HANA SPS 09 recently, a product that directly supports streaming medical data and clinical medical data though mIOT (Medical Internet of Things) devices. SAP HANA with Hadoop and R language integration is potentially used as a smart access platform in a private cloud for co-innovation among big pharma companies for clinical trial development. It also has the potential opportunities to enter the foray of speed-to-speed translation technologies on the CRM (Customer Relationship Management) platform and improve other services such as CTI and IVR integration, SAP CRM Web UI, etc.      
     This Chapter provides the answers to three comprehensive research questions. It shows that SAP HANA can resolve the biggest conundrums with awareness, adoption, interest, trial and evaluation in many fields. 

Chapter 5 Summary, Conclusion, and Recommendations
     The author closed the erudite study with summary, collusion and recommendations as follows:
     The present study examined analytical big data tools such as SAP HANA, IBM DB2 Blu, Apache Cassandra, DataStax, MongoDB, and Oracle Exalytics. The study included data movement between OLTP and OLAP with ROI improvement and TCO (Total Cost of Ownership) reduction in organizations and trend in future memory technologies. With the problem of the fast growth of high-speed big data, SAP HANA that integrates ERP, CRM, SCM, FSCM, PLM, PPM and SRM systems, performs benchmarks in the evolution of the big data movement with in-memory database computing for many industries.     
     The research study described how the in-memory computing database platform tool  SAP HANA provides a big data enterprise-ready solutions for applications versus other products, e.g., IBM DB2 Blue, and Oracle Exadata. 509% ROI benefits and excellent performance benchmarks played a crucial role in SAP HANA’s application and deployment. SAP HANA provides many improvements. For example, time of data movement between OLTP and OLAP was reduced by 87%, reporting with 80% improvement, data compression with 511%, etc.   
    SAP HANA was built and run entirely on inexpensive DRAM. However, its future outlook of in-memory databases tilts towards expensive flash memory. DRAM in-memory DB has limitations of scalability, unlike in-memory grids that can perform massively in parallelism. The author recommends the SAP HANA research labs to establish the future SAP HANA databases in the memory flash to support revolutionary and innovative architecture in the coming models. The cost of application, deployment and maintenance of SAP HANA is slightly higher than IBM DB2 Blu. SAP should look at flash options with hybrid memory at a lower cost. For example, SAP HANA with hardware and software for small companies may cost $300,000 while Aerospike offers the database at 1 TB at $75,000. SAP may need a forklift upgrade similar to Aerospike database to acquire more domestic and global customers. SAP may team up with Interactive Intelligence Customer Interaction Center to its ability to integrate and deploy  SAP CRM 7.X for speech-to-speech translation and voice recognition. SAP HANA and R language integration for neural network learning algorithms to provide predictive analytics for enterprises in forecasting. Based on neural networks, SAP HANA can build a natural language in ABAP (Advanced Business Application Programming Language) for predictive analytics in the dynamic motion.

     Dr. Ganapathi Pulipaka spent two years in a qualitative research study of SAP HANA, a big data high-performance analytical platform for in-memory computing. The laudable research study provided an intensive and comprehensive work such as establishing the problem, framework, research questions, literature review, and qualitative analysis. He compared and contrasted several analytics and statistics tools such as IBM DB2 Blu, Oracle Exadata, Tableau, etc. with interesting findings and lucid results from qualitative surveys, interviews in the credible resources such as IDC, SAP customers. His malleable conclusion indicated that SAP HANA is a manifestly analytical tool and can be applied in various industries such as aerospace, healthcare, automotive, etc. The fervid recommendation for SAP HANA, particularly in-memory flash is the great idea for future research. 



2 comments:

  1. I got a good answer from the above description,but it still requires some more update to be made. Please share more content on MSBI Online Training Bangalore

    ReplyDelete