Inspur Information Computing Platform Accelerates Genetic Research of Renowned South Korean Research Institute-Inspur

Background introduction:

Inspur Information built a converged platform for artificial intelligence (AI) and high-performance computing (HPC) to efficiently collect and analyze genomic data for a well-known research institute in South Korea


The human genome has become one of the most promising areas in the biotechnology industry. In 2021, a South Korean research institute built a genomic information database to provide support for the early diagnosis of diseases, treatment of patients with chronic illnesses, and the rapid development of vaccines via big data analysis of the human genome. It also provides data support for innovative local biotechnology companies in South Korea to promote their development.
The genomic information database created by this South Korean research institute will become the core of the region's genomic service industry, but it also faces the following challenges:
1. The huge scale of data
The genomic information database is tasked with analyzing the genomic information of more than 10,000 people. For whole genome sequencing (WGS), the human genome is equivalent to roughly 3GB of data, with about a 30x sequencing depth multiplier, making a genome data set for a human approximately 100GB of data in total. Consequently, genomic information of more than 10,000 people will generate PBs of data, necessitating a high-performance hardware platform.
2. High business complexity
The genomic information database requires the storage and management of genetic and medical data. It must also analyze and calculate a variety of human vital signs data. It needs to provide AI data processing, visually present genomic data models, and provide cloud services and other online functions. Based on these needs, choosing an appropriate system platform and maximizing performance of the hardware platform is the biggest challenge.
3. High complexity of data center management
With the expansion of its business scope, the customer’s data center has increased from dozens to hundreds of units of equipment. The batch deployment and daily maintenance of this hardware requires extensive human input. In addition, the business continuity of online cloud services brings more challenges in terms of quick fault locations and the maintenance efficiency of data center equipment.

Solution introduction:

1. Jointly developed a massive data storage and analysis platform

Inspur has jointly developed dozens of PB data storage and analysis platforms for customers. Nodes are interconnected by a 200Gb/s InfiniBand network. The I/O read and write performance can reach more than 2GB/s, meeting the needs of the genomic information database, which includes massive data collection, high transmission, large storage, low latency, and high bandwidth.

2. Create diversified computing clusters and comprehensively improve application operating efficiency

Inspur has designed a number of business systems such as a biological big data storage system, a collection system, a high-speed analysis system, a sales management system, and an AI cloud service provision system. In terms of satisfying the high memory requirements of the biological big data high-speed analysis system, Inspur has adopted the eight-socket server fat node TS860M5 which can be configured with up to 12TB of memory in a single node. The excellent performance and extremely low physical space requirements minimizes the number of deployed nodes and significantly reduces TCO. For cloud-based AI model training and data processing, Inspur built a GPU cluster based on NVLink 3.0 with A100 GPUs and NF5488A5 AI servers capable of processing 5 petaFLOPS on a single node, which massively improves the training efficiency of AI models. It effectively supports the calculation of massive genomic data and helps customers efficiently deploy an AI development environment.

3. Provide a unified hardware management platform to reduce maintenance costs

Inspur has developed an automated operation and maintenance solution for customers by taking the Inspur physical infrastructure management (ISPIM) platform as the core and unified it with deployment, monitoring, operation, maintenance, and alarm management equipment from multiple vendors in the customer's data center. ISPIM's batch configuration function and out-of-band operating system deployment function can greatly improve equipment mounting efficiency; the 3D computer room function can completely restore the space and equipment layout of the data center. The parameters are clear at a glance, which improves the efficiency of fault prediction and improves overall operation and maintenance efficiency.

The client's return:

The genomic information database computing platform developed by Inspur Information for a research institute in South Korea meets the requirements of ultra-high-capacity genomic storage and computing, AI data processing, high-speed data transmission, and safe and efficient backup while also improving the operation and maintenance efficiency and helping reduce TCO by 16%. The South Korean research institute will standardize the storage and management of genomic data based on this platform, and provide extensive data foundation support for hospitals, enterprises, and research institutions to carry out genomic research. At the same time, with the integration of genomics and medical data, scientific research has been improved, the transformation of basic research results into clinical practice was accelerated, and the development of the genomic industry in the region was vigorously promoted.

Recommend Products

Related Solutions