Isma Gilani presented “EduHeLx - HeLx for Education,” which provided an overview of the use of HeLx, a scalable computing platform developed by RENCI for researchers and students to address the need of scalable and robust cyberinfrastructure in the cloud. Researchers and scientists across many domains use HeLx to ingest, analyze, share, and archive scientific data. HeLx provides a wide array of data science tools used in research communities in a modern, cloud-native environment with appropriate security, networking, and persistent storage, and can be deployed as a customizable and configurable domain, specific to the user community, across many cloud infrastructures. The platform empowers researchers across fields such as plant genomics, clinical informatics, biomedical science, and neuroscience with computational workspaces close to the data in the cloud at scale, and can be used to facilitate instruction and learning in a classroom environment by providing customizable computing environments for data science courses. HeLx is deployed on many cloud infrastructures using technologies such as Docker and Kubernetes, and interfaces with iRODS, Google Cloud Platform, and Amazon S3 as needed. Several open source technologies are adapted into apps deployed on HeLx, such as various notebooks, visualization tools, Linux desktop environments, and more. A specialized extension of HeLx, called EduHeLx, was developed for classroom instruction for individual instructor and course use and successfully as a means of instruction for UNC Chapel Hill’s Data Science Minor course, “COMP 116: Introduction to Scientific Programming.” The HeLx team worked with UNC ITS to deploy EduHeLx in GCP for the Fall 2021 and Spring 2022 semesters of the course. This GCP cluster was equipped with enough resources to support simultaneous notebook launch and execution for hundreds of users. The COMP116 instance of EduHeLx was built with workspaces consisting of specialized Jupyter notebooks preloaded with the required python environment and modules needed to complete and submit assignments and examinations, hosted in a Kubernetes environment. Additionally, dockerized processes running on servers in the Computer Science department were used for grading student submissions, and the continuous monitoring of cluster resources allowed for metrics to be gathered for insights into resource allocation and usage. Gilani concluded her talk by mentioning that RENCI is collaborating with UNC’s School of Data Science and Society to leverage EduHeLx as its educational platform with essential enhancements, including unified workspaces for multiple course offerings, integration with autograder, addition of R/RStudio to computational tools, and integration with campus LMS, and more.
Click here to view the talk on YouTube.
Isma Gilani, Software Engineering & Solutions Manager & Interim Director of the Software Architecture Group
Department: Renaissance Computing Institute (RENCI) | Faculty Profile
Featured on: July 28, 2022 (Event Page)
Session Title: Data Science: Zero-to-Sixty (Event Recap)
Tools, Information, and Resources:
- HeLx Team: Please reach out to the HeLx team if you’d like to find out more about EduHeLx.
- Docker: Docker takes away repetitive, mundane configuration tasks and is used throughout the development lifecycle for fast, easy and portable application development – desktop and cloud.
- Kubernetes: Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications.
- iRODS: The Integrated Rule-Oriented Data System (iRODS) is open source data management software used by research, commercial, and governmental organizations worldwide.
- Google Cloud Platform: Meet your business challenges head on with cloud computing services from Google, including data management, hybrid & multi-cloud, and AI & ML.
- Amazon S3: Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance.