One of the barriers to assessing the cost of risks and opportunities in climate-change research is the lack of reliable and readily accessible data about climate. This data gap prevents financial sector stakeholders and others from assessing the financial stability of mitigation and resilience efforts and channeling global capital flows towards them. It also forces businesses to engage in costly, improvised ingestion and curation efforts without the benefit of shared data or open protocols.

To address these problems, the Open Source Climate (OS-Climate) initiative is building an open data science platform that supports complex data ingestion, processing, and quality management requirements. It takes advantage of the latest advances in open source data platform tools and machine learning and the development of scenario-based predictive analytics by OS-Climate community members.

To build a data platform that is open, auditable, and supports durable and repeatable deployments, the OS-Climate initiative leverages the Operate First program. Operate First builds on GitOps to extend open source principles with open deployments and operational knowledge.

[ Learn about how engineers are using Operate First to host cloud-native AI.] 

Operate First was founded on the principle that you can and should trust open communities with running and managing applications and infrastructure. Operate First aims to close the feedback loop on software development by providing developers and operators total open source visibility and participation in the configuration and deployment of production environments.

OS-Climate manages the OpenShift data platform cluster's configuration and deployments with GitHub issues and pull requests against the Operate First Git repository. This generates a public record of all platform configurations and the community discussions about them. It vastly increases the OS-Climate community's scalability by enabling multiple open source communities to share some common deployment configurations and knowledge.

OS-Climate uses Operate First to manage its core Data Commons platform technologies: the Trino data mesh, Jupyter Hub and Elyra pipeline editor, SuperSet, and OpenShift.

OS-Climate Data Commons platform

Another key component of the OS-Climate data platform managed with Operate First is authentication and data access control. Due to the sensitive nature of many data sources in the financial industry, certain data may not be public or open in the way we think of open source code. OS-Climate data providers can set, manage, and enforce compliance and security rules to dictate which users can access their datasets. That could mean sharing some data to help build the model, other data with data scientists for environmental research, and more business-sensitive data with regulators only.

This complex interaction between privacy requirements (and the principles of open source and open data) results in a challenging data access control architecture. To maximize reliability and transparency, database access control configurations are managed fully in the open using Operate First principles. This allows the entire community to review and audit all access control policies for correctness before making updates.

Community participation in the OS-Climate project has grown by leaps and bounds since the inception of the Data Commons platform. Using Operate First for platform configuration management and deployments tremendously impacts the community's scalability.

For more information about the project, please watch our presentation Unlocking climate-related data through open source and data mesh architecture from the Open Data Science Conference West 2021, or our keynote address Powering open source climate with Operate First at DevConf.CZ 2022.


저자 소개

Erik is a Software Engineer at Red Hat's Open Services Group, where he explores emerging open source technologies at the intersection of data science and the Kubernetes ecosystem.

UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Virtualization icon

가상화

온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래