From diverse datasets to United Nations public good tasks

Go back to programme

From diverse datasets to United Nations public good tasks

Over the last year, we made much progress towards composable systems for working with machine learning’s core asset: data. 

 

https://data.mlr.press/assets/pdf/v01-5.pdf

Croissant, a new format for ML ready datasets, has seen the day of light with integrations across Huggingface, Kaggle, OpenML and with loaders for TensorFlow, PyTorch, Jax and Keras, with more than >400k datasets indexed and accessible to date. 

https://github.com/mlcommons/croissant 

https://arxiv.org/abs/2403.19546  

 

 

A burgeoning DMLR (Data-Centric Machine Learning Research) ecosystem has grown, complete with back-to-back workshops at NeurIPS, ICML and ICLR and an ambitious journal 

https://dmlr.ai/ 

https://data.mlr.press/ 

 

In this workshop, we are taking first steps to cross-pollinate ecosystems, matching advanced data-centric methods and infrastructure with high-impact public good expertise. 

 

On Friday, May 31, we will meet in Geneva to deliberate over a roadmap for one such domain, namely healthcare. 

The goals of our meeting are threefold: 

1) Take inventory of the assets we collectively work on in this space 

2) Align on our vision 

3) Define a roadmap for the adoption of Croissant Health Extension (or similar format) 

This will be a no-nonsense-get-things-done workshop meeting. We will meet from 9.30AM to 1.30PM keeping a tight beat with a few coffee breaks inbetween. The schedule is split into two main components. 

  1. A) Round robin inventory presentations: every speaker will give a short 15min summary of their corner of the universe
  2. B) Roadmap discussion: we will define targets for format adoption and dissemination + any serendipitous ideas that come during the discussion

 

 

 

 

Insights from building a cloud-based ML development platform for Medical Applications

Pratik Bijam Abstract

Recent AI challenges in clinical settings demand robust, generalizable models, created with extensive diverse data and healthcare professional inputs. Bayer has developed(MVP by Q3 2024) an ecosystem of digital platforms(focusing on Radiology to begin) for FAIR data management, AI development with a scalable cloud infrastructure & a marketplace with security, automations & collaboration at the core of it . This effort aligns with Bayer's innovation strategy, fostering external partnerships to merge medical knowledge with digital technologies to improve patient outcomes and healthcare systems on a global scale.

 

The ZODIAC Respiratory Disease Phenotype Observatory: an IAEA International Cooperative Study for Early Detection of New Pandemics (The IAEA CT Artificial Intelligence -Cooperative Study- ICAI Project).

Enrique Estrada-Lobato IAEA

The project aims to enhance disease management by using radiomics, AI, and data analysis to identify patterns in medical imaging, correlating these with clinical data. By establishing characteristic disease patterns, such as those for respiratory diseases. Radiomics enhances image interpretation, and aggregates data for imaging analysis and medical training. The primary objective is to develop a global early detection and response system for pandemics using AI and ML. The project analyses CT and X-ray data to identify early indicators of critical disease phenotypes and generate prognostic predictors.

 

 

Share this session