DMLR: Data-centric Machine Learning Research – Past, Present and Future
- Luis Oala ,
- Manil Maskey ,
- Lilith Bat-Leah ,
- Alicia Parrish ,
- Nezihe Merve Gurel ,
- Tzu-Sheng Kuo ,
- Yang Liu ,
- Rotem Dror ,
- Danilo Brajovic ,
- Xiaozhe Yao ,
- Max Bartolo ,
- William Gaviria Rojas ,
- Ryan Hileman ,
- Rainier Aliment ,
- Michael W. Mahoney ,
- Meg Risdal ,
- Matthew Lease ,
- Wojciech Samek ,
- Debo Dutta ,
- Curtis Northcutt ,
- Cody Coleman ,
- Braden Hancock ,
- Bernard Koch ,
- Girmaw Abebe Tadesse ,
- Bojan Karlas ,
- Ahmed Alaa ,
- Adji Bousso Dieng ,
- Natasha Noy ,
- Vijay Janapa Reddi ,
- James Zou ,
- Praveen Paritosh ,
- Mihaela van der Schaar ,
- Kurt Bollacker ,
- Lora Aroyo ,
- Ce Zhang ,
- Joaquin Vanschoren ,
- Isabelle Guyon ,
- Peter Mattson
Journal of Data-centric Machine Learning Research |
Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods towards positive scientific, societal and business impact.