Sustainability data is increasingly relevant for multinational enterprises (MNEs), financial institutions, and researchers. However, sustainability data remain incomplete, fragmented, or scarce. In our paper, we propose a novel approach to address this challenge using machine learning (ML) to predict sustainability metrics from readily available financial data. This method allows for a more detailed and accurate assessment of sustainability in MNEs and their global value chains. Our approach is tested using a comprehensive dataset of financial and sustainability information at the company level. The results indicate that ML is helpful in predicting key sustainability metrics, such as corporate carbon emissions and water discharge. However, users should reflect on the specific use case when applying ML since model performance can vary sectorally, spatially, and temporally. In addition, we develop a metric to assess the uncertainty of the predictions and find that it can substantially affect the model output. Regulators should build on our findings to encourage the use of ML-generated sustainability data while also requiring more transparency from data providers and model users.