Amazon SageMaker Pipelines now offers native EMR integration for large scale data processing

Amazon SageMaker Pipelines is a fully-managed service that allows customers to define and orchestrate their model building steps as workflows. Today, we are happy to introduce a new step type that allows machine learning engineers to run data processing applications using open source frameworks such as Apache Spark, Presto, and Hive on Amazon EMR clusters.

Source: AWS