Data Engineer ML (Open to Remote)

IHS Markit

Virginia

United States

Engineering
(No Timezone Provided)

Our company are leaders delivering fuel pricing data, market intelligence and analytics to the global Energy industry. We believe in having the best technology capabilities to deliver our products to our customers.

OPIS, part of IHS Markit, is expanding its Data Science and Machine Learning (MLOps) Engineering team. You will play a critical role in the design and development of a world-class Data Science and MLOps platform. We encourage creativity, autonomy, and flexibility to allow you to bring your ideas to life!

To fit this role, we are looking for a mid-level data engineer with a unique mix of technical capabilities, strong data analysis and collaboration skills.

YOUR ROLE

  • Design and develop complex code in SQL or Python to optimize data pipelines for ML analysis and modeling

  • Write PySpark code to transform our model algorithms to production-ready code

  • Work with the latest AWS cloud computing and data warehousing technologies such as Snowflake

  • Work with the latest AWS services to deploy and run our ML code via CI/CD pipelines and containers

  • Be part of a diverse, global Agile team, employing best practices for the DSML Lifecycle Framework

  • Be proud of both your own growth and your team’s success!

  • ABOUT YOU

    You have:

  • 3 to 5+ years of working experience in data management technologies (Microsoft SQL Server, Postgres, advanced SQL coding, relational database design, data warehousing)

  • 3+ years with writing SQL ETL processes

  • 3+ years coding in Python or PySpark

  • 2-3+ years developing Big Data and ML technologies (Spark, AWS SageMaker) to optimize ML model processing (training, evaluation, and inference)

  • 2-3+ years in cloud computing especially AWS

  • Experience working in a Linux and Windows environment

  • Bonus if you also have experience in:

  • Writing infrastructure as code for AWS (Terraform, Step Functions, CloudWatch)

  • Working with SnowFlake cloud data warehouse, parquet file formats, AWS S3 storage

  • Creating CI/CD deployment pipelines in Microsoft Azure DevOps

  • Working with containers (Docker, Kubernetes)

  • Deployments in MLOps software (AWS SageMaker)

  • Working with BI software (Microsoft Power BI)

  • Educational background in computer science, engineering, or data analytics

  • -----------------------------------------------

    Inclusion and diversity are critical to the success of IHS Markit, and we actively encourage applications from people of all backgrounds. We are committed to providing equal employment opportunity without regard to race, color, religion, sex, sexual orientation, gender identity, age, national origin, disability, status as a protected veteran, or any other protected category. For more information on the many ways in which we enthusiastically support inclusion and diversity efforts for both candidates and employees, please access our Inclusion & Diversity Statement .

    Data Engineer ML (Open to Remote)

    IHS Markit

    Virginia

    United States

    Engineering

    (No Timezone Provided)

    Our company are leaders delivering fuel pricing data, market intelligence and analytics to the global Energy industry. We believe in having the best technology capabilities to deliver our products to our customers.

    OPIS, part of IHS Markit, is expanding its Data Science and Machine Learning (MLOps) Engineering team. You will play a critical role in the design and development of a world-class Data Science and MLOps platform. We encourage creativity, autonomy, and flexibility to allow you to bring your ideas to life!

    To fit this role, we are looking for a mid-level data engineer with a unique mix of technical capabilities, strong data analysis and collaboration skills.

    YOUR ROLE

  • Design and develop complex code in SQL or Python to optimize data pipelines for ML analysis and modeling

  • Write PySpark code to transform our model algorithms to production-ready code

  • Work with the latest AWS cloud computing and data warehousing technologies such as Snowflake

  • Work with the latest AWS services to deploy and run our ML code via CI/CD pipelines and containers

  • Be part of a diverse, global Agile team, employing best practices for the DSML Lifecycle Framework

  • Be proud of both your own growth and your team’s success!

  • ABOUT YOU

    You have:

  • 3 to 5+ years of working experience in data management technologies (Microsoft SQL Server, Postgres, advanced SQL coding, relational database design, data warehousing)

  • 3+ years with writing SQL ETL processes

  • 3+ years coding in Python or PySpark

  • 2-3+ years developing Big Data and ML technologies (Spark, AWS SageMaker) to optimize ML model processing (training, evaluation, and inference)

  • 2-3+ years in cloud computing especially AWS

  • Experience working in a Linux and Windows environment

  • Bonus if you also have experience in:

  • Writing infrastructure as code for AWS (Terraform, Step Functions, CloudWatch)

  • Working with SnowFlake cloud data warehouse, parquet file formats, AWS S3 storage

  • Creating CI/CD deployment pipelines in Microsoft Azure DevOps

  • Working with containers (Docker, Kubernetes)

  • Deployments in MLOps software (AWS SageMaker)

  • Working with BI software (Microsoft Power BI)

  • Educational background in computer science, engineering, or data analytics

  • -----------------------------------------------

    Inclusion and diversity are critical to the success of IHS Markit, and we actively encourage applications from people of all backgrounds. We are committed to providing equal employment opportunity without regard to race, color, religion, sex, sexual orientation, gender identity, age, national origin, disability, status as a protected veteran, or any other protected category. For more information on the many ways in which we enthusiastically support inclusion and diversity efforts for both candidates and employees, please access our Inclusion & Diversity Statement .