Lead Data Scientist – Remote OK

The Nielsen Company

Cockeysville Maryland

United States

Scientific Research
(No Timezone Provided)

Lead Data Scientist - Remote OK - 101791

Data Science - USA Offsite, Offsite 

The Lead Data Scientist’s primary responsibility in the Audio Data Science team is to develop creative solutions to enhance the data and analysis infrastructure and pipeline which underpins the survey quality for all Nielsen Audio survey products. In order to deliver high quality standards, the Data Scientist will work as subject matter expert on a team of analysts to establish, maintain and continuously improve data tools and processes supporting the Audio data science team.
Tasks will include developing system enhancements, procedural and technological documentation, working with cross functional teams to implement solutions into production systems, supporting survey methodology enhancement projects, and supporting client facing data requests.

What will I do?

  • Maintain and continuously improve the variety of data infrastructure, analysis, production and QA processes for the Audio Data Science team
  • Assist in the transition of the data science tech infrastructure away from legacy systems and methods
  • Work with cross-functional teams to implement and validate enhanced audience measurement methodologies
  • Build and refine data queries from large relational databases/data warehouses/data lakes for various analyses and/or requests
  • Utilize tools such as Python, Tableau, AWS, Databricks etc. to independently develop, test and implement high quality custom, modular code to perform complex data analysis, visualizations, and answer client queries
  • Maintain and update comprehensive documentation on departmental procedures, checklists and metrics
  • Implement prevention and detection controls to ensure data integrity, as well as detect and address quality escapes
  • Work closely with internal customers and IT personnel to improve current processes and engineer new methods, frameworks and data pipelines
  • Work as an integral member of the Audio Data Science team in a time-critical production environment
  • Key tasks include – but are not limited to – data integration, data harmonization, automation, examining large volumes of data, identifying & implementing methodological, process & technology improvements
  • Develop and maintain the underlying infrastructure to support forecasting & statistical models, machine learning solutions, big data pipelines (from internal and external sources) used in a production environment
  • Is this for me?

  • Undergraduate or graduate degree in mathematics, statistics, engineering, computer science, economics, business or fields that employ rigorous data analysis
  • Must be proficient with Python (and Spark/Scala) to develop sharable software with the appropriate technical documentation
  • Experience utilizing Gitlab, Git or similar to manage code development
  • Experience utilizing Apache Spark, Databricks & Airflow
  • Expertize with Tableau, or other data visualization software and techniques
  • Experience in containerization such as Docker and/or Kubernetes
  • Expertize in querying large datasets with SQL and of working with Oracle, Netezza, Data Warehouse and Data Lake data structures
  • Experience in leveraging CI/CD pipelines
  • Experience utilizing cloud computing platforms such as AWS, Azure, etc
  • Strong ability to proactively gather information, work independently as well as within an multi disciplinary team
  • E- Proficiency in MS Office suite (Excel, Access, PowerPoint and Word) and/or Google Office Apps (Sheets, Docs, Slides, Gmail)
  • Preferred

  • Knowledge of machine learning and data modeling techniques such as Time Series, Decision Trees, Random Forests, SVM, Neural Networks, Incremental Response Modeling, and Credit Scoring
  • Knowledge of survey sampling methodologies
  • Knowledge of statistical tests and procedures such as ANOVA, Chi-squared, Correlation, Regression, etc
  • #LI-MJ1

    Lead Data Scientist – Remote OK

    The Nielsen Company

    Cockeysville Maryland

    United States

    Scientific Research

    (No Timezone Provided)

    Lead Data Scientist - Remote OK - 101791

    Data Science - USA Offsite, Offsite 

    The Lead Data Scientist’s primary responsibility in the Audio Data Science team is to develop creative solutions to enhance the data and analysis infrastructure and pipeline which underpins the survey quality for all Nielsen Audio survey products. In order to deliver high quality standards, the Data Scientist will work as subject matter expert on a team of analysts to establish, maintain and continuously improve data tools and processes supporting the Audio data science team.
    Tasks will include developing system enhancements, procedural and technological documentation, working with cross functional teams to implement solutions into production systems, supporting survey methodology enhancement projects, and supporting client facing data requests.

    What will I do?

  • Maintain and continuously improve the variety of data infrastructure, analysis, production and QA processes for the Audio Data Science team
  • Assist in the transition of the data science tech infrastructure away from legacy systems and methods
  • Work with cross-functional teams to implement and validate enhanced audience measurement methodologies
  • Build and refine data queries from large relational databases/data warehouses/data lakes for various analyses and/or requests
  • Utilize tools such as Python, Tableau, AWS, Databricks etc. to independently develop, test and implement high quality custom, modular code to perform complex data analysis, visualizations, and answer client queries
  • Maintain and update comprehensive documentation on departmental procedures, checklists and metrics
  • Implement prevention and detection controls to ensure data integrity, as well as detect and address quality escapes
  • Work closely with internal customers and IT personnel to improve current processes and engineer new methods, frameworks and data pipelines
  • Work as an integral member of the Audio Data Science team in a time-critical production environment
  • Key tasks include – but are not limited to – data integration, data harmonization, automation, examining large volumes of data, identifying & implementing methodological, process & technology improvements
  • Develop and maintain the underlying infrastructure to support forecasting & statistical models, machine learning solutions, big data pipelines (from internal and external sources) used in a production environment
  • Is this for me?

  • Undergraduate or graduate degree in mathematics, statistics, engineering, computer science, economics, business or fields that employ rigorous data analysis
  • Must be proficient with Python (and Spark/Scala) to develop sharable software with the appropriate technical documentation
  • Experience utilizing Gitlab, Git or similar to manage code development
  • Experience utilizing Apache Spark, Databricks & Airflow
  • Expertize with Tableau, or other data visualization software and techniques
  • Experience in containerization such as Docker and/or Kubernetes
  • Expertize in querying large datasets with SQL and of working with Oracle, Netezza, Data Warehouse and Data Lake data structures
  • Experience in leveraging CI/CD pipelines
  • Experience utilizing cloud computing platforms such as AWS, Azure, etc
  • Strong ability to proactively gather information, work independently as well as within an multi disciplinary team
  • E- Proficiency in MS Office suite (Excel, Access, PowerPoint and Word) and/or Google Office Apps (Sheets, Docs, Slides, Gmail)
  • Preferred

  • Knowledge of machine learning and data modeling techniques such as Time Series, Decision Trees, Random Forests, SVM, Neural Networks, Incremental Response Modeling, and Credit Scoring
  • Knowledge of survey sampling methodologies
  • Knowledge of statistical tests and procedures such as ANOVA, Chi-squared, Correlation, Regression, etc
  • #LI-MJ1