Lead Data Scientist – Remote OK
The Nielsen Company
2021-10-13T11:35:25Z
Cockeysville
Maryland
United States
Scientific Research
(No Timezone Provided)
Lead Data Scientist - Remote OK - 101791
Data Science - USA Offsite, Offsite
The Lead Data Scientist’s primary responsibility in the Audio Data Science team is to develop creative solutions to enhance the data and analysis infrastructure and pipeline which underpins the survey quality for all Nielsen Audio survey products. In order to deliver high quality standards, the Data Scientist will work as subject matter expert on a team of analysts to establish, maintain and continuously improve data tools and processes supporting the Audio data science team.
Tasks will include developing system enhancements, procedural and technological documentation, working with cross functional teams to implement solutions into production systems, supporting survey methodology enhancement projects, and supporting client facing data requests.
What will I do?
Maintain and continuously improve the variety of data infrastructure, analysis, production and QA processes for the Audio Data Science team Assist in the transition of the data science tech infrastructure away from legacy systems and methods Work with cross-functional teams to implement and validate enhanced audience measurement methodologies Build and refine data queries from large relational databases/data warehouses/data lakes for various analyses and/or requests Utilize tools such as Python, Tableau, AWS, Databricks etc. to independently develop, test and implement high quality custom, modular code to perform complex data analysis, visualizations, and answer client queries Maintain and update comprehensive documentation on departmental procedures, checklists and metrics Implement prevention and detection controls to ensure data integrity, as well as detect and address quality escapes Work closely with internal customers and IT personnel to improve current processes and engineer new methods, frameworks and data pipelines Work as an integral member of the Audio Data Science team in a time-critical production environment Key tasks include – but are not limited to – data integration, data harmonization, automation, examining large volumes of data, identifying & implementing methodological, process & technology improvements Develop and maintain the underlying infrastructure to support forecasting & statistical models, machine learning solutions, big data pipelines (from internal and external sources) used in a production environment Is this for me?
Undergraduate or graduate degree in mathematics, statistics, engineering, computer science, economics, business or fields that employ rigorous data analysis Must be proficient with Python (and Spark/Scala) to develop sharable software with the appropriate technical documentation Experience utilizing Gitlab, Git or similar to manage code development Experience utilizing Apache Spark, Databricks & Airflow Expertize with Tableau, or other data visualization software and techniques Experience in containerization such as Docker and/or Kubernetes Expertize in querying large datasets with SQL and of working with Oracle, Netezza, Data Warehouse and Data Lake data structures Experience in leveraging CI/CD pipelines Experience utilizing cloud computing platforms such as AWS, Azure, etc Strong ability to proactively gather information, work independently as well as within an multi disciplinary team E- Proficiency in MS Office suite (Excel, Access, PowerPoint and Word) and/or Google Office Apps (Sheets, Docs, Slides, Gmail) Preferred
Knowledge of machine learning and data modeling techniques such as Time Series, Decision Trees, Random Forests, SVM, Neural Networks, Incremental Response Modeling, and Credit Scoring Knowledge of survey sampling methodologies Knowledge of statistical tests and procedures such as ANOVA, Chi-squared, Correlation, Regression, etc #LI-MJ1
Lead Data Scientist – Remote OK
Lead Data Scientist - Remote OK - 101791
Data Science - USA Offsite, Offsite
The Lead Data Scientist’s primary responsibility in the Audio Data Science team is to develop creative solutions to enhance the data and analysis infrastructure and pipeline which underpins the survey quality for all Nielsen Audio survey products. In order to deliver high quality standards, the Data Scientist will work as subject matter expert on a team of analysts to establish, maintain and continuously improve data tools and processes supporting the Audio data science team.
Tasks will include developing system enhancements, procedural and technological documentation, working with cross functional teams to implement solutions into production systems, supporting survey methodology enhancement projects, and supporting client facing data requests.
What will I do?
Maintain and continuously improve the variety of data infrastructure, analysis, production and QA processes for the Audio Data Science team Assist in the transition of the data science tech infrastructure away from legacy systems and methods Work with cross-functional teams to implement and validate enhanced audience measurement methodologies Build and refine data queries from large relational databases/data warehouses/data lakes for various analyses and/or requests Utilize tools such as Python, Tableau, AWS, Databricks etc. to independently develop, test and implement high quality custom, modular code to perform complex data analysis, visualizations, and answer client queries Maintain and update comprehensive documentation on departmental procedures, checklists and metrics Implement prevention and detection controls to ensure data integrity, as well as detect and address quality escapes Work closely with internal customers and IT personnel to improve current processes and engineer new methods, frameworks and data pipelines Work as an integral member of the Audio Data Science team in a time-critical production environment Key tasks include – but are not limited to – data integration, data harmonization, automation, examining large volumes of data, identifying & implementing methodological, process & technology improvements Develop and maintain the underlying infrastructure to support forecasting & statistical models, machine learning solutions, big data pipelines (from internal and external sources) used in a production environment Is this for me?
Undergraduate or graduate degree in mathematics, statistics, engineering, computer science, economics, business or fields that employ rigorous data analysis Must be proficient with Python (and Spark/Scala) to develop sharable software with the appropriate technical documentation Experience utilizing Gitlab, Git or similar to manage code development Experience utilizing Apache Spark, Databricks & Airflow Expertize with Tableau, or other data visualization software and techniques Experience in containerization such as Docker and/or Kubernetes Expertize in querying large datasets with SQL and of working with Oracle, Netezza, Data Warehouse and Data Lake data structures Experience in leveraging CI/CD pipelines Experience utilizing cloud computing platforms such as AWS, Azure, etc Strong ability to proactively gather information, work independently as well as within an multi disciplinary team E- Proficiency in MS Office suite (Excel, Access, PowerPoint and Word) and/or Google Office Apps (Sheets, Docs, Slides, Gmail) Preferred
Knowledge of machine learning and data modeling techniques such as Time Series, Decision Trees, Random Forests, SVM, Neural Networks, Incremental Response Modeling, and Credit Scoring Knowledge of survey sampling methodologies Knowledge of statistical tests and procedures such as ANOVA, Chi-squared, Correlation, Regression, etc #LI-MJ1