Identify/develop appropriate machine learning/data mining/text mining techniques to enable better business outcomes.
Understand and analyze data sources including sampling biases, accuracy, and coverage.
Ask questions and break apart problems scientifically. Form hypotheses and validate.
Use analytical rigor and statistical methods, machine learning, programming, data modeling, simulation and advanced mathematics to analyze large amounts of data, recognizing patterns, identifying opportunities, posing business questions and making valuable discoveries.
Research new ways for modeling and predictive behavior for large scale projects.
Generate and test hypotheses, designing experiments to answer targeted questions of advanced complexity.
Collaborate with data engineers to identify data preparation/cleansing/ETL pipelines
Define data needs, evaluate data quality, and extract/manipulate data in a "Big Data" environment.
Documents projects including business objective, data gathering and processes, leading approaches, final algorithm, detailed set of results and analytical metrics.
Interprets and communicates insights and findings.
Validate score performance.
Conduct ROI and benefit analysis.
Document and present model process and performance.
Advanced degree in Machine Learning, Computer Science, Electrical Engineering, Physics, Statistics, Applied Math or other quantitative fields from a reputed university (Ph.D. a plus).
0-3 years of working experience in analytics, data mining, and/or predictive modeling.
Excellent knowledge of machine learning/data science approaches.
Comfortable interacting with business peers to understand and identify use cases. Be able to articulate solutions & present them to business.
Familiarity with data visualization tools like Tableau.
Exceptional coding skills in Python, R, PySpark.
Ability to adapt to scrum & agile methodology, comfortable with development tools like Jira/Confluence.
Experienced in applying data science to business problems (sales, IT, etc.) preferred.
Experience with Hadoop and NoSQL related technologies.