Our mission is to unlock human potential. We welcome you for who you are, the background you bring, and we embrace individuals who get excited about learning. Bring your experiences, your perspectives, and your passion; it’s in our differences that we empower the way the world learns.
About the Role:
Data Analytics and Insights (DA&I) team at Wiley is a vibrant team including colleagues from Sri Lanka, USA and UK in a truly multi-national team setup. This team is involved in building end-to-end data products and solutions for our customers dispersed around the globe in the business domains Research Publications, Learning and Education related platforms & services, Professional Learning solutions and Book Publications. From extracting and bringing data from over 100+ diverse sources both internal and external to Wiley to Wiley’s Data Lake and Enterprise Data Warehouse built on AWS and Snowflake, to processing that terabyte scale data for insights and machine learning use cases using Spark, TensorFlow and other ML libraries, to building Business Intelligence applications in Power BI, to building high performance, fault tolerant and secure data integration Data as a Service (DaaS) micro-service applications on Spring Boot and Kubernetes, to building external customer facing Web based analytics solutions on React JS, Highcharts, D3 etc., DA&I team at Wiley is a one-stop-shop for providing any colleague who gets on-board with us “all” Data Solutioning depth and breadth opportunities that cannot be matched by many other competitors.
Data Engineering vacancies in DA&I team provides a unique opportunity to candidates to involve in building and maintaining the terabyte scale Data Lake and EDW at Wiley, building truly scaled datasets including both structed and semi-structured datasets, using those data sets to derive insights and ML use cases based on the analytics teams’ requirements and specifications, and orchestrating such complex data pipelines using industry leading orchestration platforms. They are given the opportunity to involve in the end-to-end life cycle of designing, developing, deploying and production support of such data use cases, providing a full end-to-end experience in maintaining a production data eco-system at the scale and diversity of Wiley.
What we look for:
- BSc in Computer Science, Computer Engineering, Information Technology or equivalent.
- MSc in Big Data, Machine Learning, Data Analytics or equivalent will be beneficial.
- Big Data, Machine Learning and/or Cloud computing related professional certifications will be beneficial.
- Industry experience of:
- 8+ years for Technical Specialist
- 6+ years for Associate Technical Specialist
- 4+ years for Senior Software Engineer
- Note: Based on the interview performance, candidate maybe considered under accelerated career progress track.
- Strong verbal and written communication skills.
- Hands-on experience in producing architectural diagrams and documentation.
- Hands-on experience in developing data applications using Python, Scala or Java.
- Hands-on experience and deep understanding in developing and optimizing in SQL. Prior experience in Snowflake will be beneficial.
- Hands-on experience with data workflow tools such as Apache Airflow, LinkedIn Azkaban or Argo Workflows.
- Knowledge on database technologies including Relational databases, NoSQL databases, Data warehouses, Data lakes and equivalent cloud native technologies.
- Experience in building data pipeline against different source types such as Relational DBs, Flat files, APIs, Streaming sources and Scraping.
- Experience in dealing with various data storage formats such as CSV, XML, JSON, Parquet, Avro, ORC etc.
- Understanding of ELT and ETL approaches and knowledge of using either or both in large scale data eco-systems.
- Understanding of data privacy, governance and security aspects.
Enabling Discovery, Powering Education, Shaping Workforces.
We clear the way for seekers of knowledge: illuminating the path forward for research and education, tearing down barriers to society’s advancement, and giving seekers the help they need to turn their steps into strides.Wiley may have been founded over two centuries ago, but our secret to success remains the same: our people. We are willing to challenge the status quo, move the needle, and be innovative. Wiley’s headquarters are located in Hoboken, New Jersey, with operations across the globe in more than 40 countries.
How you will make an impact:
- Design and implement robust & secure data pipelines from both internal and external data sources into Wiley’s Data Lake and EDW built on AWS and Snowflake.
- Design and implement scalable and cost-effective insights and machine learning applications using AWS EMR, TensorFlow and other scalable ML libraries based on the specifications provided by analytics and data science teams.
- Design and implement containerized solutions to facilitate fault-tolerant and scalable data pipeline workflows using Airflow & Kubernetes.
- Design and implement automated CI/CD for data pipelines.
- Design and implement data quality and audit mechanisms to ensure the accuracy of data and data use case outputs.
- Design and implement information retrieval solutions using Search Engine technologies such as Solr and Elasticsearch.
- Follow a Test-Driven Development methodology and working closely with QE to ensure robust and quality output.
- Mentor other team members in data engineering, guide and review development work providing coaching and coding feedback aligning to best practices.
- Collaborate with quality engineering and solution management teams for capacity planning and Program Increment (PI) planning.
- Work with Business SMEs, Business Analysts and Product Management to translate functional specifications into technical requirements and designs.
- Define best practices and standards for data pipelines and integrations in collaboration with Data Architects and other Data Leads.
- Ensure enterprise security and access control policies are adhered to in the solutions.
- Creation of architecture and design artifacts and documents and being able to present such designs to a wider audience both technical and non-technical.