Key Responsibilities include:
o Design and implement data pipelines to ingest, extract, transform, load (ETL) data and store large datasets from various sources
o Build and maintain data warehouses, including data modeling, data governance, and data quality
o Ensure data quality, integrity, and security by implementing data validation, data cleansing, and data governance policies
o Optimize data systems for performance, scalability, and reliability
o Collaborate with customers to understand their technical requirements and provide guidance on best practices for using Amazon Redshift
o Work with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data requirements and deliver data solutions
o Provide technical support for Amazon Redshift, including troubleshooting, performance optimization, and data modeling
o Identify and resolve data-related issues, including data pipeline failures, data quality issues, and performance bottlenecks
o Develop technical documentation and knowledge base articles to help customers and AWS engineers troubleshoot common issues
Key Skills Needed:
o Bachelor's or Master's degree in Computer Science or a related field, with at least 6 years of experience in Information Technology
o Proficiency in one or more programming languages (e.g., Python, Java, Scala)
o 8+ years of experience in data engineering, with a focus on designing and implementing large-scale data systems
o 5 + years of hands-on experience in writing complex, highly-optimized queries across large data sets using Oracle, SQL Server and Redshift.
o 5 + years of hands-on experience using AWS Glue, python/pyspark to build ETL pipelines in a production setting, including writing test cases
o Strong understanding of database design principles, data modeling, and data governance
o Proficiency in SQL, including query optimization, indexing, and performance tuning
o Experience with data warehousing concepts, including star and snowflake schemas
o Strong analytical and problem-solving skills, with the ability to break down complex problems into manageable components
o Experience with data storage solutions such as relational databases (Oracle, SQL Server), NoSQL databases, or cloud-based data warehouses (Redshift)
o Experience with data processing frameworks such as Apache Kafka, Fivetran
o Experience in building ETL pipelines using AWS Glue, Apache Airflow, and programming languages including Python and PySpark
o Understanding of data quality and governance principles and best practices
o Experience with agile development methodologies such as Scrum or Kanban
EOE (Veteran/Disability)