Share this job
Data Engineer
Sugar Land, Texas, United States
Apply for this job

Design, develop, and support data engineering, data modeling, and data integrations, with a primary focus on accelerating data landing and curation in a Databricks data lake house. Build and maintain reliable, well-governed pipelines that ingest data from source systems into the lake house and curate it through a layered (medallion) architecture into analytics-ready, trusted datasets. The role also carries a strong reporting and data-analysis focus — partnering with business users to build semantic data models, dashboards, and reports, and performing hands-on analysis to answer business questions. The Data Engineer will help establish the data foundation that powers data-related AI and machine learning initiatives, ensuring high-quality, well-documented, AI-ready data products.


Key Responsibilities

•    Build, optimize, and support pipelines that land data from source systems into the Databricks lake house and curate it through a layered (medallion) architecture into trusted, analytics-ready datasets.

•    Produce and maintain high-quality, well-governed, documented, AI-ready data products that serve as the foundation for AI and machine learning initiatives.

•    Implement data quality, governance, and monitoring controls (e.g., Unity Catalog, automated testing, alerting) across lake house pipelines.

•    Develop and maintain reporting and analytics solutions — semantic data models, dashboards, and reports — and perform ad-hoc querying to support business decision-making.

•    Gather requirements, design, and develop new data integrations or enhancements to existing code.

•    Partner with business users and the Business Relationship Management team on requirements gathering, testing, and supporting existing integrations, analytics, and reporting.

•    Create and maintain documentation and process flows for integration solutions.


Required Experience & Skills

•    Minimum 5 years of IT/technology experience spanning data analysis, data engineering, and/or data integration, with a focus on building and curating pipelines in a cloud data lake or lake house environment.

•    At least 3 years writing SQL/NoSQL queries, with specific experience in MS SQL Server, Oracle, and/or Postgres.

•    Hands-on experience with a modern cloud data platform / lake house (Databricks, Microsoft Fabric, Snowflake, or comparable). Databricks strongly preferred.

•    Demonstrated experience landing data from diverse source systems into a lake/lake house and curating it through a medallion (bronze-silver-gold) architecture into clean, conformed, analytics-ready datasets.

•    Strong Python skills for data engineering, including PySpark.

•    Working knowledge of data quality, data governance, and pipeline reliability practices — automated testing, monitoring, alerting, and orchestration of batch and incremental/streaming workloads.

•    Experience designing simplified data models for integrations, analytics, and reporting; comfortable performing hands-on data analysis and ad-hoc querying.

•    Experience extracting data from source systems via web services (SOAP, REST, Web APIs), XML, and CSV/Excel exports.

•    Experience building the data foundation and automation pipelines for analytics and AI/ML initiatives, and partnering with business users on LLM/GenAI use cases.

•    Bachelor's degree in Information Systems, IT, or a related technical discipline — or equivalent demonstrated technical proficiency.

•    Strong interpersonal and communication skills; fluent in English (oral and written).


Preferred / Nice-to-Have

•    Python, cloud data warehouse experience (e.g., Snowflake, Synapse), Spark SQL

•    Performance tuning, partitioning, and optimization.

•    Modern LLM architectures and GenAI frameworks — retrieval-augmented generation (RAG), embeddings and vector databases, prompt orchestration, and integrating LLMs into data products and pipelines.

•    Familiarity with using LLMs in automation development and with vector/embedding data.

•    Experience in the Oil & Gas domain.


Apply for this job
Powered by