Data Operations Support Engineer

Remote

Description

We’re GoSolve – a global company specialised in transforming our customers’ vision into digital applications. We love building large-scale cloud-based digital products and have the necessary skills to make it happen. Join us and work with the top tech talents from all over the globe in a driven, proactive environment. Do you wish to join a stable product development project? You will be a perfect fit!

Just GO for it.

We are looking for a Data Operations Support Engineer to support, monitor, and troubleshoot our data pipelines and workflows across Airflow, Redshift, and Databricks platforms. The ideal candidate will have strong technical skills in data pipeline management, proactive problem-solving capabilities, and a collaborative mindset to work with data engineers, analysts, and other stakeholders. This role is integral to ensuring reliable data flow and high-quality data for business needs.

Responsibilities:

Data Pipeline Monitoring and Support:
- Actively monitor and manage data pipelines and ETL workflows in Apache Airflow to ensure continuous data processing.
- Oversee Redshift and Databricks-based data workflows to maintain pipeline reliability, data integrity, and timely data ingestion and transformation.
- Use monitoring tools, logs, and alerts to detect and respond to issues promptly, minimizing downtime and data latency.

Incident Management and Troubleshooting:
- Troubleshoot and resolve issues in Airflow DAGs, Redshift data transformations, and Databricks notebooks by identifying root causes and coordinating with data engineers for resolution.
- Establish incident management protocols, including documentation and rapid communication with relevant stakeholders for operational transparency.

Data Quality Assurance:
- Perform data validation, consistency checks, and quality assurance measures to detect and address anomalies across Redshift and Databricks tables.
- Implement data quality controls within Airflow DAGs to proactively catch issues, ensuring data accuracy and reliability.

Stakeholder Collaboration and Communication:
- Collaborate closely with data engineering, data science, and business analytics teams to support data availability and quality.
- Act as the primary point of contact for data pipeline issues, communicating resolutions and impacts to data stakeholders.

Process Optimization and Documentation:
- Identify and implement optimizations for Airflow, Redshift, and Databricks workflows to improve pipeline efficiency and scalability.
- Document troubleshooting steps, operational workflows, and processes to build a comprehensive support knowledge base for data operations.

Tool Management and Continuous Improvement:
- Maintain and improve data orchestration in Airflow, ensuring DAGs are efficient, well-structured, and robust.
- Optimize Redshift performance through table management, query optimization, and monitoring cluster health.
- Collaborate on enhancements within Databricks, managing notebooks, clusters, and Spark jobs to ensure cost-effective and efficient data processing.

Skills & Requirements:

Proficiency in Apache Airflow, Redshift, and Databricks; solid understanding of ETL processes, workflow automation, and SQL for data validation and transformations; familiarity with monitoring tools for proactive issue detection.

Strong problem-solving abilities to identify root causes and manage incidents; experience in setting up protocols for quick stakeholder communication.

Skills in data validation and quality checks to maintain data accuracy and reliability, with the ability to implement quality controls within pipelines.

Effective collaboration with data engineering, science, and business analytics teams; excellent communication to update stakeholders on pipeline issues and resolutions.

Proven ability to streamline data workflows for efficiency and scalability, along with documenting processes to build a knowledge base.

Dedication to enhancing data orchestration systems and ensuring cost-effective operations in Databricks and Redshift.

Bachelor’s in Computer Science, Information Systems, or a related field, with relevant certifications in data tools as a plus.

Benefits:

Be a part of the team that works for the most influential global brands.

Opportunities to create industry-defining services using the latest technologies.

Responsible position – leverage your knowledge beyond simple coding.

Advise customers on optimal solutions – we trust your expertise!

No rush! Work at your own pace in a quality-over-quantity environment.

Collaborate with highly experienced professionals.

Numerous opportunities for professional growth.

Full-time remote work from anywhere in the world.

Monthly budget for social benefits – tailored to your location and lifestyle.

20 days of paid time off.

Annual training budget.

GDPR DATA PRIVACY NOTICE
In accordance with Article 13(1) and (2) of the GDPR, we inform you that: 1. The controller of your personal data is the entity indicated in the job offer. 2. We will process your personal data for the purpose of conducting the recruitment process for the position indicated in the job offer, and if you have given consent in this regard, also for the purpose of conducting future recruitment processes. 3. You have the right to: access your data and request its rectification, erasure, restriction of processing, the right to data portability, and the right to object to the processing of your data. 4. We do not engage in automated decision-making or profiling. 5. For more information on how we process your personal data, please refer to the full text of the Information Clause for Job Candidates.