Cloud Data Engineer

Date: Feb 17, 2025

Location:

Bangalore, IN

Company: Marlabs Innovations Pvt Ltd

Description:

In this role, you will be part of a growing, global team of data engineers, who collaborate in DevOps mode, to enable business with state-of-the-art technology to leverage data as an asset and to take better informed decisions.

The Enabling Functions Data Office Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Enabling Function’s data management and analytics platform (Palantir Foundry, AWS and other components).

Developing pipelines and applications on cloud platform requires:

* Hands-on experience with Terraform or Cloudformation and other infrastructure automation tools

* Experience with Azure DEVOPS

* Proven track record in setting up CI/CD pipelines and automating cloud infrastructure

* Strong understanding of cloud infrastructure, with experience in AWS or other cloud providers

* Experience with Azure DEVOPS , GitOps approach for automation

* Experience with automation of DBT orchestration using Azue Devops pipelines

* Experience working with services like Glue, EC2, ELB, RDS, Dynamo DB and S3

* Ability to work independently, troubleshoot issues, and optimize performance

* Practical experience is valued more than certifications

This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.

*Roles & Responsibilities:*

* Tech / B.Sc./M.Sc. in Computer Science or related field and overall 6+ years of industry experience

* Strong experience in Big Data & Data Analytics

* Experience in building robust ETL pipelines for batch as well as streaming ingestion.

* Big Data engineers with a firm grounding in Object Oriented Programming and an advanced level knowledge with commercial experience in Python, PySpark and SQL

* Interacting with RESTful APIs incl. authentication via SAML and OAuth2

* Experience with test driven development and CI/CD workflows

* Knowledge of Git for source control management

* Agile experience in Scrum environments like Jira

* Knowledge of container technologies such as Docker and Kubernetes is an advantage

* Experience in Palantir Foundry, AWS or Snowflake is an advantage

* Problem solving abilities

* Proficient in English with strong written and verbal communication

* Primary Responsibilities

o Responsible for designing, developing, testing and supporting data pipelines and applications

o Industrialize data pipelines

o Establishes a continuous quality improvement process to systematically optimize data quality

o Collaboration with various stakeholders incl. business and IT

*Education*

* Bachelor (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences or related fields

*Professional Experience*

* 6+ years of experience in system engineering or software development

* 3+ years of experience in engineering with experience in ETL type work with databases and Cloud platforms.

*Skills*

|*Big Data General*|Deep knowledge of distributed file system concepts, map-reduce principles and distributed computing. Knowledge of Spark and differences between Spark and Map-Reduce. Familiarity of encryption and security in a Hadoop cluster.|

|*Data management / data structures*|Must be proficient in technical data management tasks, i.e. writing code to read, transform and store data

XML/JSON knowledge

Experience working with REST APIs|

|*Spark*|Experience in launching spark jobs in client mode and cluster mode. Familiarity with the property settings of spark jobs and their implications to performance.|

|*SCC/Git*|Must be experienced in the use of source code control systems such as Git|

|*ETL* |Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.|

|*Authorization*|Basic understanding of user authorization (Apache Ranger preferred)|

|*Programming* |Must be at able to code in Python or expert in at least one high level language such as Python, Java, Scala.

Must have experience in using REST APIs|

|*SQL* |Must be an expert in manipulating database data using SQL. Familiarity with views, functions, stored procedures and exception handling.|

|*AWS* |General knowledge of AWS Stack (EC2, S3, EBS, …)|

|*IT Process Compliance*|SDLC experience and formalized change controls

Working in DevOps teams, based on Agile principles (e.g. Scrum)

ITIL knowledge (especially incident, problem and change management)|

|*Languages* |Fluent English skills|

*Specific information related to the position:*

* Physical presence in primary work location (Bangalore)

* Flexible to work CEST and US EST time zones (according to team rotation plan)

* Willingness to travel to Germany, US and potentially other locations (as per project demand)