Date:  Feb 17, 2025
Location: 

Bangalore, IN

Company:  Marlabs Innovations Pvt Ltd
Description: 

In this role, you will be part of a growing, global team of data engineers, who collaborate in DevOps mode, to enable business with state-of-the-art technology to leverage data as an asset and to take better informed decisions.

 

The Enabling Functions Data Office Team is responsible for designing, developing, testing, and supporting automated end-to-end data pipelines and applications on Enabling Function’s data management and analytics platform (Palantir Foundry, AWS and other components).

 

Developing pipelines and applications on cloud platform requires: 

  

 * Hands-on experience with Terraform or Cloudformation and other infrastructure automation tools

 * Experience with Azure DEVOPS

 * Proven track record in setting up CI/CD pipelines and automating cloud infrastructure

 * Strong understanding of cloud infrastructure, with experience in AWS or other cloud providers

 * Experience with Azure DEVOPS , GitOps approach for automation 

 * Experience with automation of DBT orchestration using Azue Devops pipelines

 * Experience working with services like Glue, EC2, ELB, RDS, Dynamo DB and S3

 * Ability to work independently, troubleshoot issues, and optimize performance

 * Practical experience is valued more than certifications

 

This position will be project based and may work across multiple smaller projects or a single large project utilizing an agile project methodology.

 

 

*Roles & Responsibilities:* 

 * Tech / B.Sc./M.Sc. in Computer Science or related field and overall 6+ years of industry experience

 * Strong experience in Big Data & Data Analytics

 * Experience in building robust ETL pipelines for batch as well as streaming ingestion.

 * Big Data engineers with a firm grounding in Object Oriented Programming and an advanced level knowledge with commercial experience in Python, PySpark and SQL 

 * Interacting with RESTful APIs incl. authentication via SAML and OAuth2

 * Experience with test driven development and CI/CD workflows

 * Knowledge of Git for source control management

 * Agile experience in Scrum environments like Jira

 * Knowledge of container technologies such as Docker and Kubernetes is an advantage

 * Experience in Palantir Foundry, AWS or Snowflake is an advantage

 * Problem solving abilities

 * Proficient in English with strong written and verbal communication

 * Primary Responsibilities

 

o   Responsible for designing, developing, testing and supporting data pipelines and applications

o   Industrialize data pipelines

o   Establishes a continuous quality improvement process to systematically optimize data quality

o   Collaboration with various stakeholders incl. business and IT

 

  *Education* 

 * Bachelor (or higher) degree in Computer Science, Engineering, Mathematics, Physical Sciences or related fields 

 

  *Professional Experience*  

 * 6+ years of experience in system engineering or software development

 * 3+ years of experience in engineering with experience in ETL type work with databases and Cloud platforms.

 

  *Skills*

|*Big Data General*|Deep knowledge of distributed file system concepts, map-reduce principles and distributed computing.  Knowledge of Spark and differences between Spark and Map-Reduce.  Familiarity of encryption and security in a Hadoop cluster.|

|*Data management / data structures*|Must be proficient in technical data management tasks, i.e. writing code to read, transform and store data

XML/JSON knowledge

Experience working with REST APIs|

|*Spark*|Experience in launching spark jobs in client mode and cluster mode.  Familiarity with the property settings of spark jobs and their implications to performance.|

|*SCC/Git*|Must be experienced in the use of source code control systems such as Git|

|*ETL* |Experience with developing ELT/ETL processes with experience in loading data from enterprise sized RDBMS systems such as Oracle, DB2, MySQL, etc.|

|*Authorization*|Basic understanding of user authorization (Apache Ranger preferred)|

|*Programming* |Must be at able to code in Python or expert in at least one high level language such as Python, Java, Scala.

Must have experience in using REST APIs|

|*SQL* |Must be an expert in manipulating database data using SQL.  Familiarity with views, functions, stored procedures and exception handling.|

|*AWS* |General knowledge of AWS Stack (EC2, S3, EBS, …)|

|*IT Process Compliance*|SDLC experience and formalized change controls

Working in DevOps teams, based on Agile principles (e.g. Scrum)

ITIL knowledge (especially incident, problem and change management)|

|*Languages* |Fluent English skills|

 

*Specific information related to the position:*

 * Physical presence in primary work location (Bangalore)

 * Flexible to work CEST and US EST time zones (according to team rotation plan)

 * Willingness to travel to Germany, US and potentially other locations (as per project demand)