Sr. Bigdata Developer with Spark & Scala at Raleigh, NC (Day 1 Onsite ) Job Description -:we are looking for a data engineer who will help build new or improve existing data pipelines. You should be comfortable working with large or fast moving data, have a solid understanding of distributed processing frameworks, and a software engineering mindset
Mandatory Skills : DB : Hive , Impala , HBASE. Data Processing : Spark core and SQL. build tool : Maven , Testing framework : Cucumber Required Skills :
- Over all 8 to 10 years of IT experience.
- Extensive experience in Big Data, Analytics, ETL technologies.
- Application Development background along with knowledge of Analytics libraries, statistical and big data computing libraries.
- Minimum 3+ years of experience in Spark/PySpark, Python/Scala/java programming.
- Hands on experience in coding, designing and development of complex data pipelines using big data technologies.
- Experience in developing applications on Big Data. Design and build highly scalable data pipelines.
- Expertise in Python, SQL Database, Spark, non-relational databases.
- Responsible to ingest data from files, streams and databases. Process the data using Spark, Python.
- Develop programs in PySpark and Python as part of data cleaning and processing.
- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
- Develop efficient software code for multiple use cases leveraging Python and Big Data technologies for various use cases built on the platform .
- Provide high operational excellence guaranteeing high availability and platform stability.
- Implement scalable solutions to meet the ever-increasing data volumes, using big data/Palantir technologies Pyspark, any Cloud computing etc.
- Individual who can work under their own direction towards agreed targets/goals and with creative approach to work.
- Intuitive individual with an ability to manage change and proven time management.
- Proven interpersonal skills while contributing to team effort by accomplishing related results as needed.
Additional Skills: - Experience in building CI/CD Pipelines, Git, Jenkins
- Have worked with large datasets
- Proficient reading and understanding enterprise-grade PySpark OR Spark with Scala code