Position- Data Engineer Location- Plano TX Duration- ContractJob Description: Required Skills: Unix,Python,PySpark,SQL,AWS
Roles & Responsibilities :
- Analyze business processes to identify areas for improvement and optimization.
- Develop and maintain data pipelines using Python and PySpark to ensure efficient data processing.
- Utilize AWS services to manage and deploy scalable data solutions.
- Write and optimize SQL queries to extract and manipulate data for analysis.
- Implement Unix scripts to automate routine tasks and enhance system performance.
- Collaborate with stakeholders to gather requirements and translate them into technical specifications.
- Provide actionable insights through data analysis to support strategic decision-making.
- Create and maintain comprehensive documentation for data processes and business workflows.
- Conduct data validation and quality checks to ensure accuracy and reliability of data.
- Lead data-driven projects from inception to completion, ensuring timely delivery and alignment with business goals.
- Communicate findings and recommendations to both technical and non-technical audiences.
- Stay updated with industry trends and best practices to continuously improve data solutions.
- Work closely with IT and development teams to integrate data solutions into existing systems
Qualifications:
- Possess strong analytical skills with the ability to interpret complex data sets.
- Demonstrate proficiency in Python and PySpark for data processing and analysis.
- Have hands-on experience with AWS services for data management and deployment.
- Show expertise in writing and optimizing SQL queries for data extraction and manipulation.
- Be skilled in Unix scripting for automation and system performance enhancement.
- Exhibit excellent communication skills to effectively collaborate with cross-functional teams.
- Display a proactive approach to problem-solving and continuous improvement.
- Hold a Bachelor's degree in Computer Science, Information Technology, or a related field.
- Have a proven track record of delivering data-driven projects on time