Company Confidential
Responsibilities:
- Defines technology roadmap in support of product development roadmap
- Architects complex solutions encompassing multiple product lines
- Provides technical consulting to multiple product development teams
- Develop custom batch-oriented and real-time streaming data pipelines working within the MapReduce ecosystem, migrating flows from ELT to ETL
- Ensure proper data governance policies are followed by implementing or validating data lineage, quality checks, classification, etc.
- Act in a technical leadership capacity: Mentor junior engineers and new team members, and apply technical expertise to challenging programming and design problems
- Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
- Have a quality mindset, squash bugs with a passion, and work hard to prevent them in the first place through unit testing, test-driven development, version control, continuous integration and deployment.
- Ability to lead change, be bold, and have the ability to innovate and challenge status quo
- Be passionate about solving customer problems and develop solutions that result in a passionate customer/community following
- Conduct design and code reviews
- Analyze and improve efficiency, scalability, and stability of various system resources
- Contribute to the design and architecture of the project
- Operate within Agile Development environment and apply the methodologies
Required Skills and Knowledge:
- Advanced knowledge of data architectures, data pipelines, real time processing, streaming, networking, and security
- Proficient understanding of distributed computing principles
- Good knowledge of Big Data querying tools, such as Pig or Hive
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Proficiency with MapReduce, HDFS
- Experience with integration of data from multiple data sources
Basic Qualifications:
- Bachelors Degree in Software Engineer or similar degree
- 12+ years experience in software engineering
- Experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop
- Experience developing with ingestion and clustering frameworks such as Kafka, Zookeeper, YARN
- Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
- Experience with various messaging systems, such as Kafka or RabbitMQ
Preferred Experience:
- Experience with DataBricks and Spark
- Ability to solve any ongoing issues with operating the cluster
- Management of Spark or Hadoop clusters, with all included services
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
- Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O
- Understanding of Service Oriented Architecture
- Technical writing, system documentation, design document-management skills
To apply for this job email your details to Info@princetonstaffingsolutions.com