Principal Hadoop Architect

Share this job

Palo Alto, CA

Our Innovation center is in search of a seasoned Principal Hadoop Architect for our Palo Alto office. The architect will have a huge impact on the future of computing in financial services by helping streamline analytics and business insights using Big Data and by pioneering new Big Data engineering techniques. On a given day we process trillions of dollars of transactions; a huge share of global transactions and currency flows. This generates fascinating datasets for us to store and analyze using emerging big data technologies.

The Principal Hadoop Architect will play a key role in pioneering Hadoop technologies to improve the effectiveness of current processes by leveraging Big Data. The architect will be working closely and directly with business partners, developers, and other development teams to understand line of business requirements and design efficient and effective solutions on Hadoop by selecting right set of open source tools and components on the Hadoop stack for all of the Big Data use cases in the organization. The architect will create reference implementations for the teams worldwide to follow.

Responsibilities to include the following:

Understanding of Hadoop design principles and factors that affect High-Performance across distributed system, including hardware and network considerations

• Evaluate emerging technologies on Hadoop including Spark, Kafka, Druid

• Architect and develop Big Data solutions streaming/batch using Spark and Hadoop technologies.

• Evangelize and help other groups take advantage of Hadoop eco-system

• Develop best practices and reference implementations

Qualifications

• BS, MS in Computer Science or equivalent along with hands-on experience in dealing with large data sets and distributed computing in big data systems using Hadoop

• Proficiency in Java or Scala, writing software for distributed systems

• Experience in writing software with Hadoop or Spark

• Experience around developing code for large clusters with huge volumes of data – streaming as well as batch code.

• Strong knowledge of Linux

• At least 3 years of experience with and strong understanding of big data technologies in Hadoop ecosystem – Hive, HDFS, Map/Reduce, Yarn, Pig, Hbase, Sqoop, etc

• At least 1 year of data streaming; Spark, Kafka, Storm

• At least 1 year of NoSQL Database; Druid, Cassandra, Hbase

Apply for this job