Senior Software Engineer - Data and ML Infrastructure (Cloud Adoption Team)

  • Full-time

Company Description

Twitter is what’s happening and what people are talking about right now. For us, life's not about a job, it's about purpose. We believe real change starts with conversation. Here, your voice matters. Come as you are and together we'll do what's right (not what's easy) to serve the public conversation.Twitter is what’s happening and what people are talking about right now. For us, life's not about a job, it's about purpose. We believe real change starts with conversation. Here, your voice matters. Come as you are and together we'll do what's right (not what's easy) to serve the public conversation.

Job Description

If you are passionate about data and driven to take on data challenges at the scale of Twitter’s Product Teams (several 100+ terabytes daily!), please read on! 

 

Twitter is looking to leverage the best of the Cloud and the best of our own investments. We've built a stack that scales to billions of tweets and hundreds of millions of users. To move forward faster and serve people inclusively and globally, we're leveraging the Cloud, but this takes precise thinking, decision making and investments in our technology that meet our needs for velocity while balancing risks and our near term effectiveness. You'll join a budding team that's focused on making this happen!

 

You will spend your day working with various teams (internal customers such as Timelines, Search, Explore, Spaces, Notifications and more) and partners to understand where Cloud investments can generate outsized RoI by proposing specific, yet extensible solutions. Finally, you will extend that partnership into execution either by supporting the team directly or indirectly (whichever manner makes sense for the project).

 

As a Senior Software Engineer on this team, you will be part of a high visibility team whose core purpose is to help product teams leverage the best of cloud technologies especially focusing on data, analytics and machine learning needs. 

Since this team engages with many teams and work across organizations, your involvement will vary across projects and teams ranging from purely providing guidance / shepherding to completely designing and executing from scratch.

 

To highlight some responsibilities, you will be:

 

  • Collaborating with Data Platform & Infrastructure teams, product Data Scientists, ML modeling engineers, Cloud PSOs to develop and/or leverage frameworks, cloud technologies, off-the-shelf platform components, data processing solutions and any other technology that is needed to ensure success of the ML (and analytics) data and infrastructure needs

 

  • Understanding the existing architecture (both on-premise and multiple cloud environments) and building powerful, flexible, and user-friendly infrastructure that powers all of ML

 

  • Designing, prototyping and building  a low-latency, high-throughput data pipeline for our ML models including components such as ingestion, processing, transformation, inferring and monitoring

 

  • Identifying areas of improvement and thereby proposing small, medium and large initiatives that help accelerate ML end to end pipeline

 

  • Running POCs and supporting ML and Data infrastructure future investments and migrations

 

  • Shepherding high-quality data products that let product and engineering teams understand their product and evolve their ML investments

 

  • Filling knowledge and onboarding gap between product teams and other company wide initiatives being led in data and ML infrastructure space

 

  • Assessing and sharing experience with external open source (as well as proprietary softwares) that may benefit Twitter ML and Data needs and can be potentially onboarded

 

  • Evaluating and guiding teams on cost optimization, standards and best practices for large-scale distributed systems

 

You will be a part of an early stage team and have a significant stake in not only defining its own future, but multiple teams across organizations.

Qualifications

  • BS, MS, or Ph.D. in Computer Science (or similar field) with 5+ years of related or equivalent experience with data infrastructure and/or distributed systems

  • Proficiency with Scala, Python or Java and willingness to adapt to new ones in future

  • Strong background in building scalable and fault-tolerant distributed data systems

    • Past experience in writing and debugging ETL (or ELT) jobs using a distributed data framework (such as Dataflow, Spark, MapReduce, Hadoop Pig, Hive, etc)

    • Experience with lambda or kappa architectures: real-time streaming (Apache Kafka, Apache Beam, Heron, Spark Streaming) and batch pipelines.

    • Proficiency with SQL (BigQuery, Redshift, Hive, Presto, Vertica)

  • Experience with ML model training and inference in production and at scale focusing on infrastructure needs (rather than building model or model evaluation itself).

    • Familiarity with ML feature engineering and feature store (batch and real-time)

  • Experience with Google Cloud Platform (including components such as BigQuery, BigTable, BQML, VertexAI, DataFlow) or other Cloud Solutions (AWS, Azure)

  • It’s not expected that you’ll have deep expertise in every dimension above, but you should be interested in learning any of the areas that are less familiar

Nice to have:

  • Prior Machine Learning background in modeling, training and inferring at production scale

  • Experience with large-scale data warehousing architecture and data modeling

  • Ability in managing and communicating data warehouse project plans to internal clients

Additional Information

Job opportunities should be equal. We don't discriminate. Period. In legal terms, that means: Twitter is an equal opportunity employer and doesn’t discriminate based on race, color, ethnicity, ancestry, national origin, religion, sex, gender, gender identity, gender expression, sexual orientation, age, disability, veteran status, genetic information, marital status or any other legally protected status.

San Francisco applicants: In response to the San Francisco Fair Chance Ordinance, we’d like to mention that we consider qualified applicants with arrest and conviction records.

Privacy Policy