Senior Site Reliability Engineer, FedRAMP Cloud Platform-REMOTE OK!

Splunk Inc

Santa Fe New Mexico

United States

Information Technology
(No Timezone Provided)

Join us as we pursue our disruptive vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and most significantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!

Role:

Splunk's Cloud Services group is looking for a Site Reliability Engineer to help lead, design and build the next generation of our large scale cloud offering. You will be working on core services and applications that form the primitives for our current and future cloud service offerings. Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations of SRE, observability, Chaos Engineering and DevOps. This role is highly visible and impactful to the organization and will help shape Splunk's Engineering culture for years to come. Your job, in a nutshell, is to make every team around you better... including your own!

This is a remote role available in all US states except AK, ND, and WY. You also have the option of an office desk in some locations if that's convenient and desirable for you!

You will:

  • Own Splunk Cloud in FedRAMP environments, abiding by all the FedRAMP prescriptions on location and access.
  • Work across the organization to deliver quality products that delight Splunk's passionate users.
  • Lead teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
  • Mentor and help new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.
  • Qualifications:

  • You have experience operating within restrictive FedRAMP environments and are enthusiastic about doing it better.
  • You have assembled bricolages of Open Source components into cohesive services.
  • You have owned and operated Kubernetes clusters and the associated ecosystems.
  • You enjoy building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.
  • Expertise in working with container deployment and orchestration technologies at scale with knowledge of fundamentals including service discovery, deployments, monitoring, scheduling, load balancing. Knowledge of Kubernetes, Go and Docker preferred.
  • Deep understanding of Systems programming (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
  • Knowledge of standard methodologies related to security, performance, and disaster recovery.
  • Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
  • You've demonstrated the skills to effectively work across teams and functions to influence design, operations and deployment of highly available software.
  • You work hard to make the users of Splunk's products happier every day.
  • Preferred skills:

  • Experience monitoring cloud environments with Splunk.
  • Experience with development and deployment in a hosted cloud environment, preferably AWS & GCP.
  • Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.
  • Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes. You have a proven understanding of map reduce fundamentals, lambda architecture, and have developed applications on systems like Spark/Flink/Hadoop and Kafka.
  • All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.

    (Colorado only*) Minimum base salary of $115,000. You may also be eligible for incentive pay + equity + benefits.*Note: Disclosure per sb19-085 (8-5-201 et seq).

    Thank you for your interest in Splunk!

    Senior Site Reliability Engineer, FedRAMP Cloud Platform-REMOTE OK!

    Splunk Inc

    Santa Fe New Mexico

    United States

    Information Technology

    (No Timezone Provided)

    Join us as we pursue our disruptive vision to make machine data accessible, usable and valuable to everyone. We are a company filled with people who are passionate about our product and seek to deliver the best experience for our customers. At Splunk, we’re committed to our work, customers, having fun and most significantly to each other’s success. Learn more about Splunk careers and how you can become a part of our journey!

    Role:

    Splunk's Cloud Services group is looking for a Site Reliability Engineer to help lead, design and build the next generation of our large scale cloud offering. You will be working on core services and applications that form the primitives for our current and future cloud service offerings. Site Reliability Engineers in this role will be engaging with multiple service owners across the platform to teach and implement modern interpretations of SRE, observability, Chaos Engineering and DevOps. This role is highly visible and impactful to the organization and will help shape Splunk's Engineering culture for years to come. Your job, in a nutshell, is to make every team around you better... including your own!

    This is a remote role available in all US states except AK, ND, and WY. You also have the option of an office desk in some locations if that's convenient and desirable for you!

    You will:

  • Own Splunk Cloud in FedRAMP environments, abiding by all the FedRAMP prescriptions on location and access.
  • Work across the organization to deliver quality products that delight Splunk's passionate users.
  • Lead teams of tight-knit engineers who are building a state-of-the-art, cloud-based environment for massive-scale data processing.
  • Mentor and help new engineers to achieve more than they thought possible. You enjoy making other teams successful and are fulfilled through the success of others.
  • Qualifications:

  • You have experience operating within restrictive FedRAMP environments and are enthusiastic about doing it better.
  • You have assembled bricolages of Open Source components into cohesive services.
  • You have owned and operated Kubernetes clusters and the associated ecosystems.
  • You enjoy building and running distributed systems at scale in production. You understand the challenges and trade-offs to be made when building and deploying systems to production.
  • Expertise in working with container deployment and orchestration technologies at scale with knowledge of fundamentals including service discovery, deployments, monitoring, scheduling, load balancing. Knowledge of Kubernetes, Go and Docker preferred.
  • Deep understanding of Systems programming (network stack, file system, OS services) and networking (L2 vs. L3, network architecture, VLANs, etc)
  • Knowledge of standard methodologies related to security, performance, and disaster recovery.
  • Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
  • You've demonstrated the skills to effectively work across teams and functions to influence design, operations and deployment of highly available software.
  • You work hard to make the users of Splunk's products happier every day.
  • Preferred skills:

  • Experience monitoring cloud environments with Splunk.
  • Experience with development and deployment in a hosted cloud environment, preferably AWS & GCP.
  • Experience with large scale distributed cloud service development, infrastructure, traffic management and architecture.
  • Experience with distributed architectures/systems with optimized and scalable software that operates on a large number of nodes. You have a proven understanding of map reduce fundamentals, lambda architecture, and have developed applications on systems like Spark/Flink/Hadoop and Kafka.
  • All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which the candidate is applying. For job positions in San Francisco, CA, and other locations where required, we will consider for employment qualified applicants with arrest and conviction records.

    (Colorado only*) Minimum base salary of $115,000. You may also be eligible for incentive pay + equity + benefits.*Note: Disclosure per sb19-085 (8-5-201 et seq).

    Thank you for your interest in Splunk!