BMO logo

[Remote] Senior Manager, Resiliency Engineering & L4 Support (Digital Banking)

BMO

Note: The job is a remote job and is open to candidates in USA. BMO is a financial institution focused on creating positive change for customers and communities. They are seeking a Senior Manager in Resiliency Engineering & L4 Support to lead resiliency engineering and support for digital banking channels, focusing on performance optimization, stability, and scalability.

Responsibilities

  • Leads resiliency engineering and L4 development support for the digital banking channels (web and mobile)
  • Owns assessment, governance, and implementation of resilience patterns - circuit breakers, timeouts/retries, bulkheads, graceful degradation, failover/DR - and code-level observability
  • Acts as the development arm for SRE/Service Delivery to deliver instrumentation, production hardening, and fixes for incidents requiring code changes
  • Defines the strategy for continuous enhancement, performance optimization, stability, and scalability across our modern Java services, web clients, and native/hybrid mobile apps
  • Establishes and governs resiliency standards, design guardrails, readiness checks, and production change controls
  • Implements fault-tolerance patterns and libraries in services and clients; enables kill-switches, rate limiting, and backpressure
  • Delivers observability: distributed tracing, metrics, logs, health endpoints, synthetic probes, and error taxonomies
  • Serves as L4 dev support for production incidents
  • Defines and maintains runbooks, SLIs/SLOs for critical journeys, DR playbooks; conducts regular failover exercises
  • Partners with SRE/Service Delivery to translate operational needs into code-level instrumentation and monitoring enhancements
  • Collaborates with API Governance and Platform Engineering on gateway policies, dependency hardening, and release safety (canary/blue-green)
  • Improves performance and stability via caching, connection pooling, dependency isolation, capacity planning, and traffic shaping
  • Guides DR architecture (multi-AZ/region, active-active/passive) aligned to RTO/RPO and regulatory requirements
  • Influences cross-functional delivery and provides technical mentorship without formal line management
  • Fosters a culture aligned to BMO purpose, values and strategy and role models BMO values and behaviours in all that they do
  • Ensures alignment between values and behaviour that fosters diversity and inclusion
  • Regularly connects work to BMO’s purpose, sets inspirational goals, defines clear expected outcomes, and ensures clear accountability for follow through
  • Builds interdependent teams that collaborate across functional and operating groups to create the highest value for all stakeholders
  • Improves team performance, recognizes and rewards performance, coaches employees, supports their development, and manages poor performance
  • Operates at a group/enterprise-wide level and serves as a specialist resource to senior leaders and stakeholders
  • Applies expertise and thinks creatively to address unique or ambiguous situations and to find solutions to problems that can be complex and non-routine
  • Implements changes in response to shifting trends
  • Broader work or accountabilities may be assigned as needed

Skills

  • Resiliency patterns: circuit breakers, retries/timeouts, bulkheads, fallbacks, rate limiting, backpressure, feature flags/kill-switches
  • Observability: OpenTelemetry-based tracing; metrics/logging with Prometheus/Grafana and Splunk/ELK; health endpoints and synthetic monitoring
  • DR/BCP and failover: multi-AZ/region, active-active/passive designs; clear RTO/RPO ownership
  • CI/CD and release safety: Git-based pipelines, canary/blue-green, automated rollbacks, progressive delivery
  • Cloud and platforms: AWS (networking, compute, storage, databases, monitoring) and Red Hat OpenShift/Kubernetes; containerization
  • Backend: Java with Spring/Spring Boot; API gateways and governance; resilience libraries (e.g., Resilience4j)
  • Web and mobile: modern web apps (Angular) and hybrid/native mobile (Ionic, iOS, Android) including offline-first and graceful degradation
  • Data and integration: RDBMS/NoSQL, caching (Redis), messaging/streaming (Kafka/SQS); idempotency and exactly-once patterns
  • Web architecture, server-side concepts, and version control
  • Technical writing/documentation; verbal & written communication
  • Organization, collaboration & team skills; relationship building
  • Analytical and problem-solving skills; influence skills; data-driven decision making
  • Learning agility; ability to operate across multiple stakeholder groups
  • Technical leadership role with direct reports; candidates with informal team lead experience are encouraged to apply
  • Typically 7+ years of relevant experience in software engineering/tech lead roles and post-secondary degree in a related field, or equivalent experience

Benefits

  • Health insurance
  • Tuition reimbursement
  • Accident and life insurance
  • Retirement savings plans

Company Overview

  • We’re a bank, but there’s more to it than that. ​ When you join BMO, it opens a world of opportunities. It was founded in 1817, and is headquartered in Toronto, Ontario, CAN, with a workforce of 10001+ employees. Its website is http://www.bmo.com.

Company H1B Sponsorship

  • BMO has a track record of offering H1B sponsorships, with 7 in 2025, 2 in 2024, 6 in 2023, 4 in 2022, 2 in 2021, 2 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Job Type

Job Type
Full Time
Location
United States

Share this job: