Leads resiliency engineering and L4 development support for the digital banking channels (web and mobile). Owns assessment, governance, and implementation of resilience patterns - circuit breakers, timeouts/retries, bulkheads, graceful degradation, failover/DR - and code-level observability. Acts as the development arm for SRE/Service Delivery to deliver instrumentation, production hardening, and fixes for incidents requiring code changes. Defines the strategy for continuous enhancement, performance optimization, stability, and scalability across our modern Java services, web clients, and native/hybrid mobile apps. Establishes and governs resiliency standards, design guardrails, readiness checks, and production change controls. Implements fault-tolerance patterns and libraries in services and clients; enables kill-switches, rate limiting, and backpressure. Delivers observability: distributed tracing, metrics, logs, health endpoints, synthetic probes, and error taxonomies. Serves as L4 dev support for production incidents Defines and maintains runbooks, SLIs/SLOs for critical journeys, DR playbooks; conducts regular failover exercises. Partners with SRE/Service Delivery to translate operational needs into code-level instrumentation and monitoring enhancements. Collaborates with API Governance and Platform Engineering on gateway policies, dependency hardening, and release safety (canary/blue-green). Improves performance and stability via caching, connection pooling, dependency isolation, capacity planning, and traffic shaping. Guides DR architecture (multi-AZ/region, active-active/passive) aligned to RTO/RPO and regulatory requirements. Influences cross-functional delivery and provides technical mentorship without formal line management. Fosters a culture aligned to BMO purpose, values and strategy and role models BMO values and behaviours in all that they do. Ensures alignment between values and behaviour that fosters diversity and inclusion. Regularly connects work to BMO’s purpose, sets inspirational goals, defines clear expected outcomes, and ensures clear accountability for follow through. Builds interdependent teams that collaborate across functional and operating groups to create the highest value for all stakeholders. Improves team performance, recognizes and rewards performance, coaches employees, supports their development, and manages poor performance. Operates at a group/enterprise-wide level and serves as a specialist resource to senior leaders and stakeholders. Applies expertise and thinks creatively to address unique or ambiguous situations and to find solutions to problems that can be complex and non-routine. Implements changes in response to shifting trends. Broader work or accountabilities may be assigned as needed.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level