Share this job
Production System Reliability Engineer
Montreal, QC
Apply for this job

Title: Production System Reliability Engineer

Responsibilities - Internal or External:

• Managing critical incidents and ensuring all key management and business stakeholders are kept up to date

• Ensure Production Management is closely aligned/embedded in the Agile software development process and our code meets production standards

• Developing automated solutions to long standing problems to ensuring minimal downtime and manual effort

• Configuring application monitors using industry standard monitoring tools, as well as developing customized monitoring solutions

• Build extensive business and application knowledge required for supporting client facing applications

• Interface with clients and other technology teams to provide governance and control around the production environment


Qualifications – Internal or External:

You should apply on this requisition if you have, at minimum, the following profile:

• 3 years of application development (Python, HTML, Java Script) or relevant production support experience

• Ability to manage an incident call and coordinate multiple teams towards a common goal of resolving the outage



While this is not a requirement, we are very interested in people who have exposure using the following technologies or subjects:


• Enthusiasm for modern development tools and practices including test-driven development, agile and continuous integration

• Experience managing, deploying and troubleshooting, large scale production environments

• Knowledge of Devops testing and code quality tools

• Strong infrastructure knowledge in Linux / Unix, Windows, Databases (Sybase, DB2 & NoSQL), Storage, Networking and Web Technologies

• Cloud administrator / DevOps knowledge (AZURE preferred)

• Advanced Linux admin level knowledge

• Advanced Unix Shell \ Perl scripting experience

• Advanced SQL query language knowledge

Apply for this job
Powered by