Share this job
Server Specialist III
Apply for this job

Position Overview: Server Specialist III

We are seeking a detail-oriented and dedicated Server Analyst to join our Network Operations Center (NOC) within the IT Server Operations team. This role is ideal for individuals who excel in structured, high-paced environments and possess hands-on experience in NOC workflows, server diagnostics, and infrastructure support. The position involves working fixed shifts, participating in an on-call rotation, and playing a critical role in system monitoring, alert management, and incident response. Success in this role hinges on maintaining system stability, clear cross-team communication, and driving operational efficiency through timely reporting and stakeholder collaboration.


Core Responsibilities

Monitoring, Alerting & Incident Response

  • Monitor infrastructure health using tools such as SolarWinds Orion, Dynatrace, or equivalent platforms.
  • Respond promptly and accurately to alerts, ensuring timely escalation and resolution within SLA parameters.
  • Document incidents thoroughly to support Root Cause Analysis, post-mortems, and knowledge sharing.
  • Participate in a structured on-call rotation for after-hours support.
  • Execute maintenance window tasks, including application checkouts, maintenance mode validation, and alert suppression per schedule.

Server Operations & Troubleshooting

  • Conduct hands-on diagnostics and remediation for Windows and Linux servers in both physical and virtual environments.
  • Maintain up-to-date documentation of assets, configurations, and operational standards.
  • Troubleshoot technical issues, manage support tickets, and coordinate with vendor teams for onsite assistance.

Reporting, Communication & Stakeholder Engagement

  • Deliver concise updates during shift handoffs and operational briefings to ensure transparency and continuity.
  • Collaborate with cross-functional teams to align on incident priorities, escalation protocols, and service impact.
  • Work with stakeholders to define key performance indicators and tailor reporting and alerting solutions to specific application and infrastructure needs.
  • Track and report operational metrics, highlighting areas for improvement and potential risks.

Security & Compliance

  • Apply server security best practices and respond to vulnerability alerts promptly.
  • Ensure all operational activities adhere to internal policies and external regulatory requirements.

Operational Excellence & Reliability

  • Identify recurring issues and contribute to preventive strategies that enhance system reliability and reduce alert noise.
  • Maintain and improve runbooks and escalation workflows to support consistent execution.
  • Demonstrate high standards of punctuality, ownership, and accountability during assigned shifts.


Required Qualifications

  • Minimum 3 years of experience in NOC or server operations roles.
  • Proficiency in Windows Server and Linux environments.
  • Hands-on experience with infrastructure monitoring and alerting tools.
  • Familiarity with data center operations and hardware support.
  • Solid understanding of networking fundamentals (TCP/IP, DNS, DHCP).
  • Strong troubleshooting, documentation, and communication skills.
  • Willingness to work fixed shifts and participate in an on-call rotation.

Preferred Qualifications

  • Experience with SolarWinds Orion, Dynatrace, or similar observability platforms.
  • Exposure to virtualization technologies such as VMware or Hyper-V.
  • Familiarity with ITSM practices and ticketing systems (e.g., ServiceNow, Remedy).
  • Relevant certifications (e.g., Microsoft, CompTIA Server+, Red Hat).



Top Daily Tasks

  • Orion Alert Management: Tune and manage alerts in SolarWinds Orion to ensure clarity and suppress noise during maintenance windows.
  • Email & Notification Triage: Prioritize incoming alerts and system notifications, escalate critical issues, and maintain NOC-wide awareness.
  • System Failovers & Failbacks: Execute and validate failover/failback procedures, ensuring service continuity and proper documentation.
  • NOC Phone Support: Provide responsive support for infrastructure incidents, service requests, and operational escalations.
  • DNS Entry Management: Create and update DNS records to reflect infrastructure changes, manage failovers, application statuses, and configure forwarding zones in Infoblox.


Apply for this job
Powered by