S

Staff Infrastructure Engineer (R2975)

Shield AI
Full-time
On-site
Dallas Metro Area, California, United States
Aviation IT
Founded in 2015, Shield AI is a venture-backed defense technology company with the mission of protecting service members and civilians with intelligent, autonomous systems. Its products include Hivemind Enterprise—EdgeOS, Pilot, Commander, and Forge—as well as V-BAT and Sentient Vision Systems (wide-area motion imaging software). With offices in San Diego, Dallas, Washington, D.C., Abu Dhabi (UAE), Kyiv (Ukraine), and Melbourne (Australia), Shield AI’s technology actively supports U.S. and allied operations worldwide.  For more information, visit www.shield.ai. Follow Shield AI on LinkedIn, X and Instagram.    

Shield AI is seeking a Staff Infrastructure Engineer to help maintain its leadership in applied artificial intelligence development. This role is responsible for engineering, deploying, provisioning, and maintaining critical systems to drive innovation across Shield AI’s primary and remote office environments in the United States and internationally. The Staff Infrastructure Engineer will also manage workloads hosted in both private and public cloud environments, ensuring optimal performance and scalability. This position may require occasional travel to Shield AI locations.

WHAT YOU'LL DO:

    • Systems Engineering:
    • Serve as a senior resource on the IT Cloud & Infrastructure team to develop solutions for complex requirements and issue resolution.
    • Oversee the day-to-day management and optimization of on-premises and cloud-based infrastructure (e.g., AWS, Azure).
    • Configure and assist in the advanced design and administration of Active Directory environments and modern workplace solutions via Microsoft 365
    • Perform capacity planning for virtual machine environments, its systems, and the management of new solutions dependent upon that footprint.
    • Responsible for security and performance monitoring of infrastructure, recommending enhancements where necessary.
    • Implement and operate server, storage, and backup infrastructure systems and related disaster recovery processes
    • Responsible for vulnerability remediation and resolution efforts across various infrastructure systems.
    • Author and produce the necessary documentation for engineered and maintained systems along with associated processes which supporting teams can leverage.
    • Demonstrate a willingness to learn and work with unfamiliar network and machine operating systems with the desire to leverage automation tasks for repeatable tasks.
    • Manage and facilitate the progress of assigned technical IT-related projects and taskings.
    • Lead and participate in Agile methodologies and sound engineering principles.
    • Research, recommend, and develop innovative solutions.
    • Operations and Support:
    • Perform daily system monitoring, verifying the integrity and availability of all hardware, server resources, systems and key processes, reviewing system and application logs, and verifying completion of scheduled jobs such as backups.
    • Provide escalated support for operational issues possibly during and after normal business hours for systems, Kubernetes AI infrastructure, and the datacenters hosting those.
    • Analyze, troubleshoot and resolve system hardware, software, and networking issues.
    • Possess the capacity to participate in on-call, emergency, or maintenance roles
    • Maintenance:
    • Assist with applying OS patches and upgrades on a regular basis, and upgrades of administrative tools and utilities.
    • Upgrade and configure system software that supports the company’s infrastructure and its application services per operational need
    • Perform system maintenance to network devices, storage arrays, and servers during approved maintenance windows.
    • Assist in performing ongoing performance tuning, hardware upgrades, and systems optimization as required.
    • Participate and assist in disaster recovery planning, testing, and execution to meet compliance requirements.
    • Maintain and administer systems and software licensing as needed.
    • Maintain and draft operational, configuration, and other procedures and documentation.

REQUIRED QUALIFICATIONS:

    • Bachelor’s degree in a technical discipline, or at least 8 years of experience plus an engineer level certification, such as a VCP, Azure/AWS Solutions Architect, or another similar level certification.
    • 8 years’ experience supporting applications and systems in a production environment, preferably for a software and/or manufacturing development company.
    • Advanced and in-depth experience and concepts of Microsoft Windows Server administration, Azure and Active Directory environments
    • Advanced and in-depth experience along with the fundamental understanding of at least one type of virtualization platform (i.e. VMware, Hyper-V, KVM, etc.).
    • Advanced and in-depth experience along with the fundamental understanding of at least one type of Storage Operating System (i.e. NetApp, Dell EMC, Pure Storage)
    • Proven engineering experience with deploying and maintaining workloads in public cloud (i.e. Azure, AWS)
    • Experience with deployment and systems administration of at least one type of Linux distribution (i.e. RHEL, Ubuntu)
    • Improve operational efficiencies by implementing Infrastructure as Code (IaC) solutions (e.g., Terraform, Ansible).
    • Automate repetitive tasks using scripting languages such as PowerShell, Python, or Bash.
    • Fundamental understanding of the OSI model and network communication layers.
    • Ability to work independently to accomplish assigned tasks.
    • Proven organizational and multi-tasking skills.
    • Effective verbal and written communication skills. Project Management skills, process oriented with attention to detail.
    • Comfortable performing in a process-oriented and change-controlled working atmosphere.
    • Solution-oriented, constructive approach to problem-solving.
    • As a senior resource, he or she will be expected to lead and mentor teammates and be able to present topics to a non-technical audience.
    • Based in San Diego, CA, Dallas, TX or alternatively, Washington D.C.
$128,000 - $192,000 a year
#LI-EJ1
#LD

Full-time regular employee offer package:
Pay within range listed + Bonus + Benefits + Equity

Temporary employee offer package:
Pay within range listed above + temporary benefits package (applicable after 60 days of employment)

Salary compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location. All offers are contingent on a cleared background and possible reference check. Military fellows and part-time employees are not eligible for benefits. Please speak to your talent acquisition representative for more information.

###

Shield AI is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender identity or Veteran status. If you have a disability or special need that requires accommodation, please let us know.