Orchestra conductor

SharePoint operational governance

Here are some ideas when making SharePoint operational maintenance a reality

Quick reminder of the difference between monitoring and reporting

  • Monitoring provides information about a specific component in near real time. Use to performance counters with threshold to prevent failures.
  • Reporting provides information related to an entire platform and not just individual elements. Usually, this information is gathered after a lapse of time. Examples are web statistics reports.

Start by defining maintenance tasks

For each task, agree on a frequency and estimated duration.
Example of maintenance tasks in a SharePoint environment:

  • Basic check of the SharePoint functionality (home page)
  • Monitor Windows Event logs
  • Monitor IIS logs
  • Monitor ULS logs
  • Monitor  SQL logs
  • Monitor indexing logs
  • Monitor search logs
  • Check the physical environment where servers are located (Access to the server room, Temperature and humidity, network hardware)
  • Check that backups have been successful
  • Monitor free disk space on servers
  • Monitor system resources (CPU, RAM …etc.)
  • Monitor network state
  • Review SLAs from the previous week
  • Check SQL Server maintenance plans
  • Manage second level recycle bin
  • Check index fragmentation and run DBCC CHECKDB
  • Check updates applying to SharePoint and Windows
  • Execute the SharePoint Health Analyser and verify reports
  • Check usage reports
  • Check Web Analytics reports
  • Capacity planning: check  if the platform can take the load and forecast infrastructure updates
  • Optimisation of the search application service
  • Management of the security policies at the Web Application level
  • Test restore procedures
  • Update disaster recovery procedures (contact details of the employees, contractors and external parties, used program versions, Services Pack, Hotfix, communication plan …etc.)

Operating Plan

An operational job plan can be defined as all automatic or manual jobs running during a given period, for example a week.

Once all jobs have been defined, it is a good idea to plan them so that they don’t impact each other.
This type of production planning can just be a spreadsheet to give an overview of what’s going on in the farm.

Examples of operational jobs:

  • Incremental/continuous crawl of the search index
  • Antivirus scans and signature updates
  • Bare metal/File system backups
  • SQL Server backup (full, transaction log or differential)
  • PowerShell backup
  • Logs backup
  • Daily application pool recycling on SharePoint servers
  • Warm-up script: PowerShell script used to crawl a defined or calculated list of high level site pages in order to compile and cache the SharePoint pages in IIS (.Net based technology).
  • Definition of frequency of server reboot
  • SharePoint WFE server reboot
  • SharePoint application server reboot
  • SQL Server rebootFull crawl of the search index
  • Monitoring going off-line during certain periods to avoid unnecessary alerts.

Establish the different roles involved

The IT system teams consists of a network of support professionals.

You should be able to define the escalation path for end-user support and incident resolution, as well as a list of the different functions for each IT teams involved around the SharePoint farms.

Below is an example of functions:

  • End-user support
  • Service incident resolution
  • SharePoint farm exploitation
  • Third-party Application Maintenance (in French: TMA) of customised development (includes minor evolution)
  • Major evolution (customised development and infrastructure)
  • Change management
  • User training

Use a RACI chart to give responsibility and scope of each team

RACI Chart identifies who is Responsible, Accountable, Consulted and Informed
Responsible Those who do work to achieve the task, there can be multiple resources responsible
Accountable The resource ultimately accountable for the completion of the task- there must be exactly one A specified for each task
Consulted Those whose opinions are sought. 2 way communication
Informed Those that are kept up-to-date on progress. 1 way communication

Your RACI chart can either be based on:

  • Your own set of roles
  • ITIL only
  • Or a mix of both!

Usually it involves go-live activities, release management as well as run/support tasks.

Example of standard ITIL roles includes

Service Strategy
Business Relationship Manager
Demand Manager
Financial Manager
IT Steering Group (ISG)
Service Portfolio Manager
Service Strategy Manager
Service Design
Applications Analyst(s)
Availability Manager
Capacity Manager
Compliance Manager
Enterprise Architect
Information Security Manager
IT Service Continuity Manager
Risk Manager
Service Catalogue Manager
Service Design Manager
Service Level Manager
Service Owner
Supplier Manager
Technical Analyst(s)
Service transition
Application Developer(s)
Change Advisory Board (CAB)
Emergency Change Advisory Board (ECAB)
Change Manager
Configuration Manager
Knowledge Manager
Project Manager
Release Manager
Test Manager
Service operation
1st Level Support (operators and helpdesk)
2nd Level Support (= level 3 experts)
3rd Level Support (= suppliers)
Access Manager
Facilities Manager
Incident Manager
IT Operations Manager
IT Operator(s)
Major Incident Team
Problem Manager
Service Request Fulfilment Group

Published by

Jean-François Pironneau

Travaillant depuis 2003 sur les versions successives de SharePoint, je me suis spécialisé au fur et à mesure des années dans la partie infrastructure/technique, puis récemment dans la gouvernance de Microsoft 365.

Leave a Reply

Your email address will not be published. Required fields are marked *