How to Build a Great SOC

Wall of computers
Author: Grant Hughes, CISA, CISM, CDPSE, CASP, CCSK, CCSP, CEH, CIH, CISSP, SSCP
Date Published: 29 November 2023

A college professor tasked a group of students with investigating and improving the jury deliberation process.1 The students interviewed several stakeholders and discovered that the shape of the table in the jury room had an impact on the decision process. In courtrooms where there were rectangular tables, the juror sitting at the head of the table often dominated the conversation. As a result, some jurors did not share their views openly, and a verdict was quickly reached.

The students concluded that juries with round tables came to the most accurate and just verdicts as all jurors participated openly and shared their views. The students were excited about their findings. They believed that they had discovered a way to improve the justice system—and it was an easy fix. Upon receiving their feedback, the judge ordered that all jury room tables be changed to rectangular tables. The students were shocked. After all, round tables led to more robust decision-making, not rectangular tables.

When the judge said that he wanted to improve the jury deliberation process, he meant that he wanted to reduce the time it took a jury to reach a decision. The students had understood improving the process to mean ensuring a more robust and fair process. Thus, the objectives of the task were not clear to all stakeholders upfront.

The purpose of a security operations center (SOC) may not be clear to everyone in an organization. It is imperative for the purpose, scope and objectives of an SOC to be clearly defined and communicated from the start. Otherwise, the investment in the SOC can be difficult to demonstrate to business stakeholders and security leaders, and the SOC may not support the organization’s objectives.

Defining an SOC

An SOC is a critical function that enables an organization to detect and respond to cyberthreats in real time, reduce cyberrisk to the organization, and maintain compliance with regulations and standards by providing ongoing compliance monitoring and reporting. CompTIA defines an SOC as a team of experts who proactively monitor an organization’s environment to ensure that it operates securely.2 Microsoft has a similar definition, defining an SOC as “a centralized function or team responsible for improving an organization’s cybersecurity posture and preventing, detecting, and responding to threats.”3

In 2022, a study found that within 15 minutes of the disclosure of common vulnerabilities and exposures (CVE), attackers begin scanning for victims.4 An SOC with the correct processes and technology in place is well-positioned to protect an organization in real time with detection and response capabilities.

The difference between a good SOC and a great SOC is the degree of excellence and effectiveness with which the core services of the SOC are delivered. For example, all SOCs conduct incident response activities, but the measure of how quickly and effectively an SOC responds to incidents is what differentiates a good or mediocre SOC from a great SOC. A good SOC is considered adequate—it meets the minimum requirements. A great SOC is considered exceptional and exceeds expectations.

SOC Services and Technologies

An SOC provides a range of services to protect an organization's information assets from cyberthreats. The specific services provided by an SOC vary depending on the organization's size, industry and security requirements, and the definition of services may differ from one SOC to another. SOC services may include security log management, security incident management and response, security monitoring, threat detection, threat hunting, vulnerability management, regulatory compliance monitoring, cyberrisk management and cyberincident reporting.

The aim of the SOC strategy is to ensure that the SOC effectively fulfills its function and in doing so helps the organization to fulfill its overall business objectives.

Although an SOC may employ several tools to deliver those services, security incident and event management (SIEM), security orchestration, automation and response (SOAR) and threat intelligence remain the primary tools for any SOC.

When considering SOC technologies and services, whether to adopt managed detection and response (MDR) versus a traditional SIEM is an important consideration. MDR is the next step in the evolution of the managed SOC. MDR uses a diverse set of data inputs and detectors to identify suspicious activity and attempts to determine in real time whether it is an actionable alert or a false positive. Similar to a traditional SOC, it relies on technology and requires human analyst involvement. Whether an organization is using a traditional SIEM or MDR platform, it is important to ensure that the technology is correctly implemented, tuned and regularly tested to ensure high-fidelity alerts on an ongoing basis.

Technologies such as SOAR are considered force multipliers when used in combination with SIEM. SOAR technologies are adopted to improve detection and response by adding context and enrichment. This improves downstream prioritization and efficiency in an SOC. SOAR is used primarily for incident response, and vendors are increasingly building SOAR capabilities into other security tools such as SIEM solutions.

Threat intelligence is the element that differentiates good SOCs from great SOCs. Threat intelligence provides context to otherwise meaningless data and is used to support effective logging and monitoring. The platform that is ultimately used for centralizing log data must incorporate a threat intelligence feed for enrichment and context to enhance detection capabilities.

Insourced vs. Outsourced SOCs

The decision to build an internal SOC vs. using managed SOC services must be made by evaluating each option in the context of the organization. An internal SOC often has contextual awareness and understanding of internal systems and processes, allowing it to respond to incidents more effectively. There are several considerations, such as providing 24/7 coverage, that will require sufficient security analysts, SOC managers and threat hunters to cover three shifts. Depending on the core business of the organization, this might not be feasible. Managed SOC services provide economies of scale, specialized skills, cost-efficiency and 24/7 coverage. Managed SOC providers are often more experienced and better able to provide a mature service offering to organizations as SOCs are their core business.

Whichever option an organization selects, the requirements for and cost of sufficient security personnel, ongoing analyst training and 24/7 service availability must be taken into consideration.

Defining an SOC Strategy

Without a defined SOC strategy, security leaders may struggle to prioritize resources. A strategy provides direction based on various inputs, such as the threat landscape, regulatory requirements and threat assessments. In the context of an SOC, the primary objective of the SOC strategy should be to avoid a situation in which both the cost and effort are high and the value and return on investment (ROI) are low. The aim of the SOC strategy is to ensure that the SOC effectively fulfills its function and in doing so helps the organization to fulfill its overall business objectives. Caution must be exercised to avoid the deployment of duplicate technologies requiring SOC analysts to navigate multiple security tools during incident investigations. When reviewing an SOC, key indicators of a great SOC are well-defined response playbooks and standard operating procedures (SOPs) that are regularly tested and updated. In addition, an updated technology capability map should support the SOC services.

A well-architected SOC provides a positive ROI by minimizing potential financial losses due to cyberincidents. At the same time, an SOC enhances an organization's ability to detect and respond to cyberthreats in real time, safeguarding sensitive data and protecting the organization’s reputation. Therefore, compliance, ROI and risk reduction are interconnected.

The proposed strategy in figure 1 can assist security leaders in justifying the investment in an SOC, realizing value from the investment and reducing risk to the organization.

Figure 1

Scope and Objectives
An SOC strategy must be developed with a clear scope and objectives. Examples of SOC objectives may include ensuring that effective and rapid response capabilities are in place to detect and respond to cyberincidents; ensuring compliance with cybersecurity and information protection regulations and standards such as the Payment Card Industry Data Security Standard (PCI DSS) or the International Organization for Standardization (ISO) standard ISO 27001; and reducing cyberrisk to an acceptable level for the organization. The scope and objectives of the SOC vary from organization to organization and must be underpinned by legal and regulatory requirements and business requirements.

Security leaders must ensure that diverse threat intelligence feeds are ingested into the various SOC technologies to ensure detection capabilities are continuously enhanced and maintain effectiveness against emerging cyberthreats.

The SOC strategy must be developed with the consumers of the SOC services in mind. Typical SOC stakeholders may include the chief information security officer (CISO), IT operations teams, legal and compliance teams, enterprise risk management teams and executive management. If it is envisaged that C-level employees or external vendors will interact with the SOC, upfront engagement with those stakeholders is required. Organizations must ensure that all stakeholders understand the objectives and engagement model of the SOC. Steps should be taken to avoid situations wherein the SOC is tasked with investigating incidents outside of its scope and objectives without the necessary tools, supporting logs or skills.

The SOC capability must be supported with sufficient resources. This includes consideration for the people, processes, technology and information required to support the scope and objectives of the SOC.

Threat intelligence must be an integral part of the SOC strategy to combat advanced threats because it provides context and insights, which translates to faster threat detection and response. Automating the use of threat intelligence in an SOC provides a significant benefit because it enables security solutions to automatically prioritize events associated with actively exploited vulnerabilities that may impact the organization. Security leaders must ensure that diverse threat intelligence feeds are ingested into the various SOC technologies to ensure detection capabilities are continuously enhanced and maintain effectiveness against emerging cyberthreats.

The threat landscape is evolving, and what is sufficient today might not be sufficient tomorrow.

SOC services must be measured with metrics that drive the desired behaviors. However, metrics can also influence behavior. For example, if the incident resolution time of an SOC is measured, SOC analysts may rush through investigations and skip certain steps to reach a resolution more quickly. Metrics should measure things that provide insight into the efficiency of the SOC or risk exposure of the organization. Metrics may include mean time to detect, alert fidelity and mean time to respond.

Another essential element of a great SOC is defining security use cases. A security use case is an attack scenario that a security control is intended to prevent or mitigate. Although it might be tempting to start with generic use cases such as phishing, malware and denial-of-service (DoS) attacks, this approach is unlikely to resonate with business leaders. When engaging business leaders, security use cases such as fraud, business email compromise (BEC) and downtime of critical assets should be the focus of the conversation. The use cases covered by the SOC must align with and support the organization’s business objectives and risk exposure.

Asset Management
The SOC strategy must be underpinned by a good asset management program. The first and second Center for Internet Security (CIS) controls emphasize the importance of asset management.5 These controls are inventory and control of enterprise and software assets respectively. It is only when an organization knows what assets it has that it can protect and monitor them. The SOC must have access to updated asset inventory information. Establishing the best controls and monitoring practices is pointless if not all assets have been identified. For example, if a domain controller is not known to the SOC and the logs are not ingested into a SIEM tool, this impacts the ability of the SOC to detect a security incident involving that specific domain controller.

Continuous Improvement
The SOC strategy must continuously be improved. The threat landscape is evolving, and what is sufficient today might not be sufficient tomorrow. Testing the effectiveness of the SOC is an integral part of continuously improving the SOC. Red team exercises, penetration tests and breach and attack simulations are some of the ways in which the effectiveness of an SOC can be assessed. In addition, audits against industry standards can potentially improve the operational efficiency of SOC processes.

Business-Aligned Use Case Development Life Cycle

Figure 2As noted, the creation of security use cases is an essential element of a great SOC.

The steps of the development life cycle for business-aligned use cases are shown in figure 2:

  1. Understand the business objectives, mission and vision—The SOC should have a clear understanding of the organization's revenue-generating activities and business processes.
  2. Review the threat landscape—Understanding the current threat landscape, including the types of threats specific to the organization’s industry vertical, ensures that use cases are relevant to actual threats in the wild. Threat intelligence can add value in this area.
  3. Define use cases—Based on business objectives and the threat landscape, the SOC can define relevant use cases. All use cases must be linked to the organization’s objectives and mission.
  4. Identify data sources—In this phase, log sources are identified to support the various use cases. For example, if the use case is phishing, the data sources that may be identified include email gateways, exchange servers, email security solutions, firewalls and endpoints with email clients. Log collection and correlation technologies such as SIEM can ingest both on premises and cloud sources. For cloud systems, there are typically cloud connectors, while on-premises systems have either an agent that must be deployed or a native syslog forwarding to a centralized log collector.
  5. Identify and ingest event logs—Although it is ideal, it is rarely feasible to ingest all event logs from the identified data sources. As a result, a purpose-driven event logging approach must be adopted. With the help of the SIEM vendor and the original equipment manufacturer (OEM) of the log source, specific events required to support the different use cases must be identified at a granular level. The output of this activity is a detailed mapping of specific event types that must be enabled on the log source and onboarded into the SIEM tool. There are three types of logging gaps that must be avoided: insufficient logging wherein required events are not logged at all, insufficient verbosity wherein the correct level of detail is not logged, and insufficient retention wherein event logs are not retained long enough. Given the cost of storing and retaining logs, organizations must strive to log the correct events and keep logs only for the duration required to support legal and regulatory compliance and business requirements.
  6. Develop the use case logic and alerts—Defining logic involves defining the rules that trigger an alert when a specific event occurs. An example of use case logic may be to generate an alert when there are more than 10 failed login attempts from a single IP address within five minutes. Another example is an alert generated when the same user logs in from two different geographical areas at the same time or within a short period of time.
  7. Continuous testing and improvement—After use cases have been implemented, they must be tested regularly to ensure that they remain functional. The SOC team should simulate use cases to test the technology and response processes as per a defined test plan. Use cases and alerts should be refined based on the results of the testing.
  8. Reports and dashboards—When defining reports and dashboards, the intended audience must be identified. It is likely that information must be presented at a strategic, operational, analytical and tactical level to address all stakeholders’ requirements. As defined in the SOC strategy, metrics must be meaningful and aligned with business objectives. Information must be presented at an appropriate level for the intended audience.

Conclusion

The SOC is an increasingly important function for organizations to defend against cybercrime. Although an SOC can vary in terms of scope, objectives and the services it provides, security monitoring, threat detection and incident response remain the core service offering for any SOC.

Given the vast number of stakeholders involved and the variety of SOC services potentially offered, it is important to define a strategy up front and follow a structured approach and methodology when building an SOC.

The core requirements for building a great SOC are to define the scope and objective of the SOC, define and develop the SOC service offering in consultation with the consumers of the services, and ensure that sufficient resources are in place to support the mandate of the SOC. Threat intelligence and business-aligned metrics are instrumental in the success of the SOC because they drive the continuous optimization of the SOC on an ongoing basis. Finally, the SOC strategy must be underpinned by a solid asset management program.

Endnotes

1 Arquimbau, J.; “Rectangle or Oval?” InteraWorks, http://www.interaworks.com/rectangle-or-oval/
2 CompTIA, “What Is a Security Operations Center?” http://www.comptia.org/content/articles/what-is-a-security-operations-center
3 Microsoft, “What Is a Security Operations Center (SOC)?” http://www.microsoft.com/en-us/security/business/security-101/what-is-a-security-operations-center-soc
4 Toulas, B.; “Hackers Scan for Vulnerabiliites Wintin 15 Minutes of Disclosure,” Bleeping Computer, 26 July 2022, http://www.bleepingcomputer.com/news/security/hackers-scan-for-vulnerabilities-within-15-minutes-of-disclosure/
5 Center for Internet Security, "CIS Critical Security Controls Version 8," USA, 2021, http://www.cisecurity.org/controls/v8

GRANT HUGHES | CISA, CISM, CDPSE, CASP, CCSK, CCSP, CEH, CIH, CISSP, SSCP

Is a principal security architect at Engen Petroleum Ltd. He is a strategic thinker, thought leader and public speaker with a background in security strategy, architecture, cybersecurity risk and security operations. Hughes has delivered multiple keynotes and has more than 13 years of experience in IT, seven of which have been spent in information and cybersecurity. He is a trusted advisor certified in many disciplines within the information security domain. He can be contacted on LinkedIn.

Additional resources