How to build up a Security Operation Center (SOC) in reality

Offspring of the many questions from customers, curious acquaintance and friends alike this piece helps them (better) understand what my usual tasks in this field are.

But let’s taper it off!

By SOC we mean the set of integrated technological solutions, highly specialized staff and set of “rules” that have the task of identifying, analyzing and giving indications to identify and remedy an ongoing or foreseeable cyber-attack.

This environment is expected to deliver the maximum level of cyber-defense as the proactive and reactive organization to be put in place for a company or any type of entity potentially exposed to cyber-attack regardless of their relative or absolute exposition to the Big Internet.

By my creed a SOC:

  • should be UNIQUE – like a perfect tailor-made dress (or better bulletproof vest 😉)
  • should be developed AROUND the needs, the criticality and the organization (actual and to be) of the company or entity to defend
  • should be DYNAMIC to be effective – and never static, it has to be in constant evolution and adaptation following the company or entity changes of any type

 

The obvious effects thereof are clear:

No “automatic”, auto-updating, self-whatever is enough to deal with an efficient SOC.  No single box or solution or vendor solution is enough to implement a SOC. Based on my experience that’s the reason why I always – without exception – undertake a first purely consultative phase of Context Analysis. In this phase it’s mandatory to analyze the entire organization from a security perspective.  This phase is very similar to an ISO27001 internal audit and similar methodology is applied. For this reason, it has to be conducted with the full collaboration of the SOC final users.

Of course, during this phase the ability of the auditor is fully focused on the ultimate scope of the SOC implementation and further questions could easily arise.

When this phase is correctly conceived and performed in full collaboration with the customer’s organization   an exhaustive list of activities to startup the SOC becomes clear to me and to any of the SOC end users, or beneficiary if you wish.  In most cases, if not all, it is the entire customer’s organization to benefit from a SOC as all departments are, to a more or less significant extent, digitized. To enforce this consciousness, I usually prepare one or more executive summary on the outcome of the audit to be shared with the relevant stakeholders to be sure they are aware, and they agree on every single point.

It is very important that stakeholders at all levels are addressed and aligned. For this reason, the report should be personalized to be able to reach the competences, roles and capabilities of the readers target audience: while it’s boring for the CEO to go through the listed poor firewall configurations and yet enlightening  for him to understand the level of economic risks and impact his business potentially faces.

Based on the just mentioned groundwork, the next step will be much easier for everyone involved.

In the next phase I like to conceive the SOC Design.  The outcome of the previous activity will drive (and often justify) the decision to be made throughout the deployment of the SOC itself.

In the design phase we make both technical and technological choices alongside with those of economical, logistics and human impact for the final SOC user to put in place.  My experience lead to design by paying adequate attention to the organizational constraints, needs and priorities. It is critical not to miss this angle as exactly these elements represent road blockers at functionality and budgeting level.

SOC designing is very interactive and it is based both on the overall project as on to the results of the Context Analysis (I’ll never stress enough how critical that is!). Every decision finalized in the Design stage needs to find full consensus between all stakeholders to whom we address in their individual “language” with customized communication and make sure the message and information come across clear and comprehensible.

Often people have different perception of the SOC.  To most of those influenced by the filmography on the subject they see it as “the room with plenty of screens on the wall and an equal number of staff monitoring them who at the pop-up of a red light would start screaming”: this is what I call the SOC Room. People with an IT background instead visualize the SOC as the sum of the single elements of the SOC functionality in a set of standard telco racks: this is the SOC DataCenter room.

We are running here two distinct but interconnected projects. Both rooms in fact must be properly designed and implemented to realize a first class SOC able to grow as needed. Both rooms’ pre-requisites will be a mix of experience of the deploying organization and customers’ needs identified in the fundamental Context Analysis.

Human capital is yet another aspect often underestimated in a SOC project.

A SOC is like a very complex machine maintained in perfect shape by the ones operating it. The SOC operators are rare resources and hence a significant investment in terms of competence build up. This competence is, in fact, created by following the customization of the upcoming SOC. Retention, because of the rarity and the increasing need from the market, is a major factor to plan on for a long-term efficiency of the SOC. It is essential that all parties properly plan and budget for the activities designated to the human factor.

It is propaedeutic to formalize the profiles of the people needed and start the selection of internal candidates potentially or effectively available in the organization in order to plan and execute the training plan and/or proceed with an external recruiting round. Remember! a SOC with incompetent and inadequate players is ”less-useful”!

At last the much awaited Implementation Phase. Assuming we executed diligently and correctly both the previous phases, this last one is expected to be surprise-factor free, Murphy’s law forbidding 😉!

Those responsible for the implementation, in which oftentimes I remain fully involved, ensure and supervise all specialists in order to put in place the SOC.  Requirements for networking configurations, IT set-ups, storage performances and configuration, platform integration are ready to be transferred to the designated SOC specialist and verified at every step of the transfer, because that’s how I operate.

Most common challenges related to this phase refer to the specialist’s capabilities and their clear understanding of the scope of every single project’s lap as well as of the final scope. That’s why not only a professional project manager is required but also the original solution architect of the solution is always to be accessible, possibly with an active role in the continuous overseeing the correctness of the implementation of all phases.

The last device in the rack or the last integration of the last platform in the SOC Datacenter Room or the final set-up of the SOC Room do not represent the end of the project: at this point the implementation, in fact, is only in its final stage but not at the end yet.

When all SOC systems are installed, integrated and the SOC infrastructure up and running, it’s time to start the onboarding of all customer’s devices and systems.  Don’t underestimate the effort required in this lag because it is not a minor IT problem.

At this time of the implementation all systems need to be configured and able to send (by firewalls and routing) events to the heart of the SOC.  Most of the time this impacts the operations of the actual IT organization or the entity involved, and it is the first time the two groups (IT and SOC) will come in direct contact delivering activities together. I have experienced several different environments where segregation, rules, fear and unclear history of network evolution create a void in the communication between servers or some devices and the rest of the network also including and especially with the newly created SOC.

In these situations, the activities are heavily time-consuming, and it gets pretty impossible to satisfy all players. This lead me to develop and patent a solution to simplify this step. I’ll talk about it in more details in a separate instance – stay tuned!

My suggestion is to include a significant buffer in the delivery timeline for this part and start sharing relevance since the closing of the Context Analysis.

When also onboarding is complete by reaching a good 50% to 80%, the SOC machine will start producing dozens of thousands possibly malicious events per day which leads to Optimization as part of Implementation.

Most successful optimization occurs when most of the workforce identified to work in the SOC is already engaged and available to participate to this final integration phase: integration with humans!

A lifecycle of a SOC, in fact, includes a variable period to tune the systems (all!) based on the knowledge acquired on how the own infrastructure reaches a reasonable number of possibly malicious events discovered per day. This phase could last easily 1 year and in some big and complex environments it could reach a 2-year period. During this period the SOC ideally takes care of the possible events sampling them from a huge base always trying to find ways to learn lessons for further optimization to reduce the number of events in the future. This is a very sub-optimal utilization of the SOC but it’s inevitable to achieve optimum performance.

This final phase offers a good opportunity to start producing the first set of policies and procedures to deal with the SOC and to deal its relationship with the rest of the organization.

Comments are closed.