In conversation with
The VP of Product Strategy for Cortex at Palo Alto Networks on collaboration during security incidents
It all comes down to collaboration.
I think the biggest challenge I see when it comes to security incidents is collaboration: how you collaborate and who you collaborate with.
If you think about a security incident, it starts in one corner, with somebody noticing some sort of anomaly. The first step is assigning that incident to an incident commander, an owner, whatever different teams call it. That person’s role is to quickly figure out whether it’s a real problem or not—there are false positives, and security products are known to have these challenges. False positive detection typically requires people to look at different tools to bring in the data, and at security information and event management systems.
So if you figured out it’s a real incident, then you need to figure out how severe the problem is. Does it involve data leak? To an external party? Does it involve an executive’s laptop being compromised, where the sensitive data could be harming the company in a big way? Do you need to involve legal? Do you need to involve HR? When you’ve assessed the severity, that drives: Who do I need to engage?
A playbook helps you prepare for future incidents.
Some organizations don’t have well-defined playbooks and processes. In those scenarios, what are you going to do in your post-incident review? What do you check against, to see if the incident was solved or not solved? But other organizations have very well-defined playbooks and processes, and what they check is adherence to those processes.
Some of those playbooks are used for quality assurance checks—you take, say, 20% of your incidents and assign them for a post mortem review. Did the incident commander or the owner follow the steps and the processes that they were supposed to follow? Has the right level of documentation been collected? Sometimes reviews of incidents with lots of complications don’t touch upon documentation. But if it becomes a compliance problem, or a big breach externally, you’ll need to disclose that you did do the right thing—has that been documented?
Keep your communication charts up-to-date.
I’ve seen organizations that don't even have communication charts: Who do I contact if this happens? Where is that documented? This comes back to collaboration: the specific communication challenges include knowing who to contact. People are leaving all the time. How do you keep that directory current? And some of this may be about process: What level of information can I share with this person? Do I need to get legal approval before I share certain things? What if it’s actually an insider threat—a problem with somebody on their team? Those are some of the challenges, and I don't think most organizations know who to contact, and what exactly they can share.
People underestimate fire drills.
Many organizations don’t practice what I believe is a very, very critical piece: the tabletop exercise. People underestimate fire drills. When a real incident happens, people have less time to think—but if you’re able to document processes, and if you're able to make certain aspects of incident response second nature, then you go a long way towards creating more efficient ways to handle those incidents. And I think that's very, very often overlooked.
Breaking monotony can go a long way.
I’m a big proponent of giving every analyst some time to automate their job—there’s always some mundane work they have during the day that could be automated, so I’d encourage them to automate it. I’m also a big proponent of skill-building training, since the postmodern security world is changing very fast. Growth and breaking the monotony of the job is very critical: give them the opportunity to learn and build new skills, to challenge themselves in different ways. Even change the role. They were a firewall expert—now they become an endpoint expert. We don't realize monotony is a big part of what pulls people down, doing the same thing over and over. Change goes a long way.
Auto-documentation encourages communication and learning.
In a security incident, it’s not just about the person who’s on call—it’s also about doing an efficient handover.
Say whoever’s on call starts to work on an incident, but at the end of the shift, the next person takes over from there. Documenting what they’ve just done is a non-trivial task; nobody likes to write down what they did in the last eight hours. Imagine a stressed security analyst trying to solve a problem for six hours—now they have to spend an hour writing everything down. They hate it, they absolutely hate it.
So what can be done? I’ll call it auto-documentation: document as you go along. There’s this whole concept of chat-ops: as you run commands, the output gets logged. The next person comes in and knows exactly what the previous person did. Why did they do it this way? What did they handle? What have they already tried? That goes a long way for security analysts, because nobody wants to manually document all of that. If you're able to collect documentation in an automated manner, then you’re able to encourage learning from each other.
The VP of Infrastructure & Security at Algolia on navigating both availability and security incidents
The CEO and co-founder of Verica on using chaos engineering to navigate complex systems