All Articles

Are your IR playbooks ready to go?

Security Playbooks

In my previous and current gig, both roles include me working on rolling out a good IR management framework for the team. I remember when I first got the task, I thought it would be kind of boring… Because… a database, apis, a front end, and a tons of input fields… nothing too challenging here…

However, I was wrong. As I get to talk to more and more analysts, I have quickly realized that creating, designing, and rolling out a good IR platform will require a lot of work! Implementing the little details and putting in the human touch that makes everyone’s jobs easier is …tough!

For the leadership & execs, some of this will include case metrics, pretty dashboard with nice little charts. For the manager, they’ll want some evidences of current case load, and analysts’ progress/metrics that they can take to leadership and demand for more headcounts, or trainings budget if needed. For the analysts, it will include automation, a well defined workflow per attack category, good timestamps… (Anything to help the analyst focus on the analysis aspect, and less of being a copy-and-paste monkey.) For the engineers, it will require well documented APIs that they can read and write to, to extract IOCs, case notes, and write better detections…etc etc etc. It will be hard to keep everyone happy. However, in IR, the first rule is always DO NOT PANIC, and calmly response!

So why building playbooks? And automate? Doesn’t that take a tons of time?

Yes, it does. But at the same time, the benefits are:

  • Consistent responses from different analysts, with no missing steps
  • Keep analysts focused
  • Produce better, more quality metrics
  • Faster response time per case from analysts

So in this blog, I’ll rant on about some thoughts I had earlier about creating a process that will make response easier… A lot of these will be work in the preparation phase. Then from there, it will be a lot of flowcharts and checklist creations.

This is under the assumption that you have both detection AND response capabilities within your environment. (aka getting good alerts, having capabilities to reach out to user’s host, some process in place for artifact collection…etc.)

IR Procedures

(In case you work in a smaller company with little to no one in security, these are the usual steps in IR. Otherwise, feel free to skip this part if you know it already!)

  • Preparation

    • Policy/Procedures
    • Tools
    • Training
  • Detection

    • Endpoint & network monitoring
    • Alert in any suspicious events
  • Identification

    • Look at alert, can the event be converted into an incident?
    • What category does it fall under? (Phish/Spam, Malware, Exfiltration? etc…)
  • Analysis

    • Identify the breadth and depth of incident
    • How many users clicked on X links?
    • What are indicators that XYZ happened?
  • Containment

    • How do you stop XYZ?
    • Enable firewalls? Remove email from all inboxes? Sinkhole?
  • Eradication

    • Remove malware, backdoors, connections established and used by attacker
  • Recovery

    • Get workstations and servers back up online
    • Write/improve on rules to prevent a reoccurrence
    • Get the user back to their business
  • Post-Incident

    • Update procedures, modify policies, enforce better rules

Does your IR management platform already have something to cover each step? If so, what do they have in place to identify the Who/What/Why/When/How?

  • Who is responsible / accountable / informed / consulted?
  • What happened?
  • When did it happen?
  • How did it happen?
  • What has been done?
  • What else we are waiting on?
  • Who have been contacted?

If all of the above have been defined… great! Now let’s get into creating and automating playbooks!

Where to start?

Where to start? That is a good question, it’s all depend on your priorities… Say, in this case, we want to do less of most the frequent alerts (reported phishing emails).

Looking into your environment

  • What alerts fire the most?

    • What is the analysis steps?
    • What are the containment steps?
    • Can any of this be automated?

In this step, if you notice alert X fire all the time… and it’s a true positive alert, then automate to convert it to a case automatically. In this case, create well defined playbooks that it can run through.

  • What type of case categories do you see the most?

    • What steps is needed before a case can be closed?
    • Can any of this be automated?

Again, similar concept above. If there’s a case of potential password compromise, then do you already have something in place to kick off this process?

  • Reset AD Password? Do you have a way to automate that?
  • If not, then automate.

Well defined process should include:

  • A chart of the workflow
  • A JSON/YML file of standard input fields on info that you will collect
  • Code that will automate that process

A few good examples to start:

  • Active Directory Password Reset

    • Windows
    • Mac
  • Alerts on VIP/Executive account/reported
  • Botnet
  • C2 Investigate and contain
  • Compromised Email with suspicious file(s)
  • Compromised Email with suspicious link(s)
  • DNS Block URL
  • Lost devices (laptop/phone/ipad)
  • Email Notification to manager / hr
  • Email to notify device has been contained and need to be reimage
  • Excessive account lockouts
  • Exfiltration detection
  • Isolate EC2 instance
  • Investigate rootkits
  • Phishing email investigate & response
  • Re-infected Endpoints
  • Rogue Wireless Access Point Report and Remediation etc…

By having a playbook, and training both your new and experienced analysts with the exact steps to take will save them time, decrease brain fatigue, and allow them to focus on doing incident response.

Sources / Resources to check out: