🔒 Security Investigation Automation System

Comprehensive, automated security investigations powered by Microsoft Sentinel, Defender XDR, Graph API, and threat intelligence — with 25 specialized Agent Skills

📺 Video Walkthrough: See this project in action — Watch on YouTube (starts at the Security Investigator demo). Covers the end-to-end workflow: natural language investigations, MCP server integration, KQL query execution, threat intelligence enrichment, and automated report generation.

An investigation automation framework that combines GitHub Copilot, VS Code Agent Skills, and Model Context Protocol (MCP) servers to enable natural language security investigations. Ask questions like "Investigate this user for the last 7 days" or "Is this IP malicious?" and get comprehensive analysis with KQL queries, threat intelligence correlation, and professional reports.

Quick Start (TL;DR)

# 1. Clone and open in VS Code
git clone https://github.com/SCStelz/security-investigator.git
code security-investigator

# 2. Set up Python environment
python -m venv .venv
.venv\Scripts\Activate.ps1          # Windows
# source .venv/bin/activate          # macOS/Linux
pip install --require-hashes -r requirements.lock   # Hash-verified (recommended)
# pip install -r requirements.txt                   # Without hash verification

# 3. Configure environment
copy config.json.template config.json
# Edit config.json → add your Sentinel workspace ID, tenant ID
copy .env.template .env
# Edit .env → add your API tokens (ipinfo, AbuseIPDB, vpnapi, Shodan)

# 4. Configure MCP servers
copy .vscode\mcp.json.template .vscode\mcp.json
# All platform servers are pre-configured — just needs a GitHub PAT on first use

# 5. Open Copilot Chat (Ctrl+Shift+I) in Agent mode and start with:
#    "Run a threat pulse scan"

🚀 Recommended first run: The Threat Pulse skill is the best starting point. It runs a broad-spectrum scan across 9 security domains (incidents, identity, endpoint, exposure, email, UEBA, auth spray, privileged ops, CVEs) and produces prioritized findings with color-coded verdicts (🔴 Escalate / 🟠 Investigate / 🟡 Monitor / ✅ Clear). Each finding includes a drill-down recommendation pointing to a specialized skill — so after the scan, you'll know exactly where to focus and which follow-up command to run.

Other example prompts:

"Investigate user@domain.com for the last 7 days"    → user-investigation
"Analyze incident 12345"                              → incident-investigation
"Is this IP malicious? 203.0.113.42"                  → ioc-investigation
"What skills do you have access to?"                  → lists all 25 skills

For detailed workflows and KQL queries: → .github/copilot-instructions.md (universal patterns, skill detection) → .github/skills/ (25 specialized investigation workflows) → queries/ (verified KQL query library)

Architecture Overview

┌────────────────────────────────────────────────────────────────────┐
│                     GitHub Copilot (VS Code)                       │
├────────────────────────────────────────────────────────────────────┤
│                  .github/copilot-instructions.md                   │
│            (Skill detection, universal patterns, routing)          │
├────────────────────────────────────────────────────────────────────┤
│                     .github/skills/*.md                            │
│       (25 specialized workflows with KQL, risk assessment)         │
├────────────────────────────────────────────────────────────────────┤
│                     MCP Servers (Platform)                         │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────────────────┐  │
│  │ Sentinel    │  │ Graph API    │  │ Sentinel Triage (XDR)     │  │
│  │ Data Lake   │  │ (Identity)   │  │ (Advanced Hunting)        │  │
│  └─────────────┘  └──────────────┘  └───────────────────────────┘  │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────────────────┐  │
│  │ KQL Search  │  │ Microsoft    │  │ Azure MCP Server          │  │
│  │ (Schema)    │  │ Learn (Docs) │  │ (ARM + Monitor)           │  │
│  └─────────────┘  └──────────────┘  └───────────────────────────┘  │
│  ┌─────────────┐                                                   │
│  │ Sentinel    │                                                   │
│  │ Graph (Rel) │                                                   │
│  └─────────────┘                                                   │
├────────────────────────────────────────────────────────────────────┤
│               MCP Apps (Local Custom Servers)                      │
│  ┌─────────────┐  ┌──────────────┐  ┌───────────────────────────┐  │
│  │ Geomap      │  │ Heatmap      │  │ Incident Comment          │  │
│  │ (Attack Map)│  │ (Patterns)   │  │ (Sentinel Integration)    │  │
│  └─────────────┘  └──────────────┘  └───────────────────────────┘  │
├────────────────────────────────────────────────────────────────────┤
│                      Python Utilities                              │
│ generate_report_from_json.py  │  enrich_ips.py  │  report_generator│
└────────────────────────────────────────────────────────────────────┘

Key Components:

25 Agent Skills — Modular investigation workflows for incidents, users, devices, IoCs, authentication, scope drift (SPN/User/Device), MCP monitoring, exposure management, AI agent posture, app registration posture, identity posture, data security analysis, email threat posture, MITRE ATT&CK coverage, ingestion analysis, detection authoring, threat pulse scanning, SVG dashboards, and more
7 MCP Server Integrations — Sentinel Data Lake, Graph API, Defender XDR Triage, KQL Search, Microsoft Learn, Azure MCP Server, Sentinel Graph (private preview)
3 Local MCP Apps — Interactive heatmaps, geographic attack maps, incident commenting
Python Utilities — HTML report generation with IP enrichment (geolocation, VPN detection, abuse scores, Shodan port/service/CVE intelligence)

🤖 Agent Skills

This system uses VS Code Agent Skills to provide modular, domain-specific investigation workflows. Skills are automatically detected based on keywords in your prompts.

Available Skills (25)

Category	Skill	Description	Trigger Keywords
⚡ Quick Scan	threat-pulse	Rapid broad-spectrum security scan across 7 domains: active incidents, identity (human + NHI), endpoint, email threats, admin & cloud ops, exposure. Prioritized Threat Pulse findings with color-coded verdicts and drill-down recommendations	"threat pulse", "quick scan", "security pulse", "morning hunt", "what can you do", "where do I start", "what's going on"
🔍 Core Investigation	computer-investigation	Device security analysis for Entra Joined, Hybrid Joined, and Entra Registered devices: Defender alerts, compliance, logged-on users, vulnerabilities, process/network/file events	"investigate computer", "investigate device", "investigate endpoint", "check machine", hostname
🔍 Core Investigation	honeypot-investigation	Honeypot security analysis: attack patterns, threat intel, vulnerabilities, executive reports	"honeypot", "attack analysis", "threat actor"
🔍 Core Investigation	incident-investigation	Comprehensive incident analysis for Defender XDR and Sentinel incidents: criticality assessment, entity extraction, filtering, recursive entity investigation	"investigate incident", "incident ID", "analyze incident", "triage incident", incident number
🔍 Core Investigation	ioc-investigation	Indicator of Compromise analysis: IP addresses, domains, URLs, file hashes. Includes Defender Threat Intelligence, Sentinel TI tables, CVE correlation, organizational exposure	"investigate IP", "investigate domain", "investigate URL", "investigate hash", "IoC", "is this malicious"
🔍 Core Investigation	user-investigation	Entra ID user security analysis: sign-ins, anomalies, MFA, devices, audit logs, incidents, Identity Protection, HTML reports	"investigate user", "security investigation", "check user activity", UPN/email
🔐 Auth & Access	authentication-tracing	Entra ID authentication chain forensics: SessionId analysis, token reuse vs interactive MFA, geographic anomalies	"trace authentication", "SessionId analysis", "token reuse", "geographic anomaly"
🔐 Auth & Access	ca-policy-investigation	Conditional Access policy forensics: sign-in failure correlation, policy state changes, security bypass detection	"Conditional Access", "CA policy", "device compliance", "policy bypass"
📈 Behavioral Analysis	scope-drift-detection/device	Device process drift: configurable-window baseline, 5-dimension Drift Score (Volume/Processes/Accounts/Chains/Signing), fleet-wide or single-device, Heartbeat uptime corroboration	"device drift", "endpoint drift", "process baseline", "device behavioral change"
📈 Behavioral Analysis	scope-drift-detection/spn	SPN scope drift: 90-day baseline vs 7-day comparison, 5-dimension Drift Score, correlated with AuditLogs, SecurityAlert, DeviceNetworkEvents	"scope drift", "service principal drift", "SPN behavioral change", "SPN drift"
📈 Behavioral Analysis	scope-drift-detection/user	User scope drift: 90-day baseline vs 7-day comparison, dual Drift Scores (7-dim interactive + 6-dim non-interactive), correlated with AuditLogs, SecurityAlert, Identity Protection, CloudAppEvents, EmailEvents	"user drift", "user scope drift", "user behavioral change", "UPN drift"
🛡️ Posture & Exposure	exposure-investigation	Vulnerability & Exposure Management reporting: CVE assessment with exploit/CVSS data, security configuration compliance, end-of-support software, ExposureGraph critical assets, attack paths, Defender health, certificate status	"vulnerability report", "exposure report", "CVE assessment", "security posture", "TVM"
🛡️ Posture & Exposure	ai-agent-posture	AI agent security posture audit for Copilot Studio and M365 Copilot agents: agent inventory, authentication gaps, access control misconfigurations, MCP tool proliferation, knowledge source exposure, XPIA risk, credential detection, Agent Security Score	"AI agent posture", "agent security audit", "Copilot Studio agents", "agent inventory", "unauthenticated agents", "agent sprawl"
🛡️ Posture & Exposure	app-registration-posture	App registration and service principal security posture: Graph API permission inventory (dangerous grants, permission concentration), app ownership risk, credential hygiene (stale secrets, multi-credential apps), cross-tenant SPN exposure, KQL attack chain detection (AuditLogs, AADServicePrincipalSignInLogs, MicrosoftGraphActivityLogs), App Permission Risk Score with 5 dimensions	"app registration posture", "app registration abuse", "service principal permissions", "dangerous app permissions", "app ownership", "overprivileged apps"
🛡️ Posture & Exposure	email-threat-posture	Email threat protection posture report for Microsoft Defender for Office 365: inbound mail flow overview, threat composition (phishing/spam/malware), email authentication (DMARC/DKIM/SPF/CompAuth), ZAP post-delivery remediation, Safe Links click protection, attachment analysis, detection method breakdown, MDO security incidents, Email Protection Score with 5 dimensions. Inline chat, markdown file, and SVG dashboard output	"email threat report", "email security posture", "phishing report", "MDO report", "Defender for Office 365 report", "ZAP effectiveness", "Safe Links report", "DMARC report"
🛡️ Posture & Exposure	identity-posture	Identity security posture report using IdentityAccountInfo (MDI/Advanced Hunting): multi-provider account inventory (Entra ID, AD, Okta, SailPoint, CyberArk, Ping), privileged account audit with role distribution, stale/disabled/deleted account hygiene, password posture, risk distribution, multi-provider identity linking, MDI tag analysis, Identity Posture Score with 5 dimensions. Inline chat and markdown file output	"identity posture", "identity security report", "account hygiene", "stale accounts", "privileged accounts", "password posture", "identity providers", "honeytoken"
🔒 Data Security	data-security-analysis	DataSecurityEvents (Purview/IRM) analysis: SIT access breakdowns, user risk ranking, file inventory, DLP policy correlation, Copilot SIT exposure, SIT GUID-to-name resolution, anomaly detection. Designed for 100k+ user environments	"data security", "sensitive information type", "SIT access", "DLP events", "DataSecurityEvents", "EDM access", "insider risk activity", "Purview data security"
📊 Visualization	geomap-visualization	Interactive world map visualization for Sentinel data: attack origin maps, geographic threat distribution, IP geolocation with enrichment drill-down	"geomap", "world map", "geographic", "attack map", "attack origins"
📊 Visualization	heatmap-visualization	Interactive heatmap visualization for Sentinel data: attack patterns by time, activity grids, IP vs hour matrices, threat intel drill-down	"heatmap", "show heatmap", "visualize patterns", "activity grid"
📊 Visualization	svg-dashboard	SVG data visualization dashboards: dual-mode renderer supporting manifest-driven structured dashboards (from skill reports) and freeform adaptive visualizations from ad-hoc investigation data. 14-widget component library	"generate SVG dashboard", "create a visual dashboard", "visualize this report", "SVG from this data"
🔧 Tooling & Monitoring	detection-authoring	Create, deploy, update, and manage Defender XDR custom detection rules via Graph API. Query adaptation from Sentinel KQL, manifest-driven batch deployment via PowerShell, lifecycle management	"create custom detection", "deploy detection", "detection rule", "custom detection", "deploy rule", "batch deploy"
🔧 Tooling & Monitoring	kql-query-authoring	KQL query creation using schema validation, community examples, Microsoft Learn	"write KQL", "create KQL query", "help with KQL", "query [table]"
🔧 Tooling & Monitoring	mcp-usage-monitoring	MCP server usage monitoring and audit: Graph MCP endpoint analysis, Sentinel MCP auth events, Azure MCP ARM operations, workspace query governance, MCP Usage Score with 5 health/risk dimensions	"MCP usage", "MCP server monitoring", "MCP activity", "MCP audit", "Graph MCP", "Sentinel MCP", "Azure MCP"
🔧 Tooling & Monitoring	sentinel-ingestion-report	Sentinel workspace ingestion & cost analysis: table-level volume breakdown, tier classification (Analytics/Basic/Data Lake), SecurityEvent/Syslog/CommonSecurityLog deep dives, ingestion anomaly detection, analytic rule inventory via REST API, custom detection inventory via Graph API, rule health via SentinelHealth, data lake tier migration candidates, license benefit analysis (DfS P2, M365 E5)	"ingestion report", "usage report", "data volume", "cost analysis", "table breakdown", "data lake tier", "ingestion anomaly", "cost optimization"
🔧 Tooling & Monitoring	mitre-coverage-report	MITRE ATT&CK coverage analysis: YAML-driven PowerShell pipeline gathers analytic rule MITRE tags, custom detection techniques, SOC Optimization recommendations, alert/incident operational data. Tactic-level coverage matrix, technique-level drill-down with rule mapping, coverage gap identification, SOC Optimization threat scenario alignment, untagged rule remediation, MITRE Coverage Score (5 weighted dimensions). Inline chat and markdown file output	"MITRE coverage", "ATT&CK coverage", "MITRE report", "tactic coverage", "technique coverage", "coverage gaps", "MITRE score", "detection coverage report", "MITRE matrix"

How Skills Work

You ask Copilot a question (e.g., "Investigate user@domain.com for the last 7 days")
Copilot detects keywords and loads the appropriate skill from .github/skills/<skill-name>/SKILL.md
The skill provides specialized workflow, KQL queries, and risk assessment criteria
Universal patterns from .github/copilot-instructions.md are inherited automatically

Triggering Skills with Natural Language

You don't need to mention the skill name — keywords are detected automatically:

What you say	Skill triggered
"Investigate user@domain.com for the last 7 days"	user-investigation
"Analyze incident 12345"	incident-investigation
"Is this IP malicious? 203.0.113.42"	ioc-investigation
"Check the device WORKSTATION-01 for threats"	computer-investigation
"Show attack patterns on a heatmap"	heatmap-visualization
"Generate an SVG dashboard from the report"	svg-dashboard
"Map the geographic origins of these attacks"	geomap-visualization
"Write a KQL query to find failed sign-ins"	kql-query-authoring
"Trace this authentication back to the original MFA"	authentication-tracing
"Detect scope drift in service principals"	scope-drift-detection/spn
"Check user behavioral drift for user@domain.com"	scope-drift-detection/user
"Analyze device process drift across the fleet"	scope-drift-detection/device
"Show me MCP server usage for the last 30 days"	mcp-usage-monitoring
"Generate a Sentinel ingestion report"	sentinel-ingestion-report
"Create custom detections for Event ID 4799"	detection-authoring
"Audit AI agent security posture"	ai-agent-posture
"Who accessed files with credit card numbers?"	data-security-analysis
"Generate an email threat protection report"	email-threat-posture
"Run an identity posture report"	identity-posture
"Generate a MITRE ATT&CK coverage report"	mitre-coverage-report
"Run a threat pulse scan"	threat-pulse
"Audit our app registration security posture"	app-registration-posture

Follow-ups and Chaining

After running an investigation, ask follow-up questions without re-running the entire workflow:

Is that IP a VPN?
Trace authentication for that suspicious location
Was MFA used for those sign-ins?

Skills can be chained for comprehensive analysis:

1. "Investigate incident 12345" → incident-investigation extracts entities
2. "Now investigate the user from that incident" → user-investigation runs on extracted UPN
3. "Check if that IP is malicious" → ioc-investigation analyzes the suspicious IP
4. "Show me a heatmap of the attack patterns" → heatmap-visualization

Copilot uses existing investigation data from temp/investigation_*.json when available.

Discovering Skills

What investigation skills do you have access to?
Explain the high-level workflow of the user-investigation skill
What data sources does the ioc-investigation skill use?

📖 Reference: GitHub Agent Skills Documentation

Authoring New Skills & Queries from Investigations

Ad-hoc investigations naturally evolve into reusable assets. After completing an investigation, ask Copilot to package the verified queries, schema pitfalls, and analytical logic into a new SKILL.md or query file.

"Based on the investigation we just completed, create a new reusable skill"
"Read this threat intel article: <URL> — extract TTPs and IOCs, then write, test, and tune a queries file for reusable threat hunts"

📁 Project Structure

security-investigator/
├── enrich_ips.py                # Standalone IP enrichment utility
├── config.json                  # Configuration (workspace IDs, mappings)
├── config.json.template         # Config template (committed to Git)
├── .env                         # API tokens (gitignored, auto-loaded by python-dotenv)
├── .env.template                # Token template (committed to Git)
├── requirements.txt             # Python dependencies
├── requirements.lock            # Hash-verified dependency lockfile
├── .vscode/
│   └── mcp.json.template       # MCP server config template (copy to mcp.json)
├── .github/
│   ├── copilot-instructions.md  # Skill detection, universal patterns, routing
│   ├── manifests/               # Auto-generated discovery indexes
│   │   ├── discovery-manifest.yaml  # Query file + skill index (domains, MITRE, prompts)
│   │   └── build_manifest.py        # Manifest generator script
│   └── skills/                  # 25 Agent Skills (modular investigation workflows)
│       ├── ai-agent-posture/
│       ├── app-registration-posture/
│       ├── authentication-tracing/
│       ├── ca-policy-investigation/
│       ├── computer-investigation/
│       ├── data-security-analysis/
│       ├── detection-authoring/
│       ├── email-threat-posture/
│       ├── exposure-investigation/
│       ├── geomap-visualization/
│       ├── heatmap-visualization/
│       ├── honeypot-investigation/
│       ├── identity-posture/
│       ├── incident-investigation/
│       ├── ioc-investigation/
│       ├── kql-query-authoring/
│       ├── mcp-usage-monitoring/
│       ├── mitre-coverage-report/
│       ├── scope-drift-detection/
│       │   ├── spn/              # Service principal drift (5 dimensions)
│       │   ├── user/             # User account drift (7+6 dimensions)
│       │   └── device/           # Device process drift (5 dimensions)
│       ├── sentinel-ingestion-report/
│       ├── svg-dashboard/
│       ├── threat-pulse/
│       └── user-investigation/
├── queries/                     # Verified KQL query library (grep-searchable, by data domain)
│   ├── cloud/                  # Cloud app & exposure management queries
│   ├── email/                  # Defender for Office 365 email queries
│   ├── endpoint/               # Defender for Endpoint device queries
│   ├── identity/               # Entra ID / Azure AD identity queries
│   ├── incidents/              # SecurityIncident & SecurityAlert queries
│   └── network/                # Network telemetry queries
├── scripts/                     # Python utilities
│   ├── generate_report_from_json.py  # Report generator (main entry point)
│   ├── report_generator.py           # HTML report builder class
│   ├── investigator.py               # Data models and core types
│   ├── cleanup_old_investigations.py  # Automated cleanup (3+ days old)
│   └── generate_tocs.py              # Auto-generate query file TOCs
├── mcp-apps/                    # Local MCP servers (visualization, automation)
│   ├── sentinel-geomap-server/
│   ├── sentinel-heatmap-server/
│   └── sentinel-incident-comment/
├── docs/                        # Setup guides and reference documentation
├── authoring/                   # Blog drafts, writing guides, and marketing content
├── reports/                     # Generated investigation reports (organized by type)
│   ├── ai-agent-posture/       # AI agent security posture reports
│   ├── app-registration-posture/ # App registration posture reports
│   ├── computer-investigations/ # Device security investigation reports
│   ├── data-security/          # Data security SIT analysis reports
│   ├── email-threat-posture/   # Email threat protection posture reports
│   ├── exposure/               # Exposure management reports
│   ├── honeypot/               # Honeypot executive reports
│   ├── identity-posture/       # Identity security posture reports
│   ├── mcp-usage/              # MCP usage monitoring reports
│   ├── scope-drift/            # Scope drift analysis reports
│   ├── sentinel/               # Sentinel ingestion & cost analysis reports
│   ├── threat-pulse/           # Threat Pulse scan reports
│   └── user-investigations/    # HTML user investigation reports
├── temp/                        # Investigation JSON files (auto-cleaned after 3 days)
└── archive/                     # Legacy code and design docs

Query Library (`queries/`)

The queries/ folder contains verified, battle-tested KQL query collections organized by detection scenario. These are the Priority 2 lookup source in the KQL Pre-Flight Checklist — Copilot searches them before writing any ad-hoc KQL.

Each file uses a standardized metadata header for efficient grep_search discovery:

# <Title>
**Tables:** <exact KQL table names>
**Keywords:** <searchable terms — attack techniques, scenarios, field names>
**MITRE:** <ATT&CK technique IDs, e.g., T1021.001, TA0008>
**Domains:** <domain tags for manifest indexing, e.g., identity, endpoint, email>

Discovery Manifest (`.github/manifests/`)

The discovery manifest provides a machine-readable index of all query files and skills, enabling deterministic cross-referencing by domain and MITRE technique. The Threat Pulse skill loads this manifest to match findings to downstream query files and drill-down skills automatically.

discovery-manifest.yaml — Compact index (~500 lines) with title, path, domains, mitre, and prompt fields for each query file and skill
build_manifest.py — Generator script that scans queries/ metadata headers and skill YAML frontmatter to produce the manifest

How it works:

Query files declare **Domains:** tags in their metadata header (valid tags: incidents, identity, spn, endpoint, email, admin, cloud, exposure)
Skills declare threat_pulse_domains: and drill_down_prompt: in their YAML frontmatter
python .github/manifests/build_manifest.py scans both and emits the manifest
The Threat Pulse skill reads the manifest to match non-✅ findings → relevant query files and skills by domain tag and MITRE technique overlap

Regenerate after creating or renaming query files/skills, or changing Domains:/threat_pulse_domains: values:

python .github/manifests/build_manifest.py

🚀 Setup

Prerequisites

Requirement	Details
VS Code	Version 1.99+ recommended (Agent mode + MCP support).
GitHub Copilot	Active subscription — Copilot Pro+, Business, or Enterprise. Agent mode must be enabled.
Python 3.8+	For IP enrichment utility and report generation. Download
Azure CLI	Required for Azure MCP Server (underlying auth) and `sentinel-ingestion-report` skill (`az monitor log-analytics query` for all KQL queries, `az rest` for analytic rule inventory, `az monitor log-analytics workspace table list` for tier classification). Install. Authenticate: `az login --tenant <tenant_id>`, then `az account set --subscription <subscription_id>`. Requires Log Analytics Reader (KQL queries + table list) and Microsoft Sentinel Reader (analytic rule inventory) on the workspace.
`log-analytics` CLI extension	Required by the `sentinel-ingestion-report` skill for `az monitor log-analytics query` (all KQL queries in Phases 1-5). Install: `az extension add --name log-analytics`. Verify: `az extension list --query "[?name=='log-analytics']"`.
PowerShell 7.0+	Required for `sentinel-ingestion-report` skill (parallel query execution via `ForEach-Object -Parallel`). Install. Verify: `$PSVersionTable.PSVersion`.
Node.js 18+	Required for KQL Search MCP (`npx`) and building local MCP Apps. Download or install via `winget install OpenJS.NodeJS.LTS` (Windows) / `brew install node` (macOS).
Microsoft Sentinel	Log Analytics workspace with data. You'll need the workspace GUID and tenant ID.
Entra ID Permissions	If you can query Sentinel in the Azure Portal, you likely have sufficient access. The Graph MCP server requires a one-time tenant provisioning by an admin. See MCP Server Setup for detailed per-server requirements.
Microsoft.Graph PowerShell	Required for `detection-authoring` skill (`CustomDetection.ReadWrite.All` — create/update/delete custom detection rules via Graph API). Also used by `sentinel-ingestion-report` skill for rule inventory (`CustomDetection.Read.All` — read-only, degrades gracefully if not installed). `Install-Module Microsoft.Graph.Authentication -Scope CurrentUser`.
GitHub PAT	`public_repo` scope — Create one here. Used by KQL Search MCP.

1. Install Dependencies

Verify prerequisites:

python --version   # Requires 3.8+
node --version     # Requires 18+ (needed for KQL Search MCP)
az --version       # Azure CLI (needed for Azure MCP Server, ingestion report skill)
pwsh --version     # Requires 7.0+ (needed for sentinel-ingestion-report skill)

If Node.js is missing: Download or run winget install OpenJS.NodeJS.LTS (Windows) / brew install node (macOS). If Azure CLI is missing: Install, then az login --tenant <tenant_id> and az account set --subscription <subscription_id>. If the log-analytics extension is missing: az extension add --name log-analytics (required for sentinel-ingestion-report skill).

Set up Python environment:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

2. Configure Environment

Copy config.json.template to config.json and fill in your workspace details:

{
  "sentinel_workspace_id": "YOUR_WORKSPACE_ID_HERE",
  "tenant_id": "YOUR_TENANT_ID_HERE",
  "subscription_id": "YOUR_SUBSCRIPTION_ID_HERE",
  "azure_mcp": {
    "resource_group": "YOUR_LOG_ANALYTICS_RESOURCE_GROUP",
    "workspace_name": "YOUR_LOG_ANALYTICS_WORKSPACE_NAME",
    "tenant": "YOUR_TENANT_ID_HERE",
    "subscription": "YOUR_SUBSCRIPTION_ID_HERE"
  },
  "output_dir": "reports"
}

Setting	Required	Description
`sentinel_workspace_id`	Yes	Microsoft Sentinel (Log Analytics) workspace GUID
`tenant_id`	Yes	Entra ID (Azure AD) tenant ID for your Sentinel workspace
`subscription_id`	Yes	Azure subscription ID containing the Sentinel workspace
`azure_mcp.*`	Yes	Azure MCP Server parameters — resource group, workspace name, tenant, subscription. Required to avoid cross-tenant auth errors.
`output_dir`	No	Directory for HTML reports (default: `reports`)

API Tokens (`.env` file)

API tokens for IP enrichment are stored in a .env file (gitignored) rather than config.json for security. Copy the template and add your keys:

copy .env.template .env
# Edit .env with your token values

IPINFO_TOKEN=your_token_here
ABUSEIPDB_TOKEN=your_token_here
VPNAPI_TOKEN=your_token_here
SHODAN_TOKEN=your_token_here

These are auto-loaded by enrich_ips.py via python-dotenv — no manual sourcing needed.

Token	Required	Description
`IPINFO_TOKEN`	Recommended	ipinfo.io API token — geolocation, ASN, org. Free: 1K/day; token: 50K/month; paid plans include VPN detection
`ABUSEIPDB_TOKEN`	Recommended	AbuseIPDB API token — IP reputation scoring (0-100 confidence). Free: 1K/day
`VPNAPI_TOKEN`	Optional	vpnapi.io API token — VPN/proxy/Tor detection. Not needed if ipinfo.io is on a paid plan
`SHODAN_TOKEN`	Optional	Shodan API key — open ports, services, CVEs, OS detection, tags. Free InternetDB fallback if no key or credits exhausted

3. Configure MCP Servers

Copy the MCP server template (all platform servers + 3 optional MCP Apps are pre-configured):

copy .vscode/mcp.json.template .vscode/mcp.json

The template includes inline documentation for each server. On first use, VS Code will prompt for:

Entra ID login — browser-based auth for Sentinel Data Lake, Graph, Triage, and Sentinel Graph servers
GitHub PAT — for KQL Search MCP (schema intelligence and query discovery). Needs public_repo scope.

See MCP Server Setup below for per-server permissions and installation guides.

4. Build MCP Apps (Optional — Visualization Skills)

PowerShell (Windows):

cd mcp-apps/sentinel-geomap-server; npm install; npm run build; cd ../..
cd mcp-apps/sentinel-heatmap-server; npm install; npm run build; cd ../..
cd mcp-apps/sentinel-incident-comment; npm install; npm run build; cd ../..

Bash (macOS/Linux):

cd mcp-apps/sentinel-geomap-server && npm install && npm run build && cd ../..
cd mcp-apps/sentinel-heatmap-server && npm install && npm run build && cd ../..
cd mcp-apps/sentinel-incident-comment && npm install && npm run build && cd ../..

The sentinel-incident-comment MCP App requires an Azure Logic App backend. See mcp-apps/sentinel-incident-comment/README.md for setup. Based on stefanpems/mcp-add-comment-to-sentinel-incident.

🔌 MCP Server Setup

The system uses several Model Context Protocol (MCP) servers. All are pre-configured in .vscode/mcp.json.template — copy it to .vscode/mcp.json to get started (see Step 3 above). The sections below document permissions, tools, and installation guides for each server.

At a Glance

#	Server	MCP URL / Transport	Setup Guide	Key Permissions
1	Sentinel Data Lake	`https://sentinel.microsoft.com/mcp/data-exploration`	Setup	Log Analytics Reader
2	Microsoft Graph	`https://mcp.svc.cloud.microsoft/enterprise`	Setup	User.Read.All, Device.Read.All
3	Sentinel Triage	`https://sentinel.microsoft.com/mcp/triage`	Setup	SecurityReader
4	KQL Search	`npx -y kql-search-mcp` (stdio)	Setup	GitHub PAT (`public_repo`)
5	Microsoft Learn	`https://learn.microsoft.com/api/mcp`	Setup	None (free)
6	Azure MCP Server	VS Code extension (stdio)	Setup	Contributor or Reader on subscription
7	Sentinel Graph ⚠️	`https://sentinel.microsoft.com/mcp/graph`	Blog	Sentinel Reader — Private Preview

1. Microsoft Sentinel MCP Server

📖 Installation Guide

Tools: query_lake, search_tables, list_sentinel_workspaces

Permissions:

Log Analytics Reader (minimum) — query workspace data
Sentinel Reader (recommended) — full investigation capabilities
Sentinel Contributor — watchlist management (optional)

2. MCP Server for Microsoft Graph

📖 Installation Guide

Tools: microsoft_graph_suggest_queries, microsoft_graph_get, microsoft_graph_list_properties

⚡ One-time tenant provisioning (requires Application Administrator or Cloud Application Administrator role):

# 1. Install the Entra Beta PowerShell module (v1.0.13+)
Install-Module Microsoft.Entra.Beta -Force -AllowClobber

# 2. Authenticate to your tenant
Connect-Entra -Scopes 'Application.ReadWrite.All', 'Directory.Read.All', 'DelegatedPermissionGrant.ReadWrite.All'

# 3. Register the MCP Server and grant permissions to VS Code
Grant-EntraBetaMCPServerPermission -ApplicationName VisualStudioCode

This only needs to be done once per tenant. After provisioning, all users in the tenant can use the Graph MCP server by signing in with their own account.

Permissions (delegated, per-user):

User.Read.All — user profiles and authentication methods
UserAuthenticationMethod.Read.All — MFA methods
Device.Read.All — device compliance and enrollment
IdentityRiskEvent.Read.All — Identity Protection risk detections

3. Microsoft Sentinel Triage MCP Server

📖 Installation Guide

Tools (30+): RunAdvancedHuntingQuery, ListIncidents, GetAlertById, GetDefenderMachine, GetDefenderFileInfo, GetDefenderIpAlerts, ListUserRelatedMachines, GetDefenderMachineVulnerabilities, and more.

Permissions:

Microsoft Defender for Endpoint API — SecurityReader role minimum
Advanced Hunting — read access to Defender XDR data

4. KQL Search MCP Server

📖 Installation Guide

Option A: VS Code Extension (Recommended)

Extensions panel → Search "KQL Search MCP" → Install
Command Palette → KQL Search MCP: Set GitHub Token

Option B: NPX — already configured in .vscode/mcp.json.template. Just needs a GitHub PAT with public_repo scope (prompted on first use).

Tools (34): Schema intelligence, query validation, GitHub search, ASIM support for 331+ tables.

5. Microsoft Learn MCP Server

📖 Installation Guide

One-click: Install in VS Code — or already configured in .vscode/mcp.json.template.

Tools: microsoft_docs_search, microsoft_docs_fetch, microsoft_code_sample_search

No API key required — free, cloud-hosted by Microsoft.

6. Azure MCP Server

📖 Installation Guide

Install via VS Code extension: search "Azure MCP Server" in Extensions, or install from the Marketplace. The extension registers as a stdio MCP server automatically.

Tools: monitor_workspace_log_query, monitor_activitylog_list, group_list, subscription_list, and 40+ namespaces covering AI, identity, security, databases, storage, compute, and networking.

Permissions:

Reader (minimum) — read-only access to Azure resources
Log Analytics Reader — for workspace_log_query (KQL against Log Analytics)
Contributor — for write/modify operations (optional)

Configuration: Requires azure_mcp parameters in config.json (tenant, subscription, resource group, workspace name) to avoid cross-tenant auth errors. See Configure Environment.

7. Sentinel Graph MCP Server ⚠️ Private Preview

Note: Sentinel Graph is currently in private preview and not available to all customers. If your tenant does not have access, this server will fail to connect — you can safely remove it from .vscode/mcp.json. See the announcement blog post for details and enrollment.

Tools: Entity graph exploration and relationship queries.

Permissions:

Sentinel Reader (minimum)

Pre-configured in .vscode/mcp.json.template. Browser-based Entra ID login on first use.

Verify Setup

Open Copilot Chat (Ctrl+Shift+I) in Agent mode and try these prompts:

Test	Prompt to type in Copilot Chat
Sentinel Data Lake	`List my Sentinel workspaces`
Microsoft Graph	`Look up my user profile in Graph`
Sentinel Triage	`List recent security incidents`
KQL Search	`What columns does the SigninLogs table have?`
Microsoft Learn	`Search Microsoft docs for KQL query language`
All skills	`What investigation skills do you have access to?`

If any server fails, check the MCP Servers panel in VS Code (click the {} icon in the bottom status bar) to verify each server shows a green connected status.

⚙️ Configuration Details

API Rate Limits (IP Enrichment)

Provider	Free Tier	With Token
ipinfo.io	1,000/day (geo, org, ASN)	50,000/month; paid plans include VPN detection
AbuseIPDB	1,000/day	10,000/day ($20/month)
vpnapi.io	1,000/month	10,000/month ($9.99/month)
Shodan	InternetDB (unlimited, ports/vulns/tags)	$49 one-time membership: 100 queries/month (adds services, banners, SSL, OS)

Token priority: If ipinfo_token is a paid plan, VPN detection is included and vpnapi_token is optional. Shodan uses the full API when a paid key is available; on 403/429 it automatically falls back to the free InternetDB.

IP enrichment happens during report generation (not data collection), so you can re-generate reports without re-querying Sentinel/Graph.

Dependencies

pip install -r requirements.txt

Core packages: requests (HTTP client for enrichment APIs), python-dateutil (date parsing for KQL time ranges).

🔒 Security Considerations

Confidential Data — Reports contain PII and sensitive security data. Mark as CONFIDENTIAL and follow organizational data classification policies.
Access Control — Restrict access to authorized SOC personnel. Use Azure RBAC for Sentinel, PIM for Graph API permissions.
Audit Trail — All investigations are timestamped. JSON files in temp/ preserve snapshots; HTML reports include generation metadata.
Data Retention — Investigations older than 3 days are auto-deleted (configurable). Archive important investigations before cleanup.
API Token Security — Never commit config.json with tokens (already in .gitignore). Use environment variables or Azure Key Vault for production.
Investigation JSON Files — Stored in temp/ (not committed to Git). Contain complete data including IP enrichment. Can be re-analyzed without re-querying.

🛠️ Troubleshooting

Issue	Solution
"No anomalies found"	`Signinlogs_Anomalies_KQL_CL` table doesn't exist or has no data. See user-investigation skill docs. Wait 24h for initial population.
"IP enrichment failed"	ipinfo.io rate limits (1K/day free). Add token to `config.json` for 50K/month.
"MCP server not available"	Check VS Code MCP server config. Verify authentication tokens are valid.
"User ID not found" (Graph)	Verify UPN is correct. Check Graph permissions: User.Read.All.
"Sentinel query timeout"	Reduce date range. Add `\| take 10` to limit results.
Report generation fails	Validate JSON: `python -m json.tool temp/investigation_*.json`. Check required fields.
SecurityIncident returns 0 results	Use BOTH `targetUPN` and `targetUserId` (Object ID). Some incidents use Object ID.
Risky sign-ins 404	Must use `/beta` endpoint, not `/v1.0`.

Verify Connectivity

In Copilot Chat (Agent mode):

"List my Sentinel workspaces" — verifies Sentinel Data Lake MCP
"Look up user@domain.com in Graph" — verifies Graph MCP
"List recent incidents" — verifies Sentinel Triage MCP

In terminal:

python enrich_ips.py 8.8.8.8    # Verifies IP enrichment API tokens

📄 License

This project is licensed under the MIT License. Use it, fork it, adapt it for your SOC — just keep the copyright notice.

🙏 Acknowledgments

Microsoft Security Platform

Microsoft Sentinel — SIEM/SOAR platform powering all KQL queries and incident management
Microsoft Defender XDR — Advanced Hunting, device telemetry, vulnerability management
Microsoft Graph API — Entra ID identity data, user/group management, role assignments
Microsoft Entra ID Protection — Risk detections, risky sign-ins, user risk states

MCP Servers

Sentinel Data Lake MCP — KQL query execution against Sentinel workspace
Sentinel Triage MCP — Incident triage, Advanced Hunting, Defender entity APIs
MCP Server for Microsoft Graph — Graph API endpoint discovery and execution
KQL Search MCP — Schema intelligence, GitHub KQL query discovery, ASIM validation
Microsoft Learn MCP — Official documentation search and code sample retrieval

Threat Intelligence APIs

ipinfo.io — IP geolocation, ISP/ASN identification, hosting provider detection
vpnapi.io — VPN, proxy, Tor exit node, and relay detection
AbuseIPDB — Community-sourced IP abuse scoring and recent attack reports
Shodan — Open port enumeration, service/banner detection, CVE identification, infrastructure tagging

Development Tools

GitHub Copilot — AI coding assistant powering the natural language investigation interface
VS Code Agent Skills — Modular skill framework for specialized investigation workflows
Model Context Protocol (MCP) — Open protocol connecting LLMs to external data sources and tools

Special thanks to the Microsoft Security community for sharing KQL queries and detection logic, and to stefanpems for the Sentinel incident commenting MCP pattern.

Version	Changes	Urgency	Date
main@2026-06-04	Latest activity on main branch	High	6/4/2026
0.0.0	No release found — using repo HEAD	High	4/8/2026

security-investigator

Description

README

🔒 Security Investigation Automation System

Quick Start (TL;DR)

Architecture Overview

🤖 Agent Skills

Available Skills (25)

How Skills Work

Triggering Skills with Natural Language

Follow-ups and Chaining

Discovering Skills

Authoring New Skills & Queries from Investigations

📁 Project Structure

Query Library (queries/)

Discovery Manifest (.github/manifests/)

🚀 Setup

Prerequisites

1. Install Dependencies

2. Configure Environment

API Tokens (.env file)

3. Configure MCP Servers

4. Build MCP Apps (Optional — Visualization Skills)

🔌 MCP Server Setup

At a Glance

1. Microsoft Sentinel MCP Server

2. MCP Server for Microsoft Graph

3. Microsoft Sentinel Triage MCP Server

4. KQL Search MCP Server

5. Microsoft Learn MCP Server

6. Azure MCP Server

7. Sentinel Graph MCP Server ⚠️ Private Preview

Verify Setup

⚙️ Configuration Details

API Rate Limits (IP Enrichment)

Dependencies

🔒 Security Considerations

🛠️ Troubleshooting

Verify Connectivity

📄 License

🙏 Acknowledgments

Microsoft Security Platform

MCP Servers

Threat Intelligence APIs

Development Tools

Release History

Dependencies & License Audit

Similar Packages

More in MCP Servers

Query Library (`queries/`)

Discovery Manifest (`.github/manifests/`)

API Tokens (`.env` file)