Configuration

The Huntsman API is configured through a set of YAML files located in the huntsman/config directory. This guide acts as a complete reference for modifying these files to add new capabilities or services.

Note

Changes to configuration files usually require a restart of the application or a call to the reload endpoint (if configured) to take effect.

API Recipes

File: api_recipes.yaml

This is the core configuration file for Huntsman. It defines how the system connects to external APIs (like VirusTotal, Shodan, or your own internal tools). The structure is hierarchical: Service -> Endpoint -> Identifier Type.

Top-Level Keys

For each service (e.g., virustotal), the following keys are available:

Key

Required

Description

enabled

No

Boolean (true/false). Set to false to disable the entire service. Defaults to true.

base_url

Yes

The root URL of the API (e.g., https://api.example.com/v1).

auth

No

Authentication configuration object. See Authentication below.

static_headers

No

A dictionary of headers to send with every request (e.g., Accept: application/json).

endpoints

Yes

A dictionary mapping Identifier Types to their specific API calls.

Authentication

Huntsman supports flexible authentication injection. The auth block requires a type and a config object.

Type: Header Injects the API key into a request header.

auth:
  type: "header"
  config:
    header_name: "X-API-KEY" # The header expected by the API

Type: Param Injects the API key into the query string parameters.

auth:
  type: "param"
  config:
    param_name: "key" # The query param expected (e.g., ?key=XYZ)

Endpoint Configuration

The endpoints block maps a specific Identifier Type (what you are analyzing) to an API action.

Supported Identifier Types: ipv4-addr, ipv6-addr, domain-name, url, file, sha256, md5, sha1, email-addr, mac-addr, vulnerability, software.

Endpoint Keys:

Key

Required

Description

enabled

No

Boolean (true/false). Set to false to disable this specific endpoint. Defaults to true.

method

Yes

HTTP method: GET, POST, PUT, etc.

path_template

Yes

The URL path. Supports variable injection (see below).

params_template

No

Query parameters. Supports simple strings or advanced objects (see Advanced Parameters).

body_template

No

A dictionary defining the request body structure. Supports variable injection.

encoding

No

Request body encoding: json (default) or form (application/x-www-form-urlencoded).

db_pool

Yes

The name of the SuperDB pool where results will be stored.

ratelimit

No

Format: <requests>/<period> (e.g., "1/5s", "100/d").

pivots

No

A mapping of relationships to extract. See Pivots below.

Advanced Parameters & Templating

Values in path_template, params_template, and body_template support dynamic variable injection.

Available Variables:

  • {identifier}: The artifact being analyzed (e.g., “1.2.3.4”).

  • {today}: The current date in YYYY-MM-DD format.

  • {start_date}: The date 30 days prior to today in YYYY-MM-DD format.

Base64 Transformation:

You can base64-encode parameter values (URL-safe) by using a dictionary with a transform key in params_template.

params_template:
  raw_id: "{identifier}"
  # Example: Generates ?token=BASE64(2023-10-25:1.2.3.4)
  token:
    transform: "base64"
    template: "{today}:{identifier}"

Request Body (JSON vs Form):

For POST or PUT requests, you can define a body.

# JSON Example (Default)
method: "POST"
body_template:
  query: "{identifier}"
  from: "{start_date}"

# Form Data Example
method: "POST"
encoding: "form"
body_template:
  id: "{identifier}"
  format: "xml"

Pivots (Extraction Logic)

Pivots define how Huntsman “learns” from the data. They tell the system: “When you get a result from this API, run this query to find related artifacts.”

Format: <Target Identifier Type>: "<SuperQL Query>"

The SuperQL query operates on the JSON response from the API.

  • yield: Returns a value.

  • cut: Slices specific fields.

  • over: Iterates over a list.

Example:

# API Response: {"data": {"attributes": {"last_dns_records": [{"type": "A", "value": "1.2.3.4"}]}}}

pivots:
  # Extract 'A' records and treat them as new IPv4 addresses to investigate
  ipv4-addr: "over data.attributes.last_dns_records | type=='A' | yield value"

Full Recipe Example

virustotal:
  enabled: true
  base_url: "https://www.virustotal.com/api/v3"
  auth:
    type: "header"
    config:
      header_name: "x-apikey"
  endpoints:
    domain-name:
      enabled: true
      method: "GET"
      path_template: "/domains/{identifier}"
      db_pool: "virustotal"
      ratelimit: "1/5s"
      pivots:
        email-addr: "yield data.attributes.whois"

    # Example of advanced parameters and form encoding
    ipv4-addr:
      method: "POST"
      path_template: "/search"
      encoding: "form"
      params_template:
         # Base64 encode the identifier for a specific query param
         q:
           transform: "base64"
           template: "{identifier}"
      body_template:
         source: "huntsman"
         date: "{today}"
      db_pool: "virustotal"

Ratelimiting

To prevent abuse and respect the terms of service of external APIs, you can add a ratelimit setting to any endpoint in api_recipes.yaml.

Format: <calls>/<period>

  • <calls>: Max requests allowed.

  • <period>: Time unit (s = seconds, m = minutes, h = hours, d = days).

Example: ratelimit: "500/d" (500 calls per day).

Internal Services

File: internal_services_recipe.yaml

This file defines services that query your local SuperDB instance instead of an external API. This is useful for “Context” lookups (e.g., “Have we seen this IP in our RSS feeds?”).

Structure:

rss-news:
  label: "RSS News Context"
  endpoints:
    ipv4-addr:
      # Standard SuperQL query. {identifier} is injected automatically.
      query_pattern: "from 'rss-news' | search '{identifier}' | head 1"
      pivots:
        all: "yield full_text"

IOC Patterns

File: ioc_patterns.yaml

This file contains Regex patterns used to automatically detect the type of an artifact (e.g., distinguishing an IP from a Domain).

ipv4-addr: \b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b
email-addr: \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b

Scraping Recipes

File: scraping_recipes.yaml

Defines how to extract intelligence from raw HTML pages.

example_blog:
  base_url: https://example.com
  endpoints:
    post:
      path_template: /blog/{identifier}
      db_pool: example_blog_posts
      data_to_extract:
        # Key: XPath identifier
        title: "//h1/text()"
        content: "//div[@class='post-content']"

RSS Recipes

File: rss_recipes.yaml

Configures RSS feeds for ingestion.

krebs_on_security:
  url: https://krebsonsecurity.com/feed/
  db_pool: krebs