Function to placeholder assignment

Supported in:

In YARA-L, function to placeholder assignments involve using functions to process and assign values to placeholder variables. Placeholder variables (denoted by $) are used to represent specific data points extracted from UDM events. Functions can then operate on these event fields, or combinations of fields, and the results of these operations are assigned to the placeholder variables for use in other parts of the query, such as the match, condition, and outcome sections.

To better understand, consider the following:

  • Functions: YARA-L provides various built-in functions, especially aggregate functions (such as, count_distinct, count, array_distinct, array, group), and metric functions, which can perform calculations or data transformations on event fields.
  • Placeholder Variables: These are variables, denoted by a preceding dollar sign (such as, $user, $ip), that can hold values derived from event fields or function outputs. They are defined in the events section and can be used throughout the rule.
  • Assignment: The core concept is assigning the output of a function to a placeholder variable. This is typically done in the outcome section for generating statistical values.

Example: Group fields into a placeholder

Consider the group() function, which groups fields of the same type into a placeholder.

rule ExampleRule {
  events:
    $e.metadata.event_type = "NETWORK_CONNECTION"
    $e.principal.ip = /192\.168\.\d+\.\d+/ nocase
    $e.target.ip = /192\.168\.\d+\.\d+/ nocase

  match:
    $ip = group($e.principal.ip, $e.target.ip)

  outcome:
    $distinct_ips = count_distinct($ip)

  condition:
    $distinct_ips > 10
}

In this example:

  • The group() function is used to gather all IP addresses from principal.ip and target.ip fields across events.
  • The result of group() is assigned to the placeholder variable $ip.
  • Then, count_distinct($ip) calculates the number of unique IP addresses and assigns it to $distinct_ips.
  • Finally, $distinct_ips is used in the condition section to trigger the rule if more than 10 distinct IPs are found.

This demonstrates how functions are used to process data and assign the results to placeholder variables for further analysis or rule conditions.

Limitations

There are two limitations when using function to placeholder assignment:

  1. Every placeholder in function to placeholder assignment must be assigned to an expression containing an event field.

    Valid examples

    $ph1 = $e.principal.hostname
    $ph2 = $e.src.hostname
    
    // Both $ph1 and $ph2 have been assigned to an expression containing an event field.
    $ph1 = strings.concat($ph2, ".com")
    
    $ph1 = $e.network.email.from
    $ph2 = strings.concat($e.principal.hostname, "@gmail.com")
    
    // Both $ph1 and $ph2 have been assigned to an expression containing an event field.
    $ph1 = strings.to_lower($ph2)
    

    Invalid example

    $ph1 = strings.concat($e.principal.hostname, "foo")
    $ph2 = strings.concat($ph1, "bar") // $ph2 has NOT been assigned to an expression containing an event field.
    
  2. The function call should depend on exactly one event. However, more than one field from the same event can be used in function call arguments.

    Valid example

    $ph = strings.concat($event.principal.hostname, "string2")

    $ph = strings.concat($event.principal.hostname, $event.src.hostname)

    Invalid example

    $ph = strings.concat("string1", "string2")

    $ph = strings.concat($event.principal.hostname, $anotherEvent.src.hostname)

Need more help? Get answers from Community members and Google SecOps professionals.