OPA: logical or conditions

because an and isn’t always the end of it
thumbnail

This one is like riding a bicycle. Once you know it, you know it. I’ve been going down some Kubernetes rabbit holes and I’ve landed on OPA - Open Policy Agent.

The Open Policy Agent (OPA, pronounced “oh-pa”) is an open source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more.

Basically, it’s a declarative decision engine which evaluates Rego policies for a given input.

Rego was inspired by Datalog, which is a well understood, decades old query language. Rego extends Datalog to support structured document models such as JSON.

Rego queries are assertions on data stored in OPA. These queries can be used to define policies that enumerate instances of data that violate the expected state of the system.

Rego comes packed with some very neat features.

The OPA project comes with an extensive, easy to follow and brief (good!!) documentation. However, there was one question to which I could not find an answer: how do I express:

(if this OR this) AND that

Here’s my semi-realistic problem:

  • given a go structure with a bunch of string properties
  • if a user reading the structure:
    • belongs to an HR role
      • they can see all values of all fields, regardless if a field is restricted, or not restricted
    • belongs to any other role
      • they can see non-restricted fields only
    • ONLY when the user is enabled

For the following structure:

1
2
3
4
5
6
7
8
9
type Person struct {
    first_name    string
    last_name     string
    email_address string // restricted
    address1      string // restricted
    address2      string // restricted
    postal_code   string // restricted
    city          string
}

anyone who’s not a member of the HR team should see ******** instead of actual values.

There are two known facts while evaluating this policy:

  1. The list of restricted fields: email_address, address1, address2, and postal_code.
  2. The input, an example goes like this:
1
2
3
4
5
6
7
{
    "field": "email_address",
    "subject": {
        "enabled": true,
        "groups": ["employee", "hr"]
    }
}

and:

1
2
3
4
5
6
7
{
    "field": "email_address",
    "subject": {
        "enabled": true,
        "groups": ["employee", "chef"]
    }
}

Okay, let’s start with the policy:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
package play

# By default the decision is false:
default can_read_restricted_field = false

# The list of restricted fields is known:
restricted_fields := {"email_address", "address1", "address2", "postal_code"}

is_restricted_field {
    # A field is restricted if it is found in the `restricted_fields` set.
	restricted_fields[input.field]
}

can_read_restricted_field_when_enabled {
    # Not complete yet...
	input.subject.enabled
}

is_hr {
    # An employee is a member of the HR team if one
    # of their groups is `hr`.
    input.subject.groups[_] = "hr"
}

I want this:

OPA logical or

  • If the employee works in HR, there is no need to check if a field is restricted.
  • Otherwise (if not is_hr), the field can be accessed only when it isn’t restricted.

The first condition is easy:

1
2
3
can_read_restricted_field {
    is_hr
}

The second one is also easy:

1
2
3
4
can_read_restricted_field {
    not is_hr
    not is_restricted_field
}

What wasn’t clear to me was how do I make these two conditions into a single logical OR operation? It’s really easy, the full policy looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
package play

# By default the decision is false:
default can_read_restricted_field = false
default can_read_restricted_field_when_enabled = false

# The list of restricted fields is known:
restricted_fields := {"email_address", "address1", "address2", "postal_code"}

can_read_restricted_field {
    is_hr
}

can_read_restricted_field {
    not is_hr
    not is_restricted_field
}

can_read_restricted_field_when_enabled {
    can_read_restricted_field
	input.subject.enabled
}

is_restricted_field {
    # A field is restricted if it is found in the `restricted_fields` set.
	restricted_fields[input.field]
}

is_hr {
    # An employee is a member of the HR team if one
    # of their groups is `hr`.
    input.subject.groups[_] = "hr"
}

It essentially boils down to:

1
2
3
4
5
6
7
8
9
or_rule {}
or_rule {}

some_other_rule {}

or_and_others {
    or_rule
    some_other_rule
}

This policy returns for can_read_restricted_field_when_enabled:

  • false for disabled hr and field email_address
  • false for disabled hr and field first_name
  • true for enabled hr, any field
  • false for enabled chef, any field from the restricted list
  • true for enabled chef, any field outside of the restricted list
  • false for disabled chef, any field outside of the restricted list

That’s it, pretty cool. Nothing that cannot be done with a bunch of conditionals and loops but on a highly variable input iterating with Rego will be much faster!

§closing thought

Turns out to be completely unrelated:

  • Rego is inspired by Datalog.
  • Datalog is a a subset of Prolog.
  • Prolog would be a perfect language to implement Google’s Zanzibar in.
  • Why did Keto project drop the OPA route?