Back in May, I looked at an implementation of Zanzibar-style ACLs in Open Policy Agent Rego[1].
I’m revisiting that problem while learning Prolog, another language from my never-ending to-learn list. There are some similarities to the previous article, but I’m taking the problem further: I’m adding permissions inheritance from anywhere within the filesystem tree.
table of contents
§the problem
In Zanzibar-style ACLs with OPA Rego1 I have implemented the ACLs for the canonical example from the Zanzibar whitepaper:
name: "doc"
relation { name: "owner" }
relation {
name: "editor"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "owner" } }
} } }
relation {
name: "viewer"
userset_rewrite {
union {
child { _this {} }
child { computed_userset { relation: "editor" } }
child { tuple_to_userset {
tupleset { relation: "parent" }
computed_userset {
object: $TUPLE_USERSET_OBJECT # parent folder
relation: "viewer"
} } }
} } }
It reads like this:
- A document has owners.
- Document owners are also editors.
- Document editors are also document viewers.
- Furthermore, the viewers of a parent folder are also viewers of the document.
The notation I came up with:
doc:some-doc#viewer@doc:some-doc#editor@doc:some-doc#owner
doc:some-doc#viewer@folder:some-doc-parent-folder#viewer
You can find the complete reasoning behind this in the article linked above in this section.
§iterating on the problem
While solving this problem in Prolog, I’m going to take this a step further. ACLs can be inherited from any parent directory in the tree, up to the root directory.
Code examples will operate on UUIDs exclusively but in the article, I will be referring to files and directories by their names. The tree used in this example is small, so that it can be easily reasoned about:
data root, a bucket, UUID: 851a7566-b83d-4993-bd04-9b8244091d45
├-- a-data-item.png file, UUID: 9c57448c-e901-4fd3-a47e-b6ecf0470c5f
└-- documents directory, UUID: b94c1f40-818b-47be-882b-ce1a8fbee254
├-- a-document.md file, UUID: f666cc22-1e92-457b-b7ee-0f6b48825185
└-- pdfs directory, UUID: 7f916bd9-f324-40f2-97d3-95e7cc7d722e
├-- a-file.pdf file, UUID: bee100a0-30b1-4184-94aa-3d408b75e938
└-- another.pdf file, UUID: ecf47371-fd08-44f1-9c06-7c970d948ae8
A permission can be assigned to a file or directory. A permission assigned to a directory applies to all child directories and files. For example:
- An editor of data directory can edit and view: a-data-item.png, a-document.md, and a-file.pdf.
- A viewer of pdfs can view a-file.pdf only.
- A viewer of a-file.pdf can view that file only and no other file inside of the parent directory pdfs.
Permissions will be stored outside of filesystem objects and they are optional for an object. An object may or may not have a set of permissions attached. If an object does not have permissions attached, it’s okay, they are simply inherited from a directory above.
§the facts: filesystem
The filesystem is defined using the following facts:
A directory
fact has three properties:
- A name.
- An object ID.
- A parent object ID. The root directory does not have a parent and that fact is encoded by using
parent(objectid())
with value inobjectid()
.
A document has exactly one parent pointing at a directory
.
§the facts: permissions
Permissions are optional for an object hence there are less permissions than a total of directories and documents. A permission
fact contains an object ID it attaches to, and a list of permissions.
Permissions are represented as a list of [role₁, [identity₁, ...], ... roleₙ, [identityₙ, ...]]
.
From the set of facts, we can deduce the following:
- Berta is the data owner.
- Charlie is an editor of data.
- Alan is a viewer of pdfs.
- Jake can only view the a-file.pdf file.
§utilities
The program requires a bunch of utilities:
append(+L1, +L2, -OutList)
: appends a list to another list, for exampleappend([a, b], [c, d], -Out) → Out = [a, b, c, d]
.in_list(+Atom, +List)
: an implementation ofmember
, it checks if an item exists in the list.find_kv(+Atom, +List, -OutList)
: given a list of[k₁, v₁, k₂, v₂, ... kₙ, vₙ]
, findsvₙ
for requestedkₙ
, returns a default value if nothing found. The program here assumes thatvₙ
is a list.
Here’s the code:
§object permissions and inheritance
Now, I need a method of fetching object permissions and figuring out inheritance. Here’s the strategy:
- Querying for permissions of an object implies that three facts are known:
- An ID of the object.
- An identity, in this case, a username.
- A relation type: viewer, editor, owner.
- The problem statement says that an owner is an editor, and an editor is a viewer. What does this mean, logically speaking? Well, it simply means that for a user to be a viewer it doesn’t matter if the user is an editor or an owner.
Example:
|
|
The above declares a set of permissions for the data directory (the bucket). It says that Berta belongs to the owners and Charlie is an editor. To find out if Berta is an editor, the program needs to find a list of editors and a list of owners. Two lists have to be combined, the result includes berta, Berta is an editor. Regardless of the actual assigned role is. Finding out which role is actually assigned to Berta would imply another API.
Here’s the code:
The relevant part is object_permissions(+Id, +Roles, -CombinedRoles)
.
- This predicate recurses on
+Roles
list, which is given as[role₁, role₂, ... roleₙ]
. For each role, the following happens:- The program tries to get the permission list for an object ID.
- As we said, permissions for an object ID are optional, hence we have a guarding
object_permissions(_, _, []).
.- If there are no permissions for an ID, return an empty list.
- If permissions for the ID exist, call
find_kv(+Role, +Permissions, -RolePermissions)
. - Recurse on the
+RoleTail
. - Whatever comes back from
find_kv
under-RolePermissions
gets appended to the accumulator-Out
.
Let’s load it all up in Prolog and see it in action.
§SWI-Prolog
By far the easiest way to start with Prolog is SWI-Prolog. To install SWI-Prolog:
- On macOS:
brew install swi-prolog
. - On Debian-based Linux:
sudo apt-get install swi-prolog -y
.
The installation will take a bit but when finished, SWI-Prolog can be started from the terminal by executing swipl
command.
§trying out object permissions
Okay, we have Prolog available. Let’s set up a workspace and try out those permissions.
|
|
Start swipl:
|
|
Welcome to SWI-Prolog (threaded, 64 bits, version 8.4.3)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.
For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).
?-
Load all 4 files:
|
|
So, let’s get a list of all permitted owners, editors, and viewers for object 851a7566-b83d-4993-bd04-9b8244091d45.
|
|
Now, owners only:
|
|
Note: You will have to type the final dot after receiving the answer to come back to the Prolog ?-
shell.
Awesome, so this shows has to retrieve a combined list of users allowed to access an object at an expected level. The role list containing multiple roles, for example [owners, editors, viewers]
is a very simple method of saying that all owners belong to a set of owners+editors+viewers
.
Note2: To quit swipl back to the terminal shell, press CTRL+D.
Let’s move on to the inheritance.
§explaining inheritance
Let’s recap: given a file ID, the program will traverse the parents up to the root directory (the bucket). To solve this question, the following facts need to be known:
- The object ID for which we want to know if the user has the required permission.
- The required access level.
- The identity.
The traversal must continue only until:
- The root has been reached.
- Or until the user has been found in any inspected permission for the first time.
- If the user has been found to have permission before reaching the root, there is no point in traversing all the way to the root anymore.
This process is implemented in the recursive inherits/3
predicate. This predicate recurses over parent directories. There are three variants of this predicate defined:
- When reaching a root (above the root of the bucket), there is nothing else to do, if the user isn’t found to be in a role until now, they are not allowed to perform the requested role:
|
|
- When permissions returned from
optional_object_permissions/3
contain the searched username, returntrue
because the user is found. The recursion stops, thus the traversal stops.
|
|
- Otherwise, get the parent directory and recursively call
inherits/3
for the parent directory.
|
|
§predicate order
For this code, it is important to remember that the predicate order in Prolog is significant. The second predicate uses in_list/2
to check if the solution has been found, in which case, the recursion gets interrupted. The third predicate does not have a negating \+ in_list/2
call so the order is important.
Those three predicates roughly translate to:
if reached_root
false
else if user_found in permissions
true
else
fetch_parent
recurse
Unlike Datalog—a subset of Prolog—Prolog itself evaluates predicates in order. It will attempt to solve a query with the first occurrence of the predicate, if that predicate returns false
at any point, Prolog will retry answering the question with the next defined predicate that can be unified. This process repeats until all unifying predicates have been exhausted, at which point, false
is returned. Another major difference between Prolog and Datalog: Datalog programs always terminate, and infinite loops aren’t possible while they can happen in Prolog.
I’m mentioning Datalog because OPA Rego is a Datalog-inspired language.
§on unification
The in-detail explanation of unification goes beyond this article. I recommend the following article to learn more about unification in Prolog[2].
Two terms unify if they are the same term or if they contain variables that can be uniformly instantiated with terms in such a way that the resulting terms are equal.
§trying out inheritance
Let’s see how to find out if the user is allowed to become a certain role using inherits/3
.
Is Berta allowed to view a-file.pdf (object ID bee100a0-30b1-4184-94aa-3d408b75e938)? Remember, inherits means exactly that: does the user inherit a role from any parent directory? This means we cannot query directly for a file object ID just yet. So, in this case, we are querying for the parent directory ID: 7f916bd9-f324-40f2-97d3-95e7cc7d722e.
|
|
What about Charlie, is he an editor?
|
|
But is he an owner?
|
|
§tracing inherits/3
Let’s verify that our inherits/3
traverses the directory tree only until the user has been found. We can trace the execution of selected predicates using the built-in trace/1
predicate.
|
|
And retry the query for Berta:
|
|
We can see that the traversal went all the way up to the top data directory, where Berta’s permissions are attached. What about Alan? Alan is a viewer of pdfs. So if we wanted to find out if Alan can view a file inside of that directory, given that his permissions are attached directly to the pdfs directory, we should not see a full traversal to the root. Is that the case?
|
|
Indeed, the traversal finished early. Of course, the side effect of how traversal is implemented is that if we want to know if Alan is an owner of a file in the pdfs directory, we have to traverse all the way up to reach false
.
|
|
This could become a bottleneck if there are many, many nested directories but for now, I’m okay with that.
We can disable tracing now:
|
|
§tracing: the curious case of the predicate order
An interesting trivia regarding the inherits/3
predicate. Semantically, these two variants solve the same problem:
- Variant 1:
|
|
- Variant 2:
|
|
Variant 2 is more explicit. However, Variant 2 is also very inefficient:
|
|
Variant 1 is a much more efficient:
|
|
While I wasn’t able to understand why, I will come back to this problem in the future. This is a good opportunity to learn how to debug Prolog programs.
§querying file permissions directly
In the previous step, we had no ability to query for file permissions directly. Instead of a file ID, we had to use the ID of its parent directory. Let’s define an API to query for file permissions directly. There are two cases to consider:
- Permissions can be attached directly to a file. This is the case for Jake who can view a-file.pdf and nothing else.
- If there are no permissions attached directly to a file, resort to the parent traversal technique.
Here’s the code:
There’s a lot of duplication here. This API could be generalized, but for now, it will be sufficient. There are three different APIs depending on the access level we intend to query for.
Let’s look at one of them:
|
|
Remembering that the predicate order is important, querying for a document ID will first check if there are permissions defined for the file object ID. This can be considered an O(1) operation, pretty cheap in terms of computing so safe to start with. If permissions have been found, we check if the user has been found in the complete list for that object.
Otherwise, we revert to the inheritance check. The fallback happens in case of:
- There are no permissions attached directly to the file object.
- Permissions were found but used hasn’t been granted the requested role directly on the file object.
I like this example because it shows how efficient Prolog code can be. It translates to some thing like:
function run_inheritance_check(objectid, roles, user)
parentid = get_document(objectid)
return inherits(parentid, roles, user)
end function
if exist, permitted = object_permissions(objectid, roles); exist
if in_list(user, permitted)
return true
// else
return run_inheritance_check(objectid, roles, user)
else
return run_inheritance_check(objectid, roles, user)
end if
Note: It reminds me very much of Erlang. The first version of Erlang was implemented in Prolog and the influence is clearly visible to this day.
Let’s load this up in swipl.
First, download the API program from GitHub:
|
|
If your previous swipl session is still running, simply load the API program:
|
|
Let’s talk about Jake.
|
|
What about Alan?
|
|
Actually, that’s not true. Alan owns an old Volvo. But he doesn’t own any files so everything works as expected.
§summary
This program is 52 lines of code, half of it is a clumsy, non-general API. The other half contains:
- Almost general solver for any inheritance problem where an object has exactly one parent. Almost, because
inherits/3
directly referencesdirectory(X, Y, Z).
facts.- To generalize this solver for other fact types, one has to either:
- find a way to provide the type of a fact to query for,
- or create a general fact container.
- The second approach would probably be easier to implement.
- Frankly, the reason why I’m calling the fact a
directory
and adocument
is illustrative only.
- To generalize this solver for other fact types, one has to either:
- A general solver for any number of roles.
- Roles in permissions are represented as a list of
[role₁, [identity₁, ...], ... roleₙ, [identityₙ, ...]]
. - There can be any number of roles in that list.
- The API is where the knowledge of exact roles is required.
- Roles in permissions are represented as a list of
- The API could be generalized.
§thoughts on Prolog
This section is strongly inspired by the Prolog Wikipedia page[3].
Prolog, short for PROgramming in LOGic, is nothing like your regular programming language. It focuses on declarative style logic programming. Programs are expressed as relations represented by facts and rules. The computation is executed by running queries over these relations. The program applies predefined logic and evaluates to true/false (or yes/no, tomato, tomato).
That, however, doesn’t mean that Prolog cannot be applied to interesting problems. In fact, Prolog is widely used in artificial intelligence, theorem proving, expert systems, or—its original intended application—natural language processing.
Prolog’s heritage includes the research on theorem provers and other automated deduction systems developed in the 1960s and 1970s. The first Prolog was developed and implemented in Marseille, France, in 1972 by Alain Colmerauer and Philippe Roussel. Their work was based on Robert Kowalski’s procedural implementation of Horn clauses at the University of Edinburgh. The first implementation of Prolog was called Marseille Prolog, the interpreter was implemented in Fortran by Gerard Battani and Henri Meloni. David H. D. Warren took this original interpreter to the University of Edinburgh where he developed an alternative front-end. Edinburgh Prolog syntax was born. This syntax is used by most of the modern Prolog implementations.
The language does not have conditional statements, there are no loops, and there’s no state mutation. Prolog provides logical variables, integers, floats, atoms, and compound terms. Working with lists is not uncommon. Individual implementations may provide additional types. For example, SWI-Prolog provides strings, the empty list Nil ([]
), list cons-cell ([|]
), and dictionaries.
The original pure Prolog was restricted to the use of a resolution theorem prover with Horn clauses:
|
|
Which means:
To solve H, solve B₁ and … Bₙ.
Later, Prolog has been extended to include negation as failure. Conditions in a form not(Bₜ) are shown by trying and failing to solve corresponding positive conditions Bₜ.
At least a couple of dozens free and commercial implementations of Prolog exist. This Wikipedia page gives a brief summary[4]. But even that isn’t complete. There exist Prolog implementations for almost every major programming language. For example:
- ichiban/prolog[5] for Go.
- mthom/scryer-prolog[6] for Rust.
These implementations enable scripting capabilities in otherwise compiled, statically typed languages.
I am very impressed with Prolog, but also kind of disappointed with myself that I haven’t learned it earlier. By the way, a month from now, Prolog will be 50 years old. That’s 10 years older than me.
Better late than never.