This document presents a conceptual overview of the folders and repositories system. It also summarizes the Dataform API fields and methods used for working with folders and repositories.
The Dataform API provides resources that you can use to organize code assets in a hierarchical structure that's similar to a typical operating system file system. This structure also enables Identity and Access Management (IAM) policy inheritance, allowing permissions to propagate down the path.
The following list defines key terms used to describe the folders and repositories system:
- Folder
- A folder is the basic container for organizing resources, similar to a standard file system folder. It lets you organize other folders and repositories, and you can move resources into and out of folders. You can grant permissions at the folder node, and these permissions propagate to all contents.
- User root folder
- A user root folder represents a user's personal space. It contains all the folders and repositories that a user creates or accesses. A user root folder isn't part of a team folder's subtree. A user root folder is a virtual concept that doesn't have an associated API resource.
- Team folder
- A team folder is similar to a folder, but it's designed for team collaboration, similar to a shared drive in Google Drive. It provides a dedicated space for core code assets and supports stricter sharing and access permissions for a team's core assets.
- File
- In the context of this folder structure, a file is represented by a Dataform repository resource. Each repository contains a single file asset, such as a notebook, saved query, data canvas, or data preparation.
Required roles
To get the permissions that you need to complete the tasks in this document, ask your administrator to grant you the appropriate IAM roles on the project, folder, or resource.
Permissions granted on a folder propagate to all the folders and files contained within it.
The following roles apply folders and files:
| Role | Granted on | Permissions and use cases |
|---|---|---|
Code Owner (roles/dataform.codeOwner) |
Folder or file | Grants full control over a resource for managing code assets. A user with this role can perform all actions, including deleting the resource, setting its IAM policy, and moving it. |
Code Editor (roles/dataform.codeEditor) |
Folder or file | Allows for editing and managing content. A user with this role can add content to folders, edit files, and get the IAM policy for a folder or file. This role is also required on the destination folder when moving a resource. |
Code Commenter (roles/dataform.codeCommenter) |
Folder or file | Allows for commenting on code assets or folders. |
Code Viewer (roles/dataform.codeViewer) |
Folder or file | Provides read-only access. A user with this role can query the contents of folders and files. |
Code Creator (roles/dataform.codeCreator) |
Project | Grants permission to create new folders and files within a project. |
The following roles are specific to managing team folders:
| Role | Granted on | Permissions and use cases |
|---|---|---|
Team Folder Owner (roles/dataform.teamFolderOwner) |
Team folder | Grants full control over a team folder for managing code assets. A user with this role can delete the team folder and set its IAM policy. |
Team Folder Contributor (roles/dataform.teamFolderContributor) |
Team folder | Allows for content management within a team folder. A user with this role can update a team folder. |
Team Folder Commenter (roles/dataform.teamFolderCommenter) |
Team folder | Allows for commenting on a team folder and the code assets that it contains. |
Team Folder Viewer (roles/dataform.teamFolderViewer) |
Team folder | Provides read-only access to a team folder and its contents. A user with this role can view a team folder and get its IAM policy. |
Team Folder Creator (roles/dataform.teamFolderCreator) |
Project | Grants permission to create new team folders within a project. |
For more information about granting roles, see Manage access to projects, folders, and organizations.
These predefined roles contain the permissions required to complete the tasks in this document. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
- Create a folder:
folders.createon the parent user folder, team folder, or projectfolders.addContentson the parent folder or team folder
- Retrieve the properties of a folder:
folders.geton the folder - Query the contents of a folder or team folder:
folders.queryContentson the folder - Update a folder:
folders.updateon the folder - Delete a folder:
folders.deleteon the folder - Get the IAM policy for a folder:
folders.getIamPolicyon the folder - Set the IAM policy for a folder:
folders.setIamPolicyon the folder - Move a folder:
folders.moveon the folder being movedfolders.addContentson the destination folder or team folder (not needed if moving to a root folder)
- Create a team folder:
teamFolders.createon the project - Delete a team folder:
teamFolders.deleteon the team folder - Get the IAM policy for a team folder:
teamFolders.getIamPolicyon the team folder - Set the IAM policy for a team folder:
teamFolders.setIamPolicyon the team folder - Retrieve the properties of a team folder:
teamFolders.geton the team folder - Update a team folder:
teamFolders.updateon the team folder - Create a repository:
repositories.createon the parent user folder, team folder, or projectfolders.addContentson the parent folder or team folder
- Read a repository:
repositories.readFileon the repository - Write to a repository:
repositories.commiton the repository - Move a repository:
repositories.moveon the repository being movedfolders.addContentson the destination parent user folder, team folder, or project (not needed if moving to a root folder)
- Retrieve the properties of a repository:
repositories.geton the repository - Update a repository:
repositories.updateon the repository - Delete a repository:
repositories.deleteon the repository
You might also be able to get these permissions with custom roles or other predefined roles.
To gain full access for managing the code assets in your project, ask your administrator to grant you the following IAM roles on the project:
- Dataform Admin
(
roles/dataform.admin) - Dataform Editor
(
roles/dataform.editor) - Dataform Viewer
(
roles/dataform.viewer)
IAM policy inheritance
IAM access for folder and repository resources leverages a hierarchical structure. This hierarchy ensures that access policies are inherited from parent folders to their contents.
When an IAM policy is set on a folder, the permissions granted by that policy also apply to all the repositories and nested subfolders in the folder's subtree. This has the following consequences:
- Permissions are inherited through the folder hierarchy. When a user is granted a specific role on a high-level folder, they possess the permissions included in that role for all the resources contained in that folder and its subfolders.
- The permissions that a user has on a resource consist of the policies set directly on that resource and all the policies inherited from every folder in its path up to the root.
As a result, you don't need project-level permissions to perform actions on resources located deep in a folder structure. You only need the proper permission on any folder in the path to that resource. For example, if you want to create a repository in a subfolder, you need the necessary permissions on either the specific subfolder or any of its parent folders, which includes the top-level folder.
The following are best practices for applying IAM policies to folders and repositories:
- Apply IAM policies to the highest folder in the hierarchy where the permissions are uniformly needed. For example, if a team needs access to all the data in their team's directory, grant the necessary roles at the level of the team folder instead of at the level of individual project subfolders.
- Always grant the minimum set of permissions required for users or services to perform their tasks. Avoid granting broad roles where you can use more specific folder-level roles and permissions.
IAM roles granted on resource creation
The following roles are granted automatically upon resource creation:
- Users who create folders that are not in a team folder subtree automatically receive the
Dataform Admin role
(
roles/dataform.admin) on those folders. - The creator of a root team folder automatically receives the
Dataform Admin role
(
roles/dataform.admin) on that team folder. - When you set
setAuthenticatedUserAdmintotruein theprojects.locations.repositoriesresource, users who create a repository in the user root node automatically receive the Dataform Admin role (roles/dataform.admin) on the repository.
You can use the Config API to grant a specific role upon resource creation.
You don't automatically receive any roles when you create new folders or repositories within a team folder's subtree.
Limitations
Folders and repositories have the following limitations:
- You can only nest folders up to 5 levels deep.
- After moving a repository into a folder, the repository and its child resources aren't visible in Cloud Asset Inventory.
- A maximum of 100 resources can participate in a single move operation.
- Having a very large number of folders (hundreds of thousands) slows performance when working with folders.
Organize resources
The following sections describe how you can organize folder, team folder, and repository resources with the Dataform API.
Folder resources
The following table describes the API fields for folders:
| Field | Description |
|---|---|
containing_folder |
A reference to the parent folder or the team folder's name. You can set this to a folder ID or a team folder ID. If you don't set this field, this is a root folder. |
display_name |
The user-visible name for the resource. The display_name field must be unique according to the following rules:
|
The following table describes the main
projects.locations.folders
API methods:
| API method | Description |
|---|---|
create |
Creates a new folder. |
get |
Gets a folder's properties. |
patch |
Updates a folder's properties, such as its name. |
queryFolderContents |
Lists the items in a folder. |
move |
Moves the folder and its entire subtree to a new containing folder. A move operation is atomic, meaning, it succeeds only if all the resources in the folder's subtree are properly moved and there are no partial failures. |
delete |
Deletes the folder. Succeeds only if the folder is empty. |
setIamPolicy |
Grants roles to the folder. Granted roles automatically propagate to the entire subtree of the folder. |
The following example demonstrates how to create a root-level folder:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"displayName": "DISPLAY_NAME"
}' \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/folders"
Replace the following:
DISPLAY_NAME: the user-visible name for the resource.PROJECT_ID: your Google Cloud project ID.LOCATION: the location of the Dataform repository where resources are created.
The following example demonstrates how to create a folder that's nested inside another folder:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"displayName": "DISPLAY_NAME",
"containingFolder": "projects/PROJECT_ID/locations/LOCATION/folders/PARENT_FOLDER_ID"
}' \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/folders"
Replace the following:
DISPLAY_NAME: the user-visible name for the resource.PROJECT_ID: your Google Cloud project ID.LOCATION: the location of the Dataform repository where resources are created.PARENT_FOLDER_ID: the ID of the existing folder where you want to create the new folder.
The following example demonstrates how to move a folder into another folder:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"destination_containing_folder": "projects/PROJECT_ID/locations/LOCATION/folders/DESTINATION_PARENT_FOLDER_ID"
}' \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/folders/FOLDER_ID_TO_MOVE:move"
Replace the following:
PROJECT_ID: your Google Cloud project ID.LOCATION: the location of the Dataform repository.DESTINATION_PARENT_FOLDER_ID: the ID of the folder where you want to move the target folder.FOLDER_ID_TO_MOVE: the ID of the folder that you are moving.
Team folder resources
The following table describes the main
projects.locations.teamFolders
API methods:
| API method | Description |
|---|---|
create |
Creates a new team folder. |
get |
Gets a team folder's properties. |
patch |
Updates a team folder's properties, such as its name. |
queryContents |
Lists the items in a team folder. |
delete |
Deletes the team folder. Succeeds only if the team folder is empty. |
setIamPolicy |
Grants roles to the team folder. Granted roles automatically propagate to the entire subtree of the team folder. |
The following example demonstrates how to query the contents of a team folder:
curl -X GET \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/teamFolders/TEAM_FOLDER_ID:queryContents"
Replace the following:
PROJECT_ID: your Google Cloud project ID.LOCATION: the location of the Dataform resources.TEAM_FOLDER_ID: the ID of the specific Dataform team folder you're querying.
Repository resources
You can organize existing repository resources in folder and team folder
resources with the containing_folder field at the
folder node.
The following table describes the API methods for repositories:
The following table describes the main
projects.locations.repositories
API methods:
| API method | Description |
|---|---|
create |
Creates a new repository. |
get |
Gets a repository's properties. |
patch |
Updates a repository's properties, such as its name. |
move |
Moves the repository to a new containing folder. |
delete |
Deletes the repository. |
setIamPolicy |
Grants roles to the repository. Granted roles automatically propagate to the entire subtree of the repository. |
The following example demonstrates how to create a repository in the user root node while setting
setAuthenticatedUserAdmin
to true:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"displayName": "REPOSITORY_DISPLAY_NAME",
"setAuthenticatedUserAdmin": true
}' \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/repositories?repositoryId=REPOSITORY_ID"
Replace the following:
REPOSITORY_DISPLAY_NAME: a user-friendly name for the repository.PROJECT_ID: your Google Cloud project ID.LOCATION: the location for the repository.REPOSITORY_ID: the ID of the new repository.
The following example demonstrates how to create a repository inside a team folder:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"containingFolder": "projects/PROJECT_ID/locations/LOCATION/teamFolders/CONTAINING_TEAM_FOLDER_ID"
}' \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/repositories?repositoryId=REPOSITORY_ID"
Replace the following:
PROJECT_ID: your Google Cloud project ID.LOCATION: the location where resources are created. This must be the same location as the location of theCONTAINING_TEAM_FOLDER_ID.CONTAINING_TEAM_FOLDER_ID: the ID of the specific team folder where you want to place the new repository.REPOSITORY_ID: the ID for the new repository.
The following example demonstrates how to move a repository into the root folder:
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-X POST \
-d '{
"destination_containing_folder": ""
}' \
"https://dataform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/repositories/REPOSITORY_ID:move"
Replace the following:
PROJECT_ID: your Google Cloud project ID.LOCATION: the location where the repository exists.REPOSITORY_ID: the ID of the repository you want to move to the root level.
Busy resources
A folder, team folder, or repository is "busy" if it's actively involved in a move operation, either as the object being moved or the destination of the move. The system restricts busy resources from the following actions to ensure data integrity during the move:
- Being the object of another move operation.
- Being the destination of another move operation.
- Being an ancestor of a move object.
- Being the object of a delete operation.
What's next
- To learn how to organize code assets in BigQuery, see Organize code assets with folders and Create and manage folders.
- To learn more about managing permissions for Dataform resources, see Control access with IAM.
- To view the complete details of Dataform API methods, see the Dataform API.