Data models for Slack Apps
When I was setting up the database tables for my first Slack app, Simple Poll, I was more or less flying blind. I wasn’t entirely sure what Slack entities I needed to care about, or what data I needed to store and why. Database tables are easy to get wrong at the beginning, and then can become quite hard to fix up later.
A lot has happened since those first lines of code in late December 2015! Slack has become more capable1 and Simple Poll is one of the most popular Slack apps. Thanks to its popularity, Simple Poll sees a lot of the edge cases where its database models don’t quite capture all the logic they need to. Luckily, we can all learn from my past mistakes and modelling shortfalls!
I recently built a completely new app from scratch (Simple Goals). This was an opportunity for me to revisit some of the data modelling choices I had made for Simple Poll and essentially start from scratch, with the lessons learned from both Simple Poll and from building the GitHub Slack integration. This blog post is very much an artefact of that exercise. It describes how I would go about modelling the basic entities for any new Slack app.
I hope this is useful to my future self as well as to anyone who’s going about building a new Slack app and wants to spend more time writing code that actually delivers value to customers than to implement and re-hash the boilerplate. Let’s get into it.
A quick note about stacks: My stack for Slack apps is Django + Postgres. This blog post is pretty stack agnostic, but it assumes you’re using a relational SQL database. I’d recommend using a relational database for your Slack app. It’s worked well for me.
Why store things at all?
The easiest way to avoid storage, data modelling, and related problems is to not store anything at all. Let Slack be the single source of truth and query the Slack API whenever you need to know anything.
Unfortunately, a no-storage approach only takes you so far. At the very basic level, your app will want to make API calls to the Slack API (for example to post a message or open a modal) and for that you need the right access token. And you use the right access token by storing it when Slack hands it to you at the end of the OAuth flow.
In addition to storing access tokens (and knowing which access token to use when), your app will need to be able to reason about some Slack entities. Say for example you’d like a user to be able to configure a setting within your Slack app. This setting needs to be stored in relation to the slack user entity that set it up.
It’s also very likely that your app will need to be able to answer questions that tie your “business-logic” entities (like a “Poll” for Simple Poll) to Slack entities, like a Slack user. For example: “Is this user allowed to delete this poll?” or “Has this user already voted on this poll?”
Figuring out how to model and reason about the most important Slack entities is the purpose of this blog post. Importantly, we only focus on the necessary minimum to get a Slack app up and running in a future proof way. There is always more data you could store, but we’ll focus on the minium.
The Tables
There are 5 tables I consider to be the absolute minimum future-proof set:
- Enterprise
- Workspace
- Bot
- LocalUser
- GlobalUser
Below is an overview of how these five relate to each other. Don’t worry if that overview looks a bit confusing, we’ll go over these one by one and cover why they’re laid out this way and what they all mean.
Theoretically, if you really don’t care about supporting enterprise grid, you could reduce this to 3 tables (cutting out enterprise and global user), but I really really wouldn’t recommend it. If you’re building an app to be used by the public, it’s almost guaranteed that it will be used on grid.
Enterprise
An enterprise (sometimes referred to as “org”) in Slack is the central entity of Slack’s Enterprise grid plan.
At a high level, when you’re on enterprise grid, you get to connect as many workspaces together under the umbrella of one “enterprise”. So you might have an enterprise called “Acme Corp” and then a Slack workspace for “Acme Corp Sales” and “Acme Corp London Office”, and so on.
It’s possible that you haven’t come across Slack enterprise grid before. The vast majority of Slack workspaces are non-grid workspaces. I myself have never been on a slack enterprise grid except for testing my apps. So why do we care about the enterprise entity?
- There are more companies on grid than you might think! Especially larger companies tend to opt for grid
- Companies on enterprise grid are very much part of the target market for most B2B Slack apps
- Any workspace can become part of an enterprise grid at any time
- Keeping track of the enterprise entity lets your app deliver a consistent experience for all those users on grid
So here’s how to structure the Enterprise table:
enterprise table
Column name | Type | Example | Notes |
---|---|---|---|
id | string (primary key) | “E5CBMT2HE” | Unique across all of Slack |
name | string | “Acme Corp” | |
created_at | Datetime | 2020-05-11T15:52:30.591Z | |
updated_at | Datetime | 2020-05-11T15:52:30.591Z |
The enterprise id (example: E5CBMT2HE
) is guaranteed to be globally unique across all of Slack. No two enterprises can have the same id, so we can safely use this as our primary key for the Enterprise table.
I’m going to omit the
created_at
andupdated_at
columns from now on. But it makes sense to have them on all of the 5 the tables to keep track of when rows were created and updated.
Want to test your enterprise implementation? Slack provides enterprise grid sandboxes for testing
Workspace
The Slack workspace (formerly known as the “Slack team”) is arguably the most important model. It’s a foundational Slack entity and many Slack entities strongly relate back to the workspace.
workspace table
Column name | Type | Example | Notes |
---|---|---|---|
id | string (primary key) | “TR06LG5PB” | Unique across all of Slack, starts with a “T” because workspaces used to be called teams |
name | string | “Acme Corp Workspace” | |
has_app_installed | boolean | true | |
enterprise_id | string, can be NULL, foreign key to enterprise table | NULL |
The team id (example: TR06LG5PB
) is guaranteed to be globally unique across all of Slack. No two teams can have the same id, so once again we can safely use this as the primary key for our table.
“Workspaces” and “Teams” are the same thing. Formally Slack “teams” are now called “workspaces”, but the Slack API largely still refers to workspaces by their “team” naming.
The has_app_installed
column lets us keep track of whether a workspace has our app installed. Although surely a mistake and only temporary of nature, it is possible for a workspace to uninstall our app. It’s useful to know which workspaces have our app installed and which don’t anymore.
The enterprise_id
column refers back to the corresponding column on the enterprise table. For most workspaces this will be NULL, but for all those workspaces on enterprise grid, this will point towards their row in the Enterprise table.
Now feels like a good place to mention again that these columns we’re setting up are the absolute minimum of what I’d set up. For each table it’s feasible that you’d want to add additional columns. For example, in the Workspaces table you might want to also keep track of the
domain
for each workspace and add a column for that.
Bot
This guide assumes that you’re using Slack’s “granular bot permissions”, which were announced in December 2019. If you’re building a Slack app in 2020 or later, this is the permission model you should default to.
All that this “permission model” means is that as your app is installed on a Slack workspace, Slack creates a bot user on the workspace. This user has a user id just like any other user on the workspace (e.g. U010VH3ALTH
) and its permissions are based on the OAuth scopes you provided at the start of the OAuth flow. At the end of the OAuth flow, Slack provides you with an access token alongside the bot user id. You’ll know that the access token is a bot token because it starts with xoxb
.
bot table
Column name | Type | Example | Notes |
---|---|---|---|
id | integer (primary key, autoincrement) | 1 | |
workspace_id | string, foreign key to workspace table | “TR06LG5PB” | Unique. Each workspace can have just one bot |
scope | string | “commands,im:write,chat:write” | |
access_token | string | “xoxb-17653672481-19874698323-pdFZKVeTuE8sk7oOcBrzbqgy” | |
bot_user_id | string | “U010VH3ALTH” | The user_id of your app’s bot user. This is different for each workspace. |
With the Bot entity, there is no slack-provided id that is guaranteed to be globally unique, so unlike with enterprises and workspaces, we’ll resort to using an auto incrementing integer field as our bot table’s primary key (id
field).
The workspace_id
links the bot back to the workspace that it is installed on. Importantly, this field must be unique (in Django-land this is a OneToOneField), since each workspace has your app’s bot just once. Theoretically, all of these fields could just live on the Workspace table because of that, but i’ve found it cleaner to move these columns onto a separate table.
The scope
field might seem a bit unnecessary at first since at first all rows will have the exact same value. But over time, it’s likely that your Slack app will request additional OAuth scopes and once that happens, this column will be invaluable in keeping track of which permissions each Bot has.
I like storing the bot_user_id
, because it means you’ll be able to @-mention the bot in communication to users. For example, in some cases Simple Poll will ask users to /invite <@{bot.bot_user_id}>
the bot into a channel before a poll can be posted there. Users can copy and paste that command without needing to manually type out the name of the bot again.
Users
Users in Slack are interesting.
As app-builders we face a set of unusual circumstances that we have to be aware of with our Slack apps. Specifically:
- Rule 1: All users in Slack exist only within the boundaries of their workspace and as a result user ids are not globally unique (this is why you have a separate email and password for every Slack workspace you’re a member of)
- Rule 2: Rule 1 (mostly) doesn’t apply in the context of enterprise grid and any workspace can become part of an enterprise grid at any time
The models proposed as part of this blog post, can handle these two complicating circumstances, but bear with me, it’s going to get a little complex. I’ll start by laying out the two tables we’ll add to handle users in Slack (GlobalUser and LocalUser) and then will go on to cover what that ends up looking like in practice.
Global User
In enterprise grid, users exist not just within the boundaries of their workspace, but within the boundaries of their enterprise. This means that they will have a user id that begins with W. The same human will have one W-based user_id across all of the workspaces that they are a part of in the enterprise grid.
The purpose of the global_user table is to model how users work in enterprise grid, as well as prepare for the regularly-occurring scenario where a non-grid workspace joins an enterprise and becomes a grid workspace.
globaluser table
Column name | Type | Example | Notes |
---|---|---|---|
id | integer (primary key, autoincrement) | 1 | |
slack_global_user_id | string, can be NULL | “W5CD6G53J” | |
enterprise_id | string, can be NULL, foreign key to enterprise table | “E5CBMT2HE” |
slack_global_user_id
is set to the W-based user_id, if the user has one (e.g. W5CD6G53J
). If the user has a U-based user_id, then this column will be NULL. In the case of a W-based user_id, we’ll also set use the enterprise_id
column to link back to the Enterprise the user is a part of.
This table does seem slightly overkill for the non-grid case, because each row will literally just be the auto increment id, and then NULL for the slack_global_user_id
column. But it sets us up well to cover both grid, non-grid, and the case where a workspace transitions from non-grid to grid.
To ensure, that we don’t accidentally get duplicate users in this table, I recommend setting up a unique_together constraint for slack_global_user_id
and enterprise_id
. In Django that would look something like this:
class GlobalUser(models.Model):
slack_global_user_id = models.CharField(
max_length=50,
null=True,
unique=True,
db_index=True,
)
enterprise = models.ForeignKey(Enterprise, null=True, on_delete=models.CASCADE)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
unique_together = (
('slack_global_user_id', 'enterprise',)
)
You may also want to add some validation logic that ensures for each row that if slack_global_user_id
is NULL, then enterprise_id
must be NULL as well. And if slack_global_user_id
is set, then enterprise_id
must be set too (and vice versa).
Local User
The Local User is the counterpart to the Global User and its purpose is to directly model users as they exist within the boundary of a single workspace.
localuser table
Column name | Type | Example | Notes |
---|---|---|---|
id | integer (primary key, autoincrement) | 1 | |
workspace_id | string, foreign key to workspace table | “TR06LG5PB” | |
local_user_id | string | “URAASU5P0” | |
global_user_id | int, foreign key to globaluser table | 1 |
You can also set up one additional database constraint for the localuser table: The value of workspace_id
and local_user_id
should always be unique together, and this can be enforced at the database level. In Django, that would look as follows:
class LocalUser(models.Model):
workspace = models.ForeignKey(Workspace, on_delete=models.CASCADE)
local_user_id = models.CharField(max_length=50, db_index=True)
global_user = models.ForeignKey(GlobalUser, on_delete=models.CASCADE)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
unique_together = (
('local_user_id', 'workspace',)
)
Once again, since none of the Slack provided values are individually globally unique, we’ll use an auto incrementing id
column as our primary key.
One of the core properties of local users is that they exist within the boundaries of their Slack workspace. As such, the workspace_id
field is a foreign key that refers back to the appropriate workspace that the user is a part of.
The local_user_id
is the user_id your app will receive from Slack when a user interacts with your app. You’ll use this value a lot when making API calls that involve a specific user. As mentioned before, when combined, workspace_id
and local_user_id
are globally unique.
Note that in the enterprise grid case, even the local_user_id
can start with a W. In that case the local_user_id
value in the localuser table and the slack_global_user_id
value in the globaluser table will be exactly the same.
The global_user_id
column refers back to the globaluser table. If a given user is on enterprise grid, this column provides a way for our app to understand that many different workspace-delimited local users are actually all the same global user.
All together, these tables look like this:
Global users and local users in practice
Now, how what do all these tables and entities look like in practice?
The above chart shows how your data will look for a single non-enterprise-grid workspace. Most of the workspaces where your app will be installed will likely be non-grid workspaces.
Notably, as you can see in the graph, there are no slack-provided values associated with the global user records. Here, the global user is a convenience model that makes our app ready for enterprise grid.
The enterprise grid case is a bit more interesting in practice. In the example shown in the graph above, we have one enterprise with two workspaces and two corresponding bots. And then we have 4 distinct humans, represented by by 4 global user records, and 6 local user records.
- User no 1 is straight forward. The user is present on both workspaces and uses the same W-based id for both their global user record and each of their local user records.
- User no 2 is also present on both workspaces. As you’d expect the user has a W-id on their global user record. However, in this case the user has different local user ids for each of their two local user records. Only via the global user record can we identify that this is in fact the same underlying human!
- User no 3 is only present on one of the workspaces in the grid. And this user happens to have a local user id that is different from the global user id.
- User no 4 is also only present on one of the workspaces (just like user no 3, although on the other workspace), but in this case the global user id and the local user id are the same.
Hopefully this graph illustrates how the localuser and globaluser tables are both needed in helping our app make sense of how users work in Slack. They help us figure out which user we’re dealing with, and whether two local are users are actually the same global human.
Yay, we made it! That was a lot of date modelling, so go get yourself a hot beverage ☕️ Then read on for a few brief notes on procedures and next steps ⬇️
Procedures
In addition to the data models covered so far, it’s worth briefly going over some of the less obvious “procedures” for creating and updating the data we’re storing in our fancy new models. Unlike the database models, these procedures can be more easily changed and adjusted over time, which is why we’re covering them just briefly and at a high level.
“Add to Slack” OAuth flow
At the end of the “Add to Slack” OAuth flow, Slack provides you with information about who just installed your app. It looks a little something like this:
{
"ok": true,
"access_token": "xoxb-17653672481-19874698323-pdFZKVeTuE8sk7oOcBrzbqgy",
"token_type": "bot",
"scope": "commands,im:write,chat:write",
"bot_user_id": "U0KRQLJ9H",
"app_id": "A0KRD7HC3",
"team": {
"name": "Slack Softball Team",
"id": "T9TK3CUKW"
},
"enterprise": {
"name": "slack-sports",
"id": "E12345678"
},
"authed_user": {
"id": "U010VH3ALTH",
}
}
Based on this data from Slack, we can now create all the corresponding records in our 5 tables.
More information about Slack OAuth
Looking up & creating users
The flow shown in the chart above is a pretty common one for Slack apps. I want to briefly talk about the particulars of the green box: Loading the user records from the database, now that we have our 5 database tables set up.
The following flow chart describes how to load (and if the user is new to us, create) the local and global user records. Once loaded, we can then use them (and any data we store alongside them) to carry out our app’s business logic like show a specific view to a user, call the Slack API, etc.
When a workspace joins an enterprise
When a workspace that has our app installed joins an enterprise grid, Slack will keep our app up to date by sending the grid_migration_started
and grid_migration_finished
events.
Our app should act on the grid_migration_finished
event and do the following:
- If the team joined an enterprise that’s not yet in our database, create a new enterprise record
- Set the enterprise_id field in the workspace table to point at the enterprise record
- For each localuser record we have stored on the workspace, set their corresponding globaluser’s W-id with the help of the
migration.exchange
endpoint
Read more about grid_migration_finished
Future work
As mentioned before, the 5 tables we set up as part of this blog post are only the beginning. Not only will you have many other tables to store the data that matters to your specific application, but there are some more Slack entities you’ll find you’ll end up modelling too. Specifically those entities might include:
- Slack Events to keep track of which events you’ve already processed
- Slack Channels, which can have a surprising amount of complexity! My mental model for slack channels (and especially shared channels) is still very much a work in progress
Let me know how you get on! [email protected]
[1] Slack’s increased capabilities have come with increased complexity. Most notable via the addition of enterprise grid and shared channels