November 19, 2020
Where Are Postgres Tables Found?
In regards to databases it’s easy to forget that the data we shuffle around all day physically lives somewhere on disk. We’re often so preoccupied with finding the data we need that we overlook just how it’s represented by the filesystem. And you never know, having a ready insight into things like this may just spark some further understanding down the road!
To find where our tables are on disk we’ll want to run the command line utility psql. Once the psql shell is open we can find all sorts of information about our local databases and the tables which compose them. For starters, running the following command will output where your Postgres databases reside:
The results of the query above will return something like this:
Next, while still in the
psql shell, we can lookup where a specific database table is stored on disk in relation to Postgres:
Let’s break the result of this SELECT into its three obvious pieces:
base: any databases we create will be stored in this directory
4360515: this is the OID (Object Identifier) of a specific database, in this case it’s the database to which the “sea_slugs” table belongs
4362363: this is the file on disk representing the “sea_slugs” table we queried for
Now that we have all of this, and outside of the
psql shell, we can
cd into our database’s directory and see all the files present:
There’s not much to do here except congratulate yourself. Nice.
What Are Postgres Tables?
So now we know where to find tables, but what are they? Well, like almost everything else on a computer they’re just files. Until you dig deeper into the data structures they represent there isn’t much to uncover here, but there are a couple things worth noting:
- Maximum file size
- Dot notation on the filename
Let’s start with the first one as it leads directly into the second. The default maximum size for these table files is set by Postgres at 1GB. Now, once you have so much data in the table that it exceeds this maximum size, Postgres has to create a new file to store the additional data. This is where we get into the dot notation on the filename.
If we go back to our original examples, where we looked up the file path for the table, we had this value:
base/4360515/4362363. The final value,
4362363, is the actual file for our table - where all the data lives. So, what happens if we accumulate so much data over time (more than 1GB) that we need to create a new file? Well, Postgres will create a new file with incrementing dot notation on the end of the filename. What this means is the second file will be
4362363.1, the third file will be
4362363.2, and so on.
That’s pretty much it! Easy, right?
Once you have a hang of where tables are stored and what they are it becomes even easier to dig more deeply into Postgres internals. It’s like a springboard into even cooler concepts like Pages, Free Space Maps, Indexes, Multiversion Concurrency Control (MVCC), and even the amazingly named The Oversized Attribute Storage Technique (TOAST).