Wednesday 13 December 2023

Learning PostgreSQL - Installation

 If you have not viewed my last post, then I request you to please go and check it out. The last post is about the architecture of PostgreSQL database, and this post is in continuation of that series.

In this post, we will discuss, different ways of installing a PostgreSQL software. Most of the organizations use Linux to deploy their PostgreSQL database, so we will be discussing ways to install PostgreSQL software on a Linux.

This will be a long post, so I would suggest you to keep a cup of coffee or a tea handy 😉.

Before installation, we will first understand how PostgreSQL versioning works.

PostgreSQL database has a Major version and Minor version. The major version comes with new features and is released at least once a year. The minor version includes bug fixes or security fixes and are released every three months. For more information on versioning of PostgreSQL, please visit PostgreSQL community website.

Starting PostgreSQL 10, the major version is indicated by increasing the first part of the version, e.g. 10 to 11, and minor releases are numbered by increasing the last part of the version, e.g. 10.0 to 10.1.

Now that we are clear about PostgreSQL versioning, let's start with today's agenda. So there are 2 ways you can install PostgreSQL software, one is Source code installation and another one YUM/Package installation.

Source code installation

This is totally a manual method, where we build the installation from source code of PostgreSQL and then install it. Here all the installation configuration and prerequisites should be taken care manually. This is the only installation type where we can specify a directory for the installation, and we can install minor versions in different locations.

Prerequisites

Depending upon your Linux distribution, install the below packages, before proceeding with the source code installation.
  • wget ⇾ To download the source code from WWW
  • gcc ⇾  As PostgreSQL is developed in C language, it is required to interpret the source code
  • zlib ⇾  This library is used by default for compression, and it enables support for compressed archives in pg_dump and pg_restore.
  • readline & readline-devel ⇾ This library is used to remember each command that you type, and allows arrow keys to recall and edit the previous commands.
For installation of these libraries, use:

# RHEL/CentOS/Fedora
sudo yum install wget gcc zlib readline readline-devel -y

# Debian/Ubuntu
sudo apt-get install wget gcc zlib readline readline-devel -y

Now to download the source code, go to PostgreSQL downloads, choose the desired version and either download postgresql-<version>.tar.gz file and then transfer to your Linux machine or just copy the download link and download using wget. Once the file is downloaded, use tar to decompress the file.

# download source code file
$ wget https://ftp.postgresql.org/pub/source/v13.7/postgresql-13.7.tar.gz

# untar the file
$ tar -xvf postgresql-13.7.tar.gz

Note: It may inform you that the connection to ftp.postgresql.org is not secure as it is unable to verify the certificate and if you want to connect use `--no-check-certificate`. So just append `--no-check-certificate` to the end of the `wget` command.

Once the file is decompressed, then it will have all the source for installing PostgreSQL database software. The decompression will create a folder with the version that you have downloaded, in this case folder `postgresql-13.7` will be created which will have the required files.

Now to start with the installation,
  1. Configure the source tree for our system using `configure` script which comes with source code
  2. Build the source tree using `make` command
  3. Installation of the software using `make install` command
When you start installation, the default location for this will be `/usr/local/pgsql` and if you want to use a custom location then you will have to use `--prefix` option with `configure`. So below will be the commands to use for installing the binaries:

# step 1 configure the source tree with custom path
# I will use custom path if you want to skip this then remove --prefix
# the installation will be one in default directory (/usr/local/pgsql)
$ sudo ./configure --prefix /usr/local/pgsql13.7/

# check the status of the previous command if you want using this file
$ ls -ltr config.log

# build the source tree with contrib models (please read INSTALL for more details)
$ sudo make world-bin

# install the software
$ sudo make install-world-bin

# verify the installation
$ /usr/local/pgsql13.7/bin/psql --version

The `contrib` models are nothing but some additional modules that are available with PostgreSQL installation. So the installation is success, now let's configure next steps which are manual in source code installation but auto in package installation.

Create `postgres`  user and make sure that all the installation directories are owned by `postgres` user.

# create user postgres
$ sudo useradd postgres -c "PostgreSQL Linux User"

# verify if it is created
$ id postgres
uid=1001(postgres) gid=1001(postgres) groups=1001(postgres)

# change ownership of the installed software
$ sudo chown -R postgres:postgres /usr/local/pgsql13.7/

To execute the binaries which are part of PostgreSQL installation, you would need to use the absolute paths instead of just the command. So now we will add the binary location in PATH environmental variable and to create the database we will create data directory.

$ sudo su - postgres
$ pwd
/home/postgres
$ mkdir data
# export PATH and DATADIR environmental variables
# depending upon your shell add this to either .bashrc/.bash_profile/.zshrc/.profile
$ export PATH=$PATH:/usr/local/pgsql13.7/bin/
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/pgsql13.7/lib
$ export DATADIR=/home/postgres/data/

# verify if PATH variable works
$ psql --version
psql (PostgreSQL) 13.7

# create the database using initdb
$ initdb -D $DATADIR
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /home/postgres/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Kolkata
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /home/postgres/data/ -l logfile start

# start the database
$ pg_ctl -D /home/postgres/data/ -l logfile start
waiting for server to start.... done
server started

# check by logging in
$ psql
psql (13.7)
Type "help" for help.

postgres=# \c
You are now connected to database "postgres" as user "postgres".
postgres=# \q
$ 

Now our installation and database initialization is complete, but one thing is left and that is creation of service. See we have installed the PostgreSQL database using source code method which is a manual method, and if something happens to the machine, like crash or reboot then the database will not come up automatically for that we will have to re-run the `pg_ctl` command. 

To avoid that, we will create a new service called `postgresql13`, which will make sure that our PostgreSQL service comes up automatically if machine reboots. Depending upon system that you are using the method might be different but a Google search will help you in that. I am using CentOS (RHEL Flavor), so here I will tell you how you can configure.

There are some start scripts available under `contrib/start-scripts`, we will use those to create a service that we can use.

$ ls -ltr contrib/start-scripts/
total 8
-rw-r--r--. 1 akgirme akgirme 1467 May 10  2022 freebsd
drwxrwxr-x. 2 akgirme akgirme   84 May 10  2022 macos
-rw-r--r--. 1 akgirme akgirme 3554 Dec 13 21:45 linux

# We will copy the linux file to init.d with name postgresql13
# this will create a service with name postgresql13
$ sudo cp contrib/start-scripts/linux /etc/init.d/postgresql13

# load the updated service file from disk
$ sudo systemctl daemon-reload
# enable the service to start after reboot
$ sudo systemctl enable postgresql13
postgresql13.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig postgresql13 on

# start the service and check if you are able to login to database
$ sudo systemctl start postgresql13
$ sudo systemctl status postgresql13
● postgresql13.service - SYSV: PostgreSQL RDBMS
   Loaded: loaded (/etc/rc.d/init.d/postgresql13; bad; vendor preset: disabled)
   Active: active (exited) since Wed 2023-12-13 21:56:01 IST; 5s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 1403 ExecStart=/etc/rc.d/init.d/postgresql13 start (code=exited, status=0/SUCCESS)

Dec 13 21:56:01 centos7 systemd[1]: Starting SYSV: PostgreSQL RDBMS...
Dec 13 21:56:01 centos7 su[1404]: (to postgres) root on none
Dec 13 21:56:01 centos7 systemd[1]: Started SYSV: PostgreSQL RDBMS.
Dec 13 21:56:01 centos7 postgresql13[1403]: Starting PostgreSQL: ok
$ sudo su - postgres
Last login: Wed Dec 13 21:56:01 IST 2023
$ psql
psql (13.7)
Type "help" for help.

postgres=# \q

Now we are done with the source code installation.

Package Installation

This is opposite of the source code installation. In this installation type, all prerequisites and post-installation tasks will be taken care by the package itself. You just have to use the package manager (yum/dnf/apt-get) to install the database software.

For this I will be using fresh installation and I will first show you that it does not have any `postgres` user or installation already in place.

# Check if user postgres exists
$ id postgres
id: postgres: no such user
$ cat /etc/passwd | grep postgres

# Check if any PostgreSQL installation exists
$ ls -ld /usr/local/pg*
ls: cannot access /usr/local/pg*: No such file or directory
$ ls -ld /usr/pg*
ls: cannot access /usr/local/post*: No such file or directory

From this we can confirm that this is fresh virtual machine which does not have any PostgreSQL installation, now lets start with the package installation. For this installation type, we would first need a repository file and then we can install the database software.

For creating the repository file, go to PostgreSQL downloads and then select the operating system you want to install on, for me I will select Linux → Red Hat/Rocky/AlmaLinux (Distribution). Then a new page will open from there, select Version and Platform. It will then present you with one copy paste script like below:

# Install repository RPM
sudo yum install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# Install PostgreSQL
sudo yum install -y postgresql14-server

# Optionally initialize the database and enable automatic start
sudo /usr/pgsql-14/bin/postgresql-14-setup initdb
sudo systemctl enable postgresql-14
sudo systemctl start postgresql-14

Here I am going to install PostgreSQL 14, so I am presented with above snippet. You can run either all the commands at once or one by one. The `postgresql-14-setup` handles the post process which is required, it create symbolic links to required binaries like `psql`, `pg_dump`, etc. in `/usr/bin`.

This installation type will have default installation location as `/usr/pgsql-<MajorVersion>` and the data directory will be `/var/lib/pgsql/<MajorVersion>/data`. The home path for `postgres` user will be `/var/lib/pgsql/`. You can validate the installation the same way we did in source code installation.

Conclusion: In realtime, mostly the database administrators will prefer going will package installation where almost everything is automatically handled. Source code installation will only be preferred in scenarios where there is no repository in PostgreSQL community website or there any special requirements by the clients, like having a custom directory or create different locations for minor versions as well.

This all for this post, in the next post, I hope you have liked it, if there is any feedback that you want to give, please feel free to write a comment.

The next post will discuss about folder structure of a PostgreSQL database cluster.

No comments :

Post a Comment