442 stories
·
1 follower

Alexey Lesovsky: Upgrading inheritance partitioning made easy.

1 Share

An alternative way to upgrade to PostgreSQL 10 native partitioning using ALTER TABLEs and PLPGSQL.

Today's post is a bit spontaneous, because a few days ago I read the post about PostgreSQL's new native table partitioning explaining how to migrate from an old inheritance partitioning to the new one.

Author of the post proposed using pg_dump and pg_restore utilities. This way is quite simple and requires minimal effort. However, in some instances, especially in case of large partitions, dumping partition's data, restoring and creating indexes might take a long time.


In this post, I would like to offer an alternative way for partitioning upgrade. This method based on some ALTER commands wrapped into PLPGSQL procedure. Of course, I should note this technique only work if you already upgraded to PostgreSQL 10.
Let's consider a classic example, with master table and several tables which partitioned using timestamp column. Using the following commands it's possible to create our example tables.
# CREATE TABLE events (
    id serial primary key,
    cta_point integer,
    decay_factor integer,
    created_at timestamp without time zone);
# CREATE TABLE events_201708 (LIKE events INCLUDING ALL) INHERITS (events);
# CREATE TABLE events_201709 (LIKE events INCLUDING ALL) INHERITS (events);
# CREATE TABLE events_201710 (LIKE events INCLUDING ALL) INHERITS (events);
# CREATE TABLE events_201711 (LIKE events INCLUDING ALL) INHERITS (events);

Next step is to create trigger function and the trigger itself which would use our function when new records are inserted into the table.

# CREATE OR REPLACE FUNCTION fn_insert_events() RETURNS TRIGGER AS
    $function$
    BEGIN 
    EXECUTE format($$INSERT INTO %I VALUES ($1.*)$$, 'events_'||to_char(NEW.created_at, 'YYYYMM')) USING NEW;
    RETURN NULL;
    END;
    $function$ LANGUAGE plpgsql;
# CREATE TRIGGER events_trigger BEFORE INSERT ON events FOR EACH ROW EXECUTE PROCEDURE fn_insert_events();

As last step, let's add some generated data into our partitions.

# INSERT INTO events (cta_point,decay_factor,created_at)
SELECT (random() * 80 + 60)::int,(random() * 100 + 400)::int,i
FROM generate_series('2017-08-01'::timestamp,'2017-11-30','10 minutes') gs(i);


Check that everything is ok with data distribution. The master table should not contain any rows and partitions should be evenly filled.
# \dt+
                         List of relations
 Schema |     Name      | Type  |   Owner  |   Size  | Description
--------+---------------+-------+----------+---------+-------------
 public |        events | table | postgres | 0 bytes |
 public | events_201708 | table | postgres |  256 kB |
 public | events_201709 | table | postgres |  248 kB |
 public | events_201710 | table | postgres |  256 kB |
 public | events_201711 | table | postgres |  240 kB |

When test partitions are ready we can proceed to preparing the upgrade procedure.

Declarative partitioning in PostgreSQL 10 supports not only CREATE TABLE syntax, but ALTER TABLE as well. Mainly, it gives an opportunity to attach or detach tables like a partition to an existing table. So, there is no barriers to handle all partitions, switch off inheritance and attach to the new master table. This is in general, but how does it actually work?

Of course, first, we must create new master table using CREATE TABLE and, specify, basic partitioning rule - using RANGE or LIST. In our case, RANGE would be an appropriate choice. Next step, is to get the list of all source table partitions which can be obtained from pg_class and pg_inherits system tables. When the list is ready we should disable inheritance and attach partition to the new master table.

There are a few points of note here. First, when we attach partitions to master table we must specify min and max values which are then used like borders for the inserted rows. This values can be obtained from partitions' names - in my practice, partitions' names indirectly point to intervals of stored rows, the most obvious examples are daily or monthly partitions in following format: YYYYMMDD, YYYY_MM, YYYY_MM_DD or similar. Tables' names are not universal though so it doesn’t guarantee 100% accuracy, but it's an easy way to identify intervals and works in most cases.

Thus, we need to extract required information from table's name and translate it into values acceptable by ALTER TABLE command. This can be done using different string or date/time builtin functions.

Another key point is sequence handling. If original table uses sequence, this one also should be attached to the new table.

Well, it seems there are no additional pitfalls to fall into and all these ideas can be implemented in the following function.

# \i ~/Temp/fn_upgrade_partitioning.sql
CREATE FUNCTION

When function is created, we can upgrade our existing partitioning. The function requires 3 arguments: the name of the master table which should be upgraded; the name of the column which used as partition key; and last the name of the column with sequence.

# SELECT fn_upgrade_partitioning('events', 'created_at', 'id');
NOTICE: events_201708 reattached from events to events_new
NOTICE: events_201709 reattached from events to events_new
NOTICE: events_201710 reattached from events to events_new
NOTICE: events_201711 reattached from events to events_new
NOTICE: partitioning for events upgraded.
fn_upgrade_partitioning
-------------------------
(1 row)
Following the upgrade, let's check that generated data inserted without problems.

# INSERT INTO events (cta_point,decay_factor,created_at)
SELECT (random() * 80 + 60)::int,(random() * 100 + 400)::int,i
FROM generate_series('2017-08-01'::timestamp,'2017-11-30','5 minutes') gs(i);
INSERT 0 34849


Sizes and data distribution also look fine.
# \dt+
                         List of relations
 Schema |     Name      | Type  |   Owner  |   Size  | Description
--------+---------------+-------+----------+---------+-------------
 public |        events | table | postgres | 0 bytes |
 public | events_201708 | table | postgres |  712 kB |
 public | events_201709 | table | postgres |  688 kB |
 public | events_201710 | table | postgres |  712 kB |
 public | events_201711 | table | postgres |  664 kB |


As mentioned earlier it's fast and doesn’t require data dump and load. Of course, it requires some PLPGSQL skills but this is a very useful skill to have. Using PLPGSQL and making tiny changes you can adapt the function to your particular partitioning and upgrade to new PostgreSQL 10 declarative partitioning fast.

Hope you enjoyed this post. If you tried alternative ways of migration from an old inheritance partitioning to the new one let me know in comments!
Read the whole story
internetionals
5 days ago
reply
Netherlands
Share this story
Delete

Shaun M. Thomas: PG Phriday: Getting RAD with Docker [Part 2]

1 Share

In our last article, we explored how to run Postgres in some very basic Docker scenarios. Based on our experiments back then, we can obtain images, create containers, and mount to various types of storage.

Boring!

It’s not just boring, it’s mundane. It doesn’t do anything. Sure we can run Postgres in a container, but that’s true about a lot of things. You know what’s exciting? Setting up Postgres streaming replication between two docker containers.

Let’s get started.

He say “I know you, you know me”

The first thing we need to do is create a container that will be our primary host. Postgres requires a user with REPLICATION permission, as well as a special entry in the pg_hba.conf file for the “replication” pseudo-database. We could start a regular Postgres container, connect to it, and set all of that up ourselves, but it’s much cooler to create our own image which does that for us.

The Postgres Docker Hub image has excellent instructions for extending the official image, so let’s start there.

Since we know the the two modifications we need to make, let’s create a basic script to automate the steps:

#!/bin/bash

if [ $(grep -c "replication repuser" ${PGDATA}/pg_hba.conf) -gt 0 ]; then
    exit 0
fi

psql -U "$POSTGRES_USER" \
     -c "CREATE USER repuser WITH REPLICATION" postgres

echo "host replication repuser all trust" >> ${PGDATA}/pg_hba.conf
pg_ctl -D ${PGDATA} reload

Since we could be attaching to a pre-existing Postgres volume, we need to check whether or not we already modified it so the repuser user could connect. Aside from that, all we do is create the user and add a very insecure host entry so it can connect to initiate streaming replication.

He got hair down to his knee

The next step is to create a Dockerfile to define our image. We didn’t do much, so that’s a thankfully easy task. Here’s how that looks:

FROM postgres:latest

COPY scripts/enable_replication.sh /docker-entrypoint-initdb.d

We saved our previous script as scripts/enable_replication.sh to keep things together. The official Postgres image suggests we can copy that file to a special location and it’ll run automatically. Not only is that convenient, but it keeps us from having to know a lot of Docker syntax at this point.

Here’s the full contents of the folder where we’re constructing our image source:

postgres-primary/
postgres-primary/scripts
postgres-primary/scripts/enable_replication.sh
postgres-primary/Dockerfile

And now we just create our image:

$> docker build --tag postgres-primary postgres-primary

Sending build context to Docker daemon  5.632kB
Step 1/2 : FROM postgres:latest
 ---> 4860bdf1a517
Step 2/2 : COPY scripts/enable_replication.sh /docker-entrypoint-initdb.d
 ---> 03935680f9e2
Successfully built 03935680f9e2
Successfully tagged postgres-primary:latest

Got to be a joker he just do what he please

That’s one image out of the way, but a replication stream requires at least one more component: a replica. To build that, we actually have a slight problem because we need to know the hostname of the container we’re cloning. We can remove that roadblock by requiring users to pass an environment variable with that information.

We also need to stop and erase any existing Postgres instance created by the official image, while also refraining from erasing a legitimate replica. It’s a tall order, but this very basic script has all the requisite parts:

#!/bin/bash
set -e

# This is where we tell the invoker to set PRIMARY_NAME.

if [ "${PRIMARY_NAME}" == "unset" ]; then
  echo "ERROR: Please set PRIMARY_NAME to stream source container."
  exit 1
fi

# Don't erase an existing replica. This means we've already bootstrapped.

if [ -f ${PGDATA}/recovery.conf ]; then
  exit 0
fi

pg_ctl -D ${PGDATA} stop -m fast

rm -Rf ${PGDATA}/*

pg_basebackup -U repuser -X stream -R -c fast \
              -h ${PRIMARY_NAME} -D ${PGDATA}

pg_ctl -D ${PGDATA} start

The “trick” to this script is that we look for a recovery.conf file. That tells us a replica already exists in this instance, and we can skip replacing it with an upstream clone. A more “correct” script might actually check the file to ensure the replica is subscribed to the indicated PRIMARY_NAME, but this is just a demo so it’s OK to be a little sloppy.

He got feet down below his knee

As before, we need another Dockerfile to actually coordinate the moving parts, such as they are. This time we also need to define our environment variable along with copying the script.

FROM postgres:latest

ENV PRIMARY_NAME=unset

COPY scripts/bootstrap_replica.sh /docker-entrypoint-initdb.d

The ENV command requires we set the variable to something, so we just use ‘unset’. Beyond that, there’s really not much else to do. All the heavy lifting is still in the official Postgres image.

Here’s the full contents of the folder where we’re constructing our image source:

postgres-replica/
postgres-replica/scripts
postgres-replica/scripts/bootstrap_replica.sh
postgres-replica/Dockerfile

And now we just create our image:

$> docker build --tag postgres-replica postgres-replica

Sending build context to Docker daemon  3.584kB
Step 1/3 : FROM postgres:latest
 ---> 4860bdf1a517
Step 2/3 : ENV PRIMARY_NAME unset
 ---> Running in 0a119ba1f6c8
 ---> 2f389fa04d93
Removing intermediate container 0a119ba1f6c8
Step 3/3 : COPY scripts/bootstrap_replica.sh /docker-entrypoint-initdb.d
 ---> 90796d5559b7
Successfully built 90796d5559b7
Successfully tagged postgres-replica:latest

He got muddy water, he one mojo filter

One of the nice things about Docker is that it has built-in networking capabilities. Networking can be an unwieldy and cantankerous Rube Goldberg contraption of the highest order, but Docker tucks all that away into some unholy dark recesses where it can’t hurt us.

All we need to know is that we can create a network bridge. Once we have a network bridge, we can assign it to newly launched containers. This gives all containers assigned to that bridge the same network operational context. Nodes are labeled after the container name, making it trivial to reference them later.

Making the bridge itself is pretty darn easy:

$> docker network create --driver bridge pg_stream

d7fbe614030ce14485c219db1f6e23d07a373c1667877dfaa6a5f1e287a51c9a

As with everything Docker, we get a hash designating our network. Now let’s put it to use!

He say, “One and one and one is three”

Now comes the fun part. In our last article, we exposed the network interface to the host machine, but this time we’re going to specifically avoid that. Port mapping is ugly and there are better ways to connect to containers, especially when they can interact in their own network sandbox.

Let’s begin by actually launching the containers:

$> docker run --name pg-primary --network pg_stream -d postgres-primary

96d1a822e436570708e5e3de39ada41c1f07077795cff64f2796fd7afe1aa213

$> docker run --name pg-replica --network pg_stream -d \
      -e PRIMARY_NAME=pg-primary postgres-replica

9077a33bf25be4553b037137df5d43fb67f4ece0df4db69cc40cd384bb518df8

Since running a daemonized container also hides the console output, we need some way of verifying those output hashes mean everything worked as expected. As it turns out, we can connect directly to the container of the primary node and outright ask it if there are any streaming replicas connected.

Let’s do that now:

$> docker exec -it pg-primary psql -U postgres -x \
       -c "SELECT * FROM pg_stat_replication"

-[ RECORD 1 ]----+------------------------------
pid              | 76
usesysid         | 16384
usename          | repuser
application_name | walreceiver
client_addr      | 172.18.0.3
client_hostname  | 
client_port      | 39600
backend_start    | 2017-11-05 23:44:01.613458+00
backend_xmin     | 
state            | streaming
sent_lsn         | 0/3000060
write_lsn        | 0/3000060
flush_lsn        | 0/3000060
replay_lsn       | 0/3000060
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async

Well how about that? One streaming replica. How about another?

$> docker run --name pg-replica2 --network pg_stream -d \
      -e PRIMARY_NAME=pg-primary postgres-replica

3e00427e330ab202a3f8ba1c64823c8b67bfaa134da1f3083c8f5caa05fa8241

$> docker exec -it pg-primary psql -U postgres \
       -c "SELECT pid, client_addr FROM pg_stat_replication"

 pid | client_addr 
-----+-------------
  76 | 172.18.0.3
  97 | 172.18.0.4

Has it ever been easier to clone a Postgres node?

One thing I can tell you is you got to be free

The cool thing about the replica recipe is that it works for any Postgres container. If we followed some of the steps in the last article, we may have created the Postgres data directory on the native filesystem. Maybe that’s even where it started, and we’re just using a temporary container to investigate its contents.

So long as the primary container is running, we can clone it over and over again. The replicas can also live on the native filesystem, and so long as there’s a recovery.conf file, the replica will run normally. This also means if we promote a replica and restart the container, it will clone upstream again.

If we want to avoid that, we can write a more advanced bootstrap script that also checks for recovery.done, and if found, applies pg_rewind. That will allow us to follow the primary again without all the overhead of rebuilding the replica from scratch.

There are obviously more safety checks we could apply, but as a proof-of-concept, this is actually slightly ridiculous. Someone armed with these two images can create a primary Postgres node and an army of replicas with minimal knowledge about Postgres itself. It’s a set of black boxes perfect for stamping out a very simple cluster.

You can’t really get more RAD than that.

Read the whole story
internetionals
5 days ago
reply
Netherlands
Share this story
Delete

Hubert 'depesz' Lubaczewski: Waiting for PostgreSQL 11 – Add hash partitioning.

1 Share
On 9th of November 2017, Robert Haas committed patch: Add hash partitioning. Hash partitioning is useful when you want to partition a growing data set evenly. This can be useful to keep table sizes reasonable, which makes maintenance operations such as VACUUM faster, or to enable partition-wise join. At present, we still depend on constraint […]
Read the whole story
internetionals
5 days ago
reply
Netherlands
Share this story
Delete

Jobin Augustine: Migrating partitioned tables to PostgreSQL 10’s Native Partitioning

1 Share

In this article, we will focus on upgrade from 9.4 (and above) versions to PostgreSQL 10, leveraging the full features of native partitioning. For elaborate details on PostgreSQL 10 partitioning, please refer to our KB article Postgresql 10 partitioning.

Test Environment: OS: CentOS 7 PostgreSQL version: 9.6.3 and 10.0 pg_partman version: 3.1.0

Migrating from 9.x version (partitioned table with inheritance) to pg10 (native partitioning)

Consider a database with several tables some of which are quite huge. In 9.6, as known, PostgreSQL has improved the performance involving huge tables by what is called as “partitioning.” These huge tables are managed internally as partitioned tables using inheritance feature. Envisage we have a table “Inventory” with a million records. This table has four child tables (inv_quarter1, inv_quarter2, inv_quarter3, and inv_quarter4) which inherit the properties from parent table Inventory.

postgres=# \d+ test.inventory
                                  Table "test.inventory"
    Column    |         Type          | Modifiers | Storage  | Stats target | Description
--------------+-----------------------+-----------+----------+--------------+-------------
 product      | character varying(10) |           | extended |              |
 units_sold   | character varying(10) |           | extended |              |
 invoice_date | date                  |           | plain    |              |
 units_remain | character varying(10) |           | extended |              |
Triggers:
    orders_insert_trigger BEFORE INSERT ON test.inventory FOR EACH ROW EXECUTE PROCEDURE orders_insert_simple()
Child tables: test.inv_quarter1,
              test.inv_quarter2,
              test.inv_quarter3,
              test.inv_quarter4

postgres=# \d+ test.inv_quarter1
                                Table "test.inv_quarter1"
    Column    |         Type          | Modifiers | Storage  | Stats target | Description
--------------+-----------------------+-----------+----------+--------------+-------------
 product      | character varying(10) |           | extended |              |
 units_sold   | character varying(10) |           | extended |              |
 invoice_date | date                  |           | plain    |              |
 units_remain | character varying(10) |           | extended |              |
Check constraints:
    "inv_quarter1_invoice_date_check" CHECK (invoice_date >= '2017-01-01'::date AND invoice_date < '2017-04-01'::date)
Inherits: test.inventory

To upgrade to pg10, a full database dump and restore is required. In doing so, the restored huge tables will continue to have the inheritance feature.

-bash-4.2$ /usr/pgsql-10/bin/pg_dump -d postgres -p 5433 -Fc  > /tmp/pg96.dump

-bash-4.2$ /usr/pgsql-10/bin/pg_restore -d postgres /tmp/pg96.dump

postgres=# \d+ inventory
                                            Table "test.inventory"
    Column    |         Type          | Collation | Nullable | Default | Storage  | Stats target | Description
--------------+-----------------------+-----------+----------+---------+----------+--------------+-------------
 product      | character varying(10) |           |          |         | extended |              |
 units_sold   | character varying(10) |           |          |         | extended |              |
 invoice_date | date                  |           |          |         | plain    |              |
 units_remain | character varying(10) |           |          |         | extended |              |
Triggers:
    orders_insert_trigger BEFORE INSERT ON inventory FOR EACH ROW EXECUTE PROCEDURE public.orders_insert_simple()
Child tables: inv_quarter1,
              inv_quarter2,
              inv_quarter3,
              inv_quarter4

NB: It is a good practice to do pg_dump and pg_restore using the same version binaries. In these test cases, since we are migrating to pg10, pg10 binaries are being used for both dump and restore.

To exploit the performance benefit of these tables using the pg10 native partitioning, we need to manually create the parent and child tables in the pg10 environment and then do pg_restore of the data only.

create table test.inventory (product varchar(10),
            units_sold varchar(10),
            invoice_date date,
            units_remain varchar(10))
   partition by RANGE (invoice_date);
CREATE TABLE

create table test.inv_quarter1 partition of inventory 
for values from ('2017-01-01') to ('2017-04-01');
CREATE TABLE
create table test.inv_quarter2 partition of inventory 
for values from ('2017-04-01') to ('2017-07-01');
CREATE TABLE
create table test.inv_quarter3 partition of inventory 
for values from ('2017-07-01') to ('2017-10-01');
CREATE TABLE
create table test.inv_quarter4 partition of inventory 
for values from ('2017-10-01') to ('2018-01-01');
CREATE TABLE

-bash-4.2$ /usr/pgsql-10/bin/pg_dump -d postgres -Fc -t test.inventory -t 'test.inv_quarter*' -a  -f /tmp/pg96.dump
-bash-4.2$ /usr/pgsql-10/bin/pg_restore -d postgres  /tmp/pg96.dump
-bash-4.2$ /usr/pgsql-10/bin/psql
psql (10.0)
Type "help" for help.
postgres=# \d+ test.inventory
                                            Table "test.inventory"
    Column    |         Type          | Collation | Nullable | Default | Storage  | Stats target | Description
--------------+-----------------------+-----------+----------+---------+----------+--------------+-------------
 product      | character varying(10) |           |          |         | extended |              |
 units_sold   | character varying(10) |           |          |         | extended |              |
 invoice_date | date                  |           |          |         | plain    |              |
 units_remain | character varying(10) |           |          |         | extended |              |
Partition key: RANGE (invoice_date)
Partitions: test.inv_quarter1 FOR VALUES FROM ('2017-01-01') TO ('2017-04-01'),
            test.inv_quarter2 FOR VALUES FROM ('2017-04-01') TO ('2017-07-01'),
            test.inv_quarter3 FOR VALUES FROM ('2017-07-01') TO ('2017-10-01'),
            test.inv_quarter4 FOR VALUES FROM ('2017-10-01') TO ('2018-01-01')

postgres=# select count(*) from only test.inventory;
 count
-------
     0
(1 row)

postgres=# select count(*) from only test.inv_quarter1;
 count
--------
 246600
(1 row)

postgres=# select count(*) from only test.inv_quarter2;
 count
--------
 249340
(1 row)

postgres=# select count(*) from only test.inv_quarter3;
 count
--------
 252071
(1 row)

postgres=# select count(*) from only test.inv_quarter4;
 count
--------
 251988
(1 row)

The above is an example for RANGE partitioning. Similar steps should be followed for LIST partitioning.

Once the structure of the tables are done, pg_restore of only data would load the records into the appropriate child tables. Upgrade is complete with marrowbone of pg10’s native partitioning.

Migrating from 9.4 version (partitioned tables managed by pg_partman)

Likewise, let us assume a 9.4 environment in which the partitioned tables are being managed pg_partman. For illustration, let us say the table “Sales” has 1 million records and has 12 child tables. Let us see the two methods with which we can upgrade to pg10.

postgres=# \dx
                               List of installed extensions
    Name    | Version |   Schema   |                     Description
------------+---------+------------+------------------------------------------------------
 pg_partman | 3.1.0   | partman    | Extension to manage partitioned tables by time or ID


postgres=# \d+ test.sales
                                                               Table "test.sales"
   Column   |           Type           |                            Modifiers                            | Storage  | Stats target | Description
------------+--------------------------+-----------------------------------------------------------------+----------+--------------+-------------
 inv_number | integer                  | not null default nextval('test.sales_inv_number_seq'::regclass) | plain    |              |
 product    | text                     |                                                                 | extended |              |
 inv_date   | timestamp with time zone | not null default now()                                          | plain    |              |
Triggers:
    sales_part_trig BEFORE INSERT ON test.sales FOR EACH ROW EXECUTE PROCEDURE test.sales_part_trig_func()
Child tables: test.sales_p2017_04,
              test.sales_p2017_05,
              test.sales_p2017_06,
              test.sales_p2017_07,
              test.sales_p2017_08,
              test.sales_p2017_09,
              test.sales_p2017_10,
              test.sales_p2017_11,
              test.sales_p2017_12,
              test.sales_p2018_01,
              test.sales_p2018_02,
              test.sales_p2018_03,
              test.sales_p2018_04

For upgrade, do a pg_dump of the 9.x database. Before performing the pg_restore on pg10, ensure the below postgresql.conf parameters are set. A restart of the pg10 server is needed for these parameters to take effect.

--- postgresql.conf settings in pg10

shared_preload_libraries = 'pg_partman_bgw' 
pg_partman_bgw.interval = 3600 
pg_partman_bgw.role = 'postgres' 
pg_partman_bgw.dbname = 'postgres'

Now pg_restore would install the partman extensions and respective tables with their metadata as well.

postgres=# \d+ test.sales
                                                                Table "test.sales"
   Column   |           Type           | Collation | Nullable |                  Default                  | Storage  | Stats target | Description
------------+--------------------------+-----------+----------+-------------------------------------------+----------+--------------+-------------
 inv_number | integer                  |           | not null | nextval('sales_inv_number_seq'::regclass) | plain    |              |
 product    | text                     |           |          |                                           | extended |              |
 inv_date   | timestamp with time zone |           | not null | now()                                     | plain    |              |
Triggers:
    sales_part_trig BEFORE INSERT ON sales FOR EACH ROW EXECUTE PROCEDURE sales_part_trig_func()
Child tables: sales_p2017_04,
              sales_p2017_05,
              sales_p2017_06,
              sales_p2017_07,
              sales_p2017_08,
              sales_p2017_09,
              sales_p2017_10,
              sales_p2017_11,
              sales_p2017_12,
              sales_p2018_01,
              sales_p2018_02,
              sales_p2018_03,
              sales_p2018_04

postgres=# select * from partman.part_config;
 parent_table | control  | partition_type | partition_interval | constraint_cols | premake | optimize_trigger | optimize_constraint | epoch | inherit_fk | r
etention | retention_schema | retention_keep_table | retention_keep_index | infinite_time_partitions | datetime_string | automatic_maintenance | jobmon | su
b_partition_set_full | undo_in_progress | trigger_exception_handling | upsert | trigger_return_null | template_table
--------------+----------+----------------+--------------------+-----------------+---------+------------------+---------------------+-------+------------+--
---------+------------------+----------------------+----------------------+--------------------------+-----------------+-----------------------+--------+---
---------------------+------------------+----------------------------+--------+---------------------+----------------
 test.sales   | inv_date | partman        | 1 mon              |                 |       4 |                4 |                  30 | none  | t          |
         |                  | t                    | t                    | f                        | YYYY_MM         | on                    | t      | f
                     | f                | f                          |        | t                   |
(1 row)

As observed and understood, in 9.x data is loaded into child tables using TRIGGERS. When restored as is, the newly created child tables in pg10 get data loaded using triggers. We can still continue to manage these tables using pg_partman, however, the native partitioning feature will not be utilized.

Conversely, to use the NATIVE PARTITIONING feature of pg10 and to continue to use pg_partman to manage the partitioned tables, first off identify the partitioned tables in 9.x. Create parent tables manually on the pg 10 environment. Using pg_partman, then create the child tables using pg10’s native RANGE or LIST partition feature. Of note, pg_partman 3.1.0 was used for this demonstration.

postgres=# CREATE TABLE test.sales (inv_number serial, product text, inv_date timestamptz NOT NULL DEFAULT now()) partition by RANGE (inv_date);
CREATE TABLE

postgres=# SELECT partman.create_parent('test.sales', 'inv_date', 'native', 'monthly');
 create_parent
---------------
 t
(1 row)

postgres=# \d+ test.sales
                                                                  Table "test.sales"
   Column   |           Type           | Collation | Nullable |                    Default                     | Storage  | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------------------------+----------+--------------+-------------
 inv_number | integer                  |           | not null | nextval('test.sales_inv_number_seq'::regclass) | plain    |              |
 product    | text                     |           |          |                                                | extended |              |
 inv_date   | timestamp with time zone |           | not null | now()                                          | plain    |              |
Partition key: RANGE (inv_date)
Partitions: test.sales_p2017_04 FOR VALUES FROM ('2017-04-01 00:00:00+05:30') TO ('2017-05-01 00:00:00+05:30'),
            test.sales_p2017_05 FOR VALUES FROM ('2017-05-01 00:00:00+05:30') TO ('2017-06-01 00:00:00+05:30'),
            test.sales_p2017_06 FOR VALUES FROM ('2017-06-01 00:00:00+05:30') TO ('2017-07-01 00:00:00+05:30'),
            test.sales_p2017_07 FOR VALUES FROM ('2017-07-01 00:00:00+05:30') TO ('2017-08-01 00:00:00+05:30'),
            test.sales_p2017_08 FOR VALUES FROM ('2017-08-01 00:00:00+05:30') TO ('2017-09-01 00:00:00+05:30'),
            test.sales_p2017_09 FOR VALUES FROM ('2017-09-01 00:00:00+05:30') TO ('2017-10-01 00:00:00+05:30'),
            test.sales_p2017_10 FOR VALUES FROM ('2017-10-01 00:00:00+05:30') TO ('2017-11-01 00:00:00+05:30'),
            test.sales_p2017_11 FOR VALUES FROM ('2017-11-01 00:00:00+05:30') TO ('2017-12-01 00:00:00+05:30'),
            test.sales_p2017_12 FOR VALUES FROM ('2017-12-01 00:00:00+05:30') TO ('2018-01-01 00:00:00+05:30'),
            test.sales_p2018_01 FOR VALUES FROM ('2018-01-01 00:00:00+05:30') TO ('2018-02-01 00:00:00+05:30'),
            test.sales_p2018_02 FOR VALUES FROM ('2018-02-01 00:00:00+05:30') TO ('2018-03-01 00:00:00+05:30'),
            test.sales_p2018_03 FOR VALUES FROM ('2018-03-01 00:00:00+05:30') TO ('2018-04-01 00:00:00+05:30'),
            test.sales_p2018_04 FOR VALUES FROM ('2018-04-01 00:00:00+05:30') TO ('2018-05-01 00:00:00+05:30')

A pg_dump of these tables using “only data” on the 9.x server and then a pg_restore in pg10 would complete the upgrade.

-bash-4.2$ /usr/pgsql-10/bin/pg_dump -d postgres -Fc -t test.sales -t 'test.sales_p*' -a -f /tmp/pg96_data.dump
-bash-4.2$ /usr/pgsql-10/bin/pg_restore -d postgres  /tmp/pg96_data.dump
-bash-4.2$ /usr/pgsql-10/bin/psql
psql (10.0)
Type "help" for help.

postgres=# select count(*) from test.sales;
 count
--------
 999999
(1 row)

postgres=# select count(*) from only test.sales;
 count
-------
     0
(1 row)

postgres=# select count(*) from test.sales_p2017_04;
 count
-------
 75960
(1 row)

Conclusion: Pg_partman has introduced in version 3.1.0 “native” way of managing the partitioned tables. Since pg_partman 3.1.0 can be installed in PostgreSQL 9.4 and above, upgrade can be done from 9.4 onwards. For environments not managed by pg_partman, this condion does not apply.

Summarizing the above, harbouring the thought of upgrading the current version of your server to pg10, consider the huge tables benefit from all the groovy features of PostgreSQL 10.

Read the whole story
internetionals
5 days ago
reply
Netherlands
Share this story
Delete

Craig Kerstiens: Faster bulk loading in Postgres with copy

1 Share

If you’ve used a relational database, you understand basic INSERT statements. Even if you come from a NoSQL background, you likely grok inserts. Within the Postgres world, there is a utility that is useful for fast bulk ingestion: \copy. Postgres \copy is a mechanism for you to bulk load data in or out of Postgre.

First, lets pause. Do you even need to bulk load data and what’s it have to do with Citus? We see customers leverage Citus for a number of different uses. When looking at Citus for a transactional workload, say as the system of record for some kind of multi-tenant SaaS app, your app is mostly performing standard insert/updates/deletes.

But when you’re leveraging Citus for real-time analytics, you may already have a separate ingest pipeline. In this case you might be looking at event data, which can be higher in volume than a standard web app workload. If you already have an ingest pipeline that reads off Apache Kafka or Kinesis, you could be a great candidate for bulk ingest.

Back to our feature presentation: Postgres copy. Copy is interesting because you can achieve much higher throughput than with single row inserts.

Postgres \copy:

  • Can be executed from within psql or an application language
  • Supports CSV as well as binary data
  • Is transactional

A look at Postgres \copy performance for bulk ingest

Let’s take a look at both how Postgres copy performs over single row inserts, then we’ll move onto how you can leverage \copy. There are essentially 3 primary ways to scale out your writes in single-node Postgres. And if you need to scale out ingest even further beyond a single node you can look to Citus as each node in a Citus database cluster is able to help you at scaling writes.

Starting simple with inserts, using sample GitHub event data

First let’s see how we perform on a standard insert.

To perform this test, I created a set of 1,000,000 inserts from the GitHub event dataset. I then connected to Postgres with psql and ran \i single_row_inserts.sql. This command executed all the insert queries.

The result: it took 15 minutes 30 seconds to load up 1 million events records. Or right at 1,075 inserts per second on a small-size Postgres instance. Yes, I could have had a few more writers going at once and further tuned my test, but this gives us a starting baseline to compare against.

Performance tuning our inserts

It is of note here that each insert is a transaction. What this means is Postgres is doing some extra coordination to make sure the transaction is completed before returning. On every single write this takes some overhead. Instead of single row transactions, if we wrap all of our inserts in a transaction like below, we’ll see some nice performance gains:

begin;
insert 1;
insert 2;
insert 3;
...
commit;

This took my inserts down from 15 minutes 30 seconds to 5 minutes and 4 seconds. We’ve suddenly boosted our throughput by 3x to about 3k inserts per second.

Bulk ingest with even greater performance ft. Postgres \copy

By batching our inserts into a single transaction, we saw our throughput go higher. But hold on, there is even more we can do. The \copy mechanism gives a way to bulk load data in an even more performant manner. Here instead of raw insert statements, we need the data in some CSV format. Copy accepts a number of delimiters and even forms of binary data as well, but a standard CSV should work. Once we have our CSV of events we can load it up from within Postgres:

\copy github_events from 1millionevents.csv WITH csv;

Running this \copy completes in 82 seconds! We’re now processing over 10k writes per second on some fairly modest hardware.

But psql?

Okay, so those of you who are frequent repeaters of our Citus Data blog know that I love psql and the command line utilities in Postgres, but the CLI may not be helpful directly in your application. Fortunately, the copy utilities are available in most languages. Here’s the quick rundown for a few popular languages:

Leveraging bulk loading in Ruby

Within Ruby you’ll need to be using the standard pq driver or ActiveRecord. The following gives you a rough sketch for establishing a connection from within ActiveRecord and loading from CSV on disk:

  conn = ActiveRecord::Base.connection
  rc = conn.raw_connection
  rc.exec("COPY my_table (col1, col2, etc) FROM STDIN WITH CSV")

  file = File.open('myfile.csv', 'r')
  while !file.eof?
    rc.put_copy_data(file.readline)
  end

  rc.put_copy_end

Leveraging bulk loading in Python

Of course anything Ruby can do, Python (can do better ;) ). In Python, leverage a similar process as we did in ruby. With this code example we’re going to generate a CSV live though as opposed to reading one from disk. In production in both Ruby and Python you could parse the CSV out from disk, or generate it live from some other part of your pipeline like Kafka/Kinesis.

conn = psycopg2.connect(DSN)
curs = conn.cursor()

data = StringIO.StringIO()
data.write('\n'.join(['1', '2','3']))
data.seek(0)

curs.copy_from(data, 'my_table')

Go faster with Postgres \copy, and even faster with Citus

Bulk ingestion with \copy is great for a lot of workloads anddoesn’t require you to load up millions of record CSVs either. You can start with micro-batches of 100, 1000, or 10000 records and still see significant performance gains.

Even better, if you’re running into limits on a single-node Postgres instance, \copy is supported out of the box with Citus and gets parallelized so you can see even higher throughputs. Need help ingesting large amounts of data? Contact my team here at Citus and we’d be happy to chat.

Read the whole story
internetionals
5 days ago
reply
Netherlands
Share this story
Delete

Bruce Momjian: Data in the Database vs the File System

1 Share

An age-old question is whether it is better to put data in a database or a file system. Of course, the answer is "it depends," but let's look at what it depends on.

First, Postgres stores its data in the file system, so the file system must be good for something. Postgres also stores some of its configuration files in the file system, e.g. pg_hba.conf, so it isn't just a issue that once you have a database, everything is better in a database.

It usually comes down to evaluating database positives vs. negatives — first the database storage positives:

  • Do you need to view the data in multiple ways?
  • Is synchronizing database data and file system data going to be difficult?
  • Do you need multi-object commit synchronization?
  • Do applications need a single, guaranteed-consistent view of their data?
  • Is a file system API unreasonable or inaccessible to clients?

Continue Reading »

Read the whole story
internetionals
7 days ago
reply
Netherlands
Share this story
Delete
Next Page of Stories