Change SERIAL to IDENTITY in PostgreSQL

PostgreSQL 10 implements SQL standard's IDENTITY

Josep

In SQL Server is quite common to use IDENTITYs for non-natural primary keys. In PostgreSQL, until version 10, only SERIALs could be used for the same purpose. But that has changed.

Why INDENTITY and not SERIAL and SEQUENCES?

SERIAL is a friendly way to set a SEQUENCE, but at the end, it's a SEQUENCE: an object that doesn't belong to the table but sets a dependency with it. It's not a real data type. It's important to keep it in mind when using it because it has implications:
* Permissions must be given for the sequence, not only to the table.
* When dropping the SEQUENCE, first must be DROP the DEFAULT to the column and then DROP the SEQUENCE.
* When altering the next value, it has to be done in the SEQUENCE, not in the table.
* When cloning the table DDL, the same SEQUENCE would be used.

IDENTITY as Peter Eisentraut explains says:

The new way creates the sequence as a proper internal dependency of the table, so that various weird implementation details of the serial pseudotype are not exposed.

IDENTITYs limitations

I really like the flexibility and the syntax of PostgreSQL's IDENTITY. But I got a surprise that PostgreSQL is more limited than SQL Server with data types that allow IDENTITY property when trying to use it with a NUMERIC data type:

ERROR: identity column type must be smallint, integer, or bigint
SQL state: 22023

The biggest data type possible is bigint (8 bytes), that has the range from -9223372036854775808 to +9223372036854775807. This means that NUMERIC(19) cannot be converter to IDENTITY and SERIAL must be used.

To find the columns that cannot be straightforward converted to IDENTITY:

SELECT table_catalog, table_schema, table_name, column_name, column_default, is_nullable, data_type, numeric_precision, numeric_scale 
    FROM INFORMATION_SCHEMA.COLUMNS WHERE DATA_TYPE = 'numeric' AND NUMERIC_PRECISION > 18 
        AND column_default LIKE 'nextval(''%'
    ORDER BY table_catalog, table_schema, table_name, column_name;

For NUMERIC greater than 18 digits, IDENTITYs could be used if it's possible to downcast to BIGINT. This function, available in Github, checks for all NUMERIC greater than 18 digits if it's possible (returned column `downcast_possible') to convert to bigint:

CREATE OR REPLACE FUNCTION numeric_serials_bigger_than_bigint(schema_name regnamespace)
/*
Checks for all NUMERIC greater than 18 digits if it's possible to convert them to bigint (said in column 'downcast_possible')

The function can check it in one schema or in all the database:
- SELECT * FROM numeric_serials_bigger_than_bigint('my_schema');
- SELECT * FROM numeric_serials_bigger_than_bigint(null);
*/
RETURNS TABLE (
        table_schema        information_schema.sql_identifier, --regnamespace,
        table_name          information_schema.sql_identifier, --regclass,
        column_name         VARCHAR(1000),
        column_default      VARCHAR(1000),
        is_nullable         boolean,
        data_type           VARCHAR(50), 
        numeric_precision   INT, 
        numeric_scale       INT, 
        n_live_tup          NUMERIC(100),
        min_value           NUMERIC(100),
        max_value           NUMERIC(100),
        downcast_possible   boolean
    )
LANGUAGE plpgsql
AS $$
DECLARE
  min_return NUMERIC(100);
  max_return NUMERIC(100);
  serial_row RECORD;
BEGIN
    CREATE TEMPORARY TABLE serials (
        table_schema        information_schema.sql_identifier NOT NULL,  --regnamespace NOT NULL,
        table_name          information_schema.sql_identifier NOT NULL,  --regclass NOT NULL,
        column_name         VARCHAR(1000) NOT NULL,
        column_default      VARCHAR(1000) NOT NULL,
        is_nullable         boolean NOT NULL,
        data_type           VARCHAR(50) NOT NULL, 
        numeric_precision   INT         NOT NULL, 
        numeric_scale       INT         NOT NULL, 
        n_live_tup          NUMERIC(100) NOT NULL,
        min_value           NUMERIC(100),
        max_value           NUMERIC(100),
        downcast_possible   boolean
    ) ON COMMIT DROP;

    INSERT INTO serials (table_schema, table_name, column_name, column_default, is_nullable, data_type, 
            numeric_precision, numeric_scale, n_live_tup)
        SELECT c.table_schema, c.table_name, c.column_name, c.column_default, CAST (c.is_nullable AS boolean), c.data_type, 
                c.numeric_precision, c.numeric_scale, t.n_live_tup 
            FROM information_schema.columns c
            INNER JOIN pg_stat_user_tables t ON c.table_schema = t.schemaname AND c.table_name = t.relname
            WHERE c.data_type = 'numeric' AND c.numeric_precision > 18 AND c.column_default LIKE 'nextval(''%'
                AND c.table_schema = COALESCE(CAST(schema_name AS information_schema.sql_identifier), c.table_schema)
            ORDER BY c.table_catalog, c.table_schema, c.table_name, c.column_name;

    FOR serial_row IN
            SELECT s.table_schema, s.table_name, s.column_name, 
                'SELECT MIN(' || s.column_name || '), MAX(' || s.column_name || ') FROM ' || s.table_schema || '.' || s.table_name AS sql_code 
                    FROM serials s
        LOOP
            EXECUTE serial_row.sql_code INTO min_return, max_return;
            UPDATE serials s SET min_value = min_return, max_value = max_return, downcast_possible = TRUE
                WHERE s.table_schema = serial_row.table_schema AND s.table_name = serial_row.table_name AND s.column_name = serial_row.column_name;

            UPDATE serials s SET downcast_possible = (s.numeric_precision > LOG(s.max_value))
                WHERE s.max_value IS NOT NULL;
    END LOOP;

    RETURN QUERY SELECT s.table_schema, s.table_name, s.column_name, s.column_default, s.is_nullable, s.data_type, 
                s.numeric_precision, s.numeric_scale, s.n_live_tup, s.min_value, s.max_value, s.downcast_possible FROM serials s;
END;
$$;

The function can check it in one schema or in all the database:

SELECT * FROM numeric_serials_bigger_than_bigint('my_schema');
SELECT * FROM numeric_serials_bigger_than_bigint(null);

Convert a SERIAL to an IDENTITY

Having this table:

CREATE TABLE public.another_table
(
    id          SERIAL NOT NULL,
    information CHARACTER VARYING(30) COLLATE pg_catalog."default" NOT NULL,
    price       NUMERIC(10,2) NOT NULL,
    amount      NUMERIC(12,0) NOT NULL,
    CONSTRAINT pk_another_table PRIMARY KEY (id)
);

It can be done easily through these steps:

ALTER TABLE public.another_table
    ALTER COLUMN id DROP DEFAULT;

DROP SEQUENCE public.another_table_id_seq;

ALTER TABLE public.another_table
    ALTER COLUMN id SET DATA TYPE INT;

ALTER TABLE public.another_table
    ALTER COLUMN id ADD GENERATED BY DEFAULT AS IDENTITY;

Although Peter Eisentraut made a function to do it.

IDENTITYs are NOT shown in PgAdmin 4 version 2.1

But currently, the CREATE TABLE DDL option of PgAdmin 4 version 2.1 doesn't give information on IDENTITYs:

CREATE TABLE public.another_table
(
    id bigint NOT NULL,
    information character varying(30) COLLATE pg_catalog."default" NOT NULL,
    price numeric(10,2) NOT NULL,
    amount numeric(12,0) NOT NULL,
    CONSTRAINT pk_another_table PRIMARY KEY (id)
)

There's no sign of IDENTITY! But it's there, because if we try to add it again, it will fail:

ERROR: column "id" of relation "another_table" is already an identity column
SQL state: 55000

But the information is available querying INFORMATION_SCHEMA.COLUMNS table:

SELECT table_schema, table_name, column_name, is_identity, identity_generation, 
        identity_start, identity_maximum, identity_minimum, identity_cycle 
    FROM INFORMATION_SCHEMA.COLUMNS WHERE table_schema='public' AND table_name='another_table' AND column_name='id';