Option 3: Add a new column of type bigint
Overview of the process
Add a new column to the table with type bigint. Copy over id values to id_new and then rename the new column to id (by renaming the existing id to id_old).
Steps:
- Add a new nullable column id_new.
- Script that runs an update to id_new with values from id in chunks of 1,000,000.
- Once in maintenance mode, drop primary key constraint, foreign keys constraints and sequence on id.
- Rename id and id_new.
- Add back primary key constraint.
- Add back foreign key constraint.
Details of the process
Environment
| Operating System | Disk type | Amount of data | Background time | Maintenance time |
|---|---|---|---|---|
| MAC Big Sur | SSD | 0.5 billion | 3h | 30m |
-
Record min id value and max id value. Example: latest negative max id value is -480383918
select max(id) as latest_negative_max_id from trade_logs where id < 0; --latest negative max id value -
Add new nullable bigint column without a default specified for id column.
-- 1s ALTER TABLE trade_logs ADD COLUMN id_new BIGINT NULL; -
Add new nullable bigint column without a default specified for allocation_log_id column.
-- 1s ALTER TABLE trade_logs ADD COLUMN allocation_log_id_new BIGINT NULL; -
Perform VACUUM command to reclaim dead tuples before coping data from id to id_new
VACUUM (VERBOSE, ANALYZE) trade_logs; -
Copy data from id to id_new in background mode. The reason we chose increments of 1,000,000 is mainly because it performed better in terms of incremental speed, disk utilization and cleanup between transactions(Need to be verified in AWS env again). Record the max id value printed in the log
-- 1s DROP PROCEDURE if exists syn_data(int8); -- 1s create or replace PROCEDURE syn_data(in_latest_negative_max_id int8) as $$ DECLARE counter int8; min_id int8; max_id int8; step int8 := 100000; begin select min(id) into min_id from trade_logs; select max(id) into max_id from trade_logs; RAISE NOTICE 'begin time is: %',CURRENT_TIMESTAMP; RAISE NOTICE 'latest negative max id value is: %',in_latest_negative_max_id; RAISE NOTICE 'min id value is: %',min_id; RAISE NOTICE 'max id value is: %',max_id; counter:=min_id; while counter <= in_latest_negative_max_id loop begin update trade_logs set id_new=id, allocation_log_id_new=allocation_log_id where id < 0 and id >= counter and id < counter + step; commit; end; counter := counter + step; RAISE NOTICE 'current datetime: % counter: %',CURRENT_TIMESTAMP,counter; end loop; counter:=0; while counter <= max_id loop begin update trade_logs set id_new=id, allocation_log_id_new=allocation_log_id where id >= 0 and id >= counter and id < counter + step; commit; end; counter := counter + step; RAISE NOTICE 'current datetime: % counter: %',CURRENT_TIMESTAMP,counter; end loop; RAISE NOTICE 'end time is: %',CURRENT_TIMESTAMP; END $$ language plpgsql; -- 3h 3m 8s 234ms call syn_data(-480383918); -
Once the background migration completed, we can start maintenance mode.
-
Make sure there are no queries hitting the DB.
SELECT * FROM pg_stat_activity WHERE datname = '<db>' AND state = 'active' LIMIT 10; -
Run updates that id value less than latest min id value.
-- 1s UPDATE trade_logs SET id_new = id, allocation_log_id_new = allocation_log_id WHERE id >= -2047483648 and id<0; -
Checking for not null values on the column id_new before dropping constraints on id, in order to avoid some data may not be copied successfully
-- 1s DROP PROCEDURE if exists check_null_and_fixed(); create or replace PROCEDURE check_null_and_fixed() as $$ DECLARE i RECORD; begin FOR i IN select * from trade_logs tl where tl.id_new is null loop begin RAISE NOTICE '% id is for null id_new', i.id; update trade_logs set id_new=i.id, allocation_log_id_new=i.allocation_log_id where id = i.id; commit; end; end loop; END $$ language plpgsql; -- 4 m 59 s 118 ms call check_null_and_fixed(); -
Select and remove Foreign Keys from the id column
-- 1s select tc.constraint_name, tc.table_name, kcu.column_name, ccu.table_name as foreign_table_name, ccu.column_name as foreign_column_name from information_schema.table_constraints as tc join information_schema.key_column_usage as kcu on tc.constraint_name = kcu.constraint_name join information_schema.constraint_column_usage as ccu on ccu.constraint_name = tc.constraint_name where constraint_type = 'FOREIGN KEY' and ccu.table_name = 'trade_logs'; -- 1s alter table trade_logs drop constraint <constraint_name>; -
Remove Primary Key constraint on id.
-- 1s ALTER TABLE trade_logs DROP CONSTRAINT trade_logs_pkey; -
Rename column id to id_old.
-- 1s ALTER TABLE trade_logs RENAME COLUMN id TO id_old; -
Rename column id_new to id.
-- 1s ALTER TABLE trade_logs RENAME COLUMN id_new TO id; -
Rename column allocation_log_id to allocation_log_id_old.
-- 1s ALTER TABLE trade_logs RENAME COLUMN allocation_log_id TO allocation_log_id_old; -
Rename column allocation_line_new to allocation_log_id.
-- 1s ALTER TABLE trade_logs RENAME COLUMN allocation_log_id_new TO allocation_log_id; -
Add Primary Key constraint to id .
-- 10m 15s 861ms ALTER TABLE trade_logs ADD PRIMARY KEY (id); -
Add Foreign Key constraints back to id .
-- 7m 42s 986ms ALTER TABLE trade_logs ADD CONSTRAINT <constraint_name> FOREIGN KEY (allocation_log_id) REFERENCES trade_logs (id); -
Add back the sequence to id.
-- 1s ALTER TABLE trade_logs ALTER COLUMN id_old drop default; -- 1s ALTER TABLE trade_logs ALTER COLUMN id SET DEFAULT NEXTVAL('trade_logs_id_seq'::regclass); -- 1s alter sequence trade_logs_id_seq owned by trade_logs.id; -
Drop not null constraint on id_old
-- 1s ALTER TABLE trade_logs ALTER COLUMN id_old DROP NOT NULL; -
Validate the data looks good by running automated tests with fixed input and expected output.
-
Drop column id_old
-- 1s ALTER TABLE trade_logs DROP COLUMN id_old; -
Drop column allocation_log_id_old
-- 1s ALTER TABLE trade_logs DROP COLUMN allocation_log_id_old; -
Drop procedures that are no longer used
-- 1s DROP PROCEDURE if exists syn_data(int8, int8); -- 1s DROP procedure if exists check_null_and_fixed();
Note
- Back up the schema for the table before performing the changes.