I’m currently trying to create a function that will create an index on a table in every schema in the data warehouse. This is the script I have so far:
create or replace function dwh.loan_type_id_indexing()
returns void language plpgsql AS
$PROC$
Declare
myschema varchar;
sql text;
Begin
for myschema in
SELECT nspname
FROM pg_catalog.pg_namespace
where nspname not in ('information_schema', 'pg_catalog', 'pg_temp_1',
'pg_temp_7', 'pg_toast', 'pg_toast_temp_1',
'pg_toast_temp_7','public', 'c1', 'dwh',
'users', 'c2'
)
order by nspname
loop
sql = 'CREATE INDEX '|| myschema || '_' ||'type_id ON '||
myschema || '.' ||'.fact_tbl USING btree (loan_type_id)';
execute sql;
end loop;
END
$PROC$
volatile;
I know this isn’t correct but it will give you my thought process on what I’m trying to do.
Instead of filtering out schemas and assuming every other one has the the table you want, query
information_schemafor them and loop over the resulting list:You now have everything you need to create the index with
EXECUTEby looping over the records returned from the query.You might want to
RAISE NOTICE 'Creating index on %s.fact_loan.loan_type_id', table_schema;to allow you to track progress, too, as index builds can take a while.If you were going to filter schemas, you’d be better off using
schemaname NOT LIKE 'pg_%' AND lower(shemaname) <> 'information_schema'as shown above.BTW, I usually find this sort of job more convenient to do from a script outside the database where I have access to multiple connections, threading/multiprocessing, etc. A quick Python script with the
psycopg2driver for Pg would let you bang something like this together that does the index builds in parallel say 4 at a time; the right number would depend on your disk configuration.