Postgres
From NYU CCPP Wiki
Contents |
Using the Postgres Database at NYU
This is a quick guide to using the postgres database at NYU. Note, currently one can only access the db from within our network.
For an SQL tutorial see this page: http://www.w3schools.com/sql/ See also Postgres Tricks for a list of useful commands and Postgres Administration for some administration tips.
Environment
SDSS database and user
The user should set the following environment variables for convenience. Note these are automatically set if you source the setup script described in the IDL interface section below.
- PGUSER=sdss
- PGDATABASE=sdss
These are the default user and database on the system. No authentication is required for this user. Then when the user runs the command line tool 'psql' or accesses the database through the IDL, C, python, etc APIs no connection information will be needed.
Authentication
For logins that require passwords, a file ~/.pgpass can be created (unreadable to the outside world) with the following form:
- *:*:*:username:password
which will allow login with no typing of a password. If you have more than one host and database with different accounts you specify this fully: hostname:port:database:username:password
For example, in order to run the COPY command you have to run as the postgres user. You will need a .pgpass file with the following definition:
- *:*:*:postgres:the_correct_pwd
Ask Erin Sheldon for the password. This priveledge is very useful especially from the IDL interface (e.g. struct2table)
Command Line
The command line tool is a simple way to look around at the data.
[esheldon@early ~] psql sdss=>
To see the existing tables
sdss=> \d
List of relations
Schema | Name | Type | Owner
--------+----------------------+-------+----------
public | datasweep | table | postgres
public | datasweep_meta | table | postgres
public | field_rotation | table | sdss
public | lowz | table | postgres
public | lssall0 | table | postgres
public | lssfull0 | table | postgres
public | maxbcg | table | postgres
public | maxbcg_input | table | postgres
public | maxbcg_input_meta | table | postgres
public | maxbcg_neigh | table | postgres
public | recipes | table | sdss
public | vagc | table | postgres
public | vagc_match | table | postgres
public | zphot | table | postgres
public | zphot_princeton_meta | table | postgres
To see what is in each table:
sdss=> \d zphot
Table "public.zphot"
Column | Type | Modifiers
-----------------+----------+-----------
photoid | bigint | not null
run | smallint | not null
rerun | smallint | not null
camcol | smallint | not null
field | smallint | not null
id | smallint | not null
photoz_z | real | not null
photoz_zerr1 | real | not null
photoz_zerr2 | real | not null
photoz_zerr3 | real | not null
photoz_zerr4 | real | not null
photoz_zwarning | smallint | not null
Indexes:
"zphot_pkey" PRIMARY KEY, btree (photoid)
"zphot_rrcfi_index" btree (run, rerun, camcol, field, id)
Note each column definition and the list of indexes.
Look at the first 10 items in the datasweeps:
sdss=> select run,camcol,field,modelflux[2] from datasweep limit 10; run | camcol | field | modelflux -----+--------+-------+----------- 94 | 1 | 11 | 39.1747 94 | 1 | 11 | 248.069 94 | 1 | 11 | 362.275 94 | 1 | 11 | 0.794669 94 | 1 | 11 | 2.42472 94 | 1 | 11 | 0.866375 94 | 1 | 11 | 6.57751 94 | 1 | 11 | 4.65699 94 | 1 | 11 | 0.781404 94 | 1 | 11 | 13.1877
IDL interface
In general, you will need to get SDSSIDL http://sdssidl.sourceforge.net/ set up. At NYU all you need to do is source this file:
- source ~esheldon/idl_libraries/sdssidl_config/sdssidl_setup.sh
or .csh for cshell.
There is an IDL interface to the postgresql database. The basic routine is called pgsql_query() (http://sdssidl.sourceforge.net/src/pgsql_query.c.html)
- IDL> st = pgsql_query('select * from datasweep where run=756 and modelflux[2] > 100')
Will return all data for all objects in run 756 with that flux bound. The data is returned in an IDL structure:
IDL> help, st, /str ** Structure <2168818>, 41 tags, length=704, data length=696, refs=1: PHOTOID LONG64 756137100011000140 RUN LONG 756 RERUN INT 137 CAMCOL INT 1 FIELD INT 11 ID LONG 140 OBJC_TYPE LONG 3 ... etc ...
There are a whole sweet of tools in the postgres class for manipulating databases http://sdssidl.sourceforge.net/postgres/postgres__define.html For a list of all methods do
- IDL> methods,'postgres'
For example, if you have a structure in memory, you can stuff it into a new (or existing) table.
pg=obj_new('postgres')
pg->struct2table, struct, tablename
This will by default be in the sdss database owned by the sdss user if that is how your environment is set (see above). The sdss database is the best place for public tables (although one would often create them as the postgres user to keep them from getting modified). Often we won't want to clog the main table space with user tables, so we can set you up with your own database. Just let Erin Sheldon know. To create the table under a different user and database:
- pg->struct2table, ...., connect_info='dbname=jsmith user=jsmith'
You can also do generic queries with error checking.
struct = pg->query('select * from datasweep where run=756 and modelflux[2] > 100', status=status)
if status ne pg->status_val('success') then message,'Failed to retrieve data'
To see what tables are available:
pg->describe # SCHEMANAME TABLENAME TABLEOWNER TABLESPACE HASINDEXES HASRULES HASTRIGGERS 0 public maxbcg_input postgres 1 0 0 1 public datasweep_meta postgres 0 0 0 2 public datasweep postgres 1 0 0 3 public lssfull0 postgres 1 0 0 .... etc.
And to describe a given table
pg->describe, 'datasweep'
** Structure <a4cd08>, 41 tags, length=704, data length=696, refs=1:
PHOTOID LONG64 94137100011000004
RUN LONG 94
RERUN INT 137
CAMCOL INT 1
FIELD INT 11
ID LONG 4
OBJC_TYPE LONG 3
OBJC_FLAGS LONG 302645268
OBJC_FLAGS2 LONG 24600
COLC FLOAT Array[5]
PETROR50 FLOAT Array[5]
PETROR90 FLOAT Array[5]
R_DEV FLOAT Array[5]
R_EXP FLOAT Array[5]
... snip ....
Indexes for 'datasweep'
index description
------------------------------------------------------------
datasweep_pkey PRIMARY KEY, btree (photoid)
datasweep_mf0 btree ((modelflux[0]))
datasweep_mf1 btree ((modelflux[1]))
datasweep_mf2 btree ((modelflux[2]))
datasweep_mf3 btree ((modelflux[3]))
datasweep_mf4 btree ((modelflux[4]))
datasweep_rrcfi btree (run, rerun, camcol, field, id)
