SDSS tutorial
From NYU CCPP Wiki
In order for the things on this page to work, you need to have "idlutils" and "photoop" installed with the setup command.
Where are the SDSS data?
One common data set to use is the SDSS data. We will describe to you the basics of the SDSS in person. Basically, it is a large astronomical survey in the optical which has obtained images of 8000 sq deg of sky and spectroscopy and redshifts for 500,000 galaxies (or so). It is a vast and complex data set. Much of what we do here is based on it.
One thing that would be useful to discuss in this section is what an image is, what a spectrum is, and what redshift is. If you want to write a couple of paragraphs describing these things, please do so.
Our SDSS data here is in the directories of:
/global/data/sdss
There are a lot of different sorts of directories here. You don't need to look at all of them. The basics are:
- /global/data/sdss/spectro -- the spectroscopic data
- /global/data/sdss/redux -- reduced catalog and image files
- /global/data/sdss/lss -- [#lss "large-scale structure samples"]
- /global/data/vagc-dr4/vagc0 -- [vagc value-added galaxy catalog]
How do I read the data?
It is good to know where these data are, but for the most part we have IDL code that automatically reads them in, in the "idlutils", "idlspec2d" and "photoop" products. If you look in your ".bash_profile" file that you copied from dataman, you will see you have set these up already.
For example, if you type in IDL:
objs= sdss_readobj(5079, 3, 30, rerun=137)
It returns (in "objs") detected objects in a particular run (5079), camcol (3), and field (30), in a particular set of reductions (137). Look at the SDSS documentation for a picture of what runs, camcols, and fields are. In that image, there are two interleaved "runs" (one shown by six purple rectangles, one shown by six horizontally-striped rectangles). Each rectangle is a "camcol", numbered 1 through 6. Each "camcol" is chopped into fields --- there are three example fields shown. So a run/camcol/field combination designates a particular patch of the sky.
In this example, "objs" is an array of "structures", which you should have read about in the IDL documentation. That is, "objs[5]" refers to the sixth structure in a list of structures. To see what elements this structure has, type:
help, /st, objs[5]
You will see a bunch of tags and values. For example, there will be a tag "RA" with a value associated with it. As time goes on, you will learn what these values mean. To see all the individual values of one of the tags, type:
print, objs.ra
or to see just one type:
print, objs[7].ra
Simple ones to know are "RA" and "DEC", or right ascension and declination. These are designations of the location of an object on the sky, and are analogous to longitude and latitude on the Earth. In fact, RA and DEC are more-or-less the longitude and latitude for which the object would be directly overhead at noon on March 21, 2000. To see the locations of the objects for the field 5079/3/30, type:
splot, objs.ra, objs.dec, psym=4
(This will only work if you have X windows. There are about 400,000 fields like this on the sky!
How do I find all objects in a patch?
Sometimes you are interested in a particular patch on the sky and just want to get all of the objects in that direction. A good tool for that is "sdss_circle", which takes an RA, DEC, and RADIUS. For example:
objs= sdss_circle(170.,55., 0.1, rerun=137)
This returns a structure containing all the objects within 0.1 deg of RA=170 and DEC=55. To verify that this is where the returned objects are, execute:
splot, objs.ra, objs.dec, psym=4
You can ask sdss_circle to only return stars or only galaxies, by passing the "objtype" keyword, as in:
gals= sdss_circle(170.,55., 0.1, rerun=137, objtype='galaxy') soplot,gals.ra,gals.dec,psym=4, color='red'
This will overplot the locations of just the galaxies. Set objtype to 'star' to get only the stars.
In addition you can ask it to make other cuts as well. The most generally useful one is to set the keyword "limits", which is a five-element array with the flux limit to apply in each band. To be kept, the object must pass the flux limits in ALL bands. Thus, we suggest you make only one of them positive and set the others to, for example, -1000.
Looking at images
To look at the reduced image of this field, type:
atvsdss, 5079, 3, 30, rerun=137, /fpbin
(This will only work if you have X windows. This will bring up an image of the field in ATV (because of oddities in atvsdss, you may have to press the "Autoscale" button to get a decent stretch). Note that some parts of the image are at full resolution (0.396 arcsec pixels) while others are "blocky". The blocky parts are empty regions of the sky, kept at lower resolution in the reduced data. The full resolution parts are where objects actually are.
There are actually five images for each patch of sky. These are images using the u, g, r, i, and z filters, which are five different filters in the SDSS system. Look at the SDSS documentation for a description of these; but basically from u to z is blue (3500 Angstrom light) to red (10000 Angstrom light). Thus, there are actually five images for each patch of the sky, one with each filter in front of it. atvsdss shows by default the r band image. Look at the documentation to see how to show the other filters, and look at the differences between the different ones. Then show the r-band filter again before continuing below.
Each detected "object" has an "id" number which is returned in the "objs" structure above. To see this, do the following: Set once again
objs= sdss_readobj(5079, 3, 30, rerun=137)
then,
atvplot, objs.objc_colc-0.5, objs.objc_rowc-0.5, psym=3
The objc_rowc and objc_colc tags yield the "x" and "y" positions of each object in the image (offset by a half pixel), so this should put a dot at the center of each object. To see a specific object do something like:
atvplot, [objs[119].objc_colc]-0.5, [objs[119].objc_rowc]-0.5, psym=4
which puts a diamond (which is what "psym=4" indicates) at the position of the 119th object in the structure.
Maggies and magnitudes of point sources in SDSS images
Each object has a certain flux associated with it, which we normally give in units called "nanomaggies". A "maggy" is the ratio of the flux density of the object to a standard flux density. These flux densities are the energy arriving from the object per unit time, per unit area, and per unit wavelength. That is, if we exposed for a longer time, or used a telescope with a bigger collecting area, or had a slightly broader or narrower filter in front of the camera, we would recover the same "flux density" once we calibrated the raw results coming from the telescope. A more common unit in astronomy is the "magnitude" m which is related to maggies by:
An advantage of the maggie unit system is that it is linear, and thus can when necessary accommodate negative flux estimates.
But you might want me to be more specific about what units these fluxes really are in. For the SDSS, the magnitudes are "AB-relative". The AB system is designed such that a flat spectrum object with fν = 3631 Jy = 3.631 × 10−20 ergs cm−2 s−1 Hz−1 should have every magnitude equal to zero (and all maggies equal to one). The other common system in astronomy is "Vega-relative"; magnitudes in the Vega-relative system are simpler to measure experimentally but more complex to interpret.
For point sources (like the object above) the appropriate measure of the flux is given in the catalogs as "psfflux". Type:
print, objs[119].psfflux
This should print an array of five numbers. These are the fluxes in the u, g, r, i, and z filters, the five different filters in the SDSS system. Since the r-band filter is the one we are looking at, type
print, objs[119].psfflux[2]
to show the r band flux alone. To convert its flux to a magnitude type:
rmag= 22.5 - 2.5*alog10(objs[119].psfflux[2]) print, rmag
Now test that this flux makes sense by centering the cursor on this object (the one with the diamond on it) and typing "p". This makes a very simple measurement of the flux in the image. Basically it adds the flux in a circle, then estimates the "background" to subtract by looking at the flux in an annulus outside that circle. This procedure should yield a flux in the window that pops up about the same as the flux in the catalog.(You'll have to play around with 'Photometery Settings' in the atv aperture photometry box to go between Counts and Magnitude, don't forget to ajust the Magnitude zero point)
The object above is a point source, meaning that its entire size is accounted for by the blurring of the image by the atmosphere. Stars in our own galaxy (and some other objects!) are point sources. To see why, calculate the angular size that the Sun would be if it were 3 parsecs away (the distance to the closest stars). A parsec is a "parallax arcsecond", meaning that it is equal to the apparent shift of a star relative to a very distant source during a six month period because of the motion of the Earth around the Sun. You should be able to calculate the angular size of something 3 parsec away only using the definition of a parsec and the fact that the Sun is about half a degree in size on the sky. The blurring due to the atmosphere is about 1 arcsec (and the pixels on the image are about 0.396 arcsec), so you will see why we can't resolve the sizes of stars!
Maggies and magnitudes of galaxies in SDSS images
Although galaxies are much more distant they are are also much, much bigger. Type:
atvplot, [objs[85].objc_colc-0.5], [objs[85].objc_rowc-0.5], psym=4
to plot the center of a galaxy in the field we are looking at. This galaxy is XXX Mpc (megaparsecs) away. Estimate the physical size of this galaxy by guesstimating its angular size by eye. So galaxies can be quite large (the Milky Way is about 20 kpc in diameter).
The fluxes of galaxies can be measured in a bunch of different ways. Two are "Model" fluxes, which fit a model to the galaxy profile and take the flux as the total flux in the model, and "Petrosian" fluxes, which do a measurement of the flux within an aperture, but with a clever definition of the aperture which actually depends on the galaxy size (see here for details). For example, type:
print, objs[85].modelflux print, objs[85].petroflux
to see the Model and Petrosian fluxes for this galaxy. Compare these to the simple measurement you get by pressing "p" in ATV.
Atlas images in the SDSS
Now, when the SDSS software determines these fluxes, it first cuts out a section of sky called an atlas image and assigns an amount of flux from each pixel assigned to that object. For example, go back to the star you looked at above. Type:
atv, sdss_atlas_image(objs[119])
This will bring up only that part of the image which has flux assigned to that object. (You might notice the units of the image have changed, which is because sdss_atlas_image() returns uncalibrated quantities, which are not in nanomaggies, but in something called "counts"). All the objects in the image have an atlas image associated with them.
Making mosaicked images from SDSS data
Sometimes we'll want to make an image of a certain patch of sky. You can generate an image in this way in a very simple way:
smosaic_make, 178., 3., 0.1, 0.1, rerun=137, /fpbin
This will create images in all five bands, centered at the location RA=178 and DEC=3, 0.1 by 0.1 degrees on the sky. These images are in a format called FITS, which can be used to store images or tabular information (like IDL structures) in a standard way. You can read them with the mrdfits command:
im=mrdfits('smosaic-g.fits.gz',0,hdr)
atv, im, header=hdr
You can use the "mwrfits" command to write FITS files.
Often you will want just a color picture of the patch of sky. You can generate one with:
smosaic_make_jpg, 178., 3., 0.1, 0.1, rerun=137, /fpbin
This will create a JPEG format image that you can display from the Unix command line with the command:
xloadimage smosaic_178.000+3.00000_0.100000x0.100000_irg.jpg
Sometimes you will want to very quickly search an area of the sky. There is an SDSS image server which will do so for that area of the sky which is actually public. Play around with this tool, it can be quite useful.
Accessing SDSS spectra
For bright galaxies (and other objects) the SDSS has obtained spectra. To take the spectrum, a three arcsecond fiber is centered on each galaxy. These spectra are taken 640 at a time in a field 1.49 deg in radius. Each such field is referred to as a "plate". For example, to get information on all the fibers observed on a given plate do the following:
readspec, 560, zans=zans, tsobj=tsobj
This returns a structure in "zans" which has all the spectroscopic information for each fiber (including the best-fit redshift) and a structure in "tsobj" which is basically the photometric information. For example, you can look at the locations of all the spectra on the sky:
splot, zans.plug_ra,zans.plug_dec,psym=4
Or plot the distribution of redshifts:
plothist, zans.z, bin=0.1
Look at the idlspec2d documentation for a complete description of the information in these structures.
Note that you can look at the atlas image of a particular galaxy as follows:
atv,sdss_atlas_image(tsobj[10])
This corresponds to "fiberid" 11:
print, zans[10].fiberid
You can look at the spectrum (in the rest frame) as follows:
plotspec, 560, 11, /restframe
The white line is the actual spectrum. The blue line is a fit to it. The red line is the estimate of the 1 sigma noise in the spectrum.
There are some features of this spectrum you should be aware of. For example, there is a break at 4000 Angstroms. This is a strong feature in old and/or metal-rich stellar populations. In fact, most of the flux in the spectrum is due to the light of the stars in the galaxy. You will learn that the models for galaxy spectra are quite good at reproducing these spectra. There are also lots of emission lines in the spectra. The most prominent one is the H-alpha line at 6563 Angstroms. These emission lines come from ionized gas between the stars, which is being kept ionized from the light of very young stars. You can look at their identifications by typing:
plotspec, 560, 11, /restframe, /zline
These emission lines, and the underlying continuum, contain a lot of information about the history of star formation in the galaxy, and a lot of our research focuses on how to use this information.
You can use readspec to get the data corresponding to the spectrum. As a project, read the readspec documentation, and write a simple version of "plotspec" which plots the spectrum (in the observed frame is fine) of a galaxy given its plate and fiberid.
There is one important note to remember about using the spectra. In addition to the fiberid, it is also sometimes necessary to specify the date of observation, in terms of the MJD. This is because the plate can be reobserved, and the fiberid's change. So if you are trying to look at a spectrum, and you have its MJD (as you usually will), use it to be certain you are looking at the right one. Otherwise "readspec" just uses what it thinks is the best MJD, if there are multiple ones.
How do you find spectra in a given area of sky? Use the command "findspec" as follows:
findspec, 180., 30., searchrad=2./3600., slist=slist, /best
which will find the best spectrum within 2 arcsec of RA=180, Dec=30. The plate, fiberid, and MJD get returned in the structure "slist." If you have an array of ra, dec pairs that you want to match simultaneously, us:
findspec, ra, dec, searchrad=2./3600., slist=slist, /best
Another way of using findspec is to find ALL the spectra surrounding some point. This is good, for example, if you want all the spectra in a big area of sky:
findspec, 180., 30., searchrad=1., slist=slist
In this case, slist returns all of the spectra in a 1 deg radius around RA=180, Dec=30.
