Simulations of SDSS photometry
From NYU CCPP Wiki
Contents |
Introduction
The Sloan Digital Sky Survey (SDSS) has produced the most uniform wide-area, multi-wavelength imaging data set in history. A huge number of observational results in extragalactic astrophysics, Galactic astrophysics, stellar astrophysics, and solar system astrophysics have been based on the reduced catalogs and images produced by this survey.
Of course, no data reduction procedure is completely free of errors and biases. When interpreting weak signals, trying to find rare objects, working at the signal-to-noise limit of the data, and/or trying to do precise analyses, it is important to quantify the behavior of the data reduction procedure in the appropriate situation, in order to understand the limits of the data and to correct for possible biases.
At NYU, we have created a simulated data pipeline for the SDSS, which starts with the raw data, adds in fake images of an arbitrary nature, runs the photometric pipelines, and produces a set of results identical in form to the normal SDSS results, but with the fake images included in the processing.
This capability has been crucial to understanding our measurements of the structures of galaxies (Blanton et al. 2005a), the nature and number density of low luminosity galaxies (Blanton et al. 2005b), the growth by mergers of luminous red galaxies (Masjedi et al. 2006), and the interpretation of weak lensing results around galaxies (Mandelbaum et al. 2006). See those papers for their conclusions about what these simulations are telling us about the behavior of photo.
This page describes the format of some of these simulations and how to use them.
Software
The software is in the "fakeobs" repository in the CVS repository at
howdy.physics.nyu.edu:/usr/local/cvsroot
One needs to define the environmental variable FAKEOBS_DIR to the location of your copy of this product, and also to "evilmake" the software in the subdirectory:
fakeobs/src/galaxy
The basic piece of software used to make fake runs is "create_fake_run.pro". All of the software depends on having the idlutils and photoop products also installed.
If you're not on an NYU machine (and have followed the instructions at Dotfiles), Schlegel's IDL codebase has instructions for installing the required software, except that you need to request the 'cvs' versions instead of the v5_2_0 etc. tags given (and even if you want tagged versions, you probably want to use the most up-to-date version tags from the top of the page rather than the older ones in the actual instructions). After following the installation instructions there, you need to execute the following commands any time you want to run fakeobs:
setup photoop svn setup idlspec2d cvs setup idlutils cvs
You also want to add
+/usr/local/fakeobs:
(or wherever you put fakeobs) to your IDL_PATH environment variable. At this point, you can start idl and
.compile create_fake_run.pro
(There may be more steps to do before you can run it...)
General method
In order to produce our simulations, we use a package we developed called "fakeobs", which takes as an input the raw data plus calibrations for a given SDSS run, and a set of fake "stamp" images and RA and Dec locations at which to insert them. The stamps are given in calibrated nanomaggies per pixel, where each pixel is assumed to be SDSS-sized.
First, the code copies over the reductions for a particular rerun, say "137," to the fake rerun, say "9137." This gets the PSP results, the astrom results, and the calibration results sitting in the right place, since we don't want to simulate those, we just want to simulate the behavior of "frames", the code that finds and measures objects.
Next, for each field, the code finds the stamps that overlap it. For each of the five SDSS bands, it:
- convolves each stamp with the estimated PSF at its position;
- creates an empty image the same size as the raw data image;
- inserts the stamps in the appropriate places based on the astrometry;
- converts nanomaggies to counts, applying gain and flat-fields in reverse;
- writes the result to a fake idR file
After creating all of the fake idR files, it runs photo on the resulting set of files. It also runs "resolve_run", which produces a set of files with the information about how to resolve different fields in the same run.
The code has two modes:
- Putting the fake stamps in aligned with the x-y coordinate system of each image. This method means that when a single stamp falls into two different runs, a mosaic of those runs might end up looking weird, since the position angle with respect to North may be different in the two. However, this method retains the fidelity of the input image the best, since it involves minimal interpolation.
- Putting the fake stamps in aligned "north-up". This method makes mosaics of two different runs look correct. However, because this involves rotating the image, this can introduce some uncertainty in the interpretation of the very detailed measurements.
The fake idR files are put in a directory:
$PHOTO_REDUX/9137/[run]/fake_fields
with subdirectories for each camcol, just like the ordinary "fields" directory. The files in each directory come in pairs like the following:
idR-004508-g1-0114.fit.Z - the "fake" raw data file idR-004508-g1-0114.par - the information about the stamps used
The .par file is in the Yanny format; the idlutils routine "yanny_readone()" will read it into an IDL structure, for example. However, the format is fairly straightforward:
# Created by create_fake_run on date Thu Apr 12 09:30:19 2007
idrfile /global/data/sdss/imaging/4508/fake_fields/1/idR-004508-g1-0114.fit
typedef struct {
double scale;
double sersicflux;
double sersicr50;
double sersicn;
double axisratio;
double orient;
char file[57];
double xpos;
double ypos;
} FAKEIDR;
FAKEIDR 1.0000000 1112.0699 11.210600 4.1428800 0.77185702 106.92700 /global/data/fakeobs/sersic-full1/gFakeStamp-000762.fits 700.48065 2440.7490
Note that the Sersic parameters given here are only appropriate for some of the fake data (e.g. the fake Main sample data). Because the data is input as arbitrary stamps, they don't actually have to be Sersic profiles of any sort. However, the first simulations we did were Sersic profiles, so we included these parameters in this file.
SDSS Main Sample simulations
We have created a set of simulations of SDSS Main sample galaxies. In particular, these are taken from the "full1" subsample of the NYU-VAGC LSS samples from DR4. In particular, these are galaxies between about 15th and 18th magnitude in the r-band, that ended up with redshifts between 0.001 and 0.4, and r-band absolute magnitudes between -23 and -17. The data pertaining to this set of simulations can be found at:
http://sdss.physics.nyu.edu/sdss-sim/sdss-main-full1
There are a number of files in here. Some refer to the full sample of possible galaxies, which have been gathered from the particular subsample. There are 10,000 possible galaxies to choose from, and their properties are stored in the following files:
fake-dr4full1-cal.fits fake-dr4full1-im.fits fake-dr4full1-kc.fits fake-dr4full1-postcat.fits fake-dr4full1-sersic.fits
From each of these, we produce a "fake stamp" based on the Sersic fit in all bands in the last file. These stamps are stored in FITS files in the subdirectory:
sersic-full1
The headers of the FITS files contain some information about their content. Otherwise, the number in the name refers to the position in the above files.
Of these galaxies, 10,000 are chosen (with replacement, so some are repeated and some aren't used) and are thrown into the SDSS raw data for parts of runs 4508 and 4518. The RA and Dec positions, and which stamp file was used for the r-band (the corresponding ones are used for the other bands), are stored in the file:
fake_full1_sample_stamps.fits
The SDSS reductions are run on this data. These results are stored in the usual form in the "9137" directory. (The "rerun" used is 9137).
SDSS ICL simulations
For studying intracluster light, we are working on a set of simulations involving adding clusters into the data. These are currently available only locally. The stamps and the list of objects we are using are at:
/global/data/fakeobs/icl
The file "fake_icl_sample_stamps.fits" has the RAs and Decs of the objects we have thrown down. In this case, we have used runs 4569 and 4576, and the results are in rerun 9137 of those runs.
We have run the "masked_cluster" code on these fake objects, and the results are in the directory:
/global/data/fake_masked_clusters
in exactly the same format as the "masked_cluster" directory.
