IDL tutorial
From NYU CCPP Wiki
Contents |
Introduction
In order for the things on this page to work, you need to have the package "idlutils" installed. Install it with the command setup.
For most of our scientific computing purposes, we use the language IDL. It can be used either interactively or in a batch mode. There are many ways in which it sucks. Don't worry, we understand this even better than you do. But there is also a huge amount of legacy software written in it, so we are stuck with it in the short term. It is still the best existing alternative for astronomy.
This section will be updated as time goes on based on what people actually ask about or need to do. Eventually, you will have the ability to update this page, so write down things that are unclear here or should be added, and insert them someday. Also, we may want to add more advanced descriptions of stuff on this page as well.
Making it work
First, make sure it works. Just type "idl" at a Unix prompt. It should come up with a new command line like:
IDL>
which you can type at. Type "exit" to get out of it.
Basic plotting
The first thing you want to do is to learn the basics of the syntax. Ask Blanton for a binder with some basic documentation and start up tutorials for IDL and how to structure programs in it. You should first read those.
Next, you should be able to look at data interactively using the "splot" command. Run the following commands:
xx= randomn(seed, 1000) yy= randomn(seed, 1000) splot, xx, yy, psym=3, color='yellow'
A plot should pop up with yellow points on a black background. (This won't work if you don't have X windows.) You can change the color with the "color" keyword, and the point type with the "psym" keyword. If these commands don't work for you, something is wrong with your setup. Review that you have copied the dot files described above correctly. If you think you have, and this command still doesn't work, email me. You should be able to navigate around the plot. The middle button recenters in the X-direction; the left and right buttons zoom and unzoom in the X-direction. If you press shift while pressing the button, it does the Y-direction as well. Being able to look at data with this tool will be very useful to you.
Documentation
At this point, you may wonder what the "randomn" function does. It is an IDL built-in function, so look up its documentation using the command (while at an IDL prompt):
?randomn
This will pop up the IDL documentation PDF file.
You might also want to see documentation for "splot". "splot" is a procedure written in the IDL language, so you can look up its documentation in a different way:
doc_library, 'splot'
You'll note that the first line of the documentation it spews out is actually the location of the code itself, so you can look up the source IDL code for this procedure (which occasionally you may need to do for other procedures or functions).
Another useful procedure is the "which" procedure, which tells you just where the source code is. Typing:
which, 'splot'
should return:
% Compiled module: WHICH. % Compiled module: STRSPLIT. Currently-Compiled Module SPLOT in File: /global/data/products/Linux/idlutils/cvs/pro/plot/splot.pro
Often what you will be doing is looking at images. To look at images, you will often use the routine "atv". For example:
xx= findgen(200)#replicate(1., 200) yy= transpose(xx) rr2=(xx-100.)^2+(yy-100.)^2 image= exp(-rr2/30.^2) atv, image
You can rescale the greyscale by holding down the left button and moving the mouse. Sometimes after doing so you will want to press the "Restretch" button. You'll get use to when it is appropriate to do so (it never hurts to). Recenter with the middle button, and there are "ZoomIn" and "ZoomOut" buttons.
At this point, you should try to understand the above lines a bit more. Use the IDL documentation to find out what "#", "^", "replicate()", "transpose()", and "findgen()" do. "exp()" should be obvious to you. Finally, play with the "help" command, as in:
help, image
You'll find that "image" is a two-dimensional array which is 200 by 200.
Postscript output files
One very common thing you will need to do is to make postscript files, which are files that you can print out on a printer, analogous to PDF files. With the way your account is set up, you will be able to type:
k_print, filename='blah.ps'
which will "start" a postscript file called blah.ps. Then you can type plotting commands, like:
djs_plot, [1., 2., 4.], [10., 40, 30], thick=5, xrange=[0.1, 4.9], yrange=[-5., 50], $ color='red'
and then close the file with:
k_end_print
If you exit IDL and get back to the Unix prompt, you will then be able to type:
gv blah.ps &
(Find out what the "&" does from a Unix book if you don't know). This will bring up the "blah.ps" for you to look at.
The "djs_plot" command above is a wrapper on the IDL command "plot", which makes it better. But take a look at the online documentation for "plot" to find out about it. Note that interactive commands like "splot" or "atv" won't work when IDL is trying to write to a postscript file (that is, after you type the "k_print" command and before you type "k_end_print".
Now, "k_print" and "k_end_print" are not standard IDL. They are just wrappers we've written to set things up the way we like. You can use the "which" command to find out where the code is and check out what they do. If "k_print" doesn't work for you, make sure there is a line at the end of your .bash_profile file that says:
setup kcorrect
If it isn't there, add it.
The "where" Statement
One of the most useful tools in IDL is the "where" statement. This allows you to search an array (or an aligned set of arrays) for entries that satisfy a certain set of conditions. For example:
a=lindgen(20)-2L print, a indx=where(a gt 1 AND a le 3, count) print, indx print, a[indx] print, count
As you can see, "indx" holds the zero-indexed position of the entries that satisfy your conditions. "a[indx]" will then give you the values in the array "a" at those positions. "count" holds the number of values. Now consider:
indx=where(a gt 100, count) print, indx print, count
Because no entries satisfy these conditions, "indx" just gets a "-1" entry, and "count" is 0. Since trying to access "a[-1]" will yield an error message, you always want to check "count" before using "indx".
The sorts of booleans you can use in this command are "GT", "LT", "LE", "GE", "EQ", "NE", whose meaning should be obvious to you. You can also combine conditions, as above, using "AND" and "OR".
Creating IDL structures
There is a data structure in IDL called a "structure" which allows you to put lots of variables into a single bundle. For example, to create a structure with RA, DEC, ID, and FLUX you might do:
newstr= create_struct('ra', 0.D, 'dec', 0.D, 'id', 0L, 'flux', 0.)
(where this has defined RA and DEC to be double-precision floating point, ID to be a 4-byte integer (a LONG in IDL parlance) and FLUX to be single-precision floating point. Alternatively you could have gotten an identical result with:
newstr= {ra:0.D, dec:0.D, id:0L, flux:0.}
Once a structure like this is defined you can use help to look at its contents:
help, /st, newstr
or reference its contents with the "." symbol:
print, newstr.ra newstr.ra= 180. print, newstr.ra
Structures are usually most useful when you have a whole array of them. You can create such an array with the "replicate" command:
newstrs= replicate(newstr, 1000)
where you can use any (positive) number you need instead of 1000. Once you have done that, you can treat them like any other array:
newstrs.ra= randomn(seed, 1000) newstrs.flux= randomn(seed, 1000) splot, newstrs.ra, newstrs.flux, psym=3
To copy columns from one structure to a structure of a different format you can use the "struct_assign" command:
str1=replicate({flux:0., ra:0., dec:0.}, 1000)
str1.ra= randomn(seed, 1000)
str1.dec= randomn(seed, 1000)
str2=replicate({mag:0., ra:0., dec:0.}, 1000)
struct_assign, str1, str2
Reading and writing FITS files
In astrophysics, there is a special type of file called a FITS file, in which you can stored tables and images. For example, an IDL structure can be written out as a FITS table as follows:
mwrfits, newstrs, 'output.fits', /create
Then you can read in the file:
in_newstrs= mrdfits('output.fits', 1)
splot, in_newstrs.ra, in_newstrs.flux, psym=3, color='red'
You should see that "in_newstrs" is the same as "newstrs".
Why the "/create" in "mwrfits"? Well, FITS files can actually have multiple tables. For example:
newstrs2= replicate({dec:0., blah:0.}, 1000)
newstrs2.dec= randomn(seed, 1000)
newstrs2.blah= randomn(seed, 1000)
mwrfits, newstrs, 'output.fits', /create
mwrfits, newstrs2, 'output.fits'
will create a file with two tables. You read in the first and second tables with
in_newstrs=mrdfits('output.fits',1)
in_newstrs2=mrdfits('output.fits',2)
Another variant of this that is useful for reading in large files is:
in_newstrs=hogg_mrdfits('output.fits',1,nrow=28800)
This reads in the file 28,800 rows at a time. For files with millions of rows, this can be very convenient (it saves a factor of two in memory, for example). Also, in combination with the "columns" keyword, which (look at the documentation) allows you to read in only some of the columns, it can save even more memory.
Calling IDL from a shell
The Basics
You can run an IDL programs from your command line or in a shell script
echo "print,'hello world'" | idl
Another way to do this which allows more complex commands is to use the << operator in a shell script. Edit a file, say stuff.sh, and write this
idl<<EOF command1 command2 command3 EOF
where command1, etc are whatever commands you want to run. This is how run the shell script if bash is your login shell.
bash stuff.sh &> stuff.out &
This sends all the output to the stuff.out file. For tcsh it is "> & stuff.out &", note the change in order of & and >. If you for some reason want to put the error messages into a separate file:
bash stuff.sh 1> stuff.out 2> stuff.err &
Remote Jobs
If you have logged in to a remote machine with ssh, these commands will be killed when you log out if you are "forwarding X", or at the very least your logout will hang and you will have to kill your shell. The solution is to also re-direct standard input. Add this to your command to prevent this:
bash stuff.sh < /dev/null &> stuff.out &
Sadly this does not work for IDL jobs because when you run IDL it actually calls a complex script. One would have to modify that script to take stdin from /dev/null.
Running big jobs nicely
Say you have a big idl job called hello and it will take days to run. If you want to run it in the background and not interfere too much with other people on the same computer, do the following (for bash users):
echo hello | (nice -19 idl) > hello.$HOSTNAME 2> helloerr.$HOSTNAME &
This puts the command hello into idl, runs idl "nice" so it takes as little CPU as possible from interactive (real-time) users, pipes stdout to the log file hello.wassup.physics.nyu.edu (if you are running on, say wassup), pipes stderr to the log file helloerr..., and runs in the background (so you can log out and it will keep running).
If you don't mind interfering with other people on the same computer, just use "idl" in place of "(nice -19 idl)" in the above command.
If you want to see the status of your job in the machine, use
top
which will show you the percentage of CPU, the amount of RAM, and the nice level of your job, among other things.
If you want to "watch the grass grow" you can
tail -f hello.$HOSTNAME
and watch the output being put in the file.
Licenses
There are a finite number of IDL licenses. That is, only a certain number of IDL jobs can be running at once. To check who is using them (since occasionally we run out and it is useful to know) you can run:
/usr/local/rsi/idl/bin/lmstat -a
There is also one trick to remember if you are running two IDL jobs on a single (two or more processor) machine. You can conserve IDL licenses by starting both jobs from the same window. That is, instead of sshing two times to the machine bias (say) and starting IDL in each, you can ssh in once and start both jobs from the same window. Most commonly you will do this when you have a big job that you start using the "echo blah | idl > output.out &" construction described in the previous section.
