IDL Programming Style Guidelines

Programming style is a concept that is fundamentally involved with the overall appearance as well as how source code is organized and developed. It is possible to write a program without style that will work. It’s even possible to write a lot of software without style and have it function as intended. However, I have found, after decades of experience in writing, debugging and running programs, that this approach is not a very good idea. Style is a framework that organizes how code is put together and allows one to more quickly understand the intent and execution of code without having to actually remember what it does. One might imagine a utopia where everyone has the same (good) programming style and the free interchange of source code can flow from person to person. I once thought this possible but have long ago given up such notions and they may well even be counter-productive. Different people have vastly different ways of tackling problems. Sometimes this can just lead to code that is identical in function but differs only in implementation. But, regardless of what the code looks like, style is the key to achieving the goal for the code development, on-time, on-spec, without bugs, and amenable to future upgrades.

IDL imposes some requirements on the style, the first of which being that the file containing a routine must be named the same as the routine with a .pro extension. That is, a procedure named "sqrt" must be contained in a file named "sqrt.pro". This is required to support auto-compilation. It is possible to have more than one routine contained within a single source code file but the routine matching the name of the file must be the last one in the file.

Routines other than the one named for the file are treated like private routines and are intended to be called by the named routine only. This is very common, even required, for widget routines but is rare for non-widget routines and should be avoided in that case. One example of a non-widget routine using "private" routines is cvtsixty.pro. Note that routine length names for private routines is not as restrictive. Generally, a private routine will have the external name and a trailing underscore as the beginning of the name.

Unless forced otherwise by the requirements of the language, all routine names should be limited to no more than 8 characters. Out of 485 routines as of this writing, 12 have names that are 9 characters long, 7 at 10 characters, 8 at 11-13 characters. There are exceptions but they are rare.

All source code files must start with the standard IDL documentation header. A template, named idl_template, should be used as the starting point for the header. Look at a library routine for examples for how this header is used and populated. In general you will find that the documentation headers are more complete on routines that are most heavily used or that have seen multiple epochs of development. A minimal header is a sign of either a trivial or old routine or one that isn't really in general use. A poorly developed header is not allowed in new code, however. Consider this allowed only for legacy code.

Source code files should not exceed 80 character widths for any line. Rare exceptions are allowed but only in cases where there are serious problems with code flow, understanding, or execution for following this rule. The most common exception occurs with long formatting statement strings.

Tabbing is used to enhance code readability. How tabbing is handled in writing source code is between you and your editing program. However, the result in the source code file must be strictly followed. Tab characters are not allowed in the file. Use hard spaces to set up the tabbing look. Tabbing is used to setup indentations in the code and the tab stops are at 3 space intervals (ie., columns 1, 4, 7, ...).

Special note for those using Emacs for text and code editing. Some versions (perhaps all?) of Emacs can, in some circumstances, omit putting newline at the end of the last line in the file. This can cause problems with proper code compilation with strange error messages. It also can create data or other input files that cannot properly be read with common IDL I/O tools. This too, will lead to very strange error messages. If in doubt you can use “od -c file.pro | tail -5” to see if the file is affected in this way.

Code is written in lower case. Comments should use upper case as dictated by normal prose writing rules. The only place where uppercase is used is when defining the keyword arguments in a routine. On the function/pro line you used uppercase for the keyword name and lowercase for the variable. in the parameter validation process this rule holds true as well.

In the header there are a few fields worth special mention. The header is composed of lines that act like section headings and are provided in the template. Do not rename these headings and do not add information on these lines. NAME - this must match the name of the file and the routine found within. PURPOSE - a one line string that describes what this routine does. CATEGORY - this identifies the general scope of the routine. Do not invent a new category without consultation. cat.txt shows current categories. Use them and be mindful that case and spelling matters. These three keywords must be confined to a single line that following the header keyword line with no intervening blank lines between the keyword and the information.

DESCRIPTION is used to provide a high level review of what the routine is intended to do. This is usually just a paragraph or so. Output values are generally treated the same as input, just realize that the language always treats output as optional and there is nothing the code can do about this. The CALLING SEQUENCE should show the command without any keywords. This must be kept in sync with the code and the documentation. I strongly recommend that all changes to the calling sequence begin by changing the documentation and then changing the code to match. INPUTS are those values on the command line that are input values. In the documentation it is important to document the units for any value that carries units. It is also useful to document the type, rank and size restrictions (if any) for each variable. OPTIONAL INPUT PARAMETERS are those that are part of the calling sequence but do not need to be supplied. KEYWORD INPUT PARAMETERS are keywords intended for input. In all cases of optional input the default value for the variable must be specified in the documentation header. Output values are generally treated the same as input, just realize that the language always treats output as optional and there is nothing the code can do about this. Keywords in each of their sections should be listed in alphabetical order. Input and outputs that appear on the command line should be listed in the order found on the command line. COMMON BLOCKS, SIDE EFFECTS, and RESTRICTIONS are rarely used but use them if needed. PROCEDURE is used for details of how to use the routine with as much or as little text as is needed to inform future user. MODIFICATION HISTORY is very important once the first working version is reached. At this point a line will be added to indicate initial writing and the author. Modifications after initial release should be indicated as well but this is intended to be a high-level reminder. All such information must be tagged with a date in YYYY/MM/DD format (all numeric). See a recent routine for examples if in doubt. Note that some of these header fields are read automatically by software to build documentation and web pages. All lines in the header should start with ";" which is the IDL comment character. Even blank lines should have a leading ; in the header. Blank lines in the header are mostly discretionary. Use it, or not, to enhance readability.

Avoid one-character variable names. The exception is generic loop variables and disposable 'where' function return. The use of i, j, k is permitted as a loop variable but not any where else. z is used as a scratchpad variable for the return of a 'where' call. The scope of the z variable should not extend much beyond code that can be seen on a single editor screen at one time. A variable named "count" is used together with "z" as a scratch variable with the length of the 'where' return. If 'where' return data needs to have a larger scope, use some other name (usually begins with z) that can be tied to the count variable as well.

Do not use gratuitous blanks in the code. For example:
wstr = repchar(wstr,':',' ')
is good,
wstr = repchar ( wstr, ':', ' ' )
is not. Blanks are used for meaningful syntactical separation, usually required by the language, ie.
for i=0,nvals-1 do begin
endfor

Comments are recommended to internally document the logic for a section of the code. These should generally be on lines of their own, not as tags at the end of the line. Comments on a line with code are allowed but only as very brief pointers and definitely should not be the start of a longer comment that runs on to additional lines.

Variable names should be meaningful but not excessively long. Reading the code should make sense but being terse is ok. It's also useful to choose names that support, rather than confuse, searching for that variable in the code.

The IDL language includes objects and object-oriented programming. These tools are strongly discouraged.

Common blocks are to be used very, very sparingly. The most common use is a private common block (private by convention but not enforced by the language) that is used by a routine to save information from one call to the next. Some fitting techniques will require the use of common blocks to pass information not allowed on the command line. In general, the reasons that often lead to using common blocks in other languages are not compelling reasons in IDL.

The goto statement is to be avoided. There are very few acceptable uses for goto. One of these is dealing with an error condition that requires exiting the routine where there are a few things that need to be done before the return. Before deciding to use a goto you should see if the code is made simpler and easier to read by adding the goto. There needs to be an overwhelming advantage to go down this path.

The character '&' can be used to put two or more statements on one line of the source code file. This construct is very strongly discouraged. Instead, put each statement of code on its own line.

Only one FOR command is allowed per line. If you need to write a nested FOR loop you should use multiple lines. Sometimes this will lead to code that looks funny. Good. FOR loops are your enemy. They shouldn't look nice.


Written by Marc W. Buie, Southwest Research Institute, 2009 Sep 28.