Next Previous Contents

1. Introduction

1.1 Output files on the HP1000

Like the input file formats on the HP1000, output files for recording observing data have had many and varied formats. Initially data were generally recorded in some form of ASCII, with cryptic numbers in some form of header, followed by data written as sets of numbers. These were largely unintelligible to a human reader. They were soon found to rapidly fill large amounts of very expensive disk space, then costing around R1000 / MB. Access was also very slow for large files. This prompted a change to binary formats, for instance a pure binary file for spectra, and to an ASCII header + binary data for pulsars. These used much less space and could be read and written rapidly. This was an efficient local solution.

However the data were not portable. The outside astronomical world adopted the FITS format for making data transportable. FITS files user headers comprising keywords + parameters, with one keyword + parameter per 80-character line, all stored in one or more 2880-byte blocks, followed by the data in an array of either binary or ASCII format, also stored in multiple 2880-byte blocks. FITS encompasses many different internal formats, and it requires a 96-page manual to define the FITS Standard at /usr/local/src/fits_standard.ps. Optional internal FITS structures for data storage include random groups (deprecated), binary tables and ASCII tables.

The recording of cordinates in FITS as also evolved. For further discussion, see Representations of celestial coordinates in FITS, of 1996/09/09, by E W Greisen and M Calabretta, in /usr/local/src/fits/wcs.all.ps

Unfortunately, HP-Kermit cannot handle the large record length of FITS files, so recourse was made to unblocked ASCII formats for porting data from the HP. In the case of spectra, where I had attempted to implement FITS files on the HP, I used what is essentially an unblocked version of FITS. Pulsar data, stored in binary with a cryptic ascii header, were converted to ASCII. Continuum data, never recorded in binary, were ported off the HP in their existing cryptic ASCII format.

Spectra and drift scans were also exported offsite in a minimalist format of a line of descriptive header, followed by column names, followed by X,Y pairs in two-column format, all written in ASCII. The advantage of this format is its extreme portability, for example being e-mailable and editable with any editor, and its ability to be imported into non-specialist data analysis software such as spreadsheets.

1.2 Recording data on the NCCS

What data should be recorded on the NCCS, and in what format?

In theory we could again return to the efficient binary files, with cryptic, minimal housekeeping, of the HP. However, R1000 now buys 10 GB of disk space, computer speeds are up by a factor of 1000, and experience suggests that we should be recording a maximum of housekeeping information about each observation, not a minimum, in order to provide maximum validation of the data against changes in the system.

It is also beneficial to standardize the definitions used for the housekeeping. In this regard, the use of the FITS concept of keywords followed by parameters is a suitable model to follow. The following example shows the housekeeping associated with spectra files currently being ported from the HP1000. Keywords that follow the FITS standard are identified.


BEGIN                           / start of new data set
NAXIS   =  1                    / number of axes in data array (FITS)
NAXIS1  =  256                  / number of values on AXIS1 (FITS)
BUNIT   = 'K'                   / brightness units of data (FITS)
OBJECT  = '339.88-1.26'         / name of object (FITS)
TELESCOP= 'HartRAO 26M'         / telescope (FITS)
OBSERVER= 'S. GOEDHART'         / observer's name (FITS)
DATE-OBS= '1999-02-13'          / date of observation (FITS)
DATE    = '1999-03-17T13:54:17' / date of creation of header (FITS)
CTYPE1  = 'VELO-LSR KM/S'       / coordinate type for axis 1 (FITS)
CTYPE2  = 'ANTENNA_TEMP'        / brightness type of the data
CRVAL1  =  -43.193              / value at CRPIX1 (FITS)
CRPIX1  =  1                    / location of ref point on axis (FITS)
CDELT1  =   0.11239             / pixel spacing on NAXIS1 (FITS)
RESTFREQ=   6.668518E+09        / line rest frequency in Hz
RA      =   252.1               / RA of EQUINOX in degrees
DEC     =  -46.059              / DEC of EQUINOX in degrees
RADOWN  =   252.1               / RA of EQUINOX for down spectrum
DECDOWN =  -46.059              / DEC of EQUINOX for down spectrum
EQUINOX =   1950.               / equinox in years for coords (FITS)
GLII    =   339.883079          / galactic longitude
GBII    =  -1.25706087          / galactic latitude
GLIIDOWN=   339.883079          / galactic longitude for down posn
GBIIDOWN=  -1.25706087          / galactic latitude for down posn
JULDATE =   2451222.67          / Julian date of SCAN
HA      =  -19.87               / hour angle of SCAN in degrees
POL     =  0                    / polarization
TCAL    =   14.2                / calibration signal value
TCALUNIT= 'K'                   / calibration signal units
TSYS    =   59.0315             / system temperature of SCAN in TCALUNITS
DTSYS   =   2.32639612          / uncertainty in TSYS in TCALUNITS
DUR     =   210.                / total integration time of SCAN in S
RMS     =   0.142293849         / expected rms noise in data in BUNITS
PSS     =   0.                  / point source sensitivity in Jy/K
BW      =   640000.             / original bandwidth of spectrum in Hz
FROFFSET=   320000.             / frequency offset for fr sw sp in Hz
SCAN    =  31322                / scan or observation number
ADDED   =   2.                  / number of spectra averaged
FIRST_CH=  1                    / first valid pixel in data
LAST_CH =  256                  / last valid pixel in data
POLYFIT =  3                    / order of polynomial fit to baseline
FOLDED  =  0                    / =0 unfolded, =1 folded freq. sw.sp.
SMOOTHED=  0                    / =0 unsmoothed =1 smoothed
TRANSFRM=  0                    / =0 spectrum =1 transform
ALTDEG  =  75.03                / telescope altitude (elev) in deg
ATMCORR =  1.0013               / atmospheric atten. corr. applied
PNTCORR =  1.07704318           / pointing correction applied
GAINCORR=  0.0                  / antenna gain correction applied
END                             / end of housekeeping for data set (FITS)
0.336205E+00 0.342031E+00 0.177201E-01 -.336548E+00   / data
-.878676E-02 -.546026E-01 -.264129E+00 -.471806E-03   / next line of data
... repeat to end of data
BEGIN                           / start of next data set
... 

It should be noted that the housekeeping in the example does not tell us about the details of the feedhorn or receiver or polarization (although there is a keyword for this) that was used, subtleties that we might want to know about.

What are the advantages of the format shown above?

What are the disadvantages of this format?

The FITS file format is the "industry standard" for astronomy. So for comparison with the first example, this is the "primary HDU" of a small area map observed with the HP1000 (in fact a beam pattern, in decibels) after porting to Linux and conversion to FITS format using the FITSIO subroutine package:


SIMPLE  =                    T / file does conform to FITS standard
BITPIX  =                  -32 / number of bits per data pixel
NAXIS   =                    2 / number of data axes
NAXIS1  =                  112 / length of data axis 1
NAXIS2  =                  101 / length of data axis 2
EXTEND  =                    T / FITS dataset may contain extensions
COMMENT   FITS (Flexible Image Transport System) format defined in Astronomy and
COMMENT   Astrophysics Supplement Series v44/p363, v44/p371, v73/p359, v73/p365.
COMMENT   Contact the NASA Science Office of Standards and Technology for the
COMMENT   FITS Definition document #100 and other FITS information.
CTYPE1  = 'RA      '           / COORD TYPE
CRVAL1  =               85.840 / START RA (DEG)
CDELT1  =               -0.040 / RA INTERVAL (DEG)
CTYPE2  = 'DEC     '           / COORD TYPE
CRVAL2  =               20.000 / START DEC (DEG)
CDELT2  =                0.040 / DEC INTERVAL (DEG)
DATE-OBS= '26/03/1985'         / DATE OF OBSERVATION
BUNIT   = 'dB      '           / DATA IN DECIBELS
BSCALE  =                  1.0 / REAL = VALUE*BSCALE + BZERO
BZERO   =                  0.0 /
TELESCOP= 'HART-26M'           / TELESCOPE
OBSERVER= 'M.J.GAYLARD'        /
INSTRUME= '13CM CIRC FEED'     / RECEIVER
OBJECT  = 'TAURUS A'           /
HISTORY  beam offsets +0.052, +0.012, ie ~ on axis
HISTORY  Fsyn = 23.333MHz, Fobs = 2300 MHz, repeats = 4
HISTORY  ambient GAASFET amplifier, narrowband IF
HISTORY  unsmoothed median linear beam pattern map
END

This was followed by blocks containing the map data as 32-bit binary
integers.

While the above looks similar to the spectra example, note that each line is padded with blanks to exactly 80 characters, and the 29 lines shown were followed by 7 blank lines of 80 characters, to produce the HDU comprising 36 lines each of 80 characters, in one 2880-byte block.

Note also that in this case a lot of technical information relevant to the data were presented simply as HISTORY, where appropriate keywords and associated parameters would have provided a better solution.

What are the advantages of true FITS format?

What are the disadvantages of true FITS format?

What should we do for the NCCS?

It is clear that for exporting data, at least two formats are needed:

Initially all data are to be recorded in FITS format. These files can then be converted into other file formats as required, eg multicolumn ascii, TEMPO format for pulsars, etc.

A good model to follow in doing this appears to be the SDFITS standard developed for the GBT. A starting point for looking at the documentation on this is: GBT Software Project Note - Documentation Index at http://www.gb.nrao.edu/GBT/MC/doc/index/spn_doc_index/.


Next Previous Contents