The WFPC2 Archival Pure Parallels Project
ABSTRACT
We introduce the WFPC2 Archival Pure Parallels Project, in which the methods and procedures described here are used to obtain a near‐automatic combination of WFPC2 images obtained as part of the WFPC2 Archival Parallels Program. Several techniques have been developed or refined to ensure proper alignment, registration, and combination of overlapping images that can be obtained at different times and with different orientations. We quantify the success rate and the accuracy of the registration of different types of images, and we develop techniques that are suitable to equalize the sky background without unduly affecting extended emission. About 600 combined images of the 1500 eventually planned have already been publicly released through the STScI Archive. The images released to date are especially suited to the study of star formation in the Magellanic Clouds, the stellar population in the halo of nearby galaxies, and the properties of star‐forming galaxies at
.
Received 2005 June 27; accepted 2005 November 7; published 2006 March 9
1. INTRODUCTION
1.1. The WFPC2 Archival Parallels Program
Over a span of several years, from 1997 through 2003, the Wide Field and Planetary Camera 2 (WFPC2) on board the Hubble Space Telescope (HST) carried out an Archival Parallels Program under the auspices of the Parallels Working Group, chaired by J. Frogel. The program consisted of a large number of parallel images; i.e., images of the area in the sky toward which the camera was pointed while another instrument on HST was executing planned observations. Such pointings are constrained to be random, in the sense that they are not expected to contain any special sources, except by reason of proximity to the primary target, which is 5
to 12
away, depending on the instrument used.
The program was designed to produce a set of observations that would provide a valuable database for the HST archive, with the potential to impact a range of scientific programs that the community at large could carry out.3 The Parallels Working Group identified three areas of special interest: young stars and star‐forming regions in our Galaxy and other nearby galaxies, the stellar content of galaxies in the local universe (including our own), and large‐scale structure in the universe and the distribution and evolution of galaxies.
For WFPC2, the observations recommended by the Parallels Working Group were implemented in three different programs: Galactic, extragalactic, and special objects. The Galactic and extragalactic programs were used for generic pointings; i.e., those not in the special‐objects category. Galactic pointings are those at Galactic latitude
, and extragalactic are those at higher latitudes. Special‐objects pointings are those that fall close to objects of interest. The most common category for the WFPC2 parallels is of pointings within a specified distance (10
, except for a few very large galaxies, such as M31) of galaxies less than about 3 Mpc away.
The Parallels Working Group specified the observing strategy for each type of pointing. In general, observations were obtained in one or more of the four Hubble Deep Field filters (F300W, F450W, F606W, and F814W), depending on the available time (i.e., the length of the primary observation). Other filters were used for some of the programs. In almost all cases, the emphasis was on breadth of the survey, taking advantage of the expected large area of coverage, rather than depth, which could not match that of dedicated observations. Two features were common to all programs: first, exposures were always obtained in pairs for each filter, in order to facilitate data processing and especially the rejection of cosmic rays, and second, regardless of the program, the most sensitive filter, F606W, was always used first, although sometimes for brief (300 s) exposures.
Thereafter, the observing strategy varied, depending on the type of pointing. For extragalactic pointings, most observations were obtained as part of the Broad Band Survey, which aimed at obtaining multicolor data in the four Hubble Deep Field filters for relatively bright (
mag) and therefore uncommon galaxies. The four filters contribute depth (F606W), UV morphology and U dropouts (F300W), B morphology and dropouts (F450W), and photometric redshift information (F814W) and were generally obtained in that order of priority. A small fraction of observations were obtained in a special program targeting the medium‐width filters F410M and F467M, which could identify star‐forming galaxies at redshifts of 2.36 and 2.85, respectively.
For Galactic pointings, a selection of narrowband filters (F502N and F656N) were added to the extragalactic filters in order to study the properties of diffuse line emission in objects such as distant star‐forming regions and planetary nebulae, which could expect to be observed relatively often in random fields at low Galactic latitude.
The special pointings in the vicinity of bright, nearby galaxies were chosen so that WFPC2 could resolve the stellar population into individual stars, thus permitting a study of their resolved stellar populations. For Local Group galaxies, the sensitivity achieved in a typical 1000 s observation was sufficient to detect young stars in Hα emission, and thus F656N was added to the complement of filters. In a few cases, this has resulted in spectacular observations of diffuse line emission in the Magellanic Clouds. For galaxies beyond the Local Group, the sensitivity was insufficient to take full advantage of F656N or of F300W, and thus only F606W, F450W, and F814W were used.
As of 2002 July, the WFPC2 archival parallels had accumulated over 14,000 separate exposures, with a total exposure time of 11.77 Ms. Of this, the primary broadband filters F300W, F450W, F606W, and F814W had exposure times of 2.39, 1.42, 6.67, and 0.65 Ms, respectively. The distribution of these observations over the sky is shown in Figure 1.
Fig. 1.— Distribution of APPP data sets on the sky in Galactic coordinates. The size of each point is scaled by the total exposure time at that location. The two concentrations in the bottom left quadrant are the Large and Small Magellanic Clouds.
1.2. The Archival Pure Parallels Project
However, the enormous scientific potential of the archival parallels has remained largely untapped. The primary reason is accessibility: the WFPC2 images are generally not available in a readily usable form. The WFPC2 images in the archive need reliable rejection of image artifacts, such as cosmic rays and hot pixels. They also require the combination of co‐aligned and non–co‐aligned exposures, in addition to source catalog information. Some of these services are available for a subset of WFPC2 observations from the WFPC2 Associations effort.4
The Archival Pure Parallels Project (APPP) is an ongoing HST Archival Legacy program that aims at processing, combining, and delivering a large fraction of the parallel images taken by the WFPC2 Archival Parallels Program. The project will prioritize the available pointings on the basis of the number of filters available, the length of integration, and pointing characteristics. Among the special classes of observations that will receive a high priority are (1) fields in the Magellanic Clouds, which will permit a much wider study of star formation in different regions of the Clouds, (2) fields close to known radio and X‐ray sources (about 40 sources in the FIRST5 catalog fall into areas observed for the WFPC2 Archival Parallels Program), (3) fields that overlap with Chandra and XMM‐Newton observations, and (4) fields that have data in more than two filters.6 Overall, the project will make available to the community a total of 2500 images in 1500 pointings, encompassing about 7000 separate exposures, or about half of the Archival Parallels Program images. To date, about 600 images in 150 pointings have been released through the Multimission Archive at STScI (MAST). These images can be obtained through the APPP page7 at MAST. We expect that about 3 years (full‐time equivalent) of labor will be expended in the completion of this project.
The APPP is in many ways complementary to the WFPC2 Associations effort (Micol et al. 2000). Its scope is more limited, since it specializes in images taken as part of the Archival Parallels Program, whereas the WFPC2 Associations project extends to all WFPC2 images. On the other hand, there are other important differences: the APPP images are produced in a single frame, while those produced by the WFPC2 Associations effort are separated by chip, the APPP images are corrected for geometric distortion, and the APPP combines images taken as part of different visits, even when taken at different orientations. For regions of the sky that are observed repeatedly, this approach results in the delivery of deeper images that can also seamlessly cover a larger area of the sky. Cross‐registration between filters also allows source colors to be measured directly, even for non–point sources. A more detailed comparison of the final products of the WFPC2 associations and the APPP is presented in § 4.
In this paper, we describe in detail the methodology and procedures used to produce the combined images that have been released and that will be used for all upcoming images. In § 2, we describe the techniques that we have developed for accurate image registration, background equalization, and cosmic‐ray rejection. In § 3, we explain the data‐processing steps in procedural form. Section 4 summarizes the quality‐control procedures that we use before the data are publicly released.
2. DATA‐REDUCTION PROCESSES
The aim of the data‐processing pipeline is to produce astrometrically registered, drizzled images with background equalization and reliable rejection of artifacts, such as cosmic rays and hot pixels, for all four of the WFPC2 chips. Our techniques have been optimized to exploit the strengths of the WFPC2 instrument while simultaneously trying to mitigate its deficiencies.
2.1. Alignment across Chips
It is known from science data of rich stellar fields, as well as from Kelsall‐spot data, that the four WFPC2 detectors move with respect to each other over time, presumably as a consequence of physical changes in the WFPC2 optical bench. Casertano & Wiggs (2001) used the positions of 43 Kelsall spots in 173 images taken throughout the life of WFPC2, approximately one every 2 weeks. They found that the shift in relative positions of the spots (and therefore the chips) with time could be effectively modeled using a fourth‐degree polynomial.
We use the coefficients for the fourth‐degree polynomial from Casertano & Wiggs (2001) to determine the relative position of the chips at the time of observation.
2.2. Alignment of Images
The nominal world coordinate system (WCS) values for the WFPC2 are subject to small relative errors of 0
1 or smaller when the same guide star is used. In a few instances in which different guide stars are used to obtain WCS coordinates for two overlapping images, the relative errors in position may be as high as 2
–3
. We have developed a procedure to correct for these astrometric errors, which we describe below.
The image registration is best performed using individual images wherein all four chips have been drizzled to the output frame. We first construct a sequence of images such that each successive image in the sequence has the maximum possible overlap with one or more of the preceding images in the sequence. Thus, if there are n data sets, a set of
pairs with optimal overlap are determined. Relative shifts and rotations between images are determined pairwise and then propagated as described below. The motivation for choosing a maximum overlap sequence is to maximize the number of matched sources during image registration.
We have developed and tested two distinct approaches to image registration:
Approach 1.—We first determine the centroid position for all sources in the reference image and the image to be shifted using SExtractor (Bertin & Arnouts 1996). During source extraction, it is critical to use the output‐weight data from Drizzle as the weight_image in SExtractor, to reduce the number of spurious source detections. We then filter the source lists to exclude sources with possibly incorrect positions (e.g., saturated sources) and pass the source centroid positions to a triangle‐matching program that attempts to find matching sources in the reference and shifted images. This algorithm uses the principle that similarity properties of triangles hold for any transformation that involves a shift and/or rotation. Once a matched source list is obtained, this list is passed to the IRAF GEOMAP program using fitgeometry=rotate, which determines the relative shift and offset between the two images. This approach works well when the number of real sources detected in the image is comparable to (or exceeds) the number of cosmic rays. Such a situation exists only for the minority of low Galactic latitude and Magellanic Cloud pointings in our data. Pointings in which there are few real sources and many more cosmic rays are predominant. In such situations, the triangle‐matching method fails to find a sufficient number of matched triangles and sources for nearly 50% of pointing/filter combinations. We therefore explored an alternative approach to image registration. | |||||
Approach 2.—Centroid positions for sources in the reference and shifted images are obtained and filtered as in approach 1. For each source in the reference image, we search for sources that lie within 25 pixels (2 | |||||
Fig. 2.— Deviations of sources within 25
of sources in the reference image. The cluster of points near the center represents true source matches between the reference image and the source image. Most other points are chance coincidences of cosmic rays.
Image registration is then repeated in pairs. For example, given a set of four images, the two images with the highest overlap (say, images A and B) are identified, and the second image (image B) is registered relative to the first (image A). Next, among the remaining two images, the image with the highest overlap with either A or B is chosen. For example, we may find that image D has the highest overlap with image A. Image D is then registered to image A. The remaining image C is registered to the other image (image A, B, or D) with which it has the highest overlap. Pairwise shifts and rotations obtained from GEOMAP are adjusted such that they are relative to the one image that does not undergo a shift or rotation (image A in our example). These shifts and rotations are transformed to a corresponding change in the WCS of the four chips. In the above example, all chips of all images except those of image A would receive revised WCS values.
2.3. Sky‐Background Equalization
The observed sky background at a particular location on the sky can change due to light scattered into the aperture from the bright Earth limb. Due to this phenomenon, two images of the same part of the sky taken at different times may have different values of the sky background. It is important to correct for such differences to ensure a uniform sky in the drizzled output image and to accurately reject cosmic rays.
We correct for relative offsets between sky values by matching the sky pairwise in the maximum overlap sequence as obtained above. The sky offsets between images is the median of the difference between valid pixels in the two images in the region of overlap. Sky offsets are determined relative to the overall reference image (image A in our example) and are corrected for.
We illustrate the background equalization process in Figure 3. We plot the counts in a particular row of a reference image (black solid line) and in the same row of the comparison image (dashed line) that has a higher value of the sky background. The higher average counts at lower column values is due to extended nebular emission in that part of the image. After the sky‐background offset has been corrected for, the counts in the comparison image substantially match those in the reference image (red solid line). The figure only illustrates this process for one row of binned data, but in the pipeline, this procedure is carried out using all pixels in the region of overlap.
Fig. 3.— Plot of counts (DN s−1) for a single row of image data. The data have been binned for clarity. The black solid line depicts the source counts in the reference image. The dashed line shows the counts in the shift image, which are all systematically higher. The red solid line depicts counts in the shift image after the sky‐background equalization process.
Note that only relative sky offsets are corrected by our procedure. No attempt is made to subtract the sky background from the reference image.
2.4. Drizzling
Drizzle is a method used for the linear reconstruction of an image from undersampled, dithered data (Fruchter & Hook 2002). The algorithm, known as variable‐pixel linear reconstruction, or informally as “drizzle,” preserves photometry and resolution, can weight input images according to the statistical significance of each pixel, and removes the effects of geometric distortion on both image shape and photometry. In addition, it provides a method for robust cosmic‐ray rejection. Drizzle was first used in the analysis of the Hubble Deep Field North data (Williams et. al 1996). It has since been included as a part of the STSDAS software package.
Drizzle takes the input data and transfers each pixel to the output frame. This typically involves a shift (in x‐ and y‐direction), a rotation, and a scaling. In addition, each input data pixel may be shrunk before it is drizzled. During drizzling, the user must specify the shifts, rotations, and scaling needed to go from the input frame to the output frame. A version of Drizzle that simplifies this process by working directly with WCS parameters has recently been developed. With this code, called wdrizzle, the user only needs to specify the WCS keywords of the input frame (available in the image header) and the WCS keywords of the output frame.
We ran wdrizzle on all four chips of each input data set to its own output image (as determined by the output WCS) and weighted by the inverse variance map. A scale and pixfrac of 1.0 were used throughout, along with a square kernel. Kozhurina‐Platais et al. (2003) used the inner calibration field of ω Cen exposed through filters F300W, F555W, and F814W to determine the geometric distortion of WFPC2 as a function of wavelength. They incorporated the improved PSF‐fitting technique of Anderson & King (2003) to fit a bicubic polynomial model in order to derive geometric distortion coefficients in the F300W and F814W filters relative to the distortion‐free coordinates in the F555W filter. We supply these distortion coefficients to wdrizzle. If coefficients are unavailable for a particular filter, those from the nearest available filter are used.
2.5. Cosmic‐Ray Rejection
We adopt the procedure proposed by Fruchter & Hook (2002) for cosmic‐ray rejection. We construct the median image by performing the median operation on the sky‐offset–corrected images. The median‐mask allows us to exclude invalid pixels from the median image. The median image is relatively free of cosmic rays. We back‐propagate the median image to the input frame of each chip of each of the individual input images, taking into account the image shifts and/or rotation and geometric distortion. This is done by interpolating the values of the median image using a program called wblot (which is a WCS‐aware version of the blot program in the dither package).
We take the spatial derivative of each of the blotted images. The derivative image is useful for estimating the extent to which errors in the computed image shift or the blurring effect of the median operation have modified the counts in the blotted image. We compare counts in each original image with those in the corresponding blotted image. If the difference is larger than that expected from a combination of the expected noise, the blurring effect of taking the median, or an error in the shift, the pixel is flagged as cosmic‐ray affected. We repeat this step on pixels adjacent to pixels already flagged, using a more stringent comparison criterion.
3. DATA‐REDUCTION PIPELINE
All the techniques described above were implemented as a fully automated data‐reduction pipeline that can be applied to the calibrated data obtained from the STScI Archive. The pipeline consists of two segments: a database management unit, developed in Python, which identifies data sets that are proximate on the sky and arranges the appropriate data location, and a data‐processing unit, written primarily in IDL, which performs the data alignment and combination. The latter uses callable versions of the wdrizzle and wblot programs, which are identical in function to the STSDAS versions. Most of the pipeline processing is straightforward and sequential; however, as described above in § 2.2, the image registration step is fragile and can fail for a variety of reasons (poor initial positions, lack of sources, insufficient exposure time, and insufficient image overlap). Automatic checks verify whether registration was successful; if not, the affected images—in some cases, a full pointing/filter combination—are excluded from further processing. With current procedures, about 10% of all images fail the automatic registration check. The majority of registration failures are in the F300W filter. An additional 5% fail the subsequent manual quality check (see § 4). There are no failures in any of the other steps. While future changes in our procedures could slightly improve our success rate, our experience to date suggests that a fully automated alignment procedure is unlikely to work under all possible circumstances.
We now describe the individual steps taken as part of the image‐processing pipeline.
3.1. Image Database and Pointings
The first step in the processing is the definition of the relevant pointings and of the images associated with each. We started by extracting from MAST8 the data files associated with all images obtained as part of the Archival Parallels Program, including both the data files (suffix _c0f.fits) and the data quality files (suffix _c1f.fits). We also obtained all the flat‐field files used in processing these images. All images were retrieved from MAST during a brief period in 2003 July. This ensured that the on‐the‐fly reprocessing (OTFR) performed automatically by MAST was identical for all data sets.
After retrieving all images, we extracted all header parameters, including both primary and group parameters, and captured them into a relational database using MySQL. The data sets were then partitioned into groups using a simple proximity relation. First, two data sets are related if their reference point, the position of the WFALL reference point on the sky, is less than 80
apart. Second, two data sets related to a third are considered related. This proximity relation is symmetric and transitive, and therefore it is an equivalence relation, thus defining a partition in the set of all images under consideration. Each of these equivalence groups is called a “pointing.” In practical terms, pointings are defined by starting with data sets less than 80
apart and then extending the rule by the friend‐of‐friends algorithm; i.e., a data set belongs to a pointing if and only if it is less that 80
apart from at least one other data set in that pointing.
For the purpose of defining a pointing, all data sets are considered together, regardless of the filter used or the length of exposure. This could result in some anomalies in which combined images in each filters are disjoined, nonoverlapping, or could contain only one data set. We find that such anomalies are exceedingly rare, and thus the definition of pointing we adopt is useful. Each pointing is assigned a unique nine‐character identification based on the name of the data set in that pointing with the earliest observation date.
With our criteria, the 14,965 data sets in our database are grouped into 2460 pointings, with an average of about six data sets per pointing. The largest pointing contains 161 data sets and spans a diameter of 12
on the sky.
Of these, 2305 pointings have at least one filter with two or more data sets, each longer than 100 s, in that filter. All these pointings can be processed by our pipeline. However, we are prioritizing the processing based on the science drivers, total exposure time, and the number of filters available. A total of 573 images in 149 pointings have already been released to MAST. We anticipate that a total of about 2500 images in 1500 pointings will be released before the completion of this project.
3.2. Preparing Data for Pipeline Processing
Once a pointing is chosen for pipeline processing, several steps need to be undertaken in order to simplify the processing itself and to obtain uniform output products. These steps consist of compiling the relevant data, defining the output images, and preprocessing the data to the extent needed. We have developed a number of Python scripts to carry out these steps automatically and efficiently.
Data collection is directory‐based. Each pointing is assigned a directory, with a separate subdirectory for each filter. The subdirectory contains the relevant science, data quality, and inverse flat‐field images in IRAF group format (suffixes c0h, c1h, and r4h, respectively).
In each subdirectory, subsets of images that are nominally co‐aligned (shifts of less than 0
01 in roll angle and 0
01 in position) are considered suitable for cosmic‐ray rejection, which is then performed using the standard STSDAS crrej task. Using co‐aligned cosmic‐ray rejection significantly reduces the number of images that fail the alignment step and therefore significantly increases the quality of the resulting combined images.
Finally, the output image needs to be defined. The output image is rectangular, with pixel size equal to the average pixel size in chip 3, and is oriented with north up and east to the left. The output image for each filter is large enough to accommodate all of the input images, with a sufficient margin to account for possible alignment refinements (see § 2.2). Output image sizes range from 1600 to 7200 pixels on a side.
3.3. Stage 1 Pipeline
The main steps of the stage 1 pipeline are summarized in a data flow diagram in Figure 4. The stage 1 pipeline includes three separate drizzle operations and one inverse drizzle operation (blot). The various steps of the procedure are described below.
Fig. 4.— Data flow diagram for the major steps in the stage 1 pipeline processing. There are three separate drizzle operations and one inverse drizzle operation (blot). Each step is described in detail in the text.
3.3.1. Bad Pixels and Variance Maps Bad pixels in an image are those that satisfy at least one of the following conditions:
| 1. | Flagged as bad in the data quality files. These include transmission and other failures, blocked columns, saturated pixels, and bad pixels listed in calibration reference files. | ||||
| 2. | Lie within 30 pixels of the inner edges for the WFC chips, and within 50 pixels of the edge for the PC chip. | ||||
| 3. | Exhibit an inverse flat‐field value higher than 1.7, indicating that nearly half of the total flux is lost because of proximity to the pyramid edge. | ||||
| 4. | Are adjacent to a pixel marked as saturated in the data quality file. | ||||
For such bad pixels, we set the weight to zero. For the remaining pixels, the weight is computed as the inverse of the variance, according to the method given by Casertano et al. (2000). This computation of the variance takes into account noise contributions from the sky background, dark current, read noise, and flat field. Contributions to shot noise from sources are not included. These weight images are used as the input weights for the drizzle.
3.3.2. First Drizzle Pass For each input image, a separate four‐chip mosaic is generated using the Drizzle algorithm on an image that has the same frame as the predefined output image. At this stage, the image position and orientation is defined using the header values for chip 2; the positions of the other three chips are adjusted according to the time‐dependent chip separation correction described in § 2.2. Each image thus retains the imperfections (cosmic rays, interchip gaps, etc.) of the input images, and the registration between the images is based solely on the header parameters.
3.3.3. Intrafilter Image Registration In order to improve the registration between images taken in the same filter, sources are identified using SExtractor in the single mosaicked images obtained from each data set, using the weights obtained in the first‐pass combination. The algorithms described in § 2.2 are then used to determine relative shifts and rotations needed to optimize the relative registration among images taken in the same filter. The alignment quality is assessed using the rms scatter of the position residuals after outlier rejection. We show in Figures 5 and 6 typical vector plots of residual offsets between the reference image and a second image whose coordinates have been transformed using the best‐fit transformation provided by GEOMAP. The residuals are random and small. There are no systematic chip‐to‐chip variations, which would indicate an error in the relative positioning of the four chips. There are no systematic variations within each chip either, which would indicate errors in the distortion correction. For images that have a large number of sources (such as those shown in the figures), the rms of the residual offset is typically 10 mas or smaller in each coordinate.
Fig. 5.— Plot of vectors showing residual offsets in source positions between a reference image and a second image that has its coordinates transformed (with a small shift and rotation) to align it with the reference image. The offset vectors have been scaled up by a factor of 800 to make them visible. The outline is the WFPC2 footprint. It is clear that residuals are random and small (rms = 7 mas). There are no chip‐to‐chip variations and no systematic variation within each chip.
Fig. 6.— Vector plot of offsets in source positions between a reference image and the shifted/rotated image in which the shifts are large (∼3 pixels in each coordinate). Vectors are scaled up by a factor of 800, as in Fig. 5. The deviations are still random; however, their amplitude is somewhat larger.
The improved alignment information is propagated back to the input images, and their WCS parameters are updated accordingly. The shift and rotation required for each image and the rms of the residuals are recorded in the log file.
3.3.4. Second Drizzle Pass and Sky Equalization Once the images have been correctly registered to each other, they are drizzled through a second time, again with all four chips of a data set drizzled to the output frame. The sky level in each image is adjusted to ensure that the relative backgrounds match, using the sky‐background equalization technique described in § 2.3. The goal of this step is not to zero out the sky, which would lead to incorrect results whenever diffuse emission is present, but simply to remove time‐dependent background variations, which can adversely affect the quality of the combined images. An overall sky background that is consistent with one of the input images is retained. As a result, all images produced by our pipeline include a contribution from the sky background as observed.
3.3.5. The Median‐combined Image and Cosmic‐Ray Rejection The single, aligned, sky‐equalized images obtained as a result of the second drizzle pass are median‐combined to remove the impact of cosmic rays. The median‐combined image is blotted back to each input image, generating a reference image for each detector. This reference image and the original image data are compared, taking into account the possibility of a net offset due to sky equalization. Cosmic rays are identified as described in § 2.5, and the corresponding pixels are recorded in the cosmic‐ray masks.
3.3.6. Third Drizzle Pass and First Image Output Once the cosmic‐ray rejection is complete, the third drizzle pass is performed using the same inputs as in the second pass, but with the input weights modified to zero out the weight of pixels identified as cosmic rays. Unlike the previous drizzle passes, all input images for each filter are now drizzled onto a common output image. This image, which is a weighted combination of all data for the relevant filter, is written to disk with a header that records the basic information of the images that have been included in the combination. Besides the combined (science) image and the corresponding weight image, the input weight file to the third drizzle pass is also recorded; this file will be needed as input for the stage 2 pipeline. The updated WCS values for each input image, after image registration, are written as a FITS extension table of the science image.
In addition, many intermediate files (such as the median‐combined image, the blotted medians, and the cosmic‐ray flags) can be written to disk if the debug flag is set.
3.4. Stage 2 Pipeline
Once all filters in a particular pointing have been processed through the stage 1 pipeline, a science image and weight image for each filter are obtained. However, these images may not be properly aligned across filters, since the registration process is carried out independently for each filter. Cross‐filter registration is performed by the stage 2 pipeline.
3.4.1. Cross‐Filter Registration In the current pipeline implementation, we treat the F606W science image as the reference image for registration across filters. All final science images are registered to the F606W image using the technique described in § 2.2. We determine a list of sources for each filter using SExtractor and the weight file from the stage 1 pipeline, determine the relative shifts and rotations needed to align each filter with the F606W image, and propagate these shifts and rotations back to the original images, updating the WCS information separately for each chip. Note that absolute astrometry of the resulting image will still suffer from any inaccuracies present in the original images, which are derived from guide star position information through the knowledge of the HST focal plane solution at that time. In principle, the absolute astrometry could be improved by matching sources in our images with those listed in the the Guide Star Catalog II, a step that has not yet been implemented.
3.4.2. Final Drizzle When all image headers have been updated, the final wdrizzle pass is performed for all filters except F606W, using the updated alignment information. A new drizzle pass is preferable to shifting the previously obtained image, because it introduces less image degradation and noise correlation; since the final pass adopts the previously determined weight information, including the identification of cosmic‐ray events, it can be carried out efficiently, with a modest impact on computing resources.
The images resulting from this final pass, together with updated header information and a table extension containing the updated WCS parameters for all input images, are the versions that are made available to the community through the MAST site (see § 4.4).
4. QUALITY VERIFICATION AND DATA DELIVERY
4.1. Quality Control
Each image goes through an extensive set of automated and human (visual) checks before delivery to the archive. Automated checks included numerous flags for poor astrometric registrations (too few sources, extremely large image shifts, high rms residuals after GEOMAP transformations, etc.), anomalous background offsets, and over‐rejection of cosmic rays. These automated checks were supplemented by human checks. A typical quality‐control process for a pointing would include the following human check procedure:
| 1. | Display the final science images in each filter. Verify that each science image is of acceptable quality. Any image anomalies, such as bias jump, optical telescope assembly Earth reflection patterns, PC1 stray light patterns, etc. (Biretta et al. 1995), seen in the science image are noted in a log file. | ||||
| 2. | Register and blink through the final science images in each filter to verify accurate cross‐filter registration. | ||||
| 3. | Display each weight image and examine it for data quality. Issues to look for include excessive coincidence between bright sources and low weight (may indicate rejection of source cores), large areas of low weight not coinciding with image overlap (may indicate background problems, such as an Earth‐cross pattern), excessively high or low values, and gridding (should not occur with pixfrac=1). | ||||
| 4. | If the science or weight images show some anomaly (e.g., incorrect cosmic‐ray rejection, poor registration), the sky‐cube image is inspected to diagnose the problem. Any diagnosis or suspicion is noted in the log file for subsequent analysis. | ||||
| 5. | Examine the log files for the registration process to confirm that all images in the alignment sequence were registered with low rms residuals in the GEOMAP alignment process. | ||||
We reexamine images that fail the quality‐control processes. If the issue is fixable with reprocessing, this is done. For approximately 5% of the final images, quality issues remain unresolved; these images are not delivered to the archive. The most common cause of failure in quality control is when most (all) data sets contributing to the final drizzled image show the Earth‐cross pattern.
A significant contributor to the remaining uncertainty in astrometric registration is the random error in source centroid positions computed by SExtractor. The most recent version of that software (ver. 2.4.3) claims to greatly reduce such errors by allowing for centroid computation using a window function. The SExtractor documentation claims that the accuracy offered by the window function–based computation is comparable to that obtained from PSF‐fitting software. We are currently in the process of testing the new version of SExtractor and its centroiding algorithm. If found to be appropriate, we will use to it to replace the version of the software currently in use in our image registration procedure.
4.2. Photometry Tests
We tested the photometric accuracy of a small number of images produced by our pipeline. Our test consisted of comparing the instrumental magnitudes of objects in the combined image with individual images that contributed to it. The source detection and measurement parameters in SExtractor were mostly set to default values (e.g., 3 σ thresholds for detection). Varying these parameters somewhat did not change the basic result of the test. We used SExtractor in dual‐image mode for the photometry. We detected objects in the combined image, and aperture magnitudes were measured in the combined image and in each of the individual images obtained after the second drizzle operation. We show in Figure 7 the plot of the difference in aperture magnitude as measured on the combined image and a clipped mean of the individual frames, as a function of aperture magnitude in the combined image. A circular aperture with radius of 4 pixels was used. The relative brightness of some objects in the individual images is almost certainly due to cosmic‐ray contamination, since the combined images are largely free of cosmic rays, while the individual images are not.
Fig. 7.— Difference in aperture magnitudes of sources in the combined image and a clipped mean of the same sources measured on the individual data sets that contributed to the combined image, plotted against the aperture magnitude for an image in the LMC. Sources for which aperture magnitudes are brighter in the individual images are almost certainly affected by cosmic rays. The remaining sources are distributed tightly about zero magnitude, with increased scatter at fainter magnitudes, as expected.
For all the images we tested, we found no systematic offsets in photometry between the combined and individual images that would indicate a problem in our procedure. We believe the combined images are photometrically accurate and are suitable for scientific analysis. However, it should be noted that we have tested only a very small fraction of the hundreds of images that we have produced. Photometric errors in untested images may be present and may indicate problems with our procedure; users discovering any such errors are urged to contact us.
4.3. The WFPC2 Associations
The Canadian Astronomy Data Centre, the Space Telescope European Coordinating Facility, and the Multimission Archive at STScI have made available a large number of combined WFPC2 images. These combined images are the products of the basic registration and averaging of related sets of WFPC2 images, referred to as associations. As of 2002 November, over 15,000 combined images had been created from associations of nearly 50,000 individual WFPC2 images.
The WFPC2 associations are a much stricter grouping of data sets than the “pointings” in our project. Two (or more) exposures in a given filter are grouped into an association if they belong to the same program (same proposal ID), their sky‐projected distance is not greater than 10
(100 WF4 pixels), and their position angle does not differ by more than 0
03. The APPP places no restrictions on proposal ID and position angle. The separation in sky‐projected distance needs to be less than 80
for the APPP. With our considerably looser criteria for grouping images in general, more data sets are grouped together in a pointing. This implies a higher signal‐to‐noise ratio (S/N) in the final image. In addition, by our definition, any position on the sky produces only a single image in each filter. For the WFPC2 associations, this is not necessarily true.
There are also significant differences in the data‐processing approach taken by the two teams. The processing for the WFPC2 associations does not include drizzling of the images or accounting for shifts in chip positions with time. They also do not include distortion corrections. Their procedures for image registration and cosmic‐ray rejection are also different. With our more ambitious approach, there are more avenues for failure during the image‐combination process. To compensate for this, we have included visual checks as an integral part of our quality‐control procedures.
Given the two projects' differences in grouping data sets together and processing the data, a comparison of the final images produced by the two projects is not straightforward. Nevertheless, we have tried to compare our images with those from the WFPC2 associations for a few representative pointings. We find that both types of images have S/Ns that are consistent with the detector and sky‐background characteristics and the effective exposure times.
Comparing the PSFs of starlike sources, we find that the FWHMs of sources in our images are not systematically different from the corresponding sources in the WFPC2 associations. This is particularly encouraging, because in principle our PSF may be broadened relative to the WFPC2 associations, due to a combination of factors, including resampling during drizzling and errors in interchip registration and in source centroiding in our image‐registration procedure.
4.4. Data Delivery and Web Access
The final science images, weight images, and a log file containing a brief summary of the properties of the images being drizzled and their registration are delivered as high‐level science products to the MAST science archive. In addition, we also provide a feature‐rich, Web‐based front end to the data for easy browsing. This includes a Digitized Sky Survey image of an
area centered on the pointing, a three‐color composite WFPC2 image made using F450W/F606W/F814W data in the blue/green/red channels, respectively, header information for each image, and a preview image in each filter. For each pointing, we also provide a coverage map in the four principal broadband filters. We provide a link for each pointing to the NASA/IPAC Extragalactic Database, which searches for objects within 10
of the pointing center. No source catalogs have been released yet, although we aim to add them in the future. Delivered data are grouped together according to the science questions that they are useful in addressing. Metadata from our delivered images have been incorporated into the MAST database and are searchable through the the MAST search interface. Image header keywords (and other metadata) will be updated on an ongoing basis to make the data more accessible through the Virtual Observatory. Any modifications of the procedures described in this paper will be fully documented in the README files accompanying data released through MAST.
We are grateful to the anonymous referee, whose insightful comments greatly improved this paper. Support for program AR‐09540 was provided by NASA through a grant from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5‐26555.
REFERENCES
- Anderson, J., & King, I. R. 2003, PASP, 115, 113
- Bertin, E., & Arnouts, S. 1996, A&AS, 117, 393
- Biretta, J., Ritchie, C., & Rudloff, K. 1995, A Field Guide to WFPC2 Image Anomalies (WFPC2 Instum. Sci. Rep. 1995‐06; Baltimore: STScI)
- Casertano, S., & Wiggs, M. S. 2001, An Improved Geometric Solution for WFPC2 (WFPC2 Instrum. Sci. Rep. 2001‐10; Baltimore: STScI)
- Casertano, S., et al. 2000, AJ, 120, 2747
- Fruchter, A. S., & Hook, R. N. 2002, PASP, 114, 144
- Kozhurina‐Platais, V., Anderson, J., & Koekemoer, A. M. 2003, Toward a Multi‐Wavelength Geometric Distortion Solution for WFPC2 (WFPC2 Instrum. Sci. Rep. 2003‐02; Baltimore: STScI)
- Micol, A., et al. 2000, in ASP Conf. Ser. 216, Astronomical Data Analysis Software and Systems IX, ed. N. Manset, C. Veillet, & D. Crabtree (San Francisco: ASP), 223
- Williams, R. E., et al. 1996, AJ, 112, 1335
-
1 Space Telescope European Coordinating Facility, European Southern Observatory, Karl‐Schwarzschild‐Strasse 2, D‐85748 Garching, Germany.
-
2 Department of Astronomy and Astrophysics, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA 65064.
-
3 The final report of the Working Group is available at http://www.stsci.edu/instruments/parallels/HSTParallel.html; additional material is available at http://www.stsci.edu/instruments/parallels.
-
5 Faint Images of the Radio Sky at Twenty Centimeters.
-
6 An overview of the goals of the Archival Pure Parallels Project can be found online at http://www‐int.stsci.edu/~yogesh/APPP.












