Next: Requirements list
Up: E-ALFA Software
Previous: SDFITS notes
Discussion on detection algorithms
- Martha's questions to Jon and Erwin (23Apr03):
-
I have some questions for you about your extant software and how you do
things now. I can't find much on the public HIPASS site of enough specificity
to answer my questions, and how universally the
Barnes et al. paper applies.
- * What software platform do you use for image processing of HIJASS/HIPASS?
Do you stick within aips++ or export and work in something else?
- * What language do you code the detection algorithms in? Are they standalone?
- * What data format (SDFITs? Measurement set?) are the data in when the detection
is performed?
- * Are there differences in the above within the HIJASS/HIPASS collaborations?
- From Erwin (24Apr03):
-
Some info on HIPASS and the software we used for processing and
analysing the data - The HIJASS software is a variation on the same
theme I believe, but the HIJASS people will be able to tell you.
> I have some questions for you about your extant software and how you do
> things now. I can't find much on the public HIPASS site of enough specificity
> to answer my questions, and how universally the Barnes et al. paper applies.
In principle, the
Barnes et al describes the procedure for the
pipe-line processing that we used for standard HIPASS. The raw data
comes in as single dish fits files, and is then processed in AIPS++
using their measurement sets scheme. Most of this has been written in
GLISH, the scripting language for AIPS++. So all the hard work is done
in AIPS++. This was a conscious decision made at the time, which was
possible because of the large support and development effort for AIPS++
made at ATNF at the time.
The end result were 3D fits cubes, which were then used for all subsequent
analysis. All galaxy detection wasd done in the image domain (except for
the wavelets deccribed below)
> * What software platform do you use for image processing of HIJASS/HIPASS?
> Do you stick within aips++ or export and work in something else?
Most of the processing is done outside AIPS++, generally with whatever
tools were available at the time. Given the exploratory and possibly
pioneering nature of the whole analysis things sort of developed, and
things were done for historical reasons that would now obviously be done
much better with the hindsight we have. The first attempt at galaxy
detection software was a Miriad script written by Virginia Killborn,
which was essentially a peak detection algoritm that find all x-sigma
peaks, eliminated double detections, and extracted a total spectrum.
Other efforts were made with a wavelet algorithm, written in GLISH if I
remember correctly, as well as a detection algorithm based on connected
pixels, developed in cardiff (Jon or Robert Minchin will be able to tell
more about it). The wavelets turned out to be a bit of a dead end, as
the student in question left, and it wasnt possible to disentangle the
black box he left. There also was not much overlap between lists
produced by wavelets and peak detection algorithms, leading us to have
some doubt about the wavelet method. It would still be worth exploring though.
One of the main findings is that is is very very diffcult to develop an
algorithm that is both complete and reliable. With the conventional peak
detection methods it is possible to create an almost complete 5sigma
peakflux catalog, but completeness at high reliability is difficult at
the 3 sigma level. Partly this is caused by the sheer number of pixels
in an all sky survey, so that there are a significant number of false
noise detections (even at 5 sigma we are talking about 10s to hundreds
in HIPASS). These have to be filtered out by a human, or cross
correlation with other catalogs. recently Melbourne has made a
herculanean effort to catalog all detections in HIPASS down to 3 sigma,
which meant that tens of thousands of detections had to be individually
checked, and even now the reliability is not extrememly high. This is
not something we want to do with EALFA. Furthermore for HIPASS the
analysis is complicated due to the non-gaussianity of the noise (and I
have no doubt that EALFA noise will be non-gaussian as well). HIPASS has
a larger number of high-sigma noise peaks than you'd expect on the basis
of gaussian noise.
So to get back to your question, most of the detection algorithms that
are being used in Australia for HIPASS are essentially modified existing
algorithms from Miriad or aips or whatever was handy at the time.
Jon is in a better position to tell about the Cardiff ones.
Unfortunately not much was published about the Australian HIPASS galaxy
finders. There is a little bit of information about them in
Kilborn et al (2002 AJ 124 690), but not at a useful level of detail.
The efforts made by Cardiff are probably best documented of all in
some of Jons papers.
Virginia Kilborn may be in the best position to give a summary of
developments on her galaxy finder.
Jon, can we get Virginia to write a one-page description of her program?
- From Jon (24Apr03):
-
I think Erwin answered most of your questions. The one remaining is on
automated galaxy detection. I think the only thing published is the one we
described in
Davies et al. 2001, MNRAS, 328, 1151. As Erwin said this initially
relies on a peak detection followed by convolution with a matched filter. Robert
has developed this further (so hopefully after reading this he will tell you
what he has done, but I don't think anything is published). As Erwin says the
noise is none Gaussian so it is difficult to work out realistic detection
levels. This is a problem because for an HI survey you do not want to rely on
optical identifications, so either you set a high (very) S:N level or you have
to do lots of HI follow ups. There were some very interesting things in the
'noise' in the HIPASS data which I didn't follow up. There is far more positive
going noise than negative (I multiplied the spectra by -1). My guess is that
strong sources can be detected in spectra at many 10s of arc min away from the
source, so there are many faint signals in spectra. A detailed look at the noise
characteristics of HIPASS would be an interesting project. I had a very
interesting chat yesterday with one of our people who works on the detection of
signals in gravity wave data. I think they are much further advanced than we
are. They use a large template bank and fourier deconvolution - I think this
looks very promising (stand alone and written in C I think). I am in the process
of arranging a meeting with his group to see if we can carry out a few trial
runs using HI data. I will keep you informed.
- From Robert (24Apr03):
-
As Erwin and Jon have said, the primary selection relies on peak-flux
detection in a single spectrum at a time. After this selection a
matched-filter is used to improve reliability and get a first guess at the
velocity widths. The whole cube is examined spectrum by spectrum and a
catalogue constructed. This catalogued is then purged to remove duplicate
sources, keeping those with the best match to the template. The current
version of the finder is coded in F77 and is fairly much a stand-alone
package.
Next: Requirements list
Up: E-ALFA Software
Previous: SDFITS notes
This page created and maintained by
Martha Haynes.
Last modified: Thu Apr 24 21:54:40 EDT 2003