Home > Publications > SDP Kernel Workshop

SDP Kernel Workshop

The SDP Kernel Workshop, held in Cambridge on 29 November 2016, allowed key partners from industry to discuss algorithmic efficiency with radio-astronomy experts, looking at the Gridding algorithm.

The kernel we have selected is gridding, because:

  • The memory bandwidth and processing requirements are very demanding
  • The data processed has domain specific characteristics
  • It is affected by numerous parameters

We separate the work that should be undertaken into the following sections:

  • Development of algorithms
  • Study of optimal data layout for cache re-use and processing
  • Alternative implementation of algorithms and data layout (e.g. sparse FFT)
  • Organization of the parameter space
  • Demonstrating the use of tools for profiling
  • Reporting the behavior for implemented algorithms
  • Description of the goals

For more information please visit the GitHub repository on this work.

Frequently Asked Questions

A collection of clarifications to the work description

What is the role of gridding in the pipeline?

Gridding is currently projected to be one of the most expensive operations. We estimate that at least ~2-4 Pflop/s of the SDP's compute capacity will have to be spent on gridding. What makes gridding especially interesting for the current pipeline planning is that it will become more important the more the telescope is scaled up, as naive scaling will lead to O(n^4)complexity in gridding and kernels. There will obviously be other factors, but it looks like a pretty safe bet that sooner or later gridding performance will become very important.

Where does this much compute come from? To give a sense of scale, a data set described above reflects:

  • 45 seconds (x480 for 6 hours)
  • 1 polarisation (x16 for all Mueller matrix terms)
  • 7.5 MHz (x40 for full Low band)
  • 1 facet (x25 for full FoV coverage to third null of antenna reception)
  • 1 taylor term (x5 for recovering frequency dependence)
  • 1 major loop (x10 for 10 self-calibration iterations)

So we need to process a data set like this roughly 384 million times before a full observation is processed. The data sets will vary in terms of visibilities and w/A-kernels, but the uv-grid distribution will mostly stay the same.

What about weighting and deconvolution?

You might see these pop up in the reference implementation, but it is expected that those will not be relevant.

Kernels switch fast. Do they have to be loaded from memory?

Visibility count per kernel depends on the baseline, and note that generally baselines with many visibilities switch kernels more slowly. However, it is correct that (A-)kernels get switched extremely fast. After all, they depend not only on baseline, but also on time and frequency, so we get at minimum ~40 kernel switches per baseline even ignoring w-kernels.

It seems highly desirable for this reason to do convolution on-the-fly at minimum. As argued in SDP Memo 028, this should be enough to ensure that the kernel is not memory-limited on kernels in most situations. There are also more advanced algorithms which might have advantages in special situations, but this should be a good starting point.

Generating w/A-kernels on-the-fly would be possible as well. However, note that generation is generally not that expensive in the grand scheme of things: It only needs to be done per individual antenna/w-value, whereas convolution needs to be done for every antenna-pair/w-value combination. For example, the A-kernel and w-kernel sets are roughly 300 MB combined here, but all required kernel combinations would be ~4TB!

What is the purpose of the C gridding code in examples/grid?

This code was written as a test-run to see how hard it would be to work with the HDF5 data set from a C-like environment. The data access code has been tested and should be correct.

However, while the gridding portions has been used for benchmarks, it has not been tested as thoroughly. We have verified that the results are correct for simple and w-projection imaging, but for Aw-projection the program is missing actual convolution. So these portions of the program should be interpreted as illustrations, not as a reference.

How can convolution be implemented using FFTs?

If you look at the definition of aw_kernel_fn, you will notice that it is implemented as follows:

akern = scipy.signal.convolve2d(a1kern, a2kern, mode='same')

Where concolve2d implements a "naive" convolution algorithm. However, using FFT laws we can replace this by Fourier transformation. If we want to implement the original convolution exactly this would boil down to:

awkern = 29*29*extract_mid(

numpy.fft.fftshift(numpy.fft.fft2(

numpy.fft.ifft2(numpy.fft.ifftshift(pad_mid(a1kern,29))) *

numpy.fft.ifft2(numpy.fft.ifftshift(pad_mid(a2kern,29))) )),

15)

Note that we need to pad the kernels to get the right border behaviour. However, the kernels for Aw-gridding will normally be chosen to /not/ overflow the borders (e.g. padding should have happened when the kernels get generated). Therefore, the computationally simpler "wrapping" approach is permitted as well:

awkern = 15*15*numpy.fft.fftshift(numpy.fft.fft2(

numpy.fft.ifft2(numpy.fft.ifftshift(a1kern)) *

numpy.fft.ifft2(numpy.fft.ifftshift(a2kern)) ))

Note that the inner FFTs of the input kernels are repeated quite often, and should be shared as much as possible to reduce computation.

Recent news

Lovell telescope behind new SKA HQ building on 17 Jan 2019

At the end of October 2018 the SKA SDP Consortium submitted its Critical Design Review (CDR) documentation pack. Contained in this were the formal deliverables of the design consortium covering all aspects of the SDP architecture, system engineering and programmatics (the documents are available here).  The documents were received by the CDR reviewers (consisting of 3 external members and 15 internal to SKAO) who proceeded to generate clarification questions, requests and notes in the form of observations (called OARs after the Observation Action Register approach commonly used for them). Where possible these were addressed via communication exchanges in JIRA OAR tickets.

From 15th to 19th January 2019 members of the SDP Consortium then visited the SKAO HQ at Jodrell Bank (see figure 1), where direct discussions took place in the new SKAO Council Chamber (see figure 2), to further explore the architecture and identify risks as part of an SEI Architecture Tradeoff Analysis Method (ATAM).

SDP1

At the end of October the SDP Consortium submitted its full document set for Critical Design Review. (These can be found at http://ska-sdp.org/publications/sdp-cdr-documentation) together with a large number of supporting memos (http://ska-sdp.org/publications/released-sdp-memos-i and http://ska-sdp.org/publications/released-sdp-memos-ii). Table 1 below shows the documents in three main categories: those associated with software and hardware architecture; those explaining the supporting prototyping work that has been undertaken in support of the architecture, and finally those associated with system engineering (SE) and programmatics aspects (e.g. specifications for how SDP interfaces with the wider telescope systems and how components will be constructed). The documents will receive observations from a panel of reviewers up until the end of December. Responses to the observations and scenarios to ‘test’ the architecture will then be discussed at a review meeting from 15th to 18th January 2019 at Jodrell Bank.