Home > Publications > SDP Kernel Workshop

SDP Kernel Workshop

The SDP Kernel Workshop, held in Cambridge on 29 November 2016, allowed key partners from industry to discuss algorithmic efficiency with radio-astronomy experts, looking at the Gridding algorithm.

The kernel we have selected is gridding, because:

  • The memory bandwidth and processing requirements are very demanding
  • The data processed has domain specific characteristics
  • It is affected by numerous parameters

We separate the work that should be undertaken into the following sections:

  • Development of algorithms
  • Study of optimal data layout for cache re-use and processing
  • Alternative implementation of algorithms and data layout (e.g. sparse FFT)
  • Organization of the parameter space
  • Demonstrating the use of tools for profiling
  • Reporting the behavior for implemented algorithms
  • Description of the goals

For more information please visit the GitHub repository on this work.

Frequently Asked Questions

A collection of clarifications to the work description

What is the role of gridding in the pipeline?

Gridding is currently projected to be one of the most expensive operations. We estimate that at least ~2-4 Pflop/s of the SDP's compute capacity will have to be spent on gridding. What makes gridding especially interesting for the current pipeline planning is that it will become more important the more the telescope is scaled up, as naive scaling will lead to O(n^4)complexity in gridding and kernels. There will obviously be other factors, but it looks like a pretty safe bet that sooner or later gridding performance will become very important.

Where does this much compute come from? To give a sense of scale, a data set described above reflects:

  • 45 seconds (x480 for 6 hours)
  • 1 polarisation (x16 for all Mueller matrix terms)
  • 7.5 MHz (x40 for full Low band)
  • 1 facet (x25 for full FoV coverage to third null of antenna reception)
  • 1 taylor term (x5 for recovering frequency dependence)
  • 1 major loop (x10 for 10 self-calibration iterations)

So we need to process a data set like this roughly 384 million times before a full observation is processed. The data sets will vary in terms of visibilities and w/A-kernels, but the uv-grid distribution will mostly stay the same.

What about weighting and deconvolution?

You might see these pop up in the reference implementation, but it is expected that those will not be relevant.

Kernels switch fast. Do they have to be loaded from memory?

Visibility count per kernel depends on the baseline, and note that generally baselines with many visibilities switch kernels more slowly. However, it is correct that (A-)kernels get switched extremely fast. After all, they depend not only on baseline, but also on time and frequency, so we get at minimum ~40 kernel switches per baseline even ignoring w-kernels.

It seems highly desirable for this reason to do convolution on-the-fly at minimum. As argued in SDP Memo 028, this should be enough to ensure that the kernel is not memory-limited on kernels in most situations. There are also more advanced algorithms which might have advantages in special situations, but this should be a good starting point.

Generating w/A-kernels on-the-fly would be possible as well. However, note that generation is generally not that expensive in the grand scheme of things: It only needs to be done per individual antenna/w-value, whereas convolution needs to be done for every antenna-pair/w-value combination. For example, the A-kernel and w-kernel sets are roughly 300 MB combined here, but all required kernel combinations would be ~4TB!

What is the purpose of the C gridding code in examples/grid?

This code was written as a test-run to see how hard it would be to work with the HDF5 data set from a C-like environment. The data access code has been tested and should be correct.

However, while the gridding portions has been used for benchmarks, it has not been tested as thoroughly. We have verified that the results are correct for simple and w-projection imaging, but for Aw-projection the program is missing actual convolution. So these portions of the program should be interpreted as illustrations, not as a reference.

How can convolution be implemented using FFTs?

If you look at the definition of aw_kernel_fn, you will notice that it is implemented as follows:

akern = scipy.signal.convolve2d(a1kern, a2kern, mode='same')

Where concolve2d implements a "naive" convolution algorithm. However, using FFT laws we can replace this by Fourier transformation. If we want to implement the original convolution exactly this would boil down to:

awkern = 29*29*extract_mid(

numpy.fft.fftshift(numpy.fft.fft2(

numpy.fft.ifft2(numpy.fft.ifftshift(pad_mid(a1kern,29))) *

numpy.fft.ifft2(numpy.fft.ifftshift(pad_mid(a2kern,29))) )),

15)

Note that we need to pad the kernels to get the right border behaviour. However, the kernels for Aw-gridding will normally be chosen to /not/ overflow the borders (e.g. padding should have happened when the kernels get generated). Therefore, the computationally simpler "wrapping" approach is permitted as well:

awkern = 15*15*numpy.fft.fftshift(numpy.fft.fft2(

numpy.fft.ifft2(numpy.fft.ifftshift(a1kern)) *

numpy.fft.ifft2(numpy.fft.ifftshift(a2kern)) ))

Note that the inner FFTs of the input kernels are repeated quite often, and should be shared as much as possible to reduce computation.

Recent news

SIP ARL

Since the last eNews submission the SDP has successfully submitted and certified its last milestone, M19, and now is busy preparing for the next milestone, M20, Pre-CDR. The M20 Release Readiness Notice (RRN) was submitted in early March and contains 19 documents within the submission pack. This list includes Systems Engineering documents, Interface documentation (ICDs) and an updated snapshot of the latest SDP Architecture views since the M19 submission.

This pre-CDR milestone is a stepping stone to a successful CDR delivery in October.  An incremental submission allows valuable key stakeholder feedback to be gained in particular on the architectural design to ensure continuous improvement and alignment of architectural priorities.

In addition to the finalisation of deliverables for Pre-CDR submission, other areas of focus for the Consortium are - further iterations of the SDP software architecture, advances in the SEI views of the high-level SDP architecture, continued understanding of the SDP interfaces, progress on the SDP functional model using the Algorithmic Reference Library (ARL), progress on the SDP prototyping testbed (P3) in Cambridge, progress on the SDP Integration Prototype (SIP) including end-to-end testing, consolidation of data models work, monitoring of the hardware costs evolution and updates to the corresponding predicted cost to name a few!

The sections below expand in further detail the recent work efforts and progress in the areas of platform and systems integration prototyping (P3, SIP) as well as the ARL. This work reduces risk and provides the rationale for design choices ahead of System CDR.

SDP Architecture doc

The next significant milestone for the SDP Consortium is M19 on the 30th November. M19 is a lightweight review of the current state of the SDP software architecture and forms a checkpoint with the SKAO to present the current state of the architecture and progress towards Critical Design Review (CDR).

All relevant SDP SEI style views and other supporting documentation for functionally close blocks of architecture will be reviewed. The goal is to present the information in sufficient detail to facilitate immediate use by a knowledgeable audience member.

In preparation for this review, a Stakeholders Overview will take place. The required participants will include key members of the SDP architectural design team and knowledgeable stakeholders from SKAO. This review is to ensure that the high-level architecture documentation, for the purposes of providing an introductory, high-level explanation of the design, is sufficiently complete. In preparation for this overview, key members of the SDP Architecture design team recently selected a collection of views for review and this information has been distributed to the SKAO.

Comments and or suggestions made during the Stakeholder Overview and the official M19 review will guide further architectural priorities ahead of M20 (Element pre-CDR).