Title: Radio Astronomy Data Transfer and eVLBI using KAREN
Authors: Stuart Weston, Timothy Natusch, Sergei Gulyaev
Affiliation: Auckland University of Technology
With increasingly large datasets and surveys, the problem of moving astronomical data around the world is non-trivial. Moving data can be done in two ways. First, one might put the data on a storage device (i.e. magnetic tapes, CDs, or hard drives) and then carry it down the mountain. A second option it is to store the data remotely and download it over the internet. But whether you move the data physically or electronically, how fast and how accurately can this be done?
Unsurprisingly, the answer to this question depends on the method you use. I often find myself needing to move files from one computer to another and I will FTP (File Transfer Protocol) to do so. The FTP is a standard that defines how the streaming bits are to be delivered and synthesized, specifically for an TCP/IP network. You can read all about TCP/IP networks online, but the salient feature is that this protocol was developed to provide a reliable stream of data packets from one computer to another. Whenever you access the internet or your email, you are most probably using a TCP/IP protocol. However, TCP/IP is not efficient for high speed, wide area networks. These are networks that cover a broad geographic area where you don’t want performance to be dependent on location in the network. The UDP-based data transfer protocol (UDT) is an alternative to TCP/IP which reduces the latency (or delay in the file transfer) albeit with a reduction in reliability. Consequently, your choice in protocol depends on your network and speed and efficiency requirements.
This paper tested the performance of FTP and several UDT protocols for the transfer of radio data for a network of telescopes. They sent the data through various paths (see figure above) using the different protocols. The application was for Very Long Baseline Interferometry (VLBI) which creates an array of radio dishes spread across the world to obtain exquisite resolution. One of the difficulties involved with this measurement is that the data from each single dish must be combined with all the others. While this can be done after the observation using stored data, high speed network connections enable this calculation to be done in real time, a method called e-VLBI. As you might imagine, the choice of file transfer protocol is a limiting factor in the ability to effectively redistribute single-dish data over the network. For this particular problem, the UDT protocol was much faster than the FTP (see table below). This result might not be too surprising considering that UDT is optimized for networks that span countries or even the globe. However, the stark difference in protocol performance illustrates the importance of understanding not only the what and why of a measurement, but also the how.
Featured image from this site.