Network Working Group A. Bhushan
Request for Comments: 573 MIT-DM
NIC: 19083 14 September 1973
DATA AND FILE TRANSFER - SOME MEASUREMENT RESULTS
During the last six months, we have been monitoring (although not
continuously) the performance of our FTP-user and FTP-server
programs. The purpose of this paper is to 1) discuss measurement
criteria, 2) describe the measurement facilities, 3) report the
relevant measurement results, 4) discuss the significance of results
and compare them with other measurement data, and 5) ask for
suggestions on our measurement and summarizing procedures.
I. THE MEASUREMENT CRITERIA
The FTP (Ref. "The File Transfer Protocol", by Abhay Bhushan, NWG/RFC
354, NIC 10596, ) may be considered a facility for data transfer
between file systems. The relevant measurement parameters for a data
transfer facility are:
1) Transfer rate (both peak and average, measured in bits per second)
which determines the throughput of the data transfer facility.
2) Response time or delay (measured in seconds) which determines the
"interactibility" of the facility.
3) Processing cost (measured in dollars or cpu-seconds per megabit
transferred) for transferring the data between the network and the
file system. This is only one component of the cost of transferring
data, the other component being the communication cost (including IMP
processing costs) which we take as given.
4) Failure-to-connect rate - average time elapsed between failures to
connect to the facility (measured in hours). Failures could be in
the Host (processor and file system) hardware or software, or in the
IMPs and telephone lines.
5) Availability - the percentage of time a given facility is
available, or alternately the probability of finding the facility
available at a given time.
6) Accuracy - measured by the probability of error in transferring
bits, bytes, blocks, or files.
Bhushan [Page 1]
RFC 573 DATA AND FILE TRANSFER September 1973
II. THE MEASUREMENT FACILITIES
The MIT-CMS survey program (ref. "A Report on the Survey Project" by
Abhay Bhushan, NWG/RFC 530, NIC 17375) measures the response-time,
failure-to-connect rate, and availability of the Host-logger facility
(on socket 1). Our preliminary experiments have indicated that the
corresponding measurement results for the FTP are very close to that
for the logger (at least they are the same order-of-magnitude). As
the use of FTP and the ARPANET is increasing rapidly, most Hosts have
their logger and FTP operational whenever their Host and NCP (Network
Control Program) are functioning. The response time for obtaining
the use of FTP service is very close to that for obtaining the use of
the logger service as both involve the use of the ICP (Initial
Connection Protocol).
Preliminary results from the Survey Project indicate that the average
response time in recent months has been about 2.7 seconds. The
average availability has been about 85% with the failure-to-connect
rate being about once every 10 hours. Table I shows summary results
for the time period August 26 through August 31, 1973, for three
Hosts with TENEX operating systems (SRI-ARC (NIC), BBN-TENEXA, and
USC-ISI).
The reader is cautioned that the data below reflects the Host
performance as seen by the MIT-DMS survey program which surveys the
Hosts only once every twenty minutes. Consequently, the actual host
performance may be somewhat different. Also, we cannot distinguish
between IMP, telephone lines, and Host failures and the response time
of a host is affected by its distance (number of IMP hops) from the
MIT IMP (IMP 6).
In the data shown in Table II, each success or fail response is
considered to have a duration of 20 minutes, so Hosts are given the
benefit of the doubt for the time we are not surveying. In addition,
the response time has been averaged only for the successful logger
available responses. The logger is considered available if the
SURVEY program can establish a full-duplex connection within 20
seconds. The Host is considered available when it is not in the
"DEAD" state (states in which logger is not up but the Host is
available are logger not responding and logger rejecting).
Bhushan [Page 2]
RFC 573 DATA AND FILE TRANSFER September 1973
TABLE I
RESPONSE TIME, AVAILABILITY, AND FAILURE RATE FOR SELECTED HOSTS
(based on SURVEY data for 8/25/73 through 8/31/73)
PARAMETER NIC BBN ISI
Average Response-time (sec.) 2.7 2.4 3.0
Host Availability 93% 85% 87%
Logger Availability 91% 79% 83%
Failure-to-connect rate
for Host (hours) 18.2 9.4 18.1
Failure-to-connect rate
for logger (hours) 16.0 6.0 10.0
The details on the above measurements will be reported in a forth-
coming paper. This paper will focus on the remaining parameters of
transmission rate, processing costs and accuracy, as measured by the
MIT-DMS File Transfer Measurement facility.
The FTP measurement facility exists in the MIT-DMS CALICO subsystem.
Each time the MIT-DMS FTP-user or FTP-server program in the CALICO
subsystem is used to transfer files (and data) via the ARPANET, it
records in a local disk file the following transfer parameters: the
remote Host involved, the date and time the transfer is initiated,
the total number of bits transferred, the real time taken (in
seconds) for the transfer, the CPU time (in micro-seconds) used by
the program, whether the program is the server or user, and the FTP
parameter settings for byte size (BYTE), representation type (TYPE),
transfer mode (MODE), and the file structure (STRU). Programs exist
in CALICO to display and summarize this data.
It should be noted that no measurements are recorded when the non-
CALICO FTP-user and FTP-server programs are used for transferring
files. Therefore it should be pointed out that the measurement
represents a small subset of our total FTP-usage. The CALICO FTP-
server was operated only till May 1973, when we switched to the non-
CALICO FTP-server. (The switch was made because CALICO still
undergoing development is somewhat less reliable. As CALICO
stabilizes we may again operate the CALICO server and continue
measuring data transfer.) In addition many users prefer to use the
simpler (involving fewer system resources) stand-alone FTP-user
Bhushan [Page 3]
RFC 573 DATA AND FILE TRANSFER September 1973
program. The measurement does include the data transferred when FTP
is used indirectly by such commands as "copy", "print", "listf", and
"mail.file" in the CALICO NETWRK subsystem.
III. THE MEASUREMENT RESULTS
The measurement facility has been operational (though not
continuously) since 25 February 1973. It has recorded the transfer
of 304 files consisting of 57.6 million bits. Over 90% of the bits
transferred (but only 75% of the files)used the more efficient
Image-36 stream mode (TYPE I, BYTE 36, MODE S) of transfer. The
remainder of the files were transferred using the ASCII-8 stream mode
(TYPE A, BYTE 8, MODE S). It should be noted that even though block
mode was available, it was never used by our users (primarily because
many FTP-servers do not implement it, and it is less efficient to
use). All the files had a sequential non-record file structure (STRU
F). A summary of the measurement results is shown in Table II.
TABLE II
SUMMARY OF FTP MEASUREMENT RESULTS
Subset of data # Files # bits Av. File Speed CPU-use
Mbits Kbits Kbps sec/Mb
Total 304 57.6 189 7.56 4
Image 36 mode 223 53.6 240 9.35 3
ASCII-8 mode 81 4.0 49 2.09 19
Server sending 62 3.8 61 7.50 2
Server receiving 110 19.8 180 7.44 1
User receiving 83 22.8 276 7.92 6
User sending 49 11.1 225 7.09 4
The entire display of the measurement data and the summaries shown in
Table II are generated by the "PFTPST" (Print FTP Statistics)
program in the CALICO subsystem. A sample of the data displayed is
shown in Table III. The BPS (bits per second) and the M/B (CPU
microseconds per bit or CPU seconds per Megabit) information is
calculated by the displaying program. The largest file transferred
was 5.03 Mbits, a "STOR" by the FTP-user to MIT-AI. The transfer
took 10 minutes of real time for a transfer rate of a little over 10
Kbps. The highest data transfer rate recorded was 27.8 Kbps, a
Bhushan [Page 4]
RFC 573 DATA AND FILE TRANSFER September 1973
"RETR" from BBN-TENEXA to MIT-DMS FTP-server. The length of the file
in the above case was 28 Kbits. Needless to say that both of the
above transfers used the more efficient Image-36 mode for transfer.
The smallest file and the smallest transmission rate recorded was an
80 bit "MLFL" to MIT-ML (using ASCII-8) which took 7 seconds real
time for 11 bps transfer rate.
TABLE III
SAMPLE DISPLAY OF FTP MEASUREMENT DATA
-#- ---HOST--- COMM --DATE-- --TIME-- --BITS-- -BPS- M/B T BY PRG
2 sri-arc STOR 73/08/09 18:19:49 121392 1395 21 I 36 U
198 mit-ml STOR 73/08/15 15:00:30 50688 5336 8 I 36 U
198 mit-ml RETR 73/08/15 15:01:14 50688 10137 12 I 36 U
198 mit-ml STOR 73/08/15 15:02:33 255456 8808 7 I 36 U
198 mit-ml RETR 73/08/15 15:03:58 258048 8601 12 I 36 U
134 mit-ai STOR 73/08/15 15:13:17 286720 1898 29 A 8 U
134 mit-ai RETR 73/08/15 15:18:39 258048 9557 14 I 36 U
134 mit-ai STOR 73/08/15 15:19:42 258048 6974 7 I 36 U
2 sri-arc RETR 73/08/15 15:31:20 7236 3618 22 I 36 U
2 sri-arc STOR 73/08/15 15:32:55 49428 8238 31 I 36 U
2 sri-arc RETR 73/08/15 15:34:56 49428 3530 15 I 36 U
2 sri-arc STOR 73/08/15 15:38:09 49428 7061 8 I 36 U
2 sri-arc STOR 73/08/20 15:18:26 35460 2364 9 I 36 U
2 sri-arc RETR 73/08/20 16:08:09 58832 426 153 A 8 U
2 sri-arc RETR 73/08/22 12:46:10 10512 166 247 A 8 U
2 sri-arc RETR 73/08/23 16:29:37 320 64 369 A 8 U
2 sri-arc RETR 73/08/24 12:25:38 9992 262 254 A 8 U
2 sri-arc RETR 73/08/24 12:27:26 9992 454 250 A 8 U
198 mit-ml STOR 73/08/29 10:40:58 768924 7538 7 I 36 U
198 mit-ml STOR 73/08/29 10:44:09 166572 5552 7 1 36 U
198 mit-ml STOR 73/08/29 10:54:32 166572 7932 7 I 36 U
198 mit-ml STOR 73/08/29 13:48:18 158040 12156 7 I 36 U
69 bnn-tenexa MLFL 73/08/29 22:30:55 5600 1866 51 A 8 U
69 bbn-tenexa MLFL 73/08/29 22:31:42 5600 2800 50 A 8 U
86 usc-isi MLFL 73/08/29 22:33:55 5600 1400 54 A 8 U
69 bbn-tenexa MLFL 73/08/29 22:36:15 5600 2800 48 A 8 U
69 bbn-tenexa MLFL 73/08/29 22:36:54 5600 2800 49 A 8 U
It should be pointed out that recent measurement data for ASCII-8
transfer includes retrieval of "NIC Journal" documents
("<Xjournal>xxxxx.nls;xnls" files) from SRI-ARC. SRI-ARC converts
these "xnls" files from NLS to sequential form on the "fly" and this
takes considerable time giving a low transfer rate for these
transfers.
Bhushan [Page 5]
RFC 573 DATA AND FILE TRANSFER September 1973
In transferring files we found the ARPANET and the FTP to be quite
reliable. On numerous occasions we transferred complete listing of
our operating system (about 6 million bits), reassembled it and ran
it with no problem. No data lossage problems have been reported to
us as yet.
IV. THE SIGNIFICANCE OF MEASUREMENT RESULTS
First of all let me state my complete agreement with Barry Wessler
(Ref. "Revelations in Network Host Measurements" NWG/RFC 557, NIC
18457) that the measurement results should be taken in the spirit:
"Here is a place to make the Network better" rather than: "Look,
isn't the Network terrible." We take these measurements in the same
spirit and have found the measurement effort to be quite fruitful.
In several instances, with the aid of our measurement facilities, we
have been able to improve the performance of our Network programs by
an order-of-magnitude (just as Don Allen at BBN improved Greg Hicks'
RJS program).
Our measurement results are in close agreement with the BBN FTP
measurements (8.2 cpu seconds/Mb for 8-bit byte and 2 CPU seconds/Mb
for 36-bit byte transfers). We also find the 36-bit byte transfer to
be an order-of-magnitude more efficient than 8-bit byte transfer.
The processing cost (assuming $6.00 per CPU minute) for transferring
a Megabit of information comes to about $1.90 for ASCII-8 mode as
compared to only $0.30 for Image-36 mode. The difference in
transfer rate is equally astounding being 9.4 Kbps for Image-36 as
compared to only 2 Kbps for ASCII-8.
It is therefore recommended that Image-36 mode be used as much as
possible to transfer data between PDP-10s (of which there are many on
the ARPANET). It is strongly urged that protocols and programs allow
(and use) the Image-36 mode for all data transfers including mailing
files (MLFL), listing directories (LIST, NLST), and
sending/retrieving NIC Journal documents. Many of the MID-DMS user
programs such as "COPY" and "FTP" take advantage of the fact that the
remote Host is a PDP-10 (there is a table of PDP-10's in "COPY") and
use the more efficient Image-36 mode. Such a procedure is highly
recommended.
The effective IMP-IMP data transfer rate is about 37.5 Kbps over the
50 Kbps telephone line (Ref. McQuillan John M., "Throughput in the
ARPA Network--Analysis and Measurement," BBN Report 2491, NIC 14188,
January 1971). The Host-to-Host data transfer measurement performed
by BBN (above reference, p. 28) have indicated a transfer rate of
30-35 kbps BBN-to-BBN (0 IMP hops) and 12-16 Kbps BBN-to-SRI (5 hops)
using single link. As FTP transfers data via a single link, a
maximum transfer rate between 12 and 35 Kbps (depending on number of
Bhushan [Page 6]
RFC 573 DATA AND FILE TRANSFER September 1973
IMP hops) can be expected if that file transfer is the only activity
going on. In this light our maximum transfer rate of 27 Kbps to BBN
(2 hops) is probably the most one can expect out of any program. The
average transfer rate of 9.4 Kbps (for Image-36) transfer also
appears reasonable in view of the fact that during many of the
transfers other network activity is also going on, and that many of
the transfers are performed when the respective computer systems are
quite heavily loaded. Our measurement data does reveal that transfer
rate is appreciably higher during the times a computer is likely to
be lightly loaded.
The above does not mean that improvements are not possible or not
required in the state of the ARPANET data transfer. Our measurement
data has revealed areas in which improvements can be and should be
made. For example, the transfer of data to other MIT Hosts (0 IMP
hops) and back to ourselves should be faster than what we currently
achieve (transfer to BBN is faster!). The probable reason for the
above discrepancy is that our allocation (Host-Host protocol) is very
small (2944 bits) as compared to that provided by BBN (17724 bits).
This means that to transfer data our Network Control Program (NCP)
has to wait for an allocation many more times while communicating to
an ITS system than to a TENEX system. Large allocations are always
desirable but even more so while transferring files. NCP designers
can (and should) modify NCP's to allow large allocates (larger NCP
buffers) for file transfer even at the expense of smaller allocates
for other types of connections (such as a terminal connected to a
computer system) which do not require or use the larger allocation.
In addition, a new allocate should be sent as soon as data is read by
the receiving program (the NCP should not wait for the allocation to
become zero before sending the new allocate).
We also observed that small files are transferred at a significantly
lower transfer rate than large files but beyond a file size of 40
Kbits, the file size makes little difference in transfer rate or
processing cost per bit transferred. The figure of 40 Kbits is
probably related to the size of sending and receiving buffers used by
the programs. In general, for most practical values of buffer size,
the larger the buffer size and allocations, the faster and more
efficient will be the transfer. Unfortunately, large NCP buffers are
not easily available in many systems and come at a premium. The
information on average file size (240 Kbits for Image and 40 Kbits
for ASCII files) may be helpful in optimum allocation of buffer
space.
Bhushan [Page 7]
RFC 573 DATA AND FILE TRANSFER September 1973
V. REQUEST FOR COMMENTS AND SUGGESTIONS
It is hoped that the above measurement results and our FTP and SURVEY
measurement facilities will help ARPANET users plan their modes of
Network usage and help Network programmers in making the Network
better. This RFC is indeed a Request For Comments and your
suggestions on the way we collect, store, and display measurement
data will be greatly appreciated. We can break the measurement data
by Hosts and will be happy to provide the information if it is
considered desirable. Please let me know what other parameters we
should record or display. You may communicate with me via the
ARPANET (AKB at MIT-DMS (Host 70), NIC Ident AKB), via telephone
(617-253-1428 or 1449), or US mail (Rm. 208, 545 Tech Square,
Cambridge, Mass 02139).
[ This RFC was put into machine readable form for entry ]
[ into the online RFC archives by Robert Baskerville 9/98 ]
Bhushan [Page 8]