Rfc0573
TitleDATA AND FILE TRANSFER - SOME MEASUREMENT RESULTS
AuthorA. Bhushan
DateSeptember 1973
Format:TXT, PDF, HTML
Status:UNKNOWN






Network Working Group                                         A. Bhushan
Request for Comments: 573                                         MIT-DM
NIC: 19083                                             14 September 1973


           DATA AND FILE TRANSFER - SOME MEASUREMENT RESULTS


   During the last six months, we have been monitoring (although not
   continuously) the performance of our FTP-user and FTP-server
   programs.  The purpose of this paper is to  1) discuss measurement
   criteria,  2) describe the measurement facilities, 3) report the
   relevant measurement results,  4) discuss the significance of results
   and compare them with other measurement data, and 5) ask for
   suggestions on our measurement and summarizing procedures.

I. THE MEASUREMENT CRITERIA

   The FTP (Ref. "The File Transfer Protocol", by Abhay Bhushan, NWG/RFC
   354, NIC 10596, ) may be considered a facility for data transfer
   between file systems.  The relevant measurement parameters for a data
   transfer facility are:

   1) Transfer rate (both peak and average, measured in bits per second)
   which determines the throughput of the data transfer facility.

   2) Response time or delay (measured in seconds) which determines the
   "interactibility" of the facility.

   3) Processing cost (measured in dollars or cpu-seconds per megabit
   transferred) for transferring the data between the network and the
   file system.  This is only one component of the cost of transferring
   data, the other component being the communication cost (including IMP
   processing costs) which we take as given.

   4) Failure-to-connect rate - average time elapsed between failures to
   connect to the facility (measured in hours).  Failures could be in
   the Host (processor and file system) hardware or software, or in the
   IMPs and telephone lines.

   5) Availability - the percentage of time a given facility is
   available, or alternately the probability of finding the facility
   available at a given time.

   6) Accuracy - measured by the probability of error in transferring
   bits, bytes, blocks, or files.





RFC 573                  DATA AND FILE TRANSFER           September 1973


II.  THE MEASUREMENT FACILITIES

   The MIT-CMS survey program (ref. "A Report on the Survey Project" by
   Abhay Bhushan, NWG/RFC 530, NIC 17375) measures the response-time,
   failure-to-connect rate, and availability of the Host-logger facility
   (on socket 1).  Our preliminary experiments have indicated that the
   corresponding measurement results for the FTP are very close to that
   for the logger (at least they are the same order-of-magnitude).  As
   the use of FTP and the ARPANET is increasing rapidly, most Hosts have
   their logger and FTP operational whenever their Host and NCP (Network
   Control Program) are functioning.  The response time for obtaining
   the use of FTP service is very close to that for obtaining the use of
   the logger service as both involve the use of the ICP (Initial
   Connection Protocol).

   Preliminary results from the Survey Project indicate that the average
   response time in recent months has been about 2.7 seconds.  The
   average availability has been about 85% with the failure-to-connect
   rate being about once every 10 hours.  Table I shows summary results
   for the time period August 26 through August 31, 1973, for three
   Hosts with TENEX operating systems (SRI-ARC (NIC), BBN-TENEXA, and
   USC-ISI).

   The reader is cautioned that the data below reflects the Host
   performance as seen by the MIT-DMS survey program which surveys the
   Hosts only once every twenty minutes.  Consequently, the actual host
   performance may be somewhat different.  Also, we cannot distinguish
   between IMP, telephone lines, and Host failures and the response time
   of a host is affected by its distance (number of IMP hops) from the
   MIT IMP (IMP 6).

   In the data shown in Table II, each success or fail response is
   considered to have a duration of 20 minutes, so Hosts are given the
   benefit of the doubt for the time we are not surveying.  In addition,
   the response time has been averaged only for the successful logger
   available responses.  The logger is considered available if the
   SURVEY program can establish a full-duplex connection within 20
   seconds.  The Host is considered available when it is not in the
   "DEAD" state (states in which logger is not up but the Host is
   available are logger not responding and logger rejecting).











RFC 573                  DATA AND FILE TRANSFER           September 1973


                                TABLE I

   RESPONSE TIME, AVAILABILITY, AND FAILURE RATE FOR SELECTED HOSTS
          (based on SURVEY data for 8/25/73 through 8/31/73)

            PARAMETER                      NIC     BBN     ISI

           Average Response-time (sec.)    2.7     2.4     3.0

           Host Availability               93%     85%     87%

           Logger Availability             91%     79%     83%

           Failure-to-connect rate

           for Host (hours)                18.2    9.4     18.1

           Failure-to-connect rate

           for logger (hours)              16.0    6.0     10.0

   The details on the above measurements will be reported in a forth-
   coming paper.  This paper will focus on the remaining parameters of
   transmission rate, processing costs and accuracy, as measured by the
   MIT-DMS File Transfer Measurement facility.

   The FTP measurement facility exists in the MIT-DMS CALICO subsystem.
   Each time the MIT-DMS FTP-user or FTP-server program in the CALICO
   subsystem is used to transfer files (and data) via the ARPANET, it
   records in a local disk file the following transfer parameters: the
   remote Host involved, the date and time the transfer is initiated,
   the total number of bits transferred, the real time taken (in
   seconds) for the transfer, the CPU time (in micro-seconds) used by
   the program, whether the program is the server or user, and the FTP
   parameter settings for byte size (BYTE), representation type (TYPE),
   transfer mode (MODE), and the file structure (STRU).  Programs exist
   in CALICO to display and summarize this data.

   It should be noted that no measurements are recorded when the non-
   CALICO FTP-user and FTP-server programs are used for transferring
   files.  Therefore it should be pointed out that the measurement
   represents a small subset of our total FTP-usage.  The CALICO FTP-
   server was operated only till May 1973, when we switched to the non-
   CALICO FTP-server.  (The switch was made because CALICO still
   undergoing development is somewhat less reliable.  As CALICO
   stabilizes we may again operate the CALICO server and continue
   measuring data transfer.) In addition many users prefer to use the
   simpler (involving fewer system resources) stand-alone FTP-user



RFC 573                  DATA AND FILE TRANSFER           September 1973


   program.  The measurement does include the data transferred when FTP
   is used indirectly by such commands as "copy", "print", "listf", and
   "mail.file" in the CALICO NETWRK subsystem.

III.  THE MEASUREMENT RESULTS

   The measurement facility has been operational (though not
   continuously) since 25 February 1973.  It has recorded the transfer
   of 304 files consisting of 57.6 million bits.  Over 90% of the bits
   transferred (but only 75% of the files)used the more efficient
   Image-36 stream mode (TYPE I, BYTE 36, MODE S) of transfer.  The
   remainder of the files were transferred using the ASCII-8 stream mode
   (TYPE A, BYTE 8, MODE S).  It should be noted that even though block
   mode was available, it was never used by our users (primarily because
   many FTP-servers do not implement it, and it is less efficient to
   use).  All the files had a sequential non-record file structure (STRU
   F).  A summary of the measurement results is shown in Table II.

                               TABLE II

                  SUMMARY OF FTP MEASUREMENT RESULTS

   Subset of data  # Files  # bits  Av. File  Speed    CPU-use
                             Mbits    Kbits    Kbps     sec/Mb

   Total             304     57.6      189     7.56       4

   Image 36 mode     223     53.6      240     9.35       3

   ASCII-8 mode       81      4.0       49     2.09      19

   Server sending     62      3.8       61     7.50       2

   Server receiving  110     19.8      180     7.44       1

   User receiving     83     22.8      276     7.92       6

   User sending       49     11.1      225     7.09       4

   The entire display of the measurement data and the summaries shown in
   Table II  are generated by the "PFTPST" (Print FTP Statistics)
   program in the CALICO subsystem.  A sample of the data displayed is
   shown in Table III.  The BPS (bits per second) and the M/B (CPU
   microseconds per bit or CPU seconds per Megabit) information is
   calculated by the displaying program.  The largest file transferred
   was 5.03 Mbits, a "STOR" by the FTP-user to MIT-AI.  The transfer
   took 10 minutes of real time for a transfer rate of a little over 10
   Kbps.  The highest data transfer rate recorded was 27.8 Kbps, a



RFC 573                  DATA AND FILE TRANSFER           September 1973


   "RETR" from BBN-TENEXA to MIT-DMS FTP-server.  The length of the file
   in the above case was 28 Kbits.  Needless to say that both of the
   above transfers used the more efficient Image-36 mode for transfer.
   The smallest file and the smallest transmission rate recorded was an
   80 bit "MLFL" to MIT-ML (using ASCII-8) which took 7 seconds real
   time for 11 bps transfer rate.

                               TABLE III

                SAMPLE DISPLAY OF FTP MEASUREMENT DATA

   -#- ---HOST--- COMM --DATE-- --TIME-- --BITS-- -BPS- M/B T BY PRG

     2 sri-arc    STOR 73/08/09 18:19:49   121392  1395  21 I 36 U
   198 mit-ml     STOR 73/08/15 15:00:30    50688  5336   8 I 36 U
   198 mit-ml     RETR 73/08/15 15:01:14    50688 10137  12 I 36 U
   198 mit-ml     STOR 73/08/15 15:02:33   255456  8808   7 I 36 U
   198 mit-ml     RETR 73/08/15 15:03:58   258048  8601  12 I 36 U
   134 mit-ai     STOR 73/08/15 15:13:17   286720  1898  29 A  8 U
   134 mit-ai     RETR 73/08/15 15:18:39   258048  9557  14 I 36 U
   134 mit-ai     STOR 73/08/15 15:19:42   258048  6974   7 I 36 U
     2 sri-arc    RETR 73/08/15 15:31:20     7236  3618  22 I 36 U
     2 sri-arc    STOR 73/08/15 15:32:55    49428  8238  31 I 36 U
     2 sri-arc    RETR 73/08/15 15:34:56    49428  3530  15 I 36 U
     2 sri-arc    STOR 73/08/15 15:38:09    49428  7061   8 I 36 U
     2 sri-arc    STOR 73/08/20 15:18:26    35460  2364   9 I 36 U
     2 sri-arc    RETR 73/08/20 16:08:09    58832   426 153 A  8 U
     2 sri-arc    RETR 73/08/22 12:46:10    10512   166 247 A  8 U
     2 sri-arc    RETR 73/08/23 16:29:37      320    64 369 A  8 U
     2 sri-arc    RETR 73/08/24 12:25:38     9992   262 254 A  8 U
     2 sri-arc    RETR 73/08/24 12:27:26     9992   454 250 A  8 U
   198 mit-ml     STOR 73/08/29 10:40:58   768924  7538   7 I 36 U
   198 mit-ml     STOR 73/08/29 10:44:09   166572  5552   7 1 36 U
   198 mit-ml     STOR 73/08/29 10:54:32   166572  7932   7 I 36 U
   198 mit-ml     STOR 73/08/29 13:48:18   158040 12156   7 I 36 U
    69 bnn-tenexa MLFL 73/08/29 22:30:55     5600  1866  51 A  8 U
    69 bbn-tenexa MLFL 73/08/29 22:31:42     5600  2800  50 A  8 U
    86 usc-isi    MLFL 73/08/29 22:33:55     5600  1400  54 A  8 U
    69 bbn-tenexa MLFL 73/08/29 22:36:15     5600  2800  48 A  8 U
    69 bbn-tenexa MLFL 73/08/29 22:36:54     5600 2800   49 A  8 U

   It should be pointed out that recent measurement data for ASCII-8
   transfer includes retrieval of "NIC Journal" documents
   ("<Xjournal>xxxxx.nls;xnls" files) from SRI-ARC.  SRI-ARC converts
   these "xnls" files from NLS to sequential form on the "fly" and this
   takes considerable time giving a low transfer rate for these
   transfers.




RFC 573                  DATA AND FILE TRANSFER           September 1973


   In transferring files we found the ARPANET and the FTP to be quite
   reliable.  On numerous occasions we transferred complete listing of
   our operating system (about 6 million bits), reassembled it and ran
   it with no problem.  No data lossage problems have been reported to
   us as yet.

IV.  THE SIGNIFICANCE OF MEASUREMENT RESULTS

   First of all let me state my complete agreement with Barry Wessler
   (Ref. "Revelations in Network Host Measurements" NWG/RFC 557, NIC
   18457) that the measurement results should be taken in the spirit:
   "Here is a place to make the Network better" rather than:  "Look,
   isn't the Network terrible."  We take these measurements in the same
   spirit and have found the measurement effort to be quite fruitful.
   In several instances, with the aid of our measurement facilities, we
   have been able to improve the performance of our Network programs by
   an order-of-magnitude (just as Don Allen at BBN improved Greg Hicks'
   RJS program).

   Our measurement results are in close agreement with the BBN FTP
   measurements (8.2 cpu seconds/Mb for 8-bit byte and 2 CPU seconds/Mb
   for 36-bit byte transfers).  We also find the 36-bit byte transfer to
   be an order-of-magnitude more efficient than 8-bit byte transfer.
   The processing cost (assuming $6.00 per CPU minute) for transferring
   a Megabit of information comes to about $1.90 for ASCII-8 mode as
   compared to only $0.30 for Image-36 mode.   The difference in
   transfer rate is equally astounding being 9.4 Kbps for Image-36 as
   compared to only 2 Kbps for ASCII-8.

   It is therefore recommended that Image-36 mode be used as much as
   possible to transfer data between PDP-10s (of which there are many on
   the ARPANET).  It is strongly urged that protocols and programs allow
   (and use) the Image-36 mode for all data transfers including mailing
   files (MLFL), listing directories (LIST, NLST), and
   sending/retrieving NIC Journal documents.  Many of the MID-DMS user
   programs such as "COPY" and "FTP" take advantage of the fact that the
   remote Host is a PDP-10 (there is a table of PDP-10's in "COPY") and
   use the more efficient Image-36 mode.  Such a procedure is highly
   recommended.

   The effective IMP-IMP data transfer rate is about 37.5 Kbps over the
   50 Kbps telephone line (Ref.  McQuillan John M., "Throughput in the
   ARPA Network--Analysis and Measurement," BBN Report 2491, NIC 14188,
   January 1971).  The Host-to-Host data transfer measurement performed
   by BBN (above reference, p. 28) have indicated a transfer rate of
   30-35 kbps BBN-to-BBN (0 IMP hops) and 12-16 Kbps BBN-to-SRI (5 hops)
   using single link.  As FTP transfers data via a single link, a
   maximum transfer rate between 12 and 35 Kbps (depending on number of



RFC 573                  DATA AND FILE TRANSFER           September 1973


   IMP hops) can be expected if that file transfer is the only activity
   going on.  In this light our maximum transfer rate of 27 Kbps to BBN
   (2 hops) is probably the most one can expect out of any program.  The
   average transfer rate of 9.4 Kbps (for Image-36) transfer also
   appears reasonable in view of the fact that during many of the
   transfers other network activity is also going on, and that many of
   the transfers are performed when the respective computer systems are
   quite heavily loaded.  Our measurement data does reveal that transfer
   rate is appreciably higher during the times a computer is likely to
   be lightly loaded.

   The above does not mean that improvements are not possible or not
   required in the state of the ARPANET data transfer.  Our measurement
   data has revealed areas in which improvements can be and should be
   made.  For example, the transfer of data to other MIT Hosts (0 IMP
   hops) and back to ourselves should be faster than what we currently
   achieve (transfer to BBN is faster!).  The probable reason for the
   above discrepancy is that our allocation (Host-Host protocol) is very
   small (2944 bits) as compared to that provided by BBN (17724 bits).
   This means that to transfer data our Network Control Program (NCP)
   has to wait for an allocation many more times while communicating to
   an ITS system than to a TENEX system.  Large allocations are always
   desirable but even more so while transferring files.  NCP designers
   can (and should) modify NCP's to allow large allocates (larger NCP
   buffers) for file transfer even at the expense of smaller allocates
   for other types of connections (such as a terminal connected to a
   computer system) which do not require or use the larger allocation.
   In addition, a new allocate should be sent as soon as data is read by
   the receiving program (the NCP should not wait for the allocation to
   become zero before sending the new allocate).

   We also observed that small files are transferred at a significantly
   lower transfer rate than large files but beyond a file size of 40
   Kbits, the file size makes little difference in transfer rate or
   processing cost per bit transferred.  The figure of 40 Kbits is
   probably related to the size of sending and receiving buffers used by
   the programs.  In general, for most practical values of buffer size,
   the larger the buffer size and allocations, the faster and more
   efficient will be the transfer.  Unfortunately, large NCP buffers are
   not easily available in many systems and come at a premium.  The
   information on average file size (240 Kbits for Image and 40 Kbits
   for ASCII files) may be helpful in optimum allocation of buffer
   space.








RFC 573                  DATA AND FILE TRANSFER           September 1973


V. REQUEST FOR COMMENTS AND SUGGESTIONS

   It is hoped that the above measurement results and our FTP and SURVEY
   measurement facilities will help ARPANET users plan their modes of
   Network usage and help Network programmers in making the Network
   better.  This RFC is indeed a Request For Comments and your
   suggestions on the way we collect, store, and display measurement
   data will be greatly appreciated.  We can break the measurement data
   by Hosts and will be happy to provide the information if it is
   considered desirable.  Please let me know what other parameters we
   should record or display.  You may communicate with me via the
   ARPANET (AKB at MIT-DMS (Host 70), NIC Ident AKB), via telephone
   (617-253-1428 or 1449), or US mail (Rm. 208, 545 Tech Square,
   Cambridge, Mass 02139).


         [ This RFC was put into machine readable form for entry ]
        [ into the online RFC archives by Robert Baskerville 9/98 ]